Commit Graph

1385 Commits

Author SHA1 Message Date
Andrew Lunn
3095383a8a net: dsa: mv88e6xxx: Unique IRQ name
Dynamically generate a unique switch interrupt name, based on the
device name.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 18:30:14 -08:00
Vladimir Oltean
bdeced75b1 net: dsa: felix: Add PCS operations for PHYLINK
Layerscape SoCs traditionally expose the SerDes configuration/status for
Ethernet protocols (PCS for SGMII/USXGMII/10GBase-R etc etc) in a register
format that is compatible with clause 22 or clause 45 (depending on
SerDes protocol). Each MAC has its own internal MDIO bus on which there
is one or more of these PCS's, responding to commands at a configurable
PHY address. The per-port internal MDIO bus (which is just for PCSs) is
totally separate and has nothing to do with the dedicated external MDIO
controller (which is just for PHYs), but the register map for the MDIO
controller is the same.

The VSC9959 (Felix) switch instantiated in the LS1028A is integrated
in hardware with the ENETC PCS of its DSA master, and reuses its MDIO
controller driver, so Felix has been made to depend on it in Kconfig.

 +------------------------------------------------------------------------+
 |                   +--------+ GMII (typically disabled via RCW)         |
 | ENETC PCI         |  ENETC |--------------------------+                |
 | Root Complex      | port 3 |-----------------------+  |                |
 | Integrated        +--------+                       |  |                |
 | Endpoint                                           |  |                |
 |                   +--------+ 2.5G GMII             |  |                |
 |                   |  ENETC |--------------+        |  |                |
 |                   | port 2 |-----------+  |        |  |                |
 |                   +--------+           |  |        |  |                |
 |                                     +--------+  +--------+             |
 |                                     |  Felix |  |  Felix |             |
 |                                     | port 4 |  | port 5 |             |
 |                                     +--------+  +--------+             |
 |                                                                        |
 | +--------+  +--------+  +--------+  +--------+  +--------+  +--------+ |
 | |  ENETC |  |  ENETC |  |  Felix |  |  Felix |  |  Felix |  |  Felix | |
 | | port 0 |  | port 1 |  | port 0 |  | port 1 |  | port 2 |  | port 3 | |
 +------------------------------------------------------------------------+
 |    ||||  SerDes |          ||||        ||||        ||||        ||||    |
 | +--------+block |       +--------------------------------------------+ |
 | |  ENETC |      |       |       ENETC port 2 internal MDIO bus       | |
 | | port 0 |      |       |  PCS         PCS          PCS        PCS   | |
 | |   PCS  |      |       |   0           1            2          3    | |
 +-----------------|------------------------------------------------------+
        v          v           v           v            v          v
     SGMII/      RGMII    QSGMII/QSXGMII/4xSGMII/4x1000Base-X/4x2500Base-X
    USXGMII/   (bypasses
  1000Base-X/   SerDes)
  2500Base-X

In the LS1028A SoC described above, the VSC9959 Felix switch is PF5 of
the ENETC root complex, and has 2 BARs:
- BAR 4: the switch's effective registers
- BAR 0: the MDIO controller register map lended from ENETC port 2
         (PF2), for accessing its associated PCS's.

This explanation is necessary because the patch does some renaming
"pci_bar" -> "switch_pci_bar" for clarity, which would otherwise appear
a bit obtuse.

The fact that the internal MDIO bus is "borrowed" is relevant because
the register map is found in PF5 (the switch) but it triggers an access
fault if PF2 (the ENETC DSA master) is not enabled. This is not treated
in any way (and I don't think it can be treated).

All of this is so SoC-specific, that it was contained as much as
possible in the platform-integration file felix_vsc9959.c.

We need to parse and pre-validate the device tree because of 2 reasons:
- The PHY mode (SerDes protocol) cannot change at runtime due to SoC
  design.
- There is a circular dependency in that we need to know what clause the
  PCS speaks in order to find it on the internal MDIO bus. But the
  clause of the PCS depends on what phy-mode it is configured for.

The goal of this patch is to make steps towards removing the bootloader
dependency for SGMII PCS pre-configuration, as well as to add support
for monitoring the in-band SGMII AN between the PCS and the system-side
link partner (PHY or other MAC).

In practice the bootloader dependency is not completely removed. U-Boot
pre-programs the PHY address at which each PCS can be found on the
internal MDIO bus (MDEV_PORT). This is needed because the PCS of each
port has the same out-of-reset PHY address of zero. The SerDes register
for changing MDEV_PORT is pretty deep in the SoC (outside the addresses
of the ENETC PCI BARs) and therefore inaccessible to us from here.

Felix VSC9959 and Ocelot VSC7514 are integrated very differently in
their respective SoCs, and for that reason Felix does not use the Ocelot
core library for PHYLINK. On one hand we don't want to impose the
fixed phy-mode limitation to Ocelot, and on the other hand Felix doesn't
need to force the MAC link speed the way Ocelot does, since the MAC is
connected to the PCS through a fixed GMII, and the PCS is the one who
does the rate adaptation at lower link speeds, which the MAC does not
even need to know about. In fact changing the GMII speed for Felix
irrecoverably breaks transmission through that port until a reset.

The pair with ENETC port 3 and Felix port 5 is optional and doesn't
support tagging. When we enable it, swp5 is a regular slave port, albeit
an internal one. The trouble is that it doesn't work, and that is
because the DSA PHYLIB adaptation layer doesn't treat fixed-link slave
ports. So that is yet another reason for wanting to convert Felix to the
native PHYLINK API.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:33 -08:00
Vladimir Oltean
a68578c20a net: dsa: Make deferred_xmit private to sja1105
There are 3 things that are wrong with the DSA deferred xmit mechanism:

1. Its introduction has made the DSA hotpath ever so slightly more
   inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs
   to be initialized to false for every transmitted frame, in order to
   figure out whether the driver requested deferral or not (a very rare
   occasion, rare even for the only driver that does use this mechanism:
   sja1105). That was necessary to avoid kfree_skb from freeing the skb.

2. Because L2 PTP is a link-local protocol like STP, it requires
   management routes and deferred xmit with this switch. But as opposed
   to STP, the deferred work mechanism needs to schedule the packet
   rather quickly for the TX timstamp to be collected in time and sent
   to user space. But there is no provision for controlling the
   scheduling priority of this deferred xmit workqueue. Too bad this is
   a rather specific requirement for a feature that nobody else uses
   (more below).

3. Perhaps most importantly, it makes the DSA core adhere a bit too
   much to the NXP company-wide policy "Innovate Where It Doesn't
   Matter". The sja1105 is probably the only DSA switch that requires
   some frames sent from the CPU to be routed to the slave port via an
   out-of-band configuration (register write) rather than in-band (DSA
   tag). And there are indeed very good reasons to not want to do that:
   if that out-of-band register is at the other end of a slow bus such
   as SPI, then you limit that Ethernet flow's throughput to effectively
   the throughput of the SPI bus. So hardware vendors should definitely
   not be encouraged to design this way. We do _not_ want more
   widespread use of this mechanism.

Luckily we have a solution for each of the 3 issues:

For 1, we can just remove that variable in the skb->cb and counteract
the effect of kfree_skb with skb_get, much to the same effect. The
advantage, of course, being that anybody who doesn't use deferred xmit
doesn't need to do any extra operation in the hotpath.

For 2, we can create a kernel thread for each port's deferred xmit work.
If the user switch ports are named swp0, swp1, swp2, the kernel threads
will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15
character length limit on kernel thread names). With this, the user can
change the scheduling priority with chrt $(pidof swp2_xmit).

For 3, we can actually move the entire implementation to the sja1105
driver.

So this patch deletes the generic implementation from the DSA core and
adds a new one, more adequate to the requirements of PTP TX
timestamping, in sja1105_main.c.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 15:13:13 -08:00
Vladimir Oltean
0a51826c6e net: dsa: sja1105: Always send through management routes in slot 0
I finally found out how the 4 management route slots are supposed to
be used, but.. it's not worth it.

The description from the comment I've just deleted in this commit is
still true: when more than 1 management slot is active at the same time,
the switch will match frames incoming [from the CPU port] on the lowest
numbered management slot that matches the frame's DMAC.

My issue was that one was not supposed to statically assign each port a
slot. Yes, there are 4 slots and also 4 non-CPU ports, but that is a
mere coincidence.

Instead, the switch can be used like this: every management frame gets a
slot at the right of the most recently assigned slot:

Send mgmt frame 1 through S0:    S0 x  x  x
Send mgmt frame 2 through S1:    S0 S1 x  x
Send mgmt frame 3 through S2:    S0 S1 S2 x
Send mgmt frame 4 through S3:    S0 S1 S2 S3

The difference compared to the old usage is that the transmission of
frames 1-4 doesn't need to wait until the completion of the management
route. It is safe to use a slot to the right of the most recently used
one, because by protocol nobody will program a slot to your left and
"steal" your route towards the correct egress port.

So there is a potential throughput benefit here.

But mgmt frame 5 has no more free slot to use, so it has to wait until
_all_ of S0, S1, S2, S3 are full, in order to use S0 again.

And that's actually exactly the problem: I was looking for something
that would bring more predictable transmission latency, but this is
exactly the opposite: 3 out of 4 frames would be transmitted quicker,
but the 4th would draw the short straw and have a worse worst-case
latency than before.

Useless.

Things are made even worse by PTP TX timestamping, which is something I
won't go deeply into here. Suffice to say that the fact there is a
driver-level lock on the SPI bus offsets any potential throughput gains
that parallelism might bring.

So there's no going back to the multi-slot scheme, remove the
"mgmt_slot" variable from sja1105_port and the dummy static assignment
made at probe time.

While passing by, also remove the assignment to casc_port altogether.
Don't pretend that we support cascaded setups.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 15:13:12 -08:00
Florian Fainelli
aa1d54c65d net: dsa: vsc73xx: Remove dependency on CONFIG_OF
There is no build time dependency on CONFIG_OF, but we do need to make
sure we gate the initialization of the gpio_chip::of_node member with a
proper check on CONFIG_OF_GPIO. This enables the driver to build on
platforms that do not have CONFIG_OF enabled.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 14:23:48 -08:00
David S. Miller
31d518f35e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Simple overlapping changes in bpf land wrt. bpf_helper_defs.h
handling.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-31 13:37:13 -08:00
Vladimir Oltean
19d1f0ed74 net: dsa: sja1105: Empty the RX timestamping queue on PTP settings change
When disabling PTP timestamping, don't reset the switch with the new
static config until all existing PTP frames have been timestamped on the
RX path or dropped. There's nothing we can do with these afterwards.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:31:40 -08:00
Vladimir Oltean
1e762bd278 net: dsa: sja1105: Use PTP core's dedicated kernel thread for RX timestamping
And move the queue of skb's waiting for RX timestamps into the ptp_data
structure, since it isn't needed if PTP is not compiled.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:31:40 -08:00
Vladimir Oltean
54fa49ee88 net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S
For first-generation switches (SJA1105E and SJA1105T):
- TPID means C-Tag (typically 0x8100)
- TPID2 means S-Tag (typically 0x88A8)

While for the second generation switches (SJA1105P, SJA1105Q, SJA1105R,
SJA1105S) it is the other way around:
- TPID means S-Tag (typically 0x88A8)
- TPID2 means C-Tag (typically 0x8100)

In other words, E/T tags untagged traffic with TPID, and P/Q/R/S with
TPID2.

So the patch mentioned below fixed VLAN filtering for P/Q/R/S, but broke
it for E/T.

We strive for a common code path for all switches in the family, so just
lie in the static config packing functions that TPID and TPID2 are at
swapped bit offsets than they actually are, for P/Q/R/S. This will make
both switches understand TPID to be ETH_P_8021Q and TPID2 to be
ETH_P_8021AD. The meaning from the original E/T was chosen over P/Q/R/S
because E/T is actually the one with public documentation available
(UM10944.pdf).

Fixes: f9a1a7646c ("net: dsa: sja1105: Reverse TPID and TPID2")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:15:02 -08:00
Vladimir Oltean
d00bdc0a88 net: dsa: sja1105: Remove restriction of zero base-time for taprio offload
The check originates from the initial implementation which was not based
on PTP time but on a standalone clock source. In the meantime we can now
program the PTPSCHTM register at runtime with the dynamic base time
(actually with a value that is 200 ns smaller, to avoid writing DELTA=0
in the Schedule Entry Points Parameters Table). And we also have logic
for moving the actual base time in the future of the PHC's current time
base, so the check for zero serves no purpose, since even if the user
will specify zero, that's not what will end up in the static config
table where the limitation is.

Fixes: 86db36a347 ("net: dsa: sja1105: Implement state machine for TAS with PTP clock source")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:13:11 -08:00
Vladimir Oltean
5a47f588ee net: dsa: sja1105: Really make the PTP command read-write
When activating tc-taprio offload on the switch ports, the TAS state
machine will try to check whether it is running or not, but will find
both the STARTED and STOPPED bits as false in the
sja1105_tas_check_running function. So the function will return -EINVAL
(an abnormal situation) and the kernel will keep printing this from the
TAS FSM workqueue:

[   37.691971] sja1105 spi0.1: An operation returned -22

The reason is that the underlying function that gets called,
sja1105_ptp_commit, does not actually do a SPI_READ, but a SPI_WRITE. So
the command buffer remains initialized with zeroes instead of retrieving
the hardware state. Fix that.

Fixes: 41603d78b3 ("net: dsa: sja1105: Make the PTP command read-write")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:11:28 -08:00
Vladimir Oltean
9fcf024dd6 net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot
The PTP egress timestamp N must be captured from register PTPEGR_TS[n],
where n = 2 * PORT + TSREG. There are 10 PTPEGR_TS registers, 2 per
port. We are only using TSREG=0.

As opposed to the management slots, which are 4 in number
(SJA1105_NUM_PORTS, minus the CPU port). Any management frame (which
includes PTP frames) can be sent to any non-CPU port through any
management slot. When the CPU port is not the last port (#4), there will
be a mismatch between the slot and the port number.

Luckily, the only mainline occurrence with this switch
(arch/arm/boot/dts/ls1021a-tsn.dts) does have the CPU port as #4, so the
issue did not manifest itself thus far.

Fixes: 47ed985e97 ("net: dsa: sja1105: Add logic for TX timestamping")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30 20:10:20 -08:00
Nikita Yushchenko
0df9528736 mv88e6xxx: Add serdes Rx statistics
If packet checker is enabled in the serdes, then Rx counter registers
start working, and no side effects have been detected.

This patch enables packet checker automatically when powering serdes on,
and exposes Rx counter registers via ethtool statistics interface.

Code partially basded by older attempt by Andrew Lunn.

Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-27 16:34:15 -08:00
Mao Wenan
c8f957df6e net: dsa: qca: ar9331: drop pointless static qualifier in ar9331_sw_mbus_init
There is no need to set variable 'mbus' static
since new value always be assigned before use it.

Signed-off-by: Mao Wenan <maowenan@huawei.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-27 16:31:59 -08:00
Florian Fainelli
7c3125f0a6 net: dsa: bcm_sf2: Fix IP fragment location and behavior
The IP fragment is specified through user-defined field as the first
bit of the first user-defined word. We were previously trying to extract
it from the user-defined mask which could not possibly work. The ip_frag
is also supposed to be a boolean, if we do not cast it as such, we risk
overwriting the next fields in CFP_DATA(6) which would render the rule
inoperative.

Fixes: 7318166cac ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24 16:08:49 -08:00
David S. Miller
ac80010fc9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Mere overlapping changes in the conflicts here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-22 15:15:05 -08:00
Oleksij Rempel
ec6698c272 net: dsa: add support for Atheros AR9331 built-in switch
Provide basic support for Atheros AR9331 built-in switch. So far it
works as port multiplexer without any hardware offloading support.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-20 17:05:47 -08:00
Arnd Bergmann
95bed1a9fb net: dsa: ocelot: add NET_VENDOR_MICROSEMI dependency
Selecting MSCC_OCELOT_SWITCH is not possible when NET_VENDOR_MICROSEMI
is disabled:

WARNING: unmet direct dependencies detected for MSCC_OCELOT_SWITCH
  Depends on [n]: NETDEVICES [=y] && ETHERNET [=n] && NET_VENDOR_MICROSEMI [=n] && NET_SWITCHDEV [=y] && HAS_IOMEM [=y]
  Selected by [m]:
  - NET_DSA_MSCC_FELIX [=m] && NETDEVICES [=y] && HAVE_NET_DSA [=y] && NET_DSA [=y] && PCI [=y]

Add a Kconfig dependency on NET_VENDOR_MICROSEMI, which also implies
CONFIG_NETDEVICES.

Depending on a vendor config violates menuconfig locality for the DSA
driver, but is the smallest compromise since all other solutions are
much more complicated (see [0]).

https://www.spinics.net/lists/netdev/msg618808.html

Fixes: 5605194877 ("net: dsa: ocelot: add driver for Felix switch family")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-16 19:24:26 -08:00
Florian Fainelli
63cc54a6f0 net: dsa: b53: Fix egress flooding settings
There were several issues with 53568438e3 ("net: dsa: b53: Add support for port_egress_floods callback") that resulted in breaking connectivity for standalone ports:

- both user and CPU ports must allow unicast and multicast forwarding by
  default otherwise this just flat out breaks connectivity for
  standalone DSA ports
- IP multicast is treated similarly as multicast, but has separate
  control registers
- the UC, MC and IPMC lookup failure register offsets were wrong, and
  instead used bit values that are meaningful for the
  B53_IP_MULTICAST_CTRL register

Fixes: 53568438e3 ("net: dsa: b53: Add support for port_egress_floods callback")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-16 16:08:09 -08:00
David S. Miller
adf6f8cb3f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in networking bug fixes for merge window.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25 14:57:26 -08:00
Oleksij Rempel
9bca3a0a92 net: dsa: sja1105: fix sja1105_parse_rgmii_delays()
This function was using configuration of port 0 in devicetree for all ports.
In case CPU port was not 0, the delay settings was ignored. This resulted not
working communication between CPU and the switch.

Fixes: f5b8631c29 ("net: dsa: sja1105: Error out if RGMII delays are requested in DT")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25 10:56:12 -08:00
Chen Wandun
3243e04ab1 net: dsa: ocelot: fix "should it be static?" warnings
Fix following sparse warnings:
drivers/net/dsa/ocelot/felix.c:351:6: warning: symbol 'felix_txtstamp' was not declared. Should it be static?

Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22 10:09:10 -08:00
Yangbo Lu
c0bcf53766 net: dsa: ocelot: add hardware timestamping support for Felix
This patch is to reuse ocelot functions as possible to enable PTP
clock and to support hardware timestamping on Felix.
On TX path, timestamping works on packet which requires timestamp.
The injection header will be configured accordingly, and skb clone
requires timestamp will be added into a list. The TX timestamp
is final handled in threaded interrupt handler when PTP timestamp
FIFO is ready.
On RX path, timestamping is always working. The RX timestamp could
be got from extraction header.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:39:02 -08:00
Yangbo Lu
5df66c48bc net: dsa: ocelot: define PTP registers for felix_vsc9959
This patch is to define PTP registers for felix_vsc9959.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21 14:39:02 -08:00
Vladimir Oltean
b8fc7177d8 net: dsa: felix: Fix CPU port assignment when not last port
On the NXP LS1028A, there are 2 Ethernet links between the Felix switch
and the ENETC:
- eno2 <-> swp4, at 2.5G
- eno3 <-> swp5, at 1G

Only one of the above Ethernet port pairs can act as a DSA link for
tagging.

When adding initial support for the driver, it was tested only on the 1G
eno3 <-> swp5 interface, due to the necessity of using PHYLIB initially
(which treats fixed-link interfaces as emulated C22 PHYs, so it doesn't
support fixed-link speeds higher than 1G).

After making PHYLINK work, it appears that swp4 still can't act as CPU
port. So it looks like ocelot_set_cpu_port was being called for swp4,
but then it was called again for swp5, overwriting the CPU port assigned
in the DT.

It appears that when you call dsa_upstream_port for a port that is not
defined in the device tree (such as swp5 when using swp4 as CPU port),
its dp->cpu_dp pointer is not initialized by dsa_tree_setup_default_cpu,
and this trips up the following condition in dsa_upstream_port:

	if (!cpu_dp)
		return port;

So the moral of the story is: don't call dsa_upstream_port for a port
that is not defined in the device tree, and therefore its dsa_port
structure is not completely initialized (ds->num_ports is still 6).

Fixes: 5605194877 ("net: dsa: ocelot: add driver for Felix switch family")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19 15:21:45 -08:00
David S. Miller
19b7e21c55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Lots of overlapping changes and parallel additions, stuff
like that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-16 21:51:42 -08:00
Richard Cochran
c019b4be5d mv88e6xxx: Reject requests to enable time stamping on both edges.
This driver enables rising edge or falling edge, but not both, and so
this patch validates that the request contains only one of the two
edges.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-15 12:48:32 -08:00
Richard Cochran
6138e687c7 ptp: Introduce strict checking of external time stamp options.
User space may request time stamps on rising edges, falling edges, or
both.  However, the particular mode may or may not be supported in the
hardware or in the driver.  This patch adds a "strict" flag that tells
drivers to ensure that the requested mode will be honored.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-15 12:48:32 -08:00
Jacob Keller
7d9465ebcc mv88e6xxx: reject unsupported external timestamp flags
Fix the mv88e6xxx PTP support to explicitly reject any future flags that
get added to the external timestamp request ioctl.

In order to maintain currently functioning code, this patch accepts all
three current flags. This is because the PTP_RISING_EDGE and
PTP_FALLING_EDGE flags have unclear semantics and each driver seems to
have interpreted them slightly differently.

For the record, the semantics of this driver are:

  flags                                                 Meaning
  ----------------------------------------------------  --------------------------
  PTP_ENABLE_FEATURE                                    Time stamp falling edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE                    Time stamp rising edge
  PTP_ENABLE_FEATURE|PTP_FALLING_EDGE                   Time stamp falling edge
  PTP_ENABLE_FEATURE|PTP_RISING_EDGE|PTP_FALLING_EDGE   Time stamp rising edge

Cc: Brandon Streiff <brandon.streiff@ni.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-15 12:48:32 -08:00
Vladimir Oltean
5605194877 net: dsa: ocelot: add driver for Felix switch family
This supports an Ethernet switching core from Vitesse / Microsemi /
Microchip (VSC9959) which is part of the Ocelot family (a brand name),
and whose code name is Felix. The switch can be (and is) integrated on
different SoCs as a PCIe endpoint device.

The functionality is provided by the core of the Ocelot switch driver
(drivers/net/ethernet/mscc). In this regard, the current driver is an
instance of Microsemi's Ocelot core driver, with a DSA front-end. It
inherits its name from VSC9959's code name, to distinguish itself from
the switchdev ocelot driver.

The patch adds the logic for probing a PCI device and defines the
register map for the VSC9959 switch core, since it has some differences
in register addresses and bitfield mappings compared to the other Ocelot
switches (VSC7511, VSC7512, VSC7513, VSC7514).

The Felix driver declares the register map as part of the "instance
table". Currently the VSC9959 inside NXP LS1028A is the only instance,
but presumably it can support other switches in the Ocelot family, when
used in DSA mode (Linux running on the external CPU, and not on the
embedded MIPS).

In a few cases, some h/w operations have to be done differently on
VSC9959 due to missing bitfields.  This is the case for the switch core
reset and init.  Because for this operation Ocelot uses some bits that
are not present on Felix, the latter has to use a register from the
global registers block (GCB) instead.

Although it is a PCI driver, it relies on DT bindings for compatibility
with DSA (CPU port link, PHY library). It does not have any custom
device tree bindings, since we would like to minimize its dependency on
device tree though.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-15 12:32:16 -08:00
Vladimir Oltean
abfb228ae6 net: dsa: sja1105: Simplify reset handling
We don't really need 10k species of reset. Remove everything except cold
reset which is what is actually used. Too bad the hardware designers
couldn't agree to use the same bit field for rev 1 and rev 2, so the
(*reset_cmd) function pointer is there to stay.

However let's simplify the prototype and give it a struct dsa_switch (we
want to avoid forward-declarations of structures, in this case struct
sja1105_private, wherever we can).

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-14 15:11:17 -08:00
Vladimir Oltean
86db36a347 net: dsa: sja1105: Implement state machine for TAS with PTP clock source
Tested using the following bash script and the tc from iproute2-next:

	#!/bin/bash

	set -e -u -o pipefail

	NSEC_PER_SEC="1000000000"

	gatemask() {
		local tc_list="$1"
		local mask=0

		for tc in ${tc_list}; do
			mask=$((${mask} | (1 << ${tc})))
		done

		printf "%02x" ${mask}
	}

	if ! systemctl is-active --quiet ptp4l; then
		echo "Please start the ptp4l service"
		exit
	fi

	now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
	# Phase-align the base time to the start of the next second.
	sec=$(echo "${now}" | gawk -F. '{ print $1; }')
	base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"

	tc qdisc add dev swp5 parent root handle 100 taprio \
		num_tc 8 \
		map 0 1 2 3 5 6 7 \
		queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
		base-time ${base_time} \
		sched-entry S $(gatemask 7) 100000 \
		sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
		clockid CLOCK_TAI flags 2

The "state machine" is a workqueue invoked after each manipulation
command on the PTP clock (reset, adjust time, set time, adjust
frequency) which checks over the state of the time-aware scheduler.
So it is not monitored periodically, only in reaction to a PTP command
typically triggered from a userspace daemon (linuxptp). Otherwise there
is no reason for things to go wrong.

Now that the timecounter/cyclecounter has been replaced with hardware
operations on the PTP clock, the TAS Kconfig now depends upon PTP and
the standalone clocksource operating mode has been removed.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-14 14:50:35 -08:00
Vladimir Oltean
41603d78b3 net: dsa: sja1105: Make the PTP command read-write
The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate
whether the time-aware scheduler is running or not. We will be using
that for monitoring the scheduler in the next patch, so refactor the PTP
command API in order to allow that.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-14 14:50:35 -08:00
Vladimir Oltean
2eea1fa82f net: dsa: sja1105: Print the reset reason
Sometimes it can be quite opaque even for me why the driver decided to
reset the switch. So instead of adding dump_stack() calls each time for
debugging, just add a reset reason to sja1105_static_config_reload
calls which gets printed to the console.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12 19:53:07 -08:00
Colin Ian King
4e4637b103 net: dsa: mv88e6xxx: fix broken if statement because of a stray semicolon
There is a stray semicolon in an if statement that will cause a dev_err
message to be printed unconditionally. Fix this by removing the stray
semicolon.

Addresses-Coverity: ("Stay semicolon")
Fixes: f0942e00a1 ("net: dsa: mv88e6xxx: Add support for port mirroring")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12 11:29:20 -08:00
Iwan R Timmer
f0942e00a1 net: dsa: mv88e6xxx: Add support for port mirroring
Add support for configuring port mirroring through the cls_matchall
classifier. We do a full ingress and/or egress capture towards a
capture port. It allows setting a different capture port for ingress
and egress traffic.

It keeps track of the mirrored ports and the destination ports to
prevent changes to the capture port while other ports are being
mirrored.

Signed-off-by: Iwan R Timmer <irtimmer@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:51:03 -08:00
Iwan R Timmer
5c74c54ce6 net: dsa: mv88e6xxx: Split monitor port configuration
Separate the configuration of the egress and ingress monitor port.
This allows the port mirror functionality to do ingress and egress
port mirroring to separate ports.

Signed-off-by: Iwan R Timmer <irtimmer@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:51:03 -08:00
Vladimir Oltean
af580ae2dc net: dsa: sja1105: Disallow management xmit during switch reset
The purpose here is to avoid ptp4l fail due to this condition:

  timed out while polling for tx timestamp
  increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
  port 1: send peer delay request failed

So either reset the switch before the management frame was sent, or
after it was timestamped as well, but not in the middle.

The condition may arise either due to a true timeout (i.e. because
re-uploading the static config takes time), or due to the TX timestamp
actually getting lost due to reset. For the former we can increase
tx_timestamp_timeout in userspace, for the latter we need this patch.

Locking all traffic during switch reset does not make sense at all,
though. Forcing all CPU-originated traffic to potentially block waiting
for a sleepable context to send > 800 bytes over SPI is not a good idea.
Flows that are autonomously forwarded by the switch will get dropped
anyway during switch reset no matter what. So just let all other
CPU-originated traffic be dropped as well.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:45:31 -08:00
Vladimir Oltean
6cf99c13ea net: dsa: sja1105: Restore PTP time after switch reset
The PTP time of the switch is not preserved when uploading a new static
configuration. Work around this hardware oddity by reading its PTP time
before a static config upload, and restoring it afterwards.

Static config changes are expected to occur at runtime even in scenarios
directly related to PTP, i.e. the Time-Aware Scheduler of the switch is
programmed in this way.

Perhaps the larger implication of this patch is that the PTP .gettimex64
and .settime functions need to be exposed to sja1105_main.c, where the
PTP lock needs to be held during this entire process. So their core
implementation needs to move to some common functions which get exposed
in sja1105_ptp.h.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:45:30 -08:00
Vladimir Oltean
34d76e9fa8 net: dsa: sja1105: Implement the .gettimex64 system call for PTP
Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace
applications (i.e. phc2sys) to compensate for the delays incurred while
reading the PHC's time.

The task itself of taking the software timestamp is delegated to the SPI
subsystem, through the newly introduced API in struct spi_transfer. The
goal is to cross-timestamp I/O operations on the switch's PTP clock with
values in the local system clock (CLOCK_REALTIME). For that we need to
understand a bit of the hardware internals.

The 'read PTP time' message is a 12 byte structure, first 4 bytes of
which represent the SPI header, and the last 8 bytes represent the
64-bit PTP time. The switch itself starts processing the command
immediately after receiving the last bit of the address, i.e. at the
middle of byte 3 (last byte of header). The PTP time is shadowed to a
buffer register in the switch, and retrieved atomically during the
subsequent SPI frames.

A similar thing goes on for the 'write PTP time' message, although in
that case the switch waits until the 64-bit PTP time becomes fully
available before taking any action. So the byte that needs to be
software-timestamped is byte 11 (last) of the transfer.

The patch creates a common (and local) sja1105_xfer implementation for
the SPI I/O, and offers 3 front-ends:

- sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally
  requesting a PTP timestamp

- sja1105_xfer_buf: this is for large transfers (e.g. the static config
  buffer) and other misc data, and there is no point in giving
  timestamping capabilities to this.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11 12:45:30 -08:00
David S. Miller
14684b9301 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
One conflict in the BPF samples Makefile, some fixes in 'net' whilst
we were converting over to Makefile.target rules in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-09 11:04:37 -08:00
Andrew Lunn
64a26007a8 net: dsa: mv8e6xxx: Fix stub function parameters
mv88e6xxx_g2_atu_stats_get() takes two parameters. Make the stub
function also take two, otherwise we get compile errors.

Fixes: c5f299d592 ("net: dsa: mv88e6xxx: global1_atu: Add helper for get next")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 21:42:53 -08:00
Andrew Lunn
e0c69ca7df net: dsa: mv88e6xxx: Add ATU occupancy via devlink resources
The ATU can report how many entries it contains. It does this per bin,
there being 4 bins in total. Export the ATU as a devlink resource, and
provide a method the needed callback to get the resource occupancy.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 18:09:45 -08:00
Andrew Lunn
c5f299d592 net: dsa: mv88e6xxx: global1_atu: Add helper for get next
When retrieving the ATU statistics, and ATU get next has to be
performed to trigger the ATU to collect the statistics. Export a
helper from global1_atu to perform this.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 18:09:45 -08:00
Andrew Lunn
6239a386e7 net: dsa: mv88e6xxx: global2: Expose ATU stats register
Add helpers to set/get the ATU statistics register.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 18:09:45 -08:00
Andrew Lunn
d9ea56206c net: dsa: mv88e6xxx: Add number of MACs in the ATU
For each supported switch, add an entry to the info structure for the
number of MACs which can be stored in the ATU. This will later be used
to export the ATU as a devlink resource, and indicate its occupancy,
how full the ATU is.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 18:09:45 -08:00
Florian Fainelli
eee87e4377 net: dsa: bcm_sf2: Add support for optional reset controller line
Grab an optional and exclusive reset controller line for the switch and
manage it during probe/remove functions accordingly. For 7278 devices we
change bcm_sf2_sw_rst() to use the reset controller line since the
WATCHDOG_CTRL register does not reset the switch contrary to stated
documentation.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 18:06:38 -08:00
Florian Fainelli
e684000b8a net: dsa: bcm_sf2: Fix driver removal
With the DSA core doing the call to dsa_port_disable() we do not need to
do that within the driver itself. This could cause an use after free
since past dsa_unregister_switch() we should not be accessing any
dsa_switch internal structures.

Fixes: 0394a63acf ("net: dsa: enable and disable all ports")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 17:54:59 -08:00
Andrew Lunn
0c65b2b90d net: of_get_phy_mode: Change API to solve int/unit warnings
Before this change of_get_phy_mode() returned an enum,
phy_interface_t. On error, -ENODEV etc, is returned. If the result of
the function is stored in a variable of type phy_interface_t, and the
compiler has decided to represent this as an unsigned int, comparision
with -ENODEV etc, is a signed vs unsigned comparision.

Fix this problem by changing the API. Make the function return an
error, or 0 on success, and pass a pointer, of type phy_interface_t,
where the phy mode should be stored.

v2:
Return with *interface set to PHY_INTERFACE_MODE_NA on error.
Add error checks to all users of of_get_phy_mode()
Fixup a few reverse christmas tree errors
Fixup a few slightly malformed reverse christmas trees

v3:
Fix 0-day reported errors.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-04 11:21:25 -08:00
David S. Miller
d31e95585c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.

The rest were (relatively) trivial in nature.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-02 13:54:56 -07:00