Merge branch 'stmmac-dwmac-meson8b-configurable-RGMII-TX-delay'

Martin Blumenstingl says:

====================
stmmac: dwmac-meson8b: configurable RGMII TX delay

Currently the dwmac-meson8b stmmac glue driver uses a hardcoded 1/4
cycle (= 2ns) TX clock delay. This seems to work fine for many boards
(for example Odroid-C2 or Amlogic's reference boards) but there are
some others where TX traffic is simply broken.
There are probably multiple reasons why it's working on some boards
while it's broken on others:
- some of Amlogic's reference boards are using a Micrel PHY
- hardware circuit design
- maybe more...

iperf3 results on my Mecool BB2 board (Meson GXM, RTL8211F PHY) with
TX clock delay disabled on the MAC (as it's enabled in the PHY driver).
TX throughput was virtually zero before:
$ iperf3 -c 192.168.1.100 -R
Connecting to host 192.168.1.100, port 5201
Reverse mode, remote host 192.168.1.100 is sending
[  4] local 192.168.1.206 port 52828 connected to 192.168.1.100 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   108 MBytes   901 Mbits/sec
[  4]   1.00-2.00   sec  94.2 MBytes   791 Mbits/sec
[  4]   2.00-3.00   sec  96.5 MBytes   810 Mbits/sec
[  4]   3.00-4.00   sec  96.2 MBytes   808 Mbits/sec
[  4]   4.00-5.00   sec  96.6 MBytes   810 Mbits/sec
[  4]   5.00-6.00   sec  96.5 MBytes   810 Mbits/sec
[  4]   6.00-7.00   sec  96.6 MBytes   810 Mbits/sec
[  4]   7.00-8.00   sec  96.5 MBytes   809 Mbits/sec
[  4]   8.00-9.00   sec   105 MBytes   884 Mbits/sec
[  4]   9.00-10.00  sec   111 MBytes   934 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1000 MBytes   839 Mbits/sec    0             sender
[  4]   0.00-10.00  sec   998 MBytes   837 Mbits/sec                  receiver

iperf Done.
$ iperf3 -c 192.168.1.100
Connecting to host 192.168.1.100, port 5201
[  4] local 192.168.1.206 port 52832 connected to 192.168.1.100 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.01   sec  99.5 MBytes   829 Mbits/sec  117    139 KBytes
[  4]   1.01-2.00   sec   105 MBytes   884 Mbits/sec  129   70.7 KBytes
[  4]   2.00-3.01   sec   107 MBytes   889 Mbits/sec  106    187 KBytes
[  4]   3.01-4.01   sec   105 MBytes   878 Mbits/sec   92    143 KBytes
[  4]   4.01-5.00   sec   105 MBytes   882 Mbits/sec  140    129 KBytes
[  4]   5.00-6.01   sec   106 MBytes   883 Mbits/sec  115    195 KBytes
[  4]   6.01-7.00   sec   102 MBytes   863 Mbits/sec  133   70.7 KBytes
[  4]   7.00-8.01   sec   106 MBytes   884 Mbits/sec  143   97.6 KBytes
[  4]   8.01-9.01   sec   104 MBytes   875 Mbits/sec  124    107 KBytes
[  4]   9.01-10.01  sec   105 MBytes   876 Mbits/sec   90    139 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.01  sec  1.02 GBytes   874 Mbits/sec  1189             sender
[  4]   0.00-10.01  sec  1.02 GBytes   873 Mbits/sec                  receiver

iperf Done.

I get similar TX throughput on my Meson GXBB "MXQ Pro+" board when I
disable the PHY's TX-delay and configure a 4ms TX-delay on the MAC.
So changes to at least the RTL8211F PHY driver are needed to get it
working properly in all situations.

Changes since v4:
- add a fallback of 2ns (the value which was previously hardcoded) for
  the TX delay so we are backwards-compatible with older .dts'
- update the documentation with the new fallback value and add a small
  note that the "amlogic,tx-delay" property is ignored when the phy-mode
  is "rmii".

Changes since v3:
- rebased to apply against current net-next branch (fixes a conflict
  with d2ed0a7755 "net: ethernet: stmmac: fix of-node and
  fixed-link-phydev leaks")

Changes since v2:
- moved all .dts patches (3-7) to a separate series
- removed the default 2ns TX delay when phy-mode RGMII is specified
- (rebased against current net-next)

Changes since v1:
- renamed the devicetree property "amlogic,tx-delay" to
  "amlogic,tx-delay-ns", which makes the .dts easier to read as we can
  simply specify human-readable values instead of having "preprocessor
  defines and calculation in human brain". Thanks to Andrew Lunn for
  the suggestion!
- improved documentation to indicate when the MAC TX-delay should be
  configured and how to use the PHY's TX-delay
- changed the default TX-delay in the dwmac-meson8b driver from 2ns
  to 0ms when any of the rgmii-*id modes are used (the 2ns default
  value still applies for phy-mode "rgmii")
- added patches to properly reset the PHY on Meson GXBB devices and to
  use a similar configuration than the one we use on Meson GXL devices
  (by passing a phy-handle to stmmac and defining the PHY in the mdio0
  bus - patch 3-6)
- add the "amlogic,tx-delay-ns" property to all boards which are using
  the RGMII PHY (patch 7)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
David S. Miller 2017-01-24 13:35:41 -05:00
commit dd8e01fbff
2 changed files with 30 additions and 6 deletions

View File

@ -25,6 +25,22 @@ Required properties on Meson8b and newer:
- "clkin0" - first parent clock of the internal mux
- "clkin1" - second parent clock of the internal mux
Optional properties on Meson8b and newer:
- amlogic,tx-delay-ns: The internal RGMII TX clock delay (provided
by this driver) in nanoseconds. Allowed values
are: 0ns, 2ns, 4ns, 6ns.
When phy-mode is set to "rgmii" then the TX
delay should be explicitly configured. When
not configured a fallback of 2ns is used.
When the phy-mode is set to either "rgmii-id"
or "rgmii-txid" the TX clock delay is already
provided by the PHY. In that case this
property should be set to 0ns (which disables
the TX clock delay in the MAC to prevent the
clock from going off because both PHY and MAC
are adding a delay).
Any configuration is ignored when the phy-mode
is set to "rmii".
Example for Meson6:

View File

@ -35,10 +35,6 @@
#define PRG_ETH0_TXDLY_SHIFT 5
#define PRG_ETH0_TXDLY_MASK GENMASK(6, 5)
#define PRG_ETH0_TXDLY_OFF (0x0 << PRG_ETH0_TXDLY_SHIFT)
#define PRG_ETH0_TXDLY_QUARTER (0x1 << PRG_ETH0_TXDLY_SHIFT)
#define PRG_ETH0_TXDLY_HALF (0x2 << PRG_ETH0_TXDLY_SHIFT)
#define PRG_ETH0_TXDLY_THREE_QUARTERS (0x3 << PRG_ETH0_TXDLY_SHIFT)
/* divider for the result of m250_sel */
#define PRG_ETH0_CLK_M250_DIV_SHIFT 7
@ -69,6 +65,8 @@ struct meson8b_dwmac {
struct clk_divider m25_div;
struct clk *m25_div_clk;
u32 tx_delay_ns;
};
static void meson8b_dwmac_mask_bits(struct meson8b_dwmac *dwmac, u32 reg,
@ -179,6 +177,7 @@ static int meson8b_init_prg_eth(struct meson8b_dwmac *dwmac)
{
int ret;
unsigned long clk_rate;
u8 tx_dly_val;
switch (dwmac->phy_mode) {
case PHY_INTERFACE_MODE_RGMII:
@ -196,9 +195,13 @@ static int meson8b_init_prg_eth(struct meson8b_dwmac *dwmac)
meson8b_dwmac_mask_bits(dwmac, PRG_ETH0,
PRG_ETH0_INVERTED_RMII_CLK, 0);
/* TX clock delay - all known boards use a 1/4 cycle delay */
/* TX clock delay in ns = "8ns / 4 * tx_dly_val" (where
* 8ns are exactly one cycle of the 125MHz RGMII TX clock):
* 0ns = 0x0, 2ns = 0x1, 4ns = 0x2, 6ns = 0x3
*/
tx_dly_val = dwmac->tx_delay_ns >> 1;
meson8b_dwmac_mask_bits(dwmac, PRG_ETH0, PRG_ETH0_TXDLY_MASK,
PRG_ETH0_TXDLY_QUARTER);
tx_dly_val << PRG_ETH0_TXDLY_SHIFT);
break;
case PHY_INTERFACE_MODE_RMII:
@ -284,6 +287,11 @@ static int meson8b_dwmac_probe(struct platform_device *pdev)
goto err_remove_config_dt;
}
/* use 2ns as fallback since this value was previously hardcoded */
if (of_property_read_u32(pdev->dev.of_node, "amlogic,tx-delay-ns",
&dwmac->tx_delay_ns))
dwmac->tx_delay_ns = 2;
ret = meson8b_init_clk(dwmac);
if (ret)
goto err_remove_config_dt;