net/mlx5e: Use dma_rmb rather than rmb in CQE fetch routine

Use dma_rmb in mlx5e_get_cqe rather than aggressive rmb (at least on
some architectures), this should help improve the performance on such
CPU archs where dma_rmb is optimized.

Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz

Test case                   Baseline      Now      improvement
---------------------------------------------------------------
TX packets (24 threads)     45Mpps        50Mpps      11%
TC stack Drop (1 core)      3.45Mpps      3.6Mpps     5%
XDP Drop      (1 core)      14Mpps        16.9Mpps    20%
XDP TX        (1 core)      10.4Mpps      12Mpps      15%

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
Saeed Mahameed 2017-03-25 00:52:03 +03:00 committed by David S. Miller
parent 68e498554f
commit 80fe326ab8

View File

@ -44,7 +44,7 @@ struct mlx5_cqe64 *mlx5e_get_cqe(struct mlx5e_cq *cq)
return NULL;
/* ensure cqe content is read after cqe ownership bit */
rmb();
dma_rmb();
return cqe;
}