linux_dsm_epyc7002/drivers/infiniband/hw/mlx5
Israel Rukshin de0ae958de RDMA/mlx5: Improve PI handover performance
In some loads, there is performance degradation when using KLM mkey
instead of MTT mkey. This is because KLM descriptor access is via
indirection that might require more HW resources and cycles.
Using KLM descriptor is not necessary when there are no gaps at the
data/metadata sg lists. As an optimization, use MTT mkey whenever it
is possible. For that matter, allocate internal MTT mkey and choose the
effective pi_mr for in transaction according to the required mapping
scheme.

The setup of the tested benchmark (using iSER ULP):
 - 2 servers with 24 cores (1 initiator and 1 target)
 - ConnectX-4/ConnectX-5 adapters
 - 24 target sessions with 1 LUN each
 - ramdisk backstore
 - PI active

Performance results running fio (24 jobs, 128 iodepth) using
write_generate=1 and read_verify=1 (w/w.o/baseline):

bs      IOPS(read)                IOPS(write)
----    ----------                ----------
512   1262.4K/1243.3K/1147.1K    1732.1K/1725.1K/1423.8K
4k    570902/571233/457874       773982/743293/642080
32k   72086/72388/71933          96164/71789/93249

Using write_generate=0 and read_verify=0 (w/w.o patch):
bs      IOPS(read)                IOPS(write)
----    ----------                ----------
512   1600.1K/1572.1K/1393.3K    1830.3K/1823.5K/1557.2K
4k    937272/921992/762934       815304/753772/646071
32k   77369/75052/72058          97435/73180/94612

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Suggested-by: Max Gurtovoy <maxg@mellanox.com>
Suggested-by: Idan Burstein <idanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-24 11:49:28 -03:00
..
ah.c RDMA: Handle AH allocations by IB/core 2019-04-08 13:05:25 -03:00
cmd.c IB/mlx5: Add steering SW ICM device memory type 2019-05-06 12:51:51 -03:00
cmd.h IB/mlx5: Add steering SW ICM device memory type 2019-05-06 12:51:51 -03:00
cong.c infiniband: mlx5: no need to check return value of debugfs_create functions 2019-01-24 09:22:29 -07:00
cq.c RDMA: Check umem pointer validity prior to release 2019-06-20 15:17:59 -04:00
devx.c IB/mlx5: Verify DEVX general object type correctly 2019-05-14 10:22:09 -03:00
doorbell.c IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udata 2019-01-10 17:07:45 -07:00
flow.c RDMA/mlx5: Allow DEVX and raw creation flow on reps 2019-04-22 15:24:05 -03:00
gsi.c RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments const 2018-07-30 20:09:34 -06:00
ib_rep.c {IB,net}/mlx5: Constify rep ops functions pointers 2019-05-31 12:28:14 -07:00
ib_rep.h {IB,net}/mlx5: Constify rep ops functions pointers 2019-05-31 12:28:14 -07:00
ib_virt.c IB/mlx5: Restore IB guid/policy for virtual functions 2017-07-24 10:34:28 -04:00
Kconfig IB/{core,uverbs}: Move ib_umem_xxx functions from ib_core to ib_uverbs 2019-01-10 17:06:44 -07:00
mad.c RDMA/mad: Reduce MAD scope to mlx5_ib only 2019-01-15 10:02:29 +02:00
main.c RDMA/core: Rename signature qp create flag and signature device capability 2019-06-24 11:49:27 -03:00
Makefile net/mlx5: Move SRQ functions to RDMA part 2018-12-04 09:14:30 +02:00
mem.c RDMA/umem: Move page_shift from ib_umem to ib_odp_umem 2019-05-21 15:23:24 -03:00
mlx5_ib.h RDMA/mlx5: Improve PI handover performance 2019-06-24 11:49:28 -03:00
mr.c RDMA/mlx5: Improve PI handover performance 2019-06-24 11:49:28 -03:00
odp.c Merge remote-tracking branch 'mlx5-next/mlx5-next' into HEAD 2019-06-18 22:44:36 -04:00
qp.c RDMA/mlx5: Improve PI handover performance 2019-06-24 11:49:28 -03:00
srq_cmd.c RDMA: Handle SRQ allocations by IB/core 2019-04-08 13:05:25 -03:00
srq.c RDMA: Handle SRQ allocations by IB/core 2019-04-08 13:05:25 -03:00
srq.h RDMA: Handle SRQ allocations by IB/core 2019-04-08 13:05:25 -03:00