linux_dsm_epyc7002/net/sunrpc/xprtrdma
Chuck Lever 33849792cb xprtrdma: Detect unreachable NFS/RDMA servers more reliably
Current NFS clients rely on connection loss to determine when to
retransmit. In particular, for protocols like NFSv4, clients no
longer rely on RPC timeouts to drive retransmission: NFSv4 servers
are required to terminate a connection when they need a client to
retransmit pending RPCs.

When a server is no longer reachable, either because it has crashed
or because the network path has broken, the server cannot actively
terminate a connection. Thus NFS clients depend on transport-level
keepalive to determine when a connection must be replaced and
pending RPCs retransmitted.

However, RDMA RC connections do not have a native keepalive
mechanism. If an NFS/RDMA server crashes after a client has sent
RPCs successfully (an RC ACK has been received for all OTW RDMA
requests), there is no way for the client to know the connection is
moribund.

In addition, new RDMA requests are subject to the RPC-over-RDMA
credit limit. If the client has consumed all granted credits with
NFS traffic, it is not allowed to send another RDMA request until
the server replies. Thus it has no way to send a true keepalive when
the workload has already consumed all credits with pending RPCs.

To address this, forcibly disconnect a transport when an RPC times
out. This prevents moribund connections from stopping the
detection of failover or other configuration changes on the server.

Note that even if the connection is still good, retransmitting
any RPC will trigger a disconnect thanks to this logic in
xprt_rdma_send_request:

	/* Must suppress retransmit to maintain credits */
	if (req->rl_connect_cookie == xprt->connect_cookie)
		goto drop_connection;
	req->rl_connect_cookie = xprt->connect_cookie;

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-04-25 16:12:19 -04:00
..
backchannel.c xprtrdma: Cap size of callback buffer resources 2016-11-29 16:45:44 -05:00
fmr_ops.c xprtrdma: Refactor management of mw_list field 2017-02-10 14:02:37 -05:00
frwr_ops.c xprtrdma: Refactor management of mw_list field 2017-02-10 14:02:37 -05:00
Makefile xprtrdma: Remove ALLPHYSICAL memory registration mode 2016-07-11 15:50:43 -04:00
module.c rpcrdma: Merge svcrdma and xprtrdma modules into one 2015-06-04 16:56:02 -04:00
rpc_rdma.c xprtrdma: Refactor management of mw_list field 2017-02-10 14:02:37 -05:00
svc_rdma_backchannel.c The nfsd update this round is mainly a lot of miscellaneous cleanups and 2017-02-28 15:39:09 -08:00
svc_rdma_marshal.c svcrdma: Clean up RPC-over-RDMA Call header decoder 2017-02-08 14:41:57 -05:00
svc_rdma_recvfrom.c svcrdma: Poll CQs in "workqueue" mode 2017-02-08 14:42:01 -05:00
svc_rdma_sendto.c svcrdma: Clean up RPC-over-RDMA Reply header encoder 2017-02-08 14:41:41 -05:00
svc_rdma_transport.c svcrdma: set XPT_CONG_CTRL flag for bc xprt 2017-03-28 21:25:55 -04:00
svc_rdma.c svcrdma: Define maximum number of backchannel requests 2016-01-19 15:30:48 -05:00
transport.c xprtrdma: Detect unreachable NFS/RDMA servers more reliably 2017-04-25 16:12:19 -04:00
verbs.c xprtrdma: Cancel refresh worker during buffer shutdown 2017-04-25 16:12:14 -04:00
xprt_rdma.h xprtrdma: Refactor management of mw_list field 2017-02-10 14:02:37 -05:00