License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 21:07:57 +07:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2005-08-12 03:25:23 +07:00
|
|
|
/*
|
|
|
|
* linux/net/sunrpc/xprtsock.c
|
|
|
|
*
|
|
|
|
* Client-side transport implementation for sockets.
|
|
|
|
*
|
2008-10-14 09:01:08 +07:00
|
|
|
* TCP callback races fixes (C) 1998 Red Hat
|
|
|
|
* TCP send fixes (C) 1998 Red Hat
|
2005-08-12 03:25:23 +07:00
|
|
|
* TCP NFS related read + write fixes
|
|
|
|
* (C) 1999 Dave Airlie, University of Limerick, Ireland <airlied@linux.ie>
|
|
|
|
*
|
|
|
|
* Rewrite of larges part of the code in order to stabilize TCP stuff.
|
|
|
|
* Fix behaviour when socket buffer is full.
|
|
|
|
* (C) 1999 Trond Myklebust <trond.myklebust@fys.uio.no>
|
2005-08-12 03:25:47 +07:00
|
|
|
*
|
|
|
|
* IP socket transport implementation, (C) 2005 Chuck Lever <cel@netapp.com>
|
2007-08-06 22:57:53 +07:00
|
|
|
*
|
|
|
|
* IPv6 support contributed by Gilles Quillard, Bull Open Source, 2005.
|
|
|
|
* <gilles.quillard@bull.net>
|
2005-08-12 03:25:23 +07:00
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/types.h>
|
2011-05-10 02:22:44 +07:00
|
|
|
#include <linux/string.h>
|
2005-08-12 03:25:23 +07:00
|
|
|
#include <linux/slab.h>
|
2007-09-11 00:46:39 +07:00
|
|
|
#include <linux/module.h>
|
2005-08-12 03:25:23 +07:00
|
|
|
#include <linux/capability.h>
|
|
|
|
#include <linux/pagemap.h>
|
|
|
|
#include <linux/errno.h>
|
|
|
|
#include <linux/socket.h>
|
|
|
|
#include <linux/in.h>
|
|
|
|
#include <linux/net.h>
|
|
|
|
#include <linux/mm.h>
|
2011-05-10 02:22:44 +07:00
|
|
|
#include <linux/un.h>
|
2005-08-12 03:25:23 +07:00
|
|
|
#include <linux/udp.h>
|
|
|
|
#include <linux/tcp.h>
|
|
|
|
#include <linux/sunrpc/clnt.h>
|
2013-02-05 00:50:00 +07:00
|
|
|
#include <linux/sunrpc/addr.h>
|
2006-01-03 15:55:49 +07:00
|
|
|
#include <linux/sunrpc/sched.h>
|
2009-09-10 21:32:28 +07:00
|
|
|
#include <linux/sunrpc/svcsock.h>
|
2007-09-11 00:47:31 +07:00
|
|
|
#include <linux/sunrpc/xprtsock.h>
|
2005-08-12 03:25:23 +07:00
|
|
|
#include <linux/file.h>
|
2011-07-14 06:20:49 +07:00
|
|
|
#ifdef CONFIG_SUNRPC_BACKCHANNEL
|
2009-04-01 20:23:02 +07:00
|
|
|
#include <linux/sunrpc/bc_xprt.h>
|
|
|
|
#endif
|
2005-08-12 03:25:23 +07:00
|
|
|
|
|
|
|
#include <net/sock.h>
|
|
|
|
#include <net/checksum.h>
|
|
|
|
#include <net/udp.h>
|
|
|
|
#include <net/tcp.h>
|
2018-09-14 20:49:06 +07:00
|
|
|
#include <linux/bvec.h>
|
|
|
|
#include <linux/uio.h>
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2013-09-04 23:16:23 +07:00
|
|
|
#include <trace/events/sunrpc.h>
|
|
|
|
|
2009-09-10 21:32:28 +07:00
|
|
|
#include "sunrpc.h"
|
2011-05-10 02:22:44 +07:00
|
|
|
|
|
|
|
static void xs_close(struct rpc_xprt *xprt);
|
2017-02-08 23:17:54 +07:00
|
|
|
static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
|
|
|
|
struct socket *sock);
|
2011-05-10 02:22:44 +07:00
|
|
|
|
2005-11-02 00:24:48 +07:00
|
|
|
/*
|
|
|
|
* xprtsock tunables
|
|
|
|
*/
|
2012-03-12 02:22:54 +07:00
|
|
|
static unsigned int xprt_udp_slot_table_entries = RPC_DEF_SLOT_TABLE;
|
|
|
|
static unsigned int xprt_tcp_slot_table_entries = RPC_MIN_SLOT_TABLE;
|
|
|
|
static unsigned int xprt_max_tcp_slot_table_entries = RPC_MAX_SLOT_TABLE;
|
2005-11-02 00:24:48 +07:00
|
|
|
|
2012-03-12 02:22:54 +07:00
|
|
|
static unsigned int xprt_min_resvport = RPC_DEF_MIN_RESVPORT;
|
|
|
|
static unsigned int xprt_max_resvport = RPC_DEF_MAX_RESVPORT;
|
2005-11-02 00:24:48 +07:00
|
|
|
|
2015-02-09 23:01:02 +07:00
|
|
|
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
|
|
|
|
|
2009-03-12 01:38:03 +07:00
|
|
|
#define XS_TCP_LINGER_TO (15U * HZ)
|
2009-03-12 01:38:03 +07:00
|
|
|
static unsigned int xs_tcp_fin_timeout __read_mostly = XS_TCP_LINGER_TO;
|
2009-03-12 01:38:03 +07:00
|
|
|
|
2006-12-06 04:35:54 +07:00
|
|
|
/*
|
|
|
|
* We can register our own files under /proc/sys/sunrpc by
|
|
|
|
* calling register_sysctl_table() again. The files in that
|
|
|
|
* directory become the union of all files registered there.
|
|
|
|
*
|
|
|
|
* We simply need to make sure that we don't collide with
|
|
|
|
* someone else's file names!
|
|
|
|
*/
|
|
|
|
|
|
|
|
static unsigned int min_slot_table_size = RPC_MIN_SLOT_TABLE;
|
|
|
|
static unsigned int max_slot_table_size = RPC_MAX_SLOT_TABLE;
|
2011-07-18 05:11:30 +07:00
|
|
|
static unsigned int max_tcp_slot_table_limit = RPC_MAX_SLOT_TABLE_LIMIT;
|
2006-12-06 04:35:54 +07:00
|
|
|
static unsigned int xprt_min_resvport_limit = RPC_MIN_RESVPORT;
|
|
|
|
static unsigned int xprt_max_resvport_limit = RPC_MAX_RESVPORT;
|
|
|
|
|
|
|
|
static struct ctl_table_header *sunrpc_table_header;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* FIXME: changing the UDP slot table size should also resize the UDP
|
|
|
|
* socket buffers for existing UDP transports
|
|
|
|
*/
|
2013-06-12 13:04:25 +07:00
|
|
|
static struct ctl_table xs_tunables_table[] = {
|
2006-12-06 04:35:54 +07:00
|
|
|
{
|
|
|
|
.procname = "udp_slot_table_entries",
|
|
|
|
.data = &xprt_udp_slot_table_entries,
|
|
|
|
.maxlen = sizeof(unsigned int),
|
|
|
|
.mode = 0644,
|
2009-11-16 18:11:48 +07:00
|
|
|
.proc_handler = proc_dointvec_minmax,
|
2006-12-06 04:35:54 +07:00
|
|
|
.extra1 = &min_slot_table_size,
|
|
|
|
.extra2 = &max_slot_table_size
|
|
|
|
},
|
|
|
|
{
|
|
|
|
.procname = "tcp_slot_table_entries",
|
|
|
|
.data = &xprt_tcp_slot_table_entries,
|
|
|
|
.maxlen = sizeof(unsigned int),
|
|
|
|
.mode = 0644,
|
2009-11-16 18:11:48 +07:00
|
|
|
.proc_handler = proc_dointvec_minmax,
|
2006-12-06 04:35:54 +07:00
|
|
|
.extra1 = &min_slot_table_size,
|
|
|
|
.extra2 = &max_slot_table_size
|
|
|
|
},
|
2011-07-18 05:11:30 +07:00
|
|
|
{
|
|
|
|
.procname = "tcp_max_slot_table_entries",
|
|
|
|
.data = &xprt_max_tcp_slot_table_entries,
|
|
|
|
.maxlen = sizeof(unsigned int),
|
|
|
|
.mode = 0644,
|
|
|
|
.proc_handler = proc_dointvec_minmax,
|
|
|
|
.extra1 = &min_slot_table_size,
|
|
|
|
.extra2 = &max_tcp_slot_table_limit
|
|
|
|
},
|
2006-12-06 04:35:54 +07:00
|
|
|
{
|
|
|
|
.procname = "min_resvport",
|
|
|
|
.data = &xprt_min_resvport,
|
|
|
|
.maxlen = sizeof(unsigned int),
|
|
|
|
.mode = 0644,
|
2009-11-16 18:11:48 +07:00
|
|
|
.proc_handler = proc_dointvec_minmax,
|
2006-12-06 04:35:54 +07:00
|
|
|
.extra1 = &xprt_min_resvport_limit,
|
2018-10-19 02:27:02 +07:00
|
|
|
.extra2 = &xprt_max_resvport_limit
|
2006-12-06 04:35:54 +07:00
|
|
|
},
|
|
|
|
{
|
|
|
|
.procname = "max_resvport",
|
|
|
|
.data = &xprt_max_resvport,
|
|
|
|
.maxlen = sizeof(unsigned int),
|
|
|
|
.mode = 0644,
|
2009-11-16 18:11:48 +07:00
|
|
|
.proc_handler = proc_dointvec_minmax,
|
2018-10-19 02:27:02 +07:00
|
|
|
.extra1 = &xprt_min_resvport_limit,
|
2006-12-06 04:35:54 +07:00
|
|
|
.extra2 = &xprt_max_resvport_limit
|
|
|
|
},
|
2009-03-12 01:38:03 +07:00
|
|
|
{
|
|
|
|
.procname = "tcp_fin_timeout",
|
|
|
|
.data = &xs_tcp_fin_timeout,
|
|
|
|
.maxlen = sizeof(xs_tcp_fin_timeout),
|
|
|
|
.mode = 0644,
|
2009-11-16 18:11:48 +07:00
|
|
|
.proc_handler = proc_dointvec_jiffies,
|
2006-12-06 04:35:54 +07:00
|
|
|
},
|
2009-11-06 04:32:03 +07:00
|
|
|
{ },
|
2006-12-06 04:35:54 +07:00
|
|
|
};
|
|
|
|
|
2013-06-12 13:04:25 +07:00
|
|
|
static struct ctl_table sunrpc_table[] = {
|
2006-12-06 04:35:54 +07:00
|
|
|
{
|
|
|
|
.procname = "sunrpc",
|
|
|
|
.mode = 0555,
|
|
|
|
.child = xs_tunables_table
|
|
|
|
},
|
2009-11-06 04:32:03 +07:00
|
|
|
{ },
|
2006-12-06 04:35:54 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
#endif
|
|
|
|
|
2005-08-26 06:25:55 +07:00
|
|
|
/*
|
|
|
|
* Wait duration for a reply from the RPC portmapper.
|
|
|
|
*/
|
|
|
|
#define XS_BIND_TO (60U * HZ)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Delay if a UDP socket connect error occurs. This is most likely some
|
|
|
|
* kind of resource problem on the local host.
|
|
|
|
*/
|
|
|
|
#define XS_UDP_REEST_TO (2U * HZ)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The reestablish timeout allows clients to delay for a bit before attempting
|
|
|
|
* to reconnect to a server that just dropped our connection.
|
|
|
|
*
|
|
|
|
* We implement an exponential backoff when trying to reestablish a TCP
|
|
|
|
* transport connection with the server. Some servers like to drop a TCP
|
|
|
|
* connection when they are overworked, so we start with a short timeout and
|
|
|
|
* increase over time if the server is down or not responding.
|
|
|
|
*/
|
|
|
|
#define XS_TCP_INIT_REEST_TO (3U * HZ)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* TCP idle timeout; client drops the transport socket if it is idle
|
|
|
|
* for this long. Note that we also timeout UDP sockets to prevent
|
|
|
|
* holding port numbers when there is no RPC traffic.
|
|
|
|
*/
|
|
|
|
#define XS_IDLE_DISC_TO (5U * 60 * HZ)
|
|
|
|
|
2014-11-18 04:58:04 +07:00
|
|
|
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
|
2005-08-12 03:25:23 +07:00
|
|
|
# undef RPC_DEBUG_DATA
|
2005-08-12 03:25:26 +07:00
|
|
|
# define RPCDBG_FACILITY RPCDBG_TRANS
|
2005-08-12 03:25:23 +07:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#ifdef RPC_DEBUG_DATA
|
2005-08-12 03:25:26 +07:00
|
|
|
static void xs_pktdump(char *msg, u32 *packet, unsigned int count)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2005-08-12 03:25:26 +07:00
|
|
|
u8 *buf = (u8 *) packet;
|
|
|
|
int j;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: %s\n", msg);
|
2005-08-12 03:25:23 +07:00
|
|
|
for (j = 0; j < count && j < 128; j += 4) {
|
|
|
|
if (!(j & 31)) {
|
|
|
|
if (j)
|
|
|
|
dprintk("\n");
|
|
|
|
dprintk("0x%04x ", j);
|
|
|
|
}
|
|
|
|
dprintk("%02x%02x%02x%02x ",
|
|
|
|
buf[j], buf[j+1], buf[j+2], buf[j+3]);
|
|
|
|
}
|
|
|
|
dprintk("\n");
|
|
|
|
}
|
|
|
|
#else
|
2005-08-12 03:25:26 +07:00
|
|
|
static inline void xs_pktdump(char *msg, u32 *packet, unsigned int count)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
|
|
|
/* NOP */
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-01-01 01:22:59 +07:00
|
|
|
static inline struct rpc_xprt *xprt_from_sock(struct sock *sk)
|
|
|
|
{
|
|
|
|
return (struct rpc_xprt *) sk->sk_user_data;
|
|
|
|
}
|
|
|
|
|
2007-08-06 22:57:58 +07:00
|
|
|
static inline struct sockaddr *xs_addr(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
return (struct sockaddr *) &xprt->addr;
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
static inline struct sockaddr_un *xs_addr_un(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
return (struct sockaddr_un *) &xprt->addr;
|
|
|
|
}
|
|
|
|
|
2007-08-06 22:57:58 +07:00
|
|
|
static inline struct sockaddr_in *xs_addr_in(struct rpc_xprt *xprt)
|
2006-08-23 07:06:18 +07:00
|
|
|
{
|
2007-08-06 22:57:58 +07:00
|
|
|
return (struct sockaddr_in *) &xprt->addr;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct sockaddr_in6 *xs_addr_in6(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
return (struct sockaddr_in6 *) &xprt->addr;
|
|
|
|
}
|
|
|
|
|
2009-08-10 02:09:36 +07:00
|
|
|
static void xs_format_common_peer_addresses(struct rpc_xprt *xprt)
|
2006-08-23 07:06:18 +07:00
|
|
|
{
|
2009-08-10 02:09:36 +07:00
|
|
|
struct sockaddr *sap = xs_addr(xprt);
|
2009-08-10 02:09:46 +07:00
|
|
|
struct sockaddr_in6 *sin6;
|
|
|
|
struct sockaddr_in *sin;
|
2011-05-10 02:22:44 +07:00
|
|
|
struct sockaddr_un *sun;
|
2009-08-10 02:09:36 +07:00
|
|
|
char buf[128];
|
2006-08-23 07:06:18 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
switch (sap->sa_family) {
|
2011-05-10 02:22:44 +07:00
|
|
|
case AF_LOCAL:
|
|
|
|
sun = xs_addr_un(xprt);
|
|
|
|
strlcpy(buf, sun->sun_path, sizeof(buf));
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR] =
|
|
|
|
kstrdup(buf, GFP_KERNEL);
|
|
|
|
break;
|
2009-08-10 02:09:46 +07:00
|
|
|
case AF_INET:
|
2011-05-10 02:22:44 +07:00
|
|
|
(void)rpc_ntop(sap, buf, sizeof(buf));
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR] =
|
|
|
|
kstrdup(buf, GFP_KERNEL);
|
2009-08-10 02:09:46 +07:00
|
|
|
sin = xs_addr_in(xprt);
|
2010-03-09 03:15:28 +07:00
|
|
|
snprintf(buf, sizeof(buf), "%08x", ntohl(sin->sin_addr.s_addr));
|
2009-08-10 02:09:46 +07:00
|
|
|
break;
|
|
|
|
case AF_INET6:
|
2011-05-10 02:22:44 +07:00
|
|
|
(void)rpc_ntop(sap, buf, sizeof(buf));
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR] =
|
|
|
|
kstrdup(buf, GFP_KERNEL);
|
2009-08-10 02:09:46 +07:00
|
|
|
sin6 = xs_addr_in6(xprt);
|
2010-03-09 03:15:28 +07:00
|
|
|
snprintf(buf, sizeof(buf), "%pi6", &sin6->sin6_addr);
|
2009-08-10 02:09:46 +07:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
BUG();
|
2007-08-17 03:03:26 +07:00
|
|
|
}
|
2011-05-10 02:22:44 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
xprt->address_strings[RPC_DISPLAY_HEX_ADDR] = kstrdup(buf, GFP_KERNEL);
|
2006-08-23 07:06:18 +07:00
|
|
|
}
|
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
static void xs_format_common_peer_ports(struct rpc_xprt *xprt)
|
2007-08-06 22:57:12 +07:00
|
|
|
{
|
2009-08-10 02:09:46 +07:00
|
|
|
struct sockaddr *sap = xs_addr(xprt);
|
|
|
|
char buf[128];
|
2007-08-06 22:57:12 +07:00
|
|
|
|
2010-03-09 03:15:59 +07:00
|
|
|
snprintf(buf, sizeof(buf), "%u", rpc_get_port(sap));
|
2009-08-10 02:09:36 +07:00
|
|
|
xprt->address_strings[RPC_DISPLAY_PORT] = kstrdup(buf, GFP_KERNEL);
|
2007-08-06 22:57:12 +07:00
|
|
|
|
2010-03-09 03:15:59 +07:00
|
|
|
snprintf(buf, sizeof(buf), "%4hx", rpc_get_port(sap));
|
2009-08-10 02:09:36 +07:00
|
|
|
xprt->address_strings[RPC_DISPLAY_HEX_PORT] = kstrdup(buf, GFP_KERNEL);
|
|
|
|
}
|
2007-08-06 22:57:12 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
static void xs_format_peer_addresses(struct rpc_xprt *xprt,
|
|
|
|
const char *protocol,
|
|
|
|
const char *netid)
|
2009-08-10 02:09:36 +07:00
|
|
|
{
|
2008-01-08 06:34:48 +07:00
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO] = protocol;
|
|
|
|
xprt->address_strings[RPC_DISPLAY_NETID] = netid;
|
2009-08-10 02:09:36 +07:00
|
|
|
xs_format_common_peer_addresses(xprt);
|
2009-08-10 02:09:46 +07:00
|
|
|
xs_format_common_peer_ports(xprt);
|
2006-08-23 07:06:18 +07:00
|
|
|
}
|
2007-08-06 22:57:12 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
static void xs_update_peer_port(struct rpc_xprt *xprt)
|
2007-08-06 22:57:12 +07:00
|
|
|
{
|
2009-08-10 02:09:46 +07:00
|
|
|
kfree(xprt->address_strings[RPC_DISPLAY_HEX_PORT]);
|
|
|
|
kfree(xprt->address_strings[RPC_DISPLAY_PORT]);
|
2007-09-11 00:43:05 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
xs_format_common_peer_ports(xprt);
|
2006-08-23 07:06:18 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static void xs_free_peer_addresses(struct rpc_xprt *xprt)
|
|
|
|
{
|
2008-01-15 00:32:20 +07:00
|
|
|
unsigned int i;
|
|
|
|
|
|
|
|
for (i = 0; i < RPC_DISPLAY_MAX; i++)
|
|
|
|
switch (i) {
|
|
|
|
case RPC_DISPLAY_PROTO:
|
|
|
|
case RPC_DISPLAY_NETID:
|
|
|
|
continue;
|
|
|
|
default:
|
|
|
|
kfree(xprt->address_strings[i]);
|
|
|
|
}
|
2006-08-23 07:06:18 +07:00
|
|
|
}
|
|
|
|
|
2018-09-14 20:49:06 +07:00
|
|
|
static size_t
|
|
|
|
xs_alloc_sparse_pages(struct xdr_buf *buf, size_t want, gfp_t gfp)
|
|
|
|
{
|
|
|
|
size_t i,n;
|
|
|
|
|
|
|
|
if (!(buf->flags & XDRBUF_SPARSE_PAGES))
|
|
|
|
return want;
|
|
|
|
if (want > buf->page_len)
|
|
|
|
want = buf->page_len;
|
|
|
|
n = (buf->page_base + want + PAGE_SIZE - 1) >> PAGE_SHIFT;
|
|
|
|
for (i = 0; i < n; i++) {
|
|
|
|
if (buf->pages[i])
|
|
|
|
continue;
|
|
|
|
buf->bvec[i].bv_page = buf->pages[i] = alloc_page(gfp);
|
|
|
|
if (!buf->pages[i]) {
|
|
|
|
buf->page_len = (i * PAGE_SIZE) - buf->page_base;
|
|
|
|
return buf->page_len;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return want;
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_sock_recvmsg(struct socket *sock, struct msghdr *msg, int flags, size_t seek)
|
|
|
|
{
|
|
|
|
ssize_t ret;
|
|
|
|
if (seek != 0)
|
|
|
|
iov_iter_advance(&msg->msg_iter, seek);
|
|
|
|
ret = sock_recvmsg(sock, msg, flags);
|
|
|
|
return ret > 0 ? ret + seek : ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_kvec(struct socket *sock, struct msghdr *msg, int flags,
|
|
|
|
struct kvec *kvec, size_t count, size_t seek)
|
|
|
|
{
|
2018-11-02 05:19:03 +07:00
|
|
|
iov_iter_kvec(&msg->msg_iter, READ, kvec, 1, count);
|
2018-09-14 20:49:06 +07:00
|
|
|
return xs_sock_recvmsg(sock, msg, flags, seek);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_bvec(struct socket *sock, struct msghdr *msg, int flags,
|
|
|
|
struct bio_vec *bvec, unsigned long nr, size_t count,
|
|
|
|
size_t seek)
|
|
|
|
{
|
2018-11-02 05:19:03 +07:00
|
|
|
iov_iter_bvec(&msg->msg_iter, READ, bvec, nr, count);
|
2018-09-14 20:49:06 +07:00
|
|
|
return xs_sock_recvmsg(sock, msg, flags, seek);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_discard(struct socket *sock, struct msghdr *msg, int flags,
|
|
|
|
size_t count)
|
|
|
|
{
|
|
|
|
struct kvec kvec = { 0 };
|
|
|
|
return xs_read_kvec(sock, msg, flags | MSG_TRUNC, &kvec, count, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_xdr_buf(struct socket *sock, struct msghdr *msg, int flags,
|
|
|
|
struct xdr_buf *buf, size_t count, size_t seek, size_t *read)
|
|
|
|
{
|
|
|
|
size_t want, seek_init = seek, offset = 0;
|
|
|
|
ssize_t ret;
|
|
|
|
|
|
|
|
if (seek < buf->head[0].iov_len) {
|
|
|
|
want = min_t(size_t, count, buf->head[0].iov_len);
|
|
|
|
ret = xs_read_kvec(sock, msg, flags, &buf->head[0], want, seek);
|
|
|
|
if (ret <= 0)
|
|
|
|
goto sock_err;
|
|
|
|
offset += ret;
|
|
|
|
if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
|
|
|
|
goto out;
|
|
|
|
if (ret != want)
|
|
|
|
goto eagain;
|
|
|
|
seek = 0;
|
|
|
|
} else {
|
|
|
|
seek -= buf->head[0].iov_len;
|
|
|
|
offset += buf->head[0].iov_len;
|
|
|
|
}
|
|
|
|
if (seek < buf->page_len) {
|
|
|
|
want = xs_alloc_sparse_pages(buf,
|
|
|
|
min_t(size_t, count - offset, buf->page_len),
|
|
|
|
GFP_NOWAIT);
|
|
|
|
ret = xs_read_bvec(sock, msg, flags, buf->bvec,
|
|
|
|
xdr_buf_pagecount(buf),
|
|
|
|
want + buf->page_base,
|
|
|
|
seek + buf->page_base);
|
|
|
|
if (ret <= 0)
|
|
|
|
goto sock_err;
|
|
|
|
offset += ret - buf->page_base;
|
|
|
|
if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
|
|
|
|
goto out;
|
|
|
|
if (ret != want)
|
|
|
|
goto eagain;
|
|
|
|
seek = 0;
|
|
|
|
} else {
|
|
|
|
seek -= buf->page_len;
|
|
|
|
offset += buf->page_len;
|
|
|
|
}
|
|
|
|
if (seek < buf->tail[0].iov_len) {
|
|
|
|
want = min_t(size_t, count - offset, buf->tail[0].iov_len);
|
|
|
|
ret = xs_read_kvec(sock, msg, flags, &buf->tail[0], want, seek);
|
|
|
|
if (ret <= 0)
|
|
|
|
goto sock_err;
|
|
|
|
offset += ret;
|
|
|
|
if (offset == count || msg->msg_flags & (MSG_EOR|MSG_TRUNC))
|
|
|
|
goto out;
|
|
|
|
if (ret != want)
|
|
|
|
goto eagain;
|
|
|
|
} else
|
|
|
|
offset += buf->tail[0].iov_len;
|
|
|
|
ret = -EMSGSIZE;
|
|
|
|
msg->msg_flags |= MSG_TRUNC;
|
|
|
|
out:
|
|
|
|
*read = offset - seek_init;
|
|
|
|
return ret;
|
|
|
|
eagain:
|
|
|
|
ret = -EAGAIN;
|
|
|
|
goto out;
|
|
|
|
sock_err:
|
|
|
|
offset += seek;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
xs_read_header(struct sock_xprt *transport, struct xdr_buf *buf)
|
|
|
|
{
|
|
|
|
if (!transport->recv.copied) {
|
|
|
|
if (buf->head[0].iov_len >= transport->recv.offset)
|
|
|
|
memcpy(buf->head[0].iov_base,
|
|
|
|
&transport->recv.xid,
|
|
|
|
transport->recv.offset);
|
|
|
|
transport->recv.copied = transport->recv.offset;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool
|
|
|
|
xs_read_stream_request_done(struct sock_xprt *transport)
|
|
|
|
{
|
|
|
|
return transport->recv.fraghdr & cpu_to_be32(RPC_LAST_STREAM_FRAGMENT);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_stream_request(struct sock_xprt *transport, struct msghdr *msg,
|
|
|
|
int flags, struct rpc_rqst *req)
|
|
|
|
{
|
|
|
|
struct xdr_buf *buf = &req->rq_private_buf;
|
|
|
|
size_t want, read;
|
|
|
|
ssize_t ret;
|
|
|
|
|
|
|
|
xs_read_header(transport, buf);
|
|
|
|
|
|
|
|
want = transport->recv.len - transport->recv.offset;
|
|
|
|
ret = xs_read_xdr_buf(transport->sock, msg, flags, buf,
|
|
|
|
transport->recv.copied + want, transport->recv.copied,
|
|
|
|
&read);
|
|
|
|
transport->recv.offset += read;
|
|
|
|
transport->recv.copied += read;
|
|
|
|
if (transport->recv.offset == transport->recv.len) {
|
|
|
|
if (xs_read_stream_request_done(transport))
|
|
|
|
msg->msg_flags |= MSG_EOR;
|
|
|
|
return transport->recv.copied;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (ret) {
|
|
|
|
case -EMSGSIZE:
|
|
|
|
return transport->recv.copied;
|
|
|
|
case 0:
|
|
|
|
return -ESHUTDOWN;
|
|
|
|
default:
|
|
|
|
if (ret < 0)
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
return -EAGAIN;
|
|
|
|
}
|
|
|
|
|
|
|
|
static size_t
|
|
|
|
xs_read_stream_headersize(bool isfrag)
|
|
|
|
{
|
|
|
|
if (isfrag)
|
|
|
|
return sizeof(__be32);
|
|
|
|
return 3 * sizeof(__be32);
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_stream_header(struct sock_xprt *transport, struct msghdr *msg,
|
|
|
|
int flags, size_t want, size_t seek)
|
|
|
|
{
|
|
|
|
struct kvec kvec = {
|
|
|
|
.iov_base = &transport->recv.fraghdr,
|
|
|
|
.iov_len = want,
|
|
|
|
};
|
|
|
|
return xs_read_kvec(transport->sock, msg, flags, &kvec, want, seek);
|
|
|
|
}
|
|
|
|
|
|
|
|
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
|
|
|
|
static ssize_t
|
|
|
|
xs_read_stream_call(struct sock_xprt *transport, struct msghdr *msg, int flags)
|
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt = &transport->xprt;
|
|
|
|
struct rpc_rqst *req;
|
|
|
|
ssize_t ret;
|
|
|
|
|
|
|
|
/* Look up and lock the request corresponding to the given XID */
|
|
|
|
req = xprt_lookup_bc_request(xprt, transport->recv.xid);
|
|
|
|
if (!req) {
|
|
|
|
printk(KERN_WARNING "Callback slot table overflowed\n");
|
|
|
|
return -ESHUTDOWN;
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = xs_read_stream_request(transport, msg, flags, req);
|
|
|
|
if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
|
|
|
|
xprt_complete_bc_request(req, ret);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
#else /* CONFIG_SUNRPC_BACKCHANNEL */
|
|
|
|
static ssize_t
|
|
|
|
xs_read_stream_call(struct sock_xprt *transport, struct msghdr *msg, int flags)
|
|
|
|
{
|
|
|
|
return -ESHUTDOWN;
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_SUNRPC_BACKCHANNEL */
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_stream_reply(struct sock_xprt *transport, struct msghdr *msg, int flags)
|
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt = &transport->xprt;
|
|
|
|
struct rpc_rqst *req;
|
|
|
|
ssize_t ret = 0;
|
|
|
|
|
|
|
|
/* Look up and lock the request corresponding to the given XID */
|
|
|
|
spin_lock(&xprt->queue_lock);
|
|
|
|
req = xprt_lookup_rqst(xprt, transport->recv.xid);
|
|
|
|
if (!req) {
|
|
|
|
msg->msg_flags |= MSG_TRUNC;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
xprt_pin_rqst(req);
|
|
|
|
spin_unlock(&xprt->queue_lock);
|
|
|
|
|
|
|
|
ret = xs_read_stream_request(transport, msg, flags, req);
|
|
|
|
|
|
|
|
spin_lock(&xprt->queue_lock);
|
|
|
|
if (msg->msg_flags & (MSG_EOR|MSG_TRUNC))
|
|
|
|
xprt_complete_rqst(req->rq_task, ret);
|
|
|
|
xprt_unpin_rqst(req);
|
|
|
|
out:
|
|
|
|
spin_unlock(&xprt->queue_lock);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static ssize_t
|
|
|
|
xs_read_stream(struct sock_xprt *transport, int flags)
|
|
|
|
{
|
|
|
|
struct msghdr msg = { 0 };
|
|
|
|
size_t want, read = 0;
|
|
|
|
ssize_t ret = 0;
|
|
|
|
|
|
|
|
if (transport->recv.len == 0) {
|
|
|
|
want = xs_read_stream_headersize(transport->recv.copied != 0);
|
|
|
|
ret = xs_read_stream_header(transport, &msg, flags, want,
|
|
|
|
transport->recv.offset);
|
|
|
|
if (ret <= 0)
|
|
|
|
goto out_err;
|
|
|
|
transport->recv.offset = ret;
|
|
|
|
if (ret != want) {
|
|
|
|
ret = -EAGAIN;
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
transport->recv.len = be32_to_cpu(transport->recv.fraghdr) &
|
|
|
|
RPC_FRAGMENT_SIZE_MASK;
|
|
|
|
transport->recv.offset -= sizeof(transport->recv.fraghdr);
|
|
|
|
read = ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (be32_to_cpu(transport->recv.calldir)) {
|
|
|
|
case RPC_CALL:
|
|
|
|
ret = xs_read_stream_call(transport, &msg, flags);
|
|
|
|
break;
|
|
|
|
case RPC_REPLY:
|
|
|
|
ret = xs_read_stream_reply(transport, &msg, flags);
|
|
|
|
}
|
|
|
|
if (msg.msg_flags & MSG_TRUNC) {
|
|
|
|
transport->recv.calldir = cpu_to_be32(-1);
|
|
|
|
transport->recv.copied = -1;
|
|
|
|
}
|
|
|
|
if (ret < 0)
|
|
|
|
goto out_err;
|
|
|
|
read += ret;
|
|
|
|
if (transport->recv.offset < transport->recv.len) {
|
|
|
|
ret = xs_read_discard(transport->sock, &msg, flags,
|
|
|
|
transport->recv.len - transport->recv.offset);
|
|
|
|
if (ret <= 0)
|
|
|
|
goto out_err;
|
|
|
|
transport->recv.offset += ret;
|
|
|
|
read += ret;
|
|
|
|
if (transport->recv.offset != transport->recv.len)
|
|
|
|
return -EAGAIN;
|
|
|
|
}
|
|
|
|
if (xs_read_stream_request_done(transport)) {
|
2018-09-15 01:26:28 +07:00
|
|
|
trace_xs_stream_read_request(transport);
|
2018-09-14 20:49:06 +07:00
|
|
|
transport->recv.copied = 0;
|
|
|
|
}
|
|
|
|
transport->recv.offset = 0;
|
|
|
|
transport->recv.len = 0;
|
|
|
|
return read;
|
|
|
|
out_err:
|
|
|
|
switch (ret) {
|
|
|
|
case 0:
|
|
|
|
case -ESHUTDOWN:
|
|
|
|
xprt_force_disconnect(&transport->xprt);
|
|
|
|
return -ESHUTDOWN;
|
|
|
|
}
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2018-09-15 01:26:28 +07:00
|
|
|
static void xs_stream_data_receive(struct sock_xprt *transport)
|
|
|
|
{
|
|
|
|
size_t read = 0;
|
|
|
|
ssize_t ret = 0;
|
|
|
|
|
|
|
|
mutex_lock(&transport->recv_mutex);
|
|
|
|
if (transport->sock == NULL)
|
|
|
|
goto out;
|
|
|
|
clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state);
|
|
|
|
for (;;) {
|
|
|
|
ret = xs_read_stream(transport, MSG_DONTWAIT);
|
|
|
|
if (ret <= 0)
|
|
|
|
break;
|
|
|
|
read += ret;
|
|
|
|
cond_resched();
|
|
|
|
}
|
|
|
|
out:
|
|
|
|
mutex_unlock(&transport->recv_mutex);
|
|
|
|
trace_xs_stream_read_data(&transport->xprt, ret, read);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void xs_stream_data_receive_workfn(struct work_struct *work)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport =
|
|
|
|
container_of(work, struct sock_xprt, recv_worker);
|
|
|
|
xs_stream_data_receive(transport);
|
|
|
|
}
|
|
|
|
|
2018-09-15 01:32:45 +07:00
|
|
|
static void
|
|
|
|
xs_stream_reset_connect(struct sock_xprt *transport)
|
|
|
|
{
|
|
|
|
transport->recv.offset = 0;
|
|
|
|
transport->recv.len = 0;
|
|
|
|
transport->recv.copied = 0;
|
|
|
|
transport->xmit.offset = 0;
|
|
|
|
transport->xprt.stat.connect_count++;
|
|
|
|
transport->xprt.stat.connect_start = jiffies;
|
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:29 +07:00
|
|
|
#define XS_SENDMSG_FLAGS (MSG_DONTWAIT | MSG_NOSIGNAL)
|
|
|
|
|
2006-10-18 02:06:22 +07:00
|
|
|
static int xs_send_kvec(struct socket *sock, struct sockaddr *addr, int addrlen, struct kvec *vec, unsigned int base, int more)
|
2005-08-12 03:25:29 +07:00
|
|
|
{
|
|
|
|
struct msghdr msg = {
|
|
|
|
.msg_name = addr,
|
|
|
|
.msg_namelen = addrlen,
|
2006-10-18 02:06:22 +07:00
|
|
|
.msg_flags = XS_SENDMSG_FLAGS | (more ? MSG_MORE : 0),
|
|
|
|
};
|
|
|
|
struct kvec iov = {
|
|
|
|
.iov_base = vec->iov_base + base,
|
|
|
|
.iov_len = vec->iov_len - base,
|
2005-08-12 03:25:29 +07:00
|
|
|
};
|
|
|
|
|
2006-10-18 02:06:22 +07:00
|
|
|
if (iov.iov_len != 0)
|
2005-08-12 03:25:29 +07:00
|
|
|
return kernel_sendmsg(sock, &msg, &iov, 1, iov.iov_len);
|
|
|
|
return kernel_sendmsg(sock, &msg, NULL, 0, 0);
|
|
|
|
}
|
|
|
|
|
2014-09-25 01:08:00 +07:00
|
|
|
static int xs_send_pagedata(struct socket *sock, struct xdr_buf *xdr, unsigned int base, int more, bool zerocopy, int *sent_p)
|
2005-08-12 03:25:29 +07:00
|
|
|
{
|
SUNRPC: Fix a data corruption issue when retransmitting RPC calls
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-11-09 04:03:50 +07:00
|
|
|
ssize_t (*do_sendpage)(struct socket *sock, struct page *page,
|
|
|
|
int offset, size_t size, int flags);
|
2006-10-18 02:06:22 +07:00
|
|
|
struct page **ppage;
|
|
|
|
unsigned int remainder;
|
2014-09-25 01:08:00 +07:00
|
|
|
int err;
|
2006-10-18 02:06:22 +07:00
|
|
|
|
|
|
|
remainder = xdr->page_len - base;
|
|
|
|
base += xdr->page_base;
|
|
|
|
ppage = xdr->pages + (base >> PAGE_SHIFT);
|
|
|
|
base &= ~PAGE_MASK;
|
SUNRPC: Fix a data corruption issue when retransmitting RPC calls
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-11-09 04:03:50 +07:00
|
|
|
do_sendpage = sock->ops->sendpage;
|
|
|
|
if (!zerocopy)
|
|
|
|
do_sendpage = sock_no_sendpage;
|
2006-10-18 02:06:22 +07:00
|
|
|
for(;;) {
|
|
|
|
unsigned int len = min_t(unsigned int, PAGE_SIZE - base, remainder);
|
|
|
|
int flags = XS_SENDMSG_FLAGS;
|
2005-08-12 03:25:29 +07:00
|
|
|
|
2006-10-18 02:06:22 +07:00
|
|
|
remainder -= len;
|
2015-10-07 02:59:20 +07:00
|
|
|
if (more)
|
2006-10-18 02:06:22 +07:00
|
|
|
flags |= MSG_MORE;
|
2015-10-07 02:59:20 +07:00
|
|
|
if (remainder != 0)
|
|
|
|
flags |= MSG_SENDPAGE_NOTLAST | MSG_MORE;
|
SUNRPC: Fix a data corruption issue when retransmitting RPC calls
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-11-09 04:03:50 +07:00
|
|
|
err = do_sendpage(sock, *ppage, base, len, flags);
|
2006-10-18 02:06:22 +07:00
|
|
|
if (remainder == 0 || err != len)
|
|
|
|
break;
|
2014-09-25 01:08:00 +07:00
|
|
|
*sent_p += err;
|
2006-10-18 02:06:22 +07:00
|
|
|
ppage++;
|
|
|
|
base = 0;
|
|
|
|
}
|
2014-09-25 01:08:00 +07:00
|
|
|
if (err > 0) {
|
|
|
|
*sent_p += err;
|
|
|
|
err = 0;
|
|
|
|
}
|
|
|
|
return err;
|
2005-08-12 03:25:29 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
|
|
|
* xs_sendpages - write pages directly to a socket
|
|
|
|
* @sock: socket to send on
|
|
|
|
* @addr: UDP only -- address of destination
|
|
|
|
* @addrlen: UDP only -- length of destination address
|
|
|
|
* @xdr: buffer containing this request
|
|
|
|
* @base: starting position in the buffer
|
SUNRPC: Fix a data corruption issue when retransmitting RPC calls
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-11-09 04:03:50 +07:00
|
|
|
* @zerocopy: true if it is safe to use sendpage()
|
2014-09-25 01:08:00 +07:00
|
|
|
* @sent_p: return the total number of bytes successfully queued for sending
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
2005-08-12 03:25:23 +07:00
|
|
|
*/
|
2014-09-25 01:08:00 +07:00
|
|
|
static int xs_sendpages(struct socket *sock, struct sockaddr *addr, int addrlen, struct xdr_buf *xdr, unsigned int base, bool zerocopy, int *sent_p)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2006-10-18 02:06:22 +07:00
|
|
|
unsigned int remainder = xdr->len - base;
|
2014-09-25 01:08:00 +07:00
|
|
|
int err = 0;
|
|
|
|
int sent = 0;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
if (unlikely(!sock))
|
2009-03-12 01:06:41 +07:00
|
|
|
return -ENOTSOCK;
|
2005-08-12 03:25:56 +07:00
|
|
|
|
2006-10-18 02:06:22 +07:00
|
|
|
if (base != 0) {
|
|
|
|
addr = NULL;
|
|
|
|
addrlen = 0;
|
|
|
|
}
|
2005-08-12 03:25:56 +07:00
|
|
|
|
2006-10-18 02:06:22 +07:00
|
|
|
if (base < xdr->head[0].iov_len || addr != NULL) {
|
|
|
|
unsigned int len = xdr->head[0].iov_len - base;
|
|
|
|
remainder -= len;
|
|
|
|
err = xs_send_kvec(sock, addr, addrlen, &xdr->head[0], base, remainder != 0);
|
|
|
|
if (remainder == 0 || err != len)
|
2005-08-12 03:25:23 +07:00
|
|
|
goto out;
|
2014-09-25 01:08:00 +07:00
|
|
|
*sent_p += err;
|
2005-08-12 03:25:23 +07:00
|
|
|
base = 0;
|
|
|
|
} else
|
2006-10-18 02:06:22 +07:00
|
|
|
base -= xdr->head[0].iov_len;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2006-10-18 02:06:22 +07:00
|
|
|
if (base < xdr->page_len) {
|
|
|
|
unsigned int len = xdr->page_len - base;
|
|
|
|
remainder -= len;
|
2014-09-25 01:08:00 +07:00
|
|
|
err = xs_send_pagedata(sock, xdr, base, remainder != 0, zerocopy, &sent);
|
|
|
|
*sent_p += sent;
|
|
|
|
if (remainder == 0 || sent != len)
|
2005-08-12 03:25:23 +07:00
|
|
|
goto out;
|
|
|
|
base = 0;
|
2006-10-18 02:06:22 +07:00
|
|
|
} else
|
|
|
|
base -= xdr->page_len;
|
|
|
|
|
|
|
|
if (base >= xdr->tail[0].iov_len)
|
2014-09-25 01:08:00 +07:00
|
|
|
return 0;
|
2006-10-18 02:06:22 +07:00
|
|
|
err = xs_send_kvec(sock, NULL, 0, &xdr->tail[0], base, 0);
|
2005-08-12 03:25:23 +07:00
|
|
|
out:
|
2014-09-25 01:08:00 +07:00
|
|
|
if (err > 0) {
|
|
|
|
*sent_p += err;
|
|
|
|
err = 0;
|
|
|
|
}
|
|
|
|
return err;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
2018-09-04 10:39:27 +07:00
|
|
|
* xs_nospace - handle transmit was incomplete
|
2018-08-31 00:27:29 +07:00
|
|
|
* @req: pointer to RPC request
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
2005-08-12 03:25:23 +07:00
|
|
|
*/
|
2018-09-04 10:39:27 +07:00
|
|
|
static int xs_nospace(struct rpc_rqst *req)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2005-08-12 03:25:56 +07:00
|
|
|
struct rpc_xprt *xprt = req->rq_xprt;
|
2006-12-06 04:35:15 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
2014-02-11 21:15:54 +07:00
|
|
|
struct sock *sk = transport->inet;
|
2011-11-22 19:44:28 +07:00
|
|
|
int ret = -EAGAIN;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: %5u xmit incomplete (%u left of %u)\n",
|
2018-09-04 10:39:27 +07:00
|
|
|
req->rq_task->tk_pid,
|
|
|
|
req->rq_slen - transport->xmit.offset,
|
2005-08-12 03:25:56 +07:00
|
|
|
req->rq_slen);
|
|
|
|
|
2008-04-18 05:52:19 +07:00
|
|
|
/* Protect against races with write_space */
|
|
|
|
spin_lock_bh(&xprt->transport_lock);
|
|
|
|
|
|
|
|
/* Don't race with disconnect */
|
|
|
|
if (xprt_connected(xprt)) {
|
2016-01-06 20:57:06 +07:00
|
|
|
/* wait for more buffer space */
|
|
|
|
sk->sk_write_pending++;
|
2018-09-04 10:39:27 +07:00
|
|
|
xprt_wait_for_buffer_space(xprt);
|
2016-01-06 20:57:06 +07:00
|
|
|
} else
|
2009-03-12 01:38:01 +07:00
|
|
|
ret = -ENOTCONN;
|
2005-08-12 03:25:56 +07:00
|
|
|
|
2008-04-18 05:52:19 +07:00
|
|
|
spin_unlock_bh(&xprt->transport_lock);
|
2014-02-11 21:15:54 +07:00
|
|
|
|
|
|
|
/* Race breaker in case memory is freed before above code is called */
|
2016-09-19 19:58:30 +07:00
|
|
|
if (ret == -EAGAIN) {
|
|
|
|
struct socket_wq *wq;
|
|
|
|
|
|
|
|
rcu_read_lock();
|
|
|
|
wq = rcu_dereference(sk->sk_wq);
|
|
|
|
set_bit(SOCKWQ_ASYNC_NOSPACE, &wq->flags);
|
|
|
|
rcu_read_unlock();
|
|
|
|
|
|
|
|
sk->sk_write_space(sk);
|
|
|
|
}
|
2009-03-12 01:38:01 +07:00
|
|
|
return ret;
|
2005-08-12 03:25:56 +07:00
|
|
|
}
|
|
|
|
|
2018-09-14 20:49:06 +07:00
|
|
|
static void
|
|
|
|
xs_stream_prepare_request(struct rpc_rqst *req)
|
|
|
|
{
|
|
|
|
req->rq_task->tk_status = xdr_alloc_bvec(&req->rq_rcv_buf, GFP_NOIO);
|
|
|
|
}
|
|
|
|
|
2018-08-31 21:00:02 +07:00
|
|
|
/*
|
|
|
|
* Determine if the previous message in the stream was aborted before it
|
|
|
|
* could complete transmission.
|
|
|
|
*/
|
|
|
|
static bool
|
|
|
|
xs_send_request_was_aborted(struct sock_xprt *transport, struct rpc_rqst *req)
|
|
|
|
{
|
|
|
|
return transport->xmit.offset != 0 && req->rq_bytes_sent == 0;
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:34 +07:00
|
|
|
/*
|
|
|
|
* Construct a stream transport record marker in @buf.
|
|
|
|
*/
|
|
|
|
static inline void xs_encode_stream_record_marker(struct xdr_buf *buf)
|
|
|
|
{
|
|
|
|
u32 reclen = buf->len - sizeof(rpc_fraghdr);
|
|
|
|
rpc_fraghdr *base = buf->head[0].iov_base;
|
|
|
|
*base = cpu_to_be32(RPC_LAST_STREAM_FRAGMENT | reclen);
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
/**
|
|
|
|
* xs_local_send_request - write an RPC request to an AF_LOCAL socket
|
2018-08-31 00:27:29 +07:00
|
|
|
* @req: pointer to RPC request
|
2011-05-10 02:22:44 +07:00
|
|
|
*
|
|
|
|
* Return values:
|
|
|
|
* 0: The request has been sent
|
|
|
|
* EAGAIN: The socket was blocked, please call again later to
|
|
|
|
* complete the request
|
|
|
|
* ENOTCONN: Caller needs to invoke connect logic then call again
|
|
|
|
* other: Some other error occured, the request was not sent
|
|
|
|
*/
|
2018-09-04 10:58:59 +07:00
|
|
|
static int xs_local_send_request(struct rpc_rqst *req)
|
2011-05-10 02:22:44 +07:00
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt = req->rq_xprt;
|
|
|
|
struct sock_xprt *transport =
|
|
|
|
container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
struct xdr_buf *xdr = &req->rq_snd_buf;
|
|
|
|
int status;
|
2014-09-25 01:08:00 +07:00
|
|
|
int sent = 0;
|
2011-05-10 02:22:44 +07:00
|
|
|
|
2018-08-31 21:00:02 +07:00
|
|
|
/* Close the stream if the previous transmission was incomplete */
|
|
|
|
if (xs_send_request_was_aborted(transport, req)) {
|
|
|
|
xs_close(xprt);
|
|
|
|
return -ENOTCONN;
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
xs_encode_stream_record_marker(&req->rq_snd_buf);
|
|
|
|
|
|
|
|
xs_pktdump("packet data:",
|
|
|
|
req->rq_svec->iov_base, req->rq_svec->iov_len);
|
|
|
|
|
2018-03-06 03:13:07 +07:00
|
|
|
req->rq_xtime = ktime_get();
|
2018-08-14 03:54:57 +07:00
|
|
|
status = xs_sendpages(transport->sock, NULL, 0, xdr,
|
|
|
|
transport->xmit.offset,
|
2014-09-25 01:08:00 +07:00
|
|
|
true, &sent);
|
2011-05-10 02:22:44 +07:00
|
|
|
dprintk("RPC: %s(%u) = %d\n",
|
2018-08-14 03:54:57 +07:00
|
|
|
__func__, xdr->len - transport->xmit.offset, status);
|
2015-07-27 07:55:35 +07:00
|
|
|
|
|
|
|
if (status == -EAGAIN && sock_writeable(transport->inet))
|
|
|
|
status = -ENOBUFS;
|
|
|
|
|
2014-09-25 01:08:00 +07:00
|
|
|
if (likely(sent > 0) || status == 0) {
|
2018-08-14 03:54:57 +07:00
|
|
|
transport->xmit.offset += sent;
|
|
|
|
req->rq_bytes_sent = transport->xmit.offset;
|
2011-05-10 02:22:44 +07:00
|
|
|
if (likely(req->rq_bytes_sent >= req->rq_slen)) {
|
2018-08-14 03:54:57 +07:00
|
|
|
req->rq_xmit_bytes_sent += transport->xmit.offset;
|
2011-05-10 02:22:44 +07:00
|
|
|
req->rq_bytes_sent = 0;
|
2018-08-14 03:54:57 +07:00
|
|
|
transport->xmit.offset = 0;
|
2011-05-10 02:22:44 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
status = -EAGAIN;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (status) {
|
2014-07-01 00:42:19 +07:00
|
|
|
case -ENOBUFS:
|
2015-07-03 20:32:23 +07:00
|
|
|
break;
|
2011-05-10 02:22:44 +07:00
|
|
|
case -EAGAIN:
|
2018-09-04 10:39:27 +07:00
|
|
|
status = xs_nospace(req);
|
2011-05-10 02:22:44 +07:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
dprintk("RPC: sendmsg returned unrecognized error %d\n",
|
|
|
|
-status);
|
2017-10-20 23:48:30 +07:00
|
|
|
/* fall through */
|
2011-05-10 02:22:44 +07:00
|
|
|
case -EPIPE:
|
|
|
|
xs_close(xprt);
|
|
|
|
status = -ENOTCONN;
|
|
|
|
}
|
|
|
|
|
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
/**
|
|
|
|
* xs_udp_send_request - write an RPC request to a UDP socket
|
2018-08-31 00:27:29 +07:00
|
|
|
* @req: pointer to RPC request
|
2005-08-12 03:25:56 +07:00
|
|
|
*
|
|
|
|
* Return values:
|
|
|
|
* 0: The request has been sent
|
|
|
|
* EAGAIN: The socket was blocked, please call again later to
|
|
|
|
* complete the request
|
|
|
|
* ENOTCONN: Caller needs to invoke connect logic then call again
|
2011-03-31 08:57:33 +07:00
|
|
|
* other: Some other error occurred, the request was not sent
|
2005-08-12 03:25:56 +07:00
|
|
|
*/
|
2018-09-04 10:58:59 +07:00
|
|
|
static int xs_udp_send_request(struct rpc_rqst *req)
|
2005-08-12 03:25:56 +07:00
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt = req->rq_xprt;
|
2006-12-06 04:35:15 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
2005-08-12 03:25:56 +07:00
|
|
|
struct xdr_buf *xdr = &req->rq_snd_buf;
|
2014-09-25 01:08:00 +07:00
|
|
|
int sent = 0;
|
2005-08-12 03:25:56 +07:00
|
|
|
int status;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
xs_pktdump("packet data:",
|
2005-08-12 03:25:23 +07:00
|
|
|
req->rq_svec->iov_base,
|
|
|
|
req->rq_svec->iov_len);
|
|
|
|
|
2009-03-12 01:09:39 +07:00
|
|
|
if (!xprt_bound(xprt))
|
|
|
|
return -ENOTCONN;
|
2018-09-04 04:37:36 +07:00
|
|
|
|
|
|
|
if (!xprt_request_get_cong(xprt, req))
|
|
|
|
return -EBADSLT;
|
|
|
|
|
2018-03-06 03:13:07 +07:00
|
|
|
req->rq_xtime = ktime_get();
|
2014-09-25 01:08:00 +07:00
|
|
|
status = xs_sendpages(transport->sock, xs_addr(xprt), xprt->addrlen,
|
2018-08-14 03:54:57 +07:00
|
|
|
xdr, 0, true, &sent);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_udp_send_request(%u) = %d\n",
|
2018-08-14 03:54:57 +07:00
|
|
|
xdr->len, status);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
rpc: Add -EPERM processing for xs_udp_send_request()
If an iptables drop rule is added for an nfs server, the client can end up in
a softlockup. Because of the way that xs_sendpages() is structured, the -EPERM
is ignored since the prior bits of the packet may have been successfully queued
and thus xs_sendpages() returns a non-zero value. Then, xs_udp_send_request()
thinks that because some bits were queued it should return -EAGAIN. We then try
the request again and again, resulting in cpu spinning. Reproducer:
1) open a file on the nfs server '/nfs/foo' (mounted using udp)
2) iptables -A OUTPUT -d <nfs server ip> -j DROP
3) write to /nfs/foo
4) close /nfs/foo
5) iptables -D OUTPUT -d <nfs server ip> -j DROP
The softlockup occurs in step 4 above.
The previous patch, allows xs_sendpages() to return both a sent count and
any error values that may have occurred. Thus, if we get an -EPERM, return
that to the higher level code.
With this patch in place we can successfully abort the above sequence and
avoid the softlockup.
I also tried the above test case on an nfs mount on tcp and although the system
does not softlockup, I still ended up with the 'hung_task' firing after 120
seconds, due to the i/o being stuck. The tcp case appears a bit harder to fix,
since -EPERM appears to get ignored much lower down in the stack and does not
propogate up to xs_sendpages(). This case is not quite as insidious as the
softlockup and it is not addressed here.
Reported-by: Yigong Lou <ylou@akamai.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-25 01:08:04 +07:00
|
|
|
/* firewall is blocking us, don't return -EAGAIN or we end up looping */
|
|
|
|
if (status == -EPERM)
|
|
|
|
goto process_status;
|
|
|
|
|
2015-07-27 07:55:35 +07:00
|
|
|
if (status == -EAGAIN && sock_writeable(transport->inet))
|
|
|
|
status = -ENOBUFS;
|
|
|
|
|
2014-09-25 01:08:00 +07:00
|
|
|
if (sent > 0 || status == 0) {
|
|
|
|
req->rq_xmit_bytes_sent += sent;
|
|
|
|
if (sent >= req->rq_slen)
|
2007-10-01 22:43:37 +07:00
|
|
|
return 0;
|
|
|
|
/* Still some bytes left; set up for a retry later. */
|
2005-08-12 03:25:56 +07:00
|
|
|
status = -EAGAIN;
|
2007-10-01 22:43:37 +07:00
|
|
|
}
|
2005-08-12 03:25:23 +07:00
|
|
|
|
rpc: Add -EPERM processing for xs_udp_send_request()
If an iptables drop rule is added for an nfs server, the client can end up in
a softlockup. Because of the way that xs_sendpages() is structured, the -EPERM
is ignored since the prior bits of the packet may have been successfully queued
and thus xs_sendpages() returns a non-zero value. Then, xs_udp_send_request()
thinks that because some bits were queued it should return -EAGAIN. We then try
the request again and again, resulting in cpu spinning. Reproducer:
1) open a file on the nfs server '/nfs/foo' (mounted using udp)
2) iptables -A OUTPUT -d <nfs server ip> -j DROP
3) write to /nfs/foo
4) close /nfs/foo
5) iptables -D OUTPUT -d <nfs server ip> -j DROP
The softlockup occurs in step 4 above.
The previous patch, allows xs_sendpages() to return both a sent count and
any error values that may have occurred. Thus, if we get an -EPERM, return
that to the higher level code.
With this patch in place we can successfully abort the above sequence and
avoid the softlockup.
I also tried the above test case on an nfs mount on tcp and although the system
does not softlockup, I still ended up with the 'hung_task' firing after 120
seconds, due to the i/o being stuck. The tcp case appears a bit harder to fix,
since -EPERM appears to get ignored much lower down in the stack and does not
propogate up to xs_sendpages(). This case is not quite as insidious as the
softlockup and it is not addressed here.
Reported-by: Yigong Lou <ylou@akamai.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-25 01:08:04 +07:00
|
|
|
process_status:
|
2005-08-12 03:25:56 +07:00
|
|
|
switch (status) {
|
2009-03-12 01:06:41 +07:00
|
|
|
case -ENOTSOCK:
|
|
|
|
status = -ENOTCONN;
|
|
|
|
/* Should we call xs_close() here? */
|
|
|
|
break;
|
2008-04-18 05:52:19 +07:00
|
|
|
case -EAGAIN:
|
2018-09-04 10:39:27 +07:00
|
|
|
status = xs_nospace(req);
|
2008-04-18 05:52:19 +07:00
|
|
|
break;
|
2005-08-12 03:25:56 +07:00
|
|
|
case -ENETUNREACH:
|
2014-07-01 00:42:19 +07:00
|
|
|
case -ENOBUFS:
|
2005-08-12 03:25:56 +07:00
|
|
|
case -EPIPE:
|
2005-08-12 03:25:23 +07:00
|
|
|
case -ECONNREFUSED:
|
rpc: Add -EPERM processing for xs_udp_send_request()
If an iptables drop rule is added for an nfs server, the client can end up in
a softlockup. Because of the way that xs_sendpages() is structured, the -EPERM
is ignored since the prior bits of the packet may have been successfully queued
and thus xs_sendpages() returns a non-zero value. Then, xs_udp_send_request()
thinks that because some bits were queued it should return -EAGAIN. We then try
the request again and again, resulting in cpu spinning. Reproducer:
1) open a file on the nfs server '/nfs/foo' (mounted using udp)
2) iptables -A OUTPUT -d <nfs server ip> -j DROP
3) write to /nfs/foo
4) close /nfs/foo
5) iptables -D OUTPUT -d <nfs server ip> -j DROP
The softlockup occurs in step 4 above.
The previous patch, allows xs_sendpages() to return both a sent count and
any error values that may have occurred. Thus, if we get an -EPERM, return
that to the higher level code.
With this patch in place we can successfully abort the above sequence and
avoid the softlockup.
I also tried the above test case on an nfs mount on tcp and although the system
does not softlockup, I still ended up with the 'hung_task' firing after 120
seconds, due to the i/o being stuck. The tcp case appears a bit harder to fix,
since -EPERM appears to get ignored much lower down in the stack and does not
propogate up to xs_sendpages(). This case is not quite as insidious as the
softlockup and it is not addressed here.
Reported-by: Yigong Lou <ylou@akamai.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-25 01:08:04 +07:00
|
|
|
case -EPERM:
|
2005-08-12 03:25:23 +07:00
|
|
|
/* When the server has died, an ICMP port unreachable message
|
2005-08-12 03:25:26 +07:00
|
|
|
* prompts ECONNREFUSED. */
|
2016-01-06 20:57:06 +07:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
dprintk("RPC: sendmsg returned unrecognized error %d\n",
|
|
|
|
-status);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
2010-03-08 13:49:01 +07:00
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
return status;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
2005-08-12 03:25:56 +07:00
|
|
|
* xs_tcp_send_request - write an RPC request to a TCP socket
|
2018-08-31 00:27:29 +07:00
|
|
|
* @req: pointer to RPC request
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
|
|
|
* Return values:
|
2005-08-12 03:25:56 +07:00
|
|
|
* 0: The request has been sent
|
|
|
|
* EAGAIN: The socket was blocked, please call again later to
|
|
|
|
* complete the request
|
|
|
|
* ENOTCONN: Caller needs to invoke connect logic then call again
|
2011-03-31 08:57:33 +07:00
|
|
|
* other: Some other error occurred, the request was not sent
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
|
|
|
* XXX: In the case of soft timeouts, should we eventually give up
|
2005-08-12 03:25:56 +07:00
|
|
|
* if sendmsg is not able to make progress?
|
2005-08-12 03:25:26 +07:00
|
|
|
*/
|
2018-09-04 10:58:59 +07:00
|
|
|
static int xs_tcp_send_request(struct rpc_rqst *req)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt = req->rq_xprt;
|
2006-12-06 04:35:15 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
2005-08-12 03:25:56 +07:00
|
|
|
struct xdr_buf *xdr = &req->rq_snd_buf;
|
SUNRPC: Fix a data corruption issue when retransmitting RPC calls
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-11-09 04:03:50 +07:00
|
|
|
bool zerocopy = true;
|
2016-05-29 11:42:03 +07:00
|
|
|
bool vm_wait = false;
|
2007-08-06 22:56:42 +07:00
|
|
|
int status;
|
2014-09-25 01:08:00 +07:00
|
|
|
int sent;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2018-08-31 21:00:02 +07:00
|
|
|
/* Close the stream if the previous transmission was incomplete */
|
|
|
|
if (xs_send_request_was_aborted(transport, req)) {
|
|
|
|
if (transport->sock != NULL)
|
|
|
|
kernel_sock_shutdown(transport->sock, SHUT_RDWR);
|
|
|
|
return -ENOTCONN;
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:34 +07:00
|
|
|
xs_encode_stream_record_marker(&req->rq_snd_buf);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
xs_pktdump("packet data:",
|
|
|
|
req->rq_svec->iov_base,
|
|
|
|
req->rq_svec->iov_len);
|
SUNRPC: Fix a data corruption issue when retransmitting RPC calls
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-11-09 04:03:50 +07:00
|
|
|
/* Don't use zero copy if this is a resend. If the RPC call
|
|
|
|
* completes while the socket holds a reference to the pages,
|
|
|
|
* then we may end up resending corrupted data.
|
|
|
|
*/
|
2018-08-31 00:27:29 +07:00
|
|
|
if (req->rq_task->tk_flags & RPC_TASK_SENT)
|
SUNRPC: Fix a data corruption issue when retransmitting RPC calls
The following scenario can cause silent data corruption when doing
NFS writes. It has mainly been observed when doing database writes
using O_DIRECT.
1) The RPC client uses sendpage() to do zero-copy of the page data.
2) Due to networking issues, the reply from the server is delayed,
and so the RPC client times out.
3) The client issues a second sendpage of the page data as part of
an RPC call retransmission.
4) The reply to the first transmission arrives from the server
_before_ the client hardware has emptied the TCP socket send
buffer.
5) After processing the reply, the RPC state machine rules that
the call to be done, and triggers the completion callbacks.
6) The application notices the RPC call is done, and reuses the
pages to store something else (e.g. a new write).
7) The client NIC drains the TCP socket send buffer. Since the
page data has now changed, it reads a corrupted version of the
initial RPC call, and puts it on the wire.
This patch fixes the problem in the following manner:
The ordering guarantees of TCP ensure that when the server sends a
reply, then we know that the _first_ transmission has completed. Using
zero-copy in that situation is therefore safe.
If a time out occurs, we then send the retransmission using sendmsg()
(i.e. no zero-copy), We then know that the socket contains a full copy of
the data, and so it will retransmit a faithful reproduction even if the
RPC call completes, and the application reuses the O_DIRECT buffer in
the meantime.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-11-09 04:03:50 +07:00
|
|
|
zerocopy = false;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2017-02-08 23:17:54 +07:00
|
|
|
if (test_bit(XPRT_SOCK_UPD_TIMEOUT, &transport->sock_state))
|
|
|
|
xs_tcp_set_socket_timeouts(xprt, transport->sock);
|
|
|
|
|
2005-08-12 03:25:23 +07:00
|
|
|
/* Continue transmitting the packet/record. We must be careful
|
|
|
|
* to cope with writespace callbacks arriving _after_ we have
|
2005-08-12 03:25:56 +07:00
|
|
|
* called sendmsg(). */
|
2018-03-06 03:13:07 +07:00
|
|
|
req->rq_xtime = ktime_get();
|
2005-08-12 03:25:23 +07:00
|
|
|
while (1) {
|
2014-09-25 01:08:00 +07:00
|
|
|
sent = 0;
|
|
|
|
status = xs_sendpages(transport->sock, NULL, 0, xdr,
|
2018-08-14 03:54:57 +07:00
|
|
|
transport->xmit.offset,
|
|
|
|
zerocopy, &sent);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_tcp_send_request(%u) = %d\n",
|
2018-08-14 03:54:57 +07:00
|
|
|
xdr->len - transport->xmit.offset, status);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
/* If we've sent the entire packet, immediately
|
|
|
|
* reset the count of bytes sent. */
|
2018-08-14 03:54:57 +07:00
|
|
|
transport->xmit.offset += sent;
|
|
|
|
req->rq_bytes_sent = transport->xmit.offset;
|
2005-08-12 03:25:56 +07:00
|
|
|
if (likely(req->rq_bytes_sent >= req->rq_slen)) {
|
2018-08-14 03:54:57 +07:00
|
|
|
req->rq_xmit_bytes_sent += transport->xmit.offset;
|
2005-08-12 03:25:56 +07:00
|
|
|
req->rq_bytes_sent = 0;
|
2018-08-14 03:54:57 +07:00
|
|
|
transport->xmit.offset = 0;
|
2005-08-12 03:25:56 +07:00
|
|
|
return 0;
|
|
|
|
}
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2016-05-29 11:42:03 +07:00
|
|
|
WARN_ON_ONCE(sent == 0 && status == 0);
|
|
|
|
|
|
|
|
if (status == -EAGAIN ) {
|
|
|
|
/*
|
|
|
|
* Return EAGAIN if we're sure we're hitting the
|
|
|
|
* socket send buffer limits.
|
|
|
|
*/
|
|
|
|
if (test_bit(SOCK_NOSPACE, &transport->sock->flags))
|
|
|
|
break;
|
|
|
|
/*
|
|
|
|
* Did we hit a memory allocation failure?
|
|
|
|
*/
|
|
|
|
if (sent == 0) {
|
|
|
|
status = -ENOBUFS;
|
|
|
|
if (vm_wait)
|
|
|
|
break;
|
|
|
|
/* Retry, knowing now that we're below the
|
|
|
|
* socket send buffer limit
|
|
|
|
*/
|
|
|
|
vm_wait = true;
|
|
|
|
}
|
|
|
|
continue;
|
|
|
|
}
|
2015-07-11 22:48:52 +07:00
|
|
|
if (status < 0)
|
|
|
|
break;
|
2016-05-29 11:42:03 +07:00
|
|
|
vm_wait = false;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
switch (status) {
|
2009-03-12 01:06:41 +07:00
|
|
|
case -ENOTSOCK:
|
|
|
|
status = -ENOTCONN;
|
|
|
|
/* Should we call xs_close() here? */
|
|
|
|
break;
|
2005-08-12 03:25:56 +07:00
|
|
|
case -EAGAIN:
|
2018-09-04 10:39:27 +07:00
|
|
|
status = xs_nospace(req);
|
2005-08-12 03:25:56 +07:00
|
|
|
break;
|
|
|
|
case -ECONNRESET:
|
2008-10-29 02:21:39 +07:00
|
|
|
case -ECONNREFUSED:
|
2005-08-12 03:25:56 +07:00
|
|
|
case -ENOTCONN:
|
2015-02-09 09:44:04 +07:00
|
|
|
case -EADDRINUSE:
|
2015-07-03 20:32:23 +07:00
|
|
|
case -ENOBUFS:
|
2012-10-23 22:40:02 +07:00
|
|
|
case -EPIPE:
|
2016-01-06 20:57:06 +07:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
dprintk("RPC: sendmsg returned unrecognized error %d\n",
|
|
|
|
-status);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
2010-03-08 13:49:01 +07:00
|
|
|
|
2005-08-12 03:25:23 +07:00
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2008-10-29 02:21:39 +07:00
|
|
|
static void xs_save_old_callbacks(struct sock_xprt *transport, struct sock *sk)
|
|
|
|
{
|
|
|
|
transport->old_data_ready = sk->sk_data_ready;
|
|
|
|
transport->old_state_change = sk->sk_state_change;
|
|
|
|
transport->old_write_space = sk->sk_write_space;
|
2014-01-01 01:22:59 +07:00
|
|
|
transport->old_error_report = sk->sk_error_report;
|
2008-10-29 02:21:39 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *sk)
|
|
|
|
{
|
|
|
|
sk->sk_data_ready = transport->old_data_ready;
|
|
|
|
sk->sk_state_change = transport->old_state_change;
|
|
|
|
sk->sk_write_space = transport->old_write_space;
|
2014-01-01 01:22:59 +07:00
|
|
|
sk->sk_error_report = transport->old_error_report;
|
|
|
|
}
|
|
|
|
|
2016-05-23 20:24:55 +07:00
|
|
|
static void xs_sock_reset_state_flags(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
|
|
|
|
clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state);
|
|
|
|
}
|
|
|
|
|
2015-02-09 21:41:32 +07:00
|
|
|
static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
smp_mb__before_atomic();
|
|
|
|
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
|
|
|
|
clear_bit(XPRT_CLOSING, &xprt->state);
|
2016-05-23 20:24:55 +07:00
|
|
|
xs_sock_reset_state_flags(xprt);
|
2015-02-09 21:41:32 +07:00
|
|
|
smp_mb__after_atomic();
|
|
|
|
}
|
|
|
|
|
2014-01-01 01:22:59 +07:00
|
|
|
/**
|
|
|
|
* xs_error_report - callback to handle TCP socket state errors
|
|
|
|
* @sk: socket
|
|
|
|
*
|
|
|
|
* Note: we don't call sock_error() since there may be a rpc_task
|
|
|
|
* using the socket, and so we don't want to clear sk->sk_err.
|
|
|
|
*/
|
|
|
|
static void xs_error_report(struct sock *sk)
|
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
read_lock_bh(&sk->sk_callback_lock);
|
|
|
|
if (!(xprt = xprt_from_sock(sk)))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
err = -sk->sk_err;
|
|
|
|
if (err == 0)
|
|
|
|
goto out;
|
|
|
|
dprintk("RPC: xs_error_report client %p, error=%d...\n",
|
|
|
|
xprt, -err);
|
2014-01-01 01:39:22 +07:00
|
|
|
trace_rpc_socket_error(xprt, sk->sk_socket, err);
|
2014-01-01 01:22:59 +07:00
|
|
|
xprt_wake_pending_tasks(xprt, err);
|
|
|
|
out:
|
|
|
|
read_unlock_bh(&sk->sk_callback_lock);
|
2008-10-29 02:21:39 +07:00
|
|
|
}
|
|
|
|
|
2009-03-12 01:10:21 +07:00
|
|
|
static void xs_reset_transport(struct sock_xprt *transport)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2006-12-06 04:35:15 +07:00
|
|
|
struct socket *sock = transport->sock;
|
|
|
|
struct sock *sk = transport->inet;
|
2015-02-09 06:35:25 +07:00
|
|
|
struct rpc_xprt *xprt = &transport->xprt;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2009-03-12 01:10:21 +07:00
|
|
|
if (sk == NULL)
|
|
|
|
return;
|
2005-08-12 03:25:26 +07:00
|
|
|
|
2015-06-04 03:14:27 +07:00
|
|
|
if (atomic_read(&transport->xprt.swapper))
|
|
|
|
sk_clear_memalloc(sk);
|
|
|
|
|
2015-08-30 09:11:21 +07:00
|
|
|
kernel_sock_shutdown(sock, SHUT_RDWR);
|
|
|
|
|
2015-10-05 21:53:49 +07:00
|
|
|
mutex_lock(&transport->recv_mutex);
|
2005-08-12 03:25:23 +07:00
|
|
|
write_lock_bh(&sk->sk_callback_lock);
|
2006-12-06 04:35:15 +07:00
|
|
|
transport->inet = NULL;
|
|
|
|
transport->sock = NULL;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
sk->sk_user_data = NULL;
|
2008-10-29 02:21:39 +07:00
|
|
|
|
|
|
|
xs_restore_old_callbacks(transport, sk);
|
2015-08-30 03:36:30 +07:00
|
|
|
xprt_clear_connected(xprt);
|
2005-08-12 03:25:23 +07:00
|
|
|
write_unlock_bh(&sk->sk_callback_lock);
|
2015-02-09 06:35:25 +07:00
|
|
|
xs_sock_reset_connection_flags(xprt);
|
2015-10-05 21:53:49 +07:00
|
|
|
mutex_unlock(&transport->recv_mutex);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2015-02-09 06:35:25 +07:00
|
|
|
trace_rpc_socket_close(xprt, sock);
|
2005-08-12 03:25:23 +07:00
|
|
|
sock_release(sock);
|
2009-03-12 01:10:21 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* xs_close - close a socket
|
|
|
|
* @xprt: transport
|
|
|
|
*
|
|
|
|
* This is used when all requests are complete; ie, no DRC state remains
|
|
|
|
* on the server we want to save.
|
2009-04-22 04:18:20 +07:00
|
|
|
*
|
|
|
|
* The caller _must_ be holding XPRT_LOCKED in order to avoid issues with
|
|
|
|
* xs_reset_transport() zeroing the socket from underneath a writer.
|
2009-03-12 01:10:21 +07:00
|
|
|
*/
|
|
|
|
static void xs_close(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
|
|
|
|
dprintk("RPC: xs_close xprt %p\n", xprt);
|
|
|
|
|
|
|
|
xs_reset_transport(transport);
|
NFS/RPC: fix problems with reestablish_timeout and related code.
[[resending with correct cc: - "vfs.kernel.org" just isn't right!]]
xprt->reestablish_timeout is used to cause TCP connection attempts to
back off if the connection fails so as not to hammer the network,
but to still allow immediate connections when there is no reason to
believe there is a problem.
It is not used for the first connection (when transport->sock is NULL)
but only on reconnects.
It is currently set:
a/ to 0 when xs_tcp_state_change finds a state of TCP_FIN_WAIT1
on the assumption that the client has closed the connection
so the reconnect should be immediate when needed.
b/ to at least XS_TCP_INIT_REEST_TO when xs_tcp_state_change
detects TCP_CLOSING or TCP_CLOSE_WAIT on the assumption that the
server closed the connection so a small delay at least is
required.
c/ as above when xs_tcp_state_change detects TCP_SYN_SENT, so that
it is never 0 while a connection has been attempted, else
the doubling will produce 0 and there will be no backoff.
d/ to double is value (up to a limit) when delaying a connection,
thus providing exponential backoff and
e/ to XS_TCP_INIT_REEST_TO in xs_setup_tcp as simple initialisation.
So you can see it is highly dependant on xs_tcp_state_change being
called as expected. However experimental evidence shows that
xs_tcp_state_change does not see all state changes.
("rpcdebug -m rpc trans" can help show what actually happens).
Results show:
TCP_ESTABLISHED is reported when a connection is made. TCP_SYN_SENT
is never reported, so rule 'c' above is never effective.
When the server closes the connection, TCP_CLOSE_WAIT and
TCP_LAST_ACK *might* be reported, and TCP_CLOSE is always
reported. This rule 'b' above will sometimes be effective, but
not reliably.
When the client closes the connection, it used to result in
TCP_FIN_WAIT1, TCP_FIN_WAIT2, TCP_CLOSE. However since commit
f75e674 (SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on
reconnect) we don't see *any* events on client-close. I think this
is because xs_restore_old_callbacks is called to disconnect
xs_tcp_state_change before the socket is closed.
In any case, rule 'a' no longer applies.
So all that is left are rule d, which successfully doubles the
timeout which is never rest, and rule e which initialises the timeout.
Even if the rules worked as expected, there would be a problem because
a successful connection does not reset the timeout, so a sequence
of events where the server closes the connection (e.g. during failover
testing) will cause longer and longer timeouts with no good reason.
This patch:
- sets reestablish_timeout to 0 in xs_close thus effecting rule 'a'
- sets it to 0 in xs_tcp_data_ready to ensure that a successful
connection resets the timeout
- sets it to at least XS_TCP_INIT_REEST_TO after it is doubled,
thus effecting rule c
I have not reimplemented rule b and the new version of rule c
seems sufficient.
I suspect other code in xs_tcp_data_ready needs to be revised as well.
For example I don't think connect_cookie is being incremented as often
as it should be.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-09-24 01:36:37 +07:00
|
|
|
xprt->reestablish_timeout = 0;
|
2009-03-12 01:10:21 +07:00
|
|
|
|
2007-11-07 06:44:20 +07:00
|
|
|
xprt_disconnect_done(xprt);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2015-05-12 01:02:25 +07:00
|
|
|
static void xs_inject_disconnect(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
dprintk("RPC: injecting transport disconnect on xprt=%p\n",
|
|
|
|
xprt);
|
|
|
|
xprt_disconnect_done(xprt);
|
|
|
|
}
|
|
|
|
|
2014-03-24 10:07:22 +07:00
|
|
|
static void xs_xprt_free(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
xs_free_peer_addresses(xprt);
|
|
|
|
xprt_free(xprt);
|
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
|
|
|
* xs_destroy - prepare to shutdown a transport
|
|
|
|
* @xprt: doomed transport
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void xs_destroy(struct rpc_xprt *xprt)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2015-09-17 21:42:27 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt,
|
|
|
|
struct sock_xprt, xprt);
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_destroy xprt %p\n", xprt);
|
2005-08-12 03:25:26 +07:00
|
|
|
|
2015-09-17 21:42:27 +07:00
|
|
|
cancel_delayed_work_sync(&transport->connect_worker);
|
2013-10-31 20:18:49 +07:00
|
|
|
xs_close(xprt);
|
2015-10-05 21:53:49 +07:00
|
|
|
cancel_work_sync(&transport->recv_worker);
|
2014-03-24 10:07:22 +07:00
|
|
|
xs_xprt_free(xprt);
|
2013-10-31 20:18:49 +07:00
|
|
|
module_put(THIS_MODULE);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
2015-10-07 03:26:05 +07:00
|
|
|
* xs_udp_data_read_skb - receive callback for UDP sockets
|
|
|
|
* @xprt: transport
|
|
|
|
* @sk: socket
|
|
|
|
* @skb: skbuff
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
2005-08-12 03:25:23 +07:00
|
|
|
*/
|
2015-10-07 03:26:05 +07:00
|
|
|
static void xs_udp_data_read_skb(struct rpc_xprt *xprt,
|
|
|
|
struct sock *sk,
|
|
|
|
struct sk_buff *skb)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2005-08-12 03:25:26 +07:00
|
|
|
struct rpc_task *task;
|
2005-08-12 03:25:23 +07:00
|
|
|
struct rpc_rqst *rovr;
|
2015-10-07 03:26:05 +07:00
|
|
|
int repsize, copied;
|
2006-09-27 12:29:38 +07:00
|
|
|
u32 _xid;
|
|
|
|
__be32 *xp;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2016-04-07 22:44:58 +07:00
|
|
|
repsize = skb->len;
|
2005-08-12 03:25:23 +07:00
|
|
|
if (repsize < 4) {
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: impossible RPC reply size %d!\n", repsize);
|
2015-10-07 03:26:05 +07:00
|
|
|
return;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Copy the XID from the skb... */
|
2016-04-07 22:44:58 +07:00
|
|
|
xp = skb_header_pointer(skb, 0, sizeof(_xid), &_xid);
|
2005-08-12 03:25:23 +07:00
|
|
|
if (xp == NULL)
|
2015-10-07 03:26:05 +07:00
|
|
|
return;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
|
|
|
/* Look up and lock the request corresponding to the given XID */
|
2018-08-31 21:21:00 +07:00
|
|
|
spin_lock(&xprt->queue_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
rovr = xprt_lookup_rqst(xprt, *xp);
|
|
|
|
if (!rovr)
|
|
|
|
goto out_unlock;
|
2017-08-13 21:03:59 +07:00
|
|
|
xprt_pin_rqst(rovr);
|
2018-03-06 03:12:57 +07:00
|
|
|
xprt_update_rtt(rovr->rq_task);
|
2018-08-31 21:21:00 +07:00
|
|
|
spin_unlock(&xprt->queue_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
task = rovr->rq_task;
|
|
|
|
|
|
|
|
if ((copied = rovr->rq_private_buf.buflen) > repsize)
|
|
|
|
copied = repsize;
|
|
|
|
|
|
|
|
/* Suck it into the iovec, verify checksum if not done by hw. */
|
2007-12-12 02:30:32 +07:00
|
|
|
if (csum_partial_copy_to_xdr(&rovr->rq_private_buf, skb)) {
|
2018-08-31 21:21:00 +07:00
|
|
|
spin_lock(&xprt->queue_lock);
|
2018-02-09 21:39:42 +07:00
|
|
|
__UDPX_INC_STATS(sk, UDP_MIB_INERRORS);
|
2017-08-13 21:03:59 +07:00
|
|
|
goto out_unpin;
|
2007-12-12 02:30:32 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2017-08-13 21:03:59 +07:00
|
|
|
spin_lock_bh(&xprt->transport_lock);
|
2013-01-08 21:48:15 +07:00
|
|
|
xprt_adjust_cwnd(xprt, task, copied);
|
2017-08-17 02:30:35 +07:00
|
|
|
spin_unlock_bh(&xprt->transport_lock);
|
2018-08-31 21:21:00 +07:00
|
|
|
spin_lock(&xprt->queue_lock);
|
2005-08-26 06:25:52 +07:00
|
|
|
xprt_complete_rqst(task, copied);
|
2018-02-09 21:39:42 +07:00
|
|
|
__UDPX_INC_STATS(sk, UDP_MIB_INDATAGRAMS);
|
2017-08-13 21:03:59 +07:00
|
|
|
out_unpin:
|
|
|
|
xprt_unpin_rqst(rovr);
|
2005-08-12 03:25:23 +07:00
|
|
|
out_unlock:
|
2018-08-31 21:21:00 +07:00
|
|
|
spin_unlock(&xprt->queue_lock);
|
2015-10-07 03:26:05 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static void xs_udp_data_receive(struct sock_xprt *transport)
|
|
|
|
{
|
|
|
|
struct sk_buff *skb;
|
|
|
|
struct sock *sk;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
mutex_lock(&transport->recv_mutex);
|
|
|
|
sk = transport->inet;
|
|
|
|
if (sk == NULL)
|
|
|
|
goto out;
|
2018-09-15 04:45:23 +07:00
|
|
|
clear_bit(XPRT_SOCK_DATA_READY, &transport->sock_state);
|
2015-10-07 03:26:05 +07:00
|
|
|
for (;;) {
|
2016-11-04 17:28:59 +07:00
|
|
|
skb = skb_recv_udp(sk, 0, 1, &err);
|
2018-09-15 04:45:23 +07:00
|
|
|
if (skb == NULL)
|
2015-10-07 03:26:05 +07:00
|
|
|
break;
|
2018-09-15 04:45:23 +07:00
|
|
|
xs_udp_data_read_skb(&transport->xprt, sk, skb);
|
|
|
|
consume_skb(skb);
|
|
|
|
cond_resched();
|
2015-10-07 03:26:05 +07:00
|
|
|
}
|
|
|
|
out:
|
|
|
|
mutex_unlock(&transport->recv_mutex);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void xs_udp_data_receive_workfn(struct work_struct *work)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport =
|
|
|
|
container_of(work, struct sock_xprt, recv_worker);
|
|
|
|
xs_udp_data_receive(transport);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* xs_data_ready - "data ready" callback for UDP sockets
|
|
|
|
* @sk: socket with data to read
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void xs_data_ready(struct sock *sk)
|
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt;
|
|
|
|
|
|
|
|
read_lock_bh(&sk->sk_callback_lock);
|
|
|
|
dprintk("RPC: xs_data_ready...\n");
|
|
|
|
xprt = xprt_from_sock(sk);
|
|
|
|
if (xprt != NULL) {
|
|
|
|
struct sock_xprt *transport = container_of(xprt,
|
|
|
|
struct sock_xprt, xprt);
|
2016-05-29 21:13:24 +07:00
|
|
|
transport->old_data_ready(sk);
|
|
|
|
/* Any data means we had a useful conversation, so
|
|
|
|
* then we don't need to delay the next reconnect
|
|
|
|
*/
|
|
|
|
if (xprt->reestablish_timeout)
|
|
|
|
xprt->reestablish_timeout = 0;
|
2016-05-23 20:24:55 +07:00
|
|
|
if (!test_and_set_bit(XPRT_SOCK_DATA_READY, &transport->sock_state))
|
2016-05-27 21:39:50 +07:00
|
|
|
queue_work(xprtiod_workqueue, &transport->recv_worker);
|
2015-10-07 03:26:05 +07:00
|
|
|
}
|
2010-09-22 19:43:39 +07:00
|
|
|
read_unlock_bh(&sk->sk_callback_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2012-09-13 03:49:15 +07:00
|
|
|
/*
|
|
|
|
* Helper function to force a TCP close if the server is sending
|
|
|
|
* junk and/or it has put us in CLOSE_WAIT
|
|
|
|
*/
|
|
|
|
static void xs_tcp_force_close(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
xprt_force_disconnect(xprt);
|
|
|
|
}
|
|
|
|
|
2011-07-14 06:20:49 +07:00
|
|
|
#if defined(CONFIG_SUNRPC_BACKCHANNEL)
|
2015-10-25 04:28:32 +07:00
|
|
|
static int xs_tcp_bc_up(struct svc_serv *serv, struct net *net)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = svc_create_xprt(serv, "tcp-bc", net, PF_INET, 0,
|
|
|
|
SVC_SOCK_ANONYMOUS);
|
|
|
|
if (ret < 0)
|
|
|
|
return ret;
|
|
|
|
return 0;
|
|
|
|
}
|
2016-05-03 01:40:40 +07:00
|
|
|
|
|
|
|
static size_t xs_tcp_bc_maxpayload(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
return PAGE_SIZE;
|
|
|
|
}
|
2011-07-14 06:20:49 +07:00
|
|
|
#endif /* CONFIG_SUNRPC_BACKCHANNEL */
|
2009-04-01 20:23:02 +07:00
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
|
|
|
* xs_tcp_state_change - callback to handle TCP socket state changes
|
|
|
|
* @sk: socket whose state has changed
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void xs_tcp_state_change(struct sock *sk)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2005-08-12 03:25:26 +07:00
|
|
|
struct rpc_xprt *xprt;
|
2015-09-17 10:43:17 +07:00
|
|
|
struct sock_xprt *transport;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2010-09-22 19:43:39 +07:00
|
|
|
read_lock_bh(&sk->sk_callback_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
if (!(xprt = xprt_from_sock(sk)))
|
|
|
|
goto out;
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_tcp_state_change client %p...\n", xprt);
|
2010-08-10 21:19:53 +07:00
|
|
|
dprintk("RPC: state %x conn %d dead %d zapped %d sk_shutdown %d\n",
|
2007-02-01 00:14:08 +07:00
|
|
|
sk->sk_state, xprt_connected(xprt),
|
|
|
|
sock_flag(sk, SOCK_DEAD),
|
2010-08-10 21:19:53 +07:00
|
|
|
sock_flag(sk, SOCK_ZAPPED),
|
|
|
|
sk->sk_shutdown);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2015-09-17 10:43:17 +07:00
|
|
|
transport = container_of(xprt, struct sock_xprt, xprt);
|
2013-09-04 23:16:23 +07:00
|
|
|
trace_rpc_socket_state_change(xprt, sk->sk_socket);
|
2005-08-12 03:25:23 +07:00
|
|
|
switch (sk->sk_state) {
|
|
|
|
case TCP_ESTABLISHED:
|
2010-09-22 19:43:39 +07:00
|
|
|
spin_lock(&xprt->transport_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
if (!xprt_test_and_set_connected(xprt)) {
|
2013-09-26 21:18:04 +07:00
|
|
|
xprt->connect_cookie++;
|
2015-09-17 10:43:17 +07:00
|
|
|
clear_bit(XPRT_SOCK_CONNECTING, &transport->sock_state);
|
|
|
|
xprt_clear_connecting(xprt);
|
2006-12-06 04:35:19 +07:00
|
|
|
|
2018-10-02 01:25:36 +07:00
|
|
|
xprt->stat.connect_count++;
|
|
|
|
xprt->stat.connect_time += (long)jiffies -
|
|
|
|
xprt->stat.connect_start;
|
2009-03-12 01:38:00 +07:00
|
|
|
xprt_wake_pending_tasks(xprt, -EAGAIN);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
2010-09-22 19:43:39 +07:00
|
|
|
spin_unlock(&xprt->transport_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
break;
|
2007-11-06 05:42:39 +07:00
|
|
|
case TCP_FIN_WAIT1:
|
|
|
|
/* The client initiated a shutdown of the socket */
|
2008-04-18 03:52:57 +07:00
|
|
|
xprt->connect_cookie++;
|
2008-01-02 06:42:12 +07:00
|
|
|
xprt->reestablish_timeout = 0;
|
2007-11-06 05:42:39 +07:00
|
|
|
set_bit(XPRT_CLOSING, &xprt->state);
|
2014-03-18 00:06:10 +07:00
|
|
|
smp_mb__before_atomic();
|
2007-11-06 05:42:39 +07:00
|
|
|
clear_bit(XPRT_CONNECTED, &xprt->state);
|
2008-01-01 04:19:17 +07:00
|
|
|
clear_bit(XPRT_CLOSE_WAIT, &xprt->state);
|
2014-03-18 00:06:10 +07:00
|
|
|
smp_mb__after_atomic();
|
2005-08-12 03:25:23 +07:00
|
|
|
break;
|
2006-01-03 15:55:55 +07:00
|
|
|
case TCP_CLOSE_WAIT:
|
2007-11-06 05:42:39 +07:00
|
|
|
/* The server initiated a shutdown of the socket */
|
2008-04-18 03:52:57 +07:00
|
|
|
xprt->connect_cookie++;
|
2012-10-23 22:35:47 +07:00
|
|
|
clear_bit(XPRT_CONNECTED, &xprt->state);
|
2012-09-13 03:49:15 +07:00
|
|
|
xs_tcp_force_close(xprt);
|
2017-10-20 23:48:30 +07:00
|
|
|
/* fall through */
|
2008-01-02 06:42:12 +07:00
|
|
|
case TCP_CLOSING:
|
|
|
|
/*
|
|
|
|
* If the server closed down the connection, make sure that
|
|
|
|
* we back off before reconnecting
|
|
|
|
*/
|
|
|
|
if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO)
|
|
|
|
xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
|
2007-11-06 05:42:39 +07:00
|
|
|
break;
|
|
|
|
case TCP_LAST_ACK:
|
2009-03-12 01:37:58 +07:00
|
|
|
set_bit(XPRT_CLOSING, &xprt->state);
|
2014-03-18 00:06:10 +07:00
|
|
|
smp_mb__before_atomic();
|
2007-11-06 05:42:39 +07:00
|
|
|
clear_bit(XPRT_CONNECTED, &xprt->state);
|
2014-03-18 00:06:10 +07:00
|
|
|
smp_mb__after_atomic();
|
2007-11-06 05:42:39 +07:00
|
|
|
break;
|
|
|
|
case TCP_CLOSE:
|
2015-09-17 10:43:17 +07:00
|
|
|
if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
|
|
|
|
&transport->sock_state))
|
|
|
|
xprt_clear_connecting(xprt);
|
2018-02-05 22:20:06 +07:00
|
|
|
clear_bit(XPRT_CLOSING, &xprt->state);
|
2017-07-19 11:05:01 +07:00
|
|
|
if (sk->sk_err)
|
|
|
|
xprt_wake_pending_tasks(xprt, -sk->sk_err);
|
2018-02-05 22:20:06 +07:00
|
|
|
/* Trigger the socket release */
|
|
|
|
xs_tcp_force_close(xprt);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
out:
|
2010-09-22 19:43:39 +07:00
|
|
|
read_unlock_bh(&sk->sk_callback_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
static void xs_write_space(struct sock *sk)
|
|
|
|
{
|
2016-01-06 20:57:06 +07:00
|
|
|
struct socket_wq *wq;
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
struct rpc_xprt *xprt;
|
|
|
|
|
2016-01-06 20:57:06 +07:00
|
|
|
if (!sk->sk_socket)
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
return;
|
2016-01-06 20:57:06 +07:00
|
|
|
clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
|
|
|
|
if (unlikely(!(xprt = xprt_from_sock(sk))))
|
|
|
|
return;
|
2016-01-06 20:57:06 +07:00
|
|
|
rcu_read_lock();
|
|
|
|
wq = rcu_dereference(sk->sk_wq);
|
|
|
|
if (!wq || test_and_clear_bit(SOCKWQ_ASYNC_NOSPACE, &wq->flags) == 0)
|
|
|
|
goto out;
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
|
2018-09-04 10:39:27 +07:00
|
|
|
if (xprt_write_space(xprt))
|
|
|
|
sk->sk_write_pending--;
|
2016-01-06 20:57:06 +07:00
|
|
|
out:
|
|
|
|
rcu_read_unlock();
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
2005-08-12 03:25:50 +07:00
|
|
|
* xs_udp_write_space - callback invoked when socket buffer space
|
|
|
|
* becomes available
|
2005-08-12 03:25:26 +07:00
|
|
|
* @sk: socket whose state has changed
|
|
|
|
*
|
2005-08-12 03:25:23 +07:00
|
|
|
* Called when more output buffer space is available for this socket.
|
|
|
|
* We try not to wake our writers until they can make "significant"
|
2005-08-12 03:25:50 +07:00
|
|
|
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
|
2005-08-12 03:25:23 +07:00
|
|
|
* with a bunch of small requests.
|
|
|
|
*/
|
2005-08-12 03:25:50 +07:00
|
|
|
static void xs_udp_write_space(struct sock *sk)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2010-09-22 19:43:39 +07:00
|
|
|
read_lock_bh(&sk->sk_callback_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:50 +07:00
|
|
|
/* from net/core/sock.c:sock_def_write_space */
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
if (sock_writeable(sk))
|
|
|
|
xs_write_space(sk);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2010-09-22 19:43:39 +07:00
|
|
|
read_unlock_bh(&sk->sk_callback_lock);
|
2005-08-12 03:25:50 +07:00
|
|
|
}
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:50 +07:00
|
|
|
/**
|
|
|
|
* xs_tcp_write_space - callback invoked when socket buffer space
|
|
|
|
* becomes available
|
|
|
|
* @sk: socket whose state has changed
|
|
|
|
*
|
|
|
|
* Called when more output buffer space is available for this socket.
|
|
|
|
* We try not to wake our writers until they can make "significant"
|
|
|
|
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
|
|
|
|
* with a bunch of small requests.
|
|
|
|
*/
|
|
|
|
static void xs_tcp_write_space(struct sock *sk)
|
|
|
|
{
|
2010-09-22 19:43:39 +07:00
|
|
|
read_lock_bh(&sk->sk_callback_lock);
|
2005-08-12 03:25:50 +07:00
|
|
|
|
|
|
|
/* from net/core/stream.c:sk_stream_write_space */
|
2013-07-23 10:26:31 +07:00
|
|
|
if (sk_stream_is_writeable(sk))
|
net/sunrpc/xprtsock.c: some common code found
$ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c
net/sunrpc/xprtsock.c xs_tcp_write_space
--- net/sunrpc/xprtsock.c:xs_udp_write_space()
+++ net/sunrpc/xprtsock.c:xs_tcp_write_space()
@@ -1,4 +1,4 @@
- * xs_udp_write_space - callback invoked when socket buffer space
+ * xs_tcp_write_space - callback invoked when socket buffer space
* becomes available
* @sk: socket whose state has changed
*
@@ -7,12 +7,12 @@
* progress, otherwise we'll waste resources thrashing kernel_sendmsg
* with a bunch of small requests.
*/
-static void xs_udp_write_space(struct sock *sk)
+static void xs_tcp_write_space(struct sock *sk)
{
read_lock(&sk->sk_callback_lock);
- /* from net/core/sock.c:sock_def_write_space */
- if (sock_writeable(sk)) {
+ /* from net/core/stream.c:sk_stream_write_space */
+ if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
struct socket *sock;
struct rpc_xprt *xprt;
$ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new
net/sunrpc/xprtsock.c:
xs_tcp_write_space | -163
xs_udp_write_space | -163
2 functions changed, 326 bytes removed
net/sunrpc/xprtsock.c:
xs_write_space | +179
1 function changed, 179 bytes added
net/sunrpc/xprtsock.o.new:
3 functions changed, 179 bytes added, 326 bytes removed, diff: -147
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-07 14:48:33 +07:00
|
|
|
xs_write_space(sk);
|
2008-04-18 05:52:19 +07:00
|
|
|
|
2010-09-22 19:43:39 +07:00
|
|
|
read_unlock_bh(&sk->sk_callback_lock);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2005-08-26 06:25:56 +07:00
|
|
|
static void xs_udp_do_set_buffer_size(struct rpc_xprt *xprt)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2006-12-06 04:35:15 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
struct sock *sk = transport->inet;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2006-12-06 04:35:30 +07:00
|
|
|
if (transport->rcvsize) {
|
2005-08-12 03:25:23 +07:00
|
|
|
sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
|
2006-12-06 04:35:30 +07:00
|
|
|
sk->sk_rcvbuf = transport->rcvsize * xprt->max_reqs * 2;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
2006-12-06 04:35:30 +07:00
|
|
|
if (transport->sndsize) {
|
2005-08-12 03:25:23 +07:00
|
|
|
sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
|
2006-12-06 04:35:30 +07:00
|
|
|
sk->sk_sndbuf = transport->sndsize * xprt->max_reqs * 2;
|
2005-08-12 03:25:23 +07:00
|
|
|
sk->sk_write_space(sk);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2005-08-26 06:25:49 +07:00
|
|
|
/**
|
2005-08-26 06:25:56 +07:00
|
|
|
* xs_udp_set_buffer_size - set send and receive limits
|
2005-08-26 06:25:49 +07:00
|
|
|
* @xprt: generic transport
|
2005-08-26 06:25:56 +07:00
|
|
|
* @sndsize: requested size of send buffer, in bytes
|
|
|
|
* @rcvsize: requested size of receive buffer, in bytes
|
2005-08-26 06:25:49 +07:00
|
|
|
*
|
2005-08-26 06:25:56 +07:00
|
|
|
* Set socket send and receive buffer size limits.
|
2005-08-26 06:25:49 +07:00
|
|
|
*/
|
2005-08-26 06:25:56 +07:00
|
|
|
static void xs_udp_set_buffer_size(struct rpc_xprt *xprt, size_t sndsize, size_t rcvsize)
|
2005-08-26 06:25:49 +07:00
|
|
|
{
|
2006-12-06 04:35:30 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
|
|
|
|
transport->sndsize = 0;
|
2005-08-26 06:25:56 +07:00
|
|
|
if (sndsize)
|
2006-12-06 04:35:30 +07:00
|
|
|
transport->sndsize = sndsize + 1024;
|
|
|
|
transport->rcvsize = 0;
|
2005-08-26 06:25:56 +07:00
|
|
|
if (rcvsize)
|
2006-12-06 04:35:30 +07:00
|
|
|
transport->rcvsize = rcvsize + 1024;
|
2005-08-26 06:25:56 +07:00
|
|
|
|
|
|
|
xs_udp_do_set_buffer_size(xprt);
|
2005-08-26 06:25:49 +07:00
|
|
|
}
|
|
|
|
|
2005-08-26 06:25:52 +07:00
|
|
|
/**
|
|
|
|
* xs_udp_timer - called when a retransmit timeout occurs on a UDP transport
|
|
|
|
* @task: task that timed out
|
|
|
|
*
|
|
|
|
* Adjust the congestion window after a retransmit timeout has occurred.
|
|
|
|
*/
|
2013-01-08 21:48:15 +07:00
|
|
|
static void xs_udp_timer(struct rpc_xprt *xprt, struct rpc_task *task)
|
2005-08-26 06:25:52 +07:00
|
|
|
{
|
2017-02-09 05:00:51 +07:00
|
|
|
spin_lock_bh(&xprt->transport_lock);
|
2013-01-08 21:48:15 +07:00
|
|
|
xprt_adjust_cwnd(xprt, task, -ETIMEDOUT);
|
2017-02-09 05:00:51 +07:00
|
|
|
spin_unlock_bh(&xprt->transport_lock);
|
2005-08-26 06:25:52 +07:00
|
|
|
}
|
|
|
|
|
2018-10-19 02:27:02 +07:00
|
|
|
static int xs_get_random_port(void)
|
2006-05-25 12:40:49 +07:00
|
|
|
{
|
2018-10-19 02:27:02 +07:00
|
|
|
unsigned short min = xprt_min_resvport, max = xprt_max_resvport;
|
|
|
|
unsigned short range;
|
|
|
|
unsigned short rand;
|
|
|
|
|
|
|
|
if (max < min)
|
|
|
|
return -EADDRINUSE;
|
|
|
|
range = max - min + 1;
|
|
|
|
rand = (unsigned short) prandom_u32() % range;
|
|
|
|
return rand + min;
|
2006-05-25 12:40:49 +07:00
|
|
|
}
|
|
|
|
|
2015-02-09 03:00:06 +07:00
|
|
|
/**
|
|
|
|
* xs_set_reuseaddr_port - set the socket's port and address reuse options
|
|
|
|
* @sock: socket
|
|
|
|
*
|
|
|
|
* Note that this function has to be called on all sockets that share the
|
|
|
|
* same port, and it must be called before binding.
|
|
|
|
*/
|
|
|
|
static void xs_sock_set_reuseport(struct socket *sock)
|
|
|
|
{
|
2015-02-10 05:20:14 +07:00
|
|
|
int opt = 1;
|
2015-02-09 03:00:06 +07:00
|
|
|
|
2015-02-10 05:20:14 +07:00
|
|
|
kernel_setsockopt(sock, SOL_SOCKET, SO_REUSEPORT,
|
|
|
|
(char *)&opt, sizeof(opt));
|
2015-02-09 03:00:06 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static unsigned short xs_sock_getport(struct socket *sock)
|
|
|
|
{
|
|
|
|
struct sockaddr_storage buf;
|
|
|
|
unsigned short port = 0;
|
|
|
|
|
2018-02-13 02:00:20 +07:00
|
|
|
if (kernel_getsockname(sock, (struct sockaddr *)&buf) < 0)
|
2015-02-09 03:00:06 +07:00
|
|
|
goto out;
|
|
|
|
switch (buf.ss_family) {
|
|
|
|
case AF_INET6:
|
|
|
|
port = ntohs(((struct sockaddr_in6 *)&buf)->sin6_port);
|
|
|
|
break;
|
|
|
|
case AF_INET:
|
|
|
|
port = ntohs(((struct sockaddr_in *)&buf)->sin_port);
|
|
|
|
}
|
|
|
|
out:
|
|
|
|
return port;
|
|
|
|
}
|
|
|
|
|
2006-01-03 15:55:51 +07:00
|
|
|
/**
|
|
|
|
* xs_set_port - reset the port number in the remote endpoint address
|
|
|
|
* @xprt: generic transport
|
|
|
|
* @port: new port number
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void xs_set_port(struct rpc_xprt *xprt, unsigned short port)
|
|
|
|
{
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: setting port for xprt %p to %u\n", xprt, port);
|
2006-08-23 07:06:19 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
rpc_set_port(xs_addr(xprt), port);
|
|
|
|
xs_update_peer_port(xprt);
|
2006-01-03 15:55:51 +07:00
|
|
|
}
|
|
|
|
|
2015-02-09 03:00:06 +07:00
|
|
|
static void xs_set_srcport(struct sock_xprt *transport, struct socket *sock)
|
|
|
|
{
|
|
|
|
if (transport->srcport == 0)
|
|
|
|
transport->srcport = xs_sock_getport(sock);
|
|
|
|
}
|
|
|
|
|
2018-10-19 02:27:02 +07:00
|
|
|
static int xs_get_srcport(struct sock_xprt *transport)
|
2007-11-06 05:40:58 +07:00
|
|
|
{
|
2018-10-19 02:27:02 +07:00
|
|
|
int port = transport->srcport;
|
2007-11-06 05:40:58 +07:00
|
|
|
|
|
|
|
if (port == 0 && transport->xprt.resvport)
|
|
|
|
port = xs_get_random_port();
|
|
|
|
return port;
|
|
|
|
}
|
|
|
|
|
2010-10-04 19:51:56 +07:00
|
|
|
static unsigned short xs_next_srcport(struct sock_xprt *transport, unsigned short port)
|
2007-11-06 05:40:58 +07:00
|
|
|
{
|
2009-08-10 02:09:46 +07:00
|
|
|
if (transport->srcport != 0)
|
|
|
|
transport->srcport = 0;
|
2007-11-06 05:40:58 +07:00
|
|
|
if (!transport->xprt.resvport)
|
|
|
|
return 0;
|
|
|
|
if (port <= xprt_min_resvport || port > xprt_max_resvport)
|
|
|
|
return xprt_max_resvport;
|
|
|
|
return --port;
|
|
|
|
}
|
2010-10-05 18:53:08 +07:00
|
|
|
static int xs_bind(struct sock_xprt *transport, struct socket *sock)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2010-10-05 18:53:08 +07:00
|
|
|
struct sockaddr_storage myaddr;
|
2007-11-06 05:40:58 +07:00
|
|
|
int err, nloop = 0;
|
2018-10-19 02:27:02 +07:00
|
|
|
int port = xs_get_srcport(transport);
|
2007-11-06 05:40:58 +07:00
|
|
|
unsigned short last;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
rpc: xs_bind - do not bind when requesting a random ephemeral port
When attempting to establish a local ephemeral endpoint for a TCP or UDP
socket, do not explicitly call bind, instead let it happen implicilty when the
socket is first used.
The main motivating factor for this change is when TCP runs out of unique
ephemeral ports (i.e. cannot find any ephemeral ports which are not a part of
*any* TCP connection). In this situation if you explicitly call bind, then the
call will fail with EADDRINUSE. However, if you allow the allocation of an
ephemeral port to happen implicitly as part of connect (or other functions),
then ephemeral ports can be reused, so long as the combination of (local_ip,
local_port, remote_ip, remote_port) is unique for TCP sockets on the system.
This doesn't matter for UDP sockets, but it seemed easiest to treat TCP and UDP
sockets the same.
This can allow mount.nfs(8) to continue to function successfully, even in the
face of misbehaving applications which are creating a large number of TCP
connections.
Signed-off-by: Chris Perl <chris.perl@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-06 02:40:21 +07:00
|
|
|
/*
|
|
|
|
* If we are asking for any ephemeral port (i.e. port == 0 &&
|
|
|
|
* transport->xprt.resvport == 0), don't bind. Let the local
|
|
|
|
* port selection happen implicitly when the socket is used
|
|
|
|
* (for example at connect time).
|
|
|
|
*
|
|
|
|
* This ensures that we can continue to establish TCP
|
|
|
|
* connections even when all local ephemeral ports are already
|
|
|
|
* a part of some TCP connection. This makes no difference
|
|
|
|
* for UDP sockets, but also doens't harm them.
|
|
|
|
*
|
|
|
|
* If we're asking for any reserved port (i.e. port == 0 &&
|
|
|
|
* transport->xprt.resvport == 1) xs_get_srcport above will
|
|
|
|
* ensure that port is non-zero and we will bind as needed.
|
|
|
|
*/
|
2018-10-19 02:27:02 +07:00
|
|
|
if (port <= 0)
|
|
|
|
return port;
|
rpc: xs_bind - do not bind when requesting a random ephemeral port
When attempting to establish a local ephemeral endpoint for a TCP or UDP
socket, do not explicitly call bind, instead let it happen implicilty when the
socket is first used.
The main motivating factor for this change is when TCP runs out of unique
ephemeral ports (i.e. cannot find any ephemeral ports which are not a part of
*any* TCP connection). In this situation if you explicitly call bind, then the
call will fail with EADDRINUSE. However, if you allow the allocation of an
ephemeral port to happen implicitly as part of connect (or other functions),
then ephemeral ports can be reused, so long as the combination of (local_ip,
local_port, remote_ip, remote_port) is unique for TCP sockets on the system.
This doesn't matter for UDP sockets, but it seemed easiest to treat TCP and UDP
sockets the same.
This can allow mount.nfs(8) to continue to function successfully, even in the
face of misbehaving applications which are creating a large number of TCP
connections.
Signed-off-by: Chris Perl <chris.perl@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-06 02:40:21 +07:00
|
|
|
|
2010-10-05 18:53:08 +07:00
|
|
|
memcpy(&myaddr, &transport->srcaddr, transport->xprt.addrlen);
|
2005-08-12 03:25:23 +07:00
|
|
|
do {
|
2010-10-05 18:53:08 +07:00
|
|
|
rpc_set_port((struct sockaddr *)&myaddr, port);
|
|
|
|
err = kernel_bind(sock, (struct sockaddr *)&myaddr,
|
|
|
|
transport->xprt.addrlen);
|
2005-08-12 03:25:23 +07:00
|
|
|
if (err == 0) {
|
2009-08-10 02:09:46 +07:00
|
|
|
transport->srcport = port;
|
2007-07-10 03:23:35 +07:00
|
|
|
break;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
2007-11-06 05:40:58 +07:00
|
|
|
last = port;
|
2010-10-04 19:51:56 +07:00
|
|
|
port = xs_next_srcport(transport, port);
|
2007-11-06 05:40:58 +07:00
|
|
|
if (port > last)
|
|
|
|
nloop++;
|
|
|
|
} while (err == -EADDRINUSE && nloop != 2);
|
2007-08-06 22:57:33 +07:00
|
|
|
|
2010-10-20 22:52:51 +07:00
|
|
|
if (myaddr.ss_family == AF_INET)
|
2010-10-05 18:53:08 +07:00
|
|
|
dprintk("RPC: %s %pI4:%u: %s (%d)\n", __func__,
|
|
|
|
&((struct sockaddr_in *)&myaddr)->sin_addr,
|
|
|
|
port, err ? "failed" : "ok", err);
|
|
|
|
else
|
|
|
|
dprintk("RPC: %s %pI6:%u: %s (%d)\n", __func__,
|
|
|
|
&((struct sockaddr_in6 *)&myaddr)->sin6_addr,
|
|
|
|
port, err ? "failed" : "ok", err);
|
2005-08-12 03:25:23 +07:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
/*
|
|
|
|
* We don't support autobind on AF_LOCAL sockets
|
|
|
|
*/
|
|
|
|
static void xs_local_rpcbind(struct rpc_task *task)
|
|
|
|
{
|
2016-01-31 04:39:26 +07:00
|
|
|
xprt_set_bound(task->tk_xprt);
|
2011-05-10 02:22:44 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static void xs_local_set_port(struct rpc_xprt *xprt, unsigned short port)
|
|
|
|
{
|
|
|
|
}
|
2010-10-05 18:53:08 +07:00
|
|
|
|
2006-12-07 11:35:24 +07:00
|
|
|
#ifdef CONFIG_DEBUG_LOCK_ALLOC
|
|
|
|
static struct lock_class_key xs_key[2];
|
|
|
|
static struct lock_class_key xs_slock_key[2];
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
static inline void xs_reclassify_socketu(struct socket *sock)
|
|
|
|
{
|
|
|
|
struct sock *sk = sock->sk;
|
|
|
|
|
|
|
|
sock_lock_init_class_and_name(sk, "slock-AF_LOCAL-RPC",
|
|
|
|
&xs_slock_key[1], "sk_lock-AF_LOCAL-RPC", &xs_key[1]);
|
|
|
|
}
|
|
|
|
|
2007-08-06 22:58:04 +07:00
|
|
|
static inline void xs_reclassify_socket4(struct socket *sock)
|
2006-12-07 11:35:24 +07:00
|
|
|
{
|
|
|
|
struct sock *sk = sock->sk;
|
2007-08-06 22:58:04 +07:00
|
|
|
|
|
|
|
sock_lock_init_class_and_name(sk, "slock-AF_INET-RPC",
|
|
|
|
&xs_slock_key[0], "sk_lock-AF_INET-RPC", &xs_key[0]);
|
|
|
|
}
|
2006-12-07 11:35:24 +07:00
|
|
|
|
2007-08-06 22:58:04 +07:00
|
|
|
static inline void xs_reclassify_socket6(struct socket *sock)
|
|
|
|
{
|
|
|
|
struct sock *sk = sock->sk;
|
2006-12-07 11:35:24 +07:00
|
|
|
|
2007-08-06 22:58:04 +07:00
|
|
|
sock_lock_init_class_and_name(sk, "slock-AF_INET6-RPC",
|
|
|
|
&xs_slock_key[1], "sk_lock-AF_INET6-RPC", &xs_key[1]);
|
2006-12-07 11:35:24 +07:00
|
|
|
}
|
2010-10-04 19:56:38 +07:00
|
|
|
|
|
|
|
static inline void xs_reclassify_socket(int family, struct socket *sock)
|
|
|
|
{
|
2016-04-08 20:11:27 +07:00
|
|
|
if (WARN_ON_ONCE(!sock_allow_reclassification(sock->sk)))
|
2012-10-23 21:43:39 +07:00
|
|
|
return;
|
|
|
|
|
2010-10-20 22:52:51 +07:00
|
|
|
switch (family) {
|
2011-05-10 02:22:44 +07:00
|
|
|
case AF_LOCAL:
|
|
|
|
xs_reclassify_socketu(sock);
|
|
|
|
break;
|
2010-10-20 22:52:51 +07:00
|
|
|
case AF_INET:
|
2010-10-04 19:56:38 +07:00
|
|
|
xs_reclassify_socket4(sock);
|
2010-10-20 22:52:51 +07:00
|
|
|
break;
|
|
|
|
case AF_INET6:
|
2010-10-04 19:56:38 +07:00
|
|
|
xs_reclassify_socket6(sock);
|
2010-10-20 22:52:51 +07:00
|
|
|
break;
|
|
|
|
}
|
2010-10-04 19:56:38 +07:00
|
|
|
}
|
2006-12-07 11:35:24 +07:00
|
|
|
#else
|
2010-10-04 19:56:38 +07:00
|
|
|
static inline void xs_reclassify_socket(int family, struct socket *sock)
|
|
|
|
{
|
|
|
|
}
|
2006-12-07 11:35:24 +07:00
|
|
|
#endif
|
|
|
|
|
2013-10-31 12:14:36 +07:00
|
|
|
static void xs_dummy_setup_socket(struct work_struct *work)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2010-10-04 19:56:38 +07:00
|
|
|
static struct socket *xs_create_sock(struct rpc_xprt *xprt,
|
2015-02-09 03:00:06 +07:00
|
|
|
struct sock_xprt *transport, int family, int type,
|
|
|
|
int protocol, bool reuseport)
|
2010-10-04 19:54:26 +07:00
|
|
|
{
|
|
|
|
struct socket *sock;
|
|
|
|
int err;
|
|
|
|
|
2010-10-04 19:56:38 +07:00
|
|
|
err = __sock_create(xprt->xprt_net, family, type, protocol, &sock, 1);
|
2010-10-04 19:54:26 +07:00
|
|
|
if (err < 0) {
|
|
|
|
dprintk("RPC: can't create %d transport socket (%d).\n",
|
|
|
|
protocol, -err);
|
|
|
|
goto out;
|
|
|
|
}
|
2010-10-04 19:56:38 +07:00
|
|
|
xs_reclassify_socket(family, sock);
|
2010-10-04 19:54:26 +07:00
|
|
|
|
2015-02-09 03:00:06 +07:00
|
|
|
if (reuseport)
|
|
|
|
xs_sock_set_reuseport(sock);
|
|
|
|
|
2011-02-23 04:54:34 +07:00
|
|
|
err = xs_bind(transport, sock);
|
|
|
|
if (err) {
|
2010-10-04 19:54:26 +07:00
|
|
|
sock_release(sock);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
return sock;
|
|
|
|
out:
|
|
|
|
return ERR_PTR(err);
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
static int xs_local_finish_connecting(struct rpc_xprt *xprt,
|
|
|
|
struct socket *sock)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt,
|
|
|
|
xprt);
|
|
|
|
|
|
|
|
if (!transport->inet) {
|
|
|
|
struct sock *sk = sock->sk;
|
|
|
|
|
|
|
|
write_lock_bh(&sk->sk_callback_lock);
|
|
|
|
|
|
|
|
xs_save_old_callbacks(transport, sk);
|
|
|
|
|
|
|
|
sk->sk_user_data = xprt;
|
2015-10-07 04:03:00 +07:00
|
|
|
sk->sk_data_ready = xs_data_ready;
|
2011-05-10 02:22:44 +07:00
|
|
|
sk->sk_write_space = xs_udp_write_space;
|
2016-05-13 11:41:39 +07:00
|
|
|
sock_set_flag(sk, SOCK_FASYNC);
|
2014-01-01 01:22:59 +07:00
|
|
|
sk->sk_error_report = xs_error_report;
|
2015-08-20 09:46:15 +07:00
|
|
|
sk->sk_allocation = GFP_NOIO;
|
2011-05-10 02:22:44 +07:00
|
|
|
|
|
|
|
xprt_clear_connected(xprt);
|
|
|
|
|
|
|
|
/* Reset to new socket */
|
|
|
|
transport->sock = sock;
|
|
|
|
transport->inet = sk;
|
|
|
|
|
|
|
|
write_unlock_bh(&sk->sk_callback_lock);
|
|
|
|
}
|
|
|
|
|
2018-09-15 01:32:45 +07:00
|
|
|
xs_stream_reset_connect(transport);
|
2018-08-14 03:54:57 +07:00
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
return kernel_connect(sock, xs_addr(xprt), xprt->addrlen, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* xs_local_setup_socket - create AF_LOCAL socket, connect to a local endpoint
|
|
|
|
* @transport: socket transport to connect
|
|
|
|
*/
|
2013-02-21 05:52:19 +07:00
|
|
|
static int xs_local_setup_socket(struct sock_xprt *transport)
|
2011-05-10 02:22:44 +07:00
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt = &transport->xprt;
|
|
|
|
struct socket *sock;
|
|
|
|
int status = -EIO;
|
|
|
|
|
|
|
|
status = __sock_create(xprt->xprt_net, AF_LOCAL,
|
|
|
|
SOCK_STREAM, 0, &sock, 1);
|
|
|
|
if (status < 0) {
|
|
|
|
dprintk("RPC: can't create AF_LOCAL "
|
|
|
|
"transport socket (%d).\n", -status);
|
|
|
|
goto out;
|
|
|
|
}
|
2015-12-02 13:17:52 +07:00
|
|
|
xs_reclassify_socket(AF_LOCAL, sock);
|
2011-05-10 02:22:44 +07:00
|
|
|
|
|
|
|
dprintk("RPC: worker connecting xprt %p via AF_LOCAL to %s\n",
|
|
|
|
xprt, xprt->address_strings[RPC_DISPLAY_ADDR]);
|
|
|
|
|
|
|
|
status = xs_local_finish_connecting(xprt, sock);
|
2013-09-04 23:16:23 +07:00
|
|
|
trace_rpc_socket_connect(xprt, sock, status);
|
2011-05-10 02:22:44 +07:00
|
|
|
switch (status) {
|
|
|
|
case 0:
|
|
|
|
dprintk("RPC: xprt %p connected to %s\n",
|
|
|
|
xprt, xprt->address_strings[RPC_DISPLAY_ADDR]);
|
2018-10-02 01:25:36 +07:00
|
|
|
xprt->stat.connect_count++;
|
|
|
|
xprt->stat.connect_time += (long)jiffies -
|
|
|
|
xprt->stat.connect_start;
|
2011-05-10 02:22:44 +07:00
|
|
|
xprt_set_connected(xprt);
|
2014-07-01 00:42:19 +07:00
|
|
|
case -ENOBUFS:
|
2011-05-10 02:22:44 +07:00
|
|
|
break;
|
|
|
|
case -ENOENT:
|
|
|
|
dprintk("RPC: xprt %p: socket %s does not exist\n",
|
|
|
|
xprt, xprt->address_strings[RPC_DISPLAY_ADDR]);
|
|
|
|
break;
|
2012-12-16 05:02:29 +07:00
|
|
|
case -ECONNREFUSED:
|
|
|
|
dprintk("RPC: xprt %p: connection refused for %s\n",
|
|
|
|
xprt, xprt->address_strings[RPC_DISPLAY_ADDR]);
|
|
|
|
break;
|
2011-05-10 02:22:44 +07:00
|
|
|
default:
|
|
|
|
printk(KERN_ERR "%s: unhandled error (%d) connecting to %s\n",
|
|
|
|
__func__, -status,
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR]);
|
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
xprt_clear_connecting(xprt);
|
|
|
|
xprt_wake_pending_tasks(xprt, status);
|
2013-02-21 05:52:19 +07:00
|
|
|
return status;
|
|
|
|
}
|
|
|
|
|
2013-03-01 09:02:55 +07:00
|
|
|
static void xs_local_connect(struct rpc_xprt *xprt, struct rpc_task *task)
|
2013-02-21 05:52:19 +07:00
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (RPC_IS_ASYNC(task)) {
|
|
|
|
/*
|
|
|
|
* We want the AF_LOCAL connect to be resolved in the
|
|
|
|
* filesystem namespace of the process making the rpc
|
|
|
|
* call. Thus we connect synchronously.
|
|
|
|
*
|
|
|
|
* If we want to support asynchronous AF_LOCAL calls,
|
|
|
|
* we'll need to figure out how to pass a namespace to
|
|
|
|
* connect.
|
|
|
|
*/
|
|
|
|
rpc_exit(task, -ENOTCONN);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
ret = xs_local_setup_socket(transport);
|
|
|
|
if (ret && !RPC_IS_SOFTCONN(task))
|
|
|
|
msleep_interruptible(15000);
|
2011-05-10 02:22:44 +07:00
|
|
|
}
|
|
|
|
|
2015-06-04 03:14:25 +07:00
|
|
|
#if IS_ENABLED(CONFIG_SUNRPC_SWAP)
|
2015-06-04 03:14:28 +07:00
|
|
|
/*
|
|
|
|
* Note that this should be called with XPRT_LOCKED held (or when we otherwise
|
|
|
|
* know that we have exclusive access to the socket), to guard against
|
|
|
|
* races with xs_reset_transport.
|
|
|
|
*/
|
2012-08-01 06:45:12 +07:00
|
|
|
static void xs_set_memalloc(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt,
|
|
|
|
xprt);
|
|
|
|
|
2015-06-04 03:14:28 +07:00
|
|
|
/*
|
|
|
|
* If there's no sock, then we have nothing to set. The
|
|
|
|
* reconnecting process will get it for us.
|
|
|
|
*/
|
|
|
|
if (!transport->inet)
|
|
|
|
return;
|
2015-06-04 03:14:26 +07:00
|
|
|
if (atomic_read(&xprt->swapper))
|
2012-08-01 06:45:12 +07:00
|
|
|
sk_set_memalloc(transport->inet);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2015-06-04 03:14:29 +07:00
|
|
|
* xs_enable_swap - Tag this transport as being used for swap.
|
2012-08-01 06:45:12 +07:00
|
|
|
* @xprt: transport to tag
|
|
|
|
*
|
2015-06-04 03:14:26 +07:00
|
|
|
* Take a reference to this transport on behalf of the rpc_clnt, and
|
|
|
|
* optionally mark it for swapping if it wasn't already.
|
2012-08-01 06:45:12 +07:00
|
|
|
*/
|
2015-06-04 03:14:29 +07:00
|
|
|
static int
|
|
|
|
xs_enable_swap(struct rpc_xprt *xprt)
|
2012-08-01 06:45:12 +07:00
|
|
|
{
|
2015-06-04 03:14:28 +07:00
|
|
|
struct sock_xprt *xs = container_of(xprt, struct sock_xprt, xprt);
|
2012-08-01 06:45:12 +07:00
|
|
|
|
2015-06-04 03:14:28 +07:00
|
|
|
if (atomic_inc_return(&xprt->swapper) != 1)
|
|
|
|
return 0;
|
|
|
|
if (wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_KILLABLE))
|
|
|
|
return -ERESTARTSYS;
|
|
|
|
if (xs->inet)
|
|
|
|
sk_set_memalloc(xs->inet);
|
|
|
|
xprt_release_xprt(xprt, NULL);
|
2015-06-04 03:14:26 +07:00
|
|
|
return 0;
|
|
|
|
}
|
2012-08-01 06:45:12 +07:00
|
|
|
|
2015-06-04 03:14:26 +07:00
|
|
|
/**
|
2015-06-04 03:14:29 +07:00
|
|
|
* xs_disable_swap - Untag this transport as being used for swap.
|
2015-06-04 03:14:26 +07:00
|
|
|
* @xprt: transport to tag
|
|
|
|
*
|
|
|
|
* Drop a "swapper" reference to this xprt on behalf of the rpc_clnt. If the
|
|
|
|
* swapper refcount goes to 0, untag the socket as a memalloc socket.
|
|
|
|
*/
|
2015-06-04 03:14:29 +07:00
|
|
|
static void
|
|
|
|
xs_disable_swap(struct rpc_xprt *xprt)
|
2015-06-04 03:14:26 +07:00
|
|
|
{
|
2015-06-04 03:14:28 +07:00
|
|
|
struct sock_xprt *xs = container_of(xprt, struct sock_xprt, xprt);
|
2015-06-04 03:14:26 +07:00
|
|
|
|
2015-06-04 03:14:28 +07:00
|
|
|
if (!atomic_dec_and_test(&xprt->swapper))
|
|
|
|
return;
|
|
|
|
if (wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_KILLABLE))
|
|
|
|
return;
|
|
|
|
if (xs->inet)
|
|
|
|
sk_clear_memalloc(xs->inet);
|
|
|
|
xprt_release_xprt(xprt, NULL);
|
2012-08-01 06:45:12 +07:00
|
|
|
}
|
|
|
|
#else
|
|
|
|
static void xs_set_memalloc(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
}
|
2015-06-04 03:14:29 +07:00
|
|
|
|
|
|
|
static int
|
|
|
|
xs_enable_swap(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
xs_disable_swap(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
}
|
2012-08-01 06:45:12 +07:00
|
|
|
#endif
|
|
|
|
|
2007-08-06 22:57:38 +07:00
|
|
|
static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
|
|
|
|
if (!transport->inet) {
|
|
|
|
struct sock *sk = sock->sk;
|
|
|
|
|
|
|
|
write_lock_bh(&sk->sk_callback_lock);
|
|
|
|
|
2008-10-29 02:21:39 +07:00
|
|
|
xs_save_old_callbacks(transport, sk);
|
|
|
|
|
2007-08-06 22:57:38 +07:00
|
|
|
sk->sk_user_data = xprt;
|
2015-10-07 03:26:05 +07:00
|
|
|
sk->sk_data_ready = xs_data_ready;
|
2007-08-06 22:57:38 +07:00
|
|
|
sk->sk_write_space = xs_udp_write_space;
|
2016-05-13 11:41:39 +07:00
|
|
|
sock_set_flag(sk, SOCK_FASYNC);
|
2015-08-20 09:46:15 +07:00
|
|
|
sk->sk_allocation = GFP_NOIO;
|
2007-08-06 22:57:38 +07:00
|
|
|
|
|
|
|
xprt_set_connected(xprt);
|
|
|
|
|
|
|
|
/* Reset to new socket */
|
|
|
|
transport->sock = sock;
|
|
|
|
transport->inet = sk;
|
|
|
|
|
2012-08-01 06:45:12 +07:00
|
|
|
xs_set_memalloc(xprt);
|
|
|
|
|
2007-08-06 22:57:38 +07:00
|
|
|
write_unlock_bh(&sk->sk_callback_lock);
|
|
|
|
}
|
|
|
|
xs_udp_do_set_buffer_size(xprt);
|
2016-08-04 11:00:33 +07:00
|
|
|
|
|
|
|
xprt->stat.connect_start = jiffies;
|
2007-08-06 22:57:38 +07:00
|
|
|
}
|
|
|
|
|
2010-10-04 19:58:02 +07:00
|
|
|
static void xs_udp_setup_socket(struct work_struct *work)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2006-12-08 03:48:15 +07:00
|
|
|
struct sock_xprt *transport =
|
|
|
|
container_of(work, struct sock_xprt, connect_worker.work);
|
2006-12-06 04:35:26 +07:00
|
|
|
struct rpc_xprt *xprt = &transport->xprt;
|
2017-09-18 18:21:14 +07:00
|
|
|
struct socket *sock;
|
2010-10-04 19:53:46 +07:00
|
|
|
int status = -EIO;
|
2005-08-12 03:25:26 +07:00
|
|
|
|
2010-10-04 19:58:02 +07:00
|
|
|
sock = xs_create_sock(xprt, transport,
|
2015-02-09 03:00:06 +07:00
|
|
|
xs_addr(xprt)->sa_family, SOCK_DGRAM,
|
|
|
|
IPPROTO_UDP, false);
|
2010-10-04 19:53:46 +07:00
|
|
|
if (IS_ERR(sock))
|
2005-08-12 03:25:53 +07:00
|
|
|
goto out;
|
2007-08-06 22:57:48 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
dprintk("RPC: worker connecting xprt %p via %s to "
|
|
|
|
"%s (port %s)\n", xprt,
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PORT]);
|
2007-08-06 22:57:48 +07:00
|
|
|
|
|
|
|
xs_udp_finish_connecting(xprt, sock);
|
2013-09-04 23:16:23 +07:00
|
|
|
trace_rpc_socket_connect(xprt, sock, 0);
|
2005-08-12 03:25:53 +07:00
|
|
|
status = 0;
|
|
|
|
out:
|
2015-02-09 06:19:25 +07:00
|
|
|
xprt_unlock_connect(xprt, transport);
|
2005-08-12 03:25:53 +07:00
|
|
|
xprt_clear_connecting(xprt);
|
2009-03-12 01:38:03 +07:00
|
|
|
xprt_wake_pending_tasks(xprt, status);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2015-06-20 03:17:57 +07:00
|
|
|
/**
|
|
|
|
* xs_tcp_shutdown - gracefully shut down a TCP socket
|
|
|
|
* @xprt: transport
|
|
|
|
*
|
|
|
|
* Initiates a graceful shutdown of the TCP socket by calling the
|
|
|
|
* equivalent of shutdown(SHUT_RDWR);
|
|
|
|
*/
|
|
|
|
static void xs_tcp_shutdown(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
struct socket *sock = transport->sock;
|
2018-02-05 22:20:06 +07:00
|
|
|
int skst = transport->inet ? transport->inet->sk_state : TCP_CLOSE;
|
2015-06-20 03:17:57 +07:00
|
|
|
|
|
|
|
if (sock == NULL)
|
|
|
|
return;
|
2018-02-05 22:20:06 +07:00
|
|
|
switch (skst) {
|
|
|
|
default:
|
2015-06-20 03:17:57 +07:00
|
|
|
kernel_sock_shutdown(sock, SHUT_RDWR);
|
|
|
|
trace_rpc_socket_shutdown(xprt, sock);
|
2018-02-05 22:20:06 +07:00
|
|
|
break;
|
|
|
|
case TCP_CLOSE:
|
|
|
|
case TCP_TIME_WAIT:
|
2015-06-20 03:17:57 +07:00
|
|
|
xs_reset_transport(transport);
|
2018-02-05 22:20:06 +07:00
|
|
|
}
|
2015-06-20 03:17:57 +07:00
|
|
|
}
|
|
|
|
|
2017-02-08 23:17:53 +07:00
|
|
|
static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt,
|
|
|
|
struct socket *sock)
|
|
|
|
{
|
2017-02-08 23:17:54 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
unsigned int keepidle;
|
|
|
|
unsigned int keepcnt;
|
2017-02-08 23:17:53 +07:00
|
|
|
unsigned int opt_on = 1;
|
|
|
|
unsigned int timeo;
|
|
|
|
|
2017-02-08 23:17:54 +07:00
|
|
|
spin_lock_bh(&xprt->transport_lock);
|
|
|
|
keepidle = DIV_ROUND_UP(xprt->timeout->to_initval, HZ);
|
|
|
|
keepcnt = xprt->timeout->to_retries + 1;
|
|
|
|
timeo = jiffies_to_msecs(xprt->timeout->to_initval) *
|
|
|
|
(xprt->timeout->to_retries + 1);
|
|
|
|
clear_bit(XPRT_SOCK_UPD_TIMEOUT, &transport->sock_state);
|
|
|
|
spin_unlock_bh(&xprt->transport_lock);
|
|
|
|
|
2017-02-08 23:17:53 +07:00
|
|
|
/* TCP Keepalive options */
|
|
|
|
kernel_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE,
|
|
|
|
(char *)&opt_on, sizeof(opt_on));
|
|
|
|
kernel_setsockopt(sock, SOL_TCP, TCP_KEEPIDLE,
|
|
|
|
(char *)&keepidle, sizeof(keepidle));
|
|
|
|
kernel_setsockopt(sock, SOL_TCP, TCP_KEEPINTVL,
|
|
|
|
(char *)&keepidle, sizeof(keepidle));
|
|
|
|
kernel_setsockopt(sock, SOL_TCP, TCP_KEEPCNT,
|
|
|
|
(char *)&keepcnt, sizeof(keepcnt));
|
|
|
|
|
|
|
|
/* TCP user timeout (see RFC5482) */
|
|
|
|
kernel_setsockopt(sock, SOL_TCP, TCP_USER_TIMEOUT,
|
|
|
|
(char *)&timeo, sizeof(timeo));
|
|
|
|
}
|
|
|
|
|
2017-02-08 23:17:54 +07:00
|
|
|
static void xs_tcp_set_connect_timeout(struct rpc_xprt *xprt,
|
|
|
|
unsigned long connect_timeout,
|
|
|
|
unsigned long reconnect_timeout)
|
|
|
|
{
|
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
struct rpc_timeout to;
|
|
|
|
unsigned long initval;
|
|
|
|
|
|
|
|
spin_lock_bh(&xprt->transport_lock);
|
|
|
|
if (reconnect_timeout < xprt->max_reconnect_timeout)
|
|
|
|
xprt->max_reconnect_timeout = reconnect_timeout;
|
|
|
|
if (connect_timeout < xprt->connect_timeout) {
|
|
|
|
memcpy(&to, xprt->timeout, sizeof(to));
|
|
|
|
initval = DIV_ROUND_UP(connect_timeout, to.to_retries + 1);
|
|
|
|
/* Arbitrary lower limit */
|
|
|
|
if (initval < XS_TCP_INIT_REEST_TO << 1)
|
|
|
|
initval = XS_TCP_INIT_REEST_TO << 1;
|
|
|
|
to.to_initval = initval;
|
|
|
|
to.to_maxval = initval;
|
|
|
|
memcpy(&transport->tcp_timeout, &to,
|
|
|
|
sizeof(transport->tcp_timeout));
|
|
|
|
xprt->timeout = &transport->tcp_timeout;
|
|
|
|
xprt->connect_timeout = connect_timeout;
|
|
|
|
}
|
|
|
|
set_bit(XPRT_SOCK_UPD_TIMEOUT, &transport->sock_state);
|
|
|
|
spin_unlock_bh(&xprt->transport_lock);
|
|
|
|
}
|
|
|
|
|
2007-08-06 22:57:38 +07:00
|
|
|
static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2007-08-06 22:57:38 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
2011-03-19 07:21:23 +07:00
|
|
|
int ret = -ENOTCONN;
|
2006-08-23 07:06:18 +07:00
|
|
|
|
2006-12-06 04:35:15 +07:00
|
|
|
if (!transport->inet) {
|
2005-08-12 03:25:53 +07:00
|
|
|
struct sock *sk = sock->sk;
|
2016-08-04 13:24:28 +07:00
|
|
|
unsigned int addr_pref = IPV6_PREFER_SRC_PUBLIC;
|
2013-09-24 22:25:22 +07:00
|
|
|
|
2016-08-04 13:24:28 +07:00
|
|
|
/* Avoid temporary address, they are bad for long-lived
|
|
|
|
* connections such as NFS mounts.
|
|
|
|
* RFC4941, section 3.6 suggests that:
|
|
|
|
* Individual applications, which have specific
|
|
|
|
* knowledge about the normal duration of connections,
|
|
|
|
* MAY override this as appropriate.
|
|
|
|
*/
|
|
|
|
kernel_setsockopt(sock, SOL_IPV6, IPV6_ADDR_PREFERENCES,
|
|
|
|
(char *)&addr_pref, sizeof(addr_pref));
|
|
|
|
|
2017-02-08 23:17:53 +07:00
|
|
|
xs_tcp_set_socket_timeouts(xprt, sock);
|
2015-06-21 02:31:54 +07:00
|
|
|
|
2005-08-12 03:25:53 +07:00
|
|
|
write_lock_bh(&sk->sk_callback_lock);
|
|
|
|
|
2008-10-29 02:21:39 +07:00
|
|
|
xs_save_old_callbacks(transport, sk);
|
|
|
|
|
2005-08-12 03:25:53 +07:00
|
|
|
sk->sk_user_data = xprt;
|
2016-05-29 21:13:24 +07:00
|
|
|
sk->sk_data_ready = xs_data_ready;
|
2005-08-12 03:25:53 +07:00
|
|
|
sk->sk_state_change = xs_tcp_state_change;
|
|
|
|
sk->sk_write_space = xs_tcp_write_space;
|
2016-05-13 11:41:39 +07:00
|
|
|
sock_set_flag(sk, SOCK_FASYNC);
|
2014-01-01 01:22:59 +07:00
|
|
|
sk->sk_error_report = xs_error_report;
|
2015-08-20 09:46:15 +07:00
|
|
|
sk->sk_allocation = GFP_NOIO;
|
2005-08-26 06:25:55 +07:00
|
|
|
|
|
|
|
/* socket options */
|
|
|
|
sock_reset_flag(sk, SOCK_LINGER);
|
|
|
|
tcp_sk(sk)->nonagle |= TCP_NAGLE_OFF;
|
2005-08-12 03:25:53 +07:00
|
|
|
|
|
|
|
xprt_clear_connected(xprt);
|
|
|
|
|
|
|
|
/* Reset to new socket */
|
2006-12-06 04:35:15 +07:00
|
|
|
transport->sock = sock;
|
|
|
|
transport->inet = sk;
|
2005-08-12 03:25:53 +07:00
|
|
|
|
|
|
|
write_unlock_bh(&sk->sk_callback_lock);
|
|
|
|
}
|
|
|
|
|
2009-03-12 01:09:39 +07:00
|
|
|
if (!xprt_bound(xprt))
|
2011-03-19 07:21:23 +07:00
|
|
|
goto out;
|
2009-03-12 01:09:39 +07:00
|
|
|
|
2012-08-01 06:45:12 +07:00
|
|
|
xs_set_memalloc(xprt);
|
|
|
|
|
2018-08-14 03:50:49 +07:00
|
|
|
/* Reset TCP record info */
|
2018-09-15 01:32:45 +07:00
|
|
|
xs_stream_reset_connect(transport);
|
2018-08-14 03:50:49 +07:00
|
|
|
|
2005-08-12 03:25:53 +07:00
|
|
|
/* Tell the socket layer to start connecting... */
|
2015-09-17 10:43:17 +07:00
|
|
|
set_bit(XPRT_SOCK_CONNECTING, &transport->sock_state);
|
2011-03-19 07:21:23 +07:00
|
|
|
ret = kernel_connect(sock, xs_addr(xprt), xprt->addrlen, O_NONBLOCK);
|
|
|
|
switch (ret) {
|
|
|
|
case 0:
|
2015-02-09 03:00:06 +07:00
|
|
|
xs_set_srcport(transport, sock);
|
2017-10-20 23:48:30 +07:00
|
|
|
/* fall through */
|
2011-03-19 07:21:23 +07:00
|
|
|
case -EINPROGRESS:
|
|
|
|
/* SYN_SENT! */
|
|
|
|
if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO)
|
|
|
|
xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
|
2016-08-02 00:36:08 +07:00
|
|
|
break;
|
|
|
|
case -EADDRNOTAVAIL:
|
|
|
|
/* Source port number is unavailable. Try a new one! */
|
|
|
|
transport->srcport = 0;
|
2011-03-19 07:21:23 +07:00
|
|
|
}
|
|
|
|
out:
|
|
|
|
return ret;
|
2007-08-06 22:57:38 +07:00
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
2009-03-12 01:38:04 +07:00
|
|
|
* xs_tcp_setup_socket - create a TCP socket and connect to a remote endpoint
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
|
|
|
* Invoked by a work queue tasklet.
|
2005-08-12 03:25:23 +07:00
|
|
|
*/
|
2010-10-04 19:57:40 +07:00
|
|
|
static void xs_tcp_setup_socket(struct work_struct *work)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2010-10-04 19:57:40 +07:00
|
|
|
struct sock_xprt *transport =
|
|
|
|
container_of(work, struct sock_xprt, connect_worker.work);
|
2006-12-06 04:35:15 +07:00
|
|
|
struct socket *sock = transport->sock;
|
2010-10-04 19:52:25 +07:00
|
|
|
struct rpc_xprt *xprt = &transport->xprt;
|
2009-03-12 01:38:04 +07:00
|
|
|
int status = -EIO;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2006-12-06 04:35:15 +07:00
|
|
|
if (!sock) {
|
2010-10-04 19:57:40 +07:00
|
|
|
sock = xs_create_sock(xprt, transport,
|
2015-02-09 03:00:06 +07:00
|
|
|
xs_addr(xprt)->sa_family, SOCK_STREAM,
|
|
|
|
IPPROTO_TCP, true);
|
2009-03-12 01:38:04 +07:00
|
|
|
if (IS_ERR(sock)) {
|
|
|
|
status = PTR_ERR(sock);
|
2005-08-26 06:25:55 +07:00
|
|
|
goto out;
|
|
|
|
}
|
2009-03-12 01:38:03 +07:00
|
|
|
}
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
dprintk("RPC: worker connecting xprt %p via %s to "
|
|
|
|
"%s (port %s)\n", xprt,
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PORT]);
|
2006-08-23 07:06:18 +07:00
|
|
|
|
2007-08-06 22:57:38 +07:00
|
|
|
status = xs_tcp_finish_connecting(xprt, sock);
|
2013-09-04 23:16:23 +07:00
|
|
|
trace_rpc_socket_connect(xprt, sock, status);
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: %p connect status %d connected %d sock state %d\n",
|
|
|
|
xprt, -status, xprt_connected(xprt),
|
|
|
|
sock->sk->sk_state);
|
2009-03-12 01:38:00 +07:00
|
|
|
switch (status) {
|
2009-04-22 04:18:20 +07:00
|
|
|
default:
|
|
|
|
printk("%s: connect returned unhandled error %d\n",
|
|
|
|
__func__, status);
|
2017-10-20 23:48:30 +07:00
|
|
|
/* fall through */
|
2009-04-22 04:18:20 +07:00
|
|
|
case -EADDRNOTAVAIL:
|
|
|
|
/* We're probably in TIME_WAIT. Get rid of existing socket,
|
|
|
|
* and retry
|
|
|
|
*/
|
2012-09-13 03:49:15 +07:00
|
|
|
xs_tcp_force_close(xprt);
|
2009-06-18 03:22:57 +07:00
|
|
|
break;
|
2009-03-12 01:38:00 +07:00
|
|
|
case 0:
|
|
|
|
case -EINPROGRESS:
|
|
|
|
case -EALREADY:
|
2015-02-09 06:19:25 +07:00
|
|
|
xprt_unlock_connect(xprt, transport);
|
2009-03-12 01:38:03 +07:00
|
|
|
return;
|
2010-03-03 01:06:21 +07:00
|
|
|
case -EINVAL:
|
|
|
|
/* Happens, for instance, if the user specified a link
|
|
|
|
* local IPv6 address without a scope-id.
|
|
|
|
*/
|
2013-03-05 05:29:33 +07:00
|
|
|
case -ECONNREFUSED:
|
|
|
|
case -ECONNRESET:
|
2017-11-30 19:21:33 +07:00
|
|
|
case -ENETDOWN:
|
2013-03-05 05:29:33 +07:00
|
|
|
case -ENETUNREACH:
|
2017-11-25 00:00:24 +07:00
|
|
|
case -EHOSTUNREACH:
|
2015-02-09 09:44:04 +07:00
|
|
|
case -EADDRINUSE:
|
2014-07-01 00:42:19 +07:00
|
|
|
case -ENOBUFS:
|
2017-05-25 14:00:32 +07:00
|
|
|
/*
|
|
|
|
* xs_tcp_force_close() wakes tasks with -EIO.
|
|
|
|
* We need to wake them first to ensure the
|
|
|
|
* correct error code.
|
|
|
|
*/
|
|
|
|
xprt_wake_pending_tasks(xprt, status);
|
2015-02-09 03:34:28 +07:00
|
|
|
xs_tcp_force_close(xprt);
|
2010-03-03 01:06:21 +07:00
|
|
|
goto out;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
2009-03-12 01:38:00 +07:00
|
|
|
status = -EAGAIN;
|
2005-08-12 03:25:23 +07:00
|
|
|
out:
|
2015-02-09 06:19:25 +07:00
|
|
|
xprt_unlock_connect(xprt, transport);
|
2007-08-06 22:57:48 +07:00
|
|
|
xprt_clear_connecting(xprt);
|
2009-03-12 01:38:03 +07:00
|
|
|
xprt_wake_pending_tasks(xprt, status);
|
2007-08-06 22:57:48 +07:00
|
|
|
}
|
2005-08-12 03:25:53 +07:00
|
|
|
|
2016-08-04 11:00:33 +07:00
|
|
|
static unsigned long xs_reconnect_delay(const struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
unsigned long start, now = jiffies;
|
|
|
|
|
|
|
|
start = xprt->stat.connect_start + xprt->reestablish_timeout;
|
|
|
|
if (time_after(start, now))
|
|
|
|
return start - now;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-08-04 11:08:45 +07:00
|
|
|
static void xs_reconnect_backoff(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
xprt->reestablish_timeout <<= 1;
|
|
|
|
if (xprt->reestablish_timeout > xprt->max_reconnect_timeout)
|
|
|
|
xprt->reestablish_timeout = xprt->max_reconnect_timeout;
|
|
|
|
if (xprt->reestablish_timeout < XS_TCP_INIT_REEST_TO)
|
|
|
|
xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
|
|
|
|
}
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
|
|
|
* xs_connect - connect a socket to a remote endpoint
|
2013-01-08 21:26:49 +07:00
|
|
|
* @xprt: pointer to transport structure
|
2005-08-12 03:25:26 +07:00
|
|
|
* @task: address of RPC task that manages state of connect request
|
|
|
|
*
|
|
|
|
* TCP: If the remote end dropped the connection, delay reconnecting.
|
2005-08-26 06:25:55 +07:00
|
|
|
*
|
|
|
|
* UDP socket connects are synchronous, but we use a work queue anyway
|
|
|
|
* to guarantee that even unprivileged user processes can set up a
|
|
|
|
* socket on a privileged port.
|
|
|
|
*
|
|
|
|
* If a UDP socket connect fails, the delay behavior here prevents
|
|
|
|
* retry floods (hard mounts).
|
2005-08-12 03:25:26 +07:00
|
|
|
*/
|
2013-01-08 21:26:49 +07:00
|
|
|
static void xs_connect(struct rpc_xprt *xprt, struct rpc_task *task)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2006-12-06 04:35:15 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
2016-08-04 11:00:33 +07:00
|
|
|
unsigned long delay = 0;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2015-02-09 06:19:25 +07:00
|
|
|
WARN_ON_ONCE(!xprt_lock_connect(xprt, task, transport));
|
|
|
|
|
2015-08-14 02:33:51 +07:00
|
|
|
if (transport->sock != NULL) {
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_connect delayed xprt %p for %lu "
|
|
|
|
"seconds\n",
|
2005-08-26 06:25:55 +07:00
|
|
|
xprt, xprt->reestablish_timeout / HZ);
|
2015-08-14 02:33:51 +07:00
|
|
|
|
|
|
|
/* Start by resetting any existing state */
|
|
|
|
xs_reset_transport(transport);
|
|
|
|
|
2016-08-04 11:00:33 +07:00
|
|
|
delay = xs_reconnect_delay(xprt);
|
2016-08-04 11:08:45 +07:00
|
|
|
xs_reconnect_backoff(xprt);
|
2016-08-04 11:00:33 +07:00
|
|
|
|
|
|
|
} else
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_connect scheduled xprt %p\n", xprt);
|
2016-08-04 11:00:33 +07:00
|
|
|
|
|
|
|
queue_delayed_work(xprtiod_workqueue,
|
|
|
|
&transport->connect_worker,
|
|
|
|
delay);
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
/**
|
|
|
|
* xs_local_print_stats - display AF_LOCAL socket-specifc stats
|
|
|
|
* @xprt: rpc_xprt struct containing statistics
|
|
|
|
* @seq: output file
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void xs_local_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
|
|
|
|
{
|
|
|
|
long idle_time = 0;
|
|
|
|
|
|
|
|
if (xprt_connected(xprt))
|
|
|
|
idle_time = (long)(jiffies - xprt->last_used) / HZ;
|
|
|
|
|
|
|
|
seq_printf(seq, "\txprt:\tlocal %lu %lu %lu %ld %lu %lu %lu "
|
2012-02-15 04:19:18 +07:00
|
|
|
"%llu %llu %lu %llu %llu\n",
|
2011-05-10 02:22:44 +07:00
|
|
|
xprt->stat.bind_count,
|
|
|
|
xprt->stat.connect_count,
|
2018-10-02 01:25:41 +07:00
|
|
|
xprt->stat.connect_time / HZ,
|
2011-05-10 02:22:44 +07:00
|
|
|
idle_time,
|
|
|
|
xprt->stat.sends,
|
|
|
|
xprt->stat.recvs,
|
|
|
|
xprt->stat.bad_xids,
|
|
|
|
xprt->stat.req_u,
|
2012-02-15 04:19:18 +07:00
|
|
|
xprt->stat.bklog_u,
|
|
|
|
xprt->stat.max_slots,
|
|
|
|
xprt->stat.sending_u,
|
|
|
|
xprt->stat.pending_u);
|
2011-05-10 02:22:44 +07:00
|
|
|
}
|
|
|
|
|
2006-03-21 01:44:16 +07:00
|
|
|
/**
|
|
|
|
* xs_udp_print_stats - display UDP socket-specifc stats
|
|
|
|
* @xprt: rpc_xprt struct containing statistics
|
|
|
|
* @seq: output file
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void xs_udp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
|
|
|
|
{
|
2006-12-06 04:35:26 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
|
2012-02-15 04:19:18 +07:00
|
|
|
seq_printf(seq, "\txprt:\tudp %u %lu %lu %lu %lu %llu %llu "
|
|
|
|
"%lu %llu %llu\n",
|
2009-08-10 02:09:46 +07:00
|
|
|
transport->srcport,
|
2006-03-21 01:44:16 +07:00
|
|
|
xprt->stat.bind_count,
|
|
|
|
xprt->stat.sends,
|
|
|
|
xprt->stat.recvs,
|
|
|
|
xprt->stat.bad_xids,
|
|
|
|
xprt->stat.req_u,
|
2012-02-15 04:19:18 +07:00
|
|
|
xprt->stat.bklog_u,
|
|
|
|
xprt->stat.max_slots,
|
|
|
|
xprt->stat.sending_u,
|
|
|
|
xprt->stat.pending_u);
|
2006-03-21 01:44:16 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* xs_tcp_print_stats - display TCP socket-specifc stats
|
|
|
|
* @xprt: rpc_xprt struct containing statistics
|
|
|
|
* @seq: output file
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void xs_tcp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
|
|
|
|
{
|
2006-12-06 04:35:26 +07:00
|
|
|
struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
|
2006-03-21 01:44:16 +07:00
|
|
|
long idle_time = 0;
|
|
|
|
|
|
|
|
if (xprt_connected(xprt))
|
|
|
|
idle_time = (long)(jiffies - xprt->last_used) / HZ;
|
|
|
|
|
2012-02-15 04:19:18 +07:00
|
|
|
seq_printf(seq, "\txprt:\ttcp %u %lu %lu %lu %ld %lu %lu %lu "
|
|
|
|
"%llu %llu %lu %llu %llu\n",
|
2009-08-10 02:09:46 +07:00
|
|
|
transport->srcport,
|
2006-03-21 01:44:16 +07:00
|
|
|
xprt->stat.bind_count,
|
|
|
|
xprt->stat.connect_count,
|
2018-10-02 01:25:41 +07:00
|
|
|
xprt->stat.connect_time / HZ,
|
2006-03-21 01:44:16 +07:00
|
|
|
idle_time,
|
|
|
|
xprt->stat.sends,
|
|
|
|
xprt->stat.recvs,
|
|
|
|
xprt->stat.bad_xids,
|
|
|
|
xprt->stat.req_u,
|
2012-02-15 04:19:18 +07:00
|
|
|
xprt->stat.bklog_u,
|
|
|
|
xprt->stat.max_slots,
|
|
|
|
xprt->stat.sending_u,
|
|
|
|
xprt->stat.pending_u);
|
2006-03-21 01:44:16 +07:00
|
|
|
}
|
|
|
|
|
2009-09-10 21:32:28 +07:00
|
|
|
/*
|
|
|
|
* Allocate a bunch of pages for a scratch buffer for the rpc code. The reason
|
|
|
|
* we allocate pages instead doing a kmalloc like rpc_malloc is because we want
|
|
|
|
* to use the server side send routines.
|
|
|
|
*/
|
2016-09-15 21:55:20 +07:00
|
|
|
static int bc_malloc(struct rpc_task *task)
|
2009-09-10 21:32:28 +07:00
|
|
|
{
|
2016-09-15 21:55:20 +07:00
|
|
|
struct rpc_rqst *rqst = task->tk_rqstp;
|
|
|
|
size_t size = rqst->rq_callsize;
|
2009-09-10 21:32:28 +07:00
|
|
|
struct page *page;
|
|
|
|
struct rpc_buffer *buf;
|
|
|
|
|
2016-09-15 21:55:20 +07:00
|
|
|
if (size > PAGE_SIZE - sizeof(struct rpc_buffer)) {
|
|
|
|
WARN_ONCE(1, "xprtsock: large bc buffer request (size %zu)\n",
|
|
|
|
size);
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
2009-09-10 21:32:28 +07:00
|
|
|
|
2012-10-23 21:43:43 +07:00
|
|
|
page = alloc_page(GFP_KERNEL);
|
2009-09-10 21:32:28 +07:00
|
|
|
if (!page)
|
2016-09-15 21:55:20 +07:00
|
|
|
return -ENOMEM;
|
2009-09-10 21:32:28 +07:00
|
|
|
|
|
|
|
buf = page_address(page);
|
|
|
|
buf->len = PAGE_SIZE;
|
|
|
|
|
2016-09-15 21:55:20 +07:00
|
|
|
rqst->rq_buffer = buf->data;
|
2016-10-25 07:33:23 +07:00
|
|
|
rqst->rq_rbuffer = (char *)rqst->rq_buffer + rqst->rq_callsize;
|
2016-09-15 21:55:20 +07:00
|
|
|
return 0;
|
2009-09-10 21:32:28 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Free the space allocated in the bc_alloc routine
|
|
|
|
*/
|
2016-09-15 21:55:29 +07:00
|
|
|
static void bc_free(struct rpc_task *task)
|
2009-09-10 21:32:28 +07:00
|
|
|
{
|
2016-09-15 21:55:29 +07:00
|
|
|
void *buffer = task->tk_rqstp->rq_buffer;
|
2009-09-10 21:32:28 +07:00
|
|
|
struct rpc_buffer *buf;
|
|
|
|
|
|
|
|
buf = container_of(buffer, struct rpc_buffer, data);
|
|
|
|
free_page((unsigned long)buf);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Use the svc_sock to send the callback. Must be called with svsk->sk_mutex
|
|
|
|
* held. Borrows heavily from svc_tcp_sendto and xs_tcp_send_request.
|
|
|
|
*/
|
|
|
|
static int bc_sendto(struct rpc_rqst *req)
|
|
|
|
{
|
|
|
|
int len;
|
|
|
|
struct xdr_buf *xbufp = &req->rq_snd_buf;
|
|
|
|
struct rpc_xprt *xprt = req->rq_xprt;
|
|
|
|
struct sock_xprt *transport =
|
|
|
|
container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
struct socket *sock = transport->sock;
|
|
|
|
unsigned long headoff;
|
|
|
|
unsigned long tailoff;
|
|
|
|
|
2011-05-10 02:22:34 +07:00
|
|
|
xs_encode_stream_record_marker(xbufp);
|
2009-09-10 21:32:28 +07:00
|
|
|
|
|
|
|
tailoff = (unsigned long)xbufp->tail[0].iov_base & ~PAGE_MASK;
|
|
|
|
headoff = (unsigned long)xbufp->head[0].iov_base & ~PAGE_MASK;
|
|
|
|
len = svc_send_common(sock, xbufp,
|
|
|
|
virt_to_page(xbufp->head[0].iov_base), headoff,
|
|
|
|
xbufp->tail[0].iov_base, tailoff);
|
|
|
|
|
|
|
|
if (len != xbufp->len) {
|
|
|
|
printk(KERN_NOTICE "Error sending entire callback!\n");
|
|
|
|
len = -EAGAIN;
|
|
|
|
}
|
|
|
|
|
|
|
|
return len;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The send routine. Borrows from svc_send
|
|
|
|
*/
|
2018-09-04 10:58:59 +07:00
|
|
|
static int bc_send_request(struct rpc_rqst *req)
|
2009-09-10 21:32:28 +07:00
|
|
|
{
|
|
|
|
struct svc_xprt *xprt;
|
2015-09-24 21:00:09 +07:00
|
|
|
int len;
|
2009-09-10 21:32:28 +07:00
|
|
|
|
|
|
|
dprintk("sending request with xid: %08x\n", ntohl(req->rq_xid));
|
|
|
|
/*
|
|
|
|
* Get the server socket associated with this callback xprt
|
|
|
|
*/
|
|
|
|
xprt = req->rq_xprt->bc_xprt;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Grab the mutex to serialize data as the connection is shared
|
|
|
|
* with the fore channel
|
|
|
|
*/
|
2018-09-04 10:39:27 +07:00
|
|
|
mutex_lock(&xprt->xpt_mutex);
|
2009-09-10 21:32:28 +07:00
|
|
|
if (test_bit(XPT_DEAD, &xprt->xpt_flags))
|
|
|
|
len = -ENOTCONN;
|
|
|
|
else
|
|
|
|
len = bc_sendto(req);
|
|
|
|
mutex_unlock(&xprt->xpt_mutex);
|
|
|
|
|
|
|
|
if (len > 0)
|
|
|
|
len = 0;
|
|
|
|
|
|
|
|
return len;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The close routine. Since this is client initiated, we do nothing
|
|
|
|
*/
|
|
|
|
|
|
|
|
static void bc_close(struct rpc_xprt *xprt)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The xprt destroy routine. Again, because this connection is client
|
|
|
|
* initiated, we do nothing
|
|
|
|
*/
|
|
|
|
|
|
|
|
static void bc_destroy(struct rpc_xprt *xprt)
|
|
|
|
{
|
2014-03-24 10:58:16 +07:00
|
|
|
dprintk("RPC: bc_destroy xprt %p\n", xprt);
|
|
|
|
|
|
|
|
xs_xprt_free(xprt);
|
|
|
|
module_put(THIS_MODULE);
|
2009-09-10 21:32:28 +07:00
|
|
|
}
|
|
|
|
|
2017-08-01 23:00:39 +07:00
|
|
|
static const struct rpc_xprt_ops xs_local_ops = {
|
2011-05-10 02:22:44 +07:00
|
|
|
.reserve_xprt = xprt_reserve_xprt,
|
2018-08-31 21:00:02 +07:00
|
|
|
.release_xprt = xprt_release_xprt,
|
2012-09-07 22:08:50 +07:00
|
|
|
.alloc_slot = xprt_alloc_slot,
|
2018-05-05 02:34:59 +07:00
|
|
|
.free_slot = xprt_free_slot,
|
2011-05-10 02:22:44 +07:00
|
|
|
.rpcbind = xs_local_rpcbind,
|
|
|
|
.set_port = xs_local_set_port,
|
2013-02-21 05:52:19 +07:00
|
|
|
.connect = xs_local_connect,
|
2011-05-10 02:22:44 +07:00
|
|
|
.buf_alloc = rpc_malloc,
|
|
|
|
.buf_free = rpc_free,
|
2018-09-15 01:32:45 +07:00
|
|
|
.prepare_request = xs_stream_prepare_request,
|
2011-05-10 02:22:44 +07:00
|
|
|
.send_request = xs_local_send_request,
|
|
|
|
.set_retrans_timeout = xprt_set_retrans_timeout_def,
|
|
|
|
.close = xs_close,
|
2013-10-31 20:18:49 +07:00
|
|
|
.destroy = xs_destroy,
|
2011-05-10 02:22:44 +07:00
|
|
|
.print_stats = xs_local_print_stats,
|
2015-06-04 03:14:29 +07:00
|
|
|
.enable_swap = xs_enable_swap,
|
|
|
|
.disable_swap = xs_disable_swap,
|
2011-05-10 02:22:44 +07:00
|
|
|
};
|
|
|
|
|
2017-08-01 23:00:39 +07:00
|
|
|
static const struct rpc_xprt_ops xs_udp_ops = {
|
2005-08-26 06:25:49 +07:00
|
|
|
.set_buffer_size = xs_udp_set_buffer_size,
|
2005-08-26 06:25:51 +07:00
|
|
|
.reserve_xprt = xprt_reserve_xprt_cong,
|
2005-08-26 06:25:51 +07:00
|
|
|
.release_xprt = xprt_release_xprt_cong,
|
2012-09-07 22:08:50 +07:00
|
|
|
.alloc_slot = xprt_alloc_slot,
|
2018-05-05 02:34:59 +07:00
|
|
|
.free_slot = xprt_free_slot,
|
2007-07-01 23:13:17 +07:00
|
|
|
.rpcbind = rpcb_getport_async,
|
2006-01-03 15:55:51 +07:00
|
|
|
.set_port = xs_set_port,
|
2005-08-12 03:25:56 +07:00
|
|
|
.connect = xs_connect,
|
2006-01-03 15:55:49 +07:00
|
|
|
.buf_alloc = rpc_malloc,
|
|
|
|
.buf_free = rpc_free,
|
2005-08-12 03:25:56 +07:00
|
|
|
.send_request = xs_udp_send_request,
|
2005-08-26 06:25:50 +07:00
|
|
|
.set_retrans_timeout = xprt_set_retrans_timeout_rtt,
|
2005-08-26 06:25:52 +07:00
|
|
|
.timer = xs_udp_timer,
|
2005-08-26 06:25:53 +07:00
|
|
|
.release_request = xprt_release_rqst_cong,
|
2005-08-12 03:25:56 +07:00
|
|
|
.close = xs_close,
|
|
|
|
.destroy = xs_destroy,
|
2006-03-21 01:44:16 +07:00
|
|
|
.print_stats = xs_udp_print_stats,
|
2015-06-04 03:14:29 +07:00
|
|
|
.enable_swap = xs_enable_swap,
|
|
|
|
.disable_swap = xs_disable_swap,
|
2015-05-12 01:02:25 +07:00
|
|
|
.inject_disconnect = xs_inject_disconnect,
|
2005-08-12 03:25:56 +07:00
|
|
|
};
|
|
|
|
|
2017-08-01 23:00:39 +07:00
|
|
|
static const struct rpc_xprt_ops xs_tcp_ops = {
|
2005-08-26 06:25:51 +07:00
|
|
|
.reserve_xprt = xprt_reserve_xprt,
|
2018-08-31 21:00:02 +07:00
|
|
|
.release_xprt = xprt_release_xprt,
|
2018-09-04 05:41:32 +07:00
|
|
|
.alloc_slot = xprt_alloc_slot,
|
2018-05-05 02:34:59 +07:00
|
|
|
.free_slot = xprt_free_slot,
|
2007-07-01 23:13:17 +07:00
|
|
|
.rpcbind = rpcb_getport_async,
|
2006-01-03 15:55:51 +07:00
|
|
|
.set_port = xs_set_port,
|
2010-04-17 03:41:57 +07:00
|
|
|
.connect = xs_connect,
|
2006-01-03 15:55:49 +07:00
|
|
|
.buf_alloc = rpc_malloc,
|
|
|
|
.buf_free = rpc_free,
|
2018-09-14 20:49:06 +07:00
|
|
|
.prepare_request = xs_stream_prepare_request,
|
2005-08-12 03:25:56 +07:00
|
|
|
.send_request = xs_tcp_send_request,
|
2005-08-26 06:25:50 +07:00
|
|
|
.set_retrans_timeout = xprt_set_retrans_timeout_def,
|
2015-02-10 23:06:04 +07:00
|
|
|
.close = xs_tcp_shutdown,
|
2005-08-12 03:25:26 +07:00
|
|
|
.destroy = xs_destroy,
|
2017-02-08 23:17:54 +07:00
|
|
|
.set_connect_timeout = xs_tcp_set_connect_timeout,
|
2006-03-21 01:44:16 +07:00
|
|
|
.print_stats = xs_tcp_print_stats,
|
2015-06-04 03:14:29 +07:00
|
|
|
.enable_swap = xs_enable_swap,
|
|
|
|
.disable_swap = xs_disable_swap,
|
2015-05-12 01:02:25 +07:00
|
|
|
.inject_disconnect = xs_inject_disconnect,
|
2015-10-25 04:27:35 +07:00
|
|
|
#ifdef CONFIG_SUNRPC_BACKCHANNEL
|
|
|
|
.bc_setup = xprt_setup_bc,
|
2015-10-25 04:28:32 +07:00
|
|
|
.bc_up = xs_tcp_bc_up,
|
2016-05-03 01:40:40 +07:00
|
|
|
.bc_maxpayload = xs_tcp_bc_maxpayload,
|
2015-10-25 04:27:35 +07:00
|
|
|
.bc_free_rqst = xprt_free_bc_rqst,
|
|
|
|
.bc_destroy = xprt_destroy_bc,
|
|
|
|
#endif
|
2005-08-12 03:25:23 +07:00
|
|
|
};
|
|
|
|
|
2009-09-10 21:32:28 +07:00
|
|
|
/*
|
|
|
|
* The rpc_xprt_ops for the server backchannel
|
|
|
|
*/
|
|
|
|
|
2017-08-01 23:00:39 +07:00
|
|
|
static const struct rpc_xprt_ops bc_tcp_ops = {
|
2009-09-10 21:32:28 +07:00
|
|
|
.reserve_xprt = xprt_reserve_xprt,
|
|
|
|
.release_xprt = xprt_release_xprt,
|
2012-09-25 00:39:01 +07:00
|
|
|
.alloc_slot = xprt_alloc_slot,
|
2018-05-05 02:34:59 +07:00
|
|
|
.free_slot = xprt_free_slot,
|
2009-09-10 21:32:28 +07:00
|
|
|
.buf_alloc = bc_malloc,
|
|
|
|
.buf_free = bc_free,
|
|
|
|
.send_request = bc_send_request,
|
|
|
|
.set_retrans_timeout = xprt_set_retrans_timeout_def,
|
|
|
|
.close = bc_close,
|
|
|
|
.destroy = bc_destroy,
|
|
|
|
.print_stats = xs_tcp_print_stats,
|
2015-06-04 03:14:29 +07:00
|
|
|
.enable_swap = xs_enable_swap,
|
|
|
|
.disable_swap = xs_disable_swap,
|
2015-05-12 01:02:25 +07:00
|
|
|
.inject_disconnect = xs_inject_disconnect,
|
2009-09-10 21:32:28 +07:00
|
|
|
};
|
|
|
|
|
SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
The source address field in the transport's sock_xprt is initialized
ONLY IF the RPC application passed a pointer to a source address
during the call to rpc_create(). However, xs_bind() subsequently uses
the value of this field without regard to whether the source address
was initialized during transport creation or not.
So far we've been lucky: the uninitialized value of this field is
zeroes. xs_bind(), until recently, used only the sin[6]_addr field in
this sockaddr, and all zeroes is a valid value for this: it means
ANYADDR. This is a happy coincidence.
However, xs_bind() now wants to use the sa_family field as well, and
expects it to be initialized to something other than zero.
Therefore, the source address sockaddr field should be fully
initialized at transport create time in _every_ case, not just when
the RPC application wants to use a specific bind address.
Bruce added a workaround for this missing initialization by adjusting
commit 6bc9638a, but the "right" way to do this is to ensure that the
source address sockaddr is always correctly initialized from the
get-go.
This patch doesn't introduce a behavior change. It's simply a
clean-up of Bruce's fix, to prevent future problems of this kind. It
may look like overkill, but
a) it clearly documents the default initial value of this field,
b) it doesn't assume that the sockaddr_storage memory is first
initialized to any particular value, and
c) it will fail verbosely if some unknown address family is passed
in
Originally introduced by commit d3bc9a1d.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-20 22:53:01 +07:00
|
|
|
static int xs_init_anyaddr(const int family, struct sockaddr *sap)
|
|
|
|
{
|
|
|
|
static const struct sockaddr_in sin = {
|
|
|
|
.sin_family = AF_INET,
|
|
|
|
.sin_addr.s_addr = htonl(INADDR_ANY),
|
|
|
|
};
|
|
|
|
static const struct sockaddr_in6 sin6 = {
|
|
|
|
.sin6_family = AF_INET6,
|
|
|
|
.sin6_addr = IN6ADDR_ANY_INIT,
|
|
|
|
};
|
|
|
|
|
|
|
|
switch (family) {
|
2011-05-10 02:22:44 +07:00
|
|
|
case AF_LOCAL:
|
|
|
|
break;
|
SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
The source address field in the transport's sock_xprt is initialized
ONLY IF the RPC application passed a pointer to a source address
during the call to rpc_create(). However, xs_bind() subsequently uses
the value of this field without regard to whether the source address
was initialized during transport creation or not.
So far we've been lucky: the uninitialized value of this field is
zeroes. xs_bind(), until recently, used only the sin[6]_addr field in
this sockaddr, and all zeroes is a valid value for this: it means
ANYADDR. This is a happy coincidence.
However, xs_bind() now wants to use the sa_family field as well, and
expects it to be initialized to something other than zero.
Therefore, the source address sockaddr field should be fully
initialized at transport create time in _every_ case, not just when
the RPC application wants to use a specific bind address.
Bruce added a workaround for this missing initialization by adjusting
commit 6bc9638a, but the "right" way to do this is to ensure that the
source address sockaddr is always correctly initialized from the
get-go.
This patch doesn't introduce a behavior change. It's simply a
clean-up of Bruce's fix, to prevent future problems of this kind. It
may look like overkill, but
a) it clearly documents the default initial value of this field,
b) it doesn't assume that the sockaddr_storage memory is first
initialized to any particular value, and
c) it will fail verbosely if some unknown address family is passed
in
Originally introduced by commit d3bc9a1d.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-20 22:53:01 +07:00
|
|
|
case AF_INET:
|
|
|
|
memcpy(sap, &sin, sizeof(sin));
|
|
|
|
break;
|
|
|
|
case AF_INET6:
|
|
|
|
memcpy(sap, &sin6, sizeof(sin6));
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
dprintk("RPC: %s: Bad address family\n", __func__);
|
|
|
|
return -EAFNOSUPPORT;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2007-09-11 00:47:07 +07:00
|
|
|
static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
|
2011-07-18 05:11:30 +07:00
|
|
|
unsigned int slot_table_size,
|
|
|
|
unsigned int max_slot_table_size)
|
2006-10-18 01:44:27 +07:00
|
|
|
{
|
|
|
|
struct rpc_xprt *xprt;
|
2006-12-06 04:35:11 +07:00
|
|
|
struct sock_xprt *new;
|
2006-10-18 01:44:27 +07:00
|
|
|
|
2007-07-08 18:08:54 +07:00
|
|
|
if (args->addrlen > sizeof(xprt->addr)) {
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_setup_xprt: address too large\n");
|
2006-10-18 01:44:27 +07:00
|
|
|
return ERR_PTR(-EBADF);
|
|
|
|
}
|
|
|
|
|
2011-07-18 05:11:30 +07:00
|
|
|
xprt = xprt_alloc(args->net, sizeof(*new), slot_table_size,
|
|
|
|
max_slot_table_size);
|
2010-09-29 19:02:43 +07:00
|
|
|
if (xprt == NULL) {
|
2007-02-01 00:14:08 +07:00
|
|
|
dprintk("RPC: xs_setup_xprt: couldn't allocate "
|
|
|
|
"rpc_xprt\n");
|
2006-10-18 01:44:27 +07:00
|
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
}
|
|
|
|
|
2010-09-29 19:02:43 +07:00
|
|
|
new = container_of(xprt, struct sock_xprt, xprt);
|
2015-10-05 21:53:49 +07:00
|
|
|
mutex_init(&new->recv_mutex);
|
2007-07-08 18:08:54 +07:00
|
|
|
memcpy(&xprt->addr, args->dstaddr, args->addrlen);
|
|
|
|
xprt->addrlen = args->addrlen;
|
2007-07-10 03:23:35 +07:00
|
|
|
if (args->srcaddr)
|
2009-08-10 02:09:46 +07:00
|
|
|
memcpy(&new->srcaddr, args->srcaddr, args->addrlen);
|
SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
The source address field in the transport's sock_xprt is initialized
ONLY IF the RPC application passed a pointer to a source address
during the call to rpc_create(). However, xs_bind() subsequently uses
the value of this field without regard to whether the source address
was initialized during transport creation or not.
So far we've been lucky: the uninitialized value of this field is
zeroes. xs_bind(), until recently, used only the sin[6]_addr field in
this sockaddr, and all zeroes is a valid value for this: it means
ANYADDR. This is a happy coincidence.
However, xs_bind() now wants to use the sa_family field as well, and
expects it to be initialized to something other than zero.
Therefore, the source address sockaddr field should be fully
initialized at transport create time in _every_ case, not just when
the RPC application wants to use a specific bind address.
Bruce added a workaround for this missing initialization by adjusting
commit 6bc9638a, but the "right" way to do this is to ensure that the
source address sockaddr is always correctly initialized from the
get-go.
This patch doesn't introduce a behavior change. It's simply a
clean-up of Bruce's fix, to prevent future problems of this kind. It
may look like overkill, but
a) it clearly documents the default initial value of this field,
b) it doesn't assume that the sockaddr_storage memory is first
initialized to any particular value, and
c) it will fail verbosely if some unknown address family is passed
in
Originally introduced by commit d3bc9a1d.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-20 22:53:01 +07:00
|
|
|
else {
|
|
|
|
int err;
|
|
|
|
err = xs_init_anyaddr(args->dstaddr->sa_family,
|
|
|
|
(struct sockaddr *)&new->srcaddr);
|
2011-11-10 18:33:23 +07:00
|
|
|
if (err != 0) {
|
|
|
|
xprt_free(xprt);
|
SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
The source address field in the transport's sock_xprt is initialized
ONLY IF the RPC application passed a pointer to a source address
during the call to rpc_create(). However, xs_bind() subsequently uses
the value of this field without regard to whether the source address
was initialized during transport creation or not.
So far we've been lucky: the uninitialized value of this field is
zeroes. xs_bind(), until recently, used only the sin[6]_addr field in
this sockaddr, and all zeroes is a valid value for this: it means
ANYADDR. This is a happy coincidence.
However, xs_bind() now wants to use the sa_family field as well, and
expects it to be initialized to something other than zero.
Therefore, the source address sockaddr field should be fully
initialized at transport create time in _every_ case, not just when
the RPC application wants to use a specific bind address.
Bruce added a workaround for this missing initialization by adjusting
commit 6bc9638a, but the "right" way to do this is to ensure that the
source address sockaddr is always correctly initialized from the
get-go.
This patch doesn't introduce a behavior change. It's simply a
clean-up of Bruce's fix, to prevent future problems of this kind. It
may look like overkill, but
a) it clearly documents the default initial value of this field,
b) it doesn't assume that the sockaddr_storage memory is first
initialized to any particular value, and
c) it will fail verbosely if some unknown address family is passed
in
Originally introduced by commit d3bc9a1d.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-20 22:53:01 +07:00
|
|
|
return ERR_PTR(err);
|
2011-11-10 18:33:23 +07:00
|
|
|
}
|
SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
The source address field in the transport's sock_xprt is initialized
ONLY IF the RPC application passed a pointer to a source address
during the call to rpc_create(). However, xs_bind() subsequently uses
the value of this field without regard to whether the source address
was initialized during transport creation or not.
So far we've been lucky: the uninitialized value of this field is
zeroes. xs_bind(), until recently, used only the sin[6]_addr field in
this sockaddr, and all zeroes is a valid value for this: it means
ANYADDR. This is a happy coincidence.
However, xs_bind() now wants to use the sa_family field as well, and
expects it to be initialized to something other than zero.
Therefore, the source address sockaddr field should be fully
initialized at transport create time in _every_ case, not just when
the RPC application wants to use a specific bind address.
Bruce added a workaround for this missing initialization by adjusting
commit 6bc9638a, but the "right" way to do this is to ensure that the
source address sockaddr is always correctly initialized from the
get-go.
This patch doesn't introduce a behavior change. It's simply a
clean-up of Bruce's fix, to prevent future problems of this kind. It
may look like overkill, but
a) it clearly documents the default initial value of this field,
b) it doesn't assume that the sockaddr_storage memory is first
initialized to any particular value, and
c) it will fail verbosely if some unknown address family is passed
in
Originally introduced by commit d3bc9a1d.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-20 22:53:01 +07:00
|
|
|
}
|
2006-10-18 01:44:27 +07:00
|
|
|
|
|
|
|
return xprt;
|
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
static const struct rpc_timeout xs_local_default_timeout = {
|
|
|
|
.to_initval = 10 * HZ,
|
|
|
|
.to_maxval = 10 * HZ,
|
|
|
|
.to_retries = 2,
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* xs_setup_local - Set up transport to use an AF_LOCAL socket
|
|
|
|
* @args: rpc transport creation arguments
|
|
|
|
*
|
|
|
|
* AF_LOCAL is a "tpi_cots_ord" transport, just like TCP
|
|
|
|
*/
|
|
|
|
static struct rpc_xprt *xs_setup_local(struct xprt_create *args)
|
|
|
|
{
|
|
|
|
struct sockaddr_un *sun = (struct sockaddr_un *)args->dstaddr;
|
|
|
|
struct sock_xprt *transport;
|
|
|
|
struct rpc_xprt *xprt;
|
|
|
|
struct rpc_xprt *ret;
|
|
|
|
|
2011-07-18 05:11:30 +07:00
|
|
|
xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries,
|
|
|
|
xprt_max_tcp_slot_table_entries);
|
2011-05-10 02:22:44 +07:00
|
|
|
if (IS_ERR(xprt))
|
|
|
|
return xprt;
|
|
|
|
transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
|
|
|
|
xprt->prot = 0;
|
|
|
|
xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
|
|
|
|
xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
|
|
|
|
|
|
|
|
xprt->bind_timeout = XS_BIND_TO;
|
|
|
|
xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
|
|
|
|
xprt->idle_timeout = XS_IDLE_DISC_TO;
|
|
|
|
|
|
|
|
xprt->ops = &xs_local_ops;
|
|
|
|
xprt->timeout = &xs_local_default_timeout;
|
|
|
|
|
2018-09-15 01:32:45 +07:00
|
|
|
INIT_WORK(&transport->recv_worker, xs_stream_data_receive_workfn);
|
|
|
|
INIT_DELAYED_WORK(&transport->connect_worker, xs_dummy_setup_socket);
|
2013-10-31 12:14:36 +07:00
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
switch (sun->sun_family) {
|
|
|
|
case AF_LOCAL:
|
|
|
|
if (sun->sun_path[0] != '/') {
|
|
|
|
dprintk("RPC: bad AF_LOCAL address: %s\n",
|
|
|
|
sun->sun_path);
|
|
|
|
ret = ERR_PTR(-EINVAL);
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
xprt_set_bound(xprt);
|
|
|
|
xs_format_peer_addresses(xprt, "local", RPCBIND_NETID_LOCAL);
|
2013-02-21 22:14:22 +07:00
|
|
|
ret = ERR_PTR(xs_local_setup_socket(transport));
|
|
|
|
if (ret)
|
|
|
|
goto out_err;
|
2011-05-10 02:22:44 +07:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
ret = ERR_PTR(-EAFNOSUPPORT);
|
|
|
|
goto out_err;
|
|
|
|
}
|
|
|
|
|
|
|
|
dprintk("RPC: set up xprt to %s via AF_LOCAL\n",
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR]);
|
|
|
|
|
|
|
|
if (try_module_get(THIS_MODULE))
|
|
|
|
return xprt;
|
|
|
|
ret = ERR_PTR(-EINVAL);
|
|
|
|
out_err:
|
2014-03-24 10:07:22 +07:00
|
|
|
xs_xprt_free(xprt);
|
2011-05-10 02:22:44 +07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2007-12-21 04:03:54 +07:00
|
|
|
static const struct rpc_timeout xs_udp_default_timeout = {
|
|
|
|
.to_initval = 5 * HZ,
|
|
|
|
.to_maxval = 30 * HZ,
|
|
|
|
.to_increment = 5 * HZ,
|
|
|
|
.to_retries = 5,
|
|
|
|
};
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
|
|
|
* xs_setup_udp - Set up transport to use a UDP socket
|
2007-07-08 18:08:54 +07:00
|
|
|
* @args: rpc transport creation arguments
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
|
|
|
*/
|
2007-10-24 23:24:02 +07:00
|
|
|
static struct rpc_xprt *xs_setup_udp(struct xprt_create *args)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2007-08-06 22:57:53 +07:00
|
|
|
struct sockaddr *addr = args->dstaddr;
|
2006-10-18 01:44:27 +07:00
|
|
|
struct rpc_xprt *xprt;
|
2006-12-06 04:35:26 +07:00
|
|
|
struct sock_xprt *transport;
|
2010-05-26 19:42:24 +07:00
|
|
|
struct rpc_xprt *ret;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2011-07-18 05:11:30 +07:00
|
|
|
xprt = xs_setup_xprt(args, xprt_udp_slot_table_entries,
|
|
|
|
xprt_udp_slot_table_entries);
|
2006-10-18 01:44:27 +07:00
|
|
|
if (IS_ERR(xprt))
|
|
|
|
return xprt;
|
2006-12-06 04:35:26 +07:00
|
|
|
transport = container_of(xprt, struct sock_xprt, xprt);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2006-08-23 07:06:15 +07:00
|
|
|
xprt->prot = IPPROTO_UDP;
|
2005-08-26 06:25:49 +07:00
|
|
|
xprt->tsh_size = 0;
|
2005-08-12 03:25:23 +07:00
|
|
|
/* XXX: header size can vary due to auth type, IPv6, etc. */
|
|
|
|
xprt->max_payload = (1U << 16) - (MAX_HEADER << 3);
|
|
|
|
|
2005-08-26 06:25:55 +07:00
|
|
|
xprt->bind_timeout = XS_BIND_TO;
|
|
|
|
xprt->reestablish_timeout = XS_UDP_REEST_TO;
|
|
|
|
xprt->idle_timeout = XS_IDLE_DISC_TO;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
xprt->ops = &xs_udp_ops;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2007-12-21 04:03:55 +07:00
|
|
|
xprt->timeout = &xs_udp_default_timeout;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2015-10-07 03:26:05 +07:00
|
|
|
INIT_WORK(&transport->recv_worker, xs_udp_data_receive_workfn);
|
2015-10-05 21:53:49 +07:00
|
|
|
INIT_DELAYED_WORK(&transport->connect_worker, xs_udp_setup_socket);
|
|
|
|
|
2007-08-06 22:57:53 +07:00
|
|
|
switch (addr->sa_family) {
|
|
|
|
case AF_INET:
|
|
|
|
if (((struct sockaddr_in *)addr)->sin_port != htons(0))
|
|
|
|
xprt_set_bound(xprt);
|
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
xs_format_peer_addresses(xprt, "udp", RPCBIND_NETID_UDP);
|
2007-08-06 22:57:53 +07:00
|
|
|
break;
|
|
|
|
case AF_INET6:
|
|
|
|
if (((struct sockaddr_in6 *)addr)->sin6_port != htons(0))
|
|
|
|
xprt_set_bound(xprt);
|
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
xs_format_peer_addresses(xprt, "udp", RPCBIND_NETID_UDP6);
|
2007-08-06 22:57:53 +07:00
|
|
|
break;
|
|
|
|
default:
|
2010-05-26 19:42:24 +07:00
|
|
|
ret = ERR_PTR(-EAFNOSUPPORT);
|
|
|
|
goto out_err;
|
2007-08-06 22:57:53 +07:00
|
|
|
}
|
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
if (xprt_bound(xprt))
|
|
|
|
dprintk("RPC: set up xprt to %s (port %s) via %s\n",
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PORT],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO]);
|
|
|
|
else
|
|
|
|
dprintk("RPC: set up xprt to %s (autobind) via %s\n",
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO]);
|
2006-08-23 07:06:18 +07:00
|
|
|
|
2007-09-11 00:46:39 +07:00
|
|
|
if (try_module_get(THIS_MODULE))
|
|
|
|
return xprt;
|
2010-05-26 19:42:24 +07:00
|
|
|
ret = ERR_PTR(-EINVAL);
|
|
|
|
out_err:
|
2014-03-24 10:07:22 +07:00
|
|
|
xs_xprt_free(xprt);
|
2010-05-26 19:42:24 +07:00
|
|
|
return ret;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
|
|
|
|
2007-12-21 04:03:54 +07:00
|
|
|
static const struct rpc_timeout xs_tcp_default_timeout = {
|
|
|
|
.to_initval = 60 * HZ,
|
|
|
|
.to_maxval = 60 * HZ,
|
|
|
|
.to_retries = 2,
|
|
|
|
};
|
|
|
|
|
2005-08-12 03:25:26 +07:00
|
|
|
/**
|
|
|
|
* xs_setup_tcp - Set up transport to use a TCP socket
|
2007-07-08 18:08:54 +07:00
|
|
|
* @args: rpc transport creation arguments
|
2005-08-12 03:25:26 +07:00
|
|
|
*
|
|
|
|
*/
|
2007-10-24 23:24:02 +07:00
|
|
|
static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
|
2005-08-12 03:25:23 +07:00
|
|
|
{
|
2007-08-06 22:57:53 +07:00
|
|
|
struct sockaddr *addr = args->dstaddr;
|
2006-10-18 01:44:27 +07:00
|
|
|
struct rpc_xprt *xprt;
|
2006-12-06 04:35:26 +07:00
|
|
|
struct sock_xprt *transport;
|
2010-05-26 19:42:24 +07:00
|
|
|
struct rpc_xprt *ret;
|
2013-04-14 22:42:00 +07:00
|
|
|
unsigned int max_slot_table_size = xprt_max_tcp_slot_table_entries;
|
|
|
|
|
|
|
|
if (args->flags & XPRT_CREATE_INFINITE_SLOTS)
|
|
|
|
max_slot_table_size = RPC_MAX_SLOT_TABLE_LIMIT;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2011-07-18 05:11:30 +07:00
|
|
|
xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries,
|
2013-04-14 22:42:00 +07:00
|
|
|
max_slot_table_size);
|
2006-10-18 01:44:27 +07:00
|
|
|
if (IS_ERR(xprt))
|
|
|
|
return xprt;
|
2006-12-06 04:35:26 +07:00
|
|
|
transport = container_of(xprt, struct sock_xprt, xprt);
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2006-08-23 07:06:15 +07:00
|
|
|
xprt->prot = IPPROTO_TCP;
|
2005-08-26 06:25:49 +07:00
|
|
|
xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
|
|
|
|
xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-26 06:25:55 +07:00
|
|
|
xprt->bind_timeout = XS_BIND_TO;
|
|
|
|
xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
|
|
|
|
xprt->idle_timeout = XS_IDLE_DISC_TO;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2005-08-12 03:25:56 +07:00
|
|
|
xprt->ops = &xs_tcp_ops;
|
2007-12-21 04:03:55 +07:00
|
|
|
xprt->timeout = &xs_tcp_default_timeout;
|
2005-08-12 03:25:23 +07:00
|
|
|
|
2016-08-04 11:08:45 +07:00
|
|
|
xprt->max_reconnect_timeout = xprt->timeout->to_maxval;
|
2017-02-08 23:17:54 +07:00
|
|
|
xprt->connect_timeout = xprt->timeout->to_initval *
|
|
|
|
(xprt->timeout->to_retries + 1);
|
2016-08-04 11:08:45 +07:00
|
|
|
|
2018-09-15 01:26:28 +07:00
|
|
|
INIT_WORK(&transport->recv_worker, xs_stream_data_receive_workfn);
|
2015-10-05 21:53:49 +07:00
|
|
|
INIT_DELAYED_WORK(&transport->connect_worker, xs_tcp_setup_socket);
|
|
|
|
|
2007-08-06 22:57:53 +07:00
|
|
|
switch (addr->sa_family) {
|
|
|
|
case AF_INET:
|
|
|
|
if (((struct sockaddr_in *)addr)->sin_port != htons(0))
|
|
|
|
xprt_set_bound(xprt);
|
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
xs_format_peer_addresses(xprt, "tcp", RPCBIND_NETID_TCP);
|
2007-08-06 22:57:53 +07:00
|
|
|
break;
|
|
|
|
case AF_INET6:
|
|
|
|
if (((struct sockaddr_in6 *)addr)->sin6_port != htons(0))
|
|
|
|
xprt_set_bound(xprt);
|
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
xs_format_peer_addresses(xprt, "tcp", RPCBIND_NETID_TCP6);
|
2007-08-06 22:57:53 +07:00
|
|
|
break;
|
|
|
|
default:
|
2010-05-26 19:42:24 +07:00
|
|
|
ret = ERR_PTR(-EAFNOSUPPORT);
|
|
|
|
goto out_err;
|
2007-08-06 22:57:53 +07:00
|
|
|
}
|
|
|
|
|
2009-08-10 02:09:46 +07:00
|
|
|
if (xprt_bound(xprt))
|
|
|
|
dprintk("RPC: set up xprt to %s (port %s) via %s\n",
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PORT],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO]);
|
|
|
|
else
|
|
|
|
dprintk("RPC: set up xprt to %s (autobind) via %s\n",
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO]);
|
|
|
|
|
2007-09-11 00:46:39 +07:00
|
|
|
if (try_module_get(THIS_MODULE))
|
|
|
|
return xprt;
|
2010-05-26 19:42:24 +07:00
|
|
|
ret = ERR_PTR(-EINVAL);
|
|
|
|
out_err:
|
2014-03-24 10:07:22 +07:00
|
|
|
xs_xprt_free(xprt);
|
2010-05-26 19:42:24 +07:00
|
|
|
return ret;
|
2005-08-12 03:25:23 +07:00
|
|
|
}
|
2006-12-06 04:35:51 +07:00
|
|
|
|
2009-09-10 21:33:30 +07:00
|
|
|
/**
|
|
|
|
* xs_setup_bc_tcp - Set up transport to use a TCP backchannel socket
|
|
|
|
* @args: rpc transport creation arguments
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static struct rpc_xprt *xs_setup_bc_tcp(struct xprt_create *args)
|
|
|
|
{
|
|
|
|
struct sockaddr *addr = args->dstaddr;
|
|
|
|
struct rpc_xprt *xprt;
|
|
|
|
struct sock_xprt *transport;
|
|
|
|
struct svc_sock *bc_sock;
|
2010-05-26 19:42:24 +07:00
|
|
|
struct rpc_xprt *ret;
|
2009-09-10 21:33:30 +07:00
|
|
|
|
2011-07-18 05:11:30 +07:00
|
|
|
xprt = xs_setup_xprt(args, xprt_tcp_slot_table_entries,
|
|
|
|
xprt_tcp_slot_table_entries);
|
2009-09-10 21:33:30 +07:00
|
|
|
if (IS_ERR(xprt))
|
|
|
|
return xprt;
|
|
|
|
transport = container_of(xprt, struct sock_xprt, xprt);
|
|
|
|
|
|
|
|
xprt->prot = IPPROTO_TCP;
|
|
|
|
xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
|
|
|
|
xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
|
|
|
|
xprt->timeout = &xs_tcp_default_timeout;
|
|
|
|
|
|
|
|
/* backchannel */
|
|
|
|
xprt_set_bound(xprt);
|
|
|
|
xprt->bind_timeout = 0;
|
|
|
|
xprt->reestablish_timeout = 0;
|
|
|
|
xprt->idle_timeout = 0;
|
|
|
|
|
|
|
|
xprt->ops = &bc_tcp_ops;
|
|
|
|
|
|
|
|
switch (addr->sa_family) {
|
|
|
|
case AF_INET:
|
|
|
|
xs_format_peer_addresses(xprt, "tcp",
|
|
|
|
RPCBIND_NETID_TCP);
|
|
|
|
break;
|
|
|
|
case AF_INET6:
|
|
|
|
xs_format_peer_addresses(xprt, "tcp",
|
|
|
|
RPCBIND_NETID_TCP6);
|
|
|
|
break;
|
|
|
|
default:
|
2010-05-26 19:42:24 +07:00
|
|
|
ret = ERR_PTR(-EAFNOSUPPORT);
|
|
|
|
goto out_err;
|
2009-09-10 21:33:30 +07:00
|
|
|
}
|
|
|
|
|
2010-10-05 23:49:35 +07:00
|
|
|
dprintk("RPC: set up xprt to %s (port %s) via %s\n",
|
|
|
|
xprt->address_strings[RPC_DISPLAY_ADDR],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PORT],
|
|
|
|
xprt->address_strings[RPC_DISPLAY_PROTO]);
|
2009-09-10 21:33:30 +07:00
|
|
|
|
2010-12-09 00:45:44 +07:00
|
|
|
/*
|
|
|
|
* Once we've associated a backchannel xprt with a connection,
|
2013-11-30 16:56:44 +07:00
|
|
|
* we want to keep it around as long as the connection lasts,
|
|
|
|
* in case we need to start using it for a backchannel again;
|
|
|
|
* this reference won't be dropped until bc_xprt is destroyed.
|
2010-12-09 00:45:44 +07:00
|
|
|
*/
|
|
|
|
xprt_get(xprt);
|
|
|
|
args->bc_xprt->xpt_bc_xprt = xprt;
|
|
|
|
xprt->bc_xprt = args->bc_xprt;
|
|
|
|
bc_sock = container_of(args->bc_xprt, struct svc_sock, sk_xprt);
|
|
|
|
transport->sock = bc_sock->sk_sock;
|
|
|
|
transport->inet = bc_sock->sk_sk;
|
|
|
|
|
2009-09-10 21:33:30 +07:00
|
|
|
/*
|
|
|
|
* Since we don't want connections for the backchannel, we set
|
|
|
|
* the xprt status to connected
|
|
|
|
*/
|
|
|
|
xprt_set_connected(xprt);
|
|
|
|
|
|
|
|
if (try_module_get(THIS_MODULE))
|
|
|
|
return xprt;
|
2014-03-24 11:00:28 +07:00
|
|
|
|
|
|
|
args->bc_xprt->xpt_bc_xprt = NULL;
|
2016-05-17 23:38:21 +07:00
|
|
|
args->bc_xprt->xpt_bc_xps = NULL;
|
2010-12-09 00:45:44 +07:00
|
|
|
xprt_put(xprt);
|
2010-05-26 19:42:24 +07:00
|
|
|
ret = ERR_PTR(-EINVAL);
|
|
|
|
out_err:
|
2014-03-24 10:07:22 +07:00
|
|
|
xs_xprt_free(xprt);
|
2010-05-26 19:42:24 +07:00
|
|
|
return ret;
|
2009-09-10 21:33:30 +07:00
|
|
|
}
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
static struct xprt_class xs_local_transport = {
|
|
|
|
.list = LIST_HEAD_INIT(xs_local_transport.list),
|
|
|
|
.name = "named UNIX socket",
|
|
|
|
.owner = THIS_MODULE,
|
|
|
|
.ident = XPRT_TRANSPORT_LOCAL,
|
|
|
|
.setup = xs_setup_local,
|
|
|
|
};
|
|
|
|
|
2007-09-11 00:46:39 +07:00
|
|
|
static struct xprt_class xs_udp_transport = {
|
|
|
|
.list = LIST_HEAD_INIT(xs_udp_transport.list),
|
|
|
|
.name = "udp",
|
|
|
|
.owner = THIS_MODULE,
|
2009-09-10 21:33:30 +07:00
|
|
|
.ident = XPRT_TRANSPORT_UDP,
|
2007-09-11 00:46:39 +07:00
|
|
|
.setup = xs_setup_udp,
|
|
|
|
};
|
|
|
|
|
|
|
|
static struct xprt_class xs_tcp_transport = {
|
|
|
|
.list = LIST_HEAD_INIT(xs_tcp_transport.list),
|
|
|
|
.name = "tcp",
|
|
|
|
.owner = THIS_MODULE,
|
2009-09-10 21:33:30 +07:00
|
|
|
.ident = XPRT_TRANSPORT_TCP,
|
2007-09-11 00:46:39 +07:00
|
|
|
.setup = xs_setup_tcp,
|
|
|
|
};
|
|
|
|
|
2009-09-10 21:33:30 +07:00
|
|
|
static struct xprt_class xs_bc_tcp_transport = {
|
|
|
|
.list = LIST_HEAD_INIT(xs_bc_tcp_transport.list),
|
|
|
|
.name = "tcp NFSv4.1 backchannel",
|
|
|
|
.owner = THIS_MODULE,
|
|
|
|
.ident = XPRT_TRANSPORT_BC_TCP,
|
|
|
|
.setup = xs_setup_bc_tcp,
|
|
|
|
};
|
|
|
|
|
2006-12-06 04:35:51 +07:00
|
|
|
/**
|
2007-09-11 00:46:39 +07:00
|
|
|
* init_socket_xprt - set up xprtsock's sysctls, register with RPC client
|
2006-12-06 04:35:51 +07:00
|
|
|
*
|
|
|
|
*/
|
|
|
|
int init_socket_xprt(void)
|
|
|
|
{
|
2014-11-18 04:58:04 +07:00
|
|
|
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
|
2007-02-14 15:33:24 +07:00
|
|
|
if (!sunrpc_table_header)
|
2007-02-14 15:34:09 +07:00
|
|
|
sunrpc_table_header = register_sysctl_table(sunrpc_table);
|
2006-12-06 04:35:54 +07:00
|
|
|
#endif
|
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
xprt_register_transport(&xs_local_transport);
|
2007-09-11 00:46:39 +07:00
|
|
|
xprt_register_transport(&xs_udp_transport);
|
|
|
|
xprt_register_transport(&xs_tcp_transport);
|
2009-09-10 21:33:30 +07:00
|
|
|
xprt_register_transport(&xs_bc_tcp_transport);
|
2007-09-11 00:46:39 +07:00
|
|
|
|
2006-12-06 04:35:51 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2007-09-11 00:46:39 +07:00
|
|
|
* cleanup_socket_xprt - remove xprtsock's sysctls, unregister
|
2006-12-06 04:35:51 +07:00
|
|
|
*
|
|
|
|
*/
|
|
|
|
void cleanup_socket_xprt(void)
|
|
|
|
{
|
2014-11-18 04:58:04 +07:00
|
|
|
#if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
|
2006-12-06 04:35:54 +07:00
|
|
|
if (sunrpc_table_header) {
|
|
|
|
unregister_sysctl_table(sunrpc_table_header);
|
|
|
|
sunrpc_table_header = NULL;
|
|
|
|
}
|
|
|
|
#endif
|
2007-09-11 00:46:39 +07:00
|
|
|
|
2011-05-10 02:22:44 +07:00
|
|
|
xprt_unregister_transport(&xs_local_transport);
|
2007-09-11 00:46:39 +07:00
|
|
|
xprt_unregister_transport(&xs_udp_transport);
|
|
|
|
xprt_unregister_transport(&xs_tcp_transport);
|
2009-09-10 21:33:30 +07:00
|
|
|
xprt_unregister_transport(&xs_bc_tcp_transport);
|
2006-12-06 04:35:51 +07:00
|
|
|
}
|
2009-08-10 02:06:19 +07:00
|
|
|
|
2010-08-12 12:04:12 +07:00
|
|
|
static int param_set_uint_minmax(const char *val,
|
|
|
|
const struct kernel_param *kp,
|
2009-08-10 02:06:19 +07:00
|
|
|
unsigned int min, unsigned int max)
|
|
|
|
{
|
2014-06-21 19:06:38 +07:00
|
|
|
unsigned int num;
|
2009-08-10 02:06:19 +07:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (!val)
|
|
|
|
return -EINVAL;
|
2014-06-21 19:06:38 +07:00
|
|
|
ret = kstrtouint(val, 0, &num);
|
2017-02-19 04:34:59 +07:00
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
if (num < min || num > max)
|
2009-08-10 02:06:19 +07:00
|
|
|
return -EINVAL;
|
|
|
|
*((unsigned int *)kp->arg) = num;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2010-08-12 12:04:12 +07:00
|
|
|
static int param_set_portnr(const char *val, const struct kernel_param *kp)
|
2009-08-10 02:06:19 +07:00
|
|
|
{
|
2016-07-09 04:35:25 +07:00
|
|
|
return param_set_uint_minmax(val, kp,
|
2018-10-19 02:27:02 +07:00
|
|
|
RPC_MIN_RESVPORT,
|
2009-08-10 02:06:19 +07:00
|
|
|
RPC_MAX_RESVPORT);
|
|
|
|
}
|
|
|
|
|
2015-05-27 08:39:38 +07:00
|
|
|
static const struct kernel_param_ops param_ops_portnr = {
|
2010-08-12 12:04:12 +07:00
|
|
|
.set = param_set_portnr,
|
|
|
|
.get = param_get_uint,
|
|
|
|
};
|
|
|
|
|
2009-08-10 02:06:19 +07:00
|
|
|
#define param_check_portnr(name, p) \
|
|
|
|
__param_check(name, p, unsigned int);
|
|
|
|
|
|
|
|
module_param_named(min_resvport, xprt_min_resvport, portnr, 0644);
|
|
|
|
module_param_named(max_resvport, xprt_max_resvport, portnr, 0644);
|
|
|
|
|
2010-08-12 12:04:12 +07:00
|
|
|
static int param_set_slot_table_size(const char *val,
|
|
|
|
const struct kernel_param *kp)
|
2009-08-10 02:06:19 +07:00
|
|
|
{
|
|
|
|
return param_set_uint_minmax(val, kp,
|
|
|
|
RPC_MIN_SLOT_TABLE,
|
|
|
|
RPC_MAX_SLOT_TABLE);
|
|
|
|
}
|
|
|
|
|
2015-05-27 08:39:38 +07:00
|
|
|
static const struct kernel_param_ops param_ops_slot_table_size = {
|
2010-08-12 12:04:12 +07:00
|
|
|
.set = param_set_slot_table_size,
|
|
|
|
.get = param_get_uint,
|
|
|
|
};
|
|
|
|
|
2009-08-10 02:06:19 +07:00
|
|
|
#define param_check_slot_table_size(name, p) \
|
|
|
|
__param_check(name, p, unsigned int);
|
|
|
|
|
2011-07-18 05:11:30 +07:00
|
|
|
static int param_set_max_slot_table_size(const char *val,
|
|
|
|
const struct kernel_param *kp)
|
|
|
|
{
|
|
|
|
return param_set_uint_minmax(val, kp,
|
|
|
|
RPC_MIN_SLOT_TABLE,
|
|
|
|
RPC_MAX_SLOT_TABLE_LIMIT);
|
|
|
|
}
|
|
|
|
|
2015-05-27 08:39:38 +07:00
|
|
|
static const struct kernel_param_ops param_ops_max_slot_table_size = {
|
2011-07-18 05:11:30 +07:00
|
|
|
.set = param_set_max_slot_table_size,
|
|
|
|
.get = param_get_uint,
|
|
|
|
};
|
|
|
|
|
|
|
|
#define param_check_max_slot_table_size(name, p) \
|
|
|
|
__param_check(name, p, unsigned int);
|
|
|
|
|
2009-08-10 02:06:19 +07:00
|
|
|
module_param_named(tcp_slot_table_entries, xprt_tcp_slot_table_entries,
|
|
|
|
slot_table_size, 0644);
|
2011-07-18 05:11:30 +07:00
|
|
|
module_param_named(tcp_max_slot_table_entries, xprt_max_tcp_slot_table_entries,
|
|
|
|
max_slot_table_size, 0644);
|
2009-08-10 02:06:19 +07:00
|
|
|
module_param_named(udp_slot_table_entries, xprt_udp_slot_table_entries,
|
|
|
|
slot_table_size, 0644);
|