linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 18:35:26 +07:00

Author	SHA1	Message	Date
Trond Myklebust	400417b05f	pNFS: Fix a typo in pnfs_update_layout We're supposed to wait for the outstanding layout count to go to zero, but that got lost somehow. Fixes: `d03360aaf5` ("pNFS: Ensure we return the error if someone...") Reported-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-12 16:04:51 -04:00
Olga Kornievskaia	f87b543af4	fix null pointer deref in tracepoints in back channel Backchannel doesn't have the rq_task->tk_clientid pointer set. Otherwise can lead to the following oops: ocalhost login: [ 111.385319] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 111.388073] #PF error: [normal kernel read fault] [ 111.389452] PGD 80000000290d8067 P4D 80000000290d8067 PUD 75f25067 PMD 0 [ 111.391224] Oops: 0000 [#1] SMP PTI [ 111.392151] CPU: 0 PID: 3533 Comm: NFSv4 callback Not tainted 5.0.0-rc7+ #1 [ 111.393787] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 111.396340] RIP: 0010:trace_event_raw_event_xprt_enq_xmit+0x6f/0xf0 [sunrpc] [ 111.397974] Code: 00 00 00 48 89 ee 48 89 e7 e8 bd 0a 85 d7 48 85 c0 74 4a 41 0f b7 94 24 e0 00 00 00 48 89 e7 89 50 08 49 8b 94 24 a8 00 00 00 <8b> 52 04 89 50 0c 49 8b 94 24 c0 00 00 00 8b 92 a8 00 00 00 0f ca [ 111.402215] RSP: 0018:ffffb98743263cf8 EFLAGS: 00010286 [ 111.403406] RAX: ffffa0890fc3bc88 RBX: 0000000000000003 RCX: 0000000000000000 [ 111.405057] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffb98743263cf8 [ 111.406656] RBP: ffffa0896f5368f0 R08: 0000000000000246 R09: 0000000000000000 [ 111.408437] R10: ffffe19b01c01500 R11: 0000000000000000 R12: ffffa08977d28a00 [ 111.410210] R13: 0000000000000004 R14: ffffa089315303f0 R15: ffffa08931530000 [ 111.411856] FS: 0000000000000000(0000) GS:ffffa0897bc00000(0000) knlGS:0000000000000000 [ 111.413699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 111.415068] CR2: 0000000000000004 CR3: 000000002ac90004 CR4: 00000000001606f0 [ 111.416745] Call Trace: [ 111.417339] xprt_request_enqueue_transmit+0x2b6/0x4a0 [sunrpc] [ 111.418709] ? rpc_task_need_encode+0x40/0x40 [sunrpc] [ 111.419957] call_bc_transmit+0xd5/0x170 [sunrpc] [ 111.421067] __rpc_execute+0x7e/0x3f0 [sunrpc] [ 111.422177] rpc_run_bc_task+0x78/0xd0 [sunrpc] [ 111.423212] bc_svc_process+0x281/0x340 [sunrpc] [ 111.424325] nfs41_callback_svc+0x130/0x1c0 [nfsv4] [ 111.425430] ? remove_wait_queue+0x60/0x60 [ 111.426398] kthread+0xf5/0x130 [ 111.427155] ? nfs_callback_authenticate+0x50/0x50 [nfsv4] [ 111.428388] ? kthread_bind+0x10/0x10 [ 111.429270] ret_from_fork+0x1f/0x30 localhost login: [ 467.462259] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 467.464411] #PF error: [normal kernel read fault] [ 467.465445] PGD 80000000728c1067 P4D 80000000728c1067 PUD 728c0067 PMD 0 [ 467.466980] Oops: 0000 [#1] SMP PTI [ 467.467759] CPU: 0 PID: 3517 Comm: NFSv4 callback Not tainted 5.0.0-rc7+ #1 [ 467.469393] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 467.471840] RIP: 0010:trace_event_raw_event_xprt_transmit+0x7c/0xf0 [sunrpc] [ 467.473392] Code: f6 48 85 c0 74 4b 49 8b 94 24 98 00 00 00 48 89 e7 0f b7 92 e0 00 00 00 89 50 08 49 8b 94 24 98 00 00 00 48 8b 92 a8 00 00 00 <8b> 52 04 89 50 0c 41 8b 94 24 a8 00 00 00 0f ca 89 50 10 41 8b 94 [ 467.477605] RSP: 0018:ffffabe7434fbcd0 EFLAGS: 00010282 [ 467.478793] RAX: ffff99720fc3bce0 RBX: 0000000000000003 RCX: 0000000000000000 [ 467.480409] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffabe7434fbcd0 [ 467.482011] RBP: ffff99726f631948 R08: 0000000000000246 R09: 0000000000000000 [ 467.483591] R10: 0000000070000000 R11: 0000000000000000 R12: ffff997277dfcc00 [ 467.485226] R13: 0000000000000000 R14: 0000000000000000 R15: ffff99722fecdca8 [ 467.486830] FS: 0000000000000000(0000) GS:ffff99727bc00000(0000) knlGS:0000000000000000 [ 467.488596] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 467.489931] CR2: 0000000000000004 CR3: 00000000270e6006 CR4: 00000000001606f0 [ 467.491559] Call Trace: [ 467.492128] xprt_transmit+0x303/0x3f0 [sunrpc] [ 467.493143] ? rpc_task_need_encode+0x40/0x40 [sunrpc] [ 467.494328] call_bc_transmit+0x49/0x170 [sunrpc] [ 467.495379] __rpc_execute+0x7e/0x3f0 [sunrpc] [ 467.496451] rpc_run_bc_task+0x78/0xd0 [sunrpc] [ 467.497467] bc_svc_process+0x281/0x340 [sunrpc] [ 467.498507] nfs41_callback_svc+0x130/0x1c0 [nfsv4] [ 467.499751] ? remove_wait_queue+0x60/0x60 [ 467.500686] kthread+0xf5/0x130 [ 467.501438] ? nfs_callback_authenticate+0x50/0x50 [nfsv4] [ 467.502640] ? kthread_bind+0x10/0x10 [ 467.503454] ret_from_fork+0x1f/0x30 Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-12 16:01:39 -04:00
Trond Myklebust	4d6c671ace	SUNRPC: Take the transport send lock before binding+connecting Before trying to bind a port, ensure we grab the send lock to ensure that we don't change the port while another task is busy transmitting requests. The connect code already takes the send lock in xprt_connect(), but it is harmless to take it before that. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-10 14:08:19 -04:00
Trond Myklebust	009a82f643	SUNRPC: Micro-optimise when the task is known not to be sleeping In cases where we know the task is not sleeping, try to optimise away the indirect call to task->tk_action() by replacing it with a direct call. Only change tail calls, to allow gcc to perform tail call elimination. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-10 14:08:19 -04:00
Trond Myklebust	03e51d32da	SUNRPC: Check whether the task was transmitted before rebind/reconnect Before initiating transport actions that require putting the task to sleep, such as rebinding or reconnecting, we should check whether or not the task was already transmitted. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-10 11:55:58 -04:00
Trond Myklebust	6b5f590016	SUNRPC: Remove redundant calls to RPC_IS_QUEUED() The RPC task wakeup calls all check for RPC_IS_QUEUED() before taking any locks. In addition, rpc_exit() already calls rpc_wake_up_queued_task(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-09 16:22:58 -05:00
Trond Myklebust	cea57789e4	SUNRPC: Clean up Replace remaining callers of call_timeout() with rpc_check_timeout(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-09 16:22:58 -05:00
Trond Myklebust	7b3fef8e41	SUNRPC: Respect RPC call timeouts when retrying transmission Fix a regression where soft and softconn requests are not timing out as expected. Fixes: `89f90fe1ad` ("SUNRPC: Allow calls to xprt_transmit() to drain...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-07 16:45:51 -05:00
Trond Myklebust	477687e111	SUNRPC: Fix up RPC back channel transmission Now that transmissions happen through a queue, we require the RPC tasks to handle error conditions that may have been set while they were sleeping. The back channel does not currently do this, but assumes that any error condition happens during its own call to xprt_transmit(). The solution is to ensure that the back channel splits out the error handling just like the forward channel does. Fixes: `89f90fe1ad` ("SUNRPC: Allow calls to xprt_transmit() to drain...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-07 16:45:21 -05:00
Trond Myklebust	ed7dc973bd	SUNRPC: Prevent thundering herd when the socket is not connected If the socket is not connected, then we want to initiate a reconnect rather that trying to transmit requests. If there is a large number of requests queued and waiting for the lock in call_transmit(), then it can take a while for one of the to loop back and retake the lock in call_connect. Fixes: `89f90fe1ad` ("SUNRPC: Allow calls to xprt_transmit() to drain...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-07 16:45:19 -05:00
Trond Myklebust	0d1bf3407c	SUNRPC: Allow dynamic allocation of back channel slots Now that the reads happen in a process context rather than a softirq, it is safe to allocate back channel slots using a reclaiming allocation. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-02 16:25:26 -05:00
Trond Myklebust	067c469671	NFSv4.1: Bump the default callback session slot count to 16 Users can still control this value explicitly using the max_session_cb_slots module parameter, but let's bump the default up to 16 for now. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-02 16:25:26 -05:00
Trond Myklebust	12a3ad6184	SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpc Convert the remaining gfp_flags arguments in sunrpc to standard reclaiming allocations, now that we set memalloc_nofs_save() as appropriate. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-02 16:25:26 -05:00
Trond Myklebust	cefa587a40	NFS/flexfiles: Clean up mirror DS initialisation Get rid of the redundant parameter and rename the function ff_layout_mirror_valid() to ff_layout_init_mirror_ds() for clarity. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:39 -05:00
Trond Myklebust	29a23909e4	NFS/flexfiles: Remove dead code in ff_layout_mirror_valid() nfs4_ff_alloc_deviceid_node() guarantees that if mirror->mirror_ds is a valid pointer, then so is mirror->mirror_ds->ds. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	4cbc8a571c	NFS/flexfile: Simplify nfs4_ff_layout_select_ds_stateid() Pass in a pointer to the mirror rather than forcing another array access. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	626d48b12c	NFS/flexfile: Simplify nfs4_ff_layout_ds_version() Pass in a pointer to the mirror rather than forcing another array access. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	312cd4cb12	NFS/flexfiles: Simplify ff_layout_get_ds_cred() Pass in a pointer to the mirror rather than forcing another array access. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	561d6f8aaf	NFS/flexfiles: Simplify nfs4_ff_find_or_create_ds_client() Pass in a pointer to the mirror rather than forcing another array access. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	749da527b3	NFS/flexfiles: Simplify nfs4_ff_layout_select_ds_fh() Pass in a pointer to the mirror rather than having to retrieve it from the array and then verify the resulting pointer. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	76c6690522	NFS/flexfiles: Speed up read failover when DSes are down If we notice that a DS may be down, we should attempt to read from the other mirrors first before we go back to retry the dead DS. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	17aaec8167	NFS/flexfiles: Don't invalidate DS deviceids for being unresponsive If the DS is unresponsive, we want to just mark it as such, while reporting the errors. If the server later returns the same deviceid in a new layout, then we don't want to have to look it up again. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	d082d4b5a0	NFS/flexfiles: Remove bogus checks for invalid deviceids We already check the deviceids before we start the RPC call. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:38 -05:00
Trond Myklebust	0a156dd582	NFS/flexfiles: Avoid unnecessary layout invalidations In ff_layout_mirror_valid() we may not want to invalidate the layout segment despite the call to GETDEVICEINFO failing. The reason is that a read may still be able to make progress on another mirror. So instead we let the caller (in this case nfs4_ff_layout_prepare_ds()) decide whether or not it needs to invalidate. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:37 -05:00
Trond Myklebust	2444ff277a	NFS/flexfiles: refactor calls to fs4_ff_layout_prepare_ds() While we may want to skip attempting to connect to a downed mirror when we're deciding which mirror to select for a read, we do not want to do so once we've committed to attempting the I/O in ff_layout_read/write_pagelist(), or ff_layout_initiate_commit() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:37 -05:00
Trond Myklebust	18c0778a65	NFSv4: Handle early exit in layoutget by returning an error If the LAYOUTGET rpc call exits early without an error, convert it to EAGAIN. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:37 -05:00
Trond Myklebust	f0922a6c0c	NFS/flexfiles: Send LAYOUTERROR when failing over mirrored reads When a read to the preferred mirror returns an error, the flexfiles driver records the error in the inode list and currently marks the layout for return before failing over the attempted read to the next mirror. What we actually want to do is fire off a LAYOUTERROR to notify the MDS that there is an issue with the preferred mirror, then we fail over. Only once we've failed to read from all mirrors should we return the layout. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 22:37:37 -05:00
Trond Myklebust	3eb86093ea	NFSv4.2: Add client support for the generic 'layouterror' RPC call Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 16:20:16 -05:00
Trond Myklebust	a79f194aa4	NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated If a layout segment gets invalidated while a pNFS I/O operation is queued for transmission, then we ideally want to abort immediately. This is particularly the case when there is a large number of I/O related RPCs queued in the RPC layer, and the layout segment gets invalidated due to an ENOSPC error, or an EACCES (because the client was fenced). We may end up forced to spam the MDS with a lot of otherwise unnecessary LAYOUTERRORs after that I/O fails. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 16:20:16 -05:00
Trond Myklebust	39a5201a2b	NFSv4/pnfs: Fix barriers in nfs4_mark_deviceid_unavailable() Fix the memory barriers in nfs4_mark_deviceid_unavailable() and nfs4_test_deviceid_unavailable(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 16:20:16 -05:00
Trond Myklebust	762bb7e973	NFS/flexfiles: Fix up sparse RCU annotations Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 16:20:16 -05:00
Trond Myklebust	108bb4afd3	NFSv4/flexfiles: Fix invalid deref in FF_LAYOUT_DEVID_NODE() If the attempt to instantiate the mirror's layout DS pointer failed, then that pointer may hold a value of type ERR_PTR(), so we need to check that before we dereference it. Fixes: `65990d1afb` ("pNFS/flexfiles: Fix a deadlock on LAYOUTGET") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 16:20:16 -05:00
Anna Schumaker	1a3466aed3	NFS: Add missing encode / decode sequence_maxsz to v4.2 operations These really should have been there from the beginning, but we never noticed because there was enough slack in the RPC request for the extra bytes. Chuck's recent patch to use au_cslack and au_rslack to compute buffer size shrunk the buffer enough that this was now a problem for SEEK operations on my test client. Fixes: `f4ac1674f5` ("nfs: Add ALLOCATE support") Fixes: `2e72448b07` ("NFS: Add COPY nfs operation") Fixes: `cb95deea0b` ("NFS OFFLOAD_CANCEL xdr") Fixes: `624bd5b7b6` ("nfs: Add DEALLOCATE support") Fixes: `1c6dcbe5ce` ("NFS: Implement SEEK") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 16:19:46 -05:00
Trond Myklebust	c71c46f015	NFSv4.1: Don't process the sequence op more than once. Ensure that if we call nfs41_sequence_process() a second time for the same rpc_task, then we only process the results once. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-03-01 12:16:28 -05:00
Trond Myklebust	c1dffe0bf7	NFSv4.1: Reinitialise sequence results before retransmitting a request If we have to retransmit a request, we should ensure that we reinitialise the sequence results structure, since in the event of a signal we need to treat the request as if it had not been sent. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org	2019-03-01 12:13:34 -05:00
Trond Myklebust	a73881c96d	SUNRPC: Fix an Oops in udp_poll() udp_poll() checks the struct file for the O_NONBLOCK flag, so we must not call it with a NULL file pointer. Fixes: `0ffe86f480` ("SUNRPC: Use poll() to fix up the socket requeue races") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-26 06:33:02 -05:00
Trond Myklebust	06b5fc3ad9	NFSoRDMA client updates for 5.1 New features: - Convert rpc auth layer to use xdr_streams - Config option to disable insecure enctypes - Reduce size of RPC receive buffers Bugfixes and cleanups: - Fix sparse warnings - Check inline size before providing a write chunk - Reduce the receive doorbell rate - Various tracepoint improvements -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlxwX1IACgkQ18tUv7Cl QOv0tBAA3VXVuKdAtUH4b70q4ufBLkwz40puenDzlQEZXa4XjsGif+Iq62qHmAWW oYQfdaof3P+p1G/k9wEmFd6g+vk75a+2QYmmnlzVcSoOHc1teg8we39AbQt6Nz4X CZnb1VAuVYctprMXatZugyKsHi+EGWX4raUDtVlx8Zbte6BOSlzn/Cbnvvozeyi4 bMDQ5mi6vof/20o1qf9FhIjrx3UTYvqF6XOPDdMsQZs8pxDF8Z21LiRgKpPTRNrb ci1oIaqraai5SV2riDtMpVnGxR+GDQXaYnyozPnF7kFOwG5nIFyQ56m5aTd2ntd2 q09lRBHnmiy2sWaocoziXqUonnNi1sZI+fbdCzSTRD45tM0B34DkrvOKsDJuzuba m5xZqpoI8hL874EO0AFSEkPmv55BF+K7IMotPmzGo7i4ic+IlyLACDUXh5OkPx6D 2VSPvXOoAY1U4iJGg6LS9aLWNX99ShVJAuhD5InUW12FLC4GuRwVTIWY3v1s5TIJ boUe2EFVoKIxwVkNvf5tKAR1LTNsqtFBPTs1ENtXIdFo1+9ucZX7REhp3bxTlODM HheDAqUjlVV5CboB+c1Pggekyv3ON8ihyV3P+dlZ6MFwHnN9s8YOPcReQ91quBZY 0RNIMaNo2lBgLrkvCMlbDC05AZG6P8LuKhPTcAQ4+7/vfL4PpWI= =EiRa -----END PGP SIGNATURE----- Merge tag 'nfs-rdma-for-5.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs NFSoRDMA client updates for 5.1 New features: - Convert rpc auth layer to use xdr_streams - Config option to disable insecure enctypes - Reduce size of RPC receive buffers Bugfixes and cleanups: - Fix sparse warnings - Check inline size before providing a write chunk - Reduce the receive doorbell rate - Various tracepoint improvements [Trond: Fix up merge conflicts] Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-25 09:35:49 -05:00
Trond Myklebust	5085607d20	NFS/pnfs: Bulk destroy of layouts needs to be safe w.r.t. umount If a bulk layout recall or a metadata server reboot coincides with a umount, then holding a reference to an inode is unsafe unless we also hold a reference to the super block. Fixes: `fd9a8d7160` ("NFSv4.1: Fix bulk recall and destroy of layouts") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-23 13:59:29 -05:00
Trond Myklebust	6f9449be53	NFS: Fix a soft lockup in the delegation recovery code Fix a soft lockup when NFS client delegation recovery is attempted but the inode is in the process of being freed. When the igrab(inode) call fails, and we have to restart the recovery process, we need to ensure that we won't attempt to recover the same delegation again. Fixes: `45870d6909` ("NFSv4.1: Test delegation stateids when server...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-21 14:51:25 -05:00
Trond Myklebust	3453d5708b	NFSv4.1: Avoid false retries when RPC calls are interrupted A 'false retry' in NFSv4.1 occurs when the client attempts to transmit a new RPC call using a slot+sequence number combination that references an already cached one. Currently, the Linux NFS client will do this if a user process interrupts an RPC call that is in progress. The problem with doing so is that we defeat the main mechanism used by the server to differentiate between a new call and a replayed one. Even if the server is able to perfectly cache the arguments of the old call, it cannot know if the client intended to replay or send a new call. The obvious fix is to bump the sequence number pre-emptively if an RPC call is interrupted, but in order to deal with the corner cases where the interrupted call is not actually received and processed by the server, we need to interpret the error NFS4ERR_SEQ_MISORDERED as a sign that we need to either wait or locate a correct sequence number that lies between the value we sent, and the last value that was acked by a SEQUENCE call on that slot. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Jason Tibbitts <tibbs@math.uh.edu>	2019-02-21 13:22:43 -05:00
Trond Myklebust	6f903b111e	SUNRPC: Remove the redundant 'zerocopy' argument to xs_sendpages() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:35:58 -05:00
Trond Myklebust	c87dc4c73b	SUNRPC: Further cleanups of xs_sendpages() Now that we send the pages using a struct msghdr, instead of using sendpage(), we no longer need to 'prime the socket' with an address for unconnected UDP messages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:35:58 -05:00
Trond Myklebust	0472e47660	SUNRPC: Convert socket page send code to use iov_iter() Simplify the page send code using iov_iter and bvecs. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:35:58 -05:00
Trond Myklebust	e791f8e938	SUNRPC: Convert xs_send_kvec() to use iov_iter_kvec() Prepare to the socket transmission code to use iov_iter. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:35:58 -05:00
Trond Myklebust	5f52a9d429	SUNRPC: Initiate a connection close on an ESHUTDOWN error in stream receive If the client stream receive code receives an ESHUTDOWN error either because the server closed the connection, or because it sent a callback which cannot be processed, then we should shut down the connection. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:35:58 -05:00
Trond Myklebust	727fcc64a0	SUNRPC: Don't suppress socket errors when a message read completes If the message read completes, but the socket returned an error condition, we should ensure to propagate that error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:35:58 -05:00
Trond Myklebust	e92053a52e	SUNRPC: Handle zero length fragments correctly A zero length fragment is really a bug, but let's ensure we don't go nuts when one turns up. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:35:58 -05:00
Trond Myklebust	ae05355151	SUNRPC: Don't reset the stream record info when the receive worker is running To ensure that the receive worker has exclusive access to the stream record info, we must not reset the contents other than when holding the transport->recv_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:33:55 -05:00
ZhangXiaoxu	ded52fbe70	nfs: fix xfstest generic/099 failed on nfsv3 After setxattr, the nfsv3 cached the acl which set by user. But at the backend, the shared file system (eg. ext4) will check the acl, if it can merged with mode, it won't add acl to the file. So, the nfsv3 cached acl is redundant. Don't 'set_cached_acl' when setxattr. Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:33:55 -05:00
Kazuo Ito	2cde04e90d	pNFS: Avoid read/modify/write when it is not necessary As the block and SCSI layouts can only read/write fixed-length blocks, we must perform read-modify-write when data to be written is not aligned to a block boundary or smaller than the block size. (`612aa983a0` pnfs: add flag to force read-modify-write in ->write_begin) The current code tries to see if we have to do read-modify-write on block-oriented pNFS layouts by just checking !PageUptodate(page), but the same condition also applies for overwriting of any uncached potions of existing files, making such operations excessively slow even it is block-aligned. The change does not affect the optimization for modify-write-read cases (`38c73044f5` NFS: read-modify-write page updating), because partial update of !PageUptodate() pages can only happen in layouts that can do arbitrary length read/write and never in block-based ones. Testing results: We ran fio on one of the pNFS clients running 4.20 kernel (vanilla and patched) in this configuration to read/write/overwrite files on the storage array, exported as pnfs share by the server. pNFS clients ---1G Ethernet--- pNFS server (HP DL360 G8) (HP DL360 G8) \| \| \| \| +------8G Fiber Channel--------+ \| Storage Array (HP P6350) Throughput of overwrite (both buffered and O_SYNC) is noticeably improved. Ops. \|block size\| Throughput \| \| (KiB) \| (MiB/s) \| \| \| 4.20 \| patched\| ---------+----------+----------------+ buffered \| 4\| 21.3 \| 232 \| overwrite\| 32\| 22.2 \| 256 \| \| 512\| 22.4 \| 260 \| ---------+----------+----------------+ O_SYNC \| 4\| 3.84\| 4.77\| overwrite\| 32\| 12.2 \| 32.0 \| \| 512\| 18.5 \| 152 \| ---------+----------+----------------+ Read and write (buffered and O_SYNC) by the same client remain unchanged by the patch either negatively or positively, as they should do. Ops. \|block size\| Throughput \| \| (KiB) \| (MiB/s) \| \| \| 4.20 \| patched\| ---------+----------+----------------+ read \| 4\| 548 \| 550 \| \| 32\| 547 \| 551 \| \| 512\| 548 \| 551 \| ---------+----------+----------------+ buffered \| 4\| 237 \| 244 \| write \| 32\| 261 \| 268 \| \| 512\| 265 \| 272 \| ---------+----------+----------------+ O_SYNC \| 4\| 0.46\| 0.46\| write \| 32\| 3.60\| 3.57\| \| 512\| 105 \| 106 \| ---------+----------+----------------+ Signed-off-by: Kazuo Ito <ito_kazuo_g3@lab.ntt.co.jp> Tested-by: Hiroyuki Watanabe <watanabe.hiroyuki@lab.ntt.co.jp> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2019-02-20 17:33:55 -05:00

1 2 3 4 5 ...

812451 Commits