linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-25 00:50:54 +07:00

Author	SHA1	Message	Date
Chuck Lever	1890896b4e	xprtrdma: Refactor rpcrdma_ep_connect I'm about to add another arm to if (ep->rep_connected != 0) It will be cleaner to use a switch statement here. We'll be looking for a couple of specific errnos, or "anything else," basically to sort out the difference between a normal reconnect and recovery from device removal. This is a refactoring change only. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-04-25 16:12:27 -04:00
Chuck Lever	bebd031866	xprtrdma: Support unplugging an HCA from under an NFS mount The device driver for the underlying physical device associated with an RPC-over-RDMA transport can be removed while RPC-over-RDMA transports are still in use (ie, while NFS filesystems are still mounted and active). The IB core performs a connection event upcall to request that consumers free all RDMA resources associated with a transport. There may be pending RPCs when this occurs. Care must be taken to release associated resources without leaving references that can trigger a subsequent crash if a signal or soft timeout occurs. We rely on the caller of the transport's ->close method to ensure that the previous RPC task has invoked xprt_release but the transport remains write-locked. A DEVICE_REMOVE upcall forces a disconnect then sleeps. When ->close is invoked, it destroys the transport's H/W resources, then wakes the upcall, which completes and allows the core driver unload to continue. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=266 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-04-25 16:12:24 -04:00
Chuck Lever	91a10c5297	xprtrdma: Use same device when mapping or syncing DMA buffers When the underlying device driver is reloaded, ia->ri_device will be replaced. All cached copies of that device pointer have to be updated as well. Commit `54cbd6b0c6` ("xprtrdma: Delay DMA mapping Send and Receive buffers") added the rg_device field to each regbuf. As part of handling a device removal, rpcrdma_dma_unmap_regbuf is invoked on all regbufs for a transport. Simply calling rpcrdma_dma_map_regbuf for each Receive buffer after the driver has been reloaded should reinitialize rg_device correctly for every case except rpcrdma_wc_receive, which still uses rpcrdma_rep::rr_device. Ensure the same device that was used to map a Receive buffer is also used to sync it in rpcrdma_wc_receive by using rg_device there instead of rr_device. This is the only use of rr_device, so it can be removed. The use of regbufs in the send path is also updated, for completeness. Fixes: `54cbd6b0c6` ("xprtrdma: Delay DMA mapping Send and ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-04-25 16:12:22 -04:00
Chuck Lever	fff09594ed	xprtrdma: Refactor rpcrdma_ia_open() In order to unload a device driver and reload it, xprtrdma will need to close a transport's interface adapter, and then call rpcrdma_ia_open again, possibly finding a different interface adapter. Make rpcrdma_ia_open safe to call on the same transport multiple times. This is a refactoring change only. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-04-25 16:12:20 -04:00
Chuck Lever	33849792cb	xprtrdma: Detect unreachable NFS/RDMA servers more reliably Current NFS clients rely on connection loss to determine when to retransmit. In particular, for protocols like NFSv4, clients no longer rely on RPC timeouts to drive retransmission: NFSv4 servers are required to terminate a connection when they need a client to retransmit pending RPCs. When a server is no longer reachable, either because it has crashed or because the network path has broken, the server cannot actively terminate a connection. Thus NFS clients depend on transport-level keepalive to determine when a connection must be replaced and pending RPCs retransmitted. However, RDMA RC connections do not have a native keepalive mechanism. If an NFS/RDMA server crashes after a client has sent RPCs successfully (an RC ACK has been received for all OTW RDMA requests), there is no way for the client to know the connection is moribund. In addition, new RDMA requests are subject to the RPC-over-RDMA credit limit. If the client has consumed all granted credits with NFS traffic, it is not allowed to send another RDMA request until the server replies. Thus it has no way to send a true keepalive when the workload has already consumed all credits with pending RPCs. To address this, forcibly disconnect a transport when an RPC times out. This prevents moribund connections from stopping the detection of failover or other configuration changes on the server. Note that even if the connection is still good, retransmitting any RPC will trigger a disconnect thanks to this logic in xprt_rdma_send_request: /* Must suppress retransmit to maintain credits */ if (req->rl_connect_cookie == xprt->connect_cookie) goto drop_connection; req->rl_connect_cookie = xprt->connect_cookie; Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-04-25 16:12:19 -04:00
Chuck Lever	e2a4f4fbef	sunrpc: Export xprt_force_disconnect() xprt_force_disconnect() is already invoked from the socket transport. I want to invoke xprt_force_disconnect() from the RPC-over-RDMA transport, which is a separate module from sunrpc.ko. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-04-25 16:12:16 -04:00
Chuck Lever	9378b274e1	xprtrdma: Cancel refresh worker during buffer shutdown Trying to create MRs while the transport is being torn down can cause a crash. Fixes: `e2ac236c0b` ("xprtrdma: Allocate MRs on demand") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2017-04-25 16:12:14 -04:00
Linus Torvalds	39da7c509a	Linux 4.11-rc6	2017-04-09 09:49:44 -07:00
Linus Torvalds	84ced7fd06	Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 Pull CIFS fixes from Steve French: "This is a set of CIFS/SMB3 fixes for stable. There is another set of four SMB3 reconnect fixes for stable in progress but they are still being reviewed/tested, so didn't want to wait any longer to send these five below" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: Reset TreeId to zero on SMB2 TREE_CONNECT CIFS: Fix build failure with smb2 Introduce cifs_copy_file_range() SMB3: Rename clone_range to copychunk_range Handle mismatched open calls	2017-04-09 09:10:02 -07:00
Linus Torvalds	462e9a355e	Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM fixes from Russell King: "A number of ARM fixes: - prevent oopses caused by dma_get_sgtable() and declared DMA coherent memory - fix boot failure on nommu caused by ID_PFR1 access - a number of kprobes fixes from Jon Medhurst and Masami Hiramatsu" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8665/1: nommu: access ID_PFR1 only if CPUID scheme ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory arm: kprobes: Align stack to 8-bytes in test code arm: kprobes: Fix the return address of multiple kretprobes arm: kprobes: Skip single-stepping in recursing path if possible arm: kprobes: Allow to handle reentered kprobe on single-stepping	2017-04-09 09:05:25 -07:00
Linus Torvalds	5b50be743f	Driver core fixes for 4.11-rc6 Here are 3 small fixes for 4.11-rc6. One resolves a reported issue with sysfs files that NeilBrown found, one is a documenatation fix for the stable kernel rules, and the last is a small MAINTAINERS file update for kernfs. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWOnrMw8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yk/JQCfQKjOpGDAR9Hs6u4YQ4hJrAHFneYAn1F4MLDW 3b0ZMnlZHkDq834UwKnB =iiei -----END PGP SIGNATURE----- Merge tag 'driver-core-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are 3 small fixes for 4.11-rc6. One resolves a reported issue with sysfs files that NeilBrown found, one is a documenatation fix for the stable kernel rules, and the last is a small MAINTAINERS file update for kernfs" * tag 'driver-core-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: MAINTAINERS: separate out kernfs maintainership sysfs: be careful of error returns from ops->show() Documentation: stable-kernel-rules: fix stable-tag format	2017-04-09 09:03:51 -07:00
Linus Torvalds	62e1fd08ed	Staging/IIO fixes for 4.11-rc6 Here are a number of small IIO and staging driver fixes for 4.11-rc6. Nothing big here, just iio fixes for reported issues, and an ashmem fix for a very old bug that has been reported by a number of Android vendors. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWOnsZA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymb4ACfSnGU4ndDTKoyTaJ7B/ZO/RF5lZUAni9d3kYF 3Ztp0ssmF8PBNvQhyIs0 =aeZf -----END PGP SIGNATURE----- Merge tag 'staging-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO driver rfixes from Greg KH: "Here are a number of small IIO and staging driver fixes for 4.11-rc6. Nothing big here, just iio fixes for reported issues, and an ashmem fix for a very old bug that has been reported by a number of Android vendors" * tag 'staging-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: android: ashmem: lseek failed due to no FMODE_LSEEK. iio: hid-sensor-attributes: Fix sensor property setting failure. iio: accel: hid-sensor-accel-3d: Fix duplicate scan index error iio: core: Fix IIO_VAL_FRACTIONAL_LOG2 for negative values iio: st_pressure: initialize lps22hb bootime iio: bmg160: reset chip when probing iio: cros_ec_sensors: Fix return value to get raw and calibbias data.	2017-04-09 09:02:31 -07:00
Linus Torvalds	2a610b8aa8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS fixes from Al Viro: "statx followup fixes and a fix for stack-smashing on alpha" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: alpha: fix stack smashing in old_adjtimex(2) statx: Include a mask for stx_attributes in struct statx statx: Reserve the top bit of the mask for future struct expansion xfs: report crtime and attribute flags to statx ext4: Add statx support statx: optimize copy of struct statx to userspace statx: remove incorrect part of vfs_statx() comment statx: reject unknown flags when using NULL path Documentation/filesystems: fix documentation for ->getattr()	2017-04-09 08:26:21 -07:00
Linus Torvalds	78d91a75b4	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Here's a pull request for 4.11-rc, fixing a set of issues mostly centered around the new scheduling framework. These have been brewing for a while, but split up into what we absolutely need in 4.11, and what we can defer until 4.12. These are well tested, on both single queue and multiqueue setups, and with and without shared tags. They fix several hangs that have happened in testing. This is obviously larger than I would have preferred at this point in time, but I don't think we can shave much off this and still get the desired results. In detail, this pull request contains: - a set of five fixes for NVMe, mostly from Christoph and one from Roland. - a series from Bart, fixing issues with dm-mq and SCSI shared tags and scheduling. Note that one of those patches commit messages may read like an optimization, but it is in fact an important fix for queue restarts in particular. - a series from Omar, most importantly fixing a hang with multiple hardware queues when we fail to get a driver tag. Another important fix in there is for resizing hardware queues, which nbd does when handling multiple sockets for one connection. - fixing an imbalance in putting the ctx for hctx request allocations from Minchan" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: Restart a single queue if tag sets are shared dm rq: Avoid that request processing stalls sporadically scsi: Avoid that SCSI queues get stuck blk-mq: Introduce blk_mq_delay_run_hw_queue() blk-mq: remap queues when adding/removing hardware queues blk-mq-sched: fix crash in switch error path blk-mq-sched: set up scheduler tags when bringing up new queues blk-mq-sched: refactor scheduler initialization blk-mq: use the right hctx when getting a driver tag fails nvmet: fix byte swap in nvmet_parse_io_cmd nvmet: fix byte swap in nvmet_execute_write_zeroes nvmet: add missing byte swap in nvmet_get_smart_log nvme: add missing byte swap in nvme_setup_discard nvme: Correct NVMF enum values to match NVMe-oF rev 1.0 block: do not put mq context in blk_mq_alloc_request_hctx	2017-04-08 11:56:58 -07:00
Linus Torvalds	c3df1c7c36	Late pin control fix for v4.11: An issue was detected with pin control hos on the Freescale i.MX after the refactorings for more general group and function handling. We now have the proper fix for this. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJY6LkVAAoJEEEQszewGV1znr0P/17ltjCxoR9qYkMsreCs6FIk BSCx2UEmYt03WKizyj1M1/YKP2NYcngp8TXsRsMyi7vqMjVoL1BsPo8BjFGNI7lq znLyUWuP3xo9Y/naagxkfLw5TbfNF4hyL0JBchvg6ox1Kt7Z47Sed7KDXtB5QQdJ WbU4Hdo6ZG/nvl3LAc1wivF3qtnBsxIzx6CMiR2dyiOmLGADHj7jiJ70BuRMyTlo 4no0Cfm93lnPo1ccNMVZY2Rqt09XhwPppewL7j2IqOin/Kr88qWKwdOheCu/Ojsp GJfTgKjVpieKW2PjkIiDDSiTKKkUvVmzEQz+qqXozjQSwwKtJ106xZ8fW+d5xFeY EJ3jsQtKdmI3q7M0mbYpfK0vM9C1MKMg71CJt8pvbtg2NXfAfLsA9BioVOGKrOua upy6RCMDhoBRh4jRjd5DcJPKRq45m/toVSZ+tfS1Nur2k3tXd41CI3y6D+wUlz95 oq8QW2bWsC52vLXS6qywJkUM7CQiBs61FIryf84YC7mE4AqRFJpCZfBqrUYLkctN 5OHF++wu6tEXYfgR6rtWY+c26xgc6PK/rALtYvzDC4o72Z0xQLlQqFnf6hGAp3Dl eosuW5TUvnlFUEMF3CEQwVHj3awpgdo6X4UnYDIxZDRU4R/vODH46s1H719TMIWx ZBztLllUHpn57LVRvudT =06og -----END PGP SIGNATURE----- Merge tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fix from Linus Walleij: "This late fix for pin control is hopefully the last I send this cycle. The problem was detected early in the v4.11 release cycle and there has been some back and forth on how to solve it. Sadly the proper fix arrives late, but at least not too late. An issue was detected with pin control on the Freescale i.MX after the refactorings for more general group and function handling. We now have the proper fix for this" * tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()	2017-04-08 11:43:38 -07:00
Linus Torvalds	894ca30cf6	powerpc fixes for 4.11 #7 Headed to stable: - disable HFSCR[TM] if TM is not supported, fixes a potential host kernel crash triggered by a hostile guest, but only in configurations that no one uses - don't try to fix up misaligned load-with-reservation instructions - fix flush_(d\|i)cache_range() called from modules on little endian kernels - add missing global TLB invalidate if cxl is active - fix missing preempt_disable() in crc32c-vpmsum And a fix for selftests build changes that went in this release: - selftests/powerpc: Fix standalone powerpc build Thanks to: Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran, Paul Mackerras. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJY6LIKAAoJEFHr6jzI4aWAhfcQAKORHx/tJf9w8KqcfSfKfeEL O8cZEl5/N3ArNXVM5J5QK5KnMVHnoWWR3FWYwntOjt3RJywjJYJ02YvhOVvt4q+M YinRS34KzAhnT1f526zx97v0BGqi//UJamrcFBUBTd4rLuHGbol7fdtWHVrsMYa0 KWQ+ooPLEpGDk4I3sDz37yeJBQXVpyhC/UF8vzHpvHGPvIQ8Dw8rfWwOZ0HooJuZ ewKdkeIsYF8SrM461c1GhOI0VXB0q+CMn9mzIaEKMuZMhHDKyiaM5rm8mWXapzcT HsCQKlF9X9YHAbhbSbz9DGvNCEYaW7T4vnudSNHjQaAJlA4HsmeRwWXy4+zqZuPc rIbRIFZAyV3wYowN7j3P6Se3lLBDMmlHZvVkygJnwoaR4rmoujePGwdAv8ZH4Udn hrbieC41HKVxcm5t3whIDOcHmxaAo1MDqmrVhyxJSjgnkdBtN/gnZXvHDb0VeOJV 9wFGGE8WvMXnTKEcjM2l+a14CuOrV/wRbHQ1B1O0Kfk613cPrukMYab6eLPqyJzF lmkCm1o46bib5oBOmvlqK+5oVuwNyfHmJSzvL+VOylhLVbJPmFJUhHQFssCvsTUf k36ZAUxH4fbz1TzAPipXl+wrkE/yzthGmA9FTC9hLkYE/rzvrZt9IKowFw1mq5n/ 2zFabXQBl5JBQ4hdL54f =bTuf -----END PGP SIGNATURE----- Merge tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.11: Headed to stable: - disable HFSCR[TM] if TM is not supported, fixes a potential host kernel crash triggered by a hostile guest, but only in configurations that no one uses - don't try to fix up misaligned load-with-reservation instructions - fix flush_(d\|i)cache_range() called from modules on little endian kernels - add missing global TLB invalidate if cxl is active - fix missing preempt_disable() in crc32c-vpmsum And a fix for selftests build changes that went in this release: - selftests/powerpc: Fix standalone powerpc build Thanks to: Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran, Paul Mackerras" * tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable() powerpc/mm: Add missing global TLB invalidate if cxl is active powerpc/64: Fix flush_(d\|i)cache_range() called from modules powerpc: Don't try to fix up misaligned load-with-reservation instructions powerpc: Disable HFSCR[TM] if TM is not supported selftests/powerpc: Fix standalone powerpc build	2017-04-08 11:06:12 -07:00
Chris Salls	cf01fb9985	mm/mempolicy.c: fix error handling in set_mempolicy and mbind. In the case that compat_get_bitmap fails we do not want to copy the bitmap to the user as it will contain uninitialized stack data and leak sensitive data. Signed-off-by: Chris Salls <salls@cs.ucsb.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 10:57:55 -07:00
Liping Zhang	425fffd886	sysctl: report EINVAL if value is larger than UINT_MAX for proc_douintvec Currently, inputting the following command will succeed but actually the value will be truncated: # echo 0x12ffffffff > /proc/sys/net/ipv4/tcp_notsent_lowat This is not friendly to the user, so instead, we should report error when the value is larger than UINT_MAX. Fixes: `e7d316a02f` ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 10:27:40 -07:00
Tejun Heo	27f395b857	MAINTAINERS: separate out kernfs maintainership Separate out kernfs from driver core and add myself as a co-maintainer. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 18:15:32 +02:00
NeilBrown	c8a139d001	sysfs: be careful of error returns from ops->show() ops->show() can return a negative error code. Commit `65da3484d9` ("sysfs: correctly handle short reads on PREALLOC attrs.") (in v4.4) caused this to be stored in an unsigned 'size_t' variable, so errors would look like large numbers. As a result, if an error is returned, sysfs_kf_read() will return the value of 'count', typically 4096. Commit `17d0774f80` ("sysfs: correctly handle read offset on PREALLOC attrs") (in v4.8) extended this error to use the unsigned large 'len' as a size for memmove(). Consequently, if ->show returns an error, then the first read() on the sysfs file will return 4096 and could return uninitialized memory to user-space. If the application performs a subsequent read, this will trigger a memmove() with extremely large count, and is likely to crash the machine is bizarre ways. This bug can currently only be triggered by reading from an md sysfs attribute declared with __ATTR_PREALLOC() during the brief period between when mddev_put() deletes an mddev from the ->all_mddevs list, and when mddev_delayed_delete() - which is scheduled on a workqueue - completes. Before this, an error won't be returned by the ->show() After this, the ->show() won't be called. I can reproduce it reliably only by putting delay like usleep_range(500000,700000); early in mddev_delayed_delete(). Then after creating an md device md0 run echo clear > /sys/block/md0/md/array_state; cat /sys/block/md0/md/array_state The bug can be triggered without the usleep. Fixes: `65da3484d9` ("sysfs: correctly handle short reads on PREALLOC attrs.") Fixes: `17d0774f80` ("sysfs: correctly handle read offset on PREALLOC attrs") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Reported-and-tested-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 17:33:32 +02:00
Johan Hovold	cf903e9d3a	Documentation: stable-kernel-rules: fix stable-tag format A patch documenting how to specify which kernels a particular fix should be backported to (seemingly) inadvertently added a minus sign after the kernel version. This particular stable-tag format had never been used prior to this patch, and was neither present when the patch in question was first submitted (it was added in v2 without any comment). Drop the minus sign to avoid any confusion. Fixes: `fdc81b7910` ("stable_kernel_rules: Add clause about specification of kernel versions to patch.") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 17:33:31 +02:00
Shuxiao Zhang	97fbfef6bd	staging: android: ashmem: lseek failed due to no FMODE_LSEEK. vfs_llseek will check whether the file mode has FMODE_LSEEK, no return failure. But ashmem can be lseek, so add FMODE_LSEEK to ashmem file. Comment From Greg Hackmann: ashmem_llseek() passes the llseek() call through to the backing shmem file. `91360b02ab` ("ashmem: use vfs_llseek()") changed this from directly calling the file's llseek() op into a VFS layer call. This also adds a check for the FMODE_LSEEK bit, so without that bit ashmem_llseek() now always fails with -ESPIPE. Fixes: `91360b02ab` ("ashmem: use vfs_llseek()") Signed-off-by: Shuxiao Zhang <zhangshuxiao@xiaomi.com> Tested-by: Greg Hackmann <ghackmann@google.com> Cc: stable <stable@vger.kernel.org> # 3.18+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-04-08 12:13:11 +02:00
Linus Torvalds	8b65bb57d8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fixes from David Miller: "Several fixes here, mostly having to due with either build errors or memory corruptions depending upon whether you have THP enabled or not" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: remove unused wp_works_ok macro sparc32: Export vac_cache_size to fix build error sparc64: Fix memory corruption when THP is enabled sparc64: Fix kernel panic due to erroneous #ifdef surrounding pmd_write() arch/sparc: Avoid DCTI Couples sparc64: kern_addr_valid regression sparc64: Add support for 2G hugepages sparc64: Fix size check in huge_pte_alloc	2017-04-08 01:42:05 -07:00
Linus Torvalds	542380a208	KVM fixes for v4.11-rc6 ARM: - Fix a problem with GICv3 userspace save/restore - Clarify GICv2 userspace save/restore ABI - Be more careful in clearing GIC LRs - Add missing synchronization primitive to our MMU handling code PPC: - Check for a NULL return from kzalloc s390: - Prevent translation exception errors on valid page tables for the instruction-exection-protection support x86: - Fix Page-Modification Logging when running a nested guest -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJY5/X8AAoJEED/6hsPKofo8hQH/As3CbihZMysaK6JJTx5oMZw b3W8p8xVXVu4dKM8WnXa6m5xBDFmOa7eBB+CtT3gP68XnFvMpr/vPmDv6v6i9p8q 7VyALDqqk2fxDmgHEwuETw9XZyuhdyCz/GaINCdnAJs25wTFOA7r0WEW5W8qRJpA 9nQirapdJcknymIch1JqeWlYYmbIaFzT8jItfA9QQ7F9mG4pxC8D1k2D56lNYwTf FJIgXgkMPe7CPDXmgc/KqT5+iVsc/+SgzP/WdH6bX/007TV71sksxxfz6fIrao0X RtcL2WIZTXBdSNrvXflHhCfYgogPgCnYp8AsYTIa+IEijcfteJx7UiET47Ne0Ow= =/SPG -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Radim Krčmář: "ARM: - Fix a problem with GICv3 userspace save/restore - Clarify GICv2 userspace save/restore ABI - Be more careful in clearing GIC LRs - Add missing synchronization primitive to our MMU handling code PPC: - Check for a NULL return from kzalloc s390: - Prevent translation exception errors on valid page tables for the instruction-exection-protection support x86: - Fix Page-Modification Logging when running a nested guest" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Check for kmalloc errors in ioctl KVM: nVMX: initialize PML fields in vmcs02 KVM: nVMX: do not leak PML full vmexit to L1 KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI KVM: arm64: Ensure LRs are clear when they should be kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd KVM: s390: remove change-recording override support arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm	2017-04-08 01:39:43 -07:00
Linus Torvalds	62fedca5ce	Merge branch 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit Pull audit cleanup from Paul Moore: "A week later than I had hoped, but as promised, here is the audit uninline-fix we talked about during the last audit pull request. The patch is slightly different than what we originally discussed as it made more sense to keep the audit_signal_info() function in auditsc.c rather than move it and bunch of other related variables/definitions into audit.c/audit.h. At some point in the future I need to look at how the audit code is organized across kernel/audit, I suspect we could do things a bit better, but it doesn't seem like a -rc release is a good place for that ;) Regardless, this patch passes our tests without problem and looks good for v4.11" 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit: audit: move audit_signal_info() into kernel/auditsc.c	2017-04-08 01:37:25 -07:00
Linus Torvalds	56c2997965	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: move pcp and lru-pcp draining into single wq mailmap: update Yakir Yang email address mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() dax: fix radix tree insertion race mm, thp: fix setting of defer+madvise thp defrag mode ptrace: fix PTRACE_LISTEN race corrupting task->state vmlinux.lds: add missing VMLINUX_SYMBOL macros mm/page_alloc.c: fix print order in show_free_areas() userfaultfd: report actual registered features in fdinfo mm: fix page_vma_mapped_walk() for ksm pages	2017-04-08 01:35:32 -07:00
Michal Hocko	ce612879dd	mm: move pcp and lru-pcp draining into single wq We currently have 2 specific WQ_RECLAIM workqueues in the mm code. vmstat_wq for updating pcp stats and lru_add_drain_wq dedicated to drain per cpu lru caches. This seems more than necessary because both can run on a single WQ. Both do not block on locks requiring a memory allocation nor perform any allocations themselves. We will save one rescuer thread this way. On the other hand drain_all_pages() queues work on the system wq which doesn't have rescuer and so this depend on memory allocation (when all workers are stuck allocating and new ones cannot be created). Initially we thought this would be more of a theoretical problem but Hugh Dickins has reported: : 4.11-rc has been giving me hangs after hours of swapping load. At : first they looked like memory leaks ("fork: Cannot allocate memory"); : but for no good reason I happened to do "cat /proc/sys/vm/stat_refresh" : before looking at /proc/meminfo one time, and the stat_refresh stuck : in D state, waiting for completion of flush_work like many kworkers. : kthreadd waiting for completion of flush_work in drain_all_pages(). This worker should be using WQ_RECLAIM as well in order to guarantee a forward progress. We can reuse the same one as for lru draining and vmstat. Link: http://lkml.kernel.org/r/20170307131751.24936-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Yang Li <pku.leo@gmail.com> Tested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:49 -07:00
Jeffy Chen	cdcf4330d5	mailmap: update Yakir Yang email address Set current email address to replace previous employers email addresses. Link: http://lkml.kernel.org/r/1491450722-6633-1-git-send-email-jeffy.chen@rock-chips.com Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:49 -07:00
David Rientjes	460bcec84e	mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() We got need_resched() warnings in swap_cgroup_swapoff() because swap_cgroup_ctrl[type].length is particularly large. Reschedule when needed. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704061315270.80559@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:49 -07:00
Ross Zwisler	e11f8b7b6c	dax: fix radix tree insertion race While running generic/340 in my test setup I hit the following race. It can happen with kernels that support FS DAX PMDs, so v4.10 thru v4.11-rc5. Thread 1 Thread 2 -------- -------- dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() 'entry' is NULL, can't call lock_slot() spin_unlock_irq() radix_tree_preload() dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() ... lock_slot() spin_unlock_irq() dax_pmd_insert_mapping() <inserts a PMD mapping> spin_lock_irq() __radix_tree_insert() fails with -EEXIST <fall back to 4k fault, and die horribly when inserting a 4k entry where a PMD exists> The issue is that we have to drop mapping->tree_lock while calling radix_tree_preload(), but since we didn't have a radix tree entry to lock (unlike in the pmd_downgrade case) we have no protection against Thread 2 coming along and inserting a PMD at the same index. For 4k entries we handled this with a special-case response to -EEXIST coming from the __radix_tree_insert(), but this doesn't save us for PMDs because the -EEXIST case can also mean that we collided with a 4k entry in the radix tree at a different index, but one that is covered by our PMD range. So, correctly handle both the 4k and 2M collision cases by explicitly re-checking the radix tree for an entry at our index once we reacquire mapping->tree_lock. This patch has made it through a clean xfstests run with the current v4.11-rc5 based linux/master, and it also ran generic/340 500 times in a loop. It used to fail within the first 10 iterations. Link: http://lkml.kernel.org/r/20170406212944.2866-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [4.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:49 -07:00
David Rientjes	4fad7fb6b0	mm, thp: fix setting of defer+madvise thp defrag mode Setting thp defrag mode of "defer+madvise" actually sets "defer" in the kernel due to the name similarity and the out-of-order way the string is checked in defrag_store(). Check the string in the correct order so that TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG is set appropriately for "defer+madvise". Fixes: `21440d7eb9` ("mm, thp: add new defer+madvise defrag option") Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704051814420.137626@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:48 -07:00
bsegall@google.com	5402e97af6	ptrace: fix PTRACE_LISTEN race corrupting task->state In PT_SEIZED + LISTEN mode STOP/CONT signals cause a wakeup against __TASK_TRACED. If this races with the ptrace_unfreeze_traced at the end of a PTRACE_LISTEN, this can wake the task /after/ the check against __TASK_TRACED, but before the reset of state to TASK_TRACED. This causes it to instead clobber TASK_WAKING, allowing a subsequent wakeup against TRACED while the task is still on the rq wake_list, corrupting it. Oleg said: "The kernel can crash or this can lead to other hard-to-debug problems. In short, "task->state = TASK_TRACED" in ptrace_unfreeze_traced() assumes that nobody else can wake it up, but PTRACE_LISTEN breaks the contract. Obviusly it is very wrong to manipulate task->state if this task is already running, or WAKING, or it sleeps again" [akpm@linux-foundation.org: coding-style fixes] Fixes: `9899d11f` ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL") Link: http://lkml.kernel.org/r/xm26y3vfhmkp.fsf_-_@bsegall-linux.mtv.corp.google.com Signed-off-by: Ben Segall <bsegall@google.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:48 -07:00
Jessica Yu	d79bf21e0e	vmlinux.lds: add missing VMLINUX_SYMBOL macros When __{start,end}_ro_after_init is referenced from C code, we run into the following build errors on blackfin: kernel/extable.c:169: undefined reference to `__start_ro_after_init' kernel/extable.c:169: undefined reference to `__end_ro_after_init' The build error is due to the fact that blackfin is one of the few arches that prepends an underscore '_' to all symbols defined in C. Fix this by wrapping __{start,end}_ro_after_init in vmlinux.lds.h with VMLINUX_SYMBOL(), which adds the necessary prefix for arches that have HAVE_UNDERSCORE_SYMBOL_PREFIX. Link: http://lkml.kernel.org/r/1491259387-15869-1-git-send-email-jeyu@redhat.com Signed-off-by: Jessica Yu <jeyu@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Eddie Kovsky <ewk@edkovsky.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:48 -07:00
Alexander Polakov	1f06b81aea	mm/page_alloc.c: fix print order in show_free_areas() Fixes: `11fb998986` ("mm: move most file-based accounting to the node") Link: http://lkml.kernel.org/r/1490377730.30219.2.camel@beget.ru Signed-off-by: Alexander Polyakov <apolyakov@beget.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:48 -07:00
Mike Rapoport	045098e944	userfaultfd: report actual registered features in fdinfo fdinfo for userfault file descriptor reports UFFD_API_FEATURES. Up until recently, the UFFD_API_FEATURES was defined as 0, therefore corresponding field in fdinfo always contained zero. Now, with introduction of several additional features, UFFD_API_FEATURES is not longer 0 and it seems better to report actual features requested for the userfaultfd object described by the fdinfo. First, the applications that were using userfault will still see zero at the features field in fdinfo. Next, reporting actual features rather than available features, gives clear indication of what userfault features are used by an application. Link: http://lkml.kernel.org/r/1491140181-22121-1-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:48 -07:00
Hugh Dickins	d75450ff40	mm: fix page_vma_mapped_walk() for ksm pages Doug Smythies reports oops with KSM in this backtrace, I've been seeing the same: page_vma_mapped_walk+0xe6/0x5b0 page_referenced_one+0x91/0x1a0 rmap_walk_ksm+0x100/0x190 rmap_walk+0x4f/0x60 page_referenced+0x149/0x170 shrink_active_list+0x1c2/0x430 shrink_node_memcg+0x67a/0x7a0 shrink_node+0xe1/0x320 kswapd+0x34b/0x720 Just as observed in commit `4b0ece6fa0` ("mm: migrate: fix remove_migration_pte() for ksm pages"), you cannot use page->index calculations on ksm pages. page_vma_mapped_walk() is relying on __vma_address(), where a ksm page can lead it off the end of the page table, and into whatever nonsense is in the next page, ending as an oops inside check_pte()'s pte_page(). KSM tells page_vma_mapped_walk() exactly where to look for the page, it does not need any page->index calculation: and that's so also for all the normal and file and anon pages - just not for THPs and their subpages. Get out early in most cases: instead of a PageKsm test, move down the earlier not-THP-page test, as suggested by Kirill. I'm also slightly worried that this loop can stray into other vmas, so added a vm_end test to prevent surprises; though I have not imagined anything worse than a very contrived case, in which a page mlocked in the next vma might be reclaimed because it is not mlocked in this vma. Fixes: `ace71a19ce` ("mm: introduce page_vma_mapped_walk()") Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1704031104400.1118@eggly.anvils Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-08 00:47:48 -07:00
Martin Brandenburg	cefdc26e86	orangefs: move features validation to fix filesystem hang Without this fix (and another to the userspace component itself described later), the kernel will be unable to process any OrangeFS requests after the userspace component is restarted (due to a crash or at the administrator's behest). The bug here is that inside orangefs_remount, the orangefs_request_mutex is locked. When the userspace component restarts while the filesystem is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device, which causes the kernel to send it a few requests aimed at synchronizing the state between the two. While this is happening the orangefs_request_mutex is locked to prevent any other requests going through. This is only half of the bugfix. The other half is in the userspace component which outright ignores(!) requests made before it considers the filesystem remounted, which is after the ioctl returns. Of course the ioctl doesn't return until after the userspace component responds to the request it ignores. The userspace component has been changed to allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status. Mike Marshall says: "I've tested this patch against the fixed userspace part. This patch is real important, I hope it can make it into 4.11... Here's what happens when the userspace daemon is restarted, without the patch: ============================================= [ INFO: possible recursive locking detected ] [ 4.10.0-00007-ge98bdb3 #1 Not tainted ] --------------------------------------------- pvfs2-client-co/29032 is trying to acquire lock: (orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs] but task is already holding lock: (orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs] CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb3 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 Call Trace: __lock_acquire+0x7eb/0x1290 lock_acquire+0xe8/0x1d0 mutex_lock_killable_nested+0x6f/0x6e0 service_operation+0x3c7/0x7b0 [orangefs] orangefs_remount+0xea/0x150 [orangefs] dispatch_ioctl_command+0x227/0x330 [orangefs] orangefs_devreq_ioctl+0x29/0x70 [orangefs] do_vfs_ioctl+0xa3/0x6e0 SyS_ioctl+0x79/0x90" Signed-off-by: Martin Brandenburg <martin@omnibond.com> Acked-by: Mike Marshall <hubcap@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-07 13:41:22 -07:00
Linus Torvalds	c2eb7beac7	pci-v4.11-fixes-4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJY5+YTAAoJEFmIoMA60/r8SNAQAI12y86AkLENC5HeYtOpvv5s hvXkvQMstaftsV5sBZ9vpvooCZZWugIDIAXBASTJw2xouWLWWAVRBBEKWm3Xht/e bifc23jSyKu2noiIjfsRqWRUZFnu9+At8nd0LYsG0NGAgcyQJdW2MdZ7GVSBS+CV tFSd1jsaq6eLFrHAf6u31SV2D5ASAkegzSFhLZSOdihD67zwXjO1ibxKzVc00f4Y 3J7bXsAiUcdTt7I6mAmVAcxa1Hb9cmLMt+WD80lrePskHyIYDnoTyrIleLh3l0UG oT25hOzAbbTm+pe/3wjEAEczeiITXw/zf/JJJLql3u8OM4THaJjAwTiJ+ovaRdLz /P+Bd2TrsiVNNp34AwnjqZ3zz8Ah9b6OKIJEXSZ97ROziZWKTz6q9JmyXn7NZnYz zc8ZqC4fbPpJ0rN7PnOimrDo72/ZIlndCb8+mVvjepA1X32TKo91YhosjJwBu4uu gJ88Vh+D5V1gTjCROpasDBHY5/1SUwQ+LNtcHQ2k7hCEJJ4VKm1VvFV5cSdgVkGQ MeKH4YQtiUDyyA6p67jDgU7US8RcGc8zSpKm5kZ9Jp6y/WAIUqBJqUIE3duoDuzG WuV3ouGP7PttU1WEWmQmX25WmK8DC8ykF5Vo6qHWpWvgeT0cm5jK04MN5Ex2k/EN NEmtTrVwDoigzP4fFpRD =Xs4R -----END PGP SIGNATURE----- Merge tag 'pci-v4.11-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - fix ThunderX legacy firmware resources - fix ARTPEC-6 and DesignWare platform driver NULL pointer dereferences - fix HiSilicon link error * tag 'pci-v4.11-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: dwc: Fix dw_pcie_ops NULL pointer dereference PCI: dwc: Select PCI_HOST_COMMON for hisi PCI: thunder-pem: Fix legacy firmware PEM-specific resources	2017-04-07 12:26:36 -07:00
Bart Van Assche	6d8c6c0f97	blk-mq: Restart a single queue if tag sets are shared To improve scalability, if hardware queues are shared, restart a single hardware queue in round-robin fashion. Rename blk_mq_sched_restart_queues() to reflect the new semantics. Remove blk_mq_sched_mark_restart_queue() because this function has no callers. Remove flag QUEUE_FLAG_RESTART because this patch removes the code that uses this flag. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 12:40:09 -06:00
Bart Van Assche	6077c2d706	dm rq: Avoid that request processing stalls sporadically While running the srp-test software I noticed that request processing stalls sporadically at the beginning of a test, namely when mkfs is run against a dm-mpath device. Every time when that happened the following command was sufficient to resume request processing: echo run >/sys/kernel/debug/block/dm-0/state This patch avoids that such request processing stalls occur. The test I ran is as follows: while srp-test/run_tests -d -r 30 -t 02-mq; do :; done Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 12:27:10 -06:00
Bart Van Assche	36e3cf2739	scsi: Avoid that SCSI queues get stuck If a .queue_rq() function returns BLK_MQ_RQ_QUEUE_BUSY then the block driver that implements that function is responsible for rerunning the hardware queue once requests can be queued again successfully. commit `52d7f1b5c2` ("blk-mq: Avoid that requeueing starts stopped queues") removed the blk_mq_stop_hw_queue() call from scsi_queue_rq() for the BLK_MQ_RQ_QUEUE_BUSY case. Hence change all calls to functions that are intended to rerun a busy queue such that these examine all hardware queues instead of only stopped queues. Since no other functions than scsi_internal_device_block() and scsi_internal_device_unblock() should ever stop or restart a SCSI queue, change the blk_mq_delay_queue() call into a blk_mq_delay_run_hw_queue() call. Fixes: commit `52d7f1b5c2` ("blk-mq: Avoid that requeueing starts stopped queues") Fixes: commit `7e79dadce2` ("blk-mq: stop hardware queue in blk_mq_delay_queue()") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Long Li <longli@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 12:27:08 -06:00
Bart Van Assche	7587a5ae7e	blk-mq: Introduce blk_mq_delay_run_hw_queue() Introduce a function that runs a hardware queue unconditionally after a delay. Note: there is already a function that stops and restarts a hardware queue after a delay, namely blk_mq_delay_queue(). This function will be used in the next patch in this series. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Long Li <longli@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 12:27:06 -06:00
Linus Torvalds	81d4bab4ce	- Two stable@ fixes for the verity target's FEC support - A stable@ fix for raid target's raid1 support (when no bitmap is used) - A 4.11 cache metadata v2 format fix to properly test blocks are clean -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJY56smAAoJEMUj8QotnQNa3xYH/39l25eGzam0cnITa31cX9uu lb+oWnqbgvbd65HZr2QPu9RO8LQMK9wxw40wapyYTEnkDfgeW+hmwYo3BUZ0IpdT Ry39KGCGaxk3L3cATSgtZT18AsWRHmKqlHLf6y98RdeFLVb3lyUFllkLF9r3M2ep 1Ga2MiMJYffaiTsSKxwZQG3XG7mq9MNfRnCehGAQwjGgWL3EsYHNsq+Hosn/tdtZ 2D7BvAMr2X+3xEUVevqL2dFmJ1D2tbJjtedeAKVOccErV/BofwWPUvTOFX8202+Y CUC9pW+hDQqpCm15Pr4N6oU4TeC4mHMwGK0SLWmoXkl3VDPbUUO3qC5AwKxsepA= =cWkE -----END PGP SIGNATURE----- Merge tag 'dm-4.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - two stable fixes for the verity target's FEC support - a stable fix for raid target's raid1 support (when no bitmap is used) - a 4.11 cache metadata v2 format fix to properly test blocks are clean * tag 'dm-4.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm verity fec: fix bufio leaks dm raid: fix NULL pointer dereference for raid1 without bitmap dm cache metadata: fix metadata2 format's blocks_are_clean_separate_dirty dm verity fec: limit error correction recursion	2017-04-07 10:47:20 -07:00
Linus Torvalds	dc25ad3fe1	arm64 fixes: - Restore previous SIGBUS behaviour for unhandled unaligned user accesses - Revert broken support for the contiguous bit in hugetlb (again...) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJY53foAAoJELescNyEwWM0ILEH/An3v6VSnbABDRxvXWkrTvKZ y4KoHVgSDqehrH8MysrrI7SlB5J5AEGjQI2SzI2InVS4j4Dd/kfqZMeZlo2Z2Idv KlC4FXb6QhRjJrrLCVIWCZxQL8gqP9KEI+DwB76a46WSYHHWP4ihtfYTxpTSAZbj mDHOmZ2udc/GjEpPzzPNOhXs0+1dEAHkQa+gW8T5HotQK+VVBwFTJKPXGNjm/YQa A1lLzYW/R9xRzAeEaJIGa6/jy6jJQ09vkXUdriibRi9qu7+A/xecgq3nb6puwT3j 0BQqvVQ3eAEejlXA5L4xtdwNb3fhe8hK4pq9OgNnhSytntAtbSiqvGTHea03XKY= =d5YQ -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "We've got a regression fix for the signal raised when userspace makes an unsupported unaligned access and a revert of the contiguous (hugepte) support for hugetlb, which has once again been found to be broken. One day, maybe, we'll get it right. Summary: - restore previous SIGBUS behaviour for unhandled unaligned user accesses - revert broken support for the contiguous bit in hugetlb (again...)" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "Revert "arm64: hugetlb: partial revert of 66b3923a1a0f"" arm64: mm: unaligned access by user-land should be received as SIGBUS	2017-04-07 10:43:22 -07:00
Linus Torvalds	4f0d14b0c9	metag/usercopy: Fault handling fixes These patches fix a bunch of longstanding (some over a decade old) metag user copy fault handling bugs. Thanks go to Al Viro for spotting some of the questionable code in the first place. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJY5g8zAAoJEGwLaZPeOHZ6m8sP/3M5e2VPrtqK7u22QVrjIkOx XwtIRCbEFUf9XJhDATKnwraRcKlfespu3ibc2BrQ7e2FCa/Nx6nSipUIMW+zUmGX nu2DHnxh6rEEzHc5pBzmiUH8+AsoK5Q12jeQRu8PviCkn7QOOMFt6ZvOHltE0AOh OKycaZAbZnvgKQkYmAqxcakesALc0gSRmXLvxlIba7fnhR8fYhSCow3Fxf+DLBbh Hq7/7cRyZi/GNnYd0NNLVyifbZk2Xt3nfN9TadysCMc4InsYSz4uJycZZy2p/WW+ feHJy0EUvDCRBggU7vgSpAd+7By7+tVSjGTH+dwDVwcP3ukFx6qZcu8dvrbyUoMK E2QBLb3DSOqPtRmjIq4AYrQUOnzCNwxDG7f02GQGFmV8VudNHUf6Y7W3xIZrT5Ke 0Y+mcYN4ZN/g5rzBTgj5+zOMsQr1kRlBSGwt2LZq6G5oFk4ZvFXyO4pA7ZHLzP30 cRuky9uRYvTBGzwa/vxkecjJ4w7xGZAgWHHtc9yPuetX81bJYTY2bQVeqlMvfCrb NzOsObBjqokkYqYe4ywaKhyxFo/Ks4X1LkvCepcj5CfcLvfFW6BuZFKw2WvPGGoB RSEhvWhtuok5eafb+6QDd6G2ZLo4wSnt5qbCjjWjW9sHLxWG88cOsCctq4mBT1tl Vnd2140H9FoAMY0H2xhc =MwhE -----END PGP SIGNATURE----- Merge tag 'metag-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull metag usercopy fixes from James Hogan: "Metag usercopy fault handling fixes These patches fix a bunch of longstanding (some over a decade old) metag user copy fault handling bugs. Thanks go to Al Viro for spotting some of the questionable code in the first place" * tag 'metag-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: metag/usercopy: Add missing fixups metag/usercopy: Fix src fixup in from user rapf loops metag/usercopy: Set flags before ADDZ metag/usercopy: Zero rest of buffer from copy_from_user metag/usercopy: Add early abort to copy_to_user metag/usercopy: Fix alignment error checking metag/usercopy: Drop unused macros	2017-04-07 10:11:53 -07:00
Linus Torvalds	7ab661856b	ACPI fix for v4.11-rc6 - Refine the check for the existence of _HID in find_child_checks() so that it doesn't trigger for device objects with device IDs made up by the kernel (Rafael Wysocki). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJY531xAAoJEILEb/54YlRxyoMQAJ7wKKP1+1TBf+rIAb817cdp BBP8gPEmkEyTVaF8DkC4byLeIipbRHezq2sjChVM2uj+I1L9HnDg3lkbyISlH1h3 UgS5/6Sg6aY+nEhNN6REZY/5es+tmUX5bLYsmttETLjluld02XbOQHPVlkY4HR8n o5qaMxR0sKSs9AhzWG+NLqSqTSL30uqKzxaBf7ZZPiKkKbOWctgv6ts5fKdXO8mx Z4FLvuRygva5L704jycGDJ/3W8gidrfi63n2QAoP5dIoc8UHfMEHVIMYxqW6HJkp AuZTk0zJ+KmawP+FIKAXBvPk+T4XIbzV6tWmQWmMJYuEwmilp0V4oGBIZdMdYklo zShQaUdpiEAsJswzQDbl6gyl2iGzeCbTgxzhPqP7h3Q7kl5igUYLw491b5S+vEPf Kw+eI0T14pnoyk569Abf4S25BBU5fsUiNuFBNncIZNKUd9+V7TKf8WVcxlP0BfBQ sCciwlAlb6R+Mxunv4pLSA3gqEdDPHfPNjS3VoWN/B23KY2YbHPk8vlXjClIVPXR lvtaZtInR2Ri4AlDMcFRYF7fdyMOxgN+qaLSClyCFuW1ZjyG6JPbjRoadjIgAfrO 9B1CLiCngAYqw/QZLvAfp5ISxhR4y9WXrb9fbVCy86jPS3foKONbhe094d2V0EH8 5s7qzE8c6a2URDjZL7sr =04+L -----END PGP SIGNATURE----- Merge tag 'acpi-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "This fixes a core device enumeration code change made in 4.10, in order to address a reported issue, that went too far. Specifics: - Refine the check for the existence of _HID in find_child_checks() so that it doesn't trigger for device objects with device IDs made up by the kernel (Rafael Wysocki)" * tag 'acpi-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / scan: Prefer devices without _HID for _ADR matching	2017-04-07 10:01:45 -07:00
Linus Torvalds	50bdd7a0c9	fix for 4.11 rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJY5ysAAAoJELDendYovxMvEIoH/2Jl9vM9DzfdTxfyvSS0+i3/ e3UTL4igNSBT5W4t3DL0o5IZ9e6LQ17j8VWqbpT8d6AzdvtW6xx4ZpulPuh/qTnA pHNUy9yPXr91k4KzSgV3ASaqxdBOIAq74t0u+BwpDjWV8Vok5oONxPf03vigfrKb jFLZfP0DilSI6YIsExtUqZhT1ydnG6mm0PMXGT5VfHgnotUVapSnZw8Ht/STXRbH SZ5a1QLcQxfFziFEmBGSOqIKLmA3TYpzBcbSVg4vjysdmf910C8tU6KFkdDhg8Gu 3vWsjBpgxoAmdHYfElD3PDp1pyWTmKkF0jtJavcaNp0GOGD8f5jg1Rf/eozD9CA= =6Oc6 -----END PGP SIGNATURE----- Merge tag 'for-linus-4.11b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fix from Juergen Gross: "A fix for error path cleanup in the xenbus handler" * tag 'for-linus-4.11b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: remove transaction holder from list before freeing	2017-04-07 09:58:01 -07:00
Liping Zhang	5380e5644a	sysctl: don't print negative flag for proc_douintvec I saw some very confusing sysctl output on my system: # cat /proc/sys/net/core/xfrm_aevent_rseqth -2 # cat /proc/sys/net/core/xfrm_aevent_etime -10 # cat /proc/sys/net/ipv4/tcp_notsent_lowat -4294967295 Because we forget to set the negp flag in proc_douintvec, so it will become a garbage value. Since the value related to proc_douintvec is always an unsigned integer, so we can set negp to false explictily to fix this issue. Fixes: `e7d316a02f` ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-07 09:46:44 -07:00
Liping Zhang	1680a3868f	sysctl: add sanity check for proc_douintvec Commit `e7d316a02f` ("sysctl: handle error writing UINT_MAX to u32 fields") introduced the proc_douintvec helper function, but it forgot to add the related sanity check when doing register_sysctl_table. So add it now. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-07 09:46:44 -07:00
Omar Sandoval	ebe8bddb6e	blk-mq: remap queues when adding/removing hardware queues blk_mq_update_nr_hw_queues() used to remap hardware queues, which is the behavior that drivers expect. However, commit `4e68a01142` changed blk_mq_queue_reinit() to not remap queues for the case of CPU hotplugging, inadvertently making blk_mq_update_nr_hw_queues() not remap queues as well. This breaks, for example, NBD's multi-connection mode, leaving the added hardware queues unused. Fix it by making blk_mq_update_nr_hw_queues() explicitly remap the queues. Fixes: `4e68a01142` ("blk-mq: don't redistribute hardware queues on a CPU hotplug event") Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 08:56:49 -06:00

1 2 3 4 5 ...

663220 Commits