linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-25 04:39:48 +07:00

Author	SHA1	Message	Date
Jason Gunthorpe	e522a788eb	RDMA/siw,rxe: Make emulated devices virtual in the device tree [ Upstream commit a9d2e9ae953f0ddd0327479c81a085adaa76d903 ] This moves siw and rxe to be virtual devices in the device tree: lrwxrwxrwx 1 root root 0 Nov 6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/ Previously they were trying to parent themselves to the physical device of their attached netdev, which doesn't make alot of sense. My hope is this will solve some weird syzkaller hits related to sysfs as it could be possible that the parent of a netdev is another netdev, eg under bonding or some other syzkaller found netdev configuration. Nesting a ib_device under anything but a physical device is going to cause inconsistencies in sysfs during destructions. Link: https://lore.kernel.org/r/0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-09 13:46:24 +01:00
Christoph Hellwig	404fa09374	RDMA/core: remove use of dma_virt_ops [ Upstream commit 5a7a9e038b032137ae9c45d5429f18a2ffdf7d42 ] Use the ib_dma_* helpers to skip the DMA translation instead. This removes the last user if dma_virt_ops and keeps the weird layering violation inside the RDMA core instead of burderning the DMA mapping subsystems with it. This also means the software RDMA drivers now don't have to mess with DMA parameters that are not relevant to them at all, and that in the future we can use PCI P2P transfers even for software RDMA, as there is no first fake layer of DMA mapping that the P2P DMA support. Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-09 13:46:24 +01:00
Stanley Chu	2a54ad3066	scsi: ufs: Re-enable WriteBooster after device reset [ Upstream commit bd14bf0e4a084514aa62d24d2109e0f09a93822f ] UFS 3.1 specification mentions that the WriteBooster flags listed below will be set to their default values, i.e. disabled, after power cycle or any type of reset event. Thus we need to reset the flag variables kept in struct hba to align with the device status and ensure that WriteBooster-related functions are configured properly after device reset. Without this fix, WriteBooster will not be enabled successfully after by ufshcd_wb_ctrl() after device reset because hba->wb_enabled remains true. Flags required to be reset to default values: - fWriteBoosterEn: hba->wb_enabled - fWriteBoosterBufferFlushEn: hba->wb_buf_flush_enabled - fWriteBoosterBufferFlushDuringHibernate: No variable mapped Link: https://lore.kernel.org/r/20201208135635.15326-2-stanley.chu@mediatek.com Fixes: `3d17b9b5ab` ("scsi: ufs: Add write booster feature support") Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-09 13:46:23 +01:00
Adrian Hunter	acbf7db67a	scsi: ufs: Allow an error return value from ->device_reset() [ Upstream commit 151f1b664ffbb847c7fbbce5a5b8580f1b9b1d98 ] It is simpler for drivers to provide a ->device_reset() callback irrespective of whether the GPIO, or firmware interface necessary to do the reset, is discovered during probe. Change ->device_reset() to return an error code. Drivers that provide the callback, but do not do the reset operation should return -EOPNOTSUPP. Link: https://lore.kernel.org/r/20201103141403.2142-3-adrian.hunter@intel.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bean huo <beanhuo@micron.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-09 13:46:23 +01:00
Imre Deak	8cba903992	drm/i915/tgl: Fix Combo PHY DPLL fractional divider for 38.4MHz ref clock commit 0e2497e334de42dbaaee8e325241b5b5b34ede7e upstream. Apply Display WA #22010492432 for combo PHY PLLs too. This should fix a problem where the PLL output frequency is slightly off with the current PLL fractional divider value. I haven't seen an actual case where this causes a problem, but let's follow the spec. It's also needed on some EHL platforms, but for that we also need a way to distinguish the affected EHL SKUs, so I leave that for a follow-up. v2: - Apply the WA at one place when calculating the PLL dividers from the frequency and the frequency from the dividers for all the combo PLL use cases (DP, HDMI, TBT). (Ville) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201003001846.1271151-6-imre.deak@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:23 +01:00
Takashi Iwai	adee1c5126	ALSA: hda/hdmi: Fix incorrect mutex unlock in silent_stream_disable() commit 3d5c5fdcee0f9a94deb0472e594706018b00aa31 upstream. The silent_stream_disable() function introduced by the commit b1a5039759cb ("ALSA: hda/hdmi: fix silent stream for first playback to DP") takes the per_pin->lock mutex, but it unlocks the wrong one, spec->pcm_lock, which causes a deadlock. This patch corrects it. Fixes: b1a5039759cb ("ALSA: hda/hdmi: fix silent stream for first playback to DP") Reported-by: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Cc: <stable@vger.kernel.org> Acked-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20210101083852.12094-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:23 +01:00
Kailang Yang	e235fd076e	ALSA: hda/realtek - Modify Dell platform name commit c1e8952395c1f44a6304c71401519d19ed2ac56a upstream. Dell platform SSID:0x0a58 change platform name. Use the generic name instead for avoiding confusion. Fixes: 150927c3674d ("ALSA: hda/realtek - Supported Dell fixed type headset") Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/efe7c196158241aa817229df7835d645@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:23 +01:00
Edward Vear	ce9163cf7a	Bluetooth: Fix attempting to set RPA timeout when unsupported commit a31489d2a368d2f9225ed6a6f595c63bc7d10de8 upstream. During controller initialization, an LE Set RPA Timeout command is sent to the controller if supported. However, the value checked to determine if the command is supported is incorrect. Page 1921 of the Bluetooth Core Spec v5.2 shows that bit 2 of octet 35 of the Supported_Commands field corresponds to the LE Set RPA Timeout command, but currently bit 6 of octet 35 is checked. This patch checks the correct value instead. This issue led to the error seen in the following btmon output during initialization of an adapter (rtl8761b) and prevented initialization from completing. < HCI Command: LE Set Resolvable Private Address Timeout (0x08\|0x002e) plen 2 Timeout: 900 seconds > HCI Event: Command Complete (0x0e) plen 4 LE Set Resolvable Private Address Timeout (0x08\|0x002e) ncmd 2 Status: Unsupported Remote Feature / Unsupported LMP Feature (0x1a) = Close Index: 00:E0:4C:6B:E5:03 The error did not appear when running with this patch. Signed-off-by: Edward Vear <edwardvear@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:23 +01:00
Josh Poimboeuf	3e07350892	kdev_t: always inline major/minor helper functions commit aa8c7db494d0a83ecae583aa193f1134ef25d506 upstream. Silly GCC doesn't always inline these trivial functions. Fixes the following warning: arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> [build-tested] Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:23 +01:00
Rasmus Villemoes	fd3ec3b251	dt-bindings: rtc: add reset-source property commit 320d159e2d63a97a40f24cd6dfda5a57eec65b91 upstream. Some RTCs, e.g. the pcf2127, can be used as a hardware watchdog. But if the reset pin is not actually wired up, the driver exposes a watchdog device that doesn't actually work. Provide a standard binding that can be used to indicate that a given RTC can perform a reset of the machine, similar to wakeup-source. Suggested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201218101054.25416-2-rasmus.villemoes@prevas.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:22 +01:00
Uwe Kleine-König	757cd94ac8	rtc: pcf2127: only use watchdog when explicitly available commit 71ac13457d9d1007effde65b54818106b2c2b525 upstream. Most boards using the pcf2127 chip (in my bubble) don't make use of the watchdog functionality and the respective output is not connected. The effect on such a board is that there is a watchdog device provided that doesn't work. So only register the watchdog if the device tree has a "reset-source" property. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> [RV: s/has-watchdog/reset-source/] Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201218101054.25416-3-rasmus.villemoes@prevas.dk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:22 +01:00
Uwe Kleine-König	acb821425c	rtc: pcf2127: move watchdog initialisation to a separate function commit 5d78533a0c53af9659227c803df944ba27cd56e0 upstream. The obvious advantages are: - The linker can drop the watchdog functions if CONFIG_WATCHDOG is off. - All watchdog stuff grouped together with only a single function call left in generic code. - Watchdog register is only read when it is actually used. - Less #ifdefery Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20200924105256.18162-2-u.kleine-koenig@pengutronix.de Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:22 +01:00
Felix Fietkau	b001952411	Revert "mtd: spinand: Fix OOB read" This reverts stable commit `baad618d07`. This commit is adding lines to spinand_write_to_cache_op, wheras the upstream commit 868cbe2a6dcee451bd8f87cbbb2a73cf463b57e5 that this was supposed to backport was touching spinand_read_from_cache_op. It causes a crash on writing OOB data by attempting to write to read-only kernel memory. Cc: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:22 +01:00
Alex Deucher	261f4d03ad	Revert "drm/amd/display: Fix memory leaks in S3 resume" This reverts commit a135a1b4c4db1f3b8cbed9676a40ede39feb3362. This leads to blank screens on some boards after replugging a display. Revert until we understand the root cause and can fix both the leak and the blank screen after replug. Cc: Stylon Wang <stylon.wang@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Andre Tomt <andre@tomt.net> Cc: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-09 13:46:22 +01:00
Greg Kroah-Hartman	f5247949c0	Linux 5.10.5 Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20210104155708.800470590@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:56 +01:00
Dan Williams	12d377b93e	device-dax: Fix range release [ Upstream commit 6268d7da4d192af339f4d688942b9ccb45a65e04 ] There are multiple locations that open-code the release of the last range in a device-dax instance. Consolidate this into a new dev_dax_trim_range() helper. This also addresses a kmemleak report: # cat /sys/kernel/debug/kmemleak [..] unreferenced object 0xffff976bd46f6240 (size 64): comm "ndctl", pid 23556, jiffies 4299514316 (age 5406.733s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 20 c3 37 00 00 00 .......... .7... ff ff ff 7f 38 00 00 00 00 00 00 00 00 00 00 00 ....8........... backtrace: [<00000000064003cf>] __kmalloc_track_caller+0x136/0x379 [<00000000d85e3c52>] krealloc+0x67/0x92 [<00000000d7d3ba8a>] __alloc_dev_dax_range+0x73/0x25c [<0000000027d58626>] devm_create_dev_dax+0x27d/0x416 [<00000000434abd43>] __dax_pmem_probe+0x1c9/0x1000 [dax_pmem_core] [<0000000083726c1c>] dax_pmem_probe+0x10/0x1f [dax_pmem] [<00000000b5f2319c>] nvdimm_bus_probe+0x9d/0x340 [libnvdimm] [<00000000c055e544>] really_probe+0x230/0x48d [<000000006cabd38e>] driver_probe_device+0x122/0x13b [<0000000029c7b95a>] device_driver_attach+0x5b/0x60 [<0000000053e5659b>] bind_store+0xb7/0xc3 [<00000000d3bdaadc>] drv_attr_store+0x27/0x31 [<00000000949069c5>] sysfs_kf_write+0x4a/0x57 [<000000004a8b5adf>] kernfs_fop_write+0x150/0x1e5 [<00000000bded60f0>] __vfs_write+0x1b/0x34 [<00000000b92900f0>] vfs_write+0xd8/0x1d1 Reported-by: Jane Chu <jane.chu@oracle.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/160834570161.1791850.14911670304441510419.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Chunguang Xu	aceb8ae8e3	ext4: avoid s_mb_prefetch to be zero in individual scenarios [ Upstream commit 82ef1370b0c1757ab4ce29f34c52b4e93839b0aa ] Commit `cfd7323772` ("ext4: add prefetching for block allocation bitmaps") introduced block bitmap prefetch, and expects to read block bitmaps of flex_bg through an IO. However, it seems to ignore the value range of s_log_groups_per_flex. In the scenario where the value of s_log_groups_per_flex is greater than 27, s_mb_prefetch or s_mb_prefetch_limit will overflow, cause a divide zero exception. In addition, the logic of calculating nr is also flawed, because the size of flexbg is fixed during a single mount, but s_mb_prefetch can be modified, which causes nr to fail to meet the value condition of [1, flexbg_size]. To solve this problem, we need to set the upper limit of s_mb_prefetch. Since we expect to load block bitmaps of a flex_bg through an IO, we can consider determining a reasonable upper limit among the IO limit parameters. After consideration, we chose BLK_MAX_SEGMENT_SIZE. This is a good choice to solve divide zero problem and avoiding performance degradation. [ Some minor code simplifications to make the changes easy to follow -- TYT ] Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Chunguang Xu <brookxu@tencent.com> Reviewed-by: Samuel Liao <samuelliao@tencent.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/1607051143-24508-1-git-send-email-brookxu@tencent.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Hyeongseok Kim	aff18aa806	dm verity: skip verity work if I/O error when system is shutting down [ Upstream commit 252bd1256396cebc6fc3526127fdb0b317601318 ] If emergency system shutdown is called, like by thermal shutdown, a dm device could be alive when the block device couldn't process I/O requests anymore. In this state, the handling of I/O errors by new dm I/O requests or by those already in-flight can lead to a verity corruption state, which is a misjudgment. So, skip verity work in response to I/O error when system is shutting down. Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Takashi Iwai	610d2fa0ec	ALSA: pcm: Clear the full allocated memory at hw_params [ Upstream commit 618de0f4ef11acd8cf26902e65493d46cc20cc89 ] The PCM hw_params core function tries to clear up the PCM buffer before actually using for avoiding the information leak from the previous usages or the usage before a new allocation. It performs the memset() with runtime->dma_bytes, but this might still leave some remaining bytes untouched; namely, the PCM buffer size is aligned in page size for mmap, hence runtime->dma_bytes doesn't necessarily cover all PCM buffer pages, and the remaining bytes are exposed via mmap. This patch changes the memory clearance to cover the all buffer pages if the stream is supposed to be mmap-ready (that guarantees that the buffer size is aligned in page size). Reviewed-by: Lars-Peter Clausen <lars@metafoo.de> Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:56 +01:00
Pavel Begunkov	c7b04d27c9	io_uring: remove racy overflow list fast checks [ Upstream commit 9cd2be519d05ee78876d55e8e902b7125f78b74f ] list_empty_careful() is not racy only if some conditions are met, i.e. no re-adds after del_init. io_cqring_overflow_flush() does list_move(), so it's actually racy. Remove those checks, we have ->cq_check_overflow for the fast path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Heiko Carstens	13f9eec229	s390: always clear kernel stack backchain before calling functions [ Upstream commit 9365965db0c7ca7fc81eee27c21d8522d7102c32 ] Clear the kernel stack backchain before potentially calling the lockdep trace_hardirqs_off/on functions. Without this walking the kernel backchain, e.g. during a panic, might stop too early. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Thomas Gleixner	330c1ee7d5	tick/sched: Remove bogus boot "safety" check [ Upstream commit ba8ea8e7dd6e1662e34e730eadfc52aa6816f9dd ] can_stop_idle_tick() checks whether the do_timer() duty has been taken over by a CPU on boot. That's silly because the boot CPU always takes over with the initial clockevent device. But even if no CPU would have installed a clockevent and taken over the duty then the question whether the tick on the current CPU can be stopped or not is moot. In that case the current CPU would have no clockevent either, so there would be nothing to keep ticking. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20201206212002.725238293@linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Jake Wang	9b22bc0f16	drm/amd/display: updated wm table for Renoir [ Upstream commit 410066d24cfc1071be25e402510367aca9db5cb6 ] [Why] For certain timings, Renoir may underflow due to sr exit latency being too slow. [How] Updated wm table for renoir. Signed-off-by: Jake Wang <haonan.wang2@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Jeff Layton	86be0f2a0e	ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode fails [ Upstream commit 68cbb8056a4c24c6a38ad2b79e0a9764b235e8fa ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Trond Myklebust	8bcfa178f9	NFSv4.2: Don't error when exiting early on a READ_PLUS buffer overflow [ Upstream commit 503b934a752f7e789a5f33217520e0a79f3096ac ] Expanding the READ_PLUS extents can cause the read buffer to overflow. If it does, then don't error, but just exit early. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Gabriel Krisman Bertazi	ef3b9ad967	um: ubd: Submit all data segments atomically [ Upstream commit fc6b6a872dcd48c6f39c7975836d75113db67d37 ] Internally, UBD treats each physical IO segment as a separate command to be submitted in the execution pipe. If the pipe returns a transient error after a few segments have already been written, UBD will tell the block layer to requeue the request, but there is no way to reclaim the segments already submitted. When a new attempt to dispatch the request is done, those segments already submitted will get duplicated, causing the WARN_ON below in the best case, and potentially data corruption. In my system, running a UML instance with 2GB of RAM and a 50M UBD disk, I can reproduce the WARN_ON by simply running mkfs.fvat against the disk on a freshly booted system. There are a few ways to around this, like reducing the pressure on the pipe by reducing the queue depth, which almost eliminates the occurrence of the problem, increasing the pipe buffer size on the host system, or by limiting the request to one physical segment, which causes the block layer to submit way more requests to resolve a single operation. Instead, this patch modifies the format of a UBD command, such that all segments are sent through a single element in the communication pipe, turning the command submission atomic from the point of view of the block layer. The new format has a variable size, depending on the number of elements, and looks like this: +------------+-----------+-----------+------------ \| cmd_header \| segment 0 \| segment 1 \| segment ... +------------+-----------+-----------+------------ With this format, we push a pointer to cmd_header in the submission pipe. This has the advantage of reducing the memory footprint of executing a single request, since it allow us to merge some fields in the header. It is possible to reduce even further each segment memory footprint, by merging bitmap_words and cow_offset, for instance, but this is not the focus of this patch and is left as future work. One issue with the patch is that for a big number of segments, we now perform one big memory allocation instead of multiple small ones, but I wasn't able to trigger any real issues or -ENOMEM because of this change, that wouldn't be reproduced otherwise. This was tested using fio with the verify-crc32 option, and by running an ext4 filesystem over this UBD device. The original WARN_ON was: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 refcount_warn_saturate+0x13f/0x141 refcount_t: underflow; use-after-free. Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.5.0-rc6-00002-g2a5bb2cf75c8 #346 Stack: 6084eed0 6063dc77 00000009 6084ef60 00000000 604b8d9f 6084eee0 6063dcbc 6084ef40 6006ab8d e013d780 1c00000000 Call Trace: [<600a0c1c>] ? printk+0x0/0x94 [<6004a888>] show_stack+0x13b/0x155 [<6063dc77>] ? dump_stack_print_info+0xdf/0xe8 [<604b8d9f>] ? refcount_warn_saturate+0x13f/0x141 [<6063dcbc>] dump_stack+0x2a/0x2c [<6006ab8d>] __warn+0x107/0x134 [<6008da6c>] ? wake_up_process+0x17/0x19 [<60487628>] ? blk_queue_max_discard_sectors+0x0/0xd [<6006b05f>] warn_slowpath_fmt+0xd1/0xdf [<6006af8e>] ? warn_slowpath_fmt+0x0/0xdf [<600acc14>] ? raw_read_seqcount_begin.constprop.0+0x0/0x15 [<600619ae>] ? os_nsecs+0x1d/0x2b [<604b8d9f>] refcount_warn_saturate+0x13f/0x141 [<6048bc8f>] refcount_sub_and_test.constprop.0+0x2f/0x37 [<6048c8de>] blk_mq_free_request+0xf1/0x10d [<6048ca06>] __blk_mq_end_request+0x10c/0x114 [<6005ac0f>] ubd_intr+0xb5/0x169 [<600a1a37>] __handle_irq_event_percpu+0x6b/0x17e [<600a1b70>] handle_irq_event_percpu+0x26/0x69 [<600a1bd9>] handle_irq_event+0x26/0x34 [<600a1bb3>] ? handle_irq_event+0x0/0x34 [<600a5186>] ? unmask_irq+0x0/0x37 [<600a57e6>] handle_edge_irq+0xbc/0xd6 [<600a131a>] generic_handle_irq+0x21/0x29 [<60048f6e>] do_IRQ+0x39/0x54 [...] ---[ end trace c6e7444e55386c0f ]--- Cc: Christopher Obbard <chris.obbard@collabora.com> Reported-by: Martyn Welch <martyn@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Tested-by: Christopher Obbard <chris.obbard@collabora.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Christopher Obbard	a8b49c4bdf	um: random: Register random as hwrng-core device [ Upstream commit 72d3e093afae79611fa38f8f2cfab9a888fe66f2 ] The UML random driver creates a dummy device under the guest, /dev/hw_random. When this file is read from the guest, the driver reads from the host machine's /dev/random, in-turn reading from the host kernel's entropy pool. This entropy pool could have been filled by a hardware random number generator or just the host kernel's internal software entropy generator. Currently the driver does not fill the guests kernel entropy pool, this requires a userspace tool running inside the guest (like rng-tools) to read from the dummy device provided by this driver, which then would fill the guest's internal entropy pool. This all seems quite pointless when we are already reading from an entropy pool, so this patch aims to register the device as a hwrng device using the hwrng-core framework. This not only improves and cleans up the driver, but also fills the guest's entropy pool without having to resort to using extra userspace tools in the guest. This is typically a nuisance when booting a guest: the random pool takes a long time (~200s) to build up enough entropy since the dummy hwrng is not used to fill the guest's pool. This port was originally attempted by Alexander Neville "dark" (in CC, discussion in Link), but the conversation there stalled since the handling of -EAGAIN errors were no removed and longer handled by the driver. This patch attempts to use the existing method of error handling but utilises the new hwrng core. The issue can be noticed when booting a UML guest: [ 2.560000] random: fast init done [ 214.000000] random: crng init done With the patch applied, filling the pool becomes a lot quicker: [ 2.560000] random: fast init done [ 12.000000] random: crng init done Cc: Alexander Neville <dark@volatile.bz> Link: https://lore.kernel.org/lkml/20190828204609.02a7ff70@TheDarkness/ Link: https://lore.kernel.org/lkml/20190829135001.6a5ff940@TheDarkness.local/ Cc: Sjoerd Simons <sjoerd.simons@collabora.co.uk> Signed-off-by: Christopher Obbard <chris.obbard@collabora.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:55 +01:00
Zhang Qilong	0aa2eecf85	watchdog: rti-wdt: fix reference leak in rti_wdt_probe [ Upstream commit 8711071e9700b67045fe5518161d63f7a03e3c9e ] pm_runtime_get_sync() will increment pm usage counter even it failed. Forgetting to call pm_runtime_put_noidle will result in reference leak in rti_wdt_probe, so we should fix it. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201030154909.100023-1-zhangqilong3@huawei.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Eric Biggers	eae1fb3bc5	fs/namespace.c: WARN if mnt_count has become negative [ Upstream commit edf7ddbf1c5eb98b720b063b73e20e8a4a1ce673 ] Missing calls to mntget() (or equivalently, too many calls to mntput()) are hard to detect because mntput() delays freeing mounts using task_work_add(), then again using call_rcu(). As a result, mnt_count can often be decremented to -1 without getting a KASAN use-after-free report. Such cases are still bugs though, and they point to real use-after-frees being possible. For an example of this, see the bug fixed by commit `1b0b9cc8d3` ("vfs: fsmount: add missing mntget()"), discussed at https://lkml.kernel.org/linux-fsdevel/20190605135401.GB30925@xxxxxxxxxxxxxxxxxxxxxxxxx/T/#u. This bug should have been trivial to find. But actually, it wasn't found until syzkaller happened to use fchdir() to manipulate the reference count just right for the bug to be noticeable. Address this by making mntput_no_expire() issue a WARN if mnt_count has become negative. Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Nicholas Piggin	b1e155ccc8	powerpc/64: irq replay remove decrementer overflow check [ Upstream commit 59d512e4374b2d8a6ad341475dc94c4a4bdec7d3 ] This is way to catch some cases of decrementer overflow, when the decrementer has underflowed an odd number of times, while MSR[EE] was disabled. With a typical small decrementer, a timer that fires when MSR[EE] is disabled will be "lost" if MSR[EE] remains disabled for between 4.3 and 8.6 seconds after the timer expires. In any case, the decrementer interrupt would be taken at 8.6 seconds and the timer would be found at that point. So this check is for catching extreme latency events, and it prevents those latencies from being a further few seconds long. It's not obvious this is a good tradeoff. This is already a watchdog magnitude event and that situation is not improved a significantly with this check. For large decrementers, it's useless. Therefore remove this check, which avoids a mftb when enabling hard disabled interrupts (e.g., when enabling after coming from hardware interrupt handlers). Perhaps more importantly, it also removes the clunky MSR[EE] vs PACA_IRQ_HARD_DIS incoherency in soft-interrupt replay which simplifies the code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201107014336.2337337-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Jessica Yu	8b5b2b7683	module: delay kobject uevent until after module init call [ Upstream commit 38dc717e97153e46375ee21797aa54777e5498f3 ] Apparently there has been a longstanding race between udev/systemd and the module loader. Currently, the module loader sends a uevent right after sysfs initialization, but before the module calls its init function. However, some udev rules expect that the module has initialized already upon receiving the uevent. This race has been triggered recently (see link in references) in some systemd mount unit files. For instance, the configfs module creates the /sys/kernel/config mount point in its init function, however the module loader issues the uevent before this happens. sys-kernel-config.mount expects to be able to mount /sys/kernel/config upon receipt of the module loading uevent, but if the configfs module has not called its init function yet, then this directory will not exist and the mount unit fails. A similar situation exists for sys-fs-fuse-connections.mount, as the fuse sysfs mount point is created during the fuse module's init function. If udev is faster than module initialization then the mount unit would fail in a similar fashion. To fix this race, delay the module KOBJ_ADD uevent until after the module has finished calling its init routine. References: https://github.com/systemd/systemd/issues/17586 Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-By: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Daeho Jeong	db6129f6ad	f2fs: fix race of pending_pages in decompression [ Upstream commit 6422a71ef40e4751d59b8c9412e7e2dafe085878 ] I found out f2fs_free_dic() is invoked in a wrong timing, but f2fs_verify_bio() still needed the dic info and it triggered the below kernel panic. It has been caused by the race condition of pending_pages value between decompression and verity logic, when the same compression cluster had been split in different bios. By split bios, f2fs_verify_bio() ended up with decreasing pending_pages value before it is reset to nr_cpages by f2fs_decompress_pages() and caused the kernel panic. [ 4416.564763] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 ... [ 4416.896016] Workqueue: fsverity_read_queue f2fs_verity_work [ 4416.908515] pc : fsverity_verify_page+0x20/0x78 [ 4416.913721] lr : f2fs_verify_bio+0x11c/0x29c [ 4416.913722] sp : ffffffc019533cd0 [ 4416.913723] x29: ffffffc019533cd0 x28: 0000000000000402 [ 4416.913724] x27: 0000000000000001 x26: 0000000000000100 [ 4416.913726] x25: 0000000000000001 x24: 0000000000000004 [ 4416.913727] x23: 0000000000001000 x22: 0000000000000000 [ 4416.913728] x21: 0000000000000000 x20: ffffffff2076f9c0 [ 4416.913729] x19: ffffffff2076f9c0 x18: ffffff8a32380c30 [ 4416.913731] x17: ffffffc01f966d97 x16: 0000000000000298 [ 4416.913732] x15: 0000000000000000 x14: 0000000000000000 [ 4416.913733] x13: f074faec89ffffff x12: 0000000000000000 [ 4416.913734] x11: 0000000000001000 x10: 0000000000001000 [ 4416.929176] x9 : ffffffff20d1f5c7 x8 : 0000000000000000 [ 4416.929178] x7 : 626d7464ff286b6b x6 : ffffffc019533ade [ 4416.929179] x5 : 000000008049000e x4 : ffffffff2793e9e0 [ 4416.929180] x3 : 000000008049000e x2 : ffffff89ecfa74d0 [ 4416.929181] x1 : 0000000000000c40 x0 : ffffffff2076f9c0 [ 4416.929184] Call trace: [ 4416.929187] fsverity_verify_page+0x20/0x78 [ 4416.929189] f2fs_verify_bio+0x11c/0x29c [ 4416.929192] f2fs_verity_work+0x58/0x84 [ 4417.050667] process_one_work+0x270/0x47c [ 4417.055354] worker_thread+0x27c/0x4d8 [ 4417.059784] kthread+0x13c/0x320 [ 4417.063693] ret_from_fork+0x10/0x18 Chao pointed this can happen by the below race condition. Thread A f2fs_post_read_wq fsverity_wq - f2fs_read_multi_pages() - f2fs_alloc_dic - dic->pending_pages = 2 - submit_bio() - submit_bio() - f2fs_post_read_work() handle first bio - f2fs_decompress_work() - __read_end_io() - f2fs_decompress_pages() - dic->pending_pages-- - enqueue f2fs_verity_work() - f2fs_verity_work() handle first bio - f2fs_verify_bio() - dic->pending_pages-- - f2fs_post_read_work() handle second bio - f2fs_decompress_work() - enqueue f2fs_verity_work() - f2fs_verify_pages() - f2fs_free_dic() - f2fs_verity_work() handle second bio - f2fs_verfy_bio() - use-after-free on dic Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Jaegeuk Kim	ee3f8aefd0	f2fs: avoid race condition for shrinker count [ Upstream commit a95ba66ac1457b76fe472c8e092ab1006271f16c ] Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in wrong do_shinker work. Let's avoid to return insane overflowed value by adding single tracking value. Reported-by: Light Hsieh <Light.Hsieh@mediatek.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Trond Myklebust	3c0f0f5f58	NFSv4: Fix a pNFS layout related use-after-free race when freeing the inode [ Upstream commit b6d49ecd1081740b6e632366428b960461f8158b ] When returning the layout in nfs4_evict_inode(), we need to ensure that the layout is actually done being freed before we can proceed to free the inode itself. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:54 +01:00
Qinglang Miao	06ac2ca098	i3c master: fix missing destroy_workqueue() on error in i3c_master_register [ Upstream commit 59165d16c699182b86b5c65181013f1fd88feb62 ] Add the missing destroy_workqueue() before return from i3c_master_register in the error handling case. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://lore.kernel.org/linux-i3c/20201028091543.136167-1-miaoqinglang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Qinglang Miao	498d90690f	powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe() [ Upstream commit ffa1797040c5da391859a9556be7b735acbe1242 ] I noticed that iounmap() of msgr_block_addr before return from mpic_msgr_probe() in the error handling case is missing. So use devm_ioremap() instead of just ioremap() when remapping the message register block, so the mapping will be automatically released on probe failure. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Zheng Liang	acc3c8cc27	rtc: pl031: fix resource leak in pl031_probe [ Upstream commit 1eab0fea2514b269e384c117f5b5772b882761f0 ] When devm_rtc_allocate_device is failed in pl031_probe, it should release mem regions with device. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20201112093139.32566-1-zhengliang6@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Jan Kara	26058c397b	quota: Don't overflow quota file offsets [ Upstream commit 10f04d40a9fa29785206c619f80d8beedb778837 ] The on-disk quota format supports quota files with upto 2^32 blocks. Be careful when computing quota file offsets in the quota files from block numbers as they can overflow 32-bit types. Since quota files larger than 4GB would require ~26 millions of quota users, this is mostly a theoretical concern now but better be careful, fuzzers would find the problem sooner or later anyway... Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Miroslav Benes	bb2ab902f6	module: set MODULE_STATE_GOING state when a module fails to load [ Upstream commit 5e8ed280dab9eeabc1ba0b2db5dbe9fe6debb6b5 ] If a module fails to load due to an error in prepare_coming_module(), the following error handling in load_module() runs with MODULE_STATE_COMING in module's state. Fix it by correctly setting MODULE_STATE_GOING under "bug_cleanup" label. Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Dinghao Liu	0ad9a6e613	rtc: sun6i: Fix memleak in sun6i_rtc_clk_init [ Upstream commit 28d211919e422f58c1e6c900e5810eee4f1ce4c8 ] When clk_hw_register_fixed_rate_with_accuracy() fails, clk_data should be freed. It's the same for the subsequent two error paths, but we should also unregister the already registered clocks in them. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201020061226.6572-1-dinghao.liu@zju.edu.cn Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-01-06 14:56:53 +01:00
Xiaoguang Wang	b5a2f093b6	io_uring: check kthread stopped flag when sq thread is unparked commit 65b2b213484acd89a3c20dbb524e52a2f3793b78 upstream. syzbot reports following issue: INFO: task syz-executor.2:12399 can't die for more than 143 seconds. task:syz-executor.2 state:D stack:28744 pid:12399 ppid: 8504 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:3773 [inline] __schedule+0x893/0x2170 kernel/sched/core.c:4522 schedule+0xcf/0x270 kernel/sched/core.c:4600 schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1847 do_wait_for_common kernel/sched/completion.c:85 [inline] __wait_for_common kernel/sched/completion.c:106 [inline] wait_for_common kernel/sched/completion.c:117 [inline] wait_for_completion+0x163/0x260 kernel/sched/completion.c:138 kthread_stop+0x17a/0x720 kernel/kthread.c:596 io_put_sq_data fs/io_uring.c:7193 [inline] io_sq_thread_stop+0x452/0x570 fs/io_uring.c:7290 io_finish_async fs/io_uring.c:7297 [inline] io_sq_offload_create fs/io_uring.c:8015 [inline] io_uring_create fs/io_uring.c:9433 [inline] io_uring_setup+0x19b7/0x3730 fs/io_uring.c:9507 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45deb9 Code: Unable to access opcode bytes at RIP 0x45de8f. RSP: 002b:00007f174e51ac78 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9 RAX: ffffffffffffffda RBX: 0000000000008640 RCX: 000000000045deb9 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 00000000000050e5 RBP: 000000000118bf58 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118bf2c R13: 00007ffed9ca723f R14: 00007f174e51b9c0 R15: 000000000118bf2c INFO: task syz-executor.2:12399 blocked for more than 143 seconds. Not tainted 5.10.0-rc3-next-20201110-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Currently we don't have a reproducer yet, but seems that there is a race in current codes: => io_put_sq_data ctx_list is empty now. \| ==> kthread_park(sqd->thread); \| \| T1: sq thread is parked now. ==> kthread_stop(sqd->thread); \| KTHREAD_SHOULD_STOP is set now.\| ===> kthread_unpark(k); \| \| T2: sq thread is now unparkd, run again. \| \| T3: sq thread is now preempted out. \| ===> wake_up_process(k); \| \| \| T4: Since sqd ctx_list is empty, needs_sched will be true, \| then sq thread sets task state to TASK_INTERRUPTIBLE, \| and schedule, now sq thread will never be waken up. ===> wait_for_completion \| I have artificially used mdelay() to simulate above race, will get same stack like this syzbot report, but to be honest, I'm not sure this code race triggers syzbot report. To fix this possible code race, when sq thread is unparked, need to check whether sq thread has been stopped. Reported-by: syzbot+03beeb595f074db9cfd1@syzkaller.appspotmail.com Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:53 +01:00
Boqun Feng	9080305017	fcntl: Fix potential deadlock in send_sig{io, urg}() commit 8d1ddb5e79374fb277985a6b3faa2ed8631c5b4c upstream. Syzbot reports a potential deadlock found by the newly added recursive read deadlock detection in lockdep: [...] ======================================================== [...] WARNING: possible irq lock inversion dependency detected [...] 5.9.0-rc2-syzkaller #0 Not tainted [...] -------------------------------------------------------- [...] syz-executor.1/10214 just changed the state of lock: [...] ffff88811f506338 (&f->f_owner.lock){.+..}-{2:2}, at: send_sigurg+0x1d/0x200 [...] but this lock was taken by another, HARDIRQ-safe lock in the past: [...] (&dev->event_lock){-...}-{2:2} [...] [...] [...] and interrupts could create inverse lock ordering between them. [...] [...] [...] other info that might help us debug this: [...] Chain exists of: [...] &dev->event_lock --> &new->fa_lock --> &f->f_owner.lock [...] [...] Possible interrupt unsafe locking scenario: [...] [...] CPU0 CPU1 [...] ---- ---- [...] lock(&f->f_owner.lock); [...] local_irq_disable(); [...] lock(&dev->event_lock); [...] lock(&new->fa_lock); [...] <Interrupt> [...] lock(&dev->event_lock); [...] [...] * DEADLOCK * The corresponding deadlock case is as followed: CPU 0 CPU 1 CPU 2 read_lock(&fown->lock); spin_lock_irqsave(&dev->event_lock, ...) write_lock_irq(&filp->f_owner.lock); // wait for the lock read_lock(&fown-lock); // have to wait until the writer release // due to the fairness <interrupted> spin_lock_irqsave(&dev->event_lock); // wait for the lock The lock dependency on CPU 1 happens if there exists a call sequence: input_inject_event(): spin_lock_irqsave(&dev->event_lock,...); input_handle_event(): input_pass_values(): input_to_handler(): handler->event(): // evdev_event() evdev_pass_values(): spin_lock(&client->buffer_lock); __pass_event(): kill_fasync(): kill_fasync_rcu(): read_lock(&fa->fa_lock); send_sigio(): read_lock(&fown->lock); To fix this, make the reader in send_sigurg() and send_sigio() use read_lock_irqsave() and read_lock_irqrestore(). Reported-by: syzbot+22e87cdf94021b984aa6@syzkaller.appspotmail.com Reported-by: syzbot+c5e32344981ad9f33750@syzkaller.appspotmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:53 +01:00
Theodore Ts'o	721972b866	ext4: check for invalid block size early when mounting a file system commit c9200760da8a728eb9767ca41a956764b28c1310 upstream. Check for valid block size directly by validating s_log_block_size; we were doing this in two places. First, by calculating blocksize via BLOCK_SIZE << s_log_block_size, and then checking that the blocksize was valid. And then secondly, by checking s_log_block_size directly. The first check is not reliable, and can trigger an UBSAN warning if s_log_block_size on a maliciously corrupted superblock is greater than 22. This is harmless, since the second test will correctly reject the maliciously fuzzed file system, but to make syzbot shut up, and because the two checks are duplicative in any case, delete the blocksize check, and move the s_log_block_size earlier in ext4_fill_super(). Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: syzbot+345b75652b1d24227443@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00
Randy Dunlap	8ed894f111	bfs: don't use WARNING: string when it's just info. commit dc889b8d4a8122549feabe99eead04e6b23b6513 upstream. Make the printk() [bfs "printf" macro] seem less severe by changing "WARNING:" to "NOTE:". <asm-generic/bug.h> warns us about using WARNING or BUG in a format string other than in WARN() or BUG() family macros. bfs/inode.c is doing just that in a normal printk() call, so change the "WARNING" string to be "NOTE". Link: https://lkml.kernel.org/r/20201203212634.17278-1-rdunlap@infradead.org Reported-by: syzbot+3fd34060f26e766536ff@syzkaller.appspotmail.com Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: "Tigran A. Aivazian" <aivazian.tigran@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00
Takashi Iwai	fb05e983ea	ALSA: rawmidi: Access runtime->avail always in spinlock commit 88a06d6fd6b369d88cec46c62db3e2604a2f50d5 upstream. The runtime->avail field may be accessed concurrently while some places refer to it without taking the runtime->lock spinlock, as detected by KCSAN. Usually this isn't a big problem, but for consistency and safety, we should take the spinlock at each place referencing this field. Reported-by: syzbot+a23a6f1215c84756577c@syzkaller.appspotmail.com Reported-by: syzbot+3d367d1df1d2b67f5c19@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20201206083527.21163-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00
Takashi Iwai	cf7fe671cd	ALSA: seq: Use bool for snd_seq_queue internal flags commit 4ebd47037027c4beae99680bff3b20fdee5d7c1e upstream. The snd_seq_queue struct contains various flags in the bit fields. Those are categorized to two different use cases, both of which are protected by different spinlocks. That implies that there are still potential risks of the bad operations for bit fields by concurrent accesses. For addressing the problem, this patch rearranges those flags to be a standard bool instead of a bit field. Reported-by: syzbot+63cbe31877bb80ef58f5@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20201206083456.21110-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00
Chao Yu	1c5a034710	f2fs: fix shift-out-of-bounds in sanity_check_raw_super() commit e584bbe821229a3e7cc409eecd51df66f9268c21 upstream. syzbot reported a bug which could cause shift-out-of-bounds issue, fix it. Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x107/0x163 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395 sanity_check_raw_super fs/f2fs/super.c:2812 [inline] read_raw_super_block fs/f2fs/super.c:3267 [inline] f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519 mount_bdev+0x34d/0x410 fs/super.c:1366 legacy_get_tree+0x105/0x220 fs/fs_context.c:592 vfs_get_tree+0x89/0x2f0 fs/super.c:1496 do_new_mount fs/namespace.c:2896 [inline] path_mount+0x12ae/0x1e70 fs/namespace.c:3227 do_mount fs/namespace.c:3240 [inline] __do_sys_mount fs/namespace.c:3448 [inline] __se_sys_mount fs/namespace.c:3425 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+ca9a785f8ac472085994@syzkaller.appspotmail.com Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00
Mauro Carvalho Chehab	2b56f16e34	media: gp8psk: initialize stats at power control logic commit d0ac1a26ed5943127cb0156148735f5f52a07075 upstream. As reported on: https://lore.kernel.org/linux-media/20190627222020.45909-1-willemdebruijn.kernel@gmail.com/ if gp8psk_usb_in_op() returns an error, the status var is not initialized. Yet, this var is used later on, in order to identify: - if the device was already started; - if firmware has loaded; - if the LNBf was powered on. Using status = 0 seems to ensure that everything will be properly powered up. So, instead of the proposed solution, let's just set status = 0. Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00
Anant Thazhemadam	f290cffdf7	misc: vmw_vmci: fix kernel info-leak by initializing dbells in vmci_ctx_get_chkpt_doorbells() commit 31dcb6c30a26d32650ce134820f27de3c675a45a upstream. A kernel-infoleak was reported by syzbot, which was caused because dbells was left uninitialized. Using kzalloc() instead of kmalloc() fixes this issue. Reported-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com Tested-by: syzbot+a79e17c39564bedf0930@syzkaller.appspotmail.com Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Link: https://lore.kernel.org/r/20201122224534.333471-1-anant.thazhemadam@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00
Rustam Kovhaev	a021b66961	reiserfs: add check for an invalid ih_entry_count commit d24396c5290ba8ab04ba505176874c4e04a2d53c upstream. when directory item has an invalid value set for ih_entry_count it might trigger use-after-free or out-of-bounds read in bin_search_in_dir_item() ih_entry_count * IH_SIZE for directory item should not be larger than ih_item_len Link: https://lore.kernel.org/r/20201101140958.3650143-1-rkovhaev@gmail.com Reported-and-tested-by: syzbot+83b6f7cf9922cae5c4d7@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=83b6f7cf9922cae5c4d7 Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-06 14:56:52 +01:00

1 2 3 4 5 ...

969594 Commits