linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 10:55:09 +07:00

Author	SHA1	Message	Date
Christophe JAILLET	8f995afae9	net: mscc: ocelot: Fix a resource leak in the error handling path of the probe function [ Upstream commit f87675b836b324d270fd52f1f5e6d6bb9f4bd1d5 ] In case of error after calling 'ocelot_init()', it must be undone by a corresponding 'ocelot_deinit()' call, as already done in the remove function. Fixes: `a556c76adc` ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20201213114838.126922-1-christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:57 +01:00
Christophe JAILLET	1e7524c981	net: bcmgenet: Fix a resource leak in an error handling path in the probe functin [ Upstream commit 4375ada01963d1ebf733d60d1bb6e5db401e1ac6 ] If the 'register_netdev()' call fails, we must undo a previous 'bcmgenet_mii_init()' call. Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20201212182005.120437-1-christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:57 +01:00
Ioana Ciornei	5c0109f779	dpaa2-eth: fix the size of the mapped SGT buffer [ Upstream commit 54a57d1c449275ee727154ac106ec1accae012e3 ] This patch fixes an error condition triggered when the code path which transmits a S/G frame descriptor when the skb's headroom is not enough for DPAA2's needs. We are greated with a splat like the one below when a SGT structure is recycled and that is because even though a dma_unmap is performed on the Tx confirmation path, the unmap is not done with the proper size. [ 714.464927] WARNING: CPU: 13 PID: 0 at drivers/iommu/io-pgtable-arm.c:281 __arm_lpae_map+0x2d4/0x30c (...) [ 714.465343] Call trace: [ 714.465348] __arm_lpae_map+0x2d4/0x30c [ 714.465353] __arm_lpae_map+0x114/0x30c [ 714.465357] __arm_lpae_map+0x114/0x30c [ 714.465362] __arm_lpae_map+0x114/0x30c [ 714.465366] arm_lpae_map+0xf4/0x180 [ 714.465373] arm_smmu_map+0x4c/0xc0 [ 714.465379] __iommu_map+0x100/0x2bc [ 714.465385] iommu_map_atomic+0x20/0x30 [ 714.465391] __iommu_dma_map+0xb0/0x110 [ 714.465397] iommu_dma_map_page+0xb8/0x120 [ 714.465404] dma_map_page_attrs+0x1a8/0x210 [ 714.465413] __dpaa2_eth_tx+0x384/0xbd0 [fsl_dpaa2_eth] [ 714.465421] dpaa2_eth_tx+0x84/0x134 [fsl_dpaa2_eth] [ 714.465427] dev_hard_start_xmit+0x10c/0x2b0 [ 714.465433] sch_direct_xmit+0x1a0/0x550 (...) The dpaa2-eth driver uses an area of software annotations to transmit necessary information from the Tx path to the Tx confirmation one. This SWA structure has a different layout for each kind of frame that we are dealing with: linear, S/G or XDP. The commit referenced was incorrectly setting up the 'sgt_size' field for the S/G type of SWA even though we are dealing with a linear skb here. Fixes: `d70446ee1f` ("dpaa2-eth: send a scatter-gather FD instead of realloc-ing") Reported-by: Daniel Thompson <daniel.thompson@linaro.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20201211171607.108034-1-ciorneiioana@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Oleksij Rempel	d50170ac30	net: dsa: qca: ar9331: fix sleeping function called from invalid context bug [ Upstream commit 3e47495fc4de4122598dd51ae8527b09b8209646 ] With lockdep enabled, we will get following warning: ar9331_switch ethernet.1:10 lan0 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:00] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13) BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 18, name: kworker/0:1 INFO: lockdep is turned off. irq event stamp: 602 hardirqs last enabled at (601): [<8073fde0>] _raw_spin_unlock_irq+0x3c/0x80 hardirqs last disabled at (602): [<8073a4f4>] __schedule+0x184/0x800 softirqs last enabled at (0): [<80080f60>] copy_process+0x578/0x14c8 softirqs last disabled at (0): [<00000000>] 0x0 CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.10.0-rc3-ar9331-00734-g7d644991df0c #31 Workqueue: events deferred_probe_work_func Stack : 80980000 80980000 8089ef70 80890000 804b5414 80980000 00000002 80b53728 00000000 800d1268 804b5414 ffffffde 00000017 800afe08 81943860 0f5bfc32 00000000 00000000 8089ef70 819436c0 ffffffea 00000000 00000000 00000000 8194390c 808e353c 0000000f 66657272 80980000 00000000 00000000 80890000 804b5414 80980000 00000002 80b53728 00000000 00000000 00000000 80d40000 ... Call Trace: [<80069ce0>] show_stack+0x9c/0x140 [<800afe08>] ___might_sleep+0x220/0x244 [<8073bfb0>] __mutex_lock+0x70/0x374 [<8073c2e0>] mutex_lock_nested+0x2c/0x38 [<804b5414>] regmap_update_bits_base+0x38/0x8c [<804ee584>] regmap_update_bits+0x1c/0x28 [<804ee714>] ar9331_sw_unmask_irq+0x34/0x60 [<800d91f0>] unmask_irq+0x48/0x70 [<800d93d4>] irq_startup+0x114/0x11c [<800d65b4>] __setup_irq+0x4f4/0x6d0 [<800d68a0>] request_threaded_irq+0x110/0x190 [<804e3ef0>] phy_request_interrupt+0x4c/0xe4 [<804df508>] phylink_bringup_phy+0x2c0/0x37c [<804df7bc>] phylink_of_phy_connect+0x118/0x130 [<806c1a64>] dsa_slave_create+0x3d0/0x578 [<806bc4ec>] dsa_register_switch+0x934/0xa20 [<804eef98>] ar9331_sw_probe+0x34c/0x364 [<804eb48c>] mdio_probe+0x44/0x70 [<8049e3b4>] really_probe+0x30c/0x4f4 [<8049ea10>] driver_probe_device+0x264/0x26c [<8049bc10>] bus_for_each_drv+0xb4/0xd8 [<8049e684>] __device_attach+0xe8/0x18c [<8049ce58>] bus_probe_device+0x48/0xc4 [<8049db70>] deferred_probe_work_func+0xdc/0xf8 [<8009ff64>] process_one_work+0x2e4/0x4a0 [<800a0770>] worker_thread+0x2a8/0x354 [<800a774c>] kthread+0x16c/0x174 [<8006306c>] ret_from_kernel_thread+0x14/0x1c ar9331_switch ethernet.1:10 lan1 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:02] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13) DSA: tree 0 setup To fix it, it is better to move access to MDIO register to the .irq_bus_sync_unlock call back. Fixes: `ec6698c272` ("net: dsa: add support for Atheros AR9331 built-in switch") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20201211110317.17061-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Björn Töpel	bc79bf6c58	i40e, xsk: clear the status bits for the next_to_use descriptor [ Upstream commit 64050b5b8706d304ba647591b06e1eddc55e8bd9 ] On the Rx side, the next_to_use index points to the next item in the HW ring to be refilled/allocated, and next_to_clean points to the next item to potentially be processed. When the HW Rx ring is fully refilled, i.e. no packets has been processed, the next_to_use will be next_to_clean - 1. When the ring is fully processed next_to_clean will be equal to next_to_use. The latter case is where a bug is triggered. If the next_to_use bits are not cleared, and the "fully processed" state is entered, a stale descriptor can be processed. The skb-path correctly clear the status bit for the next_to_use descriptor, but the AF_XDP zero-copy path did not do that. This change adds the status bits clearing of the next_to_use descriptor. Fixes: `3b4f0b66c2` ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Björn Töpel	0c3d87fa50	ice, xsk: clear the status bits for the next_to_use descriptor [ Upstream commit 8d14768a7972b92c73259f0c9c45b969d85e3a60 ] On the Rx side, the next_to_use index points to the next item in the HW ring to be refilled/allocated, and next_to_clean points to the next item to potentially be processed. When the HW Rx ring is fully refilled, i.e. no packets has been processed, the next_to_use will be next_to_clean - 1. When the ring is fully processed next_to_clean will be equal to next_to_use. The latter case is where a bug is triggered. If the next_to_use bits are not cleared, and the "fully processed" state is entered, a stale descriptor can be processed. The skb-path correctly clear the status bit for the next_to_use descriptor, but the AF_XDP zero-copy path did not do that. This change adds the status bits clearing of the next_to_use descriptor. Fixes: `2d4238f556` ("ice: Add support for AF_XDP") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Sven Van Asbroeck	f290c9bed1	lan743x: fix rx_napi_poll/interrupt ping-pong [ Upstream commit 57030a0b620f735bf557696e5ceb9f32c2b3bb8f ] Even if there is more rx data waiting on the chip, the rx napi poll fn will never run more than once - it will always read a few buffers, then bail out and re-arm interrupts. Which results in ping-pong between napi and interrupt. This defeats the purpose of napi, and is bad for performance. Fix by making the rx napi poll behave identically to other ethernet drivers: 1. initialize rx napi polling with an arbitrary budget (64). 2. in the polling fn, return full weight if rx queue is not depleted, this tells the napi core to "keep polling". 3. update the rx tail ("ring the doorbell") once for every 8 processed rx ring buffers. Thanks to Jakub Kicinski, Eric Dumazet and Andrew Lunn for their expert opinions and suggestions. Tested with 20 seconds of full bandwidth receive (iperf3): rx irqs softirqs(NET_RX) ----------------------------- before 23827 33620 after 129 4081 Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430 Fixes: `23f0703c12` ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Link: https://lore.kernel.org/r/20201215161954.5950-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Heiko Carstens	1856405862	s390/test_unwind: fix CALL_ON_STACK tests [ Upstream commit f22b9c219a798e1bf11110a3d2733d883e6da059 ] The CALL_ON_STACK tests use the no_dat stack to switch to a different stack for unwinding tests. If an interrupt or machine check happens while using that stack, and previously being on the async stack, the interrupt / machine check entry code (SWITCH_ASYNC) will assume that the previous context did not use the async stack and happily use the async stack again. This will lead to stack corruption of the previous context. To solve this disable both interrupts and machine checks before switching to the no_dat stack. Fixes: `7868249fbb` ("s390/test_unwind: add CALL_ON_STACK tests") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Dwaipayan Ray	0633094ec7	checkpatch: fix unescaped left brace [ Upstream commit 03f4935135b9efeb780b970ba023c201f81cf4e6 ] There is an unescaped left brace in a regex in OPEN_BRACE check. This throws a runtime error when checkpatch is run with --fix flag and the OPEN_BRACE check is executed. Fix it by escaping the left brace. Link: https://lkml.kernel.org/r/20201115202928.81955-1-dwaipayanray1@gmail.com Fixes: `8d1824780f` ("checkpatch: add --fix option for a couple OPEN_BRACE misuses") Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Alexey Dobriyan	b202ac9c73	proc: fix lookup in /proc/net subdirectories after setns(2) [ Upstream commit c6c75deda81344c3a95d1d1f606d5cee109e5d54 ] Commit `1fde6f21d9` ("proc: fix /proc/net/* after setns(2)") only forced revalidation of regular files under /proc/net/ However, /proc/net/ is unusual in the sense of /proc/net/foo handlers take netns pointer from parent directory which is old netns. Steps to reproduce: (void)open("/proc/net/sctp/snmp", O_RDONLY); unshare(CLONE_NEWNET); int fd = open("/proc/net/sctp/snmp", O_RDONLY); read(fd, &c, 1); Read will read wrong data from original netns. Patch forces lookup on every directory under /proc/net . Link: https://lkml.kernel.org/r/20201205160916.GA109739@localhost.localdomain Fixes: `1da4d377f9` ("proc: revalidate misc dentries") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala@nokia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:56 +01:00
Johannes Weiner	bd3f4b6fd9	mm: don't wake kswapd prematurely when watermark boosting is disabled [ Upstream commit 597c892038e08098b17ccfe65afd9677e6979800 ] On 2-node NUMA hosts we see bursts of kswapd reclaim and subsequent pressure spikes and stalls from cache refaults while there is plenty of free memory in the system. Usually, kswapd is woken up when all eligible nodes in an allocation are full. But the code related to watermark boosting can wake kswapd on one full node while the other one is mostly empty. This may be justified to fight fragmentation, but is currently unconditionally done whether watermark boosting is occurring or not. In our case, many of our workloads' throughput scales with available memory, and pure utilization is a more tangible concern than trends around longer-term fragmentation. As a result we generally disable watermark boosting. Wake kswapd only woken when watermark boosting is requested. Link: https://lkml.kernel.org/r/20201020175833.397286-1-hannes@cmpxchg.org Fixes: `1c30844d2d` ("mm: reclaim small amounts of memory when an external fragmentation event occurs") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:55 +01:00
Dan Carpenter	9b52a37fb3	hugetlb: fix an error code in hugetlb_reserve_pages() [ Upstream commit 7fc2513aa237e2ce239ab54d7b04d1d79b317110 ] Preserve the error code from region_add() instead of returning success. Link: https://lkml.kernel.org/r/X9NGZWnZl5/Mt99R@mwanda Fixes: `0db9d74ed8` ("hugetlb: disable region_add file_region coalescing") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Mina Almasry <almasrymina@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:55 +01:00
Oscar Salvador	b7bf8ed8d1	mm,memory_failure: always pin the page in madvise_inject_error [ Upstream commit 1e8aaedb182d6ddffc894b832e4962629907b3e0 ] madvise_inject_error() uses get_user_pages_fast to translate the address we specified to a page. After [1], we drop the extra reference count for memory_failure() path. That commit says that memory_failure wanted to keep the pin in order to take the page out of circulation. The truth is that we need to keep the page pinned, otherwise the page might be re-used after the put_page() and we can end up messing with someone else's memory. E.g: CPU0 process X CPU1 madvise_inject_error get_user_pages put_page page gets reclaimed process Y allocates the page memory_failure // We mess with process Y memory madvise() is meant to operate on a self address space, so messing with pages that do not belong to us seems the wrong thing to do. To avoid that, let us keep the page pinned for memory_failure as well. Pages for DAX mappings will release this extra refcount in memory_failure_dev_pagemap. [1] ("23e7b5c2e271: mm, madvise_inject_error: Let memory_failure() optionally take a page reference") Link: https://lkml.kernel.org/r/20201207094818.8518-1-osalvador@suse.de Fixes: `23e7b5c2e2` ("mm, madvise_inject_error: Let memory_failure() optionally take a page reference") Signed-off-by: Oscar Salvador <osalvador@suse.de> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:55 +01:00
Vincenzo Frascino	23713b480d	mm/vmalloc.c: fix kasan shadow poisoning size [ Upstream commit c041098c690fe53cea5d20c62f128a4f7a5c19fe ] The size of vm area can be affected by the presence or not of the guard page. In particular when VM_NO_GUARD is present, the actual accessible size has to be considered like the real size minus the guard page. Currently kasan does not keep into account this information during the poison operation and in particular tries to poison the guard page as well. This approach, even if incorrect, does not cause an issue because the tags for the guard page are written in the shadow memory. With the future introduction of the Tag-Based KASAN, being the guard page inaccessible by nature, the write tag operation on this page triggers a fault. Fix kasan shadow poisoning size invoking get_vm_area_size() instead of accessing directly the field in the data structure to detect the correct value. Link: https://lkml.kernel.org/r/20201027160213.32904-1-vincenzo.frascino@arm.com Fixes: `d98c9e83b5` ("kasan: fix crashes on access to memory mapped by vm_map_ram()") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:55 +01:00
Waiman Long	4a9d8b0789	mm/vmalloc: Fix unlock order in s_stop() [ Upstream commit 0a7dd4e901b8a4ee040ba953900d1d7120b34ee5 ] When multiple locks are acquired, they should be released in reverse order. For s_start() and s_stop() in mm/vmalloc.c, that is not the case. s_start: mutex_lock(&vmap_purge_lock); spin_lock(&vmap_area_lock); s_stop : mutex_unlock(&vmap_purge_lock); spin_unlock(&vmap_area_lock); This unlock sequence, though allowed, is not optimal. If a waiter is present, mutex_unlock() will need to go through the slowpath of waking up the waiter with preemption disabled. Fix that by releasing the spinlock first before the mutex. Link: https://lkml.kernel.org/r/20201213180843.16938-1-longman@redhat.com Fixes: `e36176be1c` ("mm/vmalloc: rework vmap_area_lock") Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:55 +01:00
Matthew Wilcox (Oracle)	bbb7c059fd	sparc: fix handling of page table constructor failure [ Upstream commit 06517c9a336f4c20f2064611bf4b1e7881a95fe1 ] The page has just been allocated, so its refcount is 1. free_unref_page() is for use on pages which have a zero refcount. Use __free_page() like the other implementations of pte_alloc_one(). Link: https://lkml.kernel.org/r/20201125034655.27687-1-willy@infradead.org Fixes: `1ae9ae5f7d` ("sparc: handle pgtable_page_ctor() fail") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:55 +01:00
Shakeel Butt	dd156e3fca	mm/rmap: always do TTU_IGNORE_ACCESS [ Upstream commit 013339df116c2ee0d796dd8bfb8f293a2030c063 ] Since commit `369ea8242c` ("mm/rmap: update to new mmu_notifier semantic v2"), the code to check the secondary MMU's page table access bit is broken for !(TTU_IGNORE_ACCESS) because the page is unmapped from the secondary MMU's page table before the check. More specifically for those secondary MMUs which unmap the memory in mmu_notifier_invalidate_range_start() like kvm. However memory reclaim is the only user of !(TTU_IGNORE_ACCESS) or the absence of TTU_IGNORE_ACCESS and it explicitly performs the page table access check before trying to unmap the page. So, at worst the reclaim will miss accesses in a very short window if we remove page table access check in unmapping code. There is an unintented consequence of !(TTU_IGNORE_ACCESS) for the memcg reclaim. From memcg reclaim the page_referenced() only account the accesses from the processes which are in the same memcg of the target page but the unmapping code is considering accesses from all the processes, so, decreasing the effectiveness of memcg reclaim. The simplest solution is to always assume TTU_IGNORE_ACCESS in unmapping code. Link: https://lkml.kernel.org/r/20201104231928.1494083-1-shakeelb@google.com Fixes: `369ea8242c` ("mm/rmap: update to new mmu_notifier semantic v2") Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:55 +01:00
Muchun Song	6d48fff6d3	mm: memcg/slab: fix use after free in obj_cgroup_charge [ Upstream commit eefbfa7fd678805b38a46293e78543f98f353d3e ] The rcu_read_lock/unlock only can guarantee that the memcg will not be freed, but it cannot guarantee the success of css_get to memcg. If the whole process of a cgroup offlining is completed between reading a objcg->memcg pointer and bumping the css reference on another CPU, and there are exactly 0 external references to this memory cgroup (how we get to the obj_cgroup_charge() then?), css_get() can change the ref counter from 0 back to 1. Link: https://lkml.kernel.org/r/20201028035013.99711-2-songmuchun@bytedance.com Fixes: `bf4f059954` ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Roman Gushchin <guro@fb.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Yafang Shao <laoar.shao@gmail.com> Cc: Chris Down <chris@chrisdown.name> Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:54 +01:00
Muchun Song	02314d05e8	mm: memcg/slab: fix return of child memcg objcg for root memcg [ Upstream commit 2f7659a314736b32b66273dbf91c19874a052fde ] Consider the following memcg hierarchy. root / \ A B If we failed to get the reference on objcg of memcg A, the get_obj_cgroup_from_current can return the wrong objcg for the root memcg. Link: https://lkml.kernel.org/r/20201029164429.58703-1-songmuchun@bytedance.com Fixes: `bf4f059954` ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Yafang Shao <laoar.shao@gmail.com> Cc: Chris Down <chris@chrisdown.name> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Eugene Syromiatnikov <esyr@redhat.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Adrian Reber <areber@redhat.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:54 +01:00
Jason Gunthorpe	cfde6c1810	mm/gup: combine put_compound_head() and unpin_user_page() [ Upstream commit 4509b42c38963f495b49aa50209c34337286ecbe ] These functions accomplish the same thing but have different implementations. unpin_user_page() has a bug where it calls mod_node_page_state() after calling put_page() which creates a risk that the page could have been hot-uplugged from the system. Fix this by using put_compound_head() as the only implementation. __unpin_devmap_managed_user_page() and related can be deleted as well in favour of the simpler, but slower, version in put_compound_head() that has an extra atomic page_ref_sub, but always calls put_page() which internally contains the special devmap code. Move put_compound_head() to be directly after try_grab_compound_head() so people can find it in future. Link: https://lkml.kernel.org/r/0-v1-6730d4ee0d32+40e6-gup_combine_put_jgg@nvidia.com Fixes: `1970dc6f52` ("mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> CC: Joao Martins <joao.m.martins@oracle.com> CC: Jonathan Corbet <corbet@lwn.net> CC: Dan Williams <dan.j.williams@intel.com> CC: Dave Chinner <david@fromorbit.com> CC: Christoph Hellwig <hch@infradead.org> CC: Jane Chu <jane.chu@oracle.com> CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> CC: Michal Hocko <mhocko@suse.com> CC: Mike Kravetz <mike.kravetz@oracle.com> CC: Shuah Khan <shuah@kernel.org> CC: Muchun Song <songmuchun@bytedance.com> CC: Vlastimil Babka <vbabka@suse.cz> CC: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:54 +01:00
Jason Gunthorpe	537946556c	mm/gup: prevent gup_fast from racing with COW during fork [ Upstream commit 57efa1fe5957694fa541c9062de0a127f0b9acb0 ] Since commit `70e806e4e6` ("mm: Do early cow for pinned pages during fork() for ptes") pages under a FOLL_PIN will not be write protected during COW for fork. This means that pages returned from pin_user_pages(FOLL_WRITE) should not become write protected while the pin is active. However, there is a small race where get_user_pages_fast(FOLL_PIN) can establish a FOLL_PIN at the same time copy_present_page() is write protecting it: CPU 0 CPU 1 get_user_pages_fast() internal_get_user_pages_fast() copy_page_range() pte_alloc_map_lock() copy_present_page() atomic_read(has_pinned) == 0 page_maybe_dma_pinned() == false atomic_set(has_pinned, 1); gup_pgd_range() gup_pte_range() pte_t pte = gup_get_pte(ptep) pte_access_permitted(pte) try_grab_compound_head() pte = pte_wrprotect(pte) set_pte_at(); pte_unmap_unlock() // GUP now returns with a write protected page The first attempt to resolve this by using the write protect caused problems (and was missing a barrrier), see commit `f3c64eda3e` ("mm: avoid early COW write protect games during fork()") Instead wrap copy_p4d_range() with the write side of a seqcount and check the read side around gup_pgd_range(). If there is a collision then get_user_pages_fast() fails and falls back to slow GUP. Slow GUP is safe against this race because copy_page_range() is only called while holding the exclusive side of the mmap_lock on the src mm_struct. [akpm@linux-foundation.org: coding style fixes] Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com Link: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com Fixes: `f3c64eda3e` ("mm: avoid early COW write protect games during fork()") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: "Ahmed S. Darwish" <a.darwish@linutronix.de> [seqcount_t parts] Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kirill Shutemov <kirill@shutemov.name> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:54 +01:00
Jason Gunthorpe	bcb0f647c1	mm/gup: reorganize internal_get_user_pages_fast() [ Upstream commit c28b1fc70390df32e29991eedd52bd86e7aba080 ] Patch series "Add a seqcount between gup_fast and copy_page_range()", v4. As discussed and suggested by Linus use a seqcount to close the small race between gup_fast and copy_page_range(). Ahmed confirms that raw_write_seqcount_begin() is the correct API to use in this case and it doesn't trigger any lockdeps. I was able to test it using two threads, one forking and the other using ibv_reg_mr() to trigger GUP fast. Modifying copy_page_range() to sleep made the window large enough to reliably hit to test the logic. This patch (of 2): The next patch in this series makes the lockless flow a little more complex, so move the entire block into a new function and remove a level of indention. Tidy a bit of cruft: - addr is always the same as start, so use start - Use the modern check_add_overflow() for computing end = start + len - nr_pinned/pages << PAGE_SHIFT needs the LHS to be unsigned long to avoid shift overflow, make the variables unsigned long to avoid coding casts in both places. nr_pinned was missing its cast - The handling of ret and nr_pinned can be streamlined a bit No functional change. Link: https://lkml.kernel.org/r/0-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com Link: https://lkml.kernel.org/r/1-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:54 +01:00
Alex Deucher	c51e3679eb	drm/amdgpu: fix regression in vbios reservation handling on headless [ Upstream commit 7eded018bfeccb365963bb51be731a9f99aeea59 ] We need to move the check under the non-headless case, otherwise we always reserve the VGA save size. Fixes: `157fe68d74` ("drm/amdgpu: fix size calculation with stolen vga memory") Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:54 +01:00
Kajol Jain	33e8ef090b	perf test: Fix metric parsing test [ Upstream commit b2ce5dbc15819ea4bef47dbd368239cb1e965158 ] Commit `e1c92a7fbb` ("perf tests: Add another metric parsing test") add another test for metric parsing. The test goes through all metrics compiled for arch within pmu events and try to parse them. Right now this test is failing in powerpc machine. Result in power9 platform: [command]# ./perf test 10 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Skip (some metrics failed) 10.4: Parsing of PMU event table metrics with fake PMUs : FAILED! Issue is we are passing different runtime parameter value in "expr__find_other" and "expr__parse" function which is called from function `metric_parse_fake`. And because of this parsing of hv-24x7 metrics is failing. [command]# ./perf test 10 -vv ..... hv_24x7/pm_mcs01_128b_rd_disp_port01,chip=1/ not found expr__parse failed test child finished with -1 ---- end ---- PMU events subtest 4: FAILED! This patch fix this issue and change runtime parameter value to '0' in expr__parse function. Result in power9 platform after this patch: [command]# ./perf test 10 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Skip (some metrics failed) 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Fixes: `e1c92a7fbb` ("perf tests: Add another metric parsing test") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lore.kernel.org/lkml/20201119152411.46041-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:54 +01:00
Vincent Stehlé	280f29c6ba	powerpc/ps3: use dma_mapping_error() [ Upstream commit d0edaa28a1f7830997131cbce87b6c52472825d1 ] The DMA address returned by dma_map_single() should be checked with dma_mapping_error(). Fix the ps3stor_setup() function accordingly. Fixes: `80071802cb` ("[POWERPC] PS3: Storage Driver Core") Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201213182622.23047-1-vincent.stehle@laposte.net Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Madhavan Srinivasan	34169b582a	powerpc/perf: Fix Threshold Event Counter Multiplier width for P10 [ Upstream commit ef0e3b650f8ddc54bb70868852f50642ee3ae765 ] Threshold Event Counter Multiplier (TECM) is part of Monitor Mode Control Register A (MMCRA). This field along with Threshold Event Counter Exponent (TECE) is used to get threshould counter value. In Power10, this is a 8bit field, so patch fixes the current code to modify the MMCRA[TECM] extraction macro to handle this change. ISA v3.1 says this is a 7 bit field but POWER10 it's actually 8 bits which will hopefully be fixed in ISA v3.1 update. Fixes: `170a315f41` ("powerpc/perf: Support to export MMCRA[TEC*] field to userspace") Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1608022578-1532-1-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Guido Günther	b6fba53d44	drm: mxsfb: Silence -EPROBE_DEFER while waiting for bridge [ Upstream commit ee46d16d2e40bebc2aa790fd7b6a056466ff895c ] It can take multiple iterations until all components for an attached DSI bridge are up leading to several: [ 3.796425] mxsfb 30320000.lcd-controller: Cannot connect bridge: -517 [ 3.816952] mxsfb 30320000.lcd-controller: [drm:mxsfb_probe [mxsfb]] ERROR failed to attach bridge: -517 Silence this by checking for -EPROBE_DEFER and using dev_err_probe() so we set a deferred reason in case a dependency fails to probe (which quickly happens on small config/DT changes due to the rather long probe chain which can include bridges, phys, panels, backights, leds, etc.). This also removes the only DRM_DEV_ERROR() usage, the rest of the driver uses dev_err(). Signed-off-by: Guido Günther <agx@sigxcpu.org> Fixes: `c42001e357` ("drm: mxsfb: Use drm_panel_bridge") Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/d5761eb871adde5464ba112b89d966568bc2ff6c.1608020391.git.agx@sigxcpu.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Bongsu Jeon	582e1021fb	nfc: s3fwrn5: Release the nfc firmware [ Upstream commit a4485baefa1efa596702ebffd5a9c760d42b14b5 ] add the code to release the nfc firmware when the firmware image size is wrong. Fixes: `c04c674fad` ("nfc: s3fwrn5: Add driver for Samsung S3FWRN5 NFC Chip") Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20201213095850.28169-1-bongsu.jeon@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Leon Romanovsky	04ca5e7fa4	RDMA/cma: Don't overwrite sgid_attr after device is released [ Upstream commit e246b7c035d74abfb3507fa10082d0c42cc016c3 ] As part of the cma_dev release, that pointer will be set to NULL. In case it happens in rdma_bind_addr() (part of an error flow), the next call to addr_handler() will have a call to cma_acquire_dev_by_src_ip() which will overwrite sgid_attr without releasing it. WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline] WARNING: CPU: 2 PID: 108 at drivers/infiniband/core/cma.c:606 cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649 CPU: 2 PID: 108 Comm: kworker/u8:1 Not tainted 5.10.0-rc6+ #257 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: ib_addr process_one_req RIP: 0010:cma_bind_sgid_attr drivers/infiniband/core/cma.c:606 [inline] RIP: 0010:cma_acquire_dev_by_src_ip+0x470/0x4b0 drivers/infiniband/core/cma.c:649 Code: 66 d9 4a ff 4d 8b 6e 10 49 8d bd 1c 08 00 00 e8 b6 d6 4a ff 45 0f b6 bd 1c 08 00 00 41 83 e7 01 e9 49 fd ff ff e8 90 c5 29 ff <0f> 0b e9 80 fe ff ff e8 84 c5 29 ff 4c 89 f7 e8 2c d9 4a ff 4d 8b RSP: 0018:ffff8881047c7b40 EFLAGS: 00010293 RAX: ffff888104789c80 RBX: 0000000000000001 RCX: ffffffff820b8ef8 RDX: 0000000000000000 RSI: ffffffff820b9080 RDI: ffff88810cd4c998 RBP: ffff8881047c7c08 R08: ffff888104789c80 R09: ffffed10209f4036 R10: ffff888104fa01ab R11: ffffed10209f4035 R12: ffff88810cd4c800 R13: ffff888105750e28 R14: ffff888108f0a100 R15: ffff88810cd4c998 FS: 0000000000000000(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000104e60005 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: addr_handler+0x266/0x350 drivers/infiniband/core/cma.c:3190 process_one_req+0xa3/0x300 drivers/infiniband/core/addr.c:645 process_one_work+0x54c/0x930 kernel/workqueue.c:2272 worker_thread+0x82/0x830 kernel/workqueue.c:2418 kthread+0x1ca/0x220 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296 Fixes: `ff11c6cd52` ("RDMA/cma: Introduce and use cma_acquire_dev_by_src_ip()") Link: https://lore.kernel.org/r/20201213132940.345554-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Maor Gottlieb	d9a7b8fcd0	RDMA/mlx5: Fix MR cache memory leak [ Upstream commit e89938902927a54abebccc9537991aca5237dfaf ] If the MR cache entry invalidation failed, then we detach this entry from the cache, therefore we must to free the memory as well. Allcation backtrace for the leaker: [<00000000d8e423b0>] alloc_cache_mr+0x23/0xc0 [mlx5_ib] [<000000001f21304c>] create_cache_mr+0x3f/0xf0 [mlx5_ib] [<000000009d6b45dc>] mlx5_ib_alloc_implicit_mr+0x41/0×210 [mlx5_ib] [<00000000879d0d68>] mlx5_ib_reg_user_mr+0x9e/0×6e0 [mlx5_ib] [<00000000be74bf89>] create_qp+0x2fc/0xf00 [ib_uverbs] [<000000001a532d22>] ib_uverbs_handler_UVERBS_METHOD_COUNTERS_READ+0x1d9/0×230 [ib_uverbs] [<0000000070f46001>] rdma_alloc_commit_uobject+0xb5/0×120 [ib_uverbs] [<000000006d8a0b38>] uverbs_alloc+0x2b/0xf0 [ib_uverbs] [<00000000075217c9>] ksysioctl+0x234/0×7d0 [<00000000eb5c120b>] __x64_sys_ioctl+0x16/0×20 [<00000000db135b48>] do_syscall_64+0x59/0×2e0 Fixes: `1769c4c575` ("RDMA/mlx5: Always remove MRs from the cache before destroying them") Link: https://lore.kernel.org/r/20201213132940.345554-2-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Dan Aloni	c02c1df4fb	sunrpc: fix xs_read_xdr_buf for partial pages receive [ Upstream commit ac9645c87380e39a8fa87a1b51721efcdea89dbf ] When receiving pages data, return value 'ret' when positive includes `buf->page_base`, so we should subtract that before it is used for changing `offset` and comparing against `want`. This was discovered on the very rare cases where the server returned a chunk of bytes that when added to the already received amount of bytes for the pages happened to match the current `recv.len`, for example on this case: buf->page_base : 258356 actually received from socket: 1740 ret : 260096 want : 260096 In this case neither of the two 'if ... goto out' trigger, and we continue to tail parsing. Worth to mention that the ensuing EMSGSIZE from the continued execution of `xs_read_xdr_buf` may be observed by an application due to 4 superfluous bytes being added to the pages data. Fixes: `277e4ab7d5` ("SUNRPC: Simplify TCP receive code by switching to using iterators") Signed-off-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Anton Ivanov	d26a4edda5	um: chan_xterm: Fix fd leak [ Upstream commit 9431f7c199ab0d02da1482d62255e0b4621cb1b5 ] xterm serial channel was leaking a fd used in setting up the port helper This bug is prehistoric - it predates switching to git. The "fixes" header here is really just to mark all the versions we would like this to apply to which is "Anything from the Cretaceous period onwards". No dinosaurs were harmed in fixing this bug. Fixes: `b40997b872` ("um: drivers/xterm.c: fix a file descriptor leak") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:53 +01:00
Anton Ivanov	48628ec96d	um: tty: Fix handling of close in tty lines [ Upstream commit 9b1c0c0e25dcccafd30e7d4c150c249cc65550eb ] Fix a logical error in tty reading. We get 0 and errno == EAGAIN on the first attempt to read from a closed file descriptor. Compared to that a true EAGAIN is EAGAIN and -1. If we check errno for EAGAIN first, before checking the return value we miss the fact that the descriptor is closed. This bug is as old as the driver. It was not showing up with the original POLL based IRQ controller, because it was producing multiple events. Switching to EPOLL unmasked it. Fixes: `ff6a17989c` ("Epoll based IRQ controller") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:52 +01:00
Anton Ivanov	4553c8cecb	um: Monitor error events in IRQ controller [ Upstream commit e3a01cbee9c5f2c6fc813dd6af007716e60257e7 ] Ensure that file closes, connection closes, etc are propagated as interrupts in the interrupt controller. Fixes: `ff6a17989c` ("Epoll based IRQ controller") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:52 +01:00
Wang ShaoBo	0cc9725e4b	ubifs: Fix error return code in ubifs_init_authentication() [ Upstream commit 3cded66330591cfd2554a3fd5edca8859ea365a2 ] Fix to return PTR_ERR() error code from the error handling case where ubifs_hash_get_desc() failed instead of 0 in ubifs_init_authentication(), as done elsewhere in this function. Fixes: `49525e5eec` ("ubifs: Add helper functions for authentication support") Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:52 +01:00
Wang Wensheng	9dc1b44d4f	watchdog: Fix potential dereferencing of null pointer [ Upstream commit 6f733cb2e7db38f8141b14740bcde577844a03b7 ] A reboot notifier, which stops the WDT by calling the stop hook without any check, would be registered when we set WDOG_STOP_ON_REBOOT flag. Howerer we allow the WDT driver to omit the stop hook since commit "d0684c8a93549" ("watchdog: Make stop function optional") and provide a module parameter for user that controls the WDOG_STOP_ON_REBOOT flag in commit `9232c80659` ("watchdog: Add stop_on_reboot parameter to control reboot policy"). Together that commits make user potential to insert a watchdog driver that don't provide a stop hook but with the stop_on_reboot parameter set, then dereferencing of null pointer occurs on system reboot. Check the stop hook before registering the reboot notifier to fix the issue. Fixes: `d0684c8a93` ("watchdog: Make stop function optional") Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201109130512.28121-1-wangwensheng4@huawei.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:52 +01:00
Lingling Xu	17a3ee0003	watchdog: sprd: check busy bit before new loading rather than after that [ Upstream commit 3e07d240939803bed9feb2a353d94686a411a7ca ] As the specification described, users must check busy bit before start a new loading operation to make sure that the previous loading is done and the device is ready to accept a new one. [ chunyan: Massaged changelog ] Fixes: `4776034670` ("watchdog: Add Spreadtrum watchdog driver") Signed-off-by: Lingling Xu <ling_ling.xu@unisoc.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201029023933.24548-3-zhang.lyra@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:52 +01:00
Lingling Xu	f71f75acea	watchdog: sprd: remove watchdog disable from resume fail path [ Upstream commit f61a59acb462840bebcc192f754fe71b6a16ff99 ] sprd_wdt_start() would return fail if the loading operation is not completed in a certain time, disabling watchdog for that case would probably cause the kernel crash when kick watchdog later, that's too bad, so remove the watchdog disable operation for the fail case to make sure other parts in the kernel can run normally. [ chunyan: Massaged changelog ] Fixes: `4776034670` ("watchdog: Add Spreadtrum watchdog driver") Signed-off-by: Lingling Xu <ling_ling.xu@unisoc.com> Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201029023933.24548-2-zhang.lyra@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:52 +01:00
Guenter Roeck	2d42e0354d	watchdog: sirfsoc: Add missing dependency on HAS_IOMEM [ Upstream commit 8ae2511112d2e18bc7d324b77f965d34083a25a2 ] If HAS_IOMEM is not defined and SIRFSOC_WATCHDOG is enabled, the build fails with the following error. drivers/watchdog/sirfsoc_wdt.o: in function `sirfsoc_wdt_probe': sirfsoc_wdt.c:(.text+0x112): undefined reference to `devm_platform_ioremap_resource' Reported-by: Necip Fazil Yildiran <fazilyildiran@gmail.com> Fixes: `da2a68b3eb` ("watchdog: Enable COMPILE_TEST where possible") Link: https://lore.kernel.org/r/20201108162550.27660-2-linux@roeck-us.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:52 +01:00
Guenter Roeck	118a8b7e4d	watchdog: armada_37xx: Add missing dependency on HAS_IOMEM [ Upstream commit 7f6f1dfb2dcbe5d2bfa213f2df5d74c147cd5954 ] The following kbuild warning is seen on a system without HAS_IOMEM. WARNING: unmet direct dependencies detected for MFD_SYSCON Depends on [n]: HAS_IOMEM [=n] Selected by [y]: - ARMADA_37XX_WATCHDOG [=y] && WATCHDOG [=y] && (ARCH_MVEBU \|\| COMPILE_TEST This results in a subsequent compile error. drivers/watchdog/armada_37xx_wdt.o: in function `armada_37xx_wdt_probe': armada_37xx_wdt.c:(.text+0xdc): undefined reference to `devm_ioremap' Add the missing dependency. Reported-by: Necip Fazil Yildiran <fazilyildiran@gmail.com> Fixes: `54e3d9b518` ("watchdog: Add support for Armada 37xx CPU watchdog") Link: https://lore.kernel.org/r/20201108162550.27660-1-linux@roeck-us.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Douglas Anderson	9721dd96e9	irqchip/qcom-pdc: Fix phantom irq when changing between rising/falling [ Upstream commit 2f5fbc4305d07725bfebaedb09e57271315691ef ] We have a problem if we use gpio-keys and configure wakeups such that we only want one edge to wake us up. AKA: wakeup-event-action = <EV_ACT_DEASSERTED>; wakeup-source; Specifically we end up with a phantom interrupt that blocks suspend if the line was already high and we want wakeups on rising edges (AKA we want the GPIO to go low and then high again before we wake up). The opposite is also problematic. Specifically, here's what's happening today: 1. Normally, gpio-keys configures to look for both edges. Due to the current workaround introduced in commit `c3c0c2e18d` ("pinctrl: qcom: Handle broken/missing PDC dual edge IRQs on sc7180"), if the line was high we'd configure for falling edges. 2. At suspend time, we change to look for rising edges. 3. After qcom_pdc_gic_set_type() runs, we get a phantom interrupt. We can solve this by just clearing the phantom interrupt. NOTE: it is possible that this could cause problems for a client with very specific needs, but there's not much we can do with this hardware. As an example, let's say the interrupt signal is currently high and the client is looking for falling edges. The client now changes to look for rising edges. The client could possibly expect that if the line has a short pulse low (and back high) that it would always be detected. Specifically no matter when the pulse happened, it should either have tripped the (old) falling edge trigger or the (new) rising edge trigger. We will simply not trip it. We could narrow down the race a bit by polling our parent before changing types, but no matter what we do there will still be a period of time where we can't tell the difference between a real transition (or more than one transition) and the phantom. Fixes: `f55c73aef8` ("irqchip/pdc: Add PDC interrupt controller for QCOM SoCs") Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Maulik Shah <mkshah@codeaurora.org> Reviewed-by: Maulik Shah <mkshah@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20201211141514.v4.1.I2702919afc253e2a451bebc3b701b462b2d22344@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Pradeep Kumar Chitrapu	6003ff9ca7	ath11k: Fix incorrect tlvs in scan start command [ Upstream commit f57ad6a9885e8399897daee3249cabccf9c972f8 ] Currently 6G specific tlvs have duplicate entries which is causing scan failures. Fix this by removing the duplicate entries of the same tlv. This also fixes out-of-bound memory writes caused due to adding tlvs when num_hint_bssid and num_hint_s_ssid are ZEROs. Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.4.0.1-01386-QCAHKSWPL_SILICONZ-1 Fixes: `74601ecfef` ("ath11k: Add support for 6g scan hint") Reported-by: Carl Huang <cjhuang@codeaurora.org> Signed-off-by: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1607609124-17250-7-git-send-email-kvalo@codeaurora.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Nikita Shubin	2d9284c188	gpiolib: irq hooks: fix recursion in gpiochip_irq_unmask [ Upstream commit 9d5522199505c761575c8ea31dcfd9a2a8d73614 ] irqchip shared with multiple gpiochips, leads to recursive call of gpiochip_irq_mask/gpiochip_irq_unmask which was assigned to rqchip->irq_mask/irqchip->irq_unmask, these happens becouse of only irqchip->irq_enable == gpiochip_irq_enable is checked. Let's add an additional check to make sure shared irqchip is detected even if irqchip->irq_enable wasn't defined. Fixes: `a8173820f4` ("gpio: gpiolib: Allow GPIO IRQs to lazy disable") Signed-off-by: Nikita Shubin <nikita.shubin@maquefel.me> Link: https://lore.kernel.org/r/20201210070514.13238-1-nikita.shubin@maquefel.me Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Weihang Li	78d22dd989	RDMA/hns: Do shift on traffic class when using RoCEv2 [ Upstream commit 603bee935f38080a3674c763c50787751e387779 ] The high 6 bits of traffic class in GRH is DSCP (Differentiated Services Codepoint), the driver should shift it before the hardware gets it when using RoCEv2. Fixes: `606bf89e98` ("RDMA/hns: Refactor for hns_roce_v2_modify_qp function") Fixes: fba429fcf9a5 ("RDMA/hns: Fix missing fields in address vector") Link: https://lore.kernel.org/r/1607650657-35992-4-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Wenpeng Liang	44dd35a017	RDMA/hns: Normalization the judgment of some features [ Upstream commit 4ddeacf68a3dd05f346b63f4507e1032a15cc3cc ] Whether to enable the these features should better depend on the enable flags, not the value of related fields. Fixes: `5c1f167af1` ("RDMA/hns: Init SRQ table for hip08") Fixes: `3cb2c996c9` ("RDMA/hns: Add support for SCCC in size of 64 Bytes") Link: https://lore.kernel.org/r/1607650657-35992-3-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Wenpeng Liang	27f2d59a4a	RDMA/hns: Limit the length of data copied between kernel and userspace [ Upstream commit 1c0ca9cd1741687f529498ddb899805fc2c51caa ] For ib_copy_from_user(), the length of udata may not be the same as that of cmd. For ib_copy_to_user(), the length of udata may not be the same as that of resp. So limit the length to prevent out-of-bounds read and write operations from ib_copy_from_user() and ib_copy_to_user(). Fixes: `de77503a59` ("RDMA/hns: RDMA/hns: Assign rq head pointer when enable rq record db") Fixes: `633fb4d9fd` ("RDMA/hns: Use structs to describe the uABI instead of opencoding") Fixes: `ae85bf92ef` ("RDMA/hns: Optimize qp param setup flow") Fixes: `6fd610c573` ("RDMA/hns: Support 0 hop addressing for SRQ buffer") Fixes: `9d9d4ff788` ("RDMA/hns: Update the kernel header file of hns") Link: https://lore.kernel.org/r/1607650657-35992-2-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Peter Ujfalusi	567a8417b7	dmaengine: ti: k3-udma: Correct normal channel offset when uchan_cnt is not 0 [ Upstream commit e2de925bbfe321ba0588c99f577c59386ab1f428 ] According to different sections of the TRM, the hchan_cnt of CAP3 includes the number of uchan in UDMA, thus the start offset of the normal channels are hchan_cnt. Fixes: `daf4ad0499` ("dmaengine: ti: k3-udma: Query throughput level information from hardware") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Link: https://lore.kernel.org/r/20201208090440.31792-2-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:51 +01:00
Lokesh Vutla	1ce041fad2	irqchip/ti-sci-intr: Fix freeing of irqs [ Upstream commit fc6c7cd3878641fd43189f15697e7ad0871f5c1a ] ti_sci_intr_irq_domain_free() assumes that out_irq of intr is stored in data->chip_data and uses it for calling ti_sci irq_free() and then mark the out_irq as available resource. But ti_sci_intr_irq_domain_alloc() is storing p_hwirq(parent's hardware irq) which is translated from out_irq. This is causing resource leakage and eventually out_irq resources might be exhausted. Fix ti_sci_intr_irq_domain_alloc() by storing the out_irq in data->chip_data. Fixes: `a5b659bd4b` ("irqchip/ti-sci-intr: Add support for INTR being a parent to INTR") Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201102120631.11165-1-lokeshvutla@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:50 +01:00
Lokesh Vutla	629d6ba81f	irqchip/ti-sci-inta: Fix printing of inta id on probe success [ Upstream commit b10d5fd489b0c67f59cbdd28d95f4bd9f76a62f2 ] On a successful probe, the driver tries to print a success message with INTA device id. It uses pdev->id for printing the id but id is stored in inta->ti_sci_id. Fix it by correcting the dev_info parameter. Fixes: `5c4b585d29` ("irqchip/ti-sci-inta: Add support for INTA directly connecting to GIC") Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201102120614.11109-1-lokeshvutla@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:50 +01:00
Marc Zyngier	d05c219375	irqchip/alpine-msi: Fix freeing of interrupts on allocation error path [ Upstream commit 3841245e8498a789c65dedd7ffa8fb2fee2c0684 ] The alpine-msi driver has an interesting allocation error handling, where it frees the same interrupts repeatedly. Hilarity follows. This code is probably never executed, but let's fix it nonetheless. Fixes: `e6b78f2c3e` ("irqchip: Add the Alpine MSIX interrupt controller") Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Antoine Tenart <atenart@kernel.org> Cc: Tsahee Zidenberg <tsahee@annapurnalabs.com> Cc: Antoine Tenart <atenart@kernel.org> Link: https://lore.kernel.org/r/20201129135525.396671-1-maz@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:53:50 +01:00

1 2 3 4 5 ...

969264 Commits