linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-02-24 07:34:15 +07:00

Author	SHA1	Message	Date
Mathy Vanhoef	f74df6e086	mac80211: Fix NULL ptr deref for injected rate info commit bddc0c411a45d3718ac535a070f349be8eca8d48 upstream. The commit `cb17ed29a7` ("mac80211: parse radiotap header when selecting Tx queue") moved the code to validate the radiotap header from ieee80211_monitor_start_xmit to ieee80211_parse_tx_radiotap. This made is possible to share more code with the new Tx queue selection code for injected frames. But at the same time, it now required the call of ieee80211_parse_tx_radiotap at the beginning of functions which wanted to handle the radiotap header. And this broke the rate parser for radiotap header parser. The radiotap parser for rates is operating most of the time only on the data in the actual radiotap header. But for the 802.11a/b/g rates, it must also know the selected band from the chandef information. But this information is only written to the ieee80211_tx_info at the end of the ieee80211_monitor_start_xmit - long after ieee80211_parse_tx_radiotap was already called. The info->band information was therefore always 0 (NL80211_BAND_2GHZ) when the parser code tried to access it. For a 5GHz only device, injecting a frame with 802.11a rates would cause a NULL pointer dereference because local->hw.wiphy->bands[NL80211_BAND_2GHZ] would most likely have been NULL when the radiotap parser searched for the correct rate index of the driver. Cc: stable@vger.kernel.org Reported-by: Ben Greear <greearb@candelatech.com> Fixes: `cb17ed29a7` ("mac80211: parse radiotap header when selecting Tx queue") Signed-off-by: Mathy Vanhoef <Mathy.Vanhoef@kuleuven.be> [sven@narfation.org: added commit message] Signed-off-by: Sven Eckelmann <sven@narfation.org> Link: https://lore.kernel.org/r/20210530133226.40587-1-sven@narfation.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Bumyong Lee	df203c1fda	dmaengine: pl330: fix wrong usage of spinlock flags in dma_cyclc commit 4ad5dd2d7876d79507a20f026507d1a93b8fff10 upstream. flags varible which is the input parameter of pl330_prep_dma_cyclic() should not be used by spinlock_irq[save/restore] function. Signed-off-by: Jongho Park <jongho7.park@samsung.com> Signed-off-by: Bumyong Lee <bumyong.lee@samsung.com> Signed-off-by: Chanho Park <chanho61.park@samsung.com> Link: https://lore.kernel.org/r/20210507063647.111209-1-chanho61.park@samsung.com Fixes: `f6f2421c0a` ("dmaengine: pl330: Merge dma_pl330_dmac and pl330_dmac structs") Cc: stable@vger.kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Pingfan Liu	b842b568a5	crash_core, vmcoreinfo: append 'SECTION_SIZE_BITS' to vmcoreinfo commit 4f5aecdff25f59fb5ea456d5152a913906ecf287 upstream. As mentioned in kernel commit `1d50e5d0c5` ("crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo"), SECTION_SIZE_BITS in the formula: #define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Besides SECTIONS_SHIFT, SECTION_SIZE_BITS is also used to calculate PAGES_PER_SECTION in makedumpfile just like kernel. Unfortunately, this arch-dependent macro SECTION_SIZE_BITS changes, e.g. recently in kernel commit f0b13ee23241 ("arm64/sparsemem: reduce SECTION_SIZE_BITS"). But user space wants a stable interface to get this info. Such info is impossible to be deduced from a crashdump vmcore. Hence append SECTION_SIZE_BITS to vmcoreinfo. Link: https://lkml.kernel.org/r/20210608103359.84907-1-kernelfans@gmail.com Link: http://lists.infradead.org/pipermail/kexec/2021-June/022676.html Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Bhupesh Sharma <bhupesh.sharma@linaro.org> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: Dave Young <dyoung@redhat.com> Cc: Boris Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: James Morse <james.morse@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Anderson <anderson@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Thomas Gleixner	63ba83563e	x86/fpu: Reset state for all signal restore failures commit efa165504943f2128d50f63de0c02faf6dcceb0d upstream. If access_ok() or fpregs_soft_set() fails in __fpu__restore_sig() then the function just returns but does not clear the FPU state as it does for all other fatal failures. Clear the FPU state for these failures as well. Fixes: `72a671ced6` ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87mtryyhhz.ffs@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Andy Lutomirski	a7748e021b	x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer commit d8778e393afa421f1f117471144f8ce6deb6953a upstream. Both Intel and AMD consider it to be architecturally valid for XRSTOR to fail with #PF but nonetheless change the register state. The actual conditions under which this might occur are unclear [1], but it seems plausible that this might be triggered if one sibling thread unmaps a page and invalidates the shared TLB while another sibling thread is executing XRSTOR on the page in question. __fpu__restore_sig() can execute XRSTOR while the hardware registers are preserved on behalf of a different victim task (using the fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but modify the registers. If this happens, then there is a window in which __fpu__restore_sig() could schedule out and the victim task could schedule back in without reloading its own FPU registers. This would result in part of the FPU state that __fpu__restore_sig() was attempting to load leaking into the victim task's user-visible state. Invalidate preserved FPU registers on XRSTOR failure to prevent this situation from corrupting any state. [1] Frequent readers of the errata lists might imagine "complex microarchitectural conditions". Fixes: `1d731e731c` ("x86/fpu: Add a fastpath to __fpu__restore_sig()") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144345.758116583@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Thomas Gleixner	076f732b16	x86/fpu: Prevent state corruption in __fpu__restore_sig() commit 484cea4f362e1eeb5c869abbfb5f90eae6421b38 upstream. The non-compacted slowpath uses __copy_from_user() and copies the entire user buffer into the kernel buffer, verbatim. This means that the kernel buffer may now contain entirely invalid state on which XRSTOR will #GP. validate_user_xstate_header() can detect some of that corruption, but that leaves the onus on callers to clear the buffer. Prior to XSAVES support, it was possible just to reinitialize the buffer, completely, but with supervisor states that is not longer possible as the buffer clearing code split got it backwards. Fixing that is possible but not corrupting the state in the first place is more robust. Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate() which validates the XSAVE header contents before copying the actual states to the kernel. copy_user_to_xstate() was previously only called for compacted-format kernel buffers, but it works for both compacted and non-compacted forms. Using it for the non-compacted form is slower because of multiple __copy_from_user() operations, but that cost is less important than robust code in an already slow path. [ Changelog polished by Dave Hansen ] Fixes: `b860eb8dce` ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates") Reported-by: syzbot+2067e764dbcd10721e2e@syzkaller.appspotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144345.611833074@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Thomas Gleixner	abc790bdbb	x86/pkru: Write hardware init value to PKRU when xstate is init commit 510b80a6a0f1a0d114c6e33bcea64747d127973c upstream. When user space brings PKRU into init state, then the kernel handling is broken: T1 user space xsave(state) state.header.xfeatures &= ~XFEATURE_MASK_PKRU; xrstor(state) T1 -> kernel schedule() XSAVE(S) -> T1->xsave.header.xfeatures[PKRU] == 0 T1->flags \|= TIF_NEED_FPU_LOAD; wrpkru(); schedule() ... pk = get_xsave_addr(&T1->fpu->state.xsave, XFEATURE_PKRU); if (pk) wrpkru(pk->pkru); else wrpkru(DEFAULT_PKRU); Because the xfeatures bit is 0 and therefore the value in the xsave storage is not valid, get_xsave_addr() returns NULL and switch_to() writes the default PKRU. -> FAIL #1! So that wrecks any copy_to/from_user() on the way back to user space which hits memory which is protected by the default PKRU value. Assumed that this does not fail (pure luck) then T1 goes back to user space and because TIF_NEED_FPU_LOAD is set it ends up in switch_fpu_return() __fpregs_load_activate() if (!fpregs_state_valid()) { load_XSTATE_from_task(); } But if nothing touched the FPU between T1 scheduling out and back in, then the fpregs_state is still valid which means switch_fpu_return() does nothing and just clears TIF_NEED_FPU_LOAD. Back to user space with DEFAULT_PKRU loaded. -> FAIL #2! The fix is simple: if get_xsave_addr() returns NULL then set the PKRU value to 0 instead of the restrictive default PKRU value in init_pkru_value. [ bp: Massage in minor nitpicks from folks. ] Fixes: `0cecca9d03` ("x86/fpu: Eager switch PKRU state") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Tested-by: Babu Moger <babu.moger@amd.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144346.045616965@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Tom Lendacky	208bb686e7	x86/ioremap: Map EFI-reserved memory as encrypted for SEV commit 8d651ee9c71bb12fc0c8eb2786b66cbe5aa3e43b upstream. Some drivers require memory that is marked as EFI boot services data. In order for this memory to not be re-used by the kernel after ExitBootServices(), efi_mem_reserve() is used to preserve it by inserting a new EFI memory descriptor and marking it with the EFI_MEMORY_RUNTIME attribute. Under SEV, memory marked with the EFI_MEMORY_RUNTIME attribute needs to be mapped encrypted by Linux, otherwise the kernel might crash at boot like below: EFI Variables Facility v0.08 2004-May-17 general protection fault, probably for non-canonical address 0x3597688770a868b2: 0000 [#1] SMP NOPTI CPU: 13 PID: 1 Comm: swapper/0 Not tainted 5.12.4-2-default #1 openSUSE Tumbleweed Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:efi_mokvar_entry_next [...] Call Trace: efi_mokvar_sysfs_init ? efi_mokvar_table_init do_one_initcall ? __kmalloc kernel_init_freeable ? rest_init kernel_init ret_from_fork Expand the __ioremap_check_other() function to additionally check for this other type of boot data reserved at runtime and indicate that it should be mapped encrypted for an SEV guest. [ bp: Massage commit message. ] Fixes: `58c909022a` ("efi: Support for MOK variable config table") Reported-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Joerg Roedel <jroedel@suse.de> Cc: <stable@vger.kernel.org> # 5.10+ Link: https://lkml.kernel.org/r/20210608095439.12668-2-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Thomas Gleixner	75a55bc2e5	x86/process: Check PF_KTHREAD and not current->mm for kernel threads commit 12f7764ac61200e32c916f038bdc08f884b0b604 upstream. switch_fpu_finish() checks current->mm as indicator for kernel threads. That's wrong because kernel threads can temporarily use a mm of a user process via kthread_use_mm(). Check the task flags for PF_KTHREAD instead. Fixes: `0cecca9d03` ("x86/fpu: Eager switch PKRU state") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20210608144345.912645927@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Fan Du	ddaaf38e19	x86/mm: Avoid truncating memblocks for SGX memory commit 28e5e44aa3f4e0e0370864ed008fb5e2d85f4dc8 upstream. tl;dr: Several SGX users reported seeing the following message on NUMA systems: sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0. This turned out to be the memblock code mistakenly throwing away SGX memory. === Full Changelog === The 'max_pfn' variable represents the highest known RAM address. It can be used, for instance, to quickly determine for which physical addresses there is mem_map[] space allocated. The numa_meminfo code makes an effort to throw out ("trim") all memory blocks which are above 'max_pfn'. SGX memory is not considered RAM (it is marked as "Reserved" in the e820) and is not taken into account by max_pfn. Despite this, SGX memory areas have NUMA affinity and are enumerated in the ACPI SRAT table. The existing SGX code uses the numa_meminfo mechanism to look up the NUMA affinity for its memory areas. In cases where SGX memory was above max_pfn (usually just the one EPC section in the last highest NUMA node), the numa_memblock is truncated at 'max_pfn', which is below the SGX memory. When the SGX code tries to look up the affinity of this memory, it fails and produces an error message: sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0. and assigns the memory to NUMA node 0. Instead of silently truncating the memory block at 'max_pfn' and dropping the SGX memory, add the truncated portion to 'numa_reserved_meminfo'. This allows the SGX code to later determine the NUMA affinity of its 'Reserved' area. Before, numa_meminfo looked like this (from 'crash'): blk = { start = 0x0, end = 0x2080000000, nid = 0x0 } { start = 0x2080000000, end = 0x4000000000, nid = 0x1 } numa_reserved_meminfo is empty. With this, numa_meminfo looks like this: blk = { start = 0x0, end = 0x2080000000, nid = 0x0 } { start = 0x2080000000, end = 0x4000000000, nid = 0x1 } and numa_reserved_meminfo has an entry for node 1's SGX memory: blk = { start = 0x4000000000, end = 0x4080000000, nid = 0x1 } [ daveh: completely rewrote/reworked changelog ] Fixes: `5d30f92e76` ("x86/NUMA: Provide a range-to-target_node lookup facility") Reported-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20210617194657.0A99CB22@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:52 +02:00
Vineet Gupta	f6bcb1a628	ARCv2: save ABI registers across signal handling commit 96f1b00138cb8f04c742c82d0a7c460b2202e887 upstream. ARCv2 has some configuration dependent registers (r30, r58, r59) which could be targetted by the compiler. To keep the ABI stable, these were unconditionally part of the glibc ABI (sysdeps/unix/sysv/linux/arc/sys/ucontext.h:mcontext_t) however we missed populating them (by saving/restoring them across signal handling). This patch fixes the issue by - adding arcv2 ABI regs to kernel struct sigcontext - populating them during signal handling Change to struct sigcontext might seem like a glibc ABI change (although it primarily uses ucontext_t:mcontext_t) but the fact is - it has only been extended (existing fields are not touched) - the old sigcontext was ABI incomplete to begin with anyways Fixes: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/53 Cc: <stable@vger.kernel.org> Tested-by: kernel test robot <lkp@intel.com> Reported-by: Vladimir Isaev <isaev@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Harald Freudenberger	b516daed99	s390/ap: Fix hanging ioctl caused by wrong msg counter commit e73a99f3287a740a07d6618e9470f4d6cb217da8 upstream. When a AP queue is switched to soft offline, all pending requests are purged out of the pending requests list and 'received' by the upper layer like zcrypt device drivers. This is also done for requests which are already enqueued into the firmware queue. A request in a firmware queue may eventually produce an response message, but there is no waiting process any more. However, the response was counted with the queue_counter and as this counter was reset to 0 with the offline switch, the pending response caused the queue_counter to get negative. The next request increased this counter to 0 (instead of 1) which caused the ap code to assume there is nothing to receive and so the response for this valid request was never tried to fetch from the firmware queue. This all caused a queue to not work properly after a switch offline/online and in the end processes to hang forever when trying to send a crypto request after an queue offline/online switch cicle. Fixed by a) making sure the counter does not drop below 0 and b) on a successful enqueue of a message has at least a value of 1. Additionally a warning is emitted, when a reply can't get assigned to a waiting process. This may be normal operation (process had timeout or has been killed) but may give a hint that something unexpected happened (like this odd behavior described above). Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Alexander Gordeev	7c003dab43	s390/mcck: fix calculation of SIE critical section size commit 5bcbe3285fb614c49db6b238253f7daff7e66312 upstream. The size of SIE critical section is calculated wrongly as result of a missed subtraction in commit `0b0ed657fe` ("s390: remove critical section cleanup from entry.S") Fixes: `0b0ed657fe` ("s390: remove critical section cleanup from entry.S") Cc: <stable@vger.kernel.org> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Wanpeng Li	3a9934d6b8	KVM: X86: Fix x86_emulator slab cache leak commit dfdc0a714d241bfbf951886c373cd1ae463fcc25 upstream. Commit `c9b8b07cde` (KVM: x86: Dynamically allocate per-vCPU emulation context) tries to allocate per-vCPU emulation context dynamically, however, the x86_emulator slab cache is still exiting after the kvm module is unload as below after destroying the VM and unloading the kvm module. grep x86_emulator /proc/slabinfo x86_emulator 36 36 2672 12 8 : tunables 0 0 0 : slabdata 3 3 0 This patch fixes this slab cache leak by destroying the x86_emulator slab cache when the kvm module is unloaded. Fixes: `c9b8b07cde` (KVM: x86: Dynamically allocate per-vCPU emulation context) Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1623387573-5969-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Sean Christopherson	18eca69f88	KVM: x86/mmu: Calculate and check "full" mmu_role for nested MMU commit 654430efde27248be563df9a88631204b5fe2df2 upstream. Calculate and check the full mmu_role when initializing the MMU context for the nested MMU, where "full" means the bits and pieces of the role that aren't handled by kvm_calc_mmu_role_common(). While the nested MMU isn't used for shadow paging, things like the number of levels in the guest's page tables are surprisingly important when walking the guest page tables. Failure to reinitialize the nested MMU context if L2's paging mode changes can result in unexpected and/or missed page faults, and likely other explosions. E.g. if an L1 vCPU is running both a 32-bit PAE L2 and a 64-bit L2, the "common" role calculation will yield the same role for both L2s. If the 64-bit L2 is run after the 32-bit PAE L2, L0 will fail to reinitialize the nested MMU context, ultimately resulting in a bad walk of L2's page tables as the MMU will still have a guest root_level of PT32E_ROOT_LEVEL. WARNING: CPU: 4 PID: 167334 at arch/x86/kvm/vmx/vmx.c:3075 ept_save_pdptrs+0x15/0xe0 [kvm_intel] Modules linked in: kvm_intel] CPU: 4 PID: 167334 Comm: CPU 3/KVM Not tainted 5.13.0-rc1-d849817d5673-reqs #185 Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014 RIP: 0010:ept_save_pdptrs+0x15/0xe0 [kvm_intel] Code: <0f> 0b c3 f6 87 d8 02 00f RSP: 0018:ffffbba702dbba00 EFLAGS: 00010202 RAX: 0000000000000011 RBX: 0000000000000002 RCX: ffffffff810a2c08 RDX: ffff91d7bc30acc0 RSI: 0000000000000011 RDI: ffff91d7bc30a600 RBP: ffff91d7bc30a600 R08: 0000000000000010 R09: 0000000000000007 R10: 0000000000000000 R11: 0000000000000000 R12: ffff91d7bc30a600 R13: ffff91d7bc30acc0 R14: ffff91d67c123460 R15: 0000000115d7e005 FS: 00007fe8e9ffb700(0000) GS:ffff91d90fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000029f15a001 CR4: 00000000001726e0 Call Trace: kvm_pdptr_read+0x3a/0x40 [kvm] paging64_walk_addr_generic+0x327/0x6a0 [kvm] paging64_gva_to_gpa_nested+0x3f/0xb0 [kvm] kvm_fetch_guest_virt+0x4c/0xb0 [kvm] __do_insn_fetch_bytes+0x11a/0x1f0 [kvm] x86_decode_insn+0x787/0x1490 [kvm] x86_decode_emulated_instruction+0x58/0x1e0 [kvm] x86_emulate_instruction+0x122/0x4f0 [kvm] vmx_handle_exit+0x120/0x660 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xe25/0x1cb0 [kvm] kvm_vcpu_ioctl+0x211/0x5a0 [kvm] __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x40/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: stable@vger.kernel.org Fixes: `bf627a9288` ("x86/kvm/mmu: check if MMU reconfiguration is needed in init_kvm_nested_mmu()") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210610220026.1364486-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Sean Christopherson	669a8866e4	KVM: x86: Immediately reset the MMU context when the SMM flag is cleared commit 78fcb2c91adfec8ce3a2ba6b4d0dda89f2f4a7c6 upstream. Immediately reset the MMU context when the vCPU's SMM flag is cleared so that the SMM flag in the MMU role is always synchronized with the vCPU's flag. If RSM fails (which isn't correctly emulated), KVM will bail without calling post_leave_smm() and leave the MMU in a bad state. The bad MMU role can lead to a NULL pointer dereference when grabbing a shadow page's rmap for a page fault as the initial lookups for the gfn will happen with the vCPU's SMM flag (=0), whereas the rmap lookup will use the shadow page's SMM flag, which comes from the MMU (=1). SMM has an entirely different set of memslots, and so the initial lookup can find a memslot (SMM=0) and then explode on the rmap memslot lookup (SMM=1). general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 8410 Comm: syz-executor382 Not tainted 5.13.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__gfn_to_rmap arch/x86/kvm/mmu/mmu.c:935 [inline] RIP: 0010:gfn_to_rmap+0x2b0/0x4d0 arch/x86/kvm/mmu/mmu.c:947 Code: <42> 80 3c 20 00 74 08 4c 89 ff e8 f1 79 a9 00 4c 89 fb 4d 8b 37 44 RSP: 0018:ffffc90000ffef98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888015b9f414 RCX: ffff888019669c40 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 0000000000000001 R08: ffffffff811d9cdb R09: ffffed10065a6002 R10: ffffed10065a6002 R11: 0000000000000000 R12: dffffc0000000000 R13: 0000000000000003 R14: 0000000000000001 R15: 0000000000000000 FS: 000000000124b300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000028e31000 CR4: 00000000001526e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rmap_add arch/x86/kvm/mmu/mmu.c:965 [inline] mmu_set_spte+0x862/0xe60 arch/x86/kvm/mmu/mmu.c:2604 __direct_map arch/x86/kvm/mmu/mmu.c:2862 [inline] direct_page_fault+0x1f74/0x2b70 arch/x86/kvm/mmu/mmu.c:3769 kvm_mmu_do_page_fault arch/x86/kvm/mmu.h:124 [inline] kvm_mmu_page_fault+0x199/0x1440 arch/x86/kvm/mmu/mmu.c:5065 vmx_handle_exit+0x26/0x160 arch/x86/kvm/vmx/vmx.c:6122 vcpu_enter_guest+0x3bdd/0x9630 arch/x86/kvm/x86.c:9428 vcpu_run+0x416/0xc20 arch/x86/kvm/x86.c:9494 kvm_arch_vcpu_ioctl_run+0x4e8/0xa40 arch/x86/kvm/x86.c:9722 kvm_vcpu_ioctl+0x70f/0xbb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3460 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:1069 [inline] __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:1055 do_syscall_64+0x3f/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x440ce9 Cc: stable@vger.kernel.org Reported-by: syzbot+fb0b6a7e8713aeb0319c@syzkaller.appspotmail.com Fixes: `9ec19493fb` ("KVM: x86: clear SMM flags before loading state while leaving SMM") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210609185619.992058-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Chiqijun	077cb8946f	PCI: Work around Huawei Intelligent NIC VF FLR erratum commit ce00322c2365e1f7b0312f2f493539c833465d97 upstream. pcie_flr() starts a Function Level Reset (FLR), waits 100ms (the maximum time allowed for FLR completion by PCIe r5.0, sec 6.6.2), and waits for the FLR to complete. It assumes the FLR is complete when a config read returns valid data. When we do an FLR on several Huawei Intelligent NIC VFs at the same time, firmware on the NIC processes them serially. The VF may respond to config reads before the firmware has completed its reset processing. If we bind a driver to the VF (e.g., by assigning the VF to a virtual machine) in the interval between the successful config read and completion of the firmware reset processing, the NIC VF driver may fail to load. Prevent this driver failure by waiting for the NIC firmware to complete its reset processing. Not all NIC firmware supports this feature. [bhelgaas: commit log] Link: https://support.huawei.com/enterprise/en/doc/EDOC1100063073/87950645/vm-oss-occasionally-fail-to-load-the-in200-driver-when-the-vf-performs-flr Link: https://lore.kernel.org/r/20210414132301.1793-1-chiqijun@huawei.com Signed-off-by: Chiqijun <chiqijun@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Sriharsha Basavapatna	ee1a9cfed2	PCI: Add ACS quirk for Broadcom BCM57414 NIC commit db2f77e2bd99dbd2fb23ddde58f0fae392fe3338 upstream. The Broadcom BCM57414 NIC may be a multi-function device. While it does not advertise an ACS capability, peer-to-peer transactions are not possible between the individual functions, so it is safe to treat them as fully isolated. Add an ACS quirk for this device so the functions can be in independent IOMMU groups and attached individually to userspace applications using VFIO. [bhelgaas: commit log] Link: https://lore.kernel.org/r/1621645997-16251-1-git-send-email-michael.chan@broadcom.com Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Pali Rohár	1a1dbc4473	PCI: aardvark: Fix kernel panic during PIO transfer commit f18139966d072dab8e4398c95ce955a9742e04f7 upstream. Trying to start a new PIO transfer by writing value 0 in PIO_START register when previous transfer has not yet completed (which is indicated by value 1 in PIO_START) causes an External Abort on CPU, which results in kernel panic: SError Interrupt on CPU0, code 0xbf000002 -- SError Kernel panic - not syncing: Asynchronous SError Interrupt To prevent kernel panic, it is required to reject a new PIO transfer when previous one has not finished yet. If previous PIO transfer is not finished yet, the kernel may issue a new PIO request only if the previous PIO transfer timed out. In the past the root cause of this issue was incorrectly identified (as it often happens during link retraining or after link down event) and special hack was implemented in Trusted Firmware to catch all SError events in EL3, to ignore errors with code 0xbf000002 and not forwarding any other errors to kernel and instead throw panic from EL3 Trusted Firmware handler. Links to discussion and patches about this issue: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/?id=3c7dcdac5c50 https://lore.kernel.org/linux-pci/20190316161243.29517-1-repk@triplefau.lt/ https://lore.kernel.org/linux-pci/971be151d24312cc533989a64bd454b4@www.loen.fr/ https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/1541 But the real cause was the fact that during link retraining or after link down event the PIO transfer may take longer time, up to the 1.44s until it times out. This increased probability that a new PIO transfer would be issued by kernel while previous one has not finished yet. After applying this change into the kernel, it is possible to revert the mentioned TF-A hack and SError events do not have to be caught in TF-A EL3. Link: https://lore.kernel.org/r/20210608203655.31228-1-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Marek Behún <kabel@kernel.org> Cc: stable@vger.kernel.org # `7fbcb5da81` ("PCI: aardvark: Don't rely on jiffies while holding spinlock") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Shanker Donthineni	dac77a14fa	PCI: Mark some NVIDIA GPUs to avoid bus reset commit 4c207e7121fa92b66bf1896bf8ccb9edfb0f9731 upstream. Some NVIDIA GPU devices do not work with SBR. Triggering SBR leaves the device inoperable for the current system boot. It requires a system hard-reboot to get the GPU device back to normal operating condition post-SBR. For the affected devices, enable NO_BUS_RESET quirk to avoid the issue. This issue will be fixed in the next generation of hardware. Link: https://lore.kernel.org/r/20210608054857.18963-8-ameynarkhede03@gmail.com Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Antti Järvinen	1e460ddf5b	PCI: Mark TI C667X to avoid bus reset commit b5cf198e74a91073d12839a3e2db99994a39995d upstream. Some TI KeyStone C667X devices do not support bus/hot reset. The PCIESS automatically disables LTSSM when Secondary Bus Reset is received and device stops working. Prevent bus reset for these devices. With this change, the device can be assigned to VMs with VFIO, but it will leak state between VMs. Reference: https://e2e.ti.com/support/processors/f/791/t/954382 Link: https://lore.kernel.org/r/20210315102606.17153-1-antti.jarvinen@gmail.com Signed-off-by: Antti Järvinen <antti.jarvinen@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Steven Rostedt (VMware)	c9fd0ab39f	tracing: Do no increment trace_clock_global() by one commit 89529d8b8f8daf92d9979382b8d2eb39966846ea upstream. The trace_clock_global() tries to make sure the events between CPUs is somewhat in order. A global value is used and updated by the latest read of a clock. If one CPU is ahead by a little, and is read by another CPU, a lock is taken, and if the timestamp of the other CPU is behind, it will simply use the other CPUs timestamp. The lock is also only taken with a "trylock" due to tracing, and strange recursions can happen. The lock is not taken at all in NMI context. In the case where the lock is not able to be taken, the non synced timestamp is returned. But it will not be less than the saved global timestamp. The problem arises because when the time goes "backwards" the time returned is the saved timestamp plus 1. If the lock is not taken, and the plus one to the timestamp is returned, there's a small race that can cause the time to go backwards! CPU0 CPU1 ---- ---- trace_clock_global() { ts = clock() [ 1000 ] trylock(clock_lock) [ success ] global_ts = ts; [ 1000 ] <interrupted by NMI> trace_clock_global() { ts = clock() [ 999 ] if (ts < global_ts) ts = global_ts + 1 [ 1001 ] trylock(clock_lock) [ fail ] return ts [ 1001] } unlock(clock_lock); return ts; [ 1000 ] } trace_clock_global() { ts = clock() [ 1000 ] if (ts < global_ts) [ false 1000 == 1000 ] trylock(clock_lock) [ success ] global_ts = ts; [ 1000 ] unlock(clock_lock) return ts; [ 1000 ] } The above case shows to reads of trace_clock_global() on the same CPU, but the second read returns one less than the first read. That is, time when backwards, and this is not what is allowed by trace_clock_global(). This was triggered by heavy tracing and the ring buffer checker that tests for the clock going backwards: Ring buffer clock went backwards: 20613921464 -> 20613921463 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 0 at kernel/trace/ring_buffer.c:3412 check_buffer+0x1b9/0x1c0 Modules linked in: [..] [CPU: 2]TIME DOES NOT MATCH expected:20620711698 actual:20620711697 delta:6790234 before:20613921463 after:20613921463 [20613915818] PAGE TIME STAMP [20613915818] delta:0 [20613915819] delta:1 [20613916035] delta:216 [20613916465] delta:430 [20613916575] delta:110 [20613916749] delta:174 [20613917248] delta:499 [20613917333] delta:85 [20613917775] delta:442 [20613917921] delta:146 [20613918321] delta:400 [20613918568] delta:247 [20613918768] delta:200 [20613919306] delta:538 [20613919353] delta:47 [20613919980] delta:627 [20613920296] delta:316 [20613920571] delta:275 [20613920862] delta:291 [20613921152] delta:290 [20613921464] delta:312 [20613921464] delta:0 TIME EXTEND [20613921464] delta:0 This happened more than once, and always for an off by one result. It also started happening after commit aafe104aa9096 was added. Cc: stable@vger.kernel.org Fixes: aafe104aa9096 ("tracing: Restructure trace_clock_global() to never block") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Steven Rostedt (VMware)	b313bd944d	tracing: Do not stop recording comms if the trace file is being read commit 4fdd595e4f9a1ff6d93ec702eaecae451cfc6591 upstream. A while ago, when the "trace" file was opened, tracing was stopped, and code was added to stop recording the comms to saved_cmdlines, for mapping of the pids to the task name. Code has been added that only records the comm if a trace event occurred, and there's no reason to not trace it if the trace file is opened. Cc: stable@vger.kernel.org Fixes: `7ffbd48d5c` ("tracing: Cache comms only after an event occurred") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Steven Rostedt (VMware)	adb3849ed8	tracing: Do not stop recording cmdlines when tracing is off commit 85550c83da421fb12dc1816c45012e1e638d2b38 upstream. The saved_cmdlines is used to map pids to the task name, such that the output of the tracing does not just show pids, but also gives a human readable name for the task. If the name is not mapped, the output looks like this: <...>-1316 [005] ...2 132.044039: ... Instead of this: gnome-shell-1316 [005] ...2 132.044039: ... The names are updated when tracing is running, but are skipped if tracing is stopped. Unfortunately, this stops the recording of the names if the top level tracer is stopped, and not if there's other tracers active. The recording of a name only happens when a new event is written into a ring buffer, so there is no need to test if tracing is on or not. If tracing is off, then no event is written and no need to test if tracing is off or not. Remove the check, as it hides the names of tasks for events in the instance buffers. Cc: stable@vger.kernel.org Fixes: `7ffbd48d5c` ("tracing: Cache comms only after an event occurred") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Breno Lima	1a91fafa3e	usb: chipidea: imx: Fix Battery Charger 1.2 CDP detection commit c6d580d96f140596d69220f60ce0cfbea4ee5c0f upstream. i.MX8MM cannot detect certain CDP USB HUBs. usbmisc_imx.c driver is not following CDP timing requirements defined by USB BC 1.2 specification and section 3.2.4 Detection Timing CDP. During Primary Detection the i.MX device should turn on VDP_SRC and IDM_SINK for a minimum of 40ms (TVDPSRC_ON). After a time of TVDPSRC_ON, the i.MX is allowed to check the status of the D- line. Current implementation is waiting between 1ms and 2ms, and certain BC 1.2 complaint USB HUBs cannot be detected. Increase delay to 40ms allowing enough time for primary detection. During secondary detection the i.MX is required to disable VDP_SRC and IDM_SNK, and enable VDM_SRC and IDP_SINK for at least 40ms (TVDMSRC_ON). Current implementation is not disabling VDP_SRC and IDM_SNK, introduce disable sequence in imx7d_charger_secondary_detection() function. VDM_SRC and IDP_SINK should be enabled for at least 40ms (TVDMSRC_ON). Increase delay allowing enough time for detection. Cc: <stable@vger.kernel.org> Fixes: `746f316b75` ("usb: chipidea: introduce imx7d USB charger detection") Signed-off-by: Breno Lima <breno.lima@nxp.com> Signed-off-by: Jun Li <jun.li@nxp.com> Link: https://lore.kernel.org/r/20210614175013.495808-1-breno.lima@nxp.com Signed-off-by: Peter Chen <peter.chen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Andrew Lunn	576996b64e	usb: core: hub: Disable autosuspend for Cypress CY7C65632 commit a7d8d1c7a7f73e780aa9ae74926ae5985b2f895f upstream. The Cypress CY7C65632 appears to have an issue with auto suspend and detecting devices, not too dissimilar to the SMSC 5534B hub. It is easiest to reproduce by connecting multiple mass storage devices to the hub at the same time. On a Lenovo Yoga, around 1 in 3 attempts result in the devices not being detected. It is however possible to make them appear using lsusb -v. Disabling autosuspend for this hub resolves the issue. Fixes: `1208f9e1d7` ("USB: hub: Fix the broken detection of USB3 device in SMSC hub") Cc: stable@vger.kernel.org Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210614155524.2228800-1-andrew@lunn.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Pavel Skripkin	6bd3d80d1f	can: mcba_usb: fix memory leak in mcba_usb commit 91c02557174be7f72e46ed7311e3bea1939840b0 upstream. Syzbot reported memory leak in SocketCAN driver for Microchip CAN BUS Analyzer Tool. The problem was in unfreed usb_coherent. In mcba_usb_start() 20 coherent buffers are allocated and there is nothing, that frees them: 1) In callback function the urb is resubmitted and that's all 2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER is not set (see mcba_usb_start) and this flag cannot be used with coherent buffers. Fail log: \| [ 1354.053291][ T8413] mcba_usb 1-1:0.0 can0: device disconnected \| [ 1367.059384][ T8420] kmemleak: 20 new suspected memory leaks (see /sys/kernel/debug/kmem) So, all allocated buffers should be freed with usb_free_coherent() explicitly NOTE: The same pattern for allocating and freeing coherent buffers is used in drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c Fixes: `51f3baad7d` ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer") Link: https://lore.kernel.org/r/20210609215833.30393-1-paskripkin@gmail.com Cc: linux-stable <stable@vger.kernel.org> Reported-and-tested-by: syzbot+57281c762a3922e14dfe@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Oleksij Rempel	509ab6bfdd	can: j1939: fix Use-after-Free, hold skb ref while in use commit 2030043e616cab40f510299f09b636285e0a3678 upstream. This patch fixes a Use-after-Free found by the syzbot. The problem is that a skb is taken from the per-session skb queue, without incrementing the ref count. This leads to a Use-after-Free if the skb is taken concurrently from the session queue due to a CTS. Fixes: `9d71dd0c70` ("can: add support of SAE J1939 protocol") Link: https://lore.kernel.org/r/20210521115720.7533-1-o.rempel@pengutronix.de Cc: Hillf Danton <hdanton@sina.com> Cc: linux-stable <stable@vger.kernel.org> Reported-by: syzbot+220c1a29987a9a490903@syzkaller.appspotmail.com Reported-by: syzbot+45199c1b73b4013525cf@syzkaller.appspotmail.com Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Tetsuo Handa	0cf4b37790	can: bcm/raw/isotp: use per module netdevice notifier commit 8d0caedb759683041d9db82069937525999ada53 upstream. syzbot is reporting hung task at register_netdevice_notifier() [1] and unregister_netdevice_notifier() [2], for cleanup_net() might perform time consuming operations while CAN driver's raw/bcm/isotp modules are calling {register,unregister}_netdevice_notifier() on each socket. Change raw/bcm/isotp modules to call register_netdevice_notifier() from module's __init function and call unregister_netdevice_notifier() from module's __exit function, as with gw/j1939 modules are doing. Link: https://syzkaller.appspot.com/bug?id=391b9498827788b3cc6830226d4ff5be87107c30 [1] Link: https://syzkaller.appspot.com/bug?id=1724d278c83ca6e6df100a2e320c10d991cf2bce [2] Link: https://lore.kernel.org/r/54a5f451-05ed-f977-8534-79e7aa2bcc8f@i-love.sakura.ne.jp Cc: linux-stable <stable@vger.kernel.org> Reported-by: syzbot <syzbot+355f8edb2ff45d5f95fa@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+0f1827363a305f74996f@syzkaller.appspotmail.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Tested-by: syzbot <syzbot+355f8edb2ff45d5f95fa@syzkaller.appspotmail.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Norbert Slusarek	acb755be1f	can: bcm: fix infoleak in struct bcm_msg_head commit 5e87ddbe3942e27e939bdc02deb8579b0cbd8ecc upstream. On 64-bit systems, struct bcm_msg_head has an added padding of 4 bytes between struct members count and ival1. Even though all struct members are initialized, the 4-byte hole will contain data from the kernel stack. This patch zeroes out struct bcm_msg_head before usage, preventing infoleaks to userspace. Fixes: `ffd980f976` ("[CAN]: Add broadcast manager (bcm) protocol") Link: https://lore.kernel.org/r/trinity-7c1b2e82-e34f-4885-8060-2cd7a13769ce-1623532166177@3c-app-gmx-bs52 Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Norbert Slusarek <nslusarek@gmx.net> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:50 +02:00
Daniel Borkmann	8c82c52d1d	bpf: Do not mark insn as seen under speculative path verification [ Upstream commit fe9a5ca7e370e613a9a75a13008a3845ea759d6e ] ... in such circumstances, we do not want to mark the instruction as seen given the goal is still to jmp-1 rewrite/sanitize dead code, if it is not reachable from the non-speculative path verification. We do however want to verify it for safety regardless. With the patch as-is all the insns that have been marked as seen before the patch will also be marked as seen after the patch (just with a potentially different non-zero count). An upcoming patch will also verify paths that are unreachable in the non-speculative domain, hence this extension is needed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Daniel Borkmann	e9d271731d	bpf: Inherit expanded/patched seen count from old aux data [ Upstream commit d203b0fd863a2261e5d00b97f3d060c4c2a6db71 ] Instead of relying on current env->pass_cnt, use the seen count from the old aux data in adjust_insn_aux_data(), and expand it to the new range of patched instructions. This change is valid given we always expand 1:n with n>=1, so what applies to the old/original instruction needs to apply for the replacement as well. Not relying on env->pass_cnt is a prerequisite for a later change where we want to avoid marking an instruction seen when verified under speculative execution path. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Marc Zyngier	ed423d80bb	irqchip/gic-v3: Workaround inconsistent PMR setting on NMI entry [ Upstream commit 382e6e177bc1c02473e56591fe5083ae1e4904f6 ] The arm64 entry code suffers from an annoying issue on taking a NMI, as it sets PMR to a value that actually allows IRQs to be acknowledged. This is done for consistency with other parts of the code, and is in the process of being fixed. This shouldn't be a problem, as we are not enabling interrupts whilst in NMI context. However, in the infortunate scenario that we took a spurious NMI (retired before the read of IAR) and that there is an IRQ pending at the same time, we'll ack the IRQ in NMI context. Too bad. In order to avoid deadlocks while running something like perf, teach the GICv3 driver about this situation: if we were in a context where no interrupt should have fired, transiently set PMR to a value that only allows NMIs before acking the pending interrupt, and restore the original value after that. This papers over the core issue for the time being, and makes NMIs great again. Sort of. Fixes: 4d6a38da8e79e94c ("arm64: entry: always set GIC_PRIO_PSR_I_SET during entry") Co-developed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/lkml/20210610145731.1350460-1-maz@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Feng Tang	103c4a08ba	mm: relocate 'write_protect_seq' in struct mm_struct [ Upstream commit 2e3025434a6ba090c85871a1d4080ff784109e1f ] 0day robot reported a 9.2% regression for will-it-scale mmap1 test case[1], caused by commit 57efa1fe5957 ("mm/gup: prevent gup_fast from racing with COW during fork"). Further debug shows the regression is due to that commit changes the offset of hot fields 'mmap_lock' inside structure 'mm_struct', thus some cache alignment changes. From the perf data, the contention for 'mmap_lock' is very severe and takes around 95% cpu cycles, and it is a rw_semaphore struct rw_semaphore { atomic_long_t count; /* 8 bytes / atomic_long_t owner; / 8 bytes / struct optimistic_spin_queue osq; / spinner MCS lock / ... Before commit 57efa1fe5957 adds the 'write_protect_seq', it happens to have a very optimal cache alignment layout, as Linus explained: "and before the addition of the 'write_protect_seq' field, the mmap_sem was at offset 120 in 'struct mm_struct'. Which meant that count and owner were in two different cachelines, and then when you have contention and spend time in rwsem_down_write_slowpath(), this is probably exactly* the kind of layout you want. Because first the rwsem_write_trylock() will do a cmpxchg on the first cacheline (for the optimistic fast-path), and then in the case of contention, rwsem_down_write_slowpath() will just access the second cacheline. Which is probably just optimal for a load that spends a lot of time contended - new waiters touch that first cacheline, and then they queue themselves up on the second cacheline." After the commit, the rw_semaphore is at offset 128, which means the 'count' and 'owner' fields are now in the same cacheline, and causes more cache bouncing. Currently there are 3 "#ifdef CONFIG_XXX" before 'mmap_lock' which will affect its offset: CONFIG_MMU CONFIG_MEMBARRIER CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES The layout above is on 64 bits system with 0day's default kernel config (similar to RHEL-8.3's config), in which all these 3 options are 'y'. And the layout can vary with different kernel configs. Relayouting a structure is usually a double-edged sword, as sometimes it can helps one case, but hurt other cases. For this case, one solution is, as the newly added 'write_protect_seq' is a 4 bytes long seqcount_t (when CONFIG_DEBUG_LOCK_ALLOC=n), placing it into an existing 4 bytes hole in 'mm_struct' will not change other fields' alignment, while restoring the regression. Link: https://lore.kernel.org/lkml/20210525031636.GB7744@xsang-OptiPlex-9020/ [1] Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Feng Tang <feng.tang@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Riwen Lu	a87abba03a	hwmon: (scpi-hwmon) shows the negative temperature properly [ Upstream commit 78d13552346289bad4a9bf8eabb5eec5e5a321a5 ] The scpi hwmon shows the sub-zero temperature in an unsigned integer, which would confuse the users when the machine works in low temperature environment. This shows the sub-zero temperature in an signed value and users can get it properly from sensors. Signed-off-by: Riwen Lu <luriwen@kylinos.cn> Tested-by: Xin Chen <chenxin@kylinos.cn> Link: https://lore.kernel.org/r/20210604030959.736379-1-luriwen@kylinos.cn Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Chen Li	57b21ef118	radeon: use memcpy_to/fromio for UVD fw upload [ Upstream commit ab8363d3875a83f4901eb1cc00ce8afd24de6c85 ] I met a gpu addr bug recently and the kernel log tells me the pc is memcpy/memset and link register is radeon_uvd_resume. As we know, in some architectures, optimized memcpy/memset may not work well on device memory. Trival memcpy_toio/memset_io can fix this problem. BTW, amdgpu has already done it in: commit `ba0b2275a6` ("drm/amdgpu: use memcpy_to/fromio for UVD fw upload"), that's why it has no this issue on the same gpu and platform. Signed-off-by: Chen Li <chenli@uniontech.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Srinivasa Rao Mandadapu	3e4b0fbb72	ASoC: qcom: lpass-cpu: Fix pop noise during audio capture begin [ Upstream commit c8a4556d98510ca05bad8d02265a4918b03a8c0b ] This patch fixes PoP noise of around 15ms observed during audio capture begin. Enables BCLK and LRCLK in snd_soc_dai_ops prepare call for introducing some delay before capture start. (am from https://patchwork.kernel.org/patch/12276369/) (also found at https://lore.kernel.org/r/20210524142114.18676-1-srivasam@codeaurora.org) Co-developed-by: Judy Hsiao <judyhsiao@chromium.org> Signed-off-by: Judy Hsiao <judyhsiao@chromium.org> Signed-off-by: Srinivasa Rao Mandadapu <srivasam@codeaurora.org> Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20210604154545.1198337-1-judyhsiao@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Saravana Kannan	360609fc8b	drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device [ Upstream commit 9bf3797796f570b34438235a6a537df85832bdad ] On sunxi boards that use HDMI output, HDMI device probe keeps being avoided indefinitely with these repeated messages in dmesg: platform 1ee0000.hdmi: probe deferral - supplier 1ef0000.hdmi-phy not ready There's a fwnode_link being created with fw_devlink=on between hdmi and hdmi-phy nodes, because both nodes have 'compatible' property set. Fw_devlink code assumes that nodes that have compatible property set will also have a device associated with them by some driver eventually. This is not the case with the current sun8i-hdmi driver. This commit makes sun8i-hdmi-phy into a proper platform device and fixes the display pipeline probe on sunxi boards that use HDMI. More context: https://lkml.org/lkml/2021/5/16/203 Signed-off-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Ondrej Jirman <megous@megous.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210607085836.2827429-1-megous@megous.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Sergio Paracuellos	5bd6bcb353	pinctrl: ralink: rt2880: avoid to error in calls is pin is already enabled [ Upstream commit eb367d875f94a228c17c8538e3f2efcf2eb07ead ] In 'rt2880_pmx_group_enable' driver is printing an error and returning -EBUSY if a pin has been already enabled. This begets anoying messages in the caller when this happens like the following: rt2880-pinmux pinctrl: pcie is already enabled mt7621-pci 1e140000.pcie: Error applying setting, reverse things back To avoid this just print the already enabled message in the pinctrl driver and return 0 instead to not confuse the user with a real bad problem. Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Link: https://lore.kernel.org/r/20210604055337.20407-1-sergio.paracuellos@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Oder Chiou	6d0dc1b34c	ASoC: rt5682: Fix the fast discharge for headset unplugging in soundwire mode [ Upstream commit 49783c6f4a4f49836b5a109ae0daf2f90b0d7713 ] Based on ("5a15cd7fce20b1fd4aece6a0240e2b58cd6a225d"), the setting also should be set in soundwire mode. Signed-off-by: Oder Chiou <oder_chiou@realtek.com> Link: https://lore.kernel.org/r/20210604063150.29925-1-oder_chiou@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:49 +02:00
Axel Lin	ba8a26a7ce	regulator: rt4801: Fix NULL pointer dereference if priv->enable_gpios is NULL [ Upstream commit cb2381cbecb81a8893b2d1e1af29bc2e5531df27 ] devm_gpiod_get_array_optional may return NULL if no GPIO was assigned. Signed-off-by: Axel Lin <axel.lin@ingics.com> Link: https://lore.kernel.org/r/20210603094944.1114156-1-axel.lin@ingics.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
Patrice Chotard	2f8f0e97ce	spi: stm32-qspi: Always wait BUSY bit to be cleared in stm32_qspi_wait_cmd() [ Upstream commit d38fa9a155b2829b7e2cfcf8a4171b6dd3672808 ] In U-boot side, an issue has been encountered when QSPI source clock is running at low frequency (24 MHz for example), waiting for TCF bit to be set didn't ensure that all data has been send out the FIFO, we should also wait that BUSY bit is cleared. To prevent similar issue in kernel driver, we implement similar behavior by always waiting BUSY bit to be cleared. Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com> Link: https://lore.kernel.org/r/20210603073421.8441-1-patrice.chotard@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
Richard Weinberger	e03c8b3516	ASoC: tas2562: Fix TDM_CFG0_SAMPRATE values [ Upstream commit 8bef925e37bdc9b6554b85eda16ced9a8e3c135f ] TAS2562_TDM_CFG0_SAMPRATE_MASK starts at bit 1, not 0. So all values need to be left shifted by 1. Signed-off-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20210530203446.19022-1-richard@nod.at Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
Vincent Guittot	813ff24f1d	sched/pelt: Ensure that _sum is always synced with _avg [ Upstream commit fcf6631f3736985ec89bdd76392d3c7bfb60119f ] Rounding in PELT calculation happening when entities are attached/detached of a cfs_rq can result into situations where util/runnable_avg is not null but util/runnable_sum is. This is normally not possible so we need to ensure that util/runnable_sum stays synced with util/runnable_avg. detach_entity_load_avg() is the last place where we don't sync util/runnable_sum with util/runnbale_avg when moving some sched_entities Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210601085832.12626-1-vincent.guittot@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
zpershuai	f6d28f0e36	spi: spi-zynq-qspi: Fix some wrong goto jumps & missing error code [ Upstream commit f131767eefc47de2f8afb7950cdea78397997d66 ] In zynq_qspi_probe function, when enable the device clock is done, the return of all the functions should goto the clk_dis_all label. If num_cs is not right then this should return a negative error code but currently it returns success. Signed-off-by: zpershuai <zpershuai@gmail.com> Link: https://lore.kernel.org/r/1622110857-21812-1-git-send-email-zpershuai@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
ChiYuan Huang	0ea21221dd	regulator: rtmv20: Fix to make regcache value first reading back from HW [ Upstream commit 46639a5e684edd0b80ae9dff220f193feb356277 ] - Fix to make regcache value first reading back from HW. Signed-off-by: ChiYuan Huang <cy_huang@richtek.com> Link: https://lore.kernel.org/r/1622542155-6373-1-git-send-email-u0084500@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
Nicolas Cavallari	3c5064cd29	ASoC: fsl-asoc-card: Set .owner attribute when registering card. [ Upstream commit a8437f05384cb472518ec21bf4fffbe8f0a47378 ] Otherwise, when compiled as module, a WARN_ON is triggered: WARNING: CPU: 0 PID: 5 at sound/core/init.c:208 snd_card_new+0x310/0x39c [snd] [...] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.10.39 #1 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) Workqueue: events deferred_probe_work_func [<c0111988>] (unwind_backtrace) from [<c010c8ac>] (show_stack+0x10/0x14) [<c010c8ac>] (show_stack) from [<c092784c>] (dump_stack+0xdc/0x104) [<c092784c>] (dump_stack) from [<c0129710>] (__warn+0xd8/0x114) [<c0129710>] (__warn) from [<c0922a48>] (warn_slowpath_fmt+0x5c/0xc4) [<c0922a48>] (warn_slowpath_fmt) from [<bf0496f8>] (snd_card_new+0x310/0x39c [snd]) [<bf0496f8>] (snd_card_new [snd]) from [<bf1d7df8>] (snd_soc_bind_card+0x334/0x9c4 [snd_soc_core]) [<bf1d7df8>] (snd_soc_bind_card [snd_soc_core]) from [<bf1e9cd8>] (devm_snd_soc_register_card+0x30/0x6c [snd_soc_core]) [<bf1e9cd8>] (devm_snd_soc_register_card [snd_soc_core]) from [<bf22d964>] (fsl_asoc_card_probe+0x550/0xcc8 [snd_soc_fsl_asoc_card]) [<bf22d964>] (fsl_asoc_card_probe [snd_soc_fsl_asoc_card]) from [<c060c930>] (platform_drv_probe+0x48/0x98) [...] Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com> Link: https://lore.kernel.org/r/20210527163409.22049-1-nicolas.cavallari@green-communications.fr Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
Tiezhu Yang	9a17907946	phy: phy-mtk-tphy: Fix some resource leaks in mtk_phy_init() [ Upstream commit aaac9a1bd370338ce372669eb9a6059d16b929aa ] Use clk_disable_unprepare() in the error path of mtk_phy_init() to fix some resource leaks. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Reviewed-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Link: https://lore.kernel.org/r/1621420659-15858-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
Jack Yu	02e2455748	ASoC: rt5659: Fix the lost powers for the HDA header [ Upstream commit 6308c44ed6eeadf65c0a7ba68d609773ed860fbb ] The power of "LDO2", "MICBIAS1" and "Mic Det Power" were powered off after the DAPM widgets were added, and these powers were set by the JD settings "RT5659_JD_HDA_HEADER" in the probe function. In the codec probe function, these powers were ignored to prevent them controlled by DAPM. Signed-off-by: Oder Chiou <oder_chiou@realtek.com> Signed-off-by: Jack Yu <jack.yu@realtek.com> Message-Id: <15fced51977b458798ca4eebf03dafb9@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:48 +02:00
Til Jasper Ullrich	3fb6c6acc1	platform/x86: thinkpad_acpi: Add X1 Carbon Gen 9 second fan support [ Upstream commit c0e0436cb4f6627146acdae8c77828f18db01151 ] The X1 Carbon Gen 9 uses two fans instead of one like the previous generation. This adds support for the second fan. It has been tested on my X1 Carbon Gen 9 (20XXS00100) and works fine. Signed-off-by: Til Jasper Ullrich <tju@tju.me> Link: https://lore.kernel.org/r/20210525150950.14805-1-tju@tju.me Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-23 14:42:47 +02:00

... 2 3 4 5 6 ...

974998 Commits