linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-23 22:39:37 +07:00

Author	SHA1	Message	Date
Deepak Rawat	26b82873a4	drm/vmwgfx: Split surface metadata from struct vmw_surface Create a new structure vmw_surface_metadata representing the metadata used for creating surface. With this can make the surface_define_priv a bit cleaner. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:35 +01:00
Deepak Rawat	e8bead9c5c	drm/vmwgfx: Add support for streamoutput with mob commands With SM5 capability a new version of streamoutput is supported by device which need backing mob and a new field. With this change the new command is supported in command buffer. v2: Also track streamoutput context binding in binding manager. v3: Track only one streamoutput as only one can be set to context. v4: Fix comment typos Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Signed-off-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:35 +01:00
Deepak Rawat	403fef50e3	drm/vmwgfx: Rename stream output target binding tracker struct Previous name vmw_ctx_bindinfo_so is misleading because it actually represent so target and stream output is a new resource type that needs tracking for SM5 capable device. Also rename binding type enum and internal functions to reflect these belongs to so targets. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:35 +01:00
Deepak Rawat	b6fad73975	drm/vmwgfx: Add support for indirect and dispatch commands Validate indirect and dispatch commands in command buffer. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:34 +01:00
Deepak Rawat	5e8ec0d919	drm/vmwgfx: Add support for UA view commands Virtual device now support new commands to manage unordered access views. Allow them as part of user-space command buffer. This involves adding UA view cotable, binding tracker info, new view type and command verifier functions. v2: fix comment typo v3: style fixes (don't use deprecated PTR_RET) Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Signed-off-by: Neha Bhende <bhenden@vmware.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:34 +01:00
Deepak Rawat	d2e90ab374	drm/vmwgfx: Support SM5 shader type in command buffer Virtual device now supports new shader types, allow them as valid shader type in command buffer. Also add per shader bind info in binding manager state for new shader type. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:34 +01:00
Deepak Rawat	7ebb47c9f9	drm/vmwgfx: Read new register for GB memory when available Virtual device added new register for suggested GB memory, read the new register when available. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:34 +01:00
Deepak Rawat	4dec28053b	drm/vmwgfx: Add a new enum for SM5 graphics context capability A new enum to represent new SM5 graphics context capability in vmwgfx. v2: use new correct cap bits (merged several later commits into it). Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Signed-off-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:34 +01:00
Deepak Rawat	0651dfabd9	drm/vmwgfx: Sync virtual device headers for new feature Get the latest device headers for SM5 and other features development. v2: sync to newer bits (merge later commits) v3: sync to even newer bits Co-developed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Signed-off-by: Neha Bhende <bhenden@vmware.com> Signed-off-by: Charmaine Lee <charmainel@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org>	2020-03-23 22:39:34 +01:00
Deepak Rawat	878c6ecd3e	drm/vmwgfx: Use enum to represent graphics context capabilities Instead of having different bool in device private to represent incremental graphics context capabilities, add a new sm type enum. v2: Use enum instead of bit flag. v3: Incorporated review comments. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:34 +01:00
Deepak Rawat	3d14395422	drm/vmwgfx: Deprecate logic ops commands Logic ops commands are marked as deprecated by virtual device and were never used by vmwgfx. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:33 +01:00
Deepak Rawat	0652ff3363	drm/vmwgfx: Sync legacy multisampling device capability In favor of SM4.1 multisampling capability, virtual device deprecated old multisampling device capability. Mark legacy multisampling device capability as dead. Rename the function that masks legacy multisample capability to reflect that now it is masking a deprecated feature. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:33 +01:00
Deepak Rawat	ef7c7b7497	drm/vmwgfx: Also check for SVGA_CAP_DX before reading DX context support Virtual device consider SVGA_CAP_DX and SVGA3D_DEVCAP_DXCONTEXT independent of each other. Some of the commands in cmd_buf depends on SVGA_CAP_DX, so better to check for that as well. Signed-off-by: Deepak Rawat <drawat.floss@gmail.com> Reviewed-by: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Roland Scheidegger <sroland@vmware.com>	2020-03-23 22:39:33 +01:00
Eric Dumazet	6cd6cbf593	tcp: repair: fix TCP_QUEUE_SEQ implementation When application uses TCP_QUEUE_SEQ socket option to change tp->rcv_next, we must also update tp->copied_seq. Otherwise, stuff relying on tcp_inq() being precise can eventually be confused. For example, tcp_zerocopy_receive() might crash because it does not expect tcp_recv_skb() to return NULL. We could add tests in various places to fix the issue, or simply make sure tcp_inq() wont return a random value, and leave fast path as it is. Note that this fixes ioctl(fd, SIOCINQ, &val) at the same time. Fixes: `ee9952831c` ("tcp: Initial repair mode") Fixes: `05255b823a` ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-23 13:02:01 -07:00
Nick Desaulniers	428b8f1d9f	KVM: VMX: don't allow memory operands for inline asm that modifies SP THUNK_TARGET defines [thunk_target] as having "rm" input constraints when CONFIG_RETPOLINE is not set, which isn't constrained enough for this specific case. For inline assembly that modifies the stack pointer before using this input, the underspecification of constraints is dangerous, and results in an indirect call to a previously pushed flags register. In this case `entry`'s stack slot is good enough to satisfy the "m" constraint in "rm", but the inline assembly in handle_external_interrupt_irqoff() modifies the stack pointer via push+pushf before using this input, which in this case results in calling what was the previous state of the flags register, rather than `entry`. Be more specific in the constraints by requiring `entry` be in a register, and not a memory operand. Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com Debugged-by: Alexander Potapenko <glider@google.com> Debugged-by: Paolo Bonzini <pbonzini@redhat.com> Debugged-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Message-Id: <20200323191243.30002-1-ndesaulniers@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 15:40:51 -04:00
Golan Ben Ami	0433ae556e	iwlwifi: don't send GEO_TX_POWER_LIMIT if no wgds table The GEO_TX_POWER_LIMIT command was sent although there is no wgds table, so the fw got wrong SAR values from the driver. Fix this by avoiding sending the command if no wgds tables are available. Signed-off-by: Golan Ben Ami <golan.ben.ami@intel.com> Fixes: `39c1a9728f` ("iwlwifi: refactor the SAR tables from mvm to acpi") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Tested-By: Jonathan McDowell <noodles@earth.li> Tested-by: Len Brown <len.brown@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20200318081237.46db40617cc6.Id5cf852ec8c5dbf20ba86bad7b165a0c828f8b2e@changeid	2020-03-23 18:38:03 +02:00
Luca Coelho	cf52c8a776	iwlwifi: pcie: add 0x2526/0x401* devices back to cfg detection Three devices, with PCI device ID 0x2526 and subdevice IDs 0x4010, 0x4018 and 0x401C were removed accidentally. Add them back. Reported-by: Brett Hassal <brett.hassal@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206661 Fixes: `0b295a1eb8` ("iwlwifi: add device name to device_info") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/iwlwifi.20200317123331.16762b29f26c.I928bcaa799e7b3d33838c0667714eeb9fa665290@changeid	2020-03-23 18:36:26 +02:00
He Zhe	edec6e015a	KVM: LAPIC: Mark hrtimer for period or oneshot mode to expire in hard interrupt context apic->lapic_timer.timer was initialized with HRTIMER_MODE_ABS_HARD but started later with HRTIMER_MODE_ABS, which may cause the following warning in PREEMPT_RT kernel. WARNING: CPU: 1 PID: 2957 at kernel/time/hrtimer.c:1129 hrtimer_start_range_ns+0x348/0x3f0 CPU: 1 PID: 2957 Comm: qemu-system-x86 Not tainted 5.4.23-rt11 #1 Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1a 09/18/2018 RIP: 0010:hrtimer_start_range_ns+0x348/0x3f0 Code: 4d b8 0f 94 c1 0f b6 c9 e8 35 f1 ff ff 4c 8b 45 b0 e9 3b fd ff ff e8 d7 3f fa ff 48 98 4c 03 34 c5 a0 26 bf 93 e9 a1 fd ff ff <0f> 0b e9 fd fc ff ff 65 8b 05 fa b7 90 6d 89 c0 48 0f a3 05 60 91 RSP: 0018:ffffbc60026ffaf8 EFLAGS: 00010202 RAX: 0000000000000001 RBX: ffff9d81657d4110 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000006cc7987bcf RDI: ffff9d81657d4110 RBP: ffffbc60026ffb58 R08: 0000000000000001 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000000 R12: 0000006cc7987bcf R13: 0000000000000000 R14: 0000006cc7987bcf R15: ffffbc60026d6a00 FS: 00007f401daed700(0000) GS:ffff9d81ffa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000ffffffff CR3: 0000000fa7574000 CR4: 00000000003426e0 Call Trace: ? kvm_release_pfn_clean+0x22/0x60 [kvm] start_sw_timer+0x85/0x230 [kvm] ? vmx_vmexit+0x1b/0x30 [kvm_intel] kvm_lapic_switch_to_sw_timer+0x72/0x80 [kvm] vmx_pre_block+0x1cb/0x260 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0x1b/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0x1b/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0x1b/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0x1b/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0x1b/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0x1b/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_vmexit+0x1b/0x30 [kvm_intel] ? vmx_vmexit+0xf/0x30 [kvm_intel] ? vmx_sync_pir_to_irr+0x9e/0x100 [kvm_intel] ? kvm_apic_has_interrupt+0x46/0x80 [kvm] kvm_arch_vcpu_ioctl_run+0x85b/0x1fa0 [kvm] ? _raw_spin_unlock_irqrestore+0x18/0x50 ? _copy_to_user+0x2c/0x30 kvm_vcpu_ioctl+0x235/0x660 [kvm] ? rt_spin_unlock+0x2c/0x50 do_vfs_ioctl+0x3e4/0x650 ? __fget+0x7a/0xa0 ksys_ioctl+0x67/0x90 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x4d/0x120 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f4027cc54a7 Code: 00 00 90 48 8b 05 e9 59 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b9 59 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007f401dae9858 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00005558bd029690 RCX: 00007f4027cc54a7 RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000d RBP: 00007f4028b72000 R08: 00005558bc829ad0 R09: 00000000ffffffff R10: 00005558bcf90ca0 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 00005558bce1c840 --[ end trace 0000000000000002 ]-- Signed-off-by: He Zhe <zhe.he@windriver.com> Message-Id: <1584687967-332859-1-git-send-email-zhe.he@windriver.com> Reviewed-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 09:01:14 -04:00
Tom Lendacky	2e2409afe5	KVM: SVM: Issue WBINVD after deactivating an SEV guest Currently, CLFLUSH is used to flush SEV guest memory before the guest is terminated (or a memory hotplug region is removed). However, CLFLUSH is not enough to ensure that SEV guest tagged data is flushed from the cache. With `33af3a7ef9` ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations"), the original WBINVD was removed. This then exposed crashes at random times because of a cache flush race with a page that had both a hypervisor and a guest tag in the cache. Restore the WBINVD when destroying an SEV guest and add a WBINVD to the svm_unregister_enc_region() function to ensure hotplug memory is flushed when removed. The DF_FLUSH can still be avoided at this point. Fixes: `33af3a7ef9` ("KVM: SVM: Reduce WBINVD/DF_FLUSH invocations") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <c8bf9087ca3711c5770bdeaafa3e45b717dc5ef4.1584720426.git.thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-03-23 09:01:04 -04:00
Luis Henriques	c8d6ee0144	ceph: fix memory leak in ceph_cleanup_snapid_map() kmemleak reports the following memory leak: unreferenced object 0xffff88821feac8a0 (size 96): comm "kworker/1:0", pid 17, jiffies 4294896362 (age 20.512s) hex dump (first 32 bytes): a0 c8 ea 1f 82 88 ff ff 00 c9 ea 1f 82 88 ff ff ................ 00 00 00 00 00 00 00 00 00 01 00 00 00 00 ad de ................ backtrace: [<00000000b3ea77fb>] ceph_get_snapid_map+0x75/0x2a0 [<00000000d4060942>] fill_inode+0xb26/0x1010 [<0000000049da6206>] ceph_readdir_prepopulate+0x389/0xc40 [<00000000e2fe2549>] dispatch+0x11ab/0x1521 [<000000007700b894>] ceph_con_workfn+0xf3d/0x3240 [<0000000039138a41>] process_one_work+0x24d/0x590 [<00000000eb751f34>] worker_thread+0x4a/0x3d0 [<000000007e8f0d42>] kthread+0xfb/0x130 [<00000000d49bd1fa>] ret_from_fork+0x3a/0x50 A kfree is missing while looping the 'to_free' list of ceph_snapid_map objects. Cc: stable@vger.kernel.org Fixes: `75c9627efb` ("ceph: map snapid to anonymous bdev ID") Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2020-03-23 13:07:08 +01:00
Ilya Dryomov	e886274031	libceph: fix alloc_msg_with_page_vector() memory leaks Make it so that CEPH_MSG_DATA_PAGES data item can own pages, fixing a bunch of memory leaks for a page vector allocated in alloc_msg_with_page_vector(). Currently, only watch-notify messages trigger this allocation, and normally the page vector is freed either in handle_watch_notify() or by the caller of ceph_osdc_notify(). But if the message is freed before that (e.g. if the session faults while reading in the message or if the notify is stale), we leak the page vector. This was supposed to be fixed by switching to a message-owned pagelist, but that never happened. Fixes: `1907920324` ("libceph: support for sending notifies") Reported-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Roman Penyaev <rpenyaev@suse.de>	2020-03-23 13:07:08 +01:00
Ilya Dryomov	7614209736	ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult per-pool flags as well. Unfortunately the backwards compatibility here is lacking: - the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but was guarded by require_osd_release >= RELEASE_LUMINOUS - it was subsequently backported to luminous in v12.2.2, but that makes no difference to clients that only check OSDMAP_FULL/NEARFULL because require_osd_release is not client-facing -- it is for OSDs Since all kernels are affected, the best we can do here is just start checking both map flags and pool flags and send that to stable. These checks are best effort, so take osdc->lock and look up pool flags just once. Remove the FIXME, since filesystem quotas are checked above and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set. Cc: stable@vger.kernel.org Reported-by: Yanhu Cao <gmayyyha@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Sage Weil <sage@redhat.com>	2020-03-23 13:07:08 +01:00
Mauro Carvalho Chehab	692b65c84f	i2c: fix a doc warning Don't let non-letters inside a literal block without escaping it, as the toolchain would mis-interpret it: ./include/linux/i2c.h:518: WARNING: Inline strong start-string without end-string. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2020-03-23 11:17:22 +01:00
Sungbo Eo	deeabb4c13	ARM: dts: oxnas: Fix clear-mask property Disable all rps-irq interrupts during driver initialization to prevent an accidental interrupt on GIC. Fixes: `84316f4ef1` ("ARM: boot: dts: Add Oxford Semiconductor OX810SE dtsi") Fixes: `38d4a53733` ("ARM: dts: Add support for OX820 and Pogoplug V3") Signed-off-by: Sungbo Eo <mans0n@gorani.run> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>	2020-03-23 09:34:09 +01:00
Christophe JAILLET	018af9be3d	dmaengine: ti: k3-udma-glue: Fix an error handling path in 'k3_udma_glue_cfg_rx_flow()' All but one error handling paths in the 'k3_udma_glue_cfg_rx_flow()' function 'goto err' and call 'k3_udma_glue_release_rx_flow()'. This not correct because this function has a 'channel->flows_ready--;' at the end, but 'flows_ready' has not been incremented here, when we branch to the error handling path. In order to keep a correct value in 'flows_ready', un-roll 'k3_udma_glue_release_rx_flow()', simplify it, add some labels and branch at the correct places when an error is detected. Doing so, we also NULLify 'flow->udma_rflow' in a path that was lacking it. Fixes: `d702419134` ("dmaengine: ti: k3-udma: Add glue layer for non DMAengine user") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Link: https://lore.kernel.org/r/20200318191209.1267-1-christophe.jaillet@wanadoo.fr Signed-off-by: Vinod Koul <vkoul@kernel.org>	2020-03-23 11:48:34 +05:30
Zhou Wang	01c4df39a2	MAINTAINERS: Add maintainer for HiSilicon DMA engine driver Add myself as the maintainer of HiSilicon DMA engine driver. Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Link: https://lore.kernel.org/r/1584062624-196854-1-git-send-email-wangzhou1@hisilicon.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2020-03-23 11:38:49 +05:30
Dave Jiang	988aad2f11	dmaengine: idxd: fix off by one on cdev dwq refcount The refcount check for dedicated workqueue (dwq) is off by one and allows more than 1 user to open the char device. Fix check so only a single user can open the device. Fixes: `42d279f913` ("dmaengine: idxd: add char driver to expose submission portal to userland") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/158403020187.10208.14117394394540710774.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2020-03-23 11:32:06 +05:30
Linus Torvalds	16fbf79b0f	Linux 5.6-rc7	2020-03-22 18:31:56 -07:00
Nicolas Saenz Julienne	55c7c06210	ARM: dts: bcm283x: Fix vc4's firmware bus DMA limitations The bus is virtual and devices have to inherit their DMA constraints from the underlying interconnect. So add an empty dma-ranges property to the bus node, implying the firmware bus' DMA constraints are identical to its parent's. Fixes: `7dbe8c62ce` ("ARM: dts: Add minimal Raspberry Pi 4 support") Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2020-03-22 14:45:24 -07:00
Linus Torvalds	67d584e33e	for-5.6-rc6-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl53Vh0ACgkQxWXV+ddt WDtfOQ//bbUyKXcdH0FBZOCEcJmegcK1eUFYqKrwR2bHGe5JRdLM8pAvjCcqmWeO jtaRiFC4NSCqTIl3mkBUb+XmQtjZwixBUHRxJpuEO8zqawvFZXTqg/KJklNvi2rd KdflSNia6KrozTT+B/lpwZ5emS+wSdj5XTZ6VGj4riwtphSfWAjOu+4cOASMeFu+ Gfn+N9xu0ZcR/6zO20xAg0Xz+WU2uj4EfeM35dtRP2bPLG0yOGmiYT15Ll9h74Wm 7F+28iNTQfYutAexGvUpiouanGXE+ka3TCsJg5LuVTpdKGraOVGEuX+RhsyoKQrB E8bk91fbkLlooluhUC306iNA9/+RN/yFGtILX8JsgI2Od26ZuU01l/OHrc19MDIm gw1w3PMsD/hXLsG5ba4QsIYOzXofSrPdWej29h/o5p0VEQrAoCJEpAi7fVsiJDR1 sx6kCodw5jYhVs1P6DdXO1pgjE7iFUmjUQCFkl40edPMLy/LwB99A4zNnCOwI0KZ 49CMWHDe+tXVJBTzPvtma/PycQHIxJYMf1f8ko9E4stB7HtfH4dnUERDkb1UwQ5n aJgyhsCCnp/EJoPunUT7g9nLUdyu0Rtwknn3NascWZEieX2QhKEF5RcjAUSL+Hlo jbGGvoLhG0nOtYkU7BNSQbL8wxPJEEAq8e6F4tWMcOkhX4pNZP8= =YkB0 -----END PGP SIGNATURE----- Merge tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Two fixes. The first is a regression: when dropping some incompat bits the conditions were reversed. The other is a fix for rename whiteout potentially leaving stack memory linked to a list" * tag 'for-5.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix removal of raid[56\|1c34} incompat flags after removing block group btrfs: fix log context list corruption after rename whiteout error	2020-03-22 11:35:33 -07:00
Linus Torvalds	b3c03db67e	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: x86/mm: split vmalloc_sync_all() mm, slub: prevent kmalloc_node crashes and memory leaks mm/mmu_notifier: silence PROVE_RCU_LIST warnings epoll: fix possible lost wakeup on epoll_ctl() path mm: do not allow MADV_PAGEOUT for CoW pages mm, memcg: throttle allocators based on ancestral memory.high mm, memcg: fix corruption on 64-bit divisor in memory.high throttling page-flags: fix a crash at SetPageError(THP_SWAP) mm/hotplug: fix hot remove failure in SPARSEMEM\|!VMEMMAP case memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event	2020-03-22 10:46:50 -07:00
Chuhong Yuan	e1b9f99ff8	i2c: hix5hd2: add missed clk_disable_unprepare in remove The driver forgets to disable and unprepare clk when remove. Add a call to clk_disable_unprepare to fix it. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org	2020-03-22 17:20:11 +01:00
Alan Maguire	83a9b6f639	selftests/net: add definition for SOL_DCCP to fix compilation errors for old libc Many systems build/test up-to-date kernels with older libcs, and an older glibc (2.17) lacks the definition of SOL_DCCP in /usr/include/bits/socket.h (it was added in the 4.6 timeframe). Adding the definition to the test program avoids a compilation failure that gets in the way of building tools/testing/selftests/net. The test itself will work once the definition is added; either skipping due to DCCP not being configured in the kernel under test or passing, so there are no other more up-to-date glibc dependencies here it seems beyond that missing definition. Fixes: `11fb60d108` ("selftests: net: reuseport_addr_any: add DCCP") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 20:23:10 -07:00
Doug Berger	9a9ba2a4aa	net: bcmgenet: always enable status blocks The hardware offloading of the NETIF_F_HW_CSUM and NETIF_F_RXCSUM features requires the use of Transmit Status Blocks before transmit frame data and Receive Status Blocks before receive frame data to carry the checksum information. Unfortunately, these status blocks are currently only enabled when the NETIF_F_HW_CSUM feature is enabled. As a result NETIF_F_RXCSUM will not actually be offloaded to the hardware unless both it and NETIF_F_HW_CSUM are enabled. Fortunately, that is the default configuration. This commit addresses this issue by always enabling the use of status blocks on both transmit and receive frames. Further, it replaces the use of a dedicated flag within the driver private data structure with direct use of the netdev features flags. Fixes: `8101553978` ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 20:18:13 -07:00
Grygorii Strashko	749f6f6843	net: phy: dp83867: w/a for fld detect threshold bootstrapping issue When the DP83867 PHY is strapped to enable Fast Link Drop (FLD) feature STRAP_STS2.STRAP_ FLD (reg 0x006F bit 10), the Energy Lost Threshold for FLD Energy Lost Mode FLD_THR_CFG.ENERGY_LOST_FLD_THR (reg 0x002e bits 2:0) will be defaulted to 0x2. This may cause the phy link to be unstable. The new DP83867 DM recommends to always restore ENERGY_LOST_FLD_THR to 0x1. Hence, restore default value of FLD_THR_CFG.ENERGY_LOST_FLD_THR to 0x1 when FLD is enabled by bootstrapping as recommended by DM. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 20:09:57 -07:00
Emil Renner Berthing	9de9aa487d	net: stmmac: dwmac-rk: fix error path in rk_gmac_probe Make sure we clean up devicetree related configuration also when clock init fails. Fixes: `fecd4d7eef` ("net: stmmac: dwmac-rk: Add integrated PHY support") Signed-off-by: Emil Renner Berthing <kernel@esmil.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:58:48 -07:00
Oliver Hartkopp	2091a3d42b	slcan: not call free_netdev before rtnl_unlock in slcan_open As the description before netdev_run_todo, we cannot call free_netdev before rtnl_unlock, fix it by reorder the code. This patch is a 1:1 copy of upstream slip.c commit `f596c87005` ("slip: not call free_netdev before rtnl_unlock in slip_open"). Reported-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:57:15 -07:00
Lukas Bulwahn	06e9bfc1e5	ionic: make spdxcheck.py happy Headers ionic_if.h and ionic_regs.h are licensed under three alternative licenses and the used SPDX-License-Identifier expression makes ./scripts/spdxcheck.py complain: drivers/net/ethernet/pensando/ionic/ionic_if.h: 1:52 Syntax error: OR drivers/net/ethernet/pensando/ionic/ionic_regs.h: 1:52 Syntax error: OR As OR is associative, it is irrelevant if the parentheses are put around the first or the second OR-expression. Simply add parentheses to make spdxcheck.py happy. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:53:57 -07:00
Taehee Yoo	3a303cfdd2	hsr: fix general protection fault in hsr_addr_is_self() The port->hsr is used in the hsr_handle_frame(), which is a callback of rx_handler. hsr master and slaves are initialized in hsr_add_port(). This function initializes several pointers, which includes port->hsr after registering rx_handler. So, in the rx_handler routine, un-initialized pointer would be used. In order to fix this, pointers should be initialized before registering rx_handler. Test commands: ip netns del left ip netns del right modprobe -rv veth modprobe -rv hsr killall ping modprobe hsr ip netns add left ip netns add right ip link add veth0 type veth peer name veth1 ip link add veth2 type veth peer name veth3 ip link add veth4 type veth peer name veth5 ip link set veth1 netns left ip link set veth3 netns right ip link set veth4 netns left ip link set veth5 netns right ip link set veth0 up ip link set veth2 up ip link set veth0 address fc:00:00:00:00:01 ip link set veth2 address fc:00:00:00:00:02 ip netns exec left ip link set veth1 up ip netns exec left ip link set veth4 up ip netns exec right ip link set veth3 up ip netns exec right ip link set veth5 up ip link add hsr0 type hsr slave1 veth0 slave2 veth2 ip a a 192.168.100.1/24 dev hsr0 ip link set hsr0 up ip netns exec left ip link add hsr1 type hsr slave1 veth1 slave2 veth4 ip netns exec left ip a a 192.168.100.2/24 dev hsr1 ip netns exec left ip link set hsr1 up ip netns exec left ip n a 192.168.100.1 dev hsr1 lladdr \ fc:00:00:00:00:01 nud permanent ip netns exec left ip n r 192.168.100.1 dev hsr1 lladdr \ fc:00:00:00:00:01 nud permanent for i in {1..100} do ip netns exec left ping 192.168.100.1 & done ip netns exec left hping3 192.168.100.1 -2 --flood & ip netns exec right ip link add hsr2 type hsr slave1 veth3 slave2 veth5 ip netns exec right ip a a 192.168.100.3/24 dev hsr2 ip netns exec right ip link set hsr2 up ip netns exec right ip n a 192.168.100.1 dev hsr2 lladdr \ fc:00:00:00:00:02 nud permanent ip netns exec right ip n r 192.168.100.1 dev hsr2 lladdr \ fc:00:00:00:00:02 nud permanent for i in {1..100} do ip netns exec right ping 192.168.100.1 & done ip netns exec right hping3 192.168.100.1 -2 --flood & while : do ip link add hsr0 type hsr slave1 veth0 slave2 veth2 ip a a 192.168.100.1/24 dev hsr0 ip link set hsr0 up ip link del hsr0 done Splat looks like: [ 120.954938][ C0] general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1]I [ 120.957761][ C0] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037] [ 120.959064][ C0] CPU: 0 PID: 1511 Comm: hping3 Not tainted 5.6.0-rc5+ #460 [ 120.960054][ C0] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 120.962261][ C0] RIP: 0010:hsr_addr_is_self+0x65/0x2a0 [hsr] [ 120.963149][ C0] Code: 44 24 18 70 73 2f c0 48 c1 eb 03 48 8d 04 13 c7 00 f1 f1 f1 f1 c7 40 04 00 f2 f2 f2 4 [ 120.966277][ C0] RSP: 0018:ffff8880d9c09af0 EFLAGS: 00010206 [ 120.967293][ C0] RAX: 0000000000000006 RBX: 1ffff1101b38135f RCX: 0000000000000000 [ 120.968516][ C0] RDX: dffffc0000000000 RSI: ffff8880d17cb208 RDI: 0000000000000000 [ 120.969718][ C0] RBP: 0000000000000030 R08: ffffed101b3c0e3c R09: 0000000000000001 [ 120.972203][ C0] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 0000000000000000 [ 120.973379][ C0] R13: ffff8880aaf80100 R14: ffff8880aaf800f2 R15: ffff8880aaf80040 [ 120.974410][ C0] FS: 00007f58e693f740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000 [ 120.979794][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 120.980773][ C0] CR2: 00007ffcb8b38f29 CR3: 00000000afe8e001 CR4: 00000000000606f0 [ 120.981945][ C0] Call Trace: [ 120.982411][ C0] <IRQ> [ 120.982848][ C0] ? hsr_add_node+0x8c0/0x8c0 [hsr] [ 120.983522][ C0] ? rcu_read_lock_held+0x90/0xa0 [ 120.984159][ C0] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 120.984944][ C0] hsr_handle_frame+0x1db/0x4e0 [hsr] [ 120.985597][ C0] ? hsr_nl_nodedown+0x2b0/0x2b0 [hsr] [ 120.986289][ C0] __netif_receive_skb_core+0x6bf/0x3170 [ 120.992513][ C0] ? check_chain_key+0x236/0x5d0 [ 120.993223][ C0] ? do_xdp_generic+0x1460/0x1460 [ 120.993875][ C0] ? register_lock_class+0x14d0/0x14d0 [ 120.994609][ C0] ? __netif_receive_skb_one_core+0x8d/0x160 [ 120.995377][ C0] __netif_receive_skb_one_core+0x8d/0x160 [ 120.996204][ C0] ? __netif_receive_skb_core+0x3170/0x3170 [ ... ] Reported-by: syzbot+fcf5dd39282ceb27108d@syzkaller.appspotmail.com Fixes: `c5a7591172` ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:49:34 -07:00
David S. Miller	4abe5a1b96	Merge branch 'hinic-BugFixes' Luo bin says: ==================== hinic: BugFixes Fix a number of bugs which have been present since the first commit. The bugs fixed in these patchs are hardly exposed unless given very specific conditions. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:44:23 -07:00
Luo bin	7296695fc1	hinic: fix wrong value of MIN_SKB_LEN the minimum value of skb len that hw supports is 32 rather than 17 Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:44:16 -07:00
Luo bin	0da7c322f1	hinic: fix wrong para of wait_for_completion_timeout the second input parameter of wait_for_completion_timeout should be jiffies instead of millisecond Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:43:38 -07:00
Luo bin	33f15da216	hinic: fix out-of-order excution in arm cpu add read barrier in driver code to keep from reading other fileds in dma memory which is writable for hw until we have verified the memory is valid for driver Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:43:38 -07:00
Luo bin	614eaa943e	hinic: fix the bug of clearing event queue should disable eq irq before freeing it, must clear event queue depth in hw before freeing relevant memory to avoid illegal memory access and update consumer idx to avoid invalid interrupt Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:43:38 -07:00
Luo bin	96758117dc	hinic: fix a bug of waitting for IO stopped it's unreliable for fw to check whether IO is stopped, so driver wait for enough time to ensure IO process is done in hw before freeing resources Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-21 19:43:38 -07:00
Joerg Roedel	763802b53a	x86/mm: split vmalloc_sync_all() Commit `3f8fd02b1b` ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in the vunmap() code-path. While this change was necessary to maintain correctness on x86-32-pae kernels, it also adds additional cycles for architectures that don't need it. Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported severe performance regressions in micro-benchmarks because it now also calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But the vmalloc_sync_all() implementation on x86-64 is only needed for newly created mappings. To avoid the unnecessary work on x86-64 and to gain the performance back, split up vmalloc_sync_all() into two functions: * vmalloc_sync_mappings(), and * vmalloc_sync_unmappings() Most call-sites to vmalloc_sync_all() only care about new mappings being synchronized. The only exception is the new call-site added in the above mentioned commit. Shile Zhang directed us to a report of an 80% regression in reaim throughput. Fixes: `3f8fd02b1b` ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Shile Zhang <shile.zhang@linux.alibaba.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Borislav Petkov <bp@suse.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [GHES] Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20191009124418.8286-1-joro@8bytes.org Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/ Link: http://lkml.kernel.org/r/20191113095530.228959-1-shile.zhang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-21 18:56:06 -07:00
Vlastimil Babka	0715e6c516	mm, slub: prevent kmalloc_node crashes and memory leaks Sachin reports [1] a crash in SLUB __slab_alloc(): BUG: Kernel NULL pointer dereference on read at 0x000073b0 Faulting instruction address: 0xc0000000003d55f4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1 NIP: c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000 REGS: c0000008b37836d0 TRAP: 0300 Not tainted (5.6.0-rc2-next-20200218-autotest) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004844 XER: 00000000 CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1 GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500 GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620 GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000 GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000 GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002 GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122 GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8 GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180 NIP ___slab_alloc+0x1f4/0x760 LR __slab_alloc+0x34/0x60 Call Trace: ___slab_alloc+0x334/0x760 (unreliable) __slab_alloc+0x34/0x60 __kmalloc_node+0x110/0x490 kvmalloc_node+0x58/0x110 mem_cgroup_css_online+0x108/0x270 online_css+0x48/0xd0 cgroup_apply_control_enable+0x2ec/0x4d0 cgroup_mkdir+0x228/0x5f0 kernfs_iop_mkdir+0x90/0xf0 vfs_mkdir+0x110/0x230 do_mkdirat+0xb0/0x1a0 system_call+0x5c/0x68 This is a PowerPC platform with following NUMA topology: available: 2 nodes (0-1) node 0 cpus: node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 node 1 size: 35247 MB node 1 free: 30907 MB node distances: node 0 1 0: 10 40 1: 40 10 possible numa nodes: 0-31 This only happens with a mmotm patch "mm/memcontrol.c: allocate shrinker_map on appropriate NUMA node" [2] which effectively calls kmalloc_node for each possible node. SLUB however only allocates kmem_cache_node on online N_NORMAL_MEMORY nodes, and relies on node_to_mem_node to return such valid node for other nodes since commit `a561ce00b0` ("slub: fall back to node_to_mem_node() node if allocating on memoryless node"). This is however not true in this configuration where the _node_numa_mem_ array is not initialized for nodes 0 and 2-31, thus it contains zeroes and get_partial() ends up accessing non-allocated kmem_cache_node. A related issue was reported by Bharata (originally by Ramachandran) [3] where a similar PowerPC configuration, but with mainline kernel without patch [2] ends up allocating large amounts of pages by kmalloc-1k kmalloc-512. This seems to have the same underlying issue with node_to_mem_node() not behaving as expected, and might probably also lead to an infinite loop with CONFIG_SLUB_CPU_PARTIAL [4]. This patch should fix both issues by not relying on node_to_mem_node() anymore and instead simply falling back to NUMA_NO_NODE, when kmalloc_node(node) is attempted for a node that's not online, or has no usable memory. The "usable memory" condition is also changed from node_present_pages() to N_NORMAL_MEMORY node state, as that is exactly the condition that SLUB uses to allocate kmem_cache_node structures. The check in get_partial() is removed completely, as the checks in ___slab_alloc() are now sufficient to prevent get_partial() being reached with an invalid node. [1] https://lore.kernel.org/linux-next/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com/ [2] https://lore.kernel.org/linux-mm/fff0e636-4c36-ed10-281c-8cdb0687c839@virtuozzo.com/ [3] https://lore.kernel.org/linux-mm/20200317092624.GB22538@in.ibm.com/ [4] https://lore.kernel.org/linux-mm/088b5996-faae-8a56-ef9c-5b567125ae54@suse.cz/ Fixes: `a561ce00b0` ("slub: fall back to node_to_mem_node() node if allocating on memoryless node") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Reported-by: PUVICHAKRAVARTHY RAMACHANDRAN <puvichakravarthy@in.ibm.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Tested-by: Bharata B Rao <bharata@linux.ibm.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Christopher Lameter <cl@linux.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200320115533.9604-1-vbabka@suse.cz Debugged-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-21 18:56:06 -07:00
Qian Cai	63886bad90	mm/mmu_notifier: silence PROVE_RCU_LIST warnings It is safe to traverse mm->notifier_subscriptions->list either under SRCU read lock or mm->notifier_subscriptions->lock using hlist_for_each_entry_rcu(). Silence the PROVE_RCU_LIST false positives, for example, WARNING: suspicious RCU usage ----------------------------- mm/mmu_notifier.c:484 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by libvirtd/802: #0: ffff9321e3f58148 (&mm->mmap_sem#2){++++}, at: do_mprotect_pkey+0xe1/0x3e0 #1: ffffffff91ae6160 (mmu_notifier_invalidate_range_start){+.+.}, at: change_p4d_range+0x5fa/0x800 #2: ffffffff91ae6e08 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x178/0x460 stack backtrace: CPU: 7 PID: 802 Comm: libvirtd Tainted: G I 5.6.0-rc6-next-20200317+ #2 Hardware name: HP ProLiant BL460c Gen8, BIOS I31 11/02/2014 Call Trace: dump_stack+0xa4/0xfe lockdep_rcu_suspicious+0xeb/0xf5 __mmu_notifier_invalidate_range_start+0x3ff/0x460 change_p4d_range+0x746/0x800 change_protection+0x1df/0x300 mprotect_fixup+0x245/0x3e0 do_mprotect_pkey+0x23b/0x3e0 __x64_sys_mprotect+0x51/0x70 do_syscall_64+0x91/0xae8 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Link: http://lkml.kernel.org/r/20200317175640.2047-1-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-21 18:56:06 -07:00
Roman Penyaev	1b53734bd0	epoll: fix possible lost wakeup on epoll_ctl() path This fixes possible lost wakeup introduced by commit `a218cc4914`. Originally modifications to ep->wq were serialized by ep->wq.lock, but in commit `a218cc4914` ("epoll: use rwlock in order to reduce ep_poll_callback() contention") a new rw lock was introduced in order to relax fd event path, i.e. callers of ep_poll_callback() function. After the change ep_modify and ep_insert (both are called on epoll_ctl() path) were switched to ep->lock, but ep_poll (epoll_wait) was using ep->wq.lock on wqueue list modification. The bug doesn't lead to any wqueue list corruptions, because wake up path and list modifications were serialized by ep->wq.lock internally, but actual waitqueue_active() check prior wake_up() call can be reordered with modifications of ep ready list, thus wake up can be lost. And yes, can be healed by explicit smp_mb(): list_add_tail(&epi->rdlink, &ep->rdllist); smp_mb(); if (waitqueue_active(&ep->wq)) wake_up(&ep->wp); But let's make it simple, thus current patch replaces ep->wq.lock with the ep->lock for wqueue modifications, thus wake up path always observes activeness of the wqueue correcty. Fixes: `a218cc4914` ("epoll: use rwlock in order to reduce ep_poll_callback() contention") Reported-by: Max Neunhoeffer <max@arangodb.com> Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Max Neunhoeffer <max@arangodb.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Jes Sorensen <jes.sorensen@gmail.com> Cc: <stable@vger.kernel.org> [5.1+] Link: http://lkml.kernel.org/r/20200214170211.561524-1-rpenyaev@suse.de References: https://bugzilla.kernel.org/show_bug.cgi?id=205933 Bisected-by: Max Neunhoeffer <max@arangodb.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-21 18:56:06 -07:00
Michal Hocko	12e967fd8e	mm: do not allow MADV_PAGEOUT for CoW pages Jann has brought up a very interesting point [1]. While shared pages are excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed that way. This can lead to all sorts of hard to debug problems. E.g. performance problems outlined by Daniel [2]. There are runtime environments where there is a substantial memory shared among security domains via CoW memory and a easy to reclaim way of that memory, which MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation in for the parent process which might be more privileged or even open side channel attacks. The feasibility of the latter is not really clear to me TBH but there is no real reason for exposure at this stage. It seems there is no real use case to depend on reclaiming CoW memory via madvise at this stage so it is much easier to simply disallow it and this is what this patch does. Put it simply MADV_{PAGEOUT,COLD} can operate only on the exclusively owned memory which is a straightforward semantic. [1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com [2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com Fixes: `9c276cc65a` ("mm: introduce MADV_COLD") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Daniel Colascione <dancol@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200312082248.GS23944@dhcp22.suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-03-21 18:56:06 -07:00

... 5 6 7 8 9 ...

904296 Commits