linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-16 13:17:02 +07:00

Author	SHA1	Message	Date
Sebastian Andrzej Siewior	822f312d47	kvm: x86: make kvm_{load\|put}_guest_fpu() static The functions kvm_load_guest_fpu() kvm_put_guest_fpu() are only used locally, make them static. This requires also that both functions are moved because they are used before their implementation. Those functions were exported (via EXPORT_SYMBOL) before commit `e5bb40251a` ("KVM: Drop kvm_{load,put}_guest_fpu() exports"). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:43 +02:00
Vitaly Kuznetsov	a1efa9b700	x86/hyper-v: rename ipi_arg_{ex,non_ex} structures These structures are going to be used from KVM code so let's make their names reflect their Hyper-V origin. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:42 +02:00
Sean Christopherson	d264ee0c2e	KVM: VMX: use preemption timer to force immediate VMExit A VMX preemption timer value of '0' is guaranteed to cause a VMExit prior to the CPU executing any instructions in the guest. Use the preemption timer (if it's supported) to trigger immediate VMExit in place of the current method of sending a self-IPI. This ensures that pending VMExit injection to L1 occurs prior to executing any instructions in the guest (regardless of nesting level). When deferring VMExit injection, KVM generates an immediate VMExit from the (possibly nested) guest by sending itself an IPI. Because hardware interrupts are blocked prior to VMEnter and are unblocked (in hardware) after VMEnter, this results in taking a VMExit(INTR) before any guest instruction is executed. But, as this approach relies on the IPI being received before VMEnter executes, it only works as intended when KVM is running as L0. Because there are no architectural guarantees regarding when IPIs are delivered, when running nested the INTR may "arrive" long after L2 is running e.g. L0 KVM doesn't force an immediate switch to L1 to deliver an INTR. For the most part, this unintended delay is not an issue since the events being injected to L1 also do not have architectural guarantees regarding their timing. The notable exception is the VMX preemption timer[1], which is architecturally guaranteed to cause a VMExit prior to executing any instructions in the guest if the timer value is '0' at VMEnter. Specifically, the delay in injecting the VMExit causes the preemption timer KVM unit test to fail when run in a nested guest. Note: this approach is viable even on CPUs with a broken preemption timer, as broken in this context only means the timer counts at the wrong rate. There are no known errata affecting timer value of '0'. [1] I/O SMIs also have guarantees on when they arrive, but I have no idea if/how those are emulated in KVM. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> [Use a hook for SVM instead of leaving the default in x86.c - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:42 +02:00
Sean Christopherson	f459a707ed	KVM: VMX: modify preemption timer bit only when arming timer Provide a singular location where the VMX preemption timer bit is set/cleared so that future usages of the preemption timer can ensure the VMCS bit is up-to-date without having to modify unrelated code paths. For example, the preemption timer can be used to force an immediate VMExit. Cache the status of the timer to avoid redundant VMREAD and VMWRITE, e.g. if the timer stays armed across multiple VMEnters/VMExits. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:51:41 +02:00
Sean Christopherson	4c008127e4	KVM: VMX: immediately mark preemption timer expired only for zero value A VMX preemption timer value of '0' at the time of VMEnter is architecturally guaranteed to cause a VMExit prior to the CPU executing any instructions in the guest. This architectural definition is in place to ensure that a previously expired timer is correctly recognized by the CPU as it is possible for the timer to reach zero and not trigger a VMexit due to a higher priority VMExit being signalled instead, e.g. a pending #DB that morphs into a VMExit. Whether by design or coincidence, commit `f4124500c2` ("KVM: nVMX: Fully emulate preemption timer") special cased timer values of '0' and '1' to ensure prompt delivery of the VMExit. Unlike '0', a timer value of '1' has no has no architectural guarantees regarding when it is delivered. Modify the timer emulation to trigger immediate VMExit if and only if the timer value is '0', and document precisely why '0' is special. Do this even if calibration of the virtual TSC failed, i.e. VMExit will occur immediately regardless of the frequency of the timer. Making only '0' a special case gives KVM leeway to be more aggressive in ensuring the VMExit is injected prior to executing instructions in the nested guest, and also eliminates any ambiguity as to why '1' is a special case, e.g. why wasn't the threshold for a "short timeout" set to 10, 100, 1000, etc... Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:46 +02:00
Andy Shevchenko	a101c9d63e	KVM: SVM: Switch to bitmap_zalloc() Switch to bitmap_zalloc() to show clearly what we are allocating. Besides that it returns pointer of bitmap type instead of opaque void *. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:45 +02:00
Tianyu Lan	9a9845867c	KVM/MMU: Fix comment in walk_shadow_page_lockless_end() kvm_commit_zap_page() has been renamed to kvm_mmu_commit_zap_page() This patch is to fix the commit. Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:45 +02:00
Lei Yang	6bd317d3c8	kvm: selftests: use -pthread instead of -lpthread I run into the following error testing/selftests/kvm/dirty_log_test.c:285: undefined reference to `pthread_create' testing/selftests/kvm/dirty_log_test.c:297: undefined reference to `pthread_join' collect2: error: ld returned 1 exit status my gcc version is gcc version 4.8.4 "-pthread" would work everywhere Signed-off-by: Lei Yang <Lei.Yang@windriver.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:44 +02:00
Wei Yang	83b20b28c6	KVM: x86: don't reset root in kvm_mmu_setup() Here is the code path which shows kvm_mmu_setup() is invoked after kvm_mmu_create(). Since kvm_mmu_setup() is only invoked in this code path, this means the root_hpa and prev_roots are guaranteed to be invalid. And it is not necessary to reset it again. kvm_vm_ioctl_create_vcpu() kvm_arch_vcpu_create() vmx_create_vcpu() kvm_vcpu_init() kvm_arch_vcpu_init() kvm_mmu_create() kvm_arch_vcpu_setup() kvm_mmu_setup() kvm_init_mmu() This patch set reset_roots to false in kmv_mmu_setup(). Fixes: `50c28f21d0` Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:44 +02:00
Junaid Shahid	d35b34a9a7	kvm: mmu: Don't read PDPTEs when paging is not enabled kvm should not attempt to read guest PDPTEs when CR0.PG = 0 and CR4.PAE = 1. Signed-off-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:43 +02:00
Vitaly Kuznetsov	d176620277	x86/kvm/lapic: always disable MMIO interface in x2APIC mode When VMX is used with flexpriority disabled (because of no support or if disabled with module parameter) MMIO interface to lAPIC is still available in x2APIC mode while it shouldn't be (kvm-unit-tests): PASS: apic_disable: Local apic enabled in x2APIC mode PASS: apic_disable: CPUID.1H:EDX.APIC[bit 9] is set FAIL: apic_disable: *0xfee00030: 50014 The issue appears because we basically do nothing while switching to x2APIC mode when APIC access page is not used. apic_mmio_{read,write} only check if lAPIC is disabled before proceeding to actual write. When APIC access is virtualized we correctly manipulate with VMX controls in vmx_set_virtual_apic_mode() and we don't get vmexits from memory writes in x2APIC mode so there's no issue. Disabling MMIO interface seems to be easy. The question is: what do we do with these reads and writes? If we add apic_x2apic_mode() check to apic_mmio_in_range() and return -EOPNOTSUPP these reads and writes will go to userspace. When lAPIC is in kernel, Qemu uses this interface to inject MSIs only (see kvm_apic_mem_write() in hw/i386/kvm/apic.c). This somehow works with disabled lAPIC but when we're in xAPIC mode we will get a real injected MSI from every write to lAPIC. Not good. The simplest solution seems to be to just ignore writes to the region and return ~0 for all reads when we're in x2APIC mode. This is what this patch does. However, this approach is inconsistent with what currently happens when flexpriority is enabled: we allocate APIC access page and create KVM memory region so in x2APIC modes all reads and writes go to this pre-allocated page which is, btw, the same for all vCPUs. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-09-20 00:26:43 +02:00
Greg Kroah-Hartman	eb9a29f9e5	Various bug fixes for nct6775 driver -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJbon6nAAoJEMsfJm/On5mByxoP/3Z5fQWqWKZHlOYgej2xPR8I 9gMTwCca95awTgD/0toiQLOlwfevIwbdk0MWJSkr3fIqoFY/hOwwvF3mqOo/i6oN f8/oFTE0IKoaW0glNWkKK+kZb3d1WopJ1SP9is6KGYhx3b462mS0/i/on6VgQ1Yv HitLKs+UF+8rEDnXRMECnbg3gw8lznoxExOSIsxW3m8uWIgIQECdN1DHDza9QYux 4CLf1gHhk4q4eCgZCRXVPM4NeQlXrPCevnTbLkBUfqR1lCNohqJf2oNcDZYLlCYs uQAcByLkfPcifP7a9zMB7QVsWrZmOFgiTIS+Ct9d1XhA2n1+hUA+W3up79JXpr/n GS2V1b9KktL5XyRIUmRR5OoV09x2eAuuwLfq/d/Ff7z2KT+gZHdJqz1II0lzivtb xgNql/JBjmQvu1fX+g3PYqtXY/BVyNtUMWOtNnfgTBX39Z7L28P1HPCdRjglVKZi JVGD5G8XYhVfB3RMSE7b19lFOeRDvnoCNCiU4fmKPRI8IeT6aQUT5sOwjkxtrKkP pmSaHqYgawWvTojpD9DrhwHR77WIi1n482OFwB5XjCxc70LaJ0RC0wtRCPhQ1HtB cOY/JUt0biqHoX3UD1tcjMl+x3sQ23TViI5hik3iOJ7rwg6gwjL/Mj+LzJRKPfZ0 906DisKO8TUM/tJlULx9 =rHpH -----END PGP SIGNATURE----- Merge tag 'hwmon-for-linus-v4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Guenter writes: "Various bug fixes for nct6775 driver"	2018-09-19 22:59:30 +02:00
Greg Kroah-Hartman	6ad49fa199	SCSI fixes on 20180919 A couple of small but important fixes, one affecting big endian and the other fixing a BUG_ON in scatterlist processing. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCW6IviSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdrPAP4h8ouR z4qewZsVK9hwySouIfC2xAPRu7aFUBEPw12O4gEAshqLg/61w7PYS0t9NjQVpRw3 nR6xr6ymTbImn09w1Wg= =MQ30 -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi James writes: "SCSI fixes on 20180919 A couple of small but important fixes, one affecting big endian and the other fixing a BUG_ON in scatterlist processing. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>"	2018-09-19 22:34:22 +02:00
Icenowy Zheng	558a9ef94a	drm: sun4i: drop second PLL from A64 HDMI PHY The A64 HDMI PHY seems to be not able to use the second video PLL as clock parent in experiments. Drop the support for the second PLL from A64 HDMI PHY driver. Fixes: `b46e2c9f5f` ("drm/sun4i: Add support for A64 HDMI PHY") Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180916043409.62374-2-icenowy@aosc.io	2018-09-19 09:58:40 +02:00
Greg Kroah-Hartman	4ca719a338	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Crypto stuff from Herbert: "This push fixes a potential boot hang in ccp and an incorrect CPU capability check in aegis/morus on x86." * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/aegis,morus - Do not require OSXSAVE for SSE2 crypto: ccp - add timeout support in the SEV command	2018-09-19 08:29:42 +02:00
Greg Kroah-Hartman	f21f7fa263	Vaibhav Nagarnaik found that modifying the ring buffer size could cause a huge latency in the system because it does a while loop to free pages without releasing the CPU (on non preempt kernels). In a case where there are hundreds of thousands of pages to free it could actually cause a system stall. A properly place cond_resched() solves this issue. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW6GGJhQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qo2dAQDN4SUsItEc28ij5vYKoP1mSLt8aax1 1UoIHrh1pTLUMQD+PSlbtZnUq27vfGwyEFrIWLQ5eeDy3IESkQzoXWcs0gY= =HpN3 -----END PGP SIGNATURE----- Merge tag 'trace-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Steven writes: "Vaibhav Nagarnaik found that modifying the ring buffer size could cause a huge latency in the system because it does a while loop to free pages without releasing the CPU (on non preempt kernels). In a case where there are hundreds of thousands of pages to free it could actually cause a system stall. A properly place cond_resched() solves this issue."	2018-09-19 07:41:46 +02:00
Greg Kroah-Hartman	eba2d6b34a	platform-drivers-x86 for v4.19-2 Free allocated ACPI buffers in two drivers. The following is an automated git shortlog grouped by driver: alienware-wmi: - Correct a memory leak dell-smbios-wmi: - Correct a memory leak -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJboX3QAAoJEKbMaAwKp364hGwH/0CydzV81cQNsZijbLG7cZGW FF4cRwsdHGbjq2jwi1OWGV9zGUrIrambL3WjXCjogU8T+iOSt/ng+TslBgwuhc/T d/bggNuV0jw+2oKVpdHVRCedEVlhUqLJBn/8VInBcHP30vOKUNdzZSIymJvpYksi wCJph04RsKN2BR2rtyiKRuQO4iKuaRfAcQz2CPg5aUftQ1im/+Ksj5OuhcYYq0m5 lo8s8ZphzRHORkoTwNVP8zsdubH83FJeR6S4WVQmcqzK6TfKgmOldR3CKqmaZUcl bbQ8ky9MA3GtYkaMafc8sViXQW0ugVplwaRs9gCdPIMnCzu0SWqa5RNXafjxHEQ= =X2ER -----END PGP SIGNATURE----- Merge tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform-drivers-x86 Darren writes: "platform-drivers-x86 for v4.19-2 Free allocated ACPI buffers in two drivers. The following is an automated git shortlog grouped by driver: alienware-wmi: - Correct a memory leak dell-smbios-wmi: - Correct a memory leak" * tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform-drivers-x86: platform/x86: alienware-wmi: Correct a memory leak platform/x86: dell-smbios-wmi: Correct a memory leak	2018-09-19 07:21:21 +02:00
Rodrigo Vivi	a530bf948a	Merge tag 'gvt-fixes-2018-09-18' of https://github.com/intel/gvt-linux into drm-intel-fixes gvt-fixes-2018-09-18 - Fix initial DPIO PHY register state for BXT (Colin) - BXT untracked GEN9_CLKGATE_DIS_4 warning fix (Colin) - Fix srcu lock for GFN valid check (Weinan) - Should clear GGTT entry value after vGPU destroy (Zhipeng) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> From: Zhenyu Wang <zhenyuw@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180918073349.GQ20737@zhen-hp.sh.intel.com	2018-09-18 08:09:44 -07:00
Paolo Bonzini	1795f81f61	Second set of PPC KVM fixes for 4.19 Two fixes for KVM on POWER machines. Both of these relate to memory corruption and host crashes seen when transparent huge pages are enabled. The first fixes a host crash that can occur when a DMA mapping is removed by the guest and the page mapped was part of a transparent huge page; the second fixes corruption that could occur when a hypervisor page fault for a radix guest is being serviced at the same time that the backing page is being collapsed or split. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJbmFIyAAoJEJ2a6ncsY3Gf+L8H/jRQ0ONUpv2xrgirXdPmfuVv xIVejn5chiygpo3ZY2YkRGjqMoX8usA5pDQONk9duoc48FedSjjmurfAkSA8NESI y6DSRGB6pir/reP/7tBVk0eeeMBjbYnHPA7KfI8ijK424VmRpCT5stiUm7gQvSEm LSRUSLwWKfCCjU78HVtiTuK865WZifrOCy6wiNEl79F1K6T1A+LeGaKrcDLjeK/Q GsNSbwBK37BOvcsm0W1xrlnCmYtR/nVrhjTFMc5noBuc4znQd3wxitgiInFsOH5V LUWL6IStFkbGKSxVZuJilkhVF58AAisrJnwvlZsjrExWYf1J42kbyvVoURt0O8I= =blkZ -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-fixes-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD Second set of PPC KVM fixes for 4.19 Two fixes for KVM on POWER machines. Both of these relate to memory corruption and host crashes seen when transparent huge pages are enabled. The first fixes a host crash that can occur when a DMA mapping is removed by the guest and the page mapped was part of a transparent huge page; the second fixes corruption that could occur when a hypervisor page fault for a radix guest is being serviced at the same time that the backing page is being collapsed or split.	2018-09-18 15:13:00 +02:00
Paolo Bonzini	cb5fb87a2f	KVM: s390: Fixes for 4.19 - more fallout from the hugetlbfs enablement - bugfix for vma handling -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJbm68DAAoJEBF7vIC1phx8i3cP/R3UwXGrTq3u5+78PfUu+Nvm zTSgHID25/pCsytvRcK7SSJQt+h53RjuiQlOcrtSJZvq3Btt23gl+pFltqvWGZDM PYgy+LaaaoIXNeOJqwkwcnNsq4usQ0N1jQ8EgNUxEc2EBkffA6mKcTS0lWDm47AE Yxsyz0RsvwcbYCOj10s1b4D9h97+ZkqAC2RDnafzQOb54j/aOH7blsStNDXfdkte xHJ7mELOwAWre2QhD7OeTX1aCjBfKUJzWK2sPxm7qt5nEtH18o9F6j2Y/d8dnWg8 kqthyyb+nH0aHgv3G+7sKBOp1Ra7ftHAVrrbh3m9hceaW5+VuEhWSN6zVZKVdWta 4bjiWa9I1gKS/Wz479ztdbpt+koRAYSg5CLHAD7YsBR3hHtnAZwuRO38S4Xg+Tfb Hp4Lhf8gnq9yQqrcyfKWQKtU9mb84oNtEw/wQPUvr2tZyeoB5NbnVy5UDa9V1DRS FnF1/MIZ/MszkwbNEnhk+WWFy41h+uduizfXwQ/YO/xciEEdM2TaDHuLIpBegiSO 4kFfHRSMxqbwNhQSsJI8dIFpSHqXrtn2p6JTsWdi94B0QBYaNm4XMNpgEZIs5CsE aNGhQa+IZ5jk6YsGMx4mqAmV2jN9bMEoKR/SaOvrSkPvAO9QxZP4V6W72YGoi6le RZK8Z+HGxg6qpeaAiG0t =TT85 -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes for 4.19 - more fallout from the hugetlbfs enablement - bugfix for vma handling	2018-09-18 15:12:51 +02:00
Dave Airlie	57078338b2	drm: fix drm_drv_uses_atomic_modeset on non modesetting drivers. vgem seems to oops on the intel CI due to the vgem debugfs init hitting this path now. Check if we have mode_config funcs before checking one. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180918062018.24942-1-airlied@gmail.com	2018-09-18 11:17:06 +02:00
Boris Brezillon	4a3e85f267	mtd: devices: m25p80: Make sure the buffer passed in op is DMA-able As documented in spi-mem.h, spi_mem_op->data.buf.{in,out} must be DMA-able, and commit `4120f8d158` ("mtd: spi-nor: Use the spi_mem_xx() API") failed to follow this rule as buffers passed to ->{read,write}_reg() are usually placed on the stack. Fix that by allocating a scratch buffer and copying the data around. Fixes: `4120f8d158` ("mtd: spi-nor: Use the spi_mem_xx() API") Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>	2018-09-18 10:17:48 +02:00
Greg Kroah-Hartman	5211da9ca5	Merge gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net Dave writes: "Various fixes, all over the place: 1) OOB data generation fix in bluetooth, from Matias Karhumaa. 2) BPF BTF boundary calculation fix, from Martin KaFai Lau. 3) Don't bug on excessive frags, to be compatible in situations mixing older and newer kernels on each end. From Juergen Gross. 4) Scheduling in RCU fix in hv_netvsc, from Stephen Hemminger. 5) Zero keying information in TLS layer before freeing copies of them, from Sabrina Dubroca. 6) Fix NULL deref in act_sample, from Davide Caratti. 7) Orphan SKB before GRO in veth to prevent crashes with XDP, from Toshiaki Makita. 8) Fix use after free in ip6_xmit, from Eric Dumazet. 9) Fix VF mac address regression in bnxt_en, from Micahel Chan. 10) Fix MSG_PEEK behavior in TLS layer, from Daniel Borkmann. 11) Programming adjustments to r8169 which fix not being to enter deep sleep states on some machines, from Kai-Heng Feng and Hans de Goede. 12) Fix DST_NOCOUNT flag handling for ipv6 routes, from Peter Oskolkov." * gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (45 commits) net/ipv6: do not copy dst flags on rt init qmi_wwan: set DTR for modems in forced USB2 mode clk: x86: Stop marking clocks as CLK_IS_CRITICAL r8169: Get and enable optional ether_clk clock clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail r8169: enable ASPM on RTL8106E r8169: Align ASPM/CLKREQ setting function with vendor driver Revert "kcm: remove any offset before parsing messages" kcm: remove any offset before parsing messages net: ethernet: Fix a unused function warning. net: dsa: mv88e6xxx: Fix ATU Miss Violation tls: fix currently broken MSG_PEEK behavior hv_netvsc: pair VF based on serial number PCI: hv: support reporting serial number as slot information bnxt_en: Fix VF mac address regression. ipv6: fix possible use-after-free in ip6_xmit() net: hp100: fix always-true check for link up state ARM: dts: at91: add new compatibility string for macb on sama5d3 net: macb: disable scatter-gather for macb on sama5d3 net: mvpp2: let phylink manage the carrier state ...	2018-09-18 09:31:53 +02:00
Peter Oskolkov	30bfd93062	net/ipv6: do not copy dst flags on rt init DST_NOCOUNT in dst_entry::flags tracks whether the entry counts toward route cache size (net->ipv6.sysctl.ip6_rt_max_size). If the flag is NOT set, dst_ops::pcpuc_entries counter is incremented in dist_init() and decremented in dst_destroy(). This flag is tied to allocation/deallocation of dst_entry and should not be copied from another dst/route. Otherwise it can happen that dst_ops::pcpuc_entries counter grows until no new routes can be allocated because the counter reached ip6_rt_max_size due to DST_NOCOUNT not set and thus no counter decrements on gc-ed routes. Fixes: `3b6761d18b` ("net/ipv6: Move dst flags to booleans in fib entries") Cc: David Ahern <dsahern@gmail.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 19:42:14 -07:00
Zhipeng Gong	7759ca3aac	drm/i915/gvt: clear ggtt entries when destroy vgpu When one vgpu is destroyed, its ggtt entries are not cleared. This patch clears ggtt entries to avoid information leak. v2: add 'Fixes' tag (Zhenyu) Fixes: `2707e44466` ("drm/i915/gvt: vGPU graphics memory virtualization") Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com> Reviewed-by: Hang Yuan <hang.yuan@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-09-18 10:39:44 +08:00
Weinan Li	a1ac5f0943	drm/i915/gvt: request srcu_read_lock before checking if one gfn is valid Fix the suspicious RCU usage issue in intel_vgpu_emulate_mmio_write. Here need to request the srcu read lock of kvm->srcu before doing gfn_to_memslot(). The detailed log is as below: [ 218.710688] ============================= [ 218.710690] WARNING: suspicious RCU usage [ 218.710693] 4.14.15-dd+ #314 Tainted: G U [ 218.710695] ----------------------------- [ 218.710697] ./include/linux/kvm_host.h:575 suspicious rcu_dereference_check() usage! [ 218.710699] other info that might help us debug this: [ 218.710702] rcu_scheduler_active = 2, debug_locks = 1 [ 218.710704] 1 lock held by qemu-system-x86/2144: [ 218.710706] #0: (&gvt->lock){+.+.}, at: [<ffffffff816a1eea>] intel_vgpu_emulate_mmio_write+0x5a/0x2d0 [ 218.710721] stack backtrace: [ 218.710724] CPU: 0 PID: 2144 Comm: qemu-system-x86 Tainted: G U 4.14.15-dd+ #314 [ 218.710727] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.1.1 10/07/2015 [ 218.710729] Call Trace: [ 218.710734] dump_stack+0x7c/0xb3 [ 218.710739] gfn_to_memslot+0x15f/0x170 [ 218.710743] kvm_is_visible_gfn+0xa/0x30 [ 218.710746] intel_vgpu_emulate_gtt_mmio_write+0x267/0x3c0 [ 218.710751] ? __mutex_unlock_slowpath+0x3b/0x260 [ 218.710754] intel_vgpu_emulate_mmio_write+0x182/0x2d0 [ 218.710759] intel_vgpu_rw+0xba/0x170 [kvmgt] [ 218.710763] intel_vgpu_write+0x14d/0x1a0 [kvmgt] [ 218.710767] __vfs_write+0x23/0x130 [ 218.710770] vfs_write+0xb0/0x1b0 [ 218.710774] SyS_pwrite64+0x73/0x90 [ 218.710777] entry_SYSCALL_64_fastpath+0x25/0x9c [ 218.710780] RIP: 0033:0x7f33e8a91da3 [ 218.710783] RSP: 002b:00007f33dddc8700 EFLAGS: 00000293 v2: add 'Fixes' tag, refine log format.(Zhenyu) Fixes: `cc753fbe1a` ("drm/i915/gvt: validate gfn before set shadow page") Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-09-18 10:37:55 +08:00
Colin Xu	d817de3bc1	drm/i915/gvt: Add GEN9_CLKGATE_DIS_4 to default BXT mmio handler Host prints lots of untracked MMIO at 0x4653c when creating linux guest. "gvt: vgpu 2: untracked MMIO 0004653c len 4" GEN9_CLKGATE_DIS_4 (0x4653c) is accessed by i915 for gmbus clockgating. However vgpu doesn't support any clockgating powergating operations on related mmio access trap so need add it to default handler. GEN9_CLKGATE_DIS_4 is accessed in bxt_gmbus_clock_gating() which only applies to GEN9_LP so doens't show the warning on other platforms. The solution is to add it to default handler init_bxt_mmio_info(). Reviewed-by: He, Min <min.he@intel.com> Signed-off-by: Colin Xu <colin.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-09-18 10:37:44 +08:00
Colin Xu	db7c8f1e5f	drm/i915/gvt: Init PHY related registers for BXT Recent patch fixed the call trace "ERROR Port B enabled but PHY powered down? (PHY_CTL 00000000)". but introduced another similar call trace shown as: "ERROR Port C enabled but PHY powered down? (PHY_CTL 00000200)". The call trace will appear when host and guest enabled different ports, i.e. host using PORT C or neither PORT is enabled, while guest is always using PORT B as simulated by gvt. The issue is actually covered previously before the commit and reverals now when the commit do the right thing. On BXT, some PHY registers are initialized by vbios, before i915 loaded. Later i915 will re-program some, or skip some based on the implementation. The initialized mmio for guest i915 is done by gvt, based on the snapshot taken from host. If host and guest have different PORT enabled, some DPIO PHY mmios that gvt initialized for guest i915 will not match the simualted monitor for guest, which leads to guest i915 print the calltrace when it's trying to enable PHY and PORT. The solution is to init these DPIO PHY registers to default value, then guest i915 will program them to reasonable value based on the default powerwell table and enabled PORT. Together with the old patch, all similar call trace in guest kernel on BXT can be resolved. v2: Move PHY register init to intel_vgpu_reset_mmio (Min) v3: Do not delete empty line in issue fix patch. (zhenyu) Fixes: `c8ab5ac30c` ("drm/i915/gvt: Make correct handling to vreg BXT_PHY_CTL_FAMILY") Reviewed-by: He, Min <min.he@intel.com> Signed-off-by: Colin Xu <colin.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-09-18 10:37:29 +08:00
Bjørn Mork	922005c7f5	qmi_wwan: set DTR for modems in forced USB2 mode Recent firmware revisions have added the ability to force these modems to USB2 mode, hiding their SuperSpeed capabilities from the host. The driver has been using the SuperSpeed capability, as shown by the bcdUSB field of the device descriptor, to detect the need to enable the DTR quirk. This method fails when the modems are forced to USB2 mode by the modem firmware. Fix by unconditionally enabling the DTR quirk for the affected device IDs. Reported-by: Fred Veldini <fred.veldini@gmail.com> Reported-by: Deshu Wen <dwen@sierrawireless.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Reported-by: Fred Veldini <fred.veldini@gmail.com> Reported-by: Deshu Wen <dwen@sierrawireless.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 19:23:27 -07:00
David S. Miller	89bfd48d67	Merge branch 'r8169-clk-fixes' Hans de Goede says: ==================== r8169 (x86) clk fixes to fix S0ix not being reached This series adds code to the r8169 ethernet driver to get and enable an external clock if present, avoiding the need for a hack in the clk-pmc-atom driver where that clock was left on continuesly causing x86 some devices to not reach deep power saving states (S0ix) when suspended causing to them to quickly drain their battery while suspended. The 3 commits in this series need to be merged in order to avoid regressions while bisecting. The clk-pmc-atom driver does not see much changes (it was last touched over a year ago). So the clk maintainers have agreed with merging all 3 patches through the net tree. All 3 patches have Stephen Boyd's Acked-by for this purpose. This v2 of the series only had some minor tweaks done to the commit messages and is ready for merging through the net tree now. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:47:58 -07:00
Hans de Goede	648e921888	clk: x86: Stop marking clocks as CLK_IS_CRITICAL Commit `d31fd43c0f` ("clk: x86: Do not gate clocks enabled by the firmware"), which added the code to mark clocks as CLK_IS_CRITICAL, causes all unclaimed PMC clocks on Cherry Trail devices to be on all the time, resulting on the device not being able to reach S0i3 when suspended. The reason for this commit is that on some Bay Trail / Cherry Trail devices the r8169 ethernet controller uses pmc_plt_clk_4. Now that the clk-pmc-atom driver exports an "ether_clk" alias for pmc_plt_clk_4 and the r8169 driver has been modified to get and enable this clock (if present) the marking of the clocks as CLK_IS_CRITICAL is no longer necessary. This commit removes the CLK_IS_CRITICAL marking, fixing Cherry Trail devices not being able to reach S0i3 greatly decreasing their battery drain when suspended. Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach <js@sig21.net> Cc: Carlo Caione <carlo@endlessm.com> Reported-by: Johannes Stezenbach <js@sig21.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:47:58 -07:00
Hans de Goede	c2f6f3ee7f	r8169: Get and enable optional ether_clk clock On some boards a platform clock is used as clock for the r8169 chip, this commit adds support for getting and enabling this clock (assuming it has an "ether_clk" alias set on it). This is related to commit `d31fd43c0f` ("clk: x86: Do not gate clocks enabled by the firmware") which is a previous attempt to fix this for some x86 boards, but this causes all Cherry Trail SoC using boards to not reach there lowest power states when suspending. This commit (together with an atom-pmc-clk driver commit adding the alias) fixes things properly by making the r8169 get the clock and enable it when it needs it. Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach <js@sig21.net> Cc: Carlo Caione <carlo@endlessm.com> Reported-by: Johannes Stezenbach <js@sig21.net> Acked-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:47:58 -07:00
Hans de Goede	b1e3454d39	clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail Commit `d31fd43c0f` ("clk: x86: Do not gate clocks enabled by the firmware") causes all unclaimed PMC clocks on Cherry Trail devices to be on all the time, resulting on the device not being able to reach S0i2 or S0i3 when suspended. The reason for this commit is that on some Bay Trail / Cherry Trail devices the ethernet controller uses pmc_plt_clk_4. This commit adds an "ether_clk" alias, so that the relevant ethernet drivers can try to (optionally) use this, without needing X86 specific code / hacks, thus fixing ethernet on these devices without breaking S0i3 support. This commit uses clkdev_hw_create() to create the alias, mirroring the code for the already existing "mclk" alias for pmc_plt_clk_3. Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach <js@sig21.net> Cc: Carlo Caione <carlo@endlessm.com> Reported-by: Johannes Stezenbach <js@sig21.net> Acked-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:47:58 -07:00
Kai-Heng Feng	0866cd1502	r8169: enable ASPM on RTL8106E The Intel SoC was prevented from entering lower idle state because of RTL8106E's ASPM was not enabled. So enable ASPM on RTL8106E (chip version 39). Now the Intel SoC can enter lower idle state, power consumption and temperature are much lower. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:45:55 -07:00
Kai-Heng Feng	94235460f9	r8169: Align ASPM/CLKREQ setting function with vendor driver There's a small delay after setting ASPM in vendor drivers, r8101 and r8168. In addition, those drivers enable ASPM before ClkReq, also change that to align with vendor driver. I haven't seen anything bad becasue of this, but I think it's better to keep in sync with vendor driver. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:45:55 -07:00
David S. Miller	3275b4df3c	Revert "kcm: remove any offset before parsing messages" This reverts commit `072222b488`. I just read that this causes regressions. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:43:42 -07:00
Dominique Martinet	072222b488	kcm: remove any offset before parsing messages The current code assumes kcm users know they need to look for the strparser offset within their bpf program, which is not documented anywhere and examples laying around do not do. The actual recv function does handle the offset well, so we can create a temporary clone of the skb and pull that one up as required for parsing. The pull itself has a cost if we are pulling beyond the head data, measured to 2-3% latency in a noisy VM with a local client stressing that path. The clone's impact seemed too small to measure. This bug can be exhibited easily by implementing a "trivial" kcm parser taking the first bytes as size, and on the client sending at least two such packets in a single write(). Note that bpf sockmap has the same problem, both for parse and for recv, so it would pulling twice or a real pull within the strparser logic if anyone cares about that. Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 18:42:39 -07:00
Lyude Paul	3c499ea0c6	drm/atomic: Use drm_drv_uses_atomic_modeset() for debugfs creation As pointed out by Daniel Vetter, we should be usinng drm_drv_uses_atomic_modeset() for determining whether or not we want to make the debugfs nodes for atomic instead of checking DRIVER_ATOMIC, as the former isn't an accurate representation of whether or not the driver is actually using atomic modesetting internally (even though it might not be exposing atomic capabilities). Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Reviewed-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20180917173733.21293-1-lyude@redhat.com	2018-09-17 19:24:37 -04:00
Vaibhav Nagarnaik	83f365554e	ring-buffer: Allow for rescheduling when removing pages When reducing ring buffer size, pages are removed by scheduling a work item on each CPU for the corresponding CPU ring buffer. After the pages are removed from ring buffer linked list, the pages are free()d in a tight loop. The loop does not give up CPU until all pages are removed. In a worst case behavior, when lot of pages are to be freed, it can cause system stall. After the pages are removed from the list, the free() can happen while the work is rescheduled. Call cond_resched() in the loop to prevent the system hangup. Link: http://lkml.kernel.org/r/20180907223129.71994-1-vnagarnaik@google.com Cc: stable@vger.kernel.org Fixes: `83f40318da` ("ring-buffer: Make removal of ring buffer pages atomic") Reported-by: Jason Behmer <jbehmer@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-09-17 18:15:11 -04:00
Greg Kroah-Hartman	3918c21eac	spi: Fixes for v4.19 As well as one driver fix there's a couple of fixes here which address issues with the use of IDRs for allocation of dynamic bus numbers, ensuring that dynamic bus numbers interact well with static bus numbers assigned via DT and otherwise. -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAluf2foTHGJyb29uaWVA a2VybmVsLm9yZwAKCRAk1otyXVSH0DooB/4/c+c9nX31WeKh5EUFS3WJUQ4quHC+ ZPS0rqliJm0KQxdSCU6ig2+xbobZWAs9T4ZXPlE7PNk6DmqdtATS5jlzbwmoLvdA PvxsBwz3dRdIGR/BNbDYEZCb0WGtMHO6BR5c//lBCy+ea5oNENi0w0mFnY+AVeUt ivM55i/nNmV4DReT3rl5mRz/TQgfI9zc11DPpqDnlQML3emYHmJ+hZa8/1g68d8C lQHLIQMo6hGyOd3p6uPODGt98cDKIuYl9+fcYVzYFScNshwuMsxTDUXpkhv85Frb QV6LGlikaEQBh9X9BHjzTHMBqbTFVLX/I+jRC/vK2G21862rfOt3R6vh =rRj7 -----END PGP SIGNATURE----- Merge tag 'spi-fix-v4.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Mark writes: "spi: Fixes for v4.19 As well as one driver fix there's a couple of fixes here which address issues with the use of IDRs for allocation of dynamic bus numbers, ensuring that dynamic bus numbers interact well with static bus numbers assigned via DT and otherwise." * tag 'spi-fix-v4.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-fsl-dspi: fix broken DSPI_EOQ_MODE spi: Fix double IDR allocation with DT aliases spi: fix IDR collision on systems with both fixed and dynamic SPI bus numbers	2018-09-17 22:34:25 +02:00
Takashi Iwai	196f4eeeb7	ASoC: Fixes for v4.19 This is the usual set of small fixes scatterd around various drivers, plus one fix for DAPM and a UAPI build fix. There's not a huge amount that stands out here relative to anything else. -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAluf2AQTHGJyb29uaWVA a2VybmVsLm9yZwAKCRAk1otyXVSH0BojB/9ZpiRgSSjKTFSmGgu3OFI7Nvj63ruB hxOnwOc8Bea8tZtpzgEcx/aLZ1sbWVT4uRUYZv0Tf6UJtuOQagbJDEUkUdRitKtX 1khSMyKFlAa7cIbv19ZOMCN0pjcs7hlHCPryT8AyCWCWN8yPdlUsDqWfyfUoq56r qpdu/OQ4E9VvS8OcX1gPjcop3gE/fYEoU+mbUpr0KYUXaroEzJm85tOqpGYk4+XW GCNUR19vNRJr5G6ANqIx96JOlgF5nRZu7aOfvLceiWH5BgPdW3iNRAJkPmKCIHwb a1+X21eCC7Ec2/7bQmR5Aoxz1yqzhngrevSFNLrqXFZmMmNrEfkfdCrJ =gVzO -----END PGP SIGNATURE----- Merge tag 'asoc-v4.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v4.19 This is the usual set of small fixes scatterd around various drivers, plus one fix for DAPM and a UAPI build fix. There's not a huge amount that stands out here relative to anything else.	2018-09-17 18:59:21 +02:00
zhong jiang	c73480910e	net: ethernet: Fix a unused function warning. Fix the following compile warning: drivers/net/ethernet/microchip/lan743x_main.c:2964:12: warning: lan743x_pm_suspend defined but not used [-Wunused-function] static int lan743x_pm_suspend(struct device dev) drivers/net/ethernet/microchip/lan743x_main.c:2987:12: warning: lan743x_pm_resume defined but not used [-Wunused-function] static int lan743x_pm_resume(struct device dev) Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 08:24:25 -07:00
Andrew Lunn	ddca24dfcf	net: dsa: mv88e6xxx: Fix ATU Miss Violation Fix a cut/paste error and a typo which results in ATU miss violations not being reported. Fixes: `0977644c50` ("net: dsa: mv88e6xxx: Decode ATU problem interrupt") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 08:03:53 -07:00
Daniel Borkmann	50c6b58a81	tls: fix currently broken MSG_PEEK behavior In kTLS MSG_PEEK behavior is currently failing, strace example: [pid 2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3 [pid 2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4 [pid 2430] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 [pid 2430] listen(4, 10) = 0 [pid 2430] getsockname(4, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 [pid 2430] connect(3, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 [pid 2430] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? /, [7564404], 4) = 0 [pid 2430] setsockopt(3, 0x11a / SOL_?? /, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0 [pid 2430] accept(4, {sa_family=AF_INET, sin_port=htons(49636), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5 [pid 2430] setsockopt(5, SOL_TCP, 0x1f / TCP_??? /, [7564404], 4) = 0 [pid 2430] setsockopt(5, 0x11a / SOL_?? /, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0 [pid 2430] close(4) = 0 [pid 2430] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14 [pid 2430] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11 [pid 2430] recvfrom(5, "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64 As can be seen from strace, there are two TLS records sent, i) 'test_read_peek' and ii) '_mult_recs\0' where we end up peeking 'test_read_peektest_read_peektest'. This is clearly wrong, and what happens is that given peek cannot call into tls_sw_advance_skb() to unpause strparser and proceed with the next skb, we end up looping over the current one, copying the 'test_read_peek' over and over into the user provided buffer. Here, we can only peek into the currently held skb (current, full TLS record) as otherwise we would end up having to hold all the original skb(s) (depending on the peek depth) in a separate queue when unpausing strparser to process next records, minimally intrusive is to return only up to the current record's size (which likely was what `c46234ebb4` ("tls: RX path for ktls") originally intended as well). Thus, after patch we properly peek the first record: [pid 2046] wait4(2075, <unfinished ...> [pid 2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3 [pid 2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4 [pid 2075] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 [pid 2075] listen(4, 10) = 0 [pid 2075] getsockname(4, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0 [pid 2075] connect(3, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 [pid 2075] setsockopt(3, SOL_TCP, 0x1f / TCP_??? /, [7564404], 4) = 0 [pid 2075] setsockopt(3, 0x11a / SOL_?? /, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0 [pid 2075] accept(4, {sa_family=AF_INET, sin_port=htons(45732), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5 [pid 2075] setsockopt(5, SOL_TCP, 0x1f / TCP_??? /, [7564404], 4) = 0 [pid 2075] setsockopt(5, 0x11a / SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0 [pid 2075] close(4) = 0 [pid 2075] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14 [pid 2075] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11 [pid 2075] recvfrom(5, "test_read_peek", 64, MSG_PEEK, NULL, NULL) = 14 Fixes: `c46234ebb4` ("tls: RX path for ktls") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 08:03:09 -07:00
David S. Miller	aa079bd050	Merge branch 'hv_netvsc-associate-VF-and-PV-device-by-serial-number' Stephen Hemminger says: ==================== hv_netvsc: associate VF and PV device by serial number The Hyper-V implementation of PCI controller has concept of 32 bit serial number (not to be confused with PCI-E serial number). This value is sent in the protocol from the host to indicate SR-IOV VF device is attached to a synthetic NIC. Using the serial number (instead of MAC address) to associate the two devices avoids lots of potential problems when there are duplicate MAC addresses from tunnels or layered devices. The patch set is broken into two parts, one is for the PCI controller and the other is for the netvsc device. Normally, these go through different trees but sending them together here for better review. The PCI changes were submitted previously, but the main review comment was "why do you need this?". This is why. v2 - slot name can be shorter. remove locking when creating pci_slots; see comment for explaination ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 07:59:41 -07:00
Stephen Hemminger	00d7ddba11	hv_netvsc: pair VF based on serial number Matching network device based on MAC address is problematic since a non VF network device can be creted with a duplicate MAC address causing confusion and problems. The VMBus API does provide a serial number that is a better matching method. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 07:59:41 -07:00
Stephen Hemminger	a15f2c08c7	PCI: hv: support reporting serial number as slot information The Hyper-V host API for PCI provides a unique "serial number" which can be used as basis for sysfs PCI slot table. This can be useful for cases where userspace wants to find the PCI device based on serial number. When an SR-IOV NIC is added, the host sends an attach message with serial number. The kernel doesn't use the serial number, but it is useful when doing the same thing in a userspace driver such as the DPDK. By having /sys/bus/pci/slots/N it provides a direct way to find the matching PCI device. There maybe some cases where serial number is not unique such as when using GPU's. But the PCI slot infrastructure will handle that. This has a side effect which may also be useful. The common udev network device naming policy uses the slot information (rather than PCI address). Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 07:59:41 -07:00
Michael Chan	28ea334bd1	bnxt_en: Fix VF mac address regression. The recent commit to always forward the VF MAC address to the PF for approval may not work if the PF driver or the firmware is older. This will cause the VF driver to fail during probe: bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): hwrm req_type 0xf seq id 0x5 error 0xffff bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): VF MAC address 00:00:17:02:05:d0 not approved by the PF bnxt_en 0000:00:03.0: Unable to initialize mac address. bnxt_en: probe of 0000:00:03.0 failed with error -99 We fix it by treating the error as fatal only if the VF MAC address is locally generated by the VF. Fixes: `707e7e9660` ("bnxt_en: Always forward VF MAC address to the PF.") Reported-by: Seth Forshee <seth.forshee@canonical.com> Reported-by: Siwei Liu <loseweigh@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 07:56:35 -07:00
Eric Dumazet	bbd6528d28	ipv6: fix possible use-after-free in ip6_xmit() In the unlikely case ip6_xmit() has to call skb_realloc_headroom(), we need to call skb_set_owner_w() before consuming original skb, otherwise we risk a use-after-free. Bring IPv6 in line with what we do in IPv4 to fix this. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 07:56:02 -07:00
Colin Ian King	a7f38002fb	net: hp100: fix always-true check for link up state The operation ~(p100_inb(VG_LAN_CFG_1) & HP100_LINK_UP) returns a value that is always non-zero and hence the wait for the link to drop always terminates prematurely. Fix this by using a logical not operator instead of a bitwise complement. This issue has been in the driver since pre-2.6.12-rc2. Detected by CoverityScan, CID#114157 ("Logical vs. bitwise operator") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-17 07:55:19 -07:00

... 2 3 4 5 6 ...

782852 Commits