Commit Graph

826102 Commits

Author SHA1 Message Date
Mikulas Patocka
4ed319c6ac dm integrity: fix deadlock with overlapping I/O
dm-integrity will deadlock if overlapping I/O is issued to it, the bug
was introduced by commit 724376a04d ("dm integrity: implement fair
range locks").  Users rarely use overlapping I/O so this bug went
undetected until now.

Fix this bug by correcting, likely cut-n-paste, typos in
ranges_overlap() and also remove a flawed ranges_overlap() check in
remove_range_unlocked().  This condition could leave unprocessed bios
hanging on wait_list forever.

Cc: stable@vger.kernel.org # v4.19+
Fixes: 724376a04d ("dm integrity: implement fair range locks")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-04-05 18:49:08 -04:00
Andre Przywara
9cde402a59 PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller
There is a Marvell 88SE9170 PCIe SATA controller I found on a board here.
Some quick testing with the ARM SMMU enabled reveals that it suffers from
the same requester ID mixup problems as the other Marvell chips listed
already.

Add the PCI vendor/device ID to the list of chips which need the
workaround.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org
2019-04-05 16:11:16 -05:00
Marc Orr
c73f4c998e KVM: x86: nVMX: fix x2APIC VTPR read intercept
Referring to the "VIRTUALIZING MSR-BASED APIC ACCESSES" chapter of the
SDM, when "virtualize x2APIC mode" is 1 and "APIC-register
virtualization" is 0, a RDMSR of 808H should return the VTPR from the
virtual APIC page.

However, for nested, KVM currently fails to disable the read intercept
for this MSR. This means that a RDMSR exit takes precedence over
"virtualize x2APIC mode", and KVM passes through L1's TPR to L2,
instead of sourcing the value from L2's virtual APIC page.

This patch fixes the issue by disabling the read intercept, in VMCS02,
for the VTPR when "APIC-register virtualization" is 0.

The issue described above and fix prescribed here, were verified with
a related patch in kvm-unit-tests titled "Test VMX's virtualize x2APIC
mode w/ nested".

Signed-off-by: Marc Orr <marcorr@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Fixes: c992384bde ("KVM: vmx: speed up MSR bitmap merge")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-05 21:08:30 +02:00
Marc Orr
acff78477b KVM: x86: nVMX: close leak of L0's x2APIC MSRs (CVE-2019-3887)
The nested_vmx_prepare_msr_bitmap() function doesn't directly guard the
x2APIC MSR intercepts with the "virtualize x2APIC mode" MSR. As a
result, we discovered the potential for a buggy or malicious L1 to get
access to L0's x2APIC MSRs, via an L2, as follows.

1. L1 executes WRMSR(IA32_SPEC_CTRL, 1). This causes the spec_ctrl
variable, in nested_vmx_prepare_msr_bitmap() to become true.
2. L1 disables "virtualize x2APIC mode" in VMCS12.
3. L1 enables "APIC-register virtualization" in VMCS12.

Now, KVM will set VMCS02's x2APIC MSR intercepts from VMCS12, and then
set "virtualize x2APIC mode" to 0 in VMCS02. Oops.

This patch closes the leak by explicitly guarding VMCS02's x2APIC MSR
intercepts with VMCS12's "virtualize x2APIC mode" control.

The scenario outlined above and fix prescribed here, were verified with
a related patch in kvm-unit-tests titled "Add leak scenario to
virt_x2apic_mode_test".

Note, it looks like this issue may have been introduced inadvertently
during a merge---see 15303ba5d1.

Signed-off-by: Marc Orr <marcorr@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-05 21:08:22 +02:00
David Rientjes
b86bc2858b KVM: SVM: prevent DBG_DECRYPT and DBG_ENCRYPT overflow
This ensures that the address and length provided to DBG_DECRYPT and
DBG_ENCRYPT do not cause an overflow.

At the same time, pass the actual number of pages pinned in memory to
sev_unpin_memory() as a cleanup.

Reported-by: Cfir Cohen <cfir@google.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-05 20:49:42 +02:00
David Rientjes
ede885ecb2 kvm: svm: fix potential get_num_contig_pages overflow
get_num_contig_pages() could potentially overflow int so make its type
consistent with its usage.

Reported-by: Cfir Cohen <cfir@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-05 20:48:59 +02:00
Linus Torvalds
7f46774c64 MM/compaction patches for 5.1-rc4
The merge window for 5.1 introduced a number of compaction-related patches.
 with intermittent reports of corruption and functional issues. The bugs
 are due to sloopy checking of zone boundaries and a corner case where
 invalid indexes are used to access the free lists.
 
 Reports are not common but at least two users and 0-day have tripped
 over them.  There is a chance that one of the syzbot reports are related
 but it has not been confirmed properly.
 
 The normal submission path is with Andrew but there have been some delays
 and I consider them urgent enough that they should be picked up before
 RC4 to avoid duplicate reports.
 
 All of these have been successfully tested on older RC windows. This
 will make this branch look like a rebase but in fact, they've simply
 been lifted again from Andrew's tree and placed on a fresh branch. I've
 no reason to believe that this has invalidated the testing given the
 lack of change in compaction and the nature of the fixes.
 
 Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
 -----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCAA6FiEECg+MacO/Fijve9AMfMb8M0SyR+IFAlynB+gcHG1nb3JtYW5A
 dGVjaHNpbmd1bGFyaXR5Lm5ldAAKCRB8xvwzRLJH4iW4EACw2vxxFTBUr1R3Yr8a
 YsRLT9nJvpebX98hyLG3Mhcv2c2haW+LC/H4T2UP1vW6zNH7Co738QcraH1xVPfr
 J3xJWSihXSAwLPHJQpJWd3pdWmmboIVUnYR/Cn/TiE2vukyFS61aRo/FEctJx2Hm
 cdkRh/j4HPtO7U4g0Lg6t1GYa40wy+1ahP+PTHpdIjadxnu+AGKrXgi+oysJJ29F
 QBBH2/5Lv8feaOyBRlWycBYHq7to/EsN+Xr0shsMT+AVcOvPZ2hi9GCdTRiDvHiJ
 c0cXftyxm8BYqKcPDeLOdQGtnDkDRCFrZI7hzHEvbUKdodVD5BZIet8HeblnUlMt
 TtgWenwo02MY3t9zSDSxBtzj8EimNQFIGJz7kZseS56leskvhXUXuDyUhT3O5I8R
 DBSX8DRJMhri2yGImC57RRtLGO/zbzHMrpB8ogpsqlo4UbMwNNOuL4yLAJ9XCcmV
 jjuakK08dr6v02NeOc/B2z5mkm4gQe0jvJyxMaWJHWORJEwrRu9ie8Af3yKBlP3Y
 dpX0R3XLtlOJdsFksDGm+24ftogN3l+0r/qVi7E8uwCvUeS9nvwY9ffNFiLHwMAg
 BewwduX2MWT3hAVzNprFmUTMaKGUjdpgwdxUt68qcmWEzlF2KcHbTpRv4+lxzx/A
 msO3x+upkskoFOBCRdvxq+AaKg==
 =WJCV
 -----END PGP SIGNATURE-----

Merge tag 'mm-compaction-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux

Pull mm/compaction fixes from Mel Gorman:
 "The merge window for 5.1 introduced a number of compaction-related
  patches. with intermittent reports of corruption and functional
  issues. The bugs are due to sloopy checking of zone boundaries and a
  corner case where invalid indexes are used to access the free lists.

  Reports are not common but at least two users and 0-day have tripped
  over them. There is a chance that one of the syzbot reports are
  related but it has not been confirmed properly.

  The normal submission path is with Andrew but there have been some
  delays and I consider them urgent enough that they should be picked up
  before RC4 to avoid duplicate reports.

  All of these have been successfully tested on older RC windows. This
  will make this branch look like a rebase but in fact, they've simply
  been lifted again from Andrew's tree and placed on a fresh branch.
  I've no reason to believe that this has invalidated the testing given
  the lack of change in compaction and the nature of the fixes"

* tag 'mm-compaction-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux:
  mm/compaction.c: abort search if isolation fails
  mm/compaction.c: correct zone boundary handling when resetting pageblock skip hints
2019-04-05 06:09:53 -10:00
Greg Kroah-Hartman
c7084edc3f tty: mark Siemens R3964 line discipline as BROKEN
The n_r3964 line discipline driver was written in a different time, when
SMP machines were rare, and users were trusted to do the right thing.
Since then, the world has moved on but not this code, it has stayed
rooted in the past with its lovely hand-crafted list structures and
loads of "interesting" race conditions all over the place.

After attempting to clean up most of the issues, I just gave up and am
now marking the driver as BROKEN so that hopefully someone who has this
hardware will show up out of the woodwork (I know you are out there!)
and will help with debugging a raft of changes that I had laying around
for the code, but was too afraid to commit as odds are they would break
things.

Many thanks to Jann and Linus for pointing out the initial problems in
this codebase, as well as many reviews of my attempts to fix the issues.
It was a case of whack-a-mole, and as you can see, the mole won.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-05 05:56:44 -10:00
Stephen Boyd
325aa19598 genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
If a child irqchip calls irq_chip_set_wake_parent() but its parent irqchip
has the IRQCHIP_SKIP_SET_WAKE flag set an error is returned.

This is inconsistent behaviour vs. set_irq_wake_real() which returns 0 when
the irqchip has the IRQCHIP_SKIP_SET_WAKE flag set. It doesn't attempt to
walk the chain of parents and set irq wake on any chips that don't have the
flag set either. If the intent is to call the .irq_set_wake() callback of
the parent irqchip, then we expect irqchip implementations to omit the
IRQCHIP_SKIP_SET_WAKE flag and implement an .irq_set_wake() function that
calls irq_chip_set_wake_parent().

The problem has been observed on a Qualcomm sdm845 device where set wake
fails on any GPIO interrupts after applying work in progress wakeup irq
patches to the GPIO driver. The chain of chips looks like this:

     QCOM GPIO -> QCOM PDC (SKIP) -> ARM GIC (SKIP)

The GPIO controllers parent is the QCOM PDC irqchip which in turn has ARM
GIC as parent.  The QCOM PDC irqchip has the IRQCHIP_SKIP_SET_WAKE flag
set, and so does the grandparent ARM GIC.

The GPIO driver doesn't know if the parent needs to set wake or not, so it
unconditionally calls irq_chip_set_wake_parent() causing this function to
return a failure because the parent irqchip (PDC) doesn't have the
.irq_set_wake() callback set. Returning 0 instead makes everything work and
irqs from the GPIO controller can be configured for wakeup.

Make it consistent by returning 0 (success) from irq_chip_set_wake_parent()
when a parent chip has IRQCHIP_SKIP_SET_WAKE set.

[ tglx: Massaged changelog ]

Fixes: 08b55e2a92 ("genirq: Add irqchip_set_wake_parent")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-gpio@vger.kernel.org
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190325181026.247796-1-swboyd@chromium.org
2019-04-05 17:41:41 +02:00
Bart Van Assche
fd9c40f64c block: Revert v5.0 blk_mq_request_issue_directly() changes
blk_mq_try_issue_directly() can return BLK_STS*_RESOURCE for requests that
have been queued. If that happens when blk_mq_try_issue_directly() is called
by the dm-mpath driver then dm-mpath will try to resubmit a request that is
already queued and a kernel crash follows. Since it is nontrivial to fix
blk_mq_request_issue_directly(), revert the blk_mq_request_issue_directly()
changes that went into kernel v5.0.

This patch reverts the following commits:
* d6a51a97c0 ("blk-mq: replace and kill blk_mq_request_issue_directly") # v5.0.
* 5b7a6f128a ("blk-mq: issue directly with bypass 'false' in blk_mq_sched_insert_requests") # v5.0.
* 7f556a44e6 ("blk-mq: refactor the code of issue request directly") # v5.0.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: James Smart <james.smart@broadcom.com>
Cc: Dongli Zhang <dongli.zhang@oracle.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Fixes: 7f556a44e6 ("blk-mq: refactor the code of issue request directly") # v5.0.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-05 09:40:46 -06:00
YueHaibing
f0d1762554 paride/pcd: Fix potential NULL pointer dereference and mem leak
Syzkaller report this:

pcd: pcd version 1.07, major 46, nice 0
pcd0: Autoprobe failed
pcd: No CD-ROM drive found
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 4525 Comm: syz-executor.0 Not tainted 5.1.0-rc3+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:pcd_init+0x95c/0x1000 [pcd]
Code: c4 ab f7 48 89 d8 48 c1 e8 03 80 3c 28 00 74 08 48 89 df e8 56 a3 da f7 4c 8b 23 49 8d bc 24 80 05 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 74 05 e8 39 a3 da f7 49 8b bc 24 80 05 00 00 e8 cc b2
RSP: 0018:ffff8881e84df880 EFLAGS: 00010202
RAX: 00000000000000b0 RBX: ffffffffc155a088 RCX: ffffffffc1508935
RDX: 0000000000040000 RSI: ffffc900014f0000 RDI: 0000000000000580
RBP: dffffc0000000000 R08: ffffed103ee658b8 R09: ffffed103ee658b8
R10: 0000000000000001 R11: ffffed103ee658b7 R12: 0000000000000000
R13: ffffffffc155a778 R14: ffffffffc155a4a8 R15: 0000000000000003
FS:  00007fe71bee3700(0000) GS:ffff8881f7300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055a7334441a8 CR3: 00000001e9674003 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 ? 0xffffffffc1508000
 ? 0xffffffffc1508000
 do_one_initcall+0xbc/0x47d init/main.c:901
 do_init_module+0x1b5/0x547 kernel/module.c:3456
 load_module+0x6405/0x8c10 kernel/module.c:3804
 __do_sys_finit_module+0x162/0x190 kernel/module.c:3898
 do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462e99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fe71bee2c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
RBP: 00007fe71bee2c70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe71bee36bc
R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004
Modules linked in: pcd(+) paride solos_pci atm ts_fsm rtc_mt6397 mac80211 nhc_mobility nhc_udp nhc_ipv6 nhc_hop nhc_dest nhc_fragment nhc_routing 6lowpan rtc_cros_ec memconsole intel_xhci_usb_role_switch roles rtc_wm8350 usbcore industrialio_triggered_buffer kfifo_buf industrialio asc7621 dm_era dm_persistent_data dm_bufio dm_mod tpm gnss_ubx gnss_serial serdev gnss max2165 cpufreq_dt hid_penmount hid menf21bmc_wdt rc_core n_tracesink ide_gd_mod cdns_csi2tx v4l2_fwnode videodev media pinctrl_lewisburg pinctrl_intel iptable_security iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun joydev mousedev ppdev kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd
 ide_pci_generic piix input_leds cryptd glue_helper psmouse ide_core intel_agp serio_raw intel_gtt ata_generic i2c_piix4 agpgart pata_acpi parport_pc parport floppy rtc_cmos sch_fq_codel ip_tables x_tables sha1_ssse3 sha1_generic ipv6 [last unloaded: bmc150_magn]
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace d873691c3cd69f56 ]---

If alloc_disk fails in pcd_init_units, cd->disk will be
NULL, however in pcd_detect and pcd_exit, it's not check
this before free.It may result a NULL pointer dereference.

Also when register_blkdev failed, blk_cleanup_queue() and
blk_mq_free_tag_set() should be called to free resources.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 81b74ac68c ("paride/pcd: cleanup queues when detection fails")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-05 09:24:34 -06:00
Steven Rostedt (VMware)
32d9258662 syscalls: Remove start and number from syscall_set_arguments() args
After removing the start and count arguments of syscall_get_arguments() it
seems reasonable to remove them from syscall_set_arguments(). Note, as of
today, there are no users of syscall_set_arguments(). But we are told that
there will be soon. But for now, at least make it consistent with
syscall_get_arguments().

Link: http://lkml.kernel.org/r/20190327222014.GA32540@altlinux.org

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Dave Martin <dave.martin@arm.com>
Cc: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-05 09:27:23 -04:00
Steven Rostedt (Red Hat)
b35f549df1 syscalls: Remove start and number from syscall_get_arguments() args
At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
function call syscall_get_arguments() implemented in x86 was horribly
written and not optimized for the standard case of passing in 0 and 6 for
the starting index and the number of system calls to get. When looking at
all the users of this function, I discovered that all instances pass in only
0 and 6 for these arguments. Instead of having this function handle
different cases that are never used, simply rewrite it to return the first 6
arguments of a system call.

This should help out the performance of tracing system calls by ptrace,
ftrace and perf.

Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Dave Martin <dave.martin@arm.com>
Cc: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
Acked-by: Max Filippov <jcmvbkbc@gmail.com> # For xtensa changes
Acked-by: Will Deacon <will.deacon@arm.com> # For the arm64 bits
Reviewed-by: Thomas Gleixner <tglx@linutronix.de> # for x86
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-04-05 09:26:43 -04:00
Kefeng Wang
e8458e7afa genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n
When CONFIG_SPARSE_IRQ is disable, the request_mutex in struct irq_desc
is not initialized which causes malfunction.

Fixes: 9114014cf4 ("genirq: Add mutex to irq desc to serialize request/free_irq()")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190404074512.145533-1-wangkefeng.wang@huawei.com
2019-04-05 14:37:56 +02:00
Dan Carpenter
95c5c618fa irqchip/irq-ls1x: Missing error code in ls1x_intc_of_init()
Currently, when irq_domain_add_linear() fails, the error code does not get
set so it returns zero which is wrong.  Fix it by setting the appropriate
error code.

Fixes: 9e543e22e2 ("irqchip: Add driver for Loongson-1 interrupt controller")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20190329062136.GQ32613@kadam
2019-04-05 14:37:56 +02:00
Zubin Mithra
212ac181c1 ALSA: seq: Fix OOB-reads from strlcpy
When ioctl calls are made with non-null-terminated userspace strings,
strlcpy causes an OOB-read from within strlen. Fix by changing to use
strscpy instead.

Signed-off-by: Zubin Mithra <zsm@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-04-05 14:33:01 +02:00
Josh Poimboeuf
4fa5ecda2b objtool: Add rewind_stack_do_exit() to the noreturn list
This fixes the following warning seen on GCC 7.3:

  arch/x86/kernel/dumpstack.o: warning: objtool: oops_end() falls through to next function show_regs()

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/3418ebf5a5a9f6ed7e80954c741c0b904b67b5dc.1554398240.git.jpoimboe@redhat.com
2019-04-05 13:30:51 +02:00
Jernej Skrabec
cd9063757a
drm/sun4i: DW HDMI: Lower max. supported rate for H6
Currently resolutions with pixel clock higher than 340 MHz don't work
with H6 HDMI controller. They just produce a blank screen.

Limit maximum pixel clock rate to 340 MHz until scrambling is supported.

Cc: stable@vger.kernel.org # 5.0
Fixes: 40bb9d3147 ("drm/sun4i: Add support for H6 DW HDMI controller")
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190324190609.32721-1-jernej.skrabec@siol.net
2019-04-05 10:44:10 +02:00
Neil Armstrong
3df1af984b
Revert "Documentation/gpu/meson: Remove link to meson_canvas.c"
This reverts commit a3f98bb22c.

Patch "Documentation/gpu/meson: Remove link to meson_canvas.c" was
incorrectly applied on the wrong branch not containing the fixed
commit 2bf6b5b0e3 ("drm/meson: exclusively use the canvas provider module")

Acked-by: Sean Paul <sean@poorly.run>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190404144342.15238-1-narmstrong@baylibre.com
2019-04-05 10:43:10 +02:00
Dan Carpenter
42d8644bd7 xen: Prevent buffer overflow in privcmd ioctl
The "call" variable comes from the user in privcmd_ioctl_hypercall().
It's an offset into the hypercall_page[] which has (PAGE_SIZE / 32)
elements.  We need to put an upper bound on it to prevent an out of
bounds access.

Cc: stable@vger.kernel.org
Fixes: 1246ae0bb9 ("xen: add variable hypercall caller")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-04-05 08:42:45 +02:00
Andrea Righi
ad94dc3a7e xen: use struct_size() helper in kzalloc()
struct privcmd_buf_vma_private has a zero-sized array at the end
(pages), use the new struct_size() helper to determine the proper
allocation size and avoid potential type mistakes.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-04-05 08:42:41 +02:00
Linus Torvalds
ea2cec24c8 drm: i915 and amdgpu fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcprrGAAoJEAx081l5xIa+WdUP/2zoJO9vyB3NxsJ35vH4HQjV
 uOb/V2GNmUKSuVnGNp1Z/dDJdXm9zjJF78a/UzMWlXs+lJzq+7zUZF6hj9scy/5Q
 29t0Ks3fzpa5q0Lz2eOMvsDQDDcRZjtCD8xU73qZRdxnT0Ov1IayT15W+Gzf7HmJ
 WX4kK31uPKaJvmOt8RmxQkhvVJSD9lfEzTgijx/joQlPlCZ7ndml2d3dqlLmUk1X
 ylv4W6NBo24ErtNQPhFAZ/AC6sgweAKD+b560H1X6Blen0TJ0Xrr1afsHUCBh/nj
 G1uDjyp3252evuJgZEuUQb/GjhspJ5FWdG2Imn06+QbqD/N3CynZT2vTChFZ/6Hs
 C0ZMHoAc6Kx/vx/Rd4oO6VbgsqMRqkSRXshgsUb4QWwZ96pozZrFvoMKWJrJprjH
 LAfEqX5Cnl7IG2a9xLTuzNY6qhCv/85YNPnJj6f22HitS+dhWdtvV7nTq7r01lds
 7DsI5xwfnCEY4b4xnccWj4zLjvH19DK+y9xtgDnWGg5zVa4MQ6GopyTPvurH8yHR
 AyYhoMvTSpfjGcIuj1/Vj4GNT+l2A0AKHRLa8/BICd1QatFomngVvvN2AnDUnm1I
 kkBnVVS/o4FKoTWMXX7vajjOKP9hot1unyQN2xSLABPffqfBzoLu4riO2bwPa1lv
 jfCKGlRXyJEtaNL1ibz2
 =tSY4
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2019-04-05' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Pretty quiet week, just some amdgpu and i915 fixes.

  i915:
   - deadlock fix
   - gvt fixes

  amdgpu:
   - PCIE dpm feature fix
   - Powerplay fixes"

* tag 'drm-fixes-2019-04-05' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/gvt: Fix kerneldoc typo for intel_vgpu_emulate_hotplug
  drm/i915/gvt: Correct the calculation of plane size
  drm/amdgpu: remove unnecessary rlc reset function on gfx9
  drm/i915: Always backoff after a drm_modeset_lock() deadlock
  drm/i915/gvt: do not let pin count of shadow mm go negative
  drm/i915/gvt: do not deliver a workload if its creation fails
  drm/amd/display: VBIOS can't be light up HDMI when restart system
  drm/amd/powerplay: fix possible hang with 3+ 4K monitors
  drm/amd/powerplay: correct data type to avoid overflow
  drm/amd/powerplay: add ECC feature bit
  drm/amd/amdgpu: fix PCIe dpm feature issue (v3)
2019-04-04 18:22:55 -10:00
Linus Torvalds
0548740e53 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Several hash table refcount fixes in batman-adv, from Sven
    Eckelmann.

 2) Use after free in bpf_evict_inode(), from Daniel Borkmann.

 3) Fix mdio bus registration in ixgbe, from Ivan Vecera.

 4) Unbounded loop in __skb_try_recv_datagram(), from Paolo Abeni.

 5) ila rhashtable corruption fix from Herbert Xu.

 6) Don't allow upper-devices to be added to vrf devices, from Sabrina
    Dubroca.

 7) Add qmi_wwan device ID for Olicard 600, from Bjørn Mork.

 8) Don't leave skb->next poisoned in __netif_receive_skb_list_ptype,
    from Alexander Lobakin.

 9) Missing IDR checks in mlx5 driver, from Aditya Pakki.

10) Fix false connection termination in ktls, from Jakub Kicinski.

11) Work around some ASPM issues with r8169 by disabling rx interrupt
    coalescing on certain chips. From Heiner Kallweit.

12) Properly use per-cpu qstat values on NOLOCK qdiscs, from Paolo
    Abeni.

13) Fully initialize sockaddr_in structures in SCTP, from Xin Long.

14) Various BPF flow dissector fixes from Stanislav Fomichev.

15) Divide by zero in act_sample, from Davide Caratti.

16) Fix bridging multicast regression introduced by rhashtable
    conversion, from Nikolay Aleksandrov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
  ibmvnic: Fix completion structure initialization
  ipv6: sit: reset ip header pointer in ipip6_rcv
  net: bridge: always clear mcast matching struct on reports and leaves
  libcxgb: fix incorrect ppmax calculation
  vlan: conditional inclusion of FCoE hooks to match netdevice.h and bnx2x
  sch_cake: Make sure we can write the IP header before changing DSCP bits
  sch_cake: Use tc_skb_protocol() helper for getting packet protocol
  tcp: Ensure DCTCP reacts to losses
  net/sched: act_sample: fix divide by zero in the traffic path
  net: thunderx: fix NULL pointer dereference in nicvf_open/nicvf_stop
  net: hns: Fix sparse: some warnings in HNS drivers
  net: hns: Fix WARNING when remove HNS driver with SMMU enabled
  net: hns: fix ICMP6 neighbor solicitation messages discard problem
  net: hns: Fix probabilistic memory overwrite when HNS driver initialized
  net: hns: Use NAPI_POLL_WEIGHT for hns driver
  net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw()
  flow_dissector: rst'ify documentation
  ipv6: Fix dangling pointer when ipv6 fragment
  net-gro: Fix GRO flush when receiving a GSO packet.
  flow_dissector: document BPF flow dissector environment
  ...
2019-04-04 18:07:12 -10:00
Ranjani Sridharan
2e05ddd2c9
ASoC: intel: skylake: add remove() callback for component driver
Topology is not unloaded in the core during unregister_component()
anymore. So, add the remove() callback that will unload the
topology.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2019-04-05 09:23:34 +07:00
Charles Keepax
47c4cc08cb
ASoC: cs35l35: Disable regulators on driver removal
The chips main power supplies VA and VP are enabled during probe but
then never disabled, this will cause warnings from the regulator
framework on driver removal. Fix this by adding a remove callback and
disabling the supplies, whilst doing so follow best practice and put the
chip back into reset as well.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
2019-04-05 09:23:13 +07:00
Max Filippov
ecae26fae1 xtensa: fix format string warning in init_pmd
Use %lu instead of %zu to fix the following warning introduced with
recent memblock refactoring:
  xtensa/mm/mmu.c:36:9: warning: format '%zu' expects argument of type
  'size_t', but argument 3 has type 'long unsigned int

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-04-04 18:45:55 -07:00
Thomas Falcon
bbd669a868 ibmvnic: Fix completion structure initialization
Fix device initialization completion handling for vNIC adapters.
Initialize the completion structure on probe and reinitialize when needed.
This also fixes a race condition during kdump where the driver can attempt
to access the completion struct before it is initialized:

Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc0000000081acbe0
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvnic(+) ibmveth sunrpc overlay squashfs loop
CPU: 19 PID: 301 Comm: systemd-udevd Not tainted 4.18.0-64.el8.ppc64le #1
NIP:  c0000000081acbe0 LR: c0000000081ad964 CTR: c0000000081ad900
REGS: c000000027f3f990 TRAP: 0300   Not tainted  (4.18.0-64.el8.ppc64le)
MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28228288  XER: 00000006
CFAR: c000000008008934 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1
GPR00: c0000000081ad964 c000000027f3fc10 c0000000095b5800 c0000000221b4e58
GPR04: 0000000000000003 0000000000000001 000049a086918581 00000000000000d4
GPR08: 0000000000000007 0000000000000000 ffffffffffffffe8 d0000000014dde28
GPR12: c0000000081ad900 c000000009a00c00 0000000000000001 0000000000000100
GPR16: 0000000000000038 0000000000000007 c0000000095e2230 0000000000000006
GPR20: 0000000000400140 0000000000000001 c00000000910c880 0000000000000000
GPR24: 0000000000000000 0000000000000006 0000000000000000 0000000000000003
GPR28: 0000000000000001 0000000000000001 c0000000221b4e60 c0000000221b4e58
NIP [c0000000081acbe0] __wake_up_locked+0x50/0x100
LR [c0000000081ad964] complete+0x64/0xa0
Call Trace:
[c000000027f3fc10] [c000000027f3fc60] 0xc000000027f3fc60 (unreliable)
[c000000027f3fc60] [c0000000081ad964] complete+0x64/0xa0
[c000000027f3fca0] [d0000000014dad58] ibmvnic_handle_crq+0xce0/0x1160 [ibmvnic]
[c000000027f3fd50] [d0000000014db270] ibmvnic_tasklet+0x98/0x130 [ibmvnic]
[c000000027f3fda0] [c00000000813f334] tasklet_action_common.isra.3+0xc4/0x1a0
[c000000027f3fe00] [c000000008cd13f4] __do_softirq+0x164/0x400
[c000000027f3fef0] [c00000000813ed64] irq_exit+0x184/0x1c0
[c000000027f3ff20] [c0000000080188e8] __do_irq+0xb8/0x210
[c000000027f3ff90] [c00000000802d0a4] call_do_irq+0x14/0x24
[c000000026a5b010] [c000000008018adc] do_IRQ+0x9c/0x130
[c000000026a5b060] [c000000008008ce4] hardware_interrupt_common+0x114/0x120

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 18:36:07 -07:00
Lorenzo Bianconi
bb9bd814eb ipv6: sit: reset ip header pointer in ipip6_rcv
ipip6 tunnels run iptunnel_pull_header on received skbs. This can
determine the following use-after-free accessing iph pointer since
the packet will be 'uncloned' running pskb_expand_head if it is a
cloned gso skb (e.g if the packet has been sent though a veth device)

[  706.369655] BUG: KASAN: use-after-free in ipip6_rcv+0x1678/0x16e0 [sit]
[  706.449056] Read of size 1 at addr ffffe01b6bd855f5 by task ksoftirqd/1/=
[  706.669494] Hardware name: HPE ProLiant m400 Server/ProLiant m400 Server, BIOS U02 08/19/2016
[  706.771839] Call trace:
[  706.801159]  dump_backtrace+0x0/0x2f8
[  706.845079]  show_stack+0x24/0x30
[  706.884833]  dump_stack+0xe0/0x11c
[  706.925629]  print_address_description+0x68/0x260
[  706.982070]  kasan_report+0x178/0x340
[  707.025995]  __asan_report_load1_noabort+0x30/0x40
[  707.083481]  ipip6_rcv+0x1678/0x16e0 [sit]
[  707.132623]  tunnel64_rcv+0xd4/0x200 [tunnel4]
[  707.185940]  ip_local_deliver_finish+0x3b8/0x988
[  707.241338]  ip_local_deliver+0x144/0x470
[  707.289436]  ip_rcv_finish+0x43c/0x14b0
[  707.335447]  ip_rcv+0x628/0x1138
[  707.374151]  __netif_receive_skb_core+0x1670/0x2600
[  707.432680]  __netif_receive_skb+0x28/0x190
[  707.482859]  process_backlog+0x1d0/0x610
[  707.529913]  net_rx_action+0x37c/0xf68
[  707.574882]  __do_softirq+0x288/0x1018
[  707.619852]  run_ksoftirqd+0x70/0xa8
[  707.662734]  smpboot_thread_fn+0x3a4/0x9e8
[  707.711875]  kthread+0x2c8/0x350
[  707.750583]  ret_from_fork+0x10/0x18

[  707.811302] Allocated by task 16982:
[  707.854182]  kasan_kmalloc.part.1+0x40/0x108
[  707.905405]  kasan_kmalloc+0xb4/0xc8
[  707.948291]  kasan_slab_alloc+0x14/0x20
[  707.994309]  __kmalloc_node_track_caller+0x158/0x5e0
[  708.053902]  __kmalloc_reserve.isra.8+0x54/0xe0
[  708.108280]  __alloc_skb+0xd8/0x400
[  708.150139]  sk_stream_alloc_skb+0xa4/0x638
[  708.200346]  tcp_sendmsg_locked+0x818/0x2b90
[  708.251581]  tcp_sendmsg+0x40/0x60
[  708.292376]  inet_sendmsg+0xf0/0x520
[  708.335259]  sock_sendmsg+0xac/0xf8
[  708.377096]  sock_write_iter+0x1c0/0x2c0
[  708.424154]  new_sync_write+0x358/0x4a8
[  708.470162]  __vfs_write+0xc4/0xf8
[  708.510950]  vfs_write+0x12c/0x3d0
[  708.551739]  ksys_write+0xcc/0x178
[  708.592533]  __arm64_sys_write+0x70/0xa0
[  708.639593]  el0_svc_handler+0x13c/0x298
[  708.686646]  el0_svc+0x8/0xc

[  708.739019] Freed by task 17:
[  708.774597]  __kasan_slab_free+0x114/0x228
[  708.823736]  kasan_slab_free+0x10/0x18
[  708.868703]  kfree+0x100/0x3d8
[  708.905320]  skb_free_head+0x7c/0x98
[  708.948204]  skb_release_data+0x320/0x490
[  708.996301]  pskb_expand_head+0x60c/0x970
[  709.044399]  __iptunnel_pull_header+0x3b8/0x5d0
[  709.098770]  ipip6_rcv+0x41c/0x16e0 [sit]
[  709.146873]  tunnel64_rcv+0xd4/0x200 [tunnel4]
[  709.200195]  ip_local_deliver_finish+0x3b8/0x988
[  709.255596]  ip_local_deliver+0x144/0x470
[  709.303692]  ip_rcv_finish+0x43c/0x14b0
[  709.349705]  ip_rcv+0x628/0x1138
[  709.388413]  __netif_receive_skb_core+0x1670/0x2600
[  709.446943]  __netif_receive_skb+0x28/0x190
[  709.497120]  process_backlog+0x1d0/0x610
[  709.544169]  net_rx_action+0x37c/0xf68
[  709.589131]  __do_softirq+0x288/0x1018

[  709.651938] The buggy address belongs to the object at ffffe01b6bd85580
                which belongs to the cache kmalloc-1024 of size 1024
[  709.804356] The buggy address is located 117 bytes inside of
                1024-byte region [ffffe01b6bd85580, ffffe01b6bd85980)
[  709.946340] The buggy address belongs to the page:
[  710.003824] page:ffff7ff806daf600 count:1 mapcount:0 mapping:ffffe01c4001f600 index:0x0
[  710.099914] flags: 0xfffff8000000100(slab)
[  710.149059] raw: 0fffff8000000100 dead000000000100 dead000000000200 ffffe01c4001f600
[  710.242011] raw: 0000000000000000 0000000000380038 00000001ffffffff 0000000000000000
[  710.334966] page dumped because: kasan: bad access detected

Fix it resetting iph pointer after iptunnel_pull_header

Fixes: a09a4c8dd1 ("tunnels: Remove encapsulation offloads on decap")
Tested-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 18:34:23 -07:00
Linus Torvalds
8e22ba96d4 RISC-V Patches for 5.1-rc4
I dropped the ball a bit here: these patches should all probably have
 been part of rc2, but I wanted to get around to properly testing them in
 the various configurations (qemu32, qeum64, unleashed) first.
 Unfortunately I've been traveling and didn't have time to actually do
 that, but since these fix concrete bugs and pass my old set of tests I
 don't want to delay the fixes any longer.
 
 There are four independent fixes here:
 
 * A fix for the rv32 port that corrects the 64-bit user accesor's fixup
   label address.
 * A fix for a regression introduced during the merge window that broke
   medlow configurations at run time.  This patch also includes a fix
   that disables ftrace for the same set of functions, which was found by
   inspection at the same time.
 * A modification of the memory map to avoid overlapping the FIXMAP and
   VMALLOC regions on systems with small memory maps.
 * A fix to the module handling code to use the correct syntax for
   probing Kconfig entries.
 
 These have passed my standard test flow, but I didn't have time to
 expand that like I said I would.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlymeX0THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQXKFD/9rncH6fbNjb7izocCsDXNaDHWrixOX
 /qYaDTheLe45txZJytZwWcgpEHlk/W+SM6mbdk6oJ10bG4a/FVv7PltBtoJvkRxt
 5/iX0xwE73Rug5Z2FkqzGV6qNK8xD+2CfMgeYQYCOmaOgJuCjzj0x4Buh3EBHUY+
 nxueWg7/ITzWCGSzH0bYw0S9cDFz8qUhmHiHtDz6y98JPoibnz6TQ/Ie4idxfSs8
 nvV5cN1t8N8UpNAtVyVvTuWzf4E+LT02LIbD59ShOcGJ9I0hzPmNzGvVFVL6sxRs
 XA8lzwlXYSbGRFtcoqEjKws9nG1Jbx9TucdrQvSqNlZ+invjpY5KiJyucF9ODV+G
 LQosPqa4vw+hwSD958y1v2APNCLcvVx42YSHgPNabQ33/1gQiISveoQBqLUFk19s
 dsaNCrEv37EeSzNXjM3ZCjLA+udRRB4GU4lBEQSDsnDp6472cCgjpEO8mhKY7Qbq
 jIAHoVg1oXXrqqDXJoq0juiYoMMthRLtSTm+daryh5NGB8qoU3XuHc6MspJI+Shy
 tpjglSr8+xyyviDY8KnjHPmeTdUQKPhUf6/0fwMM3gX8AYdLRhXUiH/Cd2LehqFQ
 3LqvKBhfKS5eqBNQv7Dy+MlsiifVY3gDf0lsBC6jifxF/CmmRaWCauHgNUD9AHeH
 igftN7X07gwpXQ==
 =JIdq
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V fixes from Palmer Dabbelt:
 "I dropped the ball a bit here: these patches should all probably have
  been part of rc2, but I wanted to get around to properly testing them
  in the various configurations (qemu32, qeum64, unleashed) first.

  Unfortunately I've been traveling and didn't have time to actually do
  that, but since these fix concrete bugs and pass my old set of tests I
  don't want to delay the fixes any longer.

  There are four independent fixes here:

   - A fix for the rv32 port that corrects the 64-bit user accesor's
     fixup label address.

   - A fix for a regression introduced during the merge window that
     broke medlow configurations at run time. This patch also includes a
     fix that disables ftrace for the same set of functions, which was
     found by inspection at the same time.

   - A modification of the memory map to avoid overlapping the FIXMAP
     and VMALLOC regions on systems with small memory maps.

   - A fix to the module handling code to use the correct syntax for
     probing Kconfig entries.

  These have passed my standard test flow, but I didn't have time to
  expand that testing like I said I would"

* tag 'riscv-for-linus-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  RISC-V: Use IS_ENABLED(CONFIG_CMODEL_MEDLOW)
  RISC-V: Fix FIXMAP_TOP to avoid overlap with VMALLOC area
  RISC-V: Always compile mm/init.c with cmodel=medany and notrace
  riscv: fix accessing 8-byte variable from RV32
2019-04-04 15:04:00 -10:00
Nikolay Aleksandrov
1515a63fc4 net: bridge: always clear mcast matching struct on reports and leaves
We need to be careful and always zero the whole br_ip struct when it is
used for matching since the rhashtable change. This patch fixes all the
places which didn't properly clear it which in turn might've caused
mismatches.

Thanks for the great bug report with reproducing steps and bisection.

Steps to reproduce (from the bug report):
ip link add br0 type bridge mcast_querier 1
ip link set br0 up

ip link add v2 type veth peer name v3
ip link set v2 master br0
ip link set v2 up
ip link set v3 up
ip addr add 3.0.0.2/24 dev v3

ip netns add test
ip link add v1 type veth peer name v1 netns test
ip link set v1 master br0
ip link set v1 up
ip -n test link set v1 up
ip -n test addr add 3.0.0.1/24 dev v1

# Multicast receiver
ip netns exec test socat
UDP4-RECVFROM:5588,ip-add-membership=224.224.224.224:3.0.0.1,fork -

# Multicast sender
echo hello | nc -u -s 3.0.0.2 224.224.224.224 5588

Reported-by: liam.mcbirnie@boeing.com
Fixes: 19e3a9c90c ("net: bridge: convert multicast to generic rhashtable")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 17:52:40 -07:00
Linus Torvalds
20ad549488 Power management material for 5.1-rc4
- Make intel_pstate only load on Intel processors and prevent it
    from printing pointless failure messages (Borislav Petkov).
 
  - Update the turbostat utility:
    * Assorted fixes (Ben Hutchings, Len Brown, Prarit Bhargava).
    * Support for AMD Fam 17h (Zen) RAPL and package power (Calvin
      Walton).
    * Support for Intel Icelake and for systems with more than one
      die per package (Len Brown).
    * Cleanups (Len Brown).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcpmUKAAoJEILEb/54YlRxokUP/REDJUS8QMDe+jOOVySFj+LR
 +7Pmsigempfh3wjL43yWN23f2IUEJXnW27NNj3O5JpXLng6JRqwoBLcwhba2Rzu+
 n6lIWcirLRy72UdIK5oU3XP5k3FrYTuJJY+5vH7UjIWJ7bTZ+tACyG/MCNEDcTsD
 BCsEEZ/MGdxpH8hQgj61CjYiOPMhPYujFhlnu2T91vcXQ/12FUXMR17fKWM+omAo
 2TzoApQW+l2KYt7O/x4l69MDgEDY+3Pi40HSwQFGOIcY81ig1WX/DU9tKqoSa9SD
 KwJB/FkSenWP+sQgPZSUODoz9waugMDJDXouxNqxlBnqu8hDb3T5OT9s12nvJvSo
 NWRQ0XKuU3UxiOhTvvv09Ks16l8aY9ahe9JrRGFWN6PQFw6GcrlEspPGS31p7AqF
 I/j9ww8POae7/s1/MFgSWD52KxTAnQ9f4kYP6jtToCwrPPuy5fGq2zQOfgH6DwDm
 SkprpWKYkns1IjjamDTv+o609cgEK+oX3TYCxvDcT0fUfuNQxwerw4FUC0QG4E8B
 H44u+yROGBmexBDT1dkD8LYoH7lYzki1BImkxLGK1EKhhl9yvt+OPTw55yXNUfTI
 haumsRujr9pymrna9fujzlxSU4q1YSXSb4YPmGxQSQ3twZqW8d2tR2pfT5QQ/S+t
 62bxDS4a3ui7dMEfLSG1
 =RZRM
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix up the intel_pstate driver after recent changes to prevent
  it from printing pointless messages and update the turbostat utility
  (mostly fixes and new hardware support).

  Specifics:

   - Make intel_pstate only load on Intel processors and prevent it from
     printing pointless failure messages (Borislav Petkov).

   - Update the turbostat utility:
      * Assorted fixes (Ben Hutchings, Len Brown, Prarit Bhargava).
      * Support for AMD Fam 17h (Zen) RAPL and package power (Calvin
        Walton).
      * Support for Intel Icelake and for systems with more than one die
        per package (Len Brown).
      * Cleanups (Len Brown)"

* tag 'pm-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq/intel_pstate: Load only on Intel hardware
  tools/power turbostat: update version number
  tools/power turbostat: Warn on bad ACPI LPIT data
  tools/power turbostat: Add checks for failure of fgets() and fscanf()
  tools/power turbostat: Also read package power on AMD F17h (Zen)
  tools/power turbostat: Add support for AMD Fam 17h (Zen) RAPL
  tools/power turbostat: Do not display an error on systems without a cpufreq driver
  tools/power turbostat: Add Die column
  tools/power turbostat: Add Icelake support
  tools/power turbostat: Cleanup CNL-specific code
  tools/power turbostat: Cleanup CC3-skip code
  tools/power turbostat: Restore ability to execute in topology-order
2019-04-04 14:52:08 -10:00
Linus Torvalds
b512f71221 ACPI fix for 5.1-rc4
Prevent stale GPE events from triggering spurious system wakeups from
 suspend-to-idle (Furquan Shaikh).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcpmWiAAoJEILEb/54YlRxqHIP/22TfLZTQoPOd3+6K12YDETU
 OnV4+xX1k8/tVU5Xz4wTDnUYSQpZ/A9VBIp7EFkth32H71vxn080FaJ6GubdM1W1
 iBLVJKjtR7ypdmbnDSCRyDePFBNxZ1hnXd53JtChetrPFm3cNk2Tl2CZfd1ms3Va
 6OvVw+CrwFOIulgLv9AsG7N4Jr0RFsjuAmUAAYre1ZRHiruJLCTvcKTMTzKqUFuX
 0p8MIe0N77ByjdcFdZOPe7YQrCZPqRHUn3Rr7Rep//U9esUWKmvQs7al/uCrXXTu
 iqw18okpqpe6w/BTShUziwGrdj0JP82UU5me8AalOrWJ7pddDphPuN8kjuHwTqy1
 nPndWYlZ536az+DJRiiHeVVs3ANG35wB/WYzI7cHVVn0ujpHUTiml+JM/OZYi8eW
 dECmTGJopWveC6PSCBDvUnrFM5v8z1k+CIoVu+Fo6SnWuksJeRBkl+rzl8nMMQhw
 +zhM01T+R7/Yi2wyoBoRSdgdRhI1xSwUJV88kUJqXLSZ8a5xNCXOep7PJKYuv5hA
 lVSO2o6+sWuz/N3PkqfZUu25lWos9+3sMzUCZ4exavQnwu/udG6w/LjW1j0bxH8L
 plaTYdeHTFl7WFvKptD8G1UAJ85C241EGIA73DxiNjwmrw039zLX1FBs46QNKB6L
 vXa+BMxQhrwUNoGAD3hZ
 =XLS7
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Prevent stale GPE events from triggering spurious system wakeups from
  suspend-to-idle (Furquan Shaikh)"

* tag 'acpi-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPICA: Clear status of GPEs before enabling them
2019-04-04 14:48:11 -10:00
Dave Airlie
23b5f422e8 Only one fix for DSC (backoff after drm_modeset_lock deadlock)
and GVT's fixes including vGPU display plane size calculation,
 shadow mm pin count, error recovery path for workload create
 and one kerneldoc fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcpiyNAAoJEPpiX2QO6xPKgUEIAITZrpMNGN+n/agRByVgkTJC
 oR+7hon2nctO42kMrrc/oD6ddPp7WE88dlu7J5w3+25lL5xVM25I9mLmUzlcD1P3
 nDCRIw5UCoQUhO+5Hk8+p38mPQdkv8bNUUsA1JgXgDPtJ7fxxkh6HbwV9+JkG8zd
 f+/o2gZZVOd0EDXZqVCqFPT354LqwlX7OUBn8J2hLorL6y/2YbBNjTYGbe7+G18q
 c0fOZafTrgGqIKWUviBgsNx0FHaG0vhLBO/uqoO67yDUgBnViwujOpbkz+OhoQRS
 Qaf/62M/tC/218+CnftxdCcy45M4dTmhd2qKyjDVHPZB3NUcagaaotn8LXBbeUA=
 =2Q2R
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-fixes-2019-04-04' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Only one fix for DSC (backoff after drm_modeset_lock deadlock)
and GVT's fixes including vGPU display plane size calculation,
shadow mm pin count, error recovery path for workload create
and one kerneldoc fix.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190404161116.GA14522@intel.com
2019-04-05 10:45:32 +10:00
Linus Torvalds
9db6ce4ece - Bug Fixes
- Fix failed reads due to enabled IRQs when suspended; twl-core
    - Fix driver	registration when using	DT; sprd-sc27xx-spi
    - Fix `make allyesconfig` on	x86_64;	SUN6I_PRCM
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAlylz7oACgkQUa+KL4f8
 d2Hb9A/7ByIAS5N/N6vp0g3PZLHnIikb8hOeva6SCZHzM5wTlw5F2x+evn6tmVaG
 euXyek3EvHGCP8S92Ccj6aF01xEmA7xbCFzt22lhzmd6jplkyJhS84nt7S1E7spi
 FcZRWf2pTMyz4t5tXf4weqLnSBnianBmXBZr7xSMvEMcaRXMKn3/SiasMnmLjE83
 Uyv522tSa8s4sASOp0yGKGuPmxJHR1wbb41SIwSQnPfBohVEqBamaO6hMIxvRqwz
 Dl+ns6n1mXPvSAMS4WTmi7YUHwZTOdCwGMBhX0a2z66dqKgHBeJajCB9r+aCmnny
 WON0y9axqUdcal6XZcLfgPQQiNvKs2pAzBmNZaZEVjtyHDCM5vuFL0hmqnFrRqM7
 i1/SbVT6SIiNJAEoi4LrssDPRKwpR9Z4OvAWEGVsCWtWREMfya30MP2n5048FnXc
 88j+kuTumjgH0TTKnU7ls+Oqok3e6hAR/waEvgxG2IiT9T1d3u8UwqrWrCYA2y2i
 9k1qO9D2989wTs+9YmIk4/94SIgJvLvAOh5USEhfuAEoHHSO1kKdiW/8EWei68u2
 XfpX3kZQYny6q39Ia2dhTUaFpa7g6g142dzCK94M/iM/73UHszp5qpE50XvbFv6W
 6rAGJbk9Wo9hVWOT+tZwxTqIVEGCbkBOx9R3gbF1V2FowTBzmJM=
 =Sa3v
 -----END PGP SIGNATURE-----

Merge tag 'mfd-fixes-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull mfd fixes from Lee Jones:

 - Fix failed reads due to enabled IRQs when suspended; twl-core

 - Fix driver registration when using DT; sprd-sc27xx-spi

 - Fix `make allyesconfig` on x86_64; SUN6I_PRCM

* tag 'mfd-fixes-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mfd: sun6i-prcm: Allow to compile with COMPILE_TEST
  mfd: sc27xx: Use SoC compatible string for PMIC devices
  mfd: twl-core: Disable IRQ while suspended
2019-04-04 14:42:47 -10:00
Dave Airlie
2ded18812b Merge branch 'drm-fixes-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Fixes for 5.1:
- Fix for pcie dpm
- Powerplay fixes for vega20
- Fix vbios display on reboot if driver display state is retained
- Gfx9 resume robustness fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190404042939.3386-1-alexander.deucher@amd.com
2019-04-05 10:42:13 +10:00
Varun Prakash
cc5a726c79 libcxgb: fix incorrect ppmax calculation
BITS_TO_LONGS() uses DIV_ROUND_UP() because of
this ppmax value can be greater than available
per cpu page pods.

This patch removes BITS_TO_LONGS() to fix this
issue.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 17:40:26 -07:00
Chris Leech
0a89eb92d8 vlan: conditional inclusion of FCoE hooks to match netdevice.h and bnx2x
Way back in 3c9c36bced the
ndo_fcoe_get_wwn pointer was switched from depending on CONFIG_FCOE to
CONFIG_LIBFCOE in order to allow building FCoE support into the bnx2x
driver and used by bnx2fc without including the generic software fcoe
module.

But, FCoE is generally used over an 802.1q VLAN, and the implementation
of ndo_fcoe_get_wwn in the 8021q module was not similarly changed.  The
result is that if CONFIG_FCOE is disabled, then bnz2fc cannot make a
call to ndo_fcoe_get_wwn through the 8021q interface to the underlying
bnx2x interface.  The bnx2fc driver then falls back to a potentially
different mapping of Ethernet MAC to Fibre Channel WWN, creating an
incompatibility with the fabric and target configurations when compared
to the WWNs used by pre-boot firmware and differently-configured
kernels.

So make the conditional inclusion of FCoE code in 8021q match the
conditional inclusion in netdevice.h

Signed-off-by: Chris Leech <cleech@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 17:18:34 -07:00
Wei Yongjun
6af1c849df aio: use kmem_cache_free() instead of kfree()
memory allocated by kmem_cache_alloc() should be freed using
kmem_cache_free(), not kfree().

Fixes: fa0ca2aee3 ("deal with get_reqs_available() in aio_get_req() itself")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-04-04 20:13:59 -04:00
Liu Jian
d9b8a67b3b mtd: cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer
In function do_write_buffer(), in the for loop, there is a case
chip_ready() returns 1 while chip_good() returns 0, so it never
break the loop.
To fix this, chip_good() is enough and it should timeout if it stay
bad for a while.

Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to check correct value")
Signed-off-by: Yi Huaijie <yihuaijie@huawei.com>
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Tokunori Ikegami <ikegami_to@yahoo.co.jp>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-04-05 00:39:19 +02:00
David S. Miller
5ba5780117 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-04-04

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Batch of fixes to the existing BPF flow dissector API to support
   calling BPF programs from the eth_get_headlen context (support for
   latter is planned to be added in bpf-next), from Stanislav.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 13:30:55 -07:00
Rafael J. Wysocki
b59fb7ef52 Merge branch 'acpica' into acpi
* acpica:
  ACPICA: Clear status of GPEs before enabling them
2019-04-04 22:08:47 +02:00
Rafael J. Wysocki
58b0cf8e24 Merge branch 'pm-tools'
* pm-tools:
  tools/power turbostat: update version number
  tools/power turbostat: Warn on bad ACPI LPIT data
  tools/power turbostat: Add checks for failure of fgets() and fscanf()
  tools/power turbostat: Also read package power on AMD F17h (Zen)
  tools/power turbostat: Add support for AMD Fam 17h (Zen) RAPL
  tools/power turbostat: Do not display an error on systems without a cpufreq driver
  tools/power turbostat: Add Die column
  tools/power turbostat: Add Icelake support
  tools/power turbostat: Cleanup CNL-specific code
  tools/power turbostat: Cleanup CC3-skip code
  tools/power turbostat: Restore ability to execute in topology-order
2019-04-04 21:57:45 +02:00
Mike Snitzer
bcb44433bb dm: disable DISCARD if the underlying storage no longer supports it
Storage devices which report supporting discard commands like
WRITE_SAME_16 with unmap, but reject discard commands sent to the
storage device.  This is a clear storage firmware bug but it doesn't
change the fact that should a program cause discards to be sent to a
multipath device layered on this buggy storage, all paths can end up
failed at the same time from the discards, causing possible I/O loss.

The first discard to a path will fail with Illegal Request, Invalid
field in cdb, e.g.:
 kernel: sd 8:0:8:19: [sdfn] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
 kernel: sd 8:0:8:19: [sdfn] tag#0 Sense Key : Illegal Request [current]
 kernel: sd 8:0:8:19: [sdfn] tag#0 Add. Sense: Invalid field in cdb
 kernel: sd 8:0:8:19: [sdfn] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 a0 08 00 00 00 80 00 00 00
 kernel: blk_update_request: critical target error, dev sdfn, sector 10487808

The SCSI layer converts this to the BLK_STS_TARGET error number, the sd
device disables its support for discard on this path, and because of the
BLK_STS_TARGET error multipath fails the discard without failing any
path or retrying down a different path.  But subsequent discards can
cause path failures.  Any discards sent to the path which already failed
a discard ends up failing with EIO from blk_cloned_rq_check_limits with
an "over max size limit" error since the discard limit was set to 0 by
the sd driver for the path.  As the error is EIO, this now fails the
path and multipath tries to send the discard down the next path.  This
cycle continues as discards are sent until all paths fail.

Fix this by training DM core to disable DISCARD if the underlying
storage already did so.

Also, fix branching in dm_done() and clone_endio() to reflect the
mutually exclussive nature of the IO operations in question.

Cc: stable@vger.kernel.org
Reported-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-04-04 15:33:59 -04:00
Max Filippov
ada770b1e7 xtensa: fix return_address
return_address returns the address that is one level higher in the call
stack than requested in its argument, because level 0 corresponds to its
caller's return address. Use requested level as the number of stack
frames to skip.

This fixes the address reported by might_sleep and friends.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2019-04-04 11:18:55 -07:00
Horatiu Vultur
6e3572e83d
MIPS: generic: Add switchdev, pinctrl and fit to ocelot_defconfig
Some of the configuration were not selected by default anymore, therefore
enable them again. Also remove some configs which are used for MSCC Ocelot.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: <alexandre.belloni@bootlin.com>
Cc: <UNGLinuxDriver@microchip.com>
Cc: <ralf@linux-mips.org>
Cc: <jhogan@kernel.org>
Cc: <linux-mips@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
2019-04-04 11:14:45 -07:00
David S. Miller
3baf5c2d6f Merge branch 'sch_cake-fixes'
Toke Høiland-Jørgensen says:

====================
sched: A few small fixes for sch_cake

Kevin noticed a few issues with the way CAKE reads the skb protocol and the IP
diffserv fields. This series fixes those two issues, and should probably go to
in 4.19 as well. However, the previous refactoring patch means they don't apply
as-is; I can send a follow-up directly to stable if that's OK with you?
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 10:55:59 -07:00
Toke Høiland-Jørgensen
c87b4ecdbe sch_cake: Make sure we can write the IP header before changing DSCP bits
There is not actually any guarantee that the IP headers are valid before we
access the DSCP bits of the packets. Fix this using the same approach taken
in sch_dsmark.

Reported-by: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 10:55:59 -07:00
Toke Høiland-Jørgensen
b2100cc56f sch_cake: Use tc_skb_protocol() helper for getting packet protocol
We shouldn't be using skb->protocol directly as that will miss cases with
hardware-accelerated VLAN tags. Use the helper instead to get the right
protocol number.

Reported-by: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 10:55:59 -07:00
Koen De Schepper
aecfde2310 tcp: Ensure DCTCP reacts to losses
RFC8257 §3.5 explicitly states that "A DCTCP sender MUST react to
loss episodes in the same way as conventional TCP".

Currently, Linux DCTCP performs no cwnd reduction when losses
are encountered. Optionally, the dctcp_clamp_alpha_on_loss resets
alpha to its maximal value if a RTO happens. This behavior
is sub-optimal for at least two reasons: i) it ignores losses
triggering fast retransmissions; and ii) it causes unnecessary large
cwnd reduction in the future if the loss was isolated as it resets
the historical term of DCTCP's alpha EWMA to its maximal value (i.e.,
denoting a total congestion). The second reason has an especially
noticeable effect when using DCTCP in high BDP environments, where
alpha normally stays at low values.

This patch replace the clamping of alpha by setting ssthresh to
half of cwnd for both fast retransmissions and RTOs, at most once
per RTT. Consequently, the dctcp_clamp_alpha_on_loss module parameter
has been removed.

The table below shows experimental results where we measured the
drop probability of a PIE AQM (not applying ECN marks) at a
bottleneck in the presence of a single TCP flow with either the
alpha-clamping option enabled or the cwnd halving proposed by this
patch. Results using reno or cubic are given for comparison.

                          |  Link   |   RTT    |    Drop
                 TCP CC   |  speed  | base+AQM | probability
        ==================|=========|==========|============
                    CUBIC |  40Mbps |  7+20ms  |    0.21%
                     RENO |         |          |    0.19%
        DCTCP-CLAMP-ALPHA |         |          |   25.80%
         DCTCP-HALVE-CWND |         |          |    0.22%
        ------------------|---------|----------|------------
                    CUBIC | 100Mbps |  7+20ms  |    0.03%
                     RENO |         |          |    0.02%
        DCTCP-CLAMP-ALPHA |         |          |   23.30%
         DCTCP-HALVE-CWND |         |          |    0.04%
        ------------------|---------|----------|------------
                    CUBIC | 800Mbps |   1+1ms  |    0.04%
                     RENO |         |          |    0.05%
        DCTCP-CLAMP-ALPHA |         |          |   18.70%
         DCTCP-HALVE-CWND |         |          |    0.06%

We see that, without halving its cwnd for all source of losses,
DCTCP drives the AQM to large drop probabilities in order to keep
the queue length under control (i.e., it repeatedly faces RTOs).
Instead, if DCTCP reacts to all source of losses, it can then be
controlled by the AQM using similar drop levels than cubic or reno.

Signed-off-by: Koen De Schepper <koen.de_schepper@nokia-bell-labs.com>
Signed-off-by: Olivier Tilmans <olivier.tilmans@nokia-bell-labs.com>
Cc: Bob Briscoe <research@bobbriscoe.net>
Cc: Lawrence Brakmo <brakmo@fb.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <borkmann@iogearbox.net>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 10:51:16 -07:00
Davide Caratti
fae2708174 net/sched: act_sample: fix divide by zero in the traffic path
the control path of 'sample' action does not validate the value of 'rate'
provided by the user, but then it uses it as divisor in the traffic path.
Validate it in tcf_sample_init(), and return -EINVAL with a proper extack
message in case that value is zero, to fix a splat with the script below:

 # tc f a dev test0 egress matchall action sample rate 0 group 1 index 2
 # tc -s a s action sample
 total acts 1

         action order 0: sample rate 1/0 group 1 pipe
          index 2 ref 1 bind 1 installed 19 sec used 19 sec
         Action statistics:
         Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
         backlog 0b 0p requeues 0
 # ping 192.0.2.1 -I test0 -c1 -q

 divide error: 0000 [#1] SMP PTI
 CPU: 1 PID: 6192 Comm: ping Not tainted 5.1.0-rc2.diag2+ #591
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 RIP: 0010:tcf_sample_act+0x9e/0x1e0 [act_sample]
 Code: 6a f1 85 c0 74 0d 80 3d 83 1a 00 00 00 0f 84 9c 00 00 00 4d 85 e4 0f 84 85 00 00 00 e8 9b d7 9c f1 44 8b 8b e0 00 00 00 31 d2 <41> f7 f1 85 d2 75 70 f6 85 83 00 00 00 10 48 8b 45 10 8b 88 08 01
 RSP: 0018:ffffae320190ba30 EFLAGS: 00010246
 RAX: 00000000b0677d21 RBX: ffff8af1ed9ec000 RCX: 0000000059a9fe49
 RDX: 0000000000000000 RSI: 000000000c7e33b7 RDI: ffff8af23daa0af0
 RBP: ffff8af1ee11b200 R08: 0000000074fcaf7e R09: 0000000000000000
 R10: 0000000000000050 R11: ffffffffb3088680 R12: ffff8af232307f80
 R13: 0000000000000003 R14: ffff8af1ed9ec000 R15: 0000000000000000
 FS:  00007fe9c6d2f740(0000) GS:ffff8af23da80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fff6772f000 CR3: 00000000746a2004 CR4: 00000000001606e0
 Call Trace:
  tcf_action_exec+0x7c/0x1c0
  tcf_classify+0x57/0x160
  __dev_queue_xmit+0x3dc/0xd10
  ip_finish_output2+0x257/0x6d0
  ip_output+0x75/0x280
  ip_send_skb+0x15/0x40
  raw_sendmsg+0xae3/0x1410
  sock_sendmsg+0x36/0x40
  __sys_sendto+0x10e/0x140
  __x64_sys_sendto+0x24/0x30
  do_syscall_64+0x60/0x210
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  [...]
  Kernel panic - not syncing: Fatal exception in interrupt

Add a TDC selftest to document that 'rate' is now being validated.

Reported-by: Matteo Croce <mcroce@redhat.com>
Fixes: 5c5670fae4 ("net/sched: Introduce sample tc action")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Yotam Gigi <yotam.gi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 10:46:33 -07:00