Commit Graph

916773 Commits

Author SHA1 Message Date
James Morse
ef5a294be8 KVM: arm64: Add emulation for 32bit guests accessing ACTLR2
ACTLR_EL1 is a 64bit register while the 32bit ACTLR is obviously 32bit.
For 32bit software, the extra bits are accessible via ACTLR2... which
KVM doesn't emulate.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200529150656.7339-3-james.morse@arm.com
2020-06-09 09:04:42 +01:00
James Morse
7c582bf4ed KVM: arm64: Stop writing aarch32's CSSELR into ACTLR
aarch32 has pairs of registers to access the high and low parts of 64bit
registers. KVM has a union of 64bit sys_regs[] and 32bit copro[]. The
32bit accessors read the high or low part of the 64bit sys_reg[] value
through the union.

Both sys_reg_descs[] and cp15_regs[] list access_csselr() as the accessor
for CSSELR{,_EL1}. access_csselr() is only aware of the 64bit sys_regs[],
and expects r->reg to be 'CSSELR_EL1' in the enum, index 2 of the 64bit
array.

cp15_regs[] uses the 32bit copro[] alias of sys_regs[]. Here CSSELR is
c0_CSSELR which is the same location in sys_reg[]. r->reg is 'c0_CSSELR',
index 4 in the 32bit array.

access_csselr() uses the 32bit r->reg value to access the 64bit array,
so reads and write the wrong value. sys_regs[4], is ACTLR_EL1, which
is subsequently save/restored when we enter the guest.

ACTLR_EL1 is supposed to be read-only for the guest. This register
only affects execution at EL1, and the host's value is restored before
we return to host EL1.

Convert the 32bit register index back to the 64bit version.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529150656.7339-2-james.morse@arm.com
2020-06-09 09:04:42 +01:00
Marc Zyngier
7ae2f3db61 KVM: arm64: Flush the instruction cache if not unmapping the VM on reboot
On a system with FWB, we don't need to unmap Stage-2 on reboot,
as even if userspace takes this opportunity to repaint the whole
of memory, FWB ensures that the data side stays consistent even
if the guest uses non-cacheable mappings.

However, the I-side is not necessarily coherent with the D-side
if CTR_EL0.DIC is 0. In this case, invalidate the i-cache to
preserve coherency.

Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Fixes: 892713e97c ("KVM: arm64: Sidestep stage2_unmap_vm() on vcpu reset when S2FWB is supported")
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-31 11:31:54 +01:00
Marc Zyngier
8f7f4fe756 KVM: arm64: Drop obsolete comment about sys_reg ordering
The general comment about keeping the enum order in sync
with the save/restore code has been obsolete for many years now.

Just drop it.

Note that there are other ordering requirements in the enum,
such as the PtrAuth and PMU registers, which are still valid.

Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 13:16:57 +01:00
Marc Zyngier
d9d7d84d99 KVM: arm64: Parametrize exception entry with a target EL
We currently assume that an exception is delivered to EL1, always.
Once we emulate EL2, this no longer will be the case. To prepare
for this, add a target_mode parameter.

While we're at it, merge the computing of the target PC and PSTATE in
a single function that updates both PC and CPSR after saving their
previous values in the corresponding ELR/SPSR. This ensures that they
are updated in the correct order (a pretty common source of bugs...).

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 13:16:55 +01:00
Marc Zyngier
349c330ced KVM: arm64: Don't use empty structures as CPU reset state
Keeping empty structure as the vcpu state initializer is slightly
wasteful: we only want to set pstate, and zero everything else.
Just do that.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 12:00:40 +01:00
Marc Zyngier
bb44a8dbea KVM: arm64: Move sysreg reset check to boot time
Our sysreg reset check has become a bit silly, as it only checks whether
a reset callback actually exists for a given sysreg entry, and apply the
method if available. Doing the check at each vcpu reset is pretty dumb,
as the tables never change. It is thus perfectly possible to do the same
checks at boot time.

This also allows us to introduce a sparse sys_regs[] array, something
that will be required with ARMv8.4-NV.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Marc Zyngier
7ccadf23b8 KVM: arm64: Add missing reset handlers for PMU emulation
As we're about to become a bit more harsh when it comes to the lack of
reset callbacks, let's add the missing PMU reset handlers. Note that
these only cover *CLR registers that were always covered by their *SET
counterpart, so there is no semantic change here.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Marc Zyngier
7ea90bdd70 KVM: arm64: Refactor vcpu_{read,write}_sys_reg
Extract the direct HW accessors for later reuse.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Christoffer Dall
fc5d1f1a42 KVM: arm64: vgic-v3: Take cpu_if pointer directly instead of vcpu
If we move the used_lrs field to the version-specific cpu interface
structure, the following functions only operate on the struct
vgic_v3_cpu_if and not the full vcpu:

  __vgic_v3_save_state
  __vgic_v3_restore_state
  __vgic_v3_activate_traps
  __vgic_v3_deactivate_traps
  __vgic_v3_save_aprs
  __vgic_v3_restore_aprs

This is going to be very useful for nested virt, so move the used_lrs
field and change the prototypes and implementations of these functions to
take the cpu_if parameter directly.

No functional change.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-28 11:57:10 +01:00
Andrew Scull
0a78791c0d KVM: arm64: Remove obsolete kvm_virt_to_phys abstraction
This abstraction was introduced to hide the difference between arm and
arm64 but, with the former no longer supported, this abstraction can be
removed and the canonical kernel API used directly instead.

Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
CC: Marc Zyngier <maz@kernel.org>
CC: James Morse <james.morse@arm.com>
CC: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20200519104036.259917-1-ascull@google.com
2020-05-25 16:16:27 +01:00
David Brazdil
438f711ce1 KVM: arm64: Fix incorrect comment on kvm_get_hyp_vector()
The comment used to say that kvm_get_hyp_vector is only called on VHE systems.
In fact, it is also called from the nVHE init function cpu_init_hyp_mode().
Fix the comment to stop confusing devs.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200515152550.83810-1-dbrazdil@google.com
2020-05-25 16:16:16 +01:00
David Brazdil
71b3ec5f22 KVM: arm64: Clean up cpu_init_hyp_mode()
Pull bits of code to the only place where it is used. Remove empty function
__cpu_init_stage2(). Remove redundant has_vhe() check since this function is
nVHE-only. No functional changes intended.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200515152056.83158-1-dbrazdil@google.com
2020-05-25 16:15:47 +01:00
Marc Zyngier
5107000faa KVM: arm64: Make KVM_CAP_MAX_VCPUS compatible with the selected GIC version
KVM_CAP_MAX_VCPUS always return the maximum possible number of
VCPUs, irrespective of the selected interrupt controller. This
is pretty misleading for userspace that selects a GICv2 on a GICv3
system that supports v2 compat: It always gets a maximum of 512
VCPUs, even if the effective limit is 8. The 9th VCPU will fail
to be created, which is unexpected as far as userspace is concerned.

Fortunately, we already have the right information stashed in the
kvm structure, and we can return it as requested.

Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20200427141507.284985-1-maz@kernel.org
2020-05-16 15:05:02 +01:00
Keqian Zhu
c862626e19 KVM: arm64: Support enabling dirty log gradually in small chunks
There is already support of enabling dirty log gradually in small chunks
for x86 in commit 3c9bd4006b ("KVM: x86: enable dirty log gradually in
small chunks"). This adds support for arm64.

x86 still writes protect all huge pages when DIRTY_LOG_INITIALLY_ALL_SET
is enabled. However, for arm64, both huge pages and normal pages can be
write protected gradually by userspace.

Under the Huawei Kunpeng 920 2.6GHz platform, I did some tests on 128G
Linux VMs with different page size. The memory pressure is 127G in each
case. The time taken of memory_global_dirty_log_start in QEMU is listed
below:

Page Size      Before    After Optimization
  4K            650ms         1.8ms
  2M             4ms          1.8ms
  1G             2ms          1.8ms

Besides the time reduction, the biggest improvement is that we will minimize
the performance side effect (because of dissolving huge pages and marking
memslots dirty) on guest after enabling dirty log.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200413122023.52583-1-zhukeqian1@huawei.com
2020-05-16 15:05:02 +01:00
Suzuki K Poulose
0529c90212 KVM: arm64: Unify handling THP backed host memory
We support mapping host memory backed by PMD transparent hugepages
at stage2 as huge pages. However the checks are now spread across
two different places. Let us unify the handling of the THPs to
keep the code cleaner (and future proof for PUD THP support).
This patch moves transparent_hugepage_adjust() closer to the caller
to avoid a forward declaration for fault_supports_stage2_huge_mappings().

Also, since we already handle the case where the host VA and the guest
PA may not be aligned, the explicit VM_BUG_ON() is not required.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200507123546.1875-3-yuzenghui@huawei.com
2020-05-16 15:05:02 +01:00
Suzuki K Poulose
9f2836146b KVM: arm64: Clean up the checking for huge mapping
If we are checking whether the stage2 can map PAGE_SIZE,
we don't have to do the boundary checks as both the host
VMA and the guest memslots are page aligned. Bail the case
easily.

While we're at it, fixup a typo in the comment below.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200507123546.1875-2-yuzenghui@huawei.com
2020-05-16 15:05:02 +01:00
Jiang Yi
48c963e31b KVM: arm/arm64: Release kvm->mmu_lock in loop to prevent starvation
Do cond_resched_lock() in stage2_flush_memslot() like what is done in
unmap_stage2_range() and other places holding mmu_lock while processing
a possibly large range of memory.

Signed-off-by: Jiang Yi <giangyi@amazon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20200415084229.29992-1-giangyi@amazon.com
2020-05-16 15:05:02 +01:00
Zenghui Yu
892713e97c KVM: arm64: Sidestep stage2_unmap_vm() on vcpu reset when S2FWB is supported
stage2_unmap_vm() was introduced to unmap user RAM region in the stage2
page table to make the caches coherent. E.g., a guest reboot with stage1
MMU disabled will access memory using non-cacheable attributes. If the
RAM and caches are not coherent at this stage, some evicted dirty cache
line may go and corrupt guest data in RAM.

Since ARMv8.4, S2FWB feature is mandatory and KVM will take advantage
of it to configure the stage2 page table and the attributes of memory
access. So we ensure that guests always access memory using cacheable
attributes and thus, the caches always be coherent.

So on CPUs that support S2FWB, we can safely reset the vcpu without a
heavy stage2 unmapping.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200415072835.1164-1-yuzenghui@huawei.com
2020-05-16 15:05:01 +01:00
Fuad Tabba
656012c731 KVM: Fix spelling in code comments
Fix spelling and typos (e.g., repeated words) in comments.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200401140310.29701-1-tabba@google.com
2020-05-16 15:05:01 +01:00
Marc Zyngier
ce6f8f02f9 KVM: arm64: Use cpus_have_final_cap for has_vhe()
By the time we start using the has_vhe() helper, we have long
discovered whether we are running VHE or not. It thus makes
sense to use cpus_have_final_cap() instead of cpus_have_const_cap(),
which leads to a small text size reduction.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: David Brazdil <dbrazdil@google.com>
Link: https://lore.kernel.org/r/20200513103828.74580-1-maz@kernel.org
2020-05-16 15:04:51 +01:00
Marc Zyngier
c6fe89ff8b KVM: arm64: Simplify __kvm_timer_set_cntvoff implementation
Now that this function isn't constrained by the 32bit PCS,
let's simplify it by taking a single 64bit offset instead
of two 32bit parameters.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-05-16 15:04:18 +01:00
Fuad Tabba
25357de01b KVM: arm64: Clean up kvm makefiles
Consolidate references to the CONFIG_KVM configuration item to encompass
entire folders rather than per line.

Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-5-tabba@google.com
2020-05-16 15:04:18 +01:00
Will Deacon
f26133624d KVM: arm64: Change CONFIG_KVM to a menuconfig entry
Changing CONFIG_KVM to be a 'menuconfig' entry in Kconfig mean that we
can straightforwardly enumerate optional features, such as the virtual
PMU device as dependent options.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-4-tabba@google.com
2020-05-16 15:04:18 +01:00
Will Deacon
bf7bc1df30 KVM: arm64: Update help text
arm64 KVM supports 16k pages since 02e0b7600f
("arm64: kvm: Add support for 16K pages"), so update the Kconfig help
text accordingly.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-3-tabba@google.com
2020-05-16 15:04:18 +01:00
Will Deacon
d82755b2e7 KVM: arm64: Kill off CONFIG_KVM_ARM_HOST
CONFIG_KVM_ARM_HOST is just a proxy for CONFIG_KVM, so remove it in favour
of the latter.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200505154520.194120-2-tabba@google.com
2020-05-16 15:04:18 +01:00
Marc Zyngier
9ed24f4b71 KVM: arm64: Move virt/kvm/arm to arch/arm64
Now that the 32bit KVM/arm host is a distant memory, let's move the
whole of the KVM/arm64 code into the arm64 tree.

As they said in the song: Welcome Home (Sanitarium).

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200513104034.74741-1-maz@kernel.org
2020-05-16 15:03:59 +01:00
Linus Torvalds
2ef96a5bb1 Linux 5.7-rc5 2020-05-10 15:16:58 -07:00
Linus Torvalds
c14cab2688 A set of fixes for x86:
- Ensure that direct mapping alias is always flushed when changing page
    attributes. The optimization for small ranges failed to do so when
    the virtual address was in the vmalloc or module space.
 
  - Unbreak the trace event registration for syscalls without arguments
    caused by the refactoring of the SYSCALL_DEFINE0() macro.
 
  - Move the printk in the TSC deadline timer code to a place where it is
    guaranteed to only be called once during boot and cannot be rearmed by
    clearing warn_once after boot. If it's invoked post boot then lockdep
    rightfully complains about a potential deadlock as the calling context
    is different.
 
  - A series of fixes for objtool and the ORC unwinder addressing variety
    of small issues:
 
      Stack offset tracking for indirect CFAs in objtool ignored subsequent
      pushs and pops
 
      Repair the unwind hints in the register clearing entry ASM code
 
      Make the unwinding in the low level exit to usermode code stop after
      switching to the trampoline stack. The unwind hint is not longer valid
      and the ORC unwinder emits a warning as it can't find the registers
      anymore.
 
      Fix the unwind hints in switch_to_asm() and rewind_stack_do_exit()
      which caused objtool to generate bogus ORC data.
 
      Prevent unwinder warnings when dumping the stack of a non-current
      task as there is no way to be sure about the validity because the
      dumped stack can be a moving target.
 
      Make the ORC unwinder behave the same way as the frame pointer
      unwinder when dumping an inactive tasks stack and do not skip the
      first frame.
 
      Prevent ORC unwinding before ORC data has been initialized
 
      Immediately terminate unwinding when a unknown ORC entry type is
      found.
 
      Prevent premature stop of the unwinder caused by IRET frames.
 
      Fix another infinite loop in objtool caused by a negative offset which
      was not catched.
 
      Address a few build warnings in the ORC unwinder and add missing
      static/ro_after_init annotations
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6363QTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRJHD/4hWjzJLsUZ9xq2NrzhevoeJtxj+wVM
 66x9NM3mlFQ30BN4Aye4EnNEhR0iIvNPWWdfEmaJYfPHPwnUjjcOa426HYxP/WXA
 DWd5F20wGaaPOJ65LJpy/+pfcxAeQynt4I2cDEWHAplswfOWV/Hv8mSeKAKuq400
 lCWaTMkWcO/toexSNn8PVyWi9rHlm+76E1bHkVwuoekGBGt1VloKGlK6OPyElzL2
 w9VtrjSLlYQ0MdfCJKQeg44XQPMbf4hZRfc88x9SwDWB01q7aSvb0pWNl9AJKNXA
 7fFu5T4F4PABPgRM7eJ5yNk0De9jM1y+6eCp66f9UXoNOeSr7Boz9Xc4xWqAraIi
 9Dtx3WliO9CAxwUiD+Cj2iJO5o83AdRK/xhCth2VRnYMS6imfSidEqTC+LhEtkzw
 Yplu7sbrWQDa5JTh8vk60clDvbkU+pfdxJisY+KClRguWfQfR6MJNuQnE0NHr7cH
 H4VXFFHEE6tDdJneQ9RxA4iF20RTgSlJGK0YlsH6QsxPsRgoHVkGUao8fQhrNvRc
 MIdpm9YasWStjJ7ZXbDeStmnLFN3DCj1RC8wmvJ4i/R1sPnBvPvRUt4Lm988a951
 Vyr23VIcVrE7zykiqQZVH7bvIv6ULORqTJbIOF1rO/aIut4W8z0ojoVXC0Z7CiwF
 S5SGj+hlWciIew==
 =0rCi
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for x86:

   - Ensure that direct mapping alias is always flushed when changing
     page attributes. The optimization for small ranges failed to do so
     when the virtual address was in the vmalloc or module space.

   - Unbreak the trace event registration for syscalls without arguments
     caused by the refactoring of the SYSCALL_DEFINE0() macro.

   - Move the printk in the TSC deadline timer code to a place where it
     is guaranteed to only be called once during boot and cannot be
     rearmed by clearing warn_once after boot. If it's invoked post boot
     then lockdep rightfully complains about a potential deadlock as the
     calling context is different.

   - A series of fixes for objtool and the ORC unwinder addressing
     variety of small issues:

       - Stack offset tracking for indirect CFAs in objtool ignored
         subsequent pushs and pops

       - Repair the unwind hints in the register clearing entry ASM code

       - Make the unwinding in the low level exit to usermode code stop
         after switching to the trampoline stack. The unwind hint is no
         longer valid and the ORC unwinder emits a warning as it can't
         find the registers anymore.

       - Fix unwind hints in switch_to_asm() and rewind_stack_do_exit()
         which caused objtool to generate bogus ORC data.

       - Prevent unwinder warnings when dumping the stack of a
         non-current task as there is no way to be sure about the
         validity because the dumped stack can be a moving target.

       - Make the ORC unwinder behave the same way as the frame pointer
         unwinder when dumping an inactive tasks stack and do not skip
         the first frame.

       - Prevent ORC unwinding before ORC data has been initialized

       - Immediately terminate unwinding when a unknown ORC entry type
         is found.

       - Prevent premature stop of the unwinder caused by IRET frames.

       - Fix another infinite loop in objtool caused by a negative
         offset which was not catched.

       - Address a few build warnings in the ORC unwinder and add
         missing static/ro_after_init annotations"

* tag 'x86-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/unwind/orc: Move ORC sorting variables under !CONFIG_MODULES
  x86/apic: Move TSC deadline timer debug printk
  ftrace/x86: Fix trace event registration for syscalls without arguments
  x86/mm/cpa: Flush direct map alias during cpa
  objtool: Fix infinite loop in for_offset_range()
  x86/unwind/orc: Fix premature unwind stoppage due to IRET frames
  x86/unwind/orc: Fix error path for bad ORC entry type
  x86/unwind/orc: Prevent unwinding before ORC initialization
  x86/unwind/orc: Don't skip the first frame for inactive tasks
  x86/unwind: Prevent false warnings for non-current tasks
  x86/unwind/orc: Convert global variables to static
  x86/entry/64: Fix unwind hints in rewind_stack_do_exit()
  x86/entry/64: Fix unwind hints in __switch_to_asm()
  x86/entry/64: Fix unwind hints in kernel exit path
  x86/entry/64: Fix unwind hints in register clearing code
  objtool: Fix stack offset tracking for indirect CFAs
2020-05-10 11:59:53 -07:00
Linus Torvalds
8b00083219 A single fix for objtool to prevent an infinite loop in the jump table
search which can be triggered when building the kernel with
 -ffunction-sections.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl635X8THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYod4AEACumK3x6NTRw7o889GU+Kzj/LfpJZOr
 QoKjZgo4AzeONa/5sKoBeNhGiHkuAOhhwG/yq9QrTMTw3b6kFPnQ8Vel590JM/uj
 j/qjgbpR5v2v0ULMbONEHBN/3dRETyFomAe88hg6WM7ZuNCXPd+rbya9XxD7LdQo
 K04y+vdPACJIf1hyb91sOWfyAWDHzwencNQ0qq0CMf72JpKFcRXCqor7vvH5zNH0
 2PmTbKasOlyrgb9HQNLi6mNNoM47bc9lg8D76eDBl/Wl3yqYVwAawk4vqgwjxc0P
 MetoybQsWegi2dzmhjk61MIF8h6vw/NM6xuyiXSW7dHCN1GXzWGojuSVfzv0AEj2
 0/xbRToSRWuPAvURGqiZ8GhBgG2ybHz+sDQyh/Cg4Jj2NulOLxy0lBngh5o/Dczh
 mLpqcJUd6I76bUSk+c0vehUqOEj0yAZm5Olo7gTkyoHlFSG0b2nShC1Km4TBM04A
 h5RTp/lBp++u1h4+w2EwyP8F6+chcpsyUQG1yEpz+GYoHsjYkA1gfx1rtD7uNKUI
 NwzVsuUEUt3UL4PPnjMMPYbOZb4R3vUjGFhgG5Se3D+2mNHkrvh+HA3WvUb5CUuW
 MSQ0e0K7Su3+M0/H+PQDKDPS/a0h3HtpZwaZt4gPkxnXNc5P7+c4Vyo28nttwF2O
 Ad08s6mVY8sDtA==
 =0+2v
 -----END PGP SIGNATURE-----

Merge tag 'objtool-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Thomas Gleixner:
 "A single fix for objtool to prevent an infinite loop in the
  jump table search which can be triggered when building the
  kernel with '-ffunction-sections'"

* tag 'objtool-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix infinite loop in find_jump_table()
2020-05-10 11:42:14 -07:00
Linus Torvalds
bd2049f871 A single fix for the fallout of the recent futex uacess rework.
With those changes GCC9 fails to analyze arch_futex_atomic_op_inuser()
 correctly and emits a 'maybe unitialized' warning. While we usually ignore
 compiler stupidity the conditional store is pointless anyway because the
 correct case has to store. For the fault case the extra store does no harm.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl635IgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTQkEAChNFCVLyCNihKNDar4h0LhuChYSVow
 CRpnLFKTrxWpUemHYgOQM8FBFRjvVK3o3yhmp7qyWc1LnM4iYuGleP+FhfL5F1mk
 t0ANUMFAZOomy4348XXeVR/bq7RFpKrD68tsl0u3nC+NzykN4kCt3n8qN0CendbH
 +j9ILi2eNEbSIarC4gH228UuN0YIY5nC9ftW9oHJ+c/Z23X9RXstXhiH1TB9w99E
 97G96WOdWjA+z7KzMF1REi/goJGxeZh0GQdz4iuR6vBNd4iR2V9hT3DqklUnSZPp
 +XGvaWaUH7yVa0etUdCtlBwmZ7Xq3h/N381khq9m6NfXdS8aZ7OavWyf+3urx7xz
 6GtCIlo0QnIyqx5oe1/06zxQNgNAf0JAKIi5IDLFsr8SwfoWoG1Z6RrAYugyZurm
 9RganJhVGrTXApi/9NUafhqHv7y9OE5UodRLpnKdnjei+/sE51xaIgx7Tr59Ao8n
 G3sMZkI/8GV9cQnKrg7qcN7kiJfyofoslnOigwm3hJaTMAn0fK9+Bx5YvJgVlyf2
 SmE3saw3408/hhqkVWCW5GL8J+JEh/WDi6FCZ3Fu+L1UHalzqDGKAlhfmVxxDNmt
 tDbP4AUHbucmcWl98Ms0iKtfSwz1H0kTfkaHS0cvphIfH593S4FDJEiywiKsab7v
 8nPUV2Bi6vZHxw==
 =Va5K
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Thomas Gleixner:
 "A single fix for the fallout of the recent futex uacess rework.

  With those changes GCC9 fails to analyze arch_futex_atomic_op_inuser()
  correctly and emits a 'maybe unitialized' warning. While we usually
  ignore compiler stupidity the conditional store is pointless anyway
  because the correct case has to store. For the fault case the extra
  store does no harm"

* tag 'locking-urgent-2020-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM: futex: Address build warning
2020-05-10 11:39:31 -07:00
Linus Torvalds
27d2dcb1b9 IOMMU Fixes for Linux v5.7-rc4
Including:
 
 	- The race condition fixes for the AMD IOMMU driver. This are 5
 	  patches fixing two race conditions around
 	  increase_address_space(). The first race condition was around
 	  the non-atomic update of the domain page-table root pointer
 	  and the variable containing the page-table depth (called
 	  mode). This is fixed now be merging page-table root and mode
 	  into one 64-bit field which is read/written atomically.
 
 	  The second race condition was around updating the page-table
 	  root pointer and making it public before the hardware caches
 	  were flushed. This could cause addresses to be mapped and
 	  returned to drivers which are not reachable by IOMMU hardware
 	  yet, causing IO page-faults. This is fixed too by adding the
 	  necessary flushes before a new page-table root is published.
 
 	  Related to the race condition fixes these patches also add a
 	  missing domain_flush_complete() barrier to update_domain() and
 	  a fix to bail out of the loop which tries to increase the
 	  address space when the call to increase_address_space() fails.
 
 	  Qian was able to trigger the race conditions under high load
 	  and memory pressure within a few days of testing. He confirmed
 	  that he has seen no issues anymore with the fixes included
 	  here.
 
 	- Fix for a list-handling bug in the VirtIO IOMMU driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl638MYACgkQK/BELZcB
 GuOQxQ/5AYorgKuGkqVbob69YWZuSAEG08dlzDw4C8CDnPKXEPd0L4gJGLP7BpEh
 bPJo9QJtXW7zG6Hhk8sWk9/iONsThngoudaQrodJwaQRdCDGaDZlvBaezG2Vx4xb
 A2OrcM9lvQSODdgyf3x0O1cX7vkQ4J6nJR1Z8Fw4EufjH6TS9DR0tf8ZWHtIpHa6
 Josu3M+qhUXPsn7KK5o7GtNib7sI4whLldYaASGsuaFGzod3CgA0cgmL2HfD+DWP
 k1EIEZTCaOq0BamtpyXbSA6o0AxwKERr/KONi1pL0xN4r0yCjsxEQ6+Rw4caqvgA
 zrfv3kk4a+wFAxOe0hUEtKk8Oy587LPJvIX4FnjG8hRnBrEaQC9vy4eMj05utPid
 PpsNQ35P+SyrxTlIp7ybIVhUvKbxih8SSpRsjx16vX+r/h4SRvWHzjpHVq/4+gIT
 TeZGw1g7xCIyjzn5HqLs/nMG/Ly9QHQaWia8slJJgbzI/deUXAVTy6PmMrqHB+zv
 e0PelKsq5lEQBrFX+r/Sg5hBViKaMykXKbXXg3KIolzlutJc2Rrzh4EEKpP/ug2/
 upTXf+NvMobNxb3QLqn3IJApIirEGYQqI7lwjiUwTC5xb3EfYLUuRa5i4fbOAZIv
 krsVM4sNX1S32TblTMzDDOEEggPG1wPhVF5B+1emOolYHek3ShI=
 =gqwr
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Race condition fixes for the AMD IOMMU driver.

   These are five patches fixing two race conditions around
   increase_address_space(). The first race condition was around the
   non-atomic update of the domain page-table root pointer and the
   variable containing the page-table depth (called mode). This is fixed
   now be merging page-table root and mode into one 64-bit field which
   is read/written atomically.

   The second race condition was around updating the page-table root
   pointer and making it public before the hardware caches were flushed.
   This could cause addresses to be mapped and returned to drivers which
   are not reachable by IOMMU hardware yet, causing IO page-faults. This
   is fixed too by adding the necessary flushes before a new page-table
   root is published.

   Related to the race condition fixes these patches also add a missing
   domain_flush_complete() barrier to update_domain() and a fix to bail
   out of the loop which tries to increase the address space when the
   call to increase_address_space() fails.

   Qian was able to trigger the race conditions under high load and
   memory pressure within a few days of testing. He confirmed that he
   has seen no issues anymore with the fixes included here.

 - Fix for a list-handling bug in the VirtIO IOMMU driver.

* tag 'iommu-fixes-v5.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/virtio: Reverse arguments to list_add
  iommu/amd: Do not flush Device Table in iommu_map_page()
  iommu/amd: Update Device Table in increase_address_space()
  iommu/amd: Call domain_flush_complete() in update_domain()
  iommu/amd: Do not loop forever when trying to increase address space
  iommu/amd: Fix race in increase_address_space()/fetch_pte()
2020-05-10 11:26:23 -07:00
Linus Torvalds
0a85ed6e7f block-5.7-2020-05-09
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl63WVAQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpkXWD/9qJgqQpPkigCCwwPHZ+phthw6gHeAgBxPH
 Cw6P9QB4QCdacZjQA6QH3zdxaDsCCitQRioWPgxngs1326TKYNzBi7U3eTEwiK12
 cnRybLnkzei4yzYVUSJk637oOoQh3CiJLvYcJBppGFi7crpbvlQv68M2hu05vhwL
 R/91H62X/5UaUlc1cJV63OBk8euWzF6XNbCQQrR4ayDvz+BsV5Fs72vYa1gx7qIt
 as/67oTT6y4U4pd74nT4OGkxDIXbXfn2eTbh5sMNc4ilBkqMyNbf8aOHdWqXZIBd
 18RKpNl6h/fiDMJ0jsGliReONLjfRBcJla68Kn1AFONMcyxcXidjptOwLOt2fYWf
 YMguCVMhfgxVBslzLWoQ9AWSiNVh36ycORWlCOrnRaOaQCb9OaLZ2fwibfZ0JsMd
 0259Z5vA7MIUoobCc5akXOYHbpByA9FSYkKudgTYLpdjkn05kxQyA12GgJjW3sVw
 ZRjoUuDuZDDUct6JcLWdrlONT8st05g+qf6PCoD+Jac8HtbpqHfKJJUtYecUat75
 4hGKhuvTzpuVY0wNHo3sgqKfsejQODTN6UhejNI11Zs/nx6O0ze/qoDuWZHncnKl
 158le+K5rNS8SUNbDBTMWp3OX4SJm/Gsf30fOWkkt6z1iaEfKc5sCxBHvSOeBEvH
 M9pzy56Vtw==
 =73nU
 -----END PGP SIGNATURE-----

Merge tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - a small series fixing a use-after-free of bdi name (Christoph,Yufen)

 - NVMe fix for a regression with the smaller CQ update (Alexey)

 - NVMe fix for a hang at namespace scanning error recovery (Sagi)

 - fix race with blk-iocost iocg->abs_vdebt updates (Tejun)

* tag 'block-5.7-2020-05-09' of git://git.kernel.dk/linux-block:
  nvme: fix possible hang when ns scanning fails during error recovery
  nvme-pci: fix "slimmer CQ head update"
  bdi: add a ->dev_name field to struct backing_dev_info
  bdi: use bdi_dev_name() to get device name
  bdi: move bdi_dev_name out of line
  vboxsf: don't use the source name in the bdi name
  iocost: protect iocg->abs_vdebt with iocg->waitq.lock
2020-05-10 11:16:07 -07:00
Linus Torvalds
e99332e7b4 gcc-10: mark more functions __init to avoid section mismatch warnings
It seems that for whatever reason, gcc-10 ends up not inlining a couple
of functions that used to be inlined before.  Even if they only have one
single callsite - it looks like gcc may have decided that the code was
unlikely, and not worth inlining.

The code generation difference is harmless, but caused a few new section
mismatch errors, since the (now no longer inlined) function wasn't in
the __init section, but called other init functions:

   Section mismatch in reference from the function kexec_free_initrd() to the function .init.text:free_initrd_mem()
   Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memremap()
   Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memunmap()

So add the appropriate __init annotation to make modpost not complain.
In both cases there were trivially just a single callsite from another
__init function.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 17:50:03 -07:00
Linus Torvalds
2e28f3b13a RISC-V Fixes for 5.7-rc5
This contains a smattering of fixes and cleanups that I'd like to target for
 5.7:
 
 * Dead code removal.
 * Exporting riscv_cpuid_to_hartid_mask for modules.
 * Per-CPU tracking of ISA features.
 * Setting max_pfn correctly when probing memory.
 * Adding a note to the VDSO so glibc can check the kernel's version without a
   uname().
 * A fix to force the bootloader to initialize the boot spin tables, which still
   get used as a fallback when SBI-0.1 is enabled.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl61prsTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYia2UD/44ILoaQySVnLZ+ZzXaMXn3WwGHe8bS
 NVPQJB21ejkfbM8cDR5A8+w45FBrHquIRwhHnVkl5JU2AtvcdWh3tztmFx6Ejsu9
 FFBzcbHcXnYthkm1xLVPQASY0Pl6VOPdx47Mip9gvoLK79VetjQWNzUpFk4CBJdw
 nObgYgxE9twCQ7JOcK0VnPL9IpJ6E/lCcIyCi11NL9xRWtUyWk4hcmAFj/+tUegm
 DroT7QzKKxFS24eLaRkJgQGwAJ1jb0/b0ztl04U8NTOqVjgFXkGTC1Kuzd06Ch2U
 U34CYRL+A2sXwWnnNsIyjD7Epdalc/xx+JMEuD8dhnr0YK8WilvvG53gGwCwFgVc
 wpFhvsIuINYTw253Rv0q1oeRcDmMCKmV7bhOKSX4x0V1iGM1ognl/6zkCY4J0dQC
 7BCoeAGlpBTNbidatZ6jl5e32jes50ZRjhf3LxXe3mgrBd+diKXyOyLT01SVwqv/
 A1Sur/KquwoqT4RSx2Cel8JswPhfErhB0otL3CYoao8V7rxYGTKWKXg5SFAgwDHZ
 rib1UpYmyh2tjmoXb99ctlBpRHsYcVzXOZS9tG7B2ue7YhEwiZdV3249uwitAQgm
 NmGCH7tDe/nu5DLBoFyTjBJ64pZyn3YmE58M/uCmbXyMRVSGp2TXK83u3mfiw+gh
 kKNSRHJDAAl7Fg==
 =bGU8
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "A smattering of fixes and cleanups:

   - Dead code removal.

   - Exporting riscv_cpuid_to_hartid_mask for modules.

   - Per-CPU tracking of ISA features.

   - Setting max_pfn correctly when probing memory.

   - Adding a note to the VDSO so glibc can check the kernel's version
     without a uname().

   - A fix to force the bootloader to initialize the boot spin tables,
     which still get used as a fallback when SBI-0.1 is enabled"

* tag 'riscv-for-linus-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Remove unused code from STRICT_KERNEL_RWX
  riscv: force __cpu_up_ variables to put in data section
  riscv: add Linux note to vdso
  riscv: set max_pfn to the PFN of the last page
  RISC-V: Remove N-extension related defines
  RISC-V: Add bitmap reprensenting ISA features common across CPUs
  RISC-V: Export riscv_cpuid_to_hartid_mask() API
2020-05-09 16:24:16 -07:00
Linus Torvalds
1a263ae60b gcc-10: avoid shadowing standard library 'free()' in crypto
gcc-10 has started warning about conflicting types for a few new
built-in functions, particularly 'free()'.

This results in warnings like:

   crypto/xts.c:325:13: warning: conflicting types for built-in function ‘free’; expected ‘void(void *)’ [-Wbuiltin-declaration-mismatch]

because the crypto layer had its local freeing functions called
'free()'.

Gcc-10 is in the wrong here, since that function is marked 'static', and
thus there is no chance of confusion with any standard library function
namespace.

But the simplest thing to do is to just use a different name here, and
avoid this gcc mis-feature.

[ Side note: gcc knowing about 'free()' is in itself not the
  mis-feature: the semantics of 'free()' are special enough that a
  compiler can validly do special things when seeing it.

  So the mis-feature here is that gcc thinks that 'free()' is some
  restricted name, and you can't shadow it as a local static function.

  Making the special 'free()' semantics be a function attribute rather
  than tied to the name would be the much better model ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 15:58:04 -07:00
Linus Torvalds
adc7192096 gcc-10: disable 'restrict' warning for now
gcc-10 now warns about passing aliasing pointers to functions that take
restricted pointers.

That's actually a great warning, and if we ever start using 'restrict'
in the kernel, it might be quite useful.  But right now we don't, and it
turns out that the only thing this warns about is an idiom where we have
declared a few functions to be "printf-like" (which seems to make gcc
pick up the restricted pointer thing), and then we print to the same
buffer that we also use as an input.

And people do that as an odd concatenation pattern, with code like this:

    #define sysfs_show_gen_prop(buffer, fmt, ...) \
        snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__)

where we have 'buffer' as both the destination of the final result, and
as the initial argument.

Yes, it's a bit questionable.  And outside of the kernel, people do have
standard declarations like

    int snprintf( char *restrict buffer, size_t bufsz,
                  const char *restrict format, ... );

where that output buffer is marked as a restrict pointer that cannot
alias with any other arguments.

But in the context of the kernel, that 'use snprintf() to concatenate to
the end result' does work, and the pattern shows up in multiple places.
And we have not marked our own version of snprintf() as taking restrict
pointers, so the warning is incorrect for now, and gcc picks it up on
its own.

If we do start using 'restrict' in the kernel (and it might be a good
idea if people find places where it matters), we'll need to figure out
how to avoid this issue for snprintf and friends.  But in the meantime,
this warning is not useful.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 15:45:21 -07:00
Linus Torvalds
5a76021c2e gcc-10: disable 'stringop-overflow' warning for now
This is the final array bounds warning removal for gcc-10 for now.

Again, the warning is good, and we should re-enable all these warnings
when we have converted all the legacy array declaration cases to
flexible arrays. But in the meantime, it's just noise.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 15:40:52 -07:00
Sagi Grimberg
59c7c3caaa nvme: fix possible hang when ns scanning fails during error recovery
When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da24343).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns identify descriptor list optional, we continue
to compare ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf742 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:58 -06:00
Alexey Dobriyan
a8de663916 nvme-pci: fix "slimmer CQ head update"
Pre-incrementing ->cq_head can't be done in memory because OOB value
can be observed by another context.

This devalues space savings compared to original code :-\

	$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
	add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-32 (-32)
	Function                                     old     new   delta
	nvme_poll_irqdisable                         464     456      -8
	nvme_poll                                    455     447      -8
	nvme_irq                                     388     380      -8
	nvme_dev_disable                             955     947      -8

But the code is minimal now: one read for head, one read for q_depth,
one increment, one comparison, single instruction phase bit update and
one write for new head.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: John Garry <john.garry@huawei.com>
Tested-by: John Garry <john.garry@huawei.com>
Fixes: e2a366a4b0 ("nvme-pci: slimmer CQ head update")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:58 -06:00
Christoph Hellwig
6bd87eec23 bdi: add a ->dev_name field to struct backing_dev_info
Cache a copy of the name for the life time of the backing_dev_info
structure so that we can reference it even after unregistering.

Fixes: 68f23b8906 ("memcg: fix a crash in wb_workfn when a device disappears")
Reported-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:57 -06:00
Yufen Yu
d51cfc53ad bdi: use bdi_dev_name() to get device name
Use the common interface bdi_dev_name() to get device name.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Add missing <linux/backing-dev.h> include BFQ

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:39 -06:00
Linus Torvalds
44720996e2 gcc-10: disable 'array-bounds' warning for now
This is another fine warning, related to the 'zero-length-bounds' one,
but hitting the same historical code in the kernel.

Because C didn't historically support flexible array members, we have
code that instead uses a one-sized array, the same way we have cases of
zero-sized arrays.

The one-sized arrays come from either not wanting to use the gcc
zero-sized array extension, or from a slight convenience-feature, where
particularly for strings, the size of the structure now includes the
allocation for the final NUL character.

So with a "char name[1];" at the end of a structure, you can do things
like

       v = my_malloc(sizeof(struct vendor) + strlen(name));

and avoid the "+1" for the terminator.

Yes, the modern way to do that is with a flexible array, and using
'offsetof()' instead of 'sizeof()', and adding the "+1" by hand.  That
also technically gets the size "more correct" in that it avoids any
alignment (and thus padding) issues, but this is another long-term
cleanup thing that will not happen for 5.7.

So disable the warning for now, even though it's potentially quite
useful.  Having a slew of warnings that then hide more urgent new issues
is not an improvement.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 14:52:44 -07:00
Linus Torvalds
5c45de21a2 gcc-10: disable 'zero-length-bounds' warning for now
This is a fine warning, but we still have a number of zero-length arrays
in the kernel that come from the traditional gcc extension.  Yes, they
are getting converted to flexible arrays, but in the meantime the gcc-10
warning about zero-length bounds is very verbose, and is hiding other
issues.

I missed one actual build failure because it was hidden among hundreds
of lines of warning.  Thankfully I caught it on the second go before
pushing things out, but it convinced me that I really need to disable
the new warnings for now.

We'll hopefully be all done with our conversion to flexible arrays in
the not too distant future, and we can then re-enable this warning.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 14:30:29 -07:00
Linus Torvalds
78a5255ffb Stop the ad-hoc games with -Wno-maybe-initialized
We have some rather random rules about when we accept the
"maybe-initialized" warnings, and when we don't.

For example, we consider it unreliable for gcc versions < 4.9, but also
if -O3 is enabled, or if optimizing for size.  And then various kernel
config options disabled it, because they know that they trigger that
warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).

And now gcc-10 seems to be introducing a lot of those warnings too, so
it falls under the same heading as 4.9 did.

At the same time, we have a very straightforward way to _enable_ that
warning when wanted: use "W=2" to enable more warnings.

So stop playing these ad-hoc games, and just disable that warning by
default, with the known and straight-forward "if you want to work on the
extra compiler warnings, use W=123".

Would it be great to have code that is always so obvious that it never
confuses the compiler whether a variable is used initialized or not?
Yes, it would.  In a perfect world, the compilers would be smarter, and
our source code would be simpler.

That's currently not the world we live in, though.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-09 13:57:10 -07:00
Linus Torvalds
1d3962ae3b io_uring-5.7-2020-05-08
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl62HvYQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgptEAEACbuLfgFok0Vw8j7KNW0WNNKlS2o6nXQlW5
 cl95JsqYdSL+toiDPQnJFtdoaxMhzL90kbWZzvPTBj+yTpLzRX0YnwFqXwFfmrga
 gd/7SOM5C97F1LCPL+luhbgp5HUq+ZVH882KjMiOVLvjjAb4SeKSexQGoxeKvtcV
 Pg3xm+zsbKKvclRDEqhnZB1X93WFAIrufuKBuV5xMZar7lkeRS9zwBUHySXa00xF
 i7lbvDqtNn3itgNQd7VGSNCF5u4JxCUm73SumY3nDMFXBfvSNk0nUpFBpTYLjb7G
 0XY71tfWrBlbk1sssqr1Dbs+pRuxJRj9FgtfNAMid7gcK0L9k6n7v08cFxkIz4Sv
 XPHisD6QCOz7pZ5JwfdAp9Ea5g9z+QsN0G1Owr18fSgWwlgvhJ9rdd4H0Of7rWVj
 mGyF5f+ZqoLD2UhaEmLgjQoSvzPlb6rsAUL9SxgpZkg/mk5l0j5tk32JS5bJL8h5
 RTj0oeyqoVGKqnRy8heV/0z6TqcEtuNn/nOsht8adCgIUVpk95bkjTGBM900IK/X
 HhdJMqPlTEDXQic+ZxVYNHDTZFhq4UOVJkoDfEwIN971LZfUaiz8XZ6uG5m4rFqj
 iRmLN5XJNVNK52hNT1dLQyeQ4j3a5OnVGsvjZ33QLy2P6rCZd7yU6jKfsoL8JDEU
 uAzkaWqLjA==
 =YeXV
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - Fix finish_wait() balancing in file cancelation (Xiaoguang)

 - Ensure early cleanup of resources in ring map failure (Xiaoguang)

 - Ensure IORING_OP_SLICE does the right file mode checks (Pavel)

 - Remove file opening from openat/openat2/statx, it's not needed and
   messes with O_PATH

* tag 'io_uring-5.7-2020-05-08' of git://git.kernel.dk/linux-block:
  io_uring: don't use 'fd' for openat/openat2/statx
  splice: move f_mode checks to do_{splice,tee}()
  io_uring: handle -EFAULT properly in io_uring_setup()
  io_uring: fix mismatched finish_wait() calls in io_uring_cancel_files()
2020-05-09 12:02:09 -07:00
Linus Torvalds
d5eeab8d7e SCSI fixes on 20200508
Four minor fixes, all in drivers (qla2xxx, ibmvfc, ibmvscsi)
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXrWUpiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaQbAQCus5pe
 D+e8jb8VzwbT+tr6HgPvaUOoSgrBJpXHy1oVFAEAwWuT9h4yHU8rsis5UR1MMh8F
 aTW9+wWDpdnTqoZN3iY=
 =fl/G
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four minor fixes, all in drivers (qla2xxx, ibmvfc, ibmvscsi)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ibmvscsi: Fix WARN_ON during event pool release
  scsi: ibmvfc: Don't send implicit logouts prior to NPIV login
  scsi: qla2xxx: Delete all sessions before unregister local nvme port
  scsi: qla2xxx: Fix hang when issuing nvme disconnect-all in NPIV
2020-05-08 10:36:56 -07:00
Linus Torvalds
eb24fdd8e6 Fixes for an endianness handling bug that prevented mounts on
big-endian arches, a spammy log message and a couple error paths.
 Also included a MAINTAINERS update.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl61ktUTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi3yKB/9s0kZ7fLYtGzqtuoIjualsaM0lsBBS
 rWAN4BkIVsxp3eOd5Hdb+ngIY5ykLLcUd+4gKqUNHkB7/1upDq9ZURKlyTwel5Wy
 889YEYESCVQQxPVY9KNvafaPeuR++2r9Thlp9hWyczrtvXtz80sFIrtO9TwDrj1P
 ZXPN3lxppGlxQiVNQfKIw2Cs78OxaNu9BthXZ7jN2OGaMQ0NU6sZ4LRXz8rbY+od
 AbfLEfwz4dPHQ/44k3rQg2IWNuOxRK+CNayxhuN0KWzock3MzGVYoYkPx0wNLiDx
 rntMscBqh3kppILZPEIeIA5Nv0yDAf4tf2hcUDf7GoJT/L/f9v7Q2SHa
 =75Ca
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Fixes for an endianness handling bug that prevented mounts on
  big-endian arches, a spammy log message and a couple error paths.

  Also included a MAINTAINERS update"

* tag 'ceph-for-5.7-rc5' of git://github.com/ceph/ceph-client:
  ceph: demote quotarealm lookup warning to a debug message
  MAINTAINERS: remove myself as ceph co-maintainer
  ceph: fix double unlock in handle_cap_export()
  ceph: fix special error code in ceph_try_get_caps()
  ceph: fix endianness bug when handling MDS session feature bits
2020-05-08 10:27:00 -07:00
Luis Henriques
12ae44a40a ceph: demote quotarealm lookup warning to a debug message
A misconfigured cephx can easily result in having the kernel client
flooding the logs with:

  ceph: Can't lookup inode 1 (err: -13)

Change this message to debug level.

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/44546
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-05-08 18:44:40 +02:00
Linus Torvalds
4334f30ebf Char/Misc driver fixes for 5.7-rc5
Here are some small driver fixes for 5.7-rc5 that resolve a number of
 minor reported issues:
 	- mhi bus driver fixes found as people actually use the code
 	- phy driver fixes and compat string additions
 	- most driver fix due to link order changing when the core moved
 	  out of staging
 	- mei driver fix
 	- interconnect build warning fix
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXrVocw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynQ3wCaA6rSYBcPTAHNrGo0iuzan0sbAfsAnAkfjZPk
 s369btDipPKIBv2hoVwt
 =KzSN
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes for 5.7-rc5 that resolve a number of
  minor reported issues:

   - mhi bus driver fixes found as people actually use the code

   - phy driver fixes and compat string additions

   - most driver fix due to link order changing when the core moved out
     of staging

   - mei driver fix

   - interconnect build warning fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  bus: mhi: core: Fix channel device name conflict
  bus: mhi: core: Fix typo in comment
  bus: mhi: core: Offload register accesses to the controller
  bus: mhi: core: Remove link_status() callback
  bus: mhi: core: Make sure to powerdown if mhi_sync_power_up fails
  bus: mhi: Fix parsing of mhi_flags
  mei: me: disable mei interface on LBG servers.
  phy: qualcomm: usb-hs-28nm: Prepare clocks in init
  MAINTAINERS: Add Vinod Koul as Generic PHY co-maintainer
  interconnect: qcom: Move the static keyword to the front of declaration
  most: core: use function subsys_initcall()
  bus: mhi: core: Fix a NULL vs IS_ERR check in mhi_create_devices()
  phy: qcom-qusb2: Re add "qcom,sdm845-qusb2-phy" compat string
  phy: tegra: Select USB_COMMON for usb_get_maximum_speed()
2020-05-08 09:11:53 -07:00