Commit Graph

7304 Commits

Author SHA1 Message Date
Linus Torvalds
22230cd2c5 Merge branch 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat mount cleanups from Al Viro:
 "The last remnants of mount(2) compat buried by Christoph.

  Buried into NFS, that is.

  Generally I'm less enthusiastic about "let's use in_compat_syscall()
  deep in call chain" kind of approach than Christoph seems to be, but
  in this case it's warranted - that had been an NFS-specific wart,
  hopefully not to be repeated in any other filesystems (read: any new
  filesystem introducing non-text mount options will get NAKed even if
  it doesn't mess the layout up).

  IOW, not worth trying to grow an infrastructure that would avoid that
  use of in_compat_syscall()..."

* 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: remove compat_sys_mount
  fs,nfs: lift compat nfs4 mount data handling into the nfs code
  nfs: simplify nfs4_parse_monolithic
2020-10-12 16:44:57 -07:00
Linus Torvalds
e18afa5bfa Merge branch 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat quotactl cleanups from Al Viro:
 "More Christoph's compat cleanups: quotactl(2)"

* 'work.quota-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  quota: simplify the quotactl compat handling
  compat: add a compat_need_64bit_alignment_fixup() helper
  compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
2020-10-12 16:37:13 -07:00
Linus Torvalds
85ed13e78d Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat iovec cleanups from Al Viro:
 "Christoph's series around import_iovec() and compat variant thereof"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  security/keys: remove compat_keyctl_instantiate_key_iov
  mm: remove compat_process_vm_{readv,writev}
  fs: remove compat_sys_vmsplice
  fs: remove the compat readv/writev syscalls
  fs: remove various compat readv/writev helpers
  iov_iter: transparently handle compat iovecs in import_iovec
  iov_iter: refactor rw_copy_check_uvector and import_iovec
  iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
  compat.h: fix a spelling error in <linux/compat.h>
2020-10-12 16:35:51 -07:00
Linus Torvalds
c90578360c Merge branch 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull copy_and_csum cleanups from Al Viro:
 "Saner calling conventions for csum_and_copy_..._user() and friends"

[ Removing 800+ lines of code and cleaning stuff up is good  - Linus ]

* 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ppc: propagate the calling conventions change down to csum_partial_copy_generic()
  amd64: switch csum_partial_copy_generic() to new calling conventions
  sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
  xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
  mips: propagate the calling convention change down into __csum_partial_copy_..._user()
  mips: __csum_partial_copy_kernel() has no users left
  mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
  sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
  i386: propagate the calling conventions change down to csum_partial_copy_generic()
  sh: propage the calling conventions change down to csum_partial_copy_generic()
  m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
  arm: propagate the calling convention changes down to csum_partial_copy_from_user()
  alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
  saner calling conventions for csum_and_copy_..._user()
  csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  csum_partial_copy_nocheck(): drop the last argument
  unify generic instances of csum_partial_copy_nocheck()
  icmp_push_reply(): reorder adding the checksum up
  skb_copy_and_csum_bits(): don't bother with the last argument
2020-10-12 16:24:13 -07:00
Linus Torvalds
1c6890707e This tree prepares to unify the kretprobe trampoline handler and make
kretprobe lockless. (Those patches are still work in progress.)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EgmMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hKQg//WrYVMc+lLG+QP4IuKfolZVGNeS60crZE
 mTvs4iX8gBrU5omrgatrjUiDhrln6MiTf6H0ec072BAho91lom/AlyDUQbta5sls
 uXKzIjHe9J7ca+myXGDiXkGmWXgcBYHBHifyzf04xhPyFXH869HLxFXCHeV1S3m7
 Tga1Lceths425t8nnYb9yao9k26l22BSklzPqEM/XNNnktrMiaiYlfgUxi1g3hMj
 v9IbZy43qpzljyrnfRk/tRGMnZ/BtZpj7swQEjUVOKgmcymX6bQoxqYvpAH5mYX7
 jqKcTLsw/Jm4YhZdeBpjZc2JNQkNJSLjiXMMtQTmncPKx2shuU1s4KhgRtYEEeyI
 BO37k3RwplED7/yBJtojNt0WWYfd7X2ee8SPuSW/VPL6jSDgJii3Um0AldPZ0J3g
 72OT4rJkyqFER0ZKSf8uIym2Zi7F5IvtzK2xJAzquOQlYdCaKSNrWurckOzWHMm9
 JKqUqq3nV4mFUKEE7Kf0Nu3UgQZNKpxUNepWBoJb3j6baK32Qgb6qpNLLPTTi2qJ
 AwxicRlr7jzdyP2cwvU5z2FuilPypOob8ZnowhhIyV+4xQY9CymJ3uluXattDC74
 ZNgydTyyYCo0PwYZGUDeE8o87apYd3+sEOErLtw4CjaoiadxDaMBmfsHzU7W29Rc
 Fow4+FQCK/Y=
 =2jY/
 -----END PGP SIGNATURE-----

Merge tag 'perf-kprobes-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf/kprobes updates from Ingo Molnar:
 "This prepares to unify the kretprobe trampoline handler and make
  kretprobe lockless (those patches are still work in progress)"

* tag 'perf-kprobes-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()
  kprobes: Make local functions static
  kprobes: Free kretprobe_instance with RCU callback
  kprobes: Remove NMI context check
  sparc: kprobes: Use generic kretprobe trampoline handler
  sh: kprobes: Use generic kretprobe trampoline handler
  s390: kprobes: Use generic kretprobe trampoline handler
  powerpc: kprobes: Use generic kretprobe trampoline handler
  parisc: kprobes: Use generic kretprobe trampoline handler
  mips: kprobes: Use generic kretprobe trampoline handler
  ia64: kprobes: Use generic kretprobe trampoline handler
  csky: kprobes: Use generic kretprobe trampoline handler
  arc: kprobes: Use generic kretprobe trampoline handler
  arm64: kprobes: Use generic kretprobe trampoline handler
  arm: kprobes: Use generic kretprobe trampoline handler
  x86/kprobes: Use generic kretprobe trampoline handler
  kprobes: Add generic kretprobe trampoline handler
2020-10-12 14:21:15 -07:00
Linus Torvalds
34eb62d868 Orphan link sections were a long-standing source of obscure bugs,
because the heuristics that various linkers & compilers use to handle them
 (include these bits into the output image vs discarding them silently)
 are both highly idiosyncratic and also version dependent.
 
 Instead of this historically problematic mess, this tree by Kees Cook (et al)
 adds build time asserts and build time warnings if there's any orphan section
 in the kernel or if a section is not sized as expected.
 
 And because we relied on so many silent assumptions in this area, fix a metric
 ton of dependencies and some outright bugs related to this, before we can
 finally enable the checks on the x86, ARM and ARM64 platforms.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+Edv4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hiKBAApdJEOaK7hMc3013DYNctklIxEPJL2mFJ
 11YJRIh4pUJTF0TE+EHT/D+rSIuRsyuoSmOQBQ61/wVSnyG067GjjVJRqh/eYaJ1
 fDhJi2FuHOjXl+CiN0KxzBjjp+V4NhF7jHT59tpQSvfZeg7FjteoxfztxaCp5ek3
 S3wHB3CC4c4jE3lfjHem1E9/PwT4kwPYx1c3gAUdEqJdjkihjX9fWusfjLeqW6/d
 Y5VkApi6bL9XiZUZj5l0dEIweLJJ86+PkKJqpo3spxxEak1LSn1MEix+lcJ8e1Kg
 sb/bEEivDcmFlFWOJnn0QLquCR0Cx5bz1pwsL0tuf0yAd4+sXX5IMuGUysZlEdKM
 BHL9h5HbevGF4BScwZwZH7lyEg7q67s5KnRu4hxy0Swfcj7y0oT/9lXqpbpZ2DqO
 Hd+bRRQKIbqnTMp0hcit9LfpLp93vj0dBlaV5ocAJJlu62u9VnwGG5HQuZ5giLUr
 kA1SLw63Y1wopFRxgFyER8les7eLsu0zxHeK44rRVlVnfI99OMTOgVNicmDFy3Fm
 AfcnfJG0BqBEJGQz5es34uQQKKBwFPtC9NztopI62KiwOspYYZyrO1BNxdOc6DlS
 mIHrmO89HMXuid5eolvLaFqUWirHoWO8TlycgZxUWVHc2txVPjAEU/axouU/dSSU
 w/6GpzAa+7g=
 =fXAw
 -----END PGP SIGNATURE-----

Merge tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull orphan section checking from Ingo Molnar:
 "Orphan link sections were a long-standing source of obscure bugs,
  because the heuristics that various linkers & compilers use to handle
  them (include these bits into the output image vs discarding them
  silently) are both highly idiosyncratic and also version dependent.

  Instead of this historically problematic mess, this tree by Kees Cook
  (et al) adds build time asserts and build time warnings if there's any
  orphan section in the kernel or if a section is not sized as expected.

  And because we relied on so many silent assumptions in this area, fix
  a metric ton of dependencies and some outright bugs related to this,
  before we can finally enable the checks on the x86, ARM and ARM64
  platforms"

* tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/boot/compressed: Warn on orphan section placement
  x86/build: Warn on orphan section placement
  arm/boot: Warn on orphan section placement
  arm/build: Warn on orphan section placement
  arm64/build: Warn on orphan section placement
  x86/boot/compressed: Add missing debugging sections to output
  x86/boot/compressed: Remove, discard, or assert for unwanted sections
  x86/boot/compressed: Reorganize zero-size section asserts
  x86/build: Add asserts for unwanted sections
  x86/build: Enforce an empty .got.plt section
  x86/asm: Avoid generating unused kprobe sections
  arm/boot: Handle all sections explicitly
  arm/build: Assert for unwanted sections
  arm/build: Add missing sections
  arm/build: Explicitly keep .ARM.attributes sections
  arm/build: Refactor linker script headers
  arm64/build: Assert for unwanted sections
  arm64/build: Add missing DWARF sections
  arm64/build: Use common DISCARDS in linker script
  arm64/build: Remove .eh_frame* sections due to unwind tables
  ...
2020-10-12 13:39:19 -07:00
Linus Torvalds
cc7343724e Surgery of the MSI interrupt handling to prepare the support of upcoming
devices which require non-PCI based MSI handling.
 
   - Cleanup historical leftovers all over the place
 
   - Rework the code to utilize more core functionality
 
   - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain
     assignment to PCI devices possible.
 
   - Assign irqdomains to PCI devices at initialization time which allows
     to utilize the full functionality of hierarchical irqdomains.
 
   - Remove arch_.*_msi_irq() functions from X86 and utilize the irqdomain
     which is assigned to the device for interrupt management.
 
   - Make the arch_.*_msi_irq() support conditional on a config switch and
     let the last few users select it.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+EUxcTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoagLEACGp5U7a4mk24GsOZJDhrua1PHR/fhb
 enn/5yOPpxDXdYmtFHIjV5qrNjDTV/WqDlI96KOi+oinG1Eoj0O/MA1AcSRhp6nf
 jVdAuK1X0DHDUTEeTAP0JFwqd2j0KlIOphBrIMgeWIf1CRKlYiJaO+ioF9fKgwZ/
 /HigOTSykGYMPggm3JXnWTWtJkKSGFxeADBvVHt5RpVmbWtrI4YoSBxKEMtvjyeM
 5+GsqbCad1CnFYTN74N+QWVGmgGnUWGEzWsPYnJ9hW+yyjad1kWx3n6NcCWhssaC
 E4vAXl6JuCPntL7jBFkbfUkQsgq12ThMZYWpCq8pShJA9O2tDKkxIGasHWrIt4cz
 nYrESiv6hM7edjtOvBc086Gd0A2EyGOM879goHyaNVaTO4rI6jfZG7PlW1HHWibS
 mf/bdTXBtULGNgEt7T8Qnb8sZ+D01WqzLrq/wm645jIrTzvNHUEpOhT1aH/g4TFQ
 cNHD5PcM9OTmiBir9srNd47+1s2mpfwdMYHKBt2QgiXMO8fRgdtr6WLQE4vJjmG8
 sA0yGGsgdTKeg2wW1ERF1pWL0Lt05Iaa42Skm0D3BwcOG2n5ltkBHzVllto9cTUh
 kIldAOgxGE6QeCnnlrnbHz5mvzt/3Ih/PIKqPSUAC94Kx1yvVHRYuOvDExeO8DFB
 P+f0TkrscZObSg==
 =JlqV
 -----END PGP SIGNATURE-----

Merge tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 irq updates from Thomas Gleixner:
 "Surgery of the MSI interrupt handling to prepare the support of
  upcoming devices which require non-PCI based MSI handling:

   - Cleanup historical leftovers all over the place

   - Rework the code to utilize more core functionality

   - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain
     assignment to PCI devices possible.

   - Assign irqdomains to PCI devices at initialization time which
     allows to utilize the full functionality of hierarchical
     irqdomains.

   - Remove arch_.*_msi_irq() functions from X86 and utilize the
     irqdomain which is assigned to the device for interrupt management.

   - Make the arch_.*_msi_irq() support conditional on a config switch
     and let the last few users select it"

* tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS
  x86/apic/msi: Unbreak DMAR and HPET MSI
  iommu/amd: Remove domain search for PCI/MSI
  iommu/vt-d: Remove domain search for PCI/MSI[X]
  x86/irq: Make most MSI ops XEN private
  x86/irq: Cleanup the arch_*_msi_irqs() leftovers
  PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
  x86/pci: Set default irq domain in pcibios_add_device()
  iommm/amd: Store irq domain in struct device
  iommm/vt-d: Store irq domain in struct device
  x86/xen: Wrap XEN MSI management into irqdomain
  irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
  x86/xen: Consolidate XEN-MSI init
  x86/xen: Rework MSI teardown
  x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
  PCI/MSI: Provide pci_dev_has_special_msi_domain() helper
  PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI
  irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
  x86/irq: Initialize PCI/MSI domain at PCI init time
  x86/pci: Reducde #ifdeffery in PCI init code
  ...
2020-10-12 11:40:41 -07:00
Linus Torvalds
6734e20e39 arm64 updates for 5.10
- Userspace support for the Memory Tagging Extension introduced by Armv8.5.
   Kernel support (via KASAN) is likely to follow in 5.11.
 
 - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
   switching.
 
 - Fix and subsequent rewrite of our Spectre mitigations, including the
   addition of support for PR_SPEC_DISABLE_NOEXEC.
 
 - Support for the Armv8.3 Pointer Authentication enhancements.
 
 - Support for ASID pinning, which is required when sharing page-tables with
   the SMMU.
 
 - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op.
 
 - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and
   also support to handle CPU PMU IRQs as NMIs.
 
 - Allow prefetchable PCI BARs to be exposed to userspace using normal
   non-cacheable mappings.
 
 - Implementation of ARCH_STACKWALK for unwinding.
 
 - Improve reporting of unexpected kernel traps due to BPF JIT failure.
 
 - Improve robustness of user-visible HWCAP strings and their corresponding
   numerical constants.
 
 - Removal of TEXT_OFFSET.
 
 - Removal of some unused functions, parameters and prototypes.
 
 - Removal of MPIDR-based topology detection in favour of firmware
   description.
 
 - Cleanups to handling of SVE and FPSIMD register state in preparation
   for potential future optimisation of handling across syscalls.
 
 - Cleanups to the SDEI driver in preparation for support in KVM.
 
 - Miscellaneous cleanups and refactoring work.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+AUXMQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNFc1B/4q2Kabe+pPu7s1f58Q+OTaEfqcr3F1qh27
 F1YpFZUYxg0GPfPsFrnbJpo5WKo7wdR9ceI9yF/GHjs7A/MSoQJis3pG6SlAd9c0
 nMU5tCwhg9wfq6asJtl0/IPWem6cqqhdzC6m808DjeHuyi2CCJTt0vFWH3OeHEhG
 cfmLfaSNXOXa/MjEkT8y1AXJ/8IpIpzkJeCRA1G5s18PXV9Kl5bafIo9iqyfKPLP
 0rJljBmoWbzuCSMc81HmGUQI4+8KRp6HHhyZC/k0WEVgj3LiumT7am02bdjZlTnK
 BeNDKQsv2Jk8pXP2SlrI3hIUTz0bM6I567FzJEokepvTUzZ+CVBi
 =9J8H
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's quite a lot of code here, but much of it is due to the
  addition of a new PMU driver as well as some arm64-specific selftests
  which is an area where we've traditionally been lagging a bit.

  In terms of exciting features, this includes support for the Memory
  Tagging Extension which narrowly missed 5.9, hopefully allowing
  userspace to run with use-after-free detection in production on CPUs
  that support it. Work is ongoing to integrate the feature with KASAN
  for 5.11.

  Another change that I'm excited about (assuming they get the hardware
  right) is preparing the ASID allocator for sharing the CPU page-table
  with the SMMU. Those changes will also come in via Joerg with the
  IOMMU pull.

  We do stray outside of our usual directories in a few places, mostly
  due to core changes required by MTE. Although much of this has been
  Acked, there were a couple of places where we unfortunately didn't get
  any review feedback.

  Other than that, we ran into a handful of minor conflicts in -next,
  but nothing that should post any issues.

  Summary:

   - Userspace support for the Memory Tagging Extension introduced by
     Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11.

   - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
     switching.

   - Fix and subsequent rewrite of our Spectre mitigations, including
     the addition of support for PR_SPEC_DISABLE_NOEXEC.

   - Support for the Armv8.3 Pointer Authentication enhancements.

   - Support for ASID pinning, which is required when sharing
     page-tables with the SMMU.

   - MM updates, including treating flush_tlb_fix_spurious_fault() as a
     no-op.

   - Perf/PMU driver updates, including addition of the ARM CMN PMU
     driver and also support to handle CPU PMU IRQs as NMIs.

   - Allow prefetchable PCI BARs to be exposed to userspace using normal
     non-cacheable mappings.

   - Implementation of ARCH_STACKWALK for unwinding.

   - Improve reporting of unexpected kernel traps due to BPF JIT
     failure.

   - Improve robustness of user-visible HWCAP strings and their
     corresponding numerical constants.

   - Removal of TEXT_OFFSET.

   - Removal of some unused functions, parameters and prototypes.

   - Removal of MPIDR-based topology detection in favour of firmware
     description.

   - Cleanups to handling of SVE and FPSIMD register state in
     preparation for potential future optimisation of handling across
     syscalls.

   - Cleanups to the SDEI driver in preparation for support in KVM.

   - Miscellaneous cleanups and refactoring work"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  Revert "arm64: initialize per-cpu offsets earlier"
  arm64: random: Remove no longer needed prototypes
  arm64: initialize per-cpu offsets earlier
  kselftest/arm64: Check mte tagged user address in kernel
  kselftest/arm64: Verify KSM page merge for MTE pages
  kselftest/arm64: Verify all different mmap MTE options
  kselftest/arm64: Check forked child mte memory accessibility
  kselftest/arm64: Verify mte tag inclusion via prctl
  kselftest/arm64: Add utilities and a test to validate mte memory
  perf: arm-cmn: Fix conversion specifiers for node type
  perf: arm-cmn: Fix unsigned comparison to less than zero
  arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
  arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
  arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
  arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
  KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
  arm64: Get rid of arm64_ssbd_state
  KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
  KVM: arm64: Get rid of kvm_arm_have_ssbd()
  KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
  ...
2020-10-12 10:00:51 -07:00
Christoph Hellwig
c3973b401e mm: remove compat_process_vm_{readv,writev}
Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:15 -04:00
Christoph Hellwig
598b3cec83 fs: remove compat_sys_vmsplice
Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:15 -04:00
Christoph Hellwig
5f764d624a fs: remove the compat readv/writev syscalls
Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:14 -04:00
Thomas Gleixner
981aa1d366 PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS
The unconditional selection of PCI_MSI_ARCH_FALLBACKS has an unmet
dependency because PCI_MSI_ARCH_FALLBACKS is defined in a 'if PCI' clause.

As it is only relevant when PCI_MSI is enabled, update the affected
architecture Kconfigs to make the selection of PCI_MSI_ARCH_FALLBACKS
depend on 'if PCI_MSI'.

Fixes: 077ee78e39 ("PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable")
Reported-by: Qian Cai <cai@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Links: https://lore.kernel.org/r/cdfd63305caa57785b0925dd24c0711ea02c8527.camel@redhat.com
2020-09-28 16:03:10 +02:00
Vasily Gorbik
d3f7b1bb20 mm/gup: fix gup_fast with dynamic page table folding
Currently to make sure that every page table entry is read just once
gup_fast walks perform READ_ONCE and pass pXd value down to the next
gup_pXd_range function by value e.g.:

  static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                           unsigned int flags, struct page **pages, int *nr)
  ...
          pudp = pud_offset(&p4d, addr);

This function passes a reference on that local value copy to pXd_offset,
and might get the very same pointer in return.  This happens when the
level is folded (on most arches), and that pointer should not be
iterated.

On s390 due to the fact that each task might have different 5,4 or
3-level address translation and hence different levels folded the logic
is more complex and non-iteratable pointer to a local copy leads to
severe problems.

Here is an example of what happens with gup_fast on s390, for a task
with 3-level paging, crossing a 2 GB pud boundary:

  // addr = 0x1007ffff000, end = 0x10080001000
  static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                           unsigned int flags, struct page **pages, int *nr)
  {
        unsigned long next;
        pud_t *pudp;

        // pud_offset returns &p4d itself (a pointer to a value on stack)
        pudp = pud_offset(&p4d, addr);
        do {
                // on second iteratation reading "random" stack value
                pud_t pud = READ_ONCE(*pudp);

                // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
                next = pud_addr_end(addr, end);
                ...
        } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack

        return 1;
  }

This happens since s390 moved to common gup code with commit
d1874a0c28 ("s390/mm: make the pxd_offset functions more robust") and
commit 1a42010cdc ("s390/mm: convert to the generic
get_user_pages_fast code").

s390 tried to mimic static level folding by changing pXd_offset
primitives to always calculate top level page table offset in pgd_offset
and just return the value passed when pXd_offset has to act as folded.

What is crucial for gup_fast and what has been overlooked is that
PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
And the latter is not possible with dynamic folding.

To fix the issue in addition to pXd values pass original pXdp pointers
down to gup_pXd_range functions.  And introduce pXd_offset_lockless
helpers, which take an additional pXd entry value parameter.  This has
already been discussed in

  https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1

Fixes: 1a42010cdc ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: <stable@vger.kernel.org>	[5.2+]
Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hours
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26 10:33:57 -07:00
Christoph Hellwig
028abd9222 fs: remove compat_sys_mount
compat_sys_mount is identical to the regular sys_mount now, so remove it
and use the native version everywhere.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-22 23:45:57 -04:00
Mark Brown
264c03a245 stacktrace: Remove reliable argument from arch_stack_walk() callback
Currently the callback passed to arch_stack_walk() has an argument called
reliable passed to it to indicate if the stack entry is reliable, a comment
says that this is used by some printk() consumers. However in the current
kernel none of the arch_stack_walk() implementations ever set this flag to
true and the only callback implementation we have is in the generic
stacktrace code which ignores the flag. It therefore appears that this
flag is redundant so we can simplify and clarify things by removing it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20200914153409.25097-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18 14:24:16 +01:00
Christoph Hellwig
cc7886d25b compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
lift the compat_s64 and compat_u64 definitions into common code using the
COMPAT_FOR_U64_ALIGNMENT symbol for the x86 special case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-17 13:00:46 -04:00
Thomas Gleixner
077ee78e39 PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture
requires them or not. Architectures which are fully utilizing hierarchical
irq domains should never call into that code.

It's not only architectures which depend on that by implementing one or
more of the weak functions, there is also a bunch of drivers which relies
on the weak functions which invoke msi_controller::setup_irq[s] and
msi_controller::teardown_irq.

Make the architectures and drivers which rely on them select them in Kconfig
and if not selected replace them by stub functions which emit a warning and
fail the PCI/MSI interrupt allocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200826112333.992429909@linutronix.de
2020-09-16 16:52:37 +02:00
Janosch Frank
cd4d3d5f21 s390: add 3f program exception handler
Program exception 3f (secure storage violation) can only be detected
when the CPU is running in SIE with a format 4 state description,
e.g. running a protected guest. Because of this and because user
space partly controls the guest memory mapping and can trigger this
exception, we want to send a SIGSEGV to the process running the guest
and not panic the kernel.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Cc: <stable@vger.kernel.org> # 5.7
Fixes: 084ea4d611 ("s390/mm: add (non)secure page access exceptions handlers")
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Niklas Schnelle
afdf9550e5 s390/pci: fix leak of DMA tables on hard unplug
commit f606b3ef47 ("s390/pci: adapt events for zbus") removed the
zpci_disable_device() call for a zPCI event with PEC 0x0304 because
the device is already deconfigured by the platform.
This however skips the Linux side of the disable in particular it leads
to leaking the DMA tables and bitmaps because zpci_dma_exit_device() is
never called on the device.

If the device transitions to the Reserved state we call zpci_zdev_put()
but zpci_release_device() will not call zpci_disable_device() because
the state of the zPCI function is already ZPCI_FN_STATE_STANDBY.

If the device is put into the Standby state, zpci_disable_device() is
not called and the device is assumed to have been put in Standby through
platform action.
At this point the device may be removed by a subsequent event with PEC
0x0308 or 0x0306 which calls zpci_zdev_put() with the same problem
as above or the device may be configured again in which case
zpci_disable_device() is also not called.

Fix this by calling zpci_disable_device() explicitly for PEC 0x0304 as
before. To make it more clear that zpci_disable_device() may be called,
even if the lower level device has already been disabled by the
platform, add a comment to zpci_disable_device().

Cc: <stable@vger.kernel.org> # 5.8
Fixes: f606b3ef47 ("s390/pci: adapt events for zbus")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Ilya Leoshkevich
fcb2b70cdb s390/init: add missing __init annotations
Add __init to reserve_memory_end, reserve_oldmem and remove_oldmem.
Sometimes these functions are not inlined, and then the build
complains about section mismatch.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Peter Zijlstra
ca589ea8d1 s390/idle: fix suspicious RCU usage
After commit eb1f00237a ("lockdep,trace: Expose tracepoints") the
lock tracepoints are visible to lockdep and RCU-lockdep is finding a
bunch more RCU violations that were previously hidden.

Switch the idle->seqcount over to using raw_write_*() to avoid the
lockdep annotation and thus the lock tracepoints.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Masami Hiramatsu
26a24a6b43 s390: kprobes: Use generic kretprobe trampoline handler
Use the generic kretprobe trampoline handler. Don't use
framepointer verification.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/159870612453.1229682.15950927742606892302.stgit@devnote2
2020-09-08 11:52:34 +02:00
Heiko Carstens
5c60ed283e s390: update defconfigs
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-02 13:17:05 +02:00
Eric Farman
114b9df419 s390: fix GENERIC_LOCKBREAK dependency typo in Kconfig
Commit fa68645305 ("sched/rt, s390: Use CONFIG_PREEMPTION")
changed a bunch of uses of CONFIG_PREEMPT to _PREEMPTION.
Except in the Kconfig it used two T's. That's the only place
in the system where that spelling exists, so let's fix that.

Fixes: fa68645305 ("sched/rt, s390: Use CONFIG_PREEMPTION")
Cc: <stable@vger.kernel.org> # 5.6
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-02 13:17:05 +02:00
Kees Cook
c604abc3f6 vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG
The .comment section doesn't belong in STABS_DEBUG. Split it out into a
new macro named ELF_DETAILS. This will gain other non-debug sections
that need to be accounted for when linking with --orphan-handling=warn.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-09-01 09:50:35 +02:00
Linus Torvalds
b69bea8a65 A set of fixes for lockdep, tracing and RCU:
- Prevent recursion by using raw_cpu_* operations
 
   - Fixup the interrupt state in the cpu idle code to be consistent
 
   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections
 
   - Move trace_cpu_idle() into generic code so it's called before RCU goes
     idle.
 
   - Handle raw_local_irq* vs. local_irq* operations correctly
 
   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl9L5qETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoV/NEADG+h02tj2I4gP7IQ3nVodEzS1+odPI
 orabY5ggH0kn4YIhPB4UtOd5zKZjr3FJs9wEhyhQpV6ZhvFfgaIKiYqfg+Q81aMO
 /BXrfh6jBD2Hu7gaPBnVdkKeh1ehl+w0PhTeJhPBHEEvbGeLUYWwyPNlaKz//VQl
 XCWl7e7o/Uw2UyJ469SCx3z+M2DMNqwdMys/zcqvTLiBdLNCwp4TW5ACzEA0rfHh
 Pepu3eIKnMURyt82QanrOATvT2io9pOOaUh59zeKi2WM8ikwKd/Eho2kXYng6GvM
 GzX4Kn13MsNobZXf9BhqEGICdRkaJqLsXlmBNmbJdSTCn5W2lLZqu2wCEp5VZHCc
 XwMbey8ek+BRskJMqAV4oq2GA8Om9KEYWOOdixyOG0UJCiW5qDowuDYBXTLV7FWj
 XhzLGuHpUF9eKLKokJ7ideLaDcpzwYjHr58pFLQrqPwmjVKWguLeYMg5BhhTiEuV
 wNfiLIGdMNsCpYKhnce3o9paV8+hy1ZveWhNy+/4HaDLoEwI2T62i8R7xxbrcWMg
 sgdAiQG+kVLwSJ13bN+Cz79uLYTIbqGaZHtOXmeIT3jSxBjx5RlXfzocwTHSYrNk
 GuLYHd7+QaemN49Rrf4bPR16Db7ifL32QkUtLBTBLcnos9jM+fcl+BWyqYRxhgDv
 xzDS+vfK8DvRiA==
 =Hgt6
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "A set of fixes for lockdep, tracing and RCU:

   - Prevent recursion by using raw_cpu_* operations

   - Fixup the interrupt state in the cpu idle code to be consistent

   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections

   - Move trace_cpu_idle() into generic code so it's called before RCU
     goes idle.

   - Handle raw_local_irq* vs. local_irq* operations correctly

   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent"

* tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep,trace: Expose tracepoints
  lockdep: Only trace IRQ edges
  mips: Implement arch_irqs_disabled()
  arm64: Implement arch_irqs_disabled()
  nds32: Implement arch_irqs_disabled()
  locking/lockdep: Cleanup
  x86/entry: Remove unused THUNKs
  cpuidle: Move trace_cpu_idle() into generic code
  cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
  sched,idle,rcu: Push rcu_idle deeper into the idle path
  cpuidle: Fixup IRQ state
  lockdep: Use raw_cpu_*() for per-cpu variables
2020-08-30 11:43:50 -07:00
Vasily Gorbik
bffc2f7aa9 s390/vmem: fix vmem_add_range for 4-level paging
The kernel currently crashes if 4-level paging is used. Add missing
p4d_populate for just allocated pud entry.

Fixes: 3e0d3e408e ("s390/vmem: consolidate vmem_add_range() and vmem_remove_range()")
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:07:05 +02:00
Sven Schnelle
1196f12a2c s390: don't trace preemption in percpu macros
Since commit a21ee6055c ("lockdep: Change hardirq{s_enabled,_context}
to per-cpu variables") the lockdep code itself uses percpu variables. This
leads to recursions because the percpu macros are calling preempt_enable()
which might call trace_preempt_on().

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:07:04 +02:00
Peter Zijlstra
9864f5b594 cpuidle: Move trace_cpu_idle() into generic code
Remove trace_cpu_idle() from the arch_cpu_idle() implementations and
put it in the generic code, right before disabling RCU. Gets rid of
more trace_*_rcuidle() users.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200821085348.428433395@infradead.org
2020-08-26 12:41:54 +02:00
Al Viro
6e41c585e3 unify generic instances of csum_partial_copy_nocheck()
quite a few architectures have the same csum_partial_copy_nocheck() -
simply memcpy() the data and then return the csum of the copy.

hexagon, parisc, ia64, s390, um: explicitly spelled out that way.

arc, arm64, csky, h8300, m68k/nommu, microblaze, mips/GENERIC_CSUM, nds32,
nios2, openrisc, riscv, unicore32: end up picking the same thing spelled
out in lib/checksum.h (with varying amounts of perversions along the way).

everybody else (alpha, arm, c6x, m68k/mmu, mips/!GENERIC_CSUM, powerpc,
sh, sparc, x86, xtensa) have non-generic variants.  For all except c6x
the declaration is in their asm/checksum.h.  c6x uses the wrapper
from asm-generic/checksum.h that would normally lead to the lib/checksum.h
instance, but in case of c6x we end up using an asm function from arch/c6x
instead.

Screw that mess - have architectures with private instances define
_HAVE_ARCH_CSUM_AND_COPY in their asm/checksum.h and have the default
one right in net/checksum.h conditional on _HAVE_ARCH_CSUM_AND_COPY
*not* defined.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-20 15:45:14 -04:00
Niklas Schnelle
b97bf44f99 s390/pci: fix PF/VF linking on hot plug
Currently there are four places in which a PCI function is scanned
and made available to drivers:
 1. In pci_scan_root_bus() as part of the initial zbus
    creation.
 2. In zpci_bus_add_devices() when registering
    a device in configured state on a zbus that has already been
    scanned.
 3. When a function is already known to zPCI (in reserved/standby state)
    and configuration is triggered through firmware by PEC 0x301.
 4. When a device is already known to zPCI (in standby/reserved state)
    and configuration is triggered from within Linux using
    enable_slot().

The PF/VF linking step and setting of pdev->is_virtfn introduced with
commit e5794cf1a2 ("s390/pci: create links between PFs and VFs") was
only triggered for the second case, which is where VFs created through
sriov_numvfs usually land. However unlike some other platforms but like
POWER VFs can be individually enabled/disabled through
/sys/bus/pci/slots.

Fix this by doing VF setup as part of pcibios_bus_add_device() which is
called in all of the above cases.

Finally to remove the PF/VF links call the common code
pci_iov_remove_virtfn() function to remove linked VFs.
This takes care of the necessary sysfs cleanup.

Fixes: e5794cf1a2 ("s390/pci: create links between PFs and VFs")
Cc: <stable@vger.kernel.org> # 5.8: 2f0230b2f2: s390/pci: re-introduce zpci_remove_device()
Cc: <stable@vger.kernel.org> # 5.8
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:34 +02:00
Niklas Schnelle
2f0230b2f2 s390/pci: re-introduce zpci_remove_device()
For fixing the PF to VF link removal we need to perform some action on
every removal of a zdev from the common PCI subsystem.
So in preparation re-introduce zpci_remove_device() and use that instead
of directly calling the common code functions. This  was actually still
declared from earlier code but no longer implemented.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:25 +02:00
Niklas Schnelle
3cddb79afc s390/pci: fix zpci_bus_link_virtfn()
We were missing the pci_dev_put() for candidate PFs.  Furhtermore in
discussion with upstream it turns out that somewhat counterintuitively
some common code, in particular the vfio-pci driver, assumes that
pdev->is_virtfn always implies that pdev->physfn is set, i.e. that VFs
are always linked.
While POWER does seem to set pdev->is_virtfn even for unlinked functions
(see comments in arch/powerpc/kernel/eeh.c:eeh_debugfs_break_device())
for now just be safe and only set pdev->is_virtfn on linking.
Also make sure that we only search for parent PFs if the zbus is
multifunction and we thus know the devfn values supplied by firmware
come from the RID.

Fixes: e5794cf1a2 ("s390/pci: create links between PFs and VFs")
Cc: <stable@vger.kernel.org> # 5.8
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:18 +02:00
Heiko Carstens
fd78c59446 s390/ptrace: fix storage key handling
The key member of the runtime instrumentation control block contains
only the access key, not the complete storage key. Therefore the value
must be shifted by four bits. Since existing user space does not
necessarily query and set the access key correctly, just ignore the
user space provided key and use the correct one.
Note: this is only relevant for debugging purposes in case somebody
compiles a kernel with a default storage access key set to a value not
equal to zero.

Fixes: 262832bc5a ("s390/ptrace: add runtime instrumention register get/set")
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:14 +02:00
Heiko Carstens
9eaba29c79 s390/runtime_instrumentation: fix storage key handling
The key member of the runtime instrumentation control block contains
only the access key, not the complete storage key. Therefore the value
must be shifted by four bits.
Note: this is only relevant for debugging purposes in case somebody
compiles a kernel with a default storage access key set to a value not
equal to zero.

Fixes: e4b8b3f33f ("s390: add support for runtime instrumentation")
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:10 +02:00
Niklas Schnelle
b76fee1bc5 s390/pci: ignore stale configuration request event
A configuration request event may be stale, that is the event
may reference a zdev which was already configured.
This can happen when a hotplug happens during boot such that
the device is discovered and configured in the initial clp_list_pci(),
then after initialization we enable events and process
the original configuration request which additionally still contains
the old disabled function handle leading to a failure during device
enablement and subsequent I/O lockout.

Fix this by restoring the check that the device to be configured is in
standby which was removed in commit f606b3ef47 ("s390/pci: adapt events
for zbus").

This check does not need serialization as we only enable the events after
zPCI has fully initialized, which includes the initial clp_list_pci(),
rescan only does updates and events are serialized with respect to each
other.

Fixes: f606b3ef47 ("s390/pci: adapt events for zbus")
Cc: <stable@vger.kernel.org> # 5.8
Reported-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Tested-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:05 +02:00
Xiaoming Ni
88db0aa242 all arch: remove system call sys_sysctl
Since commit 61a47c1ad3 ("sysctl: Remove the sysctl system call"),
sys_sysctl is actually unavailable: any input can only return an error.

We have been warning about people using the sysctl system call for years
and believe there are no more users.  Even if there are users of this
interface if they have not complained or fixed their code by now they
probably are not going to, so there is no point in warning them any
longer.

So completely remove sys_sysctl on all architectures.

[nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
 Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will@kernel.org>		[arm/arm64]
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bin Meng <bin.meng@windriver.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: chenzefeng <chenzefeng2@huawei.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Diego Elio Pettenò <flameeyes@flameeyes.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kars de Jong <jongk@linux-m68k.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Paul Burton <paulburton@kernel.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Sven Schnelle <svens@stackframe.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com>
Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14 19:56:56 -07:00
Linus Torvalds
990f227371 - Allow s390 debug feature to handle finally more than 256 CPU numbers, instead
of truncating the most significant bits.
 
 - Improve THP splitting required by qemu processes by making use of
   walk_page_vma() instead of calling follow_page() for every single page
   within each vma.
 
 - Add missing ZCRYPT dependency to VFIO_AP to fix potential compile problems.
 
 - Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again.
 
 - Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma
   translates a node distance of 0 to "no NUMA support available".
 
 - Couple of other minor fixes and improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAl81bRwACgkQIg7DeRsp
 bsIT3g/7BSIfbI852VTn6hWw+LfdoBLVqXDFcg9xQhUqccbPcG8LG/BYlmuvIn1Z
 X9XujP76xAX7LKnt9x0Z9znwpgGIj0oURmcBxzGBTk+yLBf7YXCOLu9w4MV1AD57
 WRoGG7pPFZy5rMDM29frqBNQBWc7g+/wGoPVqiffzqgj2DcS4Ka8TSTTboMIH7th
 mLK5xaU504FdDGwTgNv6Q/Pt3+YBZ6P/1cIuifzYnfjJ2CsjSBLjS9IFP6/w/c1Q
 VQ2TfajuRbAgkkY01o15tKrlqwPzzWYhnh9ix1LkL0WT6VNql+fcwN665HdsSDFu
 Ctu6Sk3BZmhN1GONK4gIx5didRxBi7neAfUMenSasPT8uMGJcR4EMoNUr70sXNA/
 B1naRayfCH0dE3SVZflAeFJRSh1LnikY14uj65Gg/Wtla5N+90zuHEGDttIBNzrr
 FDzc+39GWN1wJEl7FzSIm+YRC88C/So/BNUQhmlVYNY0sbqyAszUGBOAa+5VS2WQ
 tkWzWPjJ6QA3e3t/dpnc0dvlCKLTJG3o7bdtilb71QNybGiAniP8+79hCNa5nPT3
 iLRhmLX27cQF54IUw1whVA5mkfzHSzwAlB+/laPi8MWFQXTRLTD1XzhmjrIyR57Y
 Q6wM+lxC4qreDz7yGX+eQk7OEs8j8IKTF14HZxkp/DEL09Vbmbs=
 =E0hf
 -----END PGP SIGNATURE-----

Merge tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:

 - Allow s390 debug feature to handle finally more than 256 CPU numbers,
   instead of truncating the most significant bits.

 - Improve THP splitting required by qemu processes by making use of
   walk_page_vma() instead of calling follow_page() for every single
   page within each vma.

 - Add missing ZCRYPT dependency to VFIO_AP to fix potential compile
   problems.

 - Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again.

 - Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma
   translates a node distance of 0 to "no NUMA support available".

 - Couple of other minor fixes and improvements.

* tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/numa: move code to arch/s390/kernel
  s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again
  s390/debug: debug feature version 3
  s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP
  s390/numa: set node distance to LOCAL_DISTANCE
  s390/pkey: remove redundant variable initialization
  s390/test_unwind: fix possible memleak in test_unwind()
  s390/gmap: improve THP splitting
  s390/atomic: circumvent gcc 10 build regression
2020-08-13 12:38:32 -07:00
Peter Xu
64019a2e46 mm/gup: remove task_struct pointer for all gup code
After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more.  Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:04 -07:00
Peter Xu
35e45f3e5a mm/s390: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-19-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:03 -07:00
Peter Xu
bce617edec mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b98270 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:02 -07:00
Christoph Hellwig
428e2976a5 uaccess: remove segment_eq
segment_eq is only used to implement uaccess_kernel.  Just open code
uaccess_kernel in the arch uaccess headers and remove one layer of
indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:58 -07:00
Alexander Gordeev
b450eeb0c9 s390/numa: move code to arch/s390/kernel
Move all code from arch/s390/numa/ to arch/s390/kernel/
since numa.c is the only source file and all others were
deleted with the fake NUMA support removal.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:55 +02:00
Heiko Carstens
12bbf0962a s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again
Sven Schnelle reported that setting CLOCKSOURCE_VALIDATE_LAST_CYCLE
doesn't make sense: even if our tod clock overflows delta calculation
(now - last) with unsigned 64 bit values will still be correct.

Therefore revert commit 555701a714 ("s390/time: select
CLOCKSOURCE_VALIDATE_LAST_CYCLE").

Fixes: 555701a714 ("s390/time: select CLOCKSOURCE_VALIDATE_LAST_CYCLE")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:48 +02:00
Mikhail Zaslonko
0990d836ce s390/debug: debug feature version 3
Change __debug_entry structure in the following way:
 - remove redundant union
 - Field containing cpuid is expanded to 16 bits. 8-bit width was not
   enough since we already support up to 512 cpus.
 - Field containing the timestamp is expanded to 60 bits. The timestamp
   itself is now stored in the absolute Unix time format in microseconds
   taking the Epoch Index into acount.
Adjust default header for debug entries by setting minimum width for cpuid
to 4 digits.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:43 +02:00
Krzysztof Kozlowski
929a343b85 s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP
The VFIO_AP uses ap_driver_register() (and deregister) functions
implemented in ap_bus.c (compiled into ap.o).  However the ap.o will be
built only if CONFIG_ZCRYPT is selected.

This was not visible before commit e93a1695d7 ("iommu: Enable compile
testing for some of drivers") because the CONFIG_VFIO_AP depends on
CONFIG_S390_AP_IOMMU which depends on the missing CONFIG_ZCRYPT.  After
adding COMPILE_TEST, it is possible to select a configuration with
VFIO_AP and S390_AP_IOMMU but without the ZCRYPT.

Add proper dependency to the VFIO_AP to fix build errors:

ERROR: modpost: "ap_driver_register" [drivers/s390/crypto/vfio_ap.ko] undefined!
ERROR: modpost: "ap_driver_unregister" [drivers/s390/crypto/vfio_ap.ko] undefined!

Reported-by: kernel test robot <lkp@intel.com>
Fixes: e93a1695d7 ("iommu: Enable compile testing for some of drivers")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:39 +02:00
Alexander Gordeev
535e4fc623 s390/numa: set node distance to LOCAL_DISTANCE
The node distance is hardcoded to 0, which causes a trouble
for some user-level applications. In particular, "libnuma"
expects the distance of a node to itself as LOCAL_DISTANCE.
This update removes the offending node distance override.

Cc: <stable@vger.kernel.org> # 4.4
Fixes: 3a368f742d ("s390/numa: add core infrastructure")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:35 +02:00
Wang Hai
75d3e7f476 s390/test_unwind: fix possible memleak in test_unwind()
test_unwind() misses to call kfree(bt) in an error path.
Add the missed function call to fix it.

Fixes: 0610154650 ("s390/test_unwind: print verbose unwinding results")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:16 +02:00
Gerald Schaefer
ba925fa350 s390/gmap: improve THP splitting
During s390_enable_sie(), we need to take care of splitting all qemu user
process THP mappings. This is currently done with follow_page(FOLL_SPLIT),
by simply iterating over all vma ranges, with PAGE_SIZE increment.

This logic is sub-optimal and can result in a lot of unnecessary overhead,
especially when using qemu and ASAN with large shadow map. Ilya reported
significant system slow-down with one CPU busy for a long time and overall
unresponsiveness.

Fix this by using walk_page_vma() and directly calling split_huge_pmd()
only for present pmds, which greatly reduces overhead.

Cc: <stable@vger.kernel.org> # v5.4+
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:13 +02:00
Vasily Gorbik
f0cbd3b83e s390/atomic: circumvent gcc 10 build regression
Circumvent the following gcc 10 allyesconfig build regression:

  CC      drivers/leds/trigger/ledtrig-cpu.o
In file included from ./arch/s390/include/asm/bitops.h:39,
                 from ./include/linux/bitops.h:29,
                 from ./include/linux/kernel.h:12,
                 from drivers/leds/trigger/ledtrig-cpu.c:18:
./arch/s390/include/asm/atomic_ops.h: In function 'ledtrig_cpu':
./arch/s390/include/asm/atomic_ops.h:46:2: warning: 'asm' operand 1 probably does not match constraints
   46 |  asm volatile(       \
      |  ^~~
./arch/s390/include/asm/atomic_ops.h:53:2: note: in expansion of macro '__ATOMIC_CONST_OP'
   53 |  __ATOMIC_CONST_OP(op_name, op_type, op_string, "\n")  \
      |  ^~~~~~~~~~~~~~~~~
./arch/s390/include/asm/atomic_ops.h:56:1: note: in expansion of macro '__ATOMIC_CONST_OPS'
   56 | __ATOMIC_CONST_OPS(__atomic_add_const, int, "asi")
      | ^~~~~~~~~~~~~~~~~~
./arch/s390/include/asm/atomic_ops.h:46:2: error: impossible constraint in 'asm'
   46 |  asm volatile(       \
      |  ^~~
./arch/s390/include/asm/atomic_ops.h:53:2: note: in expansion of macro '__ATOMIC_CONST_OP'
   53 |  __ATOMIC_CONST_OP(op_name, op_type, op_string, "\n")  \
      |  ^~~~~~~~~~~~~~~~~
./arch/s390/include/asm/atomic_ops.h:56:1: note: in expansion of macro '__ATOMIC_CONST_OPS'
   56 | __ATOMIC_CONST_OPS(__atomic_add_const, int, "asi")
      | ^~~~~~~~~~~~~~~~~~
scripts/Makefile.build:280: recipe for target 'drivers/leds/trigger/ledtrig-cpu.o' failed

By swapping conditions as proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549318.html

Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:08 +02:00