Commit Graph

165229 Commits

Author SHA1 Message Date
Rafael J. Wysocki
6221403952 Merge branch 'pm-cpuidle'
* pm-cpuidle:
  cpuidle: Pass exit latency limit to cpuidle_use_deepest_state()
  cpuidle: Allow idle injection to apply exit latency limit
  cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks
  cpuidle: teo: Avoid code duplication in conditionals
  cpuidle: teo: Avoid using "early hits" incorrectly
  cpuidle: teo: Exclude cpuidle overhead from computations
  cpuidle: Use nanoseconds as the unit of time
  cpuidle: Consolidate disabled state checks
  ACPI: processor_idle: Skip dummy wait if kernel is in guest
  cpuidle: Do not unset the driver if it is there already
  cpuidle: teo: Fix "early hits" handling for disabled idle states
  cpuidle: teo: Consider hits and misses metrics of disabled states
  cpuidle: teo: Rename local variable in teo_select()
  cpuidle: teo: Ignore disabled idle states that are too deep
2019-11-26 10:26:26 +01:00
Linus Torvalds
a86f69d334 spi: Updates for v5.5
Lots of stuff going on in the core for SPI this time around, the two big
 changes both being around time in different forms:
 
  - A rework of delay times from Alexandru Ardelean which makes the ways
    in which they are specified more consistent between drivers so that
    what's available to clients is less dependent on the hardware
    implementation.
  - Support for PTP timestamping of transfers from Vladimir Oltean,
    useful for use with precision clocks with SPI control interfaces.
  - Big cleanups for the Atmel, PXA2xx and Zynq QSPI drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl3b1ogTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0NBpB/0Rh81T/xG0moEM6qyCqjriD8rQKfA7
 l7r7pNPCXCTPnVe2mfSLrlu6ETi8tG1iVYGPSEQmo5JnlgpoOvr0SVvQMej0ceB+
 jn3H31JgfvryMr99AFEsRN8N6GkY1dLlTV/5edKRTkut++tqlYD/G99bn3K7IlxW
 oN8le+fCDuF78mWGDEdClwuZ5ZDiZtWbRu01Q6ooVVZz3XLlLZPtgKH2qlwoVJ9r
 cuwnE2e3wce5Hq/JwAFnG2pUYIjKNd00/VxK640pU1dJ/LeqekH0Xe4ZoCr4X0zE
 8c+srra1yefAiDIPNG0v+exZePuu53tbEJylVVefOlrGCTa+nbJBioH3
 =+5en
 -----END PGP SIGNATURE-----

Merge tag 'spi-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi updates from Mark Brown:
 "Lots of stuff going on in the core for SPI this time around, the two
  big changes both being around time in different forms:

   - A rework of delay times from Alexandru Ardelean which makes the
     ways in which they are specified more consistent between drivers so
     that what's available to clients is less dependent on the hardware
     implementation.

   - Support for PTP timestamping of transfers from Vladimir Oltean,
     useful for use with precision clocks with SPI control interfaces.

   - Big cleanups for the Atmel, PXA2xx and Zynq QSPI drivers"

* tag 'spi-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (119 commits)
  dt-bindings: spi: Convert stm32 QSPI bindings to json-schema
  spi: pic32: Retire dma_request_slave_channel_compat()
  spi: Fix Kconfig indentation
  spi: mediatek: add SPI_CS_HIGH support
  spi: st-ssc4: add missed pm_runtime_disable
  spi: tegra20-slink: add missed clk_unprepare
  spi: tegra20-slink: Use dma_request_chan() directly for channel request
  spi: tegra114: Use dma_request_chan() directly for channel request
  spi: s3c64xx: Use dma_request_chan() directly for channel request
  spi: qup: Use dma_request_chan() directly for channel request
  spi: pl022: Use dma_request_chan() directly for channel request
  spi: imx: Use dma_request_chan() directly for channel request
  spi: fsl-lpspi: Use dma_request_chan() directly for channel request
  spi: atmel: Use dma_request_chan() directly for channel request
  spi: at91-usart: Use dma_request_chan() directly for channel request
  spi: fsl-cpm: Correct the free:ing
  spi: Fix regression to return zero on success instead of positive value
  spi: pxa2xx: Add missed security checks
  spi: nxp-fspi: Use devm API to fix missed unregistration of controller
  spi: omap2-mcspi: Remove redundant checks
  ...
2019-11-25 21:32:37 -08:00
Linus Torvalds
386403a115 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:
 "Another merge window, another pull full of stuff:

   1) Support alternative names for network devices, from Jiri Pirko.

   2) Introduce per-netns netdev notifiers, also from Jiri Pirko.

   3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara
      Larsen.

   4) Allow compiling out the TLS TOE code, from Jakub Kicinski.

   5) Add several new tracepoints to the kTLS code, also from Jakub.

   6) Support set channels ethtool callback in ena driver, from Sameeh
      Jubran.

   7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED,
      SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long.

   8) Add XDP support to mvneta driver, from Lorenzo Bianconi.

   9) Lots of netfilter hw offload fixes, cleanups and enhancements,
      from Pablo Neira Ayuso.

  10) PTP support for aquantia chips, from Egor Pomozov.

  11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From
      Josh Hunt.

  12) Add smart nagle to tipc, from Jon Maloy.

  13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat
      Duvvuru.

  14) Add a flow mask cache to OVS, from Tonghao Zhang.

  15) Add XDP support to ice driver, from Maciej Fijalkowski.

  16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak.

  17) Support UDP GSO offload in atlantic driver, from Igor Russkikh.

  18) Support it in stmmac driver too, from Jose Abreu.

  19) Support TIPC encryption and auth, from Tuong Lien.

  20) Introduce BPF trampolines, from Alexei Starovoitov.

  21) Make page_pool API more numa friendly, from Saeed Mahameed.

  22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni.

  23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits)
  libbpf: Fix usage of u32 in userspace code
  mm: Implement no-MMU variant of vmalloc_user_node_flags
  slip: Fix use-after-free Read in slip_open
  net: dsa: sja1105: fix sja1105_parse_rgmii_delays()
  macvlan: schedule bc_work even if error
  enetc: add support Credit Based Shaper(CBS) for hardware offload
  net: phy: add helpers phy_(un)lock_mdio_bus
  mdio_bus: don't use managed reset-controller
  ax88179_178a: add ethtool_op_get_ts_info()
  mlxsw: spectrum_router: Fix use of uninitialized adjacency index
  mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels
  bpf: Simplify __bpf_arch_text_poke poke type handling
  bpf: Introduce BPF_TRACE_x helper for the tracing tests
  bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT
  bpf, testing: Add various tail call test cases
  bpf, x86: Emit patchable direct jump as tail call
  bpf: Constant map key tracking for prog array pokes
  bpf: Add poke dependency tracking for prog array maps
  bpf: Add initial poke descriptor table for jit images
  bpf: Move owner type, jited info into array auxiliary data
  ...
2019-11-25 20:02:57 -08:00
Linus Torvalds
642356cb5f Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Add library interfaces of certain crypto algorithms for WireGuard
   - Remove the obsolete ablkcipher and blkcipher interfaces
   - Move add_early_randomness() out of rng_mutex

  Algorithms:
   - Add blake2b shash algorithm
   - Add blake2s shash algorithm
   - Add curve25519 kpp algorithm
   - Implement 4 way interleave in arm64/gcm-ce
   - Implement ciphertext stealing in powerpc/spe-xts
   - Add Eric Biggers's scalar accelerated ChaCha code for ARM
   - Add accelerated 32r2 code from Zinc for MIPS
   - Add OpenSSL/CRYPTOGRAMS poly1305 implementation for ARM and MIPS

  Drivers:
   - Fix entropy reading failures in ks-sa
   - Add support for sam9x60 in atmel
   - Add crypto accelerator for amlogic GXL
   - Add sun8i-ce Crypto Engine
   - Add sun8i-ss cryptographic offloader
   - Add a host of algorithms to inside-secure
   - Add NPCM RNG driver
   - add HiSilicon HPRE accelerator
   - Add HiSilicon TRNG driver"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (285 commits)
  crypto: vmx - Avoid weird build failures
  crypto: lib/chacha20poly1305 - use chacha20_crypt()
  crypto: x86/chacha - only unregister algorithms if registered
  crypto: chacha_generic - remove unnecessary setkey() functions
  crypto: amlogic - enable working on big endian kernel
  crypto: sun8i-ce - enable working on big endian
  crypto: mips/chacha - select CRYPTO_SKCIPHER, not CRYPTO_BLKCIPHER
  hwrng: ks-sa - Enable COMPILE_TEST
  crypto: essiv - remove redundant null pointer check before kfree
  crypto: atmel-aes - Change data type for "lastc" buffer
  crypto: atmel-tdes - Set the IV after {en,de}crypt
  crypto: sun4i-ss - fix big endian issues
  crypto: sun4i-ss - hide the Invalid keylen message
  crypto: sun4i-ss - use crypto_ahash_digestsize
  crypto: sun4i-ss - remove dependency on not 64BIT
  crypto: sun4i-ss - Fix 64-bit size_t warnings on sun4i-ss-hash.c
  MAINTAINERS: Add maintainer for HiSilicon SEC V2 driver
  crypto: hisilicon - add DebugFS for HiSilicon SEC
  Documentation: add DebugFS doc for HiSilicon SEC
  crypto: hisilicon - add SRIOV for HiSilicon SEC
  ...
2019-11-25 19:49:58 -08:00
Linus Torvalds
f838767555 Livepatching changes for 5.5
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl3bz1gACgkQUqAMR0iA
 lPI5fw//db5dOqAvBu/fz4k38Mc30okgCjtRh0+vhFXCCUXauQv2IhI19J2IiPpy
 4t/CjaUk2QSB06NDNUxt7XsuR0yAF4E0nJHUmkDKkN8UFsi7jAjxJ/92zH3x/LE0
 YWtVoWjGduO+QfLVlP22VVYh1pX5kOxXG2WTEiJtnkWYdkqtkEy7Cw2Rlzzrrym6
 6kIVi3nEPtn/hOnlF/Ii5SJWh+jJrSf+XwXiuIiBupT49Ujoa4KscmhkiHnAccXb
 ICJpsxBIdZLxHLe/c0YO3b8r4Hvms124vlIC19Z0l9VEJ++sphDOWRrL3Zq9tFw8
 FwIKq8Ex9UFfldOnpD5q/PRhM9Xgfsw8UYYZZpXQeW6z7qzv1iVpM6BQlt3dYQaL
 pb21UXIfzaTdZIsUgnetypbqWIFNZxovrhORpjqxo6WV4lSaVx4MPE2/Le/G8xPR
 DT+a6yyzTyM0XbZ0MCVDfuZ+ZRaA1zfKEcsYEu883b7yK4z+TZbT61vnEKqq8+40
 EgOZnNjBENZLRQY0RofQsO5zGwcaanVOtBOmYDXtP/fup8/1SRZ25zmmIVhvChJG
 iQwCDw6IMqnae/FsMb+kBTDCRJXpN028iYGY7UAoiCuvzc0qm0gJXsGdZLqMvjEh
 nEdKKN2ze+03s6I9AcISKdnbUVphhb/xeDKRBkMgcooWLrfWj5E=
 =i37E
 -----END PGP SIGNATURE-----

Merge tag 'livepatching-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching

Pull livepatching updates from Petr Mladek:

 - New API to track system state changes done be livepatch callbacks. It
   helps to maintain compatibility between livepatches.

 - Update Kconfig help text. ORC is another reliable unwinder.

 - Disable generic selftest timeout. Livepatch selftests have their own
   per-operation fine-grained timeouts.

* tag 'livepatching-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  x86/stacktrace: update kconfig help text for reliable unwinders
  livepatch: Selftests of the API for tracking system state changes
  livepatch: Documentation of the new API for tracking system state changes
  livepatch: Allow to distinguish different version of system state changes
  livepatch: Basic API to track system state changes
  livepatch: Keep replaced patches until post_patch callback is called
  selftests/livepatch: Disable the timeout
2019-11-25 19:43:48 -08:00
Linus Torvalds
436b2a8039 Printk changes for 5.5
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl3bpjoACgkQUqAMR0iA
 lPJJDA/+IJT4YCRp2TwV2jvIs0QzvXZrzEsxgCLibLE85mYTJgoQBD3W1bH2eyjp
 T/9U0Zh5PGr/84cHd4qiMxzo+5Olz930weG59NcO4RJBSr671aRYs5tJqwaQAZDR
 wlwaob5S28vUmjPxKulvxv6V3FdI79ZE9xrCOCSTQvz4iCLsGOu+Dn/qtF64pImX
 M/EXzPMBrByiQ8RTM4Ege8JoBqiCZPDG9GR3KPXIXQwEeQgIoeYxwRYakxSmSzz8
 W8NduFCbWavg/yHhghHikMiyOZeQzAt+V9k9WjOBTle3TGJegRhvjgI7508q3tXe
 jQTMGATBOPkIgFaZz7eEn/iBa3jZUIIOzDY93RYBmd26aBvwKLOma/Vkg5oGYl0u
 ZK+CMe+/xXl7brQxQ6JNsQhbSTjT+746LvLJlCvPbbPK9R0HeKNhsdKpGY3ugnmz
 VAnOFIAvWUHO7qx+J+EnOo5iiPpcwXZj4AjrwVrs/x5zVhzwQ+4DSU6rbNn0O1Ak
 ELrBqCQkQzh5kqK93jgMHeWQ9EOUp1Lj6PJhTeVnOx2x8tCOi6iTQFFrfdUPlZ6K
 2DajgrFhti4LvwVsohZlzZuKRm5EuwReLRSOn7PU5qoSm5rcouqMkdlYG/viwyhf
 mTVzEfrfemrIQOqWmzPrWEXlMj2mq8oJm4JkC+jJ/+HsfK4UU8I=
 =QCEy
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk

Pull printk updates from Petr Mladek:

 - Allow to print symbolic error names via new %pe modifier.

 - Use pr_warn() instead of the remaining pr_warning() calls. Fix
   formatting of the related lines.

 - Add VSPRINTF entry to MAINTAINERS.

* tag 'printk-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (32 commits)
  checkpatch: don't warn about new vsprintf pointer extension '%pe'
  MAINTAINERS: Add VSPRINTF
  tools lib api: Renaming pr_warning to pr_warn
  ASoC: samsung: Use pr_warn instead of pr_warning
  lib: cpu_rmap: Use pr_warn instead of pr_warning
  trace: Use pr_warn instead of pr_warning
  dma-debug: Use pr_warn instead of pr_warning
  vgacon: Use pr_warn instead of pr_warning
  fs: afs: Use pr_warn instead of pr_warning
  sh/intc: Use pr_warn instead of pr_warning
  scsi: Use pr_warn instead of pr_warning
  platform/x86: intel_oaktrail: Use pr_warn instead of pr_warning
  platform/x86: asus-laptop: Use pr_warn instead of pr_warning
  platform/x86: eeepc-laptop: Use pr_warn instead of pr_warning
  oprofile: Use pr_warn instead of pr_warning
  of: Use pr_warn instead of pr_warning
  macintosh: Use pr_warn instead of pr_warning
  idsn: Use pr_warn instead of pr_warning
  ide: Use pr_warn instead of pr_warning
  crypto: n2: Use pr_warn instead of pr_warning
  ...
2019-11-25 19:40:40 -08:00
Linus Torvalds
752272f16d ARM:
- Data abort report and injection
 - Steal time support
 - GICv4 performance improvements
 - vgic ITS emulation fixes
 - Simplify FWB handling
 - Enable halt polling counters
 - Make the emulated timer PREEMPT_RT compliant
 
 s390:
 - Small fixes and cleanups
 - selftest improvements
 - yield improvements
 
 PPC:
 - Add capability to tell userspace whether we can single-step the guest.
 - Improve the allocation of XIVE virtual processor IDs
 - Rewrite interrupt synthesis code to deliver interrupts in virtual
   mode when appropriate.
 - Minor cleanups and improvements.
 
 x86:
 - XSAVES support for AMD
 - more accurate report of nested guest TSC to the nested hypervisor
 - retpoline optimizations
 - support for nested 5-level page tables
 - PMU virtualization optimizations, and improved support for nested
   PMU virtualization
 - correct latching of INITs for nested virtualization
 - IOAPIC optimization
 - TSX_CTRL virtualization for more TAA happiness
 - improved allocation and flushing of SEV ASIDs
 - many bugfixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJd27PMAAoJEL/70l94x66DspsH+gPc6YWtKJFJH58Zj8NrNh6y
 t0FwDFcvUa51+m4jaY4L5Y8+zqu1dZFnPPhFGqNWpxrjCEvE/glQJv3BiUX06Seh
 aYUHNymGoYCTJOHaaGhV+NlgQaDuZOCOkIsOLAPehyFd1KojwB+FRC0xmO6aROPw
 9yQgYrKuK1UUn5HwxBNrMS4+Xv+2iKv/9sTnq1G4W2qX2NZQg84LVPg1zIdkCh3D
 3GOvoCBEk3ivQqjmdE7rP/InPr0XvW0b6TFhchIk8J6jEIQFHsmOUefiTvTxsIHV
 OKAZwvyeYPrYHA/aDZpaBmY2aR0ydfKDUQcviNIJoF1vOktGs0hvl3VbsmG8QCg=
 =OSI1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - data abort report and injection
   - steal time support
   - GICv4 performance improvements
   - vgic ITS emulation fixes
   - simplify FWB handling
   - enable halt polling counters
   - make the emulated timer PREEMPT_RT compliant

  s390:
   - small fixes and cleanups
   - selftest improvements
   - yield improvements

  PPC:
   - add capability to tell userspace whether we can single-step the
     guest
   - improve the allocation of XIVE virtual processor IDs
   - rewrite interrupt synthesis code to deliver interrupts in virtual
     mode when appropriate.
   - minor cleanups and improvements.

  x86:
   - XSAVES support for AMD
   - more accurate report of nested guest TSC to the nested hypervisor
   - retpoline optimizations
   - support for nested 5-level page tables
   - PMU virtualization optimizations, and improved support for nested
     PMU virtualization
   - correct latching of INITs for nested virtualization
   - IOAPIC optimization
   - TSX_CTRL virtualization for more TAA happiness
   - improved allocation and flushing of SEV ASIDs
   - many bugfixes and cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  kvm: nVMX: Relax guest IA32_FEATURE_CONTROL constraints
  KVM: x86: Grab KVM's srcu lock when setting nested state
  KVM: x86: Open code shared_msr_update() in its only caller
  KVM: Fix jump label out_free_* in kvm_init()
  KVM: x86: Remove a spurious export of a static function
  KVM: x86: create mmu/ subdirectory
  KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page
  KVM: x86: remove set but not used variable 'called'
  KVM: nVMX: Do not mark vmcs02->apic_access_page as dirty when unpinning
  KVM: vmx: use MSR_IA32_TSX_CTRL to hard-disable TSX on guest that lack it
  KVM: vmx: implement MSR_IA32_TSX_CTRL disable RTM functionality
  KVM: x86: implement MSR_IA32_TSX_CTRL effect on CPUID
  KVM: x86: do not modify masked bits of shared MSRs
  KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES
  KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path
  KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one
  KVM: nVMX: Assume TLB entries of L1 and L2 are tagged differently if L0 use EPT
  KVM: x86: Unexport kvm_vcpu_reload_apic_access_page()
  KVM: nVMX: add CR4_LA57 bit to nested CR4_FIXED1
  KVM: nVMX: Use semi-colon instead of comma for exit-handlers initialization
  ...
2019-11-25 18:02:36 -08:00
Linus Torvalds
3f3c8be973 xen: fixes for xen
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCXdtnVQAKCRCAXGG7T9hj
 vg1hAQDqG1DKZvR6BtlvETFMz7ZlrXVkpm6C74Wy4bLiO5KSSAEAneFbrDwFVa0c
 d05Z6wemjlyEd7u3gkVQBKfHkbWBRQQ=
 =aDIL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.5a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - a small series to remove the build constraint of Xen x86 MCE handling
   to 64-bit only

 - a bunch of minor cleanups

* tag 'for-linus-5.5a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Fix Kconfig indentation
  xen/mcelog: also allow building for 32-bit kernels
  xen/mcelog: add PPIN to record when available
  xen/mcelog: drop __MC_MSR_MCGCAP
  xen/gntdev: Use select for DMA_SHARED_BUFFER
  xen: mm: make xen_mm_init static
  xen: mm: include <xen/xen-ops.h> for missing declarations
2019-11-25 17:45:31 -08:00
Linus Torvalds
2981dcf333 The main MIPS changes for 5.5:
- Atomics-related code sees some rework & cleanup, most notably allowing
   Loongson LL/SC errata workarounds to be more bulletproof & their
   correctness to be checked at build time.
 
 - Command line setup code is simplified somewhat, resolving various
   corner cases.
 
 - MIPS kernels can now be built with kcov code coverage support.
 
 - We can now build with CONFIG_FORTIFY_SOURCE=y.
 
 - Miscellaneous cleanups.
 
 And some platform specific changes:
 
 - We now disable some broken TLB functionality on certain Ingenic
   systems, and JZ4780 systems gain some devicetree nodes to support
   more devices.
 
 - Loongson support sees a number of cleanups, and we gain initial
   support for Loongson 3A R4 systems.
 
 - We gain support for MediaTek MT7688-based GARDENA Smart Gateway
   systems.
 
 - SGI IP27 (Origin 2*) see a number of fixes, cleanups &
   simplifications.
 
 - SGI IP30 (Octane) systems are now supported.
 -----BEGIN PGP SIGNATURE-----
 
 iIwEABYIADQWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXdwj2RYccGF1bGJ1cnRv
 bkBrZXJuZWwub3JnAAoJED6nn6y1dQDd54QA/2CrWLcWCcWVwN8XwLTh3gWf8/k7
 d19Ttd0bYrsBUnaHAP9s9kc9RFOAhB3p1G0dsMZqI0YX7emLGPgW3ejJ7CXNCg==
 =/Hsa
 -----END PGP SIGNATURE-----

Merge tag 'mips_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS updates from Paul Burton:
 "The main MIPS changes for 5.5:

   - Atomics-related code sees some rework & cleanup, most notably
     allowing Loongson LL/SC errata workarounds to be more bulletproof &
     their correctness to be checked at build time.

   - Command line setup code is simplified somewhat, resolving various
     corner cases.

   - MIPS kernels can now be built with kcov code coverage support.

   - We can now build with CONFIG_FORTIFY_SOURCE=y.

   - Miscellaneous cleanups.

  And some platform specific changes:

   - We now disable some broken TLB functionality on certain Ingenic
     systems, and JZ4780 systems gain some devicetree nodes to support
     more devices.

   - Loongson support sees a number of cleanups, and we gain initial
     support for Loongson 3A R4 systems.

   - We gain support for MediaTek MT7688-based GARDENA Smart Gateway
     systems.

   - SGI IP27 (Origin 2*) see a number of fixes, cleanups &
     simplifications.

   - SGI IP30 (Octane) systems are now supported"

* tag 'mips_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (107 commits)
  MIPS: SGI-IP27: Enable ethernet phy on second Origin 200 module
  MIPS: PCI: Fix fake subdevice ID for IOC3
  MIPS: Ingenic: Disable abandoned HPTLB function.
  MIPS: PCI: remember nasid changed by set interrupt affinity
  MIPS: SGI-IP27: Fix crash, when CPUs are disabled via nr_cpus parameter
  mips: add support for folded p4d page tables
  mips: drop __pXd_offset() macros that duplicate pXd_index() ones
  mips: fix build when "48 bits virtual memory" is enabled
  MIPS: math-emu: Reuse name array in debugfs_fpuemu()
  MIPS: allow building with kcov coverage
  MIPS: Loongson64: Drop setup_pcimap
  MIPS: Loongson2ef: Convert to early_printk_8250
  MIPS: Drop CPU_SUPPORTS_UNCACHED_ACCELERATED
  MIPS: Loongson{2ef, 32, 64} convert to generic fw cmdline
  MIPS: Drop pmon.h
  MIPS: Loongson: Unify LOONGSON3/LOONGSON64 Kconfig usage
  MIPS: Loongson: Rename LOONGSON1 to LOONGSON32
  MIPS: Loongson: Fix return value of loongson_hwmon_init
  MIPS: add support for SGI Octane (IP30)
  MIPS: PCI: make phys_to_dma/dma_to_phys for pci-xtalk-bridge common
  ...
2019-11-25 17:42:56 -08:00
Linus Torvalds
5ef30d7423 m68k updates for v5.5
- Atari Falcon IDE platform driver conversion for module autoload,
   - Defconfig updates (incl. enablement of Amiga ICY I2C),
   - Small fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCXdufABUcZ2VlcnRAbGlu
 dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XAhxAEAp1MN6cEMFzE2NmulbSLXSb6ceLnB
 4eRvmVi5HJACqXoBAMexfc0YUl5BB6LMi+7HRetd46QCSJ7TWko0XE9NiBsP
 =CRVE
 -----END PGP SIGNATURE-----

Merge tag 'm68k-for-v5.5-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k updates from Geert Uytterhoeven:

 - Atari Falcon IDE platform driver conversion for module autoload

 - defconfig updates (including enablement of Amiga ICY I2C)

 - small fixes and cleanups

* tag 'm68k-for-v5.5-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k/atari: Convert Falcon IDE drivers to platform drivers
  m68k: defconfig: Enable ICY I2C and LTC2990 on Amiga
  m68k: defconfig: Update defconfigs for v5.4-rc1
  m68k: q40: Fix info-leak in rtc_ioctl
  nubus: Remove cast to void pointer
2019-11-25 17:37:30 -08:00
Linus Torvalds
28fcb77b38 Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:

 - Fully reworked thermal throttling notifications, there should be no
   more spamming of dmesg (Srinivas Pandruvada and Benjamin Berg)

 - More enablement for the Intel-compatible CPUs Zhaoxin (Tony W
   Wang-oc)

 - PPIN support for Icelake (Tony Luck)

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce/therm_throt: Optimize notifications of thermal throttle
  x86/mce: Add Xeon Icelake to list of CPUs that support PPIN
  x86/mce: Lower throttling MCE messages' priority to warning
  x86/mce: Add Zhaoxin LMCE support
  x86/mce: Add Zhaoxin CMCI support
  x86/mce: Add Zhaoxin MCE support
  x86/mce/amd: Make disable_err_thresholding() static
2019-11-25 17:31:39 -08:00
Linus Torvalds
63c2291f83 Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode updates from Borislav Petkov:
 "This converts the late loading method to load the microcode in
  parallel (vs sequentially currently). The patch remained in linux-next
  for the maximum amount of time so that any potential and hard to debug
  fallout be minimized.

  Now cloud folks have their milliseconds back but all the normal people
  should use early loading anyway :-)"

* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/intel: Issue the revision updated message only on the BSP
  x86/microcode: Update late microcode in parallel
  x86/microcode/amd: Fix two -Wunused-but-set-variable warnings
2019-11-25 17:28:35 -08:00
Linus Torvalds
ea1f56fa16 s390 updates for the 5.5 merge window
- Adjust PMU device drivers registration to avoid WARN_ON and few other
   perf improvements.
 
 - Enhance tracing in vfio-ccw.
 
 - Few stack unwinder fixes and improvements, convert get_wchan custom
   stack unwinding to generic api usage.
 
 - Fixes for mm helpers issues uncovered with tests validating architecture
   page table helpers.
 
 - Fix noexec bit handling when hardware doesn't support it.
 
 - Fix memleak and unsigned value compared with zero bugs in crypto
   code. Minor code simplification.
 
 - Fix crash during kdump with kasan enabled kernel.
 
 - Switch bug and alternatives from asm to asm_inline to improve inlining
   decisions.
 
 - Use 'depends on cc-option' for MARCH and TUNE options in Kconfig,
   add z13s and z14 ZR1 to TUNE descriptions.
 
 - Minor head64.S simplification.
 
 - Fix physical to logical CPU map for SMT.
 
 - Several cleanups in qdio code.
 
 - Other minor cleanups and fixes all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl3ahGYACgkQjYWKoQLX
 FBguwAgAig+FNos8zkd7Sr2wg4DPL2IlYERVP40fOLXfGuVUOnMLg8OTO6yDWDpH
 5+cKAQS1wWgyvlfjWRUJ6anXLBAsgKRD1nyFIZTpn/wArGk/duCbnl/VFriDgrST
 8KTQDJpZ9w9nXtQ7lA2QWaw5U2WG8I2T2JuQJCdLXze7RXi0bDVe8e6131NMaJ42
 LLxqOqm8d8XDnd8oDVP04LT5IfhuI2cILoGBP/GyI2fqQk9Ems6M2gxuISq1COmy
 WORDLfwWyCLeF7gWKKjxf8Vo1HYcyoFvdXnxWiHb0TDZesQZJr/LLELTP03fbCW9
 U4jbXncnnPA7kT4tlC95jT5M69yK5w==
 =+FxG
 -----END PGP SIGNATURE-----

Merge tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Adjust PMU device drivers registration to avoid WARN_ON and few other
   perf improvements.

 - Enhance tracing in vfio-ccw.

 - Few stack unwinder fixes and improvements, convert get_wchan custom
   stack unwinding to generic api usage.

 - Fixes for mm helpers issues uncovered with tests validating
   architecture page table helpers.

 - Fix noexec bit handling when hardware doesn't support it.

 - Fix memleak and unsigned value compared with zero bugs in crypto
   code. Minor code simplification.

 - Fix crash during kdump with kasan enabled kernel.

 - Switch bug and alternatives from asm to asm_inline to improve
   inlining decisions.

 - Use 'depends on cc-option' for MARCH and TUNE options in Kconfig, add
   z13s and z14 ZR1 to TUNE descriptions.

 - Minor head64.S simplification.

 - Fix physical to logical CPU map for SMT.

 - Several cleanups in qdio code.

 - Other minor cleanups and fixes all over the code.

* tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits)
  s390/cpumf: Adjust registration of s390 PMU device drivers
  s390/smp: fix physical to logical CPU map for SMT
  s390/early: move access registers setup in C code
  s390/head64: remove unnecessary vdso_per_cpu_data setup
  s390/early: move control registers setup in C code
  s390/kasan: support memcpy_real with TRACE_IRQFLAGS
  s390/crypto: Fix unsigned variable compared with zero
  s390/pkey: use memdup_user() to simplify code
  s390/pkey: fix memory leak within _copy_apqns_from_user()
  s390/disassembler: don't hide instruction addresses
  s390/cpum_sf: Assign error value to err variable
  s390/cpum_sf: Replace function name in debug statements
  s390/cpum_sf: Use consistant debug print format for sampling
  s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr()
  s390: add error handling to perf_callchain_kernel
  s390: always inline current_stack_pointer()
  s390/mm: add mm_pxd_folded() checks to pxd_free()
  s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported
  s390/mm: simplify page table helpers for large entries
  s390/mm: make pmd/pud_bad() report large entries as bad
  ...
2019-11-25 17:23:53 -08:00
Linus Torvalds
4ba380f616 arm64 updates for 5.5:
- On ARMv8 CPUs without hardware updates of the access flag, avoid
   failing cow_user_page() on PFN mappings if the pte is old. The patches
   introduce an arch_faults_on_old_pte() macro, defined as false on x86.
   When true, cow_user_page() makes the pte young before attempting
   __copy_from_user_inatomic().
 
 - Covert the synchronous exception handling paths in
   arch/arm64/kernel/entry.S to C.
 
 - FTRACE_WITH_REGS support for arm64.
 
 - ZONE_DMA re-introduced on arm64 to support Raspberry Pi 4
 
 - Several kselftest cases specific to arm64, together with a MAINTAINERS
   update for these files (moved to the ARM64 PORT entry).
 
 - Workaround for a Neoverse-N1 erratum where the CPU may fetch stale
   instructions under certain conditions.
 
 - Workaround for Cortex-A57 and A72 errata where the CPU may
   speculatively execute an AT instruction and associate a VMID with the
   wrong guest page tables (corrupting the TLB).
 
 - Perf updates for arm64: additional PMU topologies on HiSilicon
   platforms, support for CCN-512 interconnect, AXI ID filtering in the
   IMX8 DDR PMU, support for the CCPI2 uncore PMU in ThunderX2.
 
 - GICv3 optimisation to avoid a heavy barrier when accessing the
   ICC_PMR_EL1 register.
 
 - ELF HWCAP documentation updates and clean-up.
 
 - SMC calling convention conduit code clean-up.
 
 - KASLR diagnostics printed during boot
 
 - NVIDIA Carmel CPU added to the KPTI whitelist
 
 - Some arm64 mm clean-ups: use generic free_initrd_mem(), remove stale
   macro, simplify calculation in __create_pgd_mapping(), typos.
 
 - Kconfig clean-ups: CMDLINE_FORCE to depend on CMDLINE, choice for
   endinanness to help with allmodconfig.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl3YJswACgkQa9axLQDI
 XvFwYg//aTGhNLew3ADgW2TYal7LyqetRROixPBrzqHLu2A8No1+QxHMaKxpZVyf
 pt25tABuLtPHql3qBzE0ltmfbLVsPj/3hULo404EJb9HLRfUnVGn7gcPkc+p4YAr
 IYkYPXJbk6OlJ84vI+4vXmDEF12bWCqamC9qZ+h99qTpMjFXFO17DSJ7xQ8Xic3A
 HHgCh4uA7gpTVOhLxaS6KIw+AZNYwvQxLXch2+wj6agbGX79uw9BeMhqVXdkPq8B
 RTDJpOdS970WOT4cHWOkmXwsqqGRqgsgyu+bRUJ0U72+0y6MX0qSHIUnVYGmNc5q
 Dtox4rryYLvkv/hbpkvjgVhv98q3J1mXt/CalChWB5dG4YwhJKN2jMiYuoAvB3WS
 6dR7Dfupgai9gq1uoKgBayS2O6iFLSa4g58vt3EqUBqmM7W7viGFPdLbuVio4ycn
 CNF2xZ8MZR6Wrh1JfggO7Hc11EJdSqESYfHO6V/pYB4pdpnqJLDoriYHXU7RsZrc
 HvnrIvQWKMwNbqBvpNbWvK5mpBMMX2pEienA3wOqKNH7MbepVsG+npOZTVTtl9tN
 FL0ePb/mKJu/2+gW8ntiqYn7EzjKprRmknOiT2FjWWo0PxgJ8lumefuhGZZbaOWt
 /aTAeD7qKd/UXLKGHF/9v3q4GEYUdCFOXP94szWVPyLv+D9h8L8=
 =TPL9
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Apart from the arm64-specific bits (core arch and perf, new arm64
  selftests), it touches the generic cow_user_page() (reviewed by
  Kirill) together with a macro for x86 to preserve the existing
  behaviour on this architecture.

  Summary:

   - On ARMv8 CPUs without hardware updates of the access flag, avoid
     failing cow_user_page() on PFN mappings if the pte is old. The
     patches introduce an arch_faults_on_old_pte() macro, defined as
     false on x86. When true, cow_user_page() makes the pte young before
     attempting __copy_from_user_inatomic().

   - Covert the synchronous exception handling paths in
     arch/arm64/kernel/entry.S to C.

   - FTRACE_WITH_REGS support for arm64.

   - ZONE_DMA re-introduced on arm64 to support Raspberry Pi 4

   - Several kselftest cases specific to arm64, together with a
     MAINTAINERS update for these files (moved to the ARM64 PORT entry).

   - Workaround for a Neoverse-N1 erratum where the CPU may fetch stale
     instructions under certain conditions.

   - Workaround for Cortex-A57 and A72 errata where the CPU may
     speculatively execute an AT instruction and associate a VMID with
     the wrong guest page tables (corrupting the TLB).

   - Perf updates for arm64: additional PMU topologies on HiSilicon
     platforms, support for CCN-512 interconnect, AXI ID filtering in
     the IMX8 DDR PMU, support for the CCPI2 uncore PMU in ThunderX2.

   - GICv3 optimisation to avoid a heavy barrier when accessing the
     ICC_PMR_EL1 register.

   - ELF HWCAP documentation updates and clean-up.

   - SMC calling convention conduit code clean-up.

   - KASLR diagnostics printed during boot

   - NVIDIA Carmel CPU added to the KPTI whitelist

   - Some arm64 mm clean-ups: use generic free_initrd_mem(), remove
     stale macro, simplify calculation in __create_pgd_mapping(), typos.

   - Kconfig clean-ups: CMDLINE_FORCE to depend on CMDLINE, choice for
     endinanness to help with allmodconfig"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits)
  arm64: Kconfig: add a choice for endianness
  kselftest: arm64: fix spelling mistake "contiguos" -> "contiguous"
  arm64: Kconfig: make CMDLINE_FORCE depend on CMDLINE
  MAINTAINERS: Add arm64 selftests to the ARM64 PORT entry
  arm64: kaslr: Check command line before looking for a seed
  arm64: kaslr: Announce KASLR status on boot
  kselftest: arm64: fake_sigreturn_misaligned_sp
  kselftest: arm64: fake_sigreturn_bad_size
  kselftest: arm64: fake_sigreturn_duplicated_fpsimd
  kselftest: arm64: fake_sigreturn_missing_fpsimd
  kselftest: arm64: fake_sigreturn_bad_size_for_magic0
  kselftest: arm64: fake_sigreturn_bad_magic
  kselftest: arm64: add helper get_current_context
  kselftest: arm64: extend test_init functionalities
  kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht]
  kselftest: arm64: mangle_pstate_invalid_daif_bits
  kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils
  kselftest: arm64: extend toplevel skeleton Makefile
  drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform
  arm64: mm: reserve CMA and crashkernel in ZONE_DMA32
  ...
2019-11-25 15:39:19 -08:00
Linus Torvalds
e25645b181 linux-kselftest-5.5-rc1-kunit
This kselftest update for Linux 5.5-rc1 adds KUnit, a lightweight unit
 testing and mocking framework for the Linux kernel from Brendan Higgins.
 
 KUnit is not an end-to-end testing framework. It is currently supported
 on UML and sub-systems can write unit tests and run them in UML env.
 KUnit documentation is included in this update.
 
 In addition, this Kunit update adds 3 new kunit tests:
 
 - kunit test for proc sysctl from Iurii Zaikin
 - kunit test for the 'list' doubly linked list from David Gow
 - ext4 kunit test for decoding extended timestamps from Iurii Zaikin
 
 In the future KUnit will be linked to Kselftest framework to provide
 a way to trigger KUnit tests from user-space.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl3YfBQACgkQCwJExA0N
 QxzTjxAAiFhaDMhlpLhn1DpIUNvfKrIgDjJgajQAyMMs6TJK3OrD6J4WbpVD7wGo
 aqF9l6o64sY18JAo3s00j6AcAmVwNH7qzEEuzIQPjJvQ8C4sCWL3esEP4JHgFb2F
 snlSn5KjSsdC1D9N7uQIhgW76xPSyDrTwWQpglvmB9TwmJVBIl9zhu+unp73ufFJ
 N+ieDg8A6W/wDGYSq5JBSkJbuI0gL+daNwUYzxEEZIskndhpovOc82WAldECRm6x
 TfI0u39zTbrEO0DHgmYpyGbTN8TB2mXjH5HMjwg+KbHfKVTKKGvTK7XFs8mWGQpO
 n2meypZuwuIsRPOPcAVs+Gt2dc0jFODJVIV1EzA0WSv6TEdPqyhM/d13tHdCqjm9
 ic5wQ/hQQNEB1Dvg5ereXBaGGaoqP95y61ZpCS9vCXFXH+28E/B63Ebfs2IBIuqS
 Jv2KcoxabyZq3uGdjnn+mD7IM8rkvscRP4Ba31nXRgJIYDHAzqe7APN7y3on4NGx
 1q7lBlA3XZZ8qgo0zpLST20ck/qaL3tk4k8E1f8emh6CuyrCWtazgrWkMIlyEX0O
 8nre3uEAF9xUzB4+gZK4YmelN9Bld3Uv7Ippt1zTCiQ0FkEABQIMUrTZygy7Wfg6
 6qi4dk8frWW4Kt63gOXsMxr9FWTqDk+Ys4GPAVDVm1d0dzERn8k=
 =0zEe
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-5.5-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest KUnit support gtom Shuah Khan:
 "This adds KUnit, a lightweight unit testing and mocking framework for
  the Linux kernel from Brendan Higgins.

  KUnit is not an end-to-end testing framework. It is currently
  supported on UML and sub-systems can write unit tests and run them in
  UML env. KUnit documentation is included in this update.

  In addition, this Kunit update adds 3 new kunit tests:

   - proc sysctl test from Iurii Zaikin

   - the 'list' doubly linked list test from David Gow

   - ext4 tests for decoding extended timestamps from Iurii Zaikin

  In the future KUnit will be linked to Kselftest framework to provide a
  way to trigger KUnit tests from user-space"

* tag 'linux-kselftest-5.5-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (23 commits)
  lib/list-test: add a test for the 'list' doubly linked list
  ext4: add kunit test for decoding extended timestamps
  Documentation: kunit: Fix verification command
  kunit: Fix '--build_dir' option
  kunit: fix failure to build without printk
  MAINTAINERS: add proc sysctl KUnit test to PROC SYSCTL section
  kernel/sysctl-test: Add null pointer test for sysctl.c:proc_dointvec()
  MAINTAINERS: add entry for KUnit the unit testing framework
  Documentation: kunit: add documentation for KUnit
  kunit: defconfig: add defconfigs for building KUnit tests
  kunit: tool: add Python wrappers for running KUnit tests
  kunit: test: add tests for KUnit managed resources
  kunit: test: add the concept of assertions
  kunit: test: add tests for kunit test abort
  kunit: test: add support for test abort
  objtool: add kunit_try_catch_throw to the noreturn list
  kunit: test: add initial tests
  lib: enable building KUnit in lib/
  kunit: test: add the concept of expectations
  kunit: test: add assertion printing library
  ...
2019-11-25 15:01:30 -08:00
Anton Ivanov
9807019a62 um: Loadable BPF "Firmware" for vector drivers
All vector drivers now allow a BPF program to be loaded and
associated with the RX socket in the host kernel.

1. The program can be loaded as an extra kernel command line
option to any of the vector drivers.

2. The program can also be loaded as "firmware", using the
ethtool flash option. It is possible to turn this facility
on or off using a command line option.

A simplistic wrapper for generating the BPF firmware for the raw
socket driver out of a tcpdump/libpcap filter expression can be
found at: https://github.com/kot-begemot-uk/uml_vector_utilities/

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-11-25 22:43:31 +01:00
Krzysztof Kozlowski
7d8093a560 um: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-11-25 22:43:28 +01:00
Johannes Berg
bf9f80cf0c um: virtio_uml: Disallow modular build
This driver *can* be a module, but then its parameters (socket path)
are untrusted data from inside the VM, and that isn't allowed. Allow
the code to only be built-in to avoid that.

Fixes: 5d38f32499 ("um: drivers: Add virtio vhost-user driver")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-11-25 22:43:21 +01:00
Johannes Berg
7e60746005 um: virtio: Keep reading on -EAGAIN
When we get an interrupt from the socket getting readable,
and start reading, there's a possibility for a race. This
depends on the implementation of the device, but e.g. with
qemu's libvhost-user, we can see:

 device                 virtio_uml
---------------------------------------
  write header
                         get interrupt
                         read header
                         read body -> returns -EAGAIN
  write body

The -EAGAIN return is because the socket is non-blocking,
and then this leads us to abandon this message.

In fact, we've already read the header, so when the get
another signal/interrupt for the body, we again read it
as though it's a new message header, and also abandon it
for the same reason (wrong size etc.)

This essentially breaks things, and if that message was
one that required a response, it leads to a deadlock as
the device is waiting for the response but we'll never
reply.

Fix this by spinning on -EAGAIN as well when we read the
message body. We need to handle -EAGAIN as "no message"
while reading the header, since we share an interrupt.

Note that this situation is highly unlikely to occur in
normal usage, since there will be very few messages and
only in the startup phase. With the inband call feature
this does tend to happen (eventually) though.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-11-25 22:43:13 +01:00
Johannes Berg
04e5b1fb01 um: virtio: Remove device on disconnect
If the connection drops, just remove the device, we don't try
to recover from this right now.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-11-25 22:43:08 +01:00
Johannes Berg
5c1f33e2a0 um: Don't trace irqflags during shutdown
In the main() code, we eventually enable signals just before
exec() or exit(), in order to to not have signals pending and
delivered *after* the exec().

I've observed SIGSEGV loops at this point, and the reason seems
to be the irqflags tracing; this makes sense as the kernel is
no longer really functional at this point. Since there's really
no reason to use unblock_signals_trace() here (I had just done
a global search & replace), use the plain unblock_signals() in
this case to avoid going into the no longer functional kernel.

Fixes: 0dafcbe128 ("um: Implement TRACE_IRQFLAGS_SUPPORT")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2019-11-25 22:42:57 +01:00
Arnd Bergmann
af3784689e y2038: ipc: fix x32 ABI breakage
The correct type on x32 is 64-bit wide, same as for the other struct
members around it, so use  __kernel_long_t in place of the original
__kernel_time_t here, corresponding to the rest of the structure.

Fixes: caf5e32d4e ("y2038: ipc: remove __kernel_time_t reference from headers")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-11-25 21:30:12 +01:00
Nathan Chancellor
8dcd71b45d powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp
LLVM revision r374662 gives LLVM the ability to convert certain loops
into a reference to bcmp as an optimization; this breaks
prom_init_check.sh:

    CALL    arch/powerpc/kernel/prom_init_check.sh
  Error: External symbol 'bcmp' referenced from prom_init.c
  make[2]: *** [arch/powerpc/kernel/Makefile:196: prom_init_check] Error 1

bcmp is defined in lib/string.c as a wrapper for memcmp so this could
be added to the whitelist. However, commit
450e7dd400 ("powerpc/prom_init: don't use string functions from
lib/") copied memcmp as prom_memcmp to avoid KASAN instrumentation so
having bcmp be resolved to regular memcmp would break that assumption.
Furthermore, because the compiler is the one that inserted bcmp, we
cannot provide something like prom_bcmp.

To prevent LLVM from being clever with optimizations like this, use
-ffreestanding to tell LLVM we are not hosted so it is not free to
make transformations like this.

Reviewed-by: Nick Desaulneris <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191119045712.39633-4-natechancellor@gmail.com
2019-11-25 21:45:43 +11:00
Nathan Chancellor
c9029ef9c9 powerpc: Avoid clang warnings around setjmp and longjmp
Commit aea447141c ("powerpc: Disable -Wbuiltin-requires-header when
setjmp is used") disabled -Wbuiltin-requires-header because of a
warning about the setjmp and longjmp declarations.

r367387 in clang added another diagnostic around this, complaining
that there is no jmp_buf declaration.

  In file included from ../arch/powerpc/xmon/xmon.c:47:
  ../arch/powerpc/include/asm/setjmp.h:10:13: error: declaration of
  built-in function 'setjmp' requires the declaration of the 'jmp_buf'
  type, commonly provided in the header <setjmp.h>.
  [-Werror,-Wincomplete-setjmp-declaration]
  extern long setjmp(long *);
              ^
  ../arch/powerpc/include/asm/setjmp.h:11:13: error: declaration of
  built-in function 'longjmp' requires the declaration of the 'jmp_buf'
  type, commonly provided in the header <setjmp.h>.
  [-Werror,-Wincomplete-setjmp-declaration]
  extern void longjmp(long *, long);
              ^
  2 errors generated.

We are not using the standard library's longjmp/setjmp implementations
for obvious reasons; make this clear to clang by using -ffreestanding
on these files.

Cc: stable@vger.kernel.org # 4.14+
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191119045712.39633-3-natechancellor@gmail.com
2019-11-25 21:45:43 +11:00
Nathan Chancellor
465bfd9c44 powerpc: Don't add -mabi= flags when building with Clang
When building pseries_defconfig, building vdso32 errors out:

  error: unknown target ABI 'elfv1'

This happens because -m32 in clang changes the target to 32-bit,
which does not allow the ABI to be changed.

Commit 4dc831aa88 ("powerpc: Fix compiling a BE kernel with a
powerpc64le toolchain") added these flags to fix building big endian
kernels with a little endian GCC.

Clang doesn't need -mabi because the target triple controls the
default value. -mlittle-endian and -mbig-endian manipulate the triple
into either powerpc64-* or powerpc64le-*, which properly sets the
default ABI.

Adding a debug print out in the PPC64TargetInfo constructor after line
383 above shows this:

  $ echo | ./clang -E --target=powerpc64-linux -mbig-endian -o /dev/null -
  Default ABI: elfv1

  $ echo | ./clang -E --target=powerpc64-linux -mlittle-endian -o /dev/null -
  Default ABI: elfv2

  $ echo | ./clang -E --target=powerpc64le-linux -mbig-endian -o /dev/null -
  Default ABI: elfv1

  $ echo | ./clang -E --target=powerpc64le-linux -mlittle-endian -o /dev/null -
  Default ABI: elfv2

Don't specify -mabi when building with clang to avoid the build error
with -m32 and not change any code generation.

-mcall-aixdesc is not an implemented flag in clang so it can be safely
excluded as well, see commit 238abecde8 ("powerpc: Don't use gcc
specific options on clang").

pseries_defconfig successfully builds after this patch and
powernv_defconfig and ppc44x_defconfig don't regress.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
[mpe: Trim clang links in change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191119045712.39633-2-natechancellor@gmail.com
2019-11-25 21:45:43 +11:00
Krzysztof Kozlowski
5f017a56aa powerpc: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1574306461-7646-1-git-send-email-krzk@kernel.org
2019-11-25 21:45:43 +11:00
Christophe Leroy
f2bb86937d powerpc/fixmap: don't clear fixmap area in paging_init()
fixmap is intended to map things permanently like the IMMR region on
FSL SOC (8xx, 83xx, ...), so don't clear it when initialising paging()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/41c99bc06394a6bc2888631cb98a3ed2ae281ddb.1568295907.git.christophe.leroy@c-s.fr
2019-11-25 21:45:43 +11:00
Paolo Bonzini
9671024729 Second KVM PPC update for 5.5
- Two fixes from Greg Kurz to fix memory leak bugs in the XIVE code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEv0VLfXa2m9eKuaRpnZrqdyxjcZ8FAl3bJKwACgkQnZrqdyxj
 cZ92xQgAhgnARWJwh+uazayNrwB12TJA7G25RO8CUEwWaAY/io5QeO7nQCmNZ3cf
 TflQpI1dL5qFpzU7uNunHqdqyhlaD0wwkHfrN71molr5sA1uRlIyxwwkE6coZQEC
 n/LiGayoxqt2Ra06H4L4SGSjb7fcCl8eYjC3xjTx9Zdf/iXVcwYprBch5kcrToLV
 s0NvRvDgwcaqsxQyybTO0wRvME/qz9JFtNUgl6H4PNSt3l/yv+rM+BgjyNR3tyKu
 B1G4937GqBIAV4jYmK0a/LDnNfxs9EmOjuJLKCHmVxlfbsg8wasNk3kj+mdrh2O3
 ZjCdh782GyGwp/ysddOHmIhXFyQMhQ==
 =9kV2
 -----END PGP SIGNATURE-----

Merge tag 'kvm-ppc-next-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD

Second KVM PPC update for 5.5

- Two fixes from Greg Kurz to fix memory leak bugs in the XIVE code.
2019-11-25 11:29:05 +01:00
Andy Lutomirski
4a13b0e3e1 x86/entry/32: Fix FIXUP_ESPFIX_STACK with user CR3
UNWIND_ESPFIX_STACK needs to read the GDT, and the GDT mapping that
can be accessed via %fs is not mapped in the user pagetables.  Use
SGDT to find the cpu_entry_area mapping and read the espfix offset
from that instead.

Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25 09:36:47 +01:00
Will Deacon
fb041bb7c0 locking/refcount: Consolidate implementations of refcount_t
The generic implementation of refcount_t should be good enough for
everybody, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely,
leaving the generic implementation enabled unconditionally.

Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191121115902.2551-9-will@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25 09:15:32 +01:00
Ingo Molnar
ceb9e77324 Merge branch 'x86/core' into perf/core, to resolve conflicts and to pick up completed topic tree
Conflicts:
	tools/perf/check-headers.sh

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25 09:09:27 +01:00
Ingo Molnar
c494cd6469 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25 09:08:29 +01:00
Ingo Molnar
f01ec4fca8 Merge branch 'x86/build' into x86/asm, to pick up completed topic branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25 09:05:09 +01:00
Ingo Molnar
05b042a194 x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make the CPU_ENTRY_AREA_PAGES assert precise
When two recent commits that increased the size of the 'struct cpu_entry_area'
were merged in -tip, the 32-bit defconfig build started failing on the following
build time assert:

  ./include/linux/compiler.h:391:38: error: call to ‘__compiletime_assert_189’ declared with attribute error: BUILD_BUG_ON failed: CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE
  arch/x86/mm/cpu_entry_area.c:189:2: note: in expansion of macro ‘BUILD_BUG_ON’
  In function ‘setup_cpu_entry_area_ptes’,

Which corresponds to the following build time assert:

	BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE);

The purpose of this assert is to sanity check the fixed-value definition of
CPU_ENTRY_AREA_PAGES arch/x86/include/asm/pgtable_32_types.h:

	#define CPU_ENTRY_AREA_PAGES    (NR_CPUS * 41)

The '41' is supposed to match sizeof(struct cpu_entry_area)/PAGE_SIZE, which value
we didn't want to define in such a low level header, because it would cause
dependency hell.

Every time the size of cpu_entry_area is changed, we have to adjust CPU_ENTRY_AREA_PAGES
accordingly - and this assert is checking that constraint.

But the assert is both imprecise and buggy, primarily because it doesn't
include the single readonly IDT page that is mapped at CPU_ENTRY_AREA_BASE
(which begins at a PMD boundary).

This bug was hidden by the fact that by accident CPU_ENTRY_AREA_PAGES is defined
too large upstream (v5.4-rc8):

	#define CPU_ENTRY_AREA_PAGES    (NR_CPUS * 40)

While 'struct cpu_entry_area' is 155648 bytes, or 38 pages. So we had two extra
pages, which hid the bug.

The following commit (not yet upstream) increased the size to 40 pages:

  x86/iopl: ("Restrict iopl() permission scope")

... but increased CPU_ENTRY_AREA_PAGES only 41 - i.e. shortening the gap
to just 1 extra page.

Then another not-yet-upstream commit changed the size again:

  880a98c339: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit")

Which increased the cpu_entry_area size from 38 to 39 pages, but
didn't change CPU_ENTRY_AREA_PAGES (kept it at 40). This worked
fine, because we still had a page left from the accidental 'reserve'.

But when these two commits were merged into the same tree, the
combined size of cpu_entry_area grew from 38 to 40 pages, while
CPU_ENTRY_AREA_PAGES finally caught up to 40 as well.

Which is fine in terms of functionality, but the assert broke:

	BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE);

because CPU_ENTRY_AREA_MAP_SIZE is the total size of the area,
which is 1 page larger due to the IDT page.

To fix all this, change the assert to two precise asserts:

	BUILD_BUG_ON((CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE);
	BUILD_BUG_ON(CPU_ENTRY_AREA_TOTAL_SIZE != CPU_ENTRY_AREA_MAP_SIZE);

This takes the IDT page into account, and also connects the size-based
define of CPU_ENTRY_AREA_TOTAL_SIZE with the address-subtraction based
define of CPU_ENTRY_AREA_MAP_SIZE.

Also clean up some of the names which made it rather confusing:

 - 'CPU_ENTRY_AREA_TOT_SIZE' wasn't actually the 'total' size of
   the cpu-entry-area, but the per-cpu array size, so rename this
   to CPU_ENTRY_AREA_ARRAY_SIZE.

 - Introduce CPU_ENTRY_AREA_TOTAL_SIZE that _is_ the total mapping
   size, with the IDT included.

 - Add comments where '+1' denotes the IDT mapping - it wasn't
   obvious and took me about 3 hours to decode...

Finally, because this particular commit is actually applied after
this patch:

  880a98c339: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit")

Fix the CPU_ENTRY_AREA_PAGES value from 40 pages to the correct 39 pages.

All future commits that change cpu_entry_area will have to adjust
this value precisely.

As a side note, we should probably attempt to remove CPU_ENTRY_AREA_PAGES
and derive its value directly from the structure, without causing
header hell - but that is an adventure for another day! :-)

Fixes: 880a98c339: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit")
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25 08:53:33 +01:00
Daniel Borkmann
b553a6ec57 bpf: Simplify __bpf_arch_text_poke poke type handling
Given that we have BPF_MOD_NOP_TO_{CALL,JUMP}, BPF_MOD_{CALL,JUMP}_TO_NOP
and BPF_MOD_{CALL,JUMP}_TO_{CALL,JUMP} poke types and that we also pass in
old_addr as well as new_addr, it's a bit redundant and unnecessarily
complicates __bpf_arch_text_poke() itself since we can derive the same from
the *_addr that were passed in. Hence simplify and use BPF_MOD_{CALL,JUMP}
as types which also allows to clean up call-sites.

In addition to that, __bpf_arch_text_poke() currently verifies that text
matches expected old_insn before we invoke text_poke_bp(). Also add a check
on new_insn and skip rewrite if it already matches. Reason why this is rather
useful is that it avoids making any special casing in prog_array_map_poke_run()
when old and new prog were NULL and has the benefit that also for this case
we perform a check on text whether it really matches our expectations.

Suggested-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/fcb00a2b0b288d6c73de4ef58116a821c8fe8f2f.1574555798.git.daniel@iogearbox.net
2019-11-24 17:12:11 -08:00
Daniel Borkmann
428d5df1fa bpf, x86: Emit patchable direct jump as tail call
Add initial code emission for *direct* jumps for tail call maps in
order to avoid the retpoline overhead from a493a87f38 ("bpf, x64:
implement retpoline for tail call") for situations that allow for
it, meaning, for known constant keys at verification time which are
used as index into the tail call map. In case of Cilium which makes
heavy use of tail calls, constant keys are used in the vast majority,
only for a single occurrence we use a dynamic key.

High level outline is that if the target prog is NULL in the map, we
emit a 5-byte nop for the fall-through case and if not, we emit a
5-byte direct relative jmp to the target bpf_func + skipped prologue
offset. Later during runtime, we patch these 5-byte nop/jmps upon
tail call map update or deletions dynamically. Note that on x86-64
the direct jmp works as we reuse the same stack frame and skip
prologue (as opposed to some other JIT implementations).

One of the issues is that the tail call map slots can change at any
given time even during JITing. Therefore, we have two passes: i) emit
nops for all patchable locations during main JITing phase until we
declare prog->jited = 1 eventually. At this point the image is stable,
not public yet and with all jmps disabled. While JITing, we collect
additional info like poke->ip in order to remember the patch location
for later modifications. In ii) bpf_tail_call_direct_fixup() walks
over the progs poke_tab, locks the tail call maps poke_mutex to
prevent from parallel updates and patches in the right locations via
__bpf_arch_text_poke(). Note, the main bpf_arch_text_poke() cannot
be used at this point since we're not yet exposed to kallsyms. For
the update we use plain memcpy() since the image is not public and
still in read-write mode. After patching, we activate that poke entry
through poke->ip_stable. Meaning, at this point any tail call map
updates/deletions are not going to ignore that poke entry anymore.
Then, bpf_arch_text_poke() might still occur on the read-write image
until we finally locked it as read-only. Both modifications on the
given image are under text_mutex to avoid interference with each
other when update requests come in in parallel for different tail
call maps (current one we have locked in JIT and different one where
poke->ip_stable was already set).

Example prog:

  # ./bpftool p d x i 1655
   0: (b7) r3 = 0
   1: (18) r2 = map[id:526]
   3: (85) call bpf_tail_call#12
   4: (b7) r0 = 1
   5: (95) exit

Before:

  # ./bpftool p d j i 1655
  0xffffffffc076e55c:
   0:   nopl   0x0(%rax,%rax,1)
   5:   push   %rbp
   6:   mov    %rsp,%rbp
   9:   sub    $0x200,%rsp
  10:   push   %rbx
  11:   push   %r13
  13:   push   %r14
  15:   push   %r15
  17:   pushq  $0x0                      _
  19:   xor    %edx,%edx                |_ index (arg 3)
  1b:   movabs $0xffff88d95cc82600,%rsi |_ map (arg 2)
  25:   mov    %edx,%edx                |  index >= array->map.max_entries
  27:   cmp    %edx,0x24(%rsi)          |
  2a:   jbe    0x0000000000000066       |_
  2c:   mov    -0x224(%rbp),%eax        |  tail call limit check
  32:   cmp    $0x20,%eax               |
  35:   ja     0x0000000000000066       |
  37:   add    $0x1,%eax                |
  3a:   mov    %eax,-0x224(%rbp)        |_
  40:   mov    0xd0(%rsi,%rdx,8),%rax   |_ prog = array->ptrs[index]
  48:   test   %rax,%rax                |  prog == NULL check
  4b:   je     0x0000000000000066       |_
  4d:   mov    0x30(%rax),%rax          |  goto *(prog->bpf_func + prologue_size)
  51:   add    $0x19,%rax               |
  55:   callq  0x0000000000000061       |  retpoline for indirect jump
  5a:   pause                           |
  5c:   lfence                          |
  5f:   jmp    0x000000000000005a       |
  61:   mov    %rax,(%rsp)              |
  65:   retq                            |_
  66:   mov    $0x1,%eax
  6b:   pop    %rbx
  6c:   pop    %r15
  6e:   pop    %r14
  70:   pop    %r13
  72:   pop    %rbx
  73:   leaveq
  74:   retq

After; state after JIT:

  # ./bpftool p d j i 1655
  0xffffffffc08e8930:
   0:   nopl   0x0(%rax,%rax,1)
   5:   push   %rbp
   6:   mov    %rsp,%rbp
   9:   sub    $0x200,%rsp
  10:   push   %rbx
  11:   push   %r13
  13:   push   %r14
  15:   push   %r15
  17:   pushq  $0x0                      _
  19:   xor    %edx,%edx                |_ index (arg 3)
  1b:   movabs $0xffff9d8afd74c000,%rsi |_ map (arg 2)
  25:   mov    -0x224(%rbp),%eax        |  tail call limit check
  2b:   cmp    $0x20,%eax               |
  2e:   ja     0x000000000000003e       |
  30:   add    $0x1,%eax                |
  33:   mov    %eax,-0x224(%rbp)        |_
  39:   jmpq   0xfffffffffffd1785       |_ [direct] goto *(prog->bpf_func + prologue_size)
  3e:   mov    $0x1,%eax
  43:   pop    %rbx
  44:   pop    %r15
  46:   pop    %r14
  48:   pop    %r13
  4a:   pop    %rbx
  4b:   leaveq
  4c:   retq

After; state after map update (target prog):

  # ./bpftool p d j i 1655
  0xffffffffc08e8930:
   0:   nopl   0x0(%rax,%rax,1)
   5:   push   %rbp
   6:   mov    %rsp,%rbp
   9:   sub    $0x200,%rsp
  10:   push   %rbx
  11:   push   %r13
  13:   push   %r14
  15:   push   %r15
  17:   pushq  $0x0
  19:   xor    %edx,%edx
  1b:   movabs $0xffff9d8afd74c000,%rsi
  25:   mov    -0x224(%rbp),%eax
  2b:   cmp    $0x20,%eax               .
  2e:   ja     0x000000000000003e       .
  30:   add    $0x1,%eax                .
  33:   mov    %eax,-0x224(%rbp)        |_
  39:   jmpq   0xffffffffffb09f55       |_ goto *(prog->bpf_func + prologue_size)
  3e:   mov    $0x1,%eax
  43:   pop    %rbx
  44:   pop    %r15
  46:   pop    %r14
  48:   pop    %r13
  4a:   pop    %rbx
  4b:   leaveq
  4c:   retq

After; state after map update (no prog):

  # ./bpftool p d j i 1655
  0xffffffffc08e8930:
   0:   nopl   0x0(%rax,%rax,1)
   5:   push   %rbp
   6:   mov    %rsp,%rbp
   9:   sub    $0x200,%rsp
  10:   push   %rbx
  11:   push   %r13
  13:   push   %r14
  15:   push   %r15
  17:   pushq  $0x0
  19:   xor    %edx,%edx
  1b:   movabs $0xffff9d8afd74c000,%rsi
  25:   mov    -0x224(%rbp),%eax
  2b:   cmp    $0x20,%eax               .
  2e:   ja     0x000000000000003e       .
  30:   add    $0x1,%eax                .
  33:   mov    %eax,-0x224(%rbp)        |_
  39:   nopl   0x0(%rax,%rax,1)         |_ fall-through nop
  3e:   mov    $0x1,%eax
  43:   pop    %rbx
  44:   pop    %r15
  46:   pop    %r14
  48:   pop    %r13
  4a:   pop    %rbx
  4b:   leaveq
  4c:   retq

Nice bonus is that this also shrinks the code emission quite a bit
for every tail call invocation.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/6ada4c1c9d35eeb5f4ecfab94593dafa6b5c4b09.1574452833.git.daniel@iogearbox.net
2019-11-24 17:04:11 -08:00
Daniel Borkmann
4b3da77b72 bpf, x86: Generalize and extend bpf_arch_text_poke for direct jumps
Add BPF_MOD_{NOP_TO_JUMP,JUMP_TO_JUMP,JUMP_TO_NOP} patching for x86
JIT in order to be able to patch direct jumps or nop them out. We need
this facility in order to patch tail call jumps and in later work also
BPF static keys.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/aa4784196a8e5e985af4b30a4fe5336bce6e9643.1574452833.git.daniel@iogearbox.net
2019-11-24 16:58:47 -08:00
Eric Dumazet
c392bccf2c powerpc: Add const qual to local_read() parameter
A patch in net-next triggered a compile error on powerpc:

  include/linux/u64_stats_sync.h: In function 'u64_stats_read':
  include/asm-generic/local64.h:30:37: warning: passing argument 1 of 'local_read' discards 'const' qualifier from pointer target type

This seems reasonable to relax powerpc local_read() requirements.

Fixes: 316580b69d ("u64_stats: provide u64_stats_t type")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <lkp@intel.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # build only
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-24 15:06:33 -08:00
Thomas Bogendoerfer
a8d0f11ee5
MIPS: SGI-IP27: Enable ethernet phy on second Origin 200 module
PROM only enables ethernet PHY on first Origin 200 module, so we must
do it ourselves for the second module.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-rtc@vger.kernel.org
Cc: linux-serial@vger.kernel.org
2019-11-23 14:20:30 -08:00
Thomas Bogendoerfer
29b261ff6f
MIPS: PCI: Fix fake subdevice ID for IOC3
Generation of fake subdevice ID had vendor and device ID swapped.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-rtc@vger.kernel.org
Cc: linux-serial@vger.kernel.org
2019-11-23 14:20:18 -08:00
Jim Mattson
85c9aae9ac kvm: nVMX: Relax guest IA32_FEATURE_CONTROL constraints
Commit 37e4c997da ("KVM: VMX: validate individual bits of guest
MSR_IA32_FEATURE_CONTROL") broke the KVM_SET_MSRS ABI by instituting
new constraints on the data values that kvm would accept for the guest
MSR, IA32_FEATURE_CONTROL. Perhaps these constraints should have been
opt-in via a new KVM capability, but they were applied
indiscriminately, breaking at least one existing hypervisor.

Relax the constraints to allow either or both of
FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX and
FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX to be set when nVMX is
enabled. This change is sufficient to fix the aforementioned breakage.

Fixes: 37e4c997da ("KVM: VMX: validate individual bits of guest MSR_IA32_FEATURE_CONTROL")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-23 11:34:46 +01:00
Sean Christopherson
ad5996d9a0 KVM: x86: Grab KVM's srcu lock when setting nested state
Acquire kvm->srcu for the duration of ->set_nested_state() to fix a bug
where nVMX derefences ->memslots without holding ->srcu or ->slots_lock.

The other half of nested migration, ->get_nested_state(), does not need
to acquire ->srcu as it is a purely a dump of internal KVM (and CPU)
state to userspace.

Detected as an RCU lockdep splat that is 100% reproducible by running
KVM's state_test selftest with CONFIG_PROVE_LOCKING=y.  Note that the
failing function, kvm_is_visible_gfn(), is only checking the validity of
a gfn, it's not actually accessing guest memory (which is more or less
unsupported during vmx_set_nested_state() due to incorrect MMU state),
i.e. vmx_set_nested_state() itself isn't fundamentally broken.  In any
case, setting nested state isn't a fast path so there's no reason to go
out of our way to avoid taking ->srcu.

  =============================
  WARNING: suspicious RCU usage
  5.4.0-rc7+ #94 Not tainted
  -----------------------------
  include/linux/kvm_host.h:626 suspicious rcu_dereference_check() usage!

               other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by evmcs_test/10939:
   #0: ffff88826ffcb800 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x85/0x630 [kvm]

  stack backtrace:
  CPU: 1 PID: 10939 Comm: evmcs_test Not tainted 5.4.0-rc7+ #94
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  Call Trace:
   dump_stack+0x68/0x9b
   kvm_is_visible_gfn+0x179/0x180 [kvm]
   mmu_check_root+0x11/0x30 [kvm]
   fast_cr3_switch+0x40/0x120 [kvm]
   kvm_mmu_new_cr3+0x34/0x60 [kvm]
   nested_vmx_load_cr3+0xbd/0x1f0 [kvm_intel]
   nested_vmx_enter_non_root_mode+0xab8/0x1d60 [kvm_intel]
   vmx_set_nested_state+0x256/0x340 [kvm_intel]
   kvm_arch_vcpu_ioctl+0x491/0x11a0 [kvm]
   kvm_vcpu_ioctl+0xde/0x630 [kvm]
   do_vfs_ioctl+0xa2/0x6c0
   ksys_ioctl+0x66/0x70
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x54/0x200
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7f59a2b95f47

Fixes: 8fcc4b5923 ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-23 11:30:15 +01:00
Sean Christopherson
05c19c2fe1 KVM: x86: Open code shared_msr_update() in its only caller
Fold shared_msr_update() into its sole user to eliminate its pointless
bounds check, its godawful printk, its misleading comment (it's called
under a global lock), and its woefully inaccurate name.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-23 11:29:38 +01:00
Sean Christopherson
24885d1d79 KVM: x86: Remove a spurious export of a static function
A recent change inadvertently exported a static function, which results
in modpost throwing a warning.  Fix it.

Fixes: cbbaa2727a ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-23 11:28:59 +01:00
Paul Walmsley
5ba9aa56e6 Merge branch 'next/nommu' into for-next
Conflicts:
	arch/riscv/boot/Makefile
	arch/riscv/include/asm/sbi.h
2019-11-22 18:59:09 -08:00
Paul Walmsley
4a979862dd Merge branch 'next/misc' into for-next 2019-11-22 18:58:34 -08:00
Paul Walmsley
e8cad25b7e Merge branch 'next/tlb-opt' into for-next 2019-11-22 18:58:32 -08:00
Paul Walmsley
9acfd6f538 Merge branch 'next/isa-string' into for-next 2019-11-22 18:58:29 -08:00
Paul Walmsley
69049d523f Merge branch 'next/seccomp' into for-next 2019-11-22 18:58:26 -08:00
Hassan Naveed
16c0f03f62 tracing: Enable syscall optimization for MIPS
Since MIPS architecture has a sparse syscall array, select the
HAVE_SPARSE_SYSCALL_NR to save space.

Link: http://lkml.kernel.org/r/20191115234314.21599-2-hnaveed@wavecomp.com

Signed-off-by: Hassan Naveed <hnaveed@wavecomp.com>
Reviewed-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-22 19:47:41 -05:00
Hassan Naveed
0e24220821 tracing: Use xarray for syscall trace events
Currently, a lot of memory is wasted for architectures like MIPS when
init_ftrace_syscalls() allocates the array for syscalls using kcalloc.
This is because syscalls numbers start from 4000, 5000 or 6000 and
array elements up to that point are unused.
Fix this by using a data structure more suited to storing sparsely
populated arrays. The XARRAY data structure, implemented using radix
trees, is much more memory efficient for storing the syscalls in
question.

Link: http://lkml.kernel.org/r/20191115234314.21599-1-hnaveed@wavecomp.com

Signed-off-by: Hassan Naveed <hnaveed@wavecomp.com>
Reviewed-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-22 19:47:41 -05:00
Jakub Kicinski
a9f852e92e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock
from commit c8183f5489 ("s390/qeth: fix potential deadlock on
workqueue flush"), removed the code which was removed by commit
9897d583b0 ("s390/qeth: consolidate some duplicated HW cmd code").

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-22 16:27:24 -08:00
Zhou Yanjie
b02efeb056
MIPS: Ingenic: Disable abandoned HPTLB function.
JZ4760/JZ4770/JZ4775/X1000/X1500 has an abandoned huge page tlb,
this mode is not compatible with the MIPS standard, it will cause
tlbmiss and into an infinite loop (line 21 in the tlb-funcs.S)
when starting the init process. write 0xa9000000 to cp0 register 5
sel 4 to disable this function to prevent getting stuck. Confirmed
by Ingenic, this operation will not adversely affect processors
without HPTLB function.

Signed-off-by: Zhou Yanjie <zhouyanjie@zoho.com>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: ralf@linux-mips.org
Cc: jhogan@kernel.org
Cc: jiaxun.yang@flygoat.com
Cc: gregkh@linuxfoundation.org
Cc: malat@debian.org
Cc: tglx@linutronix.de
Cc: chenhc@lemote.com
2019-11-22 14:00:28 -08:00
Krzysztof Kozlowski
0ecdcaa6d5 openrisc: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2019-11-23 06:49:21 +09:00
Mark Brown
ca4196aa10
Merge branch 'spi-5.5' into spi-next 2019-11-22 19:56:35 +00:00
Thomas Bogendoerfer
37640adbef
MIPS: PCI: remember nasid changed by set interrupt affinity
When changing interrupt affinity remember the possible changed nasid,
otherwise an interrupt deactivate/activate sequence will incorrectly
setup interrupt.

Fixes: e6308b6d35 ("MIPS: SGI-IP27: abstract chipset irq from bridge")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
2019-11-22 10:56:16 -08:00
Thomas Bogendoerfer
e3d765a941
MIPS: SGI-IP27: Fix crash, when CPUs are disabled via nr_cpus parameter
If number of CPUs are limited by the kernel commandline parameter nr_cpus
assignment of interrupts accourding to numa rules might not be possibe.
As a fallback use one of the online CPUs as interrupt destination.

Fixes: 69a07a41d9 ("MIPS: SGI-IP27: rework HUB interrupts")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
2019-11-22 10:56:14 -08:00
Mike Rapoport
2bee1b5848
mips: add support for folded p4d page tables
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5leve-fixup.h with pgtable-nop4d.h and
drop usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Mike Rapoport <rppt@kernel.org>
2019-11-22 10:51:22 -08:00
Mike Rapoport
31168f033e
mips: drop __pXd_offset() macros that duplicate pXd_index() ones
The __pXd_offset() macros are identical to the pXd_index() macros and there
is no point to keep both of them. All architectures define and use
pXd_index() so let's keep only those to make mips consistent with the rest
of the kernel.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Mike Rapoport <rppt@kernel.org>
2019-11-22 10:51:17 -08:00
Mike Rapoport
3ed6751bb8
mips: fix build when "48 bits virtual memory" is enabled
With CONFIG_MIPS_VA_BITS_48=y the build fails miserably:

  CC      arch/mips/kernel/asm-offsets.s
In file included from arch/mips/include/asm/pgtable.h:644,
                 from include/linux/mm.h:99,
                 from arch/mips/kernel/asm-offsets.c:15:
include/asm-generic/pgtable.h:16:2: error: #error CONFIG_PGTABLE_LEVELS is not consistent with __PAGETABLE_{P4D,PUD,PMD}_FOLDED
 #error CONFIG_PGTABLE_LEVELS is not consistent with __PAGETABLE_{P4D,PUD,PMD}_FOLDED
  ^~~~~
include/asm-generic/pgtable.h:390:28: error: unknown type name 'p4d_t'; did you mean 'pmd_t'?
 static inline int p4d_same(p4d_t p4d_a, p4d_t p4d_b)
                            ^~~~~
                            pmd_t

[ ... more such errors ... ]

scripts/Makefile.build:99: recipe for target 'arch/mips/kernel/asm-offsets.s' failed
make[2]: *** [arch/mips/kernel/asm-offsets.s] Error 1

This happens because when CONFIG_MIPS_VA_BITS_48 enables 4th level of the
page tables, but neither pgtable-nop4d.h nor 5level-fixup.h are included to
cope with the 5th level.

Replace #ifdef conditions around includes of the pgtable-nop{m,u}d.h with
explicit CONFIG_PGTABLE_LEVELS and add include of 5level-fixup.h for the
case when CONFIG_PGTABLE_LEVELS==4

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Mike Rapoport <rppt@kernel.org>
2019-11-22 10:50:56 -08:00
Eric Biggers
b62755aed3 crypto: x86/chacha - only unregister algorithms if registered
It's not valid to call crypto_unregister_skciphers() without a prior
call to crypto_register_skciphers().

Fixes: 84e03fa39f ("crypto: x86/chacha - expose SIMD ChaCha routine as library function")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-22 18:48:39 +08:00
Dexuan Cui
b96f86534f x86/hyperv: Implement hv_is_hibernation_supported()
The API will be used by the hv_balloon and hv_vmbus drivers.

Balloon up/down and hot-add of memory must not be active if the user
wants the Linux VM to support hibernation, because they are incompatible
with hibernation according to Hyper-V team, e.g. upon suspend the
balloon VSP doesn't save any info about the ballooned-out pages (if any);
so, after Linux resumes, Linux balloon VSC expects that the VSP will
return the pages if Linux is under memory pressure, but the VSP will
never do that, since the VSP thinks it never stole the pages from the VM.

So, if the user wants Linux VM to support hibernation, Linux must forbid
balloon up/down and hot-add, and the only functionality of the balloon VSC
driver is reporting the VM's memory pressure to the host.

Ideally, when Linux detects that the user wants it to support hibernation,
the balloon VSC should tell the VSP that it does not support ballooning
and hot-add. However, the current version of the VSP requires the VSC
should support these capabilities, otherwise the capability negotiation
fails and the VSC can not load at all, so with the later changes to the
VSC driver, Linux VM still reports to the VSP that the VSC supports these
capabilities, but the VSC ignores the VSP's requests of balloon up/down
and hot add, and reports an error to the VSP, when applicable. BTW, in
the future the balloon VSP driver will allow the VSC to not support the
capabilities of balloon up/down and hot add.

The ACPI S4 state is not a must for hibernation to work, because Linux is
able to hibernate as long as the system can shut down. However in practice
we decide to artificially use the presence of the virtual ACPI S4 state as
an indicator of the user's intent of using hibernation, because Linux VM
must find a way to know if the user wants to use the hibernation feature
or not.

By default, Hyper-V does not enable the virtual ACPI S4 state; on recent
Hyper-V hosts (e.g. RS5, 19H1), the administrator is able to enable the
state for a VM by WMI commands.

Once all the vmbus and VSC patches for the hibernation feature are
accepted, an extra patch will be submitted to forbid hibernation if the
virtual ACPI S4 state is absent, i.e. hv_is_hibernation_supported() is
false.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21 20:10:45 -05:00
Himadri Pandya
fa36dcdf8b x86: hv: Add function to allocate zeroed page for Hyper-V
Hyper-V assumes page size to be 4K. While this assumption holds true on
x86 architecture, it might not  be true for ARM64 architecture. Hence
define hyper-v specific function to allocate a zeroed page which can
have a different implementation on ARM64 architecture to handle the
conflict between hyper-v's assumed page size and actual guest page size.

Signed-off-by: Himadri Pandya <himadri18.07@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21 20:10:45 -05:00
Jisheng Zhang (syna)
1a70cf0e7e ARM: 8940/1: ftrace: remove mcount(),ftrace_caller_old() and ftrace_call_old()
Commit d3c6161956 ("ARM: 8788/1: ftrace: remove old mcount support")
removed the old mcount support, but forget to remove these three
declarations. This patch removes them.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-11-22 00:19:16 +00:00
Dmitry Golovin
29c623d64f ARM: 8939/1: kbuild: use correct nm executable
Since $(NM) variable can be easily overridden for the whole build, it's
better to use it instead of $(CROSS_COMPILE)nm. The use of $(CROSS_COMPILE)
prefixed variables where their calculated equivalents can be used is
incorrect. This fixes issues with builds where $(NM) is set to llvm-nm.

Link: https://github.com/ClangBuiltLinux/linux/issues/766

Signed-off-by: Dmitry Golovin <dima@golovin.in>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Matthias Maennich <maennich@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-11-22 00:19:11 +00:00
Linus Torvalds
81429eb8d9 arm64 fix for 5.4
- Ensure PAN is re-enabled following user fault in uaccess routines
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl3WdTEQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNE43B/9ow5C7aV1Jmb9dk3zfkDkVuviCkv9STpNW
 YosKSAHwGvsfmBd/f+kd/ipqnt/fBLbXHqRderbayDW7xundCP8Qgl51N2RwJ92O
 si3rPHW62xjbv0U3XqY9h74wH4jfox3KAT8c5EBcLuZtmV3gTb1txWDQ0PfQZMpN
 0cmfTRPHB7LKJ/hupEyq6iVPYx+mV1ajenW1uzBOQhu+c8jC5Uf9kDn/D93rCYCg
 7NL88odhP0Oyt6HDcR5LB+nl7fz+q0jXzxyF8qHDWNitV4aovU9rLMpDpn4YVvYd
 DcFEyGdwYSxExLFrBuvmevZ7+2biT3qWQcPuNQGCcqWQSu38ULFK
 =r7nI
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Will Deacon:
 "Ensure PAN is re-enabled following user fault in uaccess routines.

  After I thought we were done for 5.4, we had a report this week of a
  nasty issue that has been shown to leak data between different user
  address spaces thanks to corruption of entries in the TLB. In
  hindsight, we should have spotted this in review when the PAN code was
  merged back in v4.3, but hindsight is 20/20 and I'm trying not to beat
  myself up too much about it despite being fairly miserable.

  Anyway, the fix is "obvious" but the actual failure is more more
  subtle, and is described in the commit message. I've included a fairly
  mechanical follow-up patch here as well, which moves this checking out
  into the C wrappers which is what we do for {get,put}_user() already
  and allows us to remove these bloody assembly macros entirely. The
  patches have passed kernelci [1] [2] [3] and CKI [4] tests over night,
  as well as some targetted testing [5] for this particular issue.

  The first patch is tagged for stable and should be applied to 4.14,
  4.19 and 5.3. I have separate backports for 4.4 and 4.9, which I'll
  send out once this has landed in your tree (although the original
  patch applies cleanly, it won't build for those two trees).

  Thanks to Pavel Tatashin for reporting this and Mark Rutland for
  helping to diagnose the issue and review/test the solution"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: uaccess: Remove uaccess_*_not_uao asm macros
  arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
2019-11-21 12:15:24 -08:00
Peter Zijlstra
8954290765 x86/entry/32: Fix NMI vs ESPFIX
When the NMI lands on an ESPFIX_SS, we are on the entry stack and must
swizzle, otherwise we'll run do_nmi() on the entry stack, which is
BAD.

Also, similar to the normal exception path, we need to correct the
ESPFIX magic before leaving the entry stack, otherwise pt_regs will
present a non-flat stack pointer.

Tested by running sigreturn_32 concurrent with perf-record.

Fixes: e5862d0515 ("x86/entry/32: Leave the kernel via trampoline stack")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@kernel.org
2019-11-21 19:37:44 +01:00
Andy Lutomirski
a1a338e5b6 x86/entry/32: Unwind the ESPFIX stack earlier on exception entry
Right now, we do some fancy parts of the exception entry path while SS
might have a nonzero base: we fill in regs->ss and regs->sp, and we
consider switching to the kernel stack. This results in regs->ss and
regs->sp referring to a non-flat stack and it may result in
overflowing the entry stack. The former issue means that we can try to
call iret_exc on a non-flat stack, which doesn't work.

Tested with selftests/x86/sigreturn_32.

Fixes: 45d7b25574 ("x86/entry/32: Enter the kernel via trampoline stack")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
2019-11-21 19:37:44 +01:00
Andy Lutomirski
82cb8a0b1d x86/entry/32: Move FIXUP_FRAME after pushing %fs in SAVE_ALL
This will allow us to get percpu access working before FIXUP_FRAME,
which will allow us to unwind ESPFIX earlier.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
2019-11-21 19:37:43 +01:00
Andy Lutomirski
4c4fd55d3d x86/entry/32: Use %ss segment where required
When re-building the IRET frame we use %eax as an destination %esp,
make sure to then also match the segment for when there is a nonzero
SS base (ESPFIX).

[peterz: Changelog and minor edits]
Fixes: 3c88c692c2 ("x86/stackframe/32: Provide consistent pt_regs")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
2019-11-21 19:37:43 +01:00
Peter Zijlstra
40ad219958 x86/entry/32: Fix IRET exception
As reported by Lai, the commit 3c88c692c2 ("x86/stackframe/32:
Provide consistent pt_regs") wrecked the IRET EXTABLE entry by making
.Lirq_return not point at IRET.

Fix this by placing IRET_FRAME in RESTORE_REGS, to mirror how
FIXUP_FRAME is part of SAVE_ALL.

Fixes: 3c88c692c2 ("x86/stackframe/32: Provide consistent pt_regs")
Reported-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@kernel.org
2019-11-21 19:37:43 +01:00
Thomas Gleixner
880a98c339 x86/cpu_entry_area: Add guard page for entry stack on 32bit
The entry stack in the cpu entry area is protected against overflow by the
readonly GDT on 64-bit, but on 32-bit the GDT needs to be writeable and
therefore does not trigger a fault on stack overflow.

Add a guard page.

Fixes: c482feefe1 ("x86/entry/64: Make cpu_entry_area.tss read-only")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
2019-11-21 19:37:43 +01:00
Thomas Gleixner
f490e07c53 x86/pti/32: Size initial_page_table correctly
Commit 945fd17ab6 ("x86/cpu_entry_area: Sync cpu_entry_area to
initial_page_table") introduced the sync for the initial page table for
32bit.

sync_initial_page_table() uses clone_pgd_range() which does the update for
the kernel page table. If PTI is enabled it also updates the user space
page table counterpart, which is assumed to be in the next page after the
target PGD.

At this point in time 32-bit did not have PTI support, so the user space
page table update was not taking place.

The support for PTI on 32-bit which was introduced later on, did not take
that into account and missed to add the user space counter part for the
initial page table.

As a consequence sync_initial_page_table() overwrites any data which is
located in the page behing initial_page_table causing random failures,
e.g. by corrupting doublefault_tss and wreckaging the doublefault handler
on 32bit.

Fix it by adding a "user" page table right after initial_page_table.

Fixes: 7757d607c6 ("x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Cc: stable@kernel.org
2019-11-21 19:37:43 +01:00
Andy Lutomirski
3580d0b29c x86/doublefault/32: Fix stack canaries in the double fault handler
The double fault TSS was missing GS setup, which is needed for stack
canaries to work.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
2019-11-21 19:37:42 +01:00
Davidlohr Bueso
7f264dab5b x86/mm/pat: Rename pat_rbtree.c to pat_interval.c
Considering the previous changes, this is a more proper name.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20191121011601.20611-5-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-21 18:48:18 +01:00
Davidlohr Bueso
511aaca834 x86/mm/pat: Drop the rbt_ prefix from external memtype calls
Drop the rbt_memtype_*() call rbt_ prefix, as we no longer use
an rbtree directly.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20191121011601.20611-4-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-21 18:48:07 +01:00
Davidlohr Bueso
6a9930b1c5 x86/mm/pat: Do not pass 'rb_root' down the memtype tree helper functions
Get rid of the passing the rb_root down the helper calls; there
is only one: &memtype_rbroot.

No change in functionality.

[ mingo: Fixed the changelog which described a different version of the patch. ]

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20191121011601.20611-3-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-21 18:47:59 +01:00
Davidlohr Bueso
8d04a5f97a x86/mm/pat: Convert the PAT tree to a generic interval tree
With some considerations, the custom pat_rbtree implementation can be
simplified to use most of the generic interval_tree machinery:

 - The tree inorder traversal can slightly differ when there are key
   ('start') collisions in the tree due to one going left and another right.
   This, however, only affects the output of debugfs' pat_memtype_list file.

 - Generic interval trees are now fully closed [a, b], for which we need
   to adjust the last endpoint (ie: end - 1).

 - Erasing logic must remain untouched as well.

 - In order for the types to remain u64, the 'memtype_interval' calls are
   introduced, as opposed to simply using struct interval_tree.

In addition, the PAT tree might potentially also benefit by the fast overlap
detection for the insertion case when looking up the first overlapping node
in the tree.

No change in behavior is intended.

Finally, I've tested this on various servers, via sanity warnings, running
side by side with the current version and so far see no differences in the
returned pointer node when doing memtype_rb_lowest_match() lookups.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20191121011601.20611-2-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-21 18:47:30 +01:00
Nicolas Saenz Julienne
a7ba70f178 dma-mapping: treat dev->bus_dma_mask as a DMA limit
Using a mask to represent bus DMA constraints has a set of limitations.
The biggest one being it can only hold a power of two (minus one). The
DMA mapping code is already aware of this and treats dev->bus_dma_mask
as a limit. This quirk is already used by some architectures although
still rare.

With the introduction of the Raspberry Pi 4 we've found a new contender
for the use of bus DMA limits, as its PCIe bus can only address the
lower 3GB of memory (of a total of 4GB). This is impossible to represent
with a mask. To make things worse the device-tree code rounds non power
of two bus DMA limits to the next power of two, which is unacceptable in
this case.

In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
over the tree and treat it as such. Note that dev->bus_dma_limit should
contain the higher accessible DMA address.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-11-21 18:14:35 +01:00
Christoph Hellwig
d7293f79ca Merge branch 'for-next/zone-dma' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into dma-mapping-for-next
Pull in a stable branch from the arm64 tree that adds the zone_dma_bits
variable to avoid creating hard to resolve conflicts with that addition.
2019-11-21 18:13:03 +01:00
Arnd Bergmann
1c11ca7a05 y2038: fix typo in powerpc vdso "LOPART"
The earlier patch introduced a typo, change LOWPART back to
LOPART.

Fixes: 176ed98c8a ("y2038: vdso: powerpc: avoid timespec references")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-11-21 15:19:49 +01:00
Kai-Heng Feng
7e8ce0e2b0 x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect
The AMD FCH USB XHCI Controller advertises support for generating PME#
while in D0.  When in D0, it does signal PME# for USB 3.0 connect events,
but not for USB 2.0 or USB 1.1 connect events, which means the controller
doesn't wake correctly for those events.

  00:10.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller [1022:7914] (rev 20) (prog-if 30 [XHCI])
        Subsystem: Dell FCH USB XHCI Controller [1028:087e]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)

Clear PCI_PM_CAP_PME_D0 in dev->pme_support to indicate the device will not
assert PME# from D0 so we don't rely on it.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203673
Link: https://lore.kernel.org/r/20190902145252.32111-1-kai.heng.feng@canonical.com
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2019-11-21 07:49:31 -06:00
Krzysztof Wilczynski
0d2f4d62ff x86/PCI: Replace deprecated EXTRA_CFLAGS with ccflags-y
Update arch/x86/pci/Makefile replacing the deprecated EXTRA_CFLAGS with the
ccflags-y matching recommendation per section 3.7 of
Documentation/kbuild/makefiles.txt.

Link: https://lore.kernel.org/r/20190819060532.17093-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-11-21 07:49:27 -06:00
Krzysztof Wilczynski
4257ac5acd x86/PCI: Add NumaChip SPDX GPL-2.0 to replace COPYING boilerplate
Add SPDX GPL-2.0 to numachip.c, which referred to the kernel default
"COPYING" file, which specifies GPL version 2.

Remove the boilerplate language referring to the GPL and "COPYING", relying
on the assertion in b24413180f ("License cleanup: add SPDX GPL-2.0
license identifier to files with no license") that the SPDX identifier may
be used instead of the full boilerplate text.

[bhelgaas: split to separate patch]
Link: https://lore.kernel.org/r/20190828135322.10370-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-11-21 07:49:22 -06:00
Paolo Bonzini
c50d8ae3a1 KVM: x86: create mmu/ subdirectory
Preparatory work for shattering mmu.c into multiple files.  Besides making it easier
to follow, this will also make it possible to write unit tests for various parts.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 12:03:50 +01:00
Liran Alon
0155b2b91b KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page
According to Intel SDM section 28.3.3.3/28.3.3.4 Guidelines for Use
of the INVVPID/INVEPT Instruction, the hypervisor needs to execute
INVVPID/INVEPT X in case CPU executes VMEntry with VPID/EPTP X and
either: "Virtualize APIC accesses" VM-execution control was changed
from 0 to 1, OR the value of apic_access_page was changed.

In the nested case, the burden falls on L1, unless L0 enables EPT in
vmcs02 but L1 enables neither EPT nor VPID in vmcs12.  For this reason
prepare_vmcs02() and load_vmcs12_host_state() have special code to
request a TLB flush in case L1 does not use EPT but it uses
"virtualize APIC accesses".

This special case however is not necessary. On a nested vmentry the
physical TLB will already be flushed except if all the following apply:

* L0 uses VPID

* L1 uses VPID

* L0 can guarantee TLB entries populated while running L1 are tagged
differently than TLB entries populated while running L2.

If the first condition is false, the processor will flush the TLB
on vmentry to L2.  If the second or third condition are false,
prepare_vmcs02() will request KVM_REQ_TLB_FLUSH.  However, even
if both are true, no extra TLB flush is needed to handle the APIC
access page:

* if L1 doesn't use VPID, the second condition doesn't hold and the
TLB will be flushed anyway.

* if L1 uses VPID, it has to flush the TLB itself with INVVPID and
section 28.3.3.3 doesn't apply to L0.

* even INVEPT is not needed because, if L0 uses EPT, it uses different
EPTP when running L2 than L1 (because guest_mode is part of mmu-role).
In this case SDM section 28.3.3.4 doesn't apply.

Similarly, examining nested_vmx_vmexit()->load_vmcs12_host_state(),
one could note that L0 won't flush TLB only in cases where SDM sections
28.3.3.3 and 28.3.3.4 don't apply.  In particular, if L0 uses different
VPIDs for L1 and L2 (i.e. vmx->vpid != vmx->nested.vpid02), section
28.3.3.3 doesn't apply.

Thus, remove this flush from prepare_vmcs02() and nested_vmx_vmexit().

Side-note: This patch can be viewed as removing parts of commit
fb6c819843 ("kvm: vmx: Flush TLB when the APIC-access address changes”)
that is not relevant anymore since commit
1313cc2bd8 ("kvm: mmu: Add guest_mode to kvm_mmu_page_role”).
i.e. The first commit assumes that if L0 use EPT and L1 doesn’t use EPT,
then L0 will use same EPTP for both L0 and L1. Which indeed required
L0 to execute INVEPT before entering L2 guest. This assumption is
not true anymore since when guest_mode was added to mmu-role.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 12:03:50 +01:00
Mao Wenan
db5a95ec16 KVM: x86: remove set but not used variable 'called'
Fixes gcc '-Wunused-but-set-variable' warning:

arch/x86/kvm/x86.c: In function kvm_make_scan_ioapic_request_mask:
arch/x86/kvm/x86.c:7911:7: warning: variable called set but not
used [-Wunused-but-set-variable]

It is not used since commit 7ee30bc132 ("KVM: x86: deliver KVM
IOAPIC scan request to target vCPUs")

Signed-off-by: Mao Wenan <maowenan@huawei.com>
Fixes: 7ee30bc132 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 12:03:49 +01:00
Liran Alon
b11494bcab KVM: nVMX: Do not mark vmcs02->apic_access_page as dirty when unpinning
vmcs->apic_access_page is simply a token that the hypervisor puts into
the PFN of a 4KB EPTE (or PTE if using shadow-paging) that triggers
APIC-access VMExit or APIC virtualization logic whenever a CPU running
in VMX non-root mode read/write from/to this PFN.

As every write either triggers an APIC-access VMExit or write is
performed on vmcs->virtual_apic_page, the PFN pointed to by
vmcs->apic_access_page should never actually be touched by CPU.

Therefore, there is no need to mark vmcs02->apic_access_page as dirty
after unpin it on L2->L1 emulated VMExit or when L1 exit VMX operation.

Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 12:03:48 +01:00
Paolo Bonzini
46f4f0aabc Merge branch 'kvm-tsx-ctrl' into HEAD
Conflicts:
	arch/x86/kvm/vmx/vmx.c
2019-11-21 12:03:40 +01:00
Paolo Bonzini
b07a5c53d4 KVM: vmx: use MSR_IA32_TSX_CTRL to hard-disable TSX on guest that lack it
If X86_FEATURE_RTM is disabled, the guest should not be able to access
MSR_IA32_TSX_CTRL.  We can therefore use it in KVM to force all
transactions from the guest to abort.

Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 10:01:02 +01:00
Paolo Bonzini
c11f83e062 KVM: vmx: implement MSR_IA32_TSX_CTRL disable RTM functionality
The current guest mitigation of TAA is both too heavy and not really
sufficient.  It is too heavy because it will cause some affected CPUs
(those that have MDS_NO but lack TAA_NO) to fall back to VERW and
get the corresponding slowdown.  It is not really sufficient because
it will cause the MDS_NO bit to disappear upon microcode update, so
that VMs started before the microcode update will not be runnable
anymore afterwards, even with tsx=on.

Instead, if tsx=on on the host, we can emulate MSR_IA32_TSX_CTRL for
the guest and let it run without the VERW mitigation.  Even though
MSR_IA32_TSX_CTRL is quite heavyweight, and we do not want to write
it on every vmentry, we can use the shared MSR functionality because
the host kernel need not protect itself from TSX-based side-channels.

Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 10:00:59 +01:00
Paolo Bonzini
edef5c36b0 KVM: x86: implement MSR_IA32_TSX_CTRL effect on CPUID
Because KVM always emulates CPUID, the CPUID clear bit
(bit 1) of MSR_IA32_TSX_CTRL must be emulated "manually"
by the hypervisor when performing said emulation.

Right now neither kvm-intel.ko nor kvm-amd.ko implement
MSR_IA32_TSX_CTRL but this will change in the next patch.

Reviewed-by: Jim Mattson <jmattson@google.com>
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 09:59:31 +01:00
Paolo Bonzini
de1fca5d6e KVM: x86: do not modify masked bits of shared MSRs
"Shared MSRs" are guest MSRs that are written to the host MSRs but
keep their value until the next return to userspace.  They support
a mask, so that some bits keep the host value, but this mask is
only used to skip an unnecessary MSR write and the value written
to the MSR is always the guest MSR.

Fix this and, while at it, do not update smsr->values[slot].curr if
for whatever reason the wrmsr fails.  This should only happen due to
reserved bits, so the value written to smsr->values[slot].curr
will not match when the user-return notifier and the host value will
always be restored.  However, it is untidy and in rare cases this
can actually avoid spurious WRMSRs on return to userspace.

Cc: stable@vger.kernel.org
Reviewed-by: Jim Mattson <jmattson@google.com>
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 09:59:22 +01:00
Paolo Bonzini
cbbaa2727a KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES
KVM does not implement MSR_IA32_TSX_CTRL, so it must not be presented
to the guests.  It is also confusing to have !ARCH_CAP_TSX_CTRL_MSR &&
!RTM && ARCH_CAP_TAA_NO: lack of MSR_IA32_TSX_CTRL suggests TSX was not
hidden (it actually was), yet the value says that TSX is not vulnerable
to microarchitectural data sampling.  Fix both.

Cc: stable@vger.kernel.org
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 09:59:10 +01:00
Paolo Bonzini
14edff8831 KVM/arm updates for Linux 5.5:
- Allow non-ISV data aborts to be reported to userspace
 - Allow injection of data aborts from userspace
 - Expose stolen time to guests
 - GICv4 performance improvements
 - vgic ITS emulation fixes
 - Simplify FWB handling
 - Enable halt pool counters
 - Make the emulated timer PREEMPT_RT compliant
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl3VZ0EPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDwikQAJ/lQT97zEKV3dnpD/jjmEic/3QvTGljS4p+
 pbwZAzyoSMc09lAK2pkaGRRc7euPnp4uLdRS4SenToVzmUQCzuxpQEEMdV/wjp4V
 WLQ1WnTEAhYkm7k5MVo4uy3eD7nVWHWXgfQJvzL4EYZ5R/gd9NzBrnAc6LLV6hp9
 0eLXIYrGFa1GESzF6P6sBDJhYpqVUcQlTI8I43kZH3iCC4+OsBxIkhHREZYsELhW
 MZJIM9ZCskg2tvPC4UysaFiGjBYUJNJ0V+fFOrhyGzludP8i8rNRwgA60mznwvNw
 V4N6/gLlkGK7nLqP+noUcU2wTBnIu389TcWWsX47CuDUzawCd8Fb9kX2zYauQSyS
 ujE0uzoo/nhPFysh9OVVeLUZ6o/wotXoMp2t32t1c5h9N1hISEJvAWavMxTY6KzF
 NEn9hWFjNcgBoArz9GKn9p2nBQpCDvu+2SlI4nL/qgZ7lPC4O3U1uq9myCOLj/gu
 Can/u5EAwgyIBDVcEPHV+vP2GjyeERdXprGiG2VJTYlbHsdjgISTR+5Fy32KdGlP
 YygeZxJtzretr3AYsWqD6Mri30FDSoYy9rUOWBpa+ZHbJPac0M+uKOqntV1OxPX9
 QUkmNEdJcDr8fkcKxnEZ/MaZxFTGPp4vfhiT4A7dUkWTFq7ajvGo8IwN1d7PvWxS
 LFMij1Js
 =SxBH
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm updates for Linux 5.5:

- Allow non-ISV data aborts to be reported to userspace
- Allow injection of data aborts from userspace
- Expose stolen time to guests
- GICv4 performance improvements
- vgic ITS emulation fixes
- Simplify FWB handling
- Enable halt pool counters
- Make the emulated timer PREEMPT_RT compliant

Conflicts:
	include/uapi/linux/kvm.h
2019-11-21 09:58:35 +01:00
Krzysztof Wilczynski
b6378caf82 nds32: Move static keyword to the front of declaration
Move the static keyword to the front of declaration of
cpu_pmu_of_device_ids, and resolve the following compiler
warning that can be seen when building with warnings
enabled (W=1):

arch/nds32/kernel/perf_event_cpu.c:1122:1: warning:
  ‘static’ is not at beginning of declaration [-Wold-style-declaration]

Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
Signed-off-by: Greentime Hu <green.hu@gmail.com>
2019-11-21 16:05:15 +08:00
Masanari Iida
1b78375c37 nds32: Fix typo in Kconfig.cpu
This patch fixes some spelling typo in Kconfig.cpu

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
Signed-off-by: Greentime Hu <green.hu@gmail.com>
2019-11-21 16:05:07 +08:00
Masahiro Yamada
9e5183ee41 nds32: remove unneeded clean-files for DTB
These patterns are cleaned-up by the top-level Makefile

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
Signed-off-by: Greentime Hu <green.hu@gmail.com>
2019-11-21 16:03:27 +08:00
Greg Kurz
30486e7209 KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path
We need to check the host page size is big enough to accomodate the
EQ. Let's do this before taking a reference on the EQ page to avoid
a potential leak if the check fails.

Cc: stable@vger.kernel.org # v5.2
Fixes: 13ce3297c5 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration")
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-21 16:24:41 +11:00
Greg Kurz
31a88c82b4 KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one
The EQ page is allocated by the guest and then passed to the hypervisor
with the H_INT_SET_QUEUE_CONFIG hcall. A reference is taken on the page
before handing it over to the HW. This reference is dropped either when
the guest issues the H_INT_RESET hcall or when the KVM device is released.
But, the guest can legitimately call H_INT_SET_QUEUE_CONFIG several times,
either to reset the EQ (vCPU hot unplug) or to set a new EQ (guest reboot).
In both cases the existing EQ page reference is leaked because we simply
overwrite it in the XIVE queue structure without calling put_page().

This is especially visible when the guest memory is backed with huge pages:
start a VM up to the guest userspace, either reboot it or unplug a vCPU,
quit QEMU. The leak is observed by comparing the value of HugePages_Free in
/proc/meminfo before and after the VM is run.

Ideally we'd want the XIVE code to handle the EQ page de-allocation at the
platform level. This isn't the case right now because the various XIVE
drivers have different allocation needs. It could maybe worth introducing
hooks for this purpose instead of exposing XIVE internals to the drivers,
but this is certainly a huge work to be done later.

In the meantime, for easier backport, fix both vCPU unplug and guest reboot
leaks by introducing a wrapper around xive_native_configure_queue() that
does the necessary cleanup.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v5.2
Fixes: 13ce3297c5 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-11-21 16:24:41 +11:00
Oliver O'Halloran
9d72dcef89 powerpc/powernv: Disable native PCIe port management
On PowerNV the PCIe topology is (currently) managed by the powernv platform
code in Linux in cooperation with the platform firmware. Linux's native
PCIe port service drivers operate independently of both and this can cause
problems.

The main issue is that the portbus driver will conflict with the platform
specific hotplug driver (pnv_php) over ownership of the MSI used to notify
the host when a hotplug event occurs. The portbus driver claims this MSI on
behalf of the individual port services because the same interrupt is used
for hotplug events, PMEs (on root ports), and link bandwidth change
notifications. The portbus driver will always claim the interrupt even if
the individual port service drivers, such as pciehp, are compiled out.

The second, bigger, problem is that the hotplug port service driver
fundamentally does not work on PowerNV. The platform assumes that all
PCI devices have a corresponding arch-specific handle derived from the DT
node for the device (pci_dn) and without one the platform will not allow
a PCI device to be enabled. This problem is largely due to historical
baggage, but it can't be resolved without significant re-factoring of the
platform PCI support.

We can fix these problems in the interim by setting the
"pcie_ports_disabled" flag during platform initialisation. The flag
indicates the platform owns the PCIe ports which stops the portbus driver
from being registered.

This does have the side effect of disabling all port services drivers
that is: AER, PME, BW notifications, hotplug, and DPC. However, this is
not a huge disadvantage on PowerNV since these services are either unused
or handled through other means.

Fixes: 66725152fb ("PCI/hotplug: PowerPC PowerNV PCI hotplug driver")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191118065553.30362-1-oohall@gmail.com
2019-11-21 15:41:38 +11:00
Christophe Leroy
793b08e2ef powerpc/kexec: Move kexec files into a dedicated subdir.
arch/powerpc/kernel/ contains 8 files dedicated to kexec.

Move them into a dedicated subdirectory.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Move to a/p/kexec, drop the 'machine' naming and use 'core' instead]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/afbef97ec6a978574a5cf91a4441000e0a9da42a.1572351221.git.christophe.leroy@c-s.fr
2019-11-21 15:41:34 +11:00
Christophe Leroy
9f7bd92015 powerpc/32: Split kexec low level code out of misc_32.S
Almost half of misc_32.S is dedicated to kexec.
That's the relocation function for kexec.

Drop it into a dedicated kexec_relocate_32.S

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e235973a1198195763afd3b6baffa548a83f4611.1572351221.git.christophe.leroy@c-s.fr
2019-11-21 15:41:34 +11:00
Christophe Leroy
8795a739e5 powerpc/sysdev: drop simple gpio
There is a config item CONFIG_SIMPLE_GPIO which
provides simple memory mapped GPIOs specific to powerpc.

However, the only platform which selects this option is
mpc5200, and this platform doesn't use it.

There are three boards calling simple_gpiochip_init(), but
as they don't select CONFIG_SIMPLE_GPIO, this is just a nop.

Simple_gpio is just redundant with the generic MMIO GPIO
driver which can be found in driver/gpio/ and selected via
CONFIG_GPIO_GENERIC_PLATFORM, so drop simple_gpio driver.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bf930402613b41b42d0441b784e0cc43fc18d1fb.1572529632.git.christophe.leroy@c-s.fr
2019-11-21 15:41:34 +11:00
David S. Miller
ee5a489fd9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-11-20

The following pull-request contains BPF updates for your *net-next* tree.

We've added 81 non-merge commits during the last 17 day(s) which contain
a total of 120 files changed, 4958 insertions(+), 1081 deletions(-).

There are 3 trivial conflicts, resolve it by always taking the chunk from
196e8ca748:

<<<<<<< HEAD
=======
void *bpf_map_area_mmapable_alloc(u64 size, int numa_node);
>>>>>>> 196e8ca748

<<<<<<< HEAD
void *bpf_map_area_alloc(u64 size, int numa_node)
=======
static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable)
>>>>>>> 196e8ca748

<<<<<<< HEAD
        if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
=======
        /* kmalloc()'ed memory can't be mmap()'ed */
        if (!mmapable && size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
>>>>>>> 196e8ca748

The main changes are:

1) Addition of BPF trampoline which works as a bridge between kernel functions,
   BPF programs and other BPF programs along with two new use cases: i) fentry/fexit
   BPF programs for tracing with practically zero overhead to call into BPF (as
   opposed to k[ret]probes) and ii) attachment of the former to networking related
   programs to see input/output of networking programs (covering xdpdump use case),
   from Alexei Starovoitov.

2) BPF array map mmap support and use in libbpf for global data maps; also a big
   batch of libbpf improvements, among others, support for reading bitfields in a
   relocatable manner (via libbpf's CO-RE helper API), from Andrii Nakryiko.

3) Extend s390x JIT with usage of relative long jumps and loads in order to lift
   the current 64/512k size limits on JITed BPF programs there, from Ilya Leoshkevich.

4) Add BPF audit support and emit messages upon successful prog load and unload in
   order to have a timeline of events, from Daniel Borkmann and Jiri Olsa.

5) Extension to libbpf and xdpsock sample programs to demo the shared umem mode
   (XDP_SHARED_UMEM) as well as RX-only and TX-only sockets, from Magnus Karlsson.

6) Several follow-up bug fixes for libbpf's auto-pinning code and a new API
   call named bpf_get_link_xdp_info() for retrieving the full set of prog
   IDs attached to XDP, from Toke Høiland-Jørgensen.

7) Add BTF support for array of int, array of struct and multidimensional arrays
   and enable it for skb->cb[] access in kfree_skb test, from Martin KaFai Lau.

8) Fix AF_XDP by using the correct number of channels from ethtool, from Luigi Rizzo.

9) Two fixes for BPF selftest to get rid of a hang in test_tc_tunnel and to avoid
   xdping to be run as standalone, from Jiri Benc.

10) Various BPF selftest fixes when run with latest LLVM trunk, from Yonghong Song.

11) Fix a memory leak in BPF fentry test run data, from Colin Ian King.

12) Various smaller misc cleanups and improvements mostly all over BPF selftests and
    samples, from Daniel T. Lee, Andre Guedes, Anders Roxell, Mao Wenan, Yue Haibing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 18:11:23 -08:00
Alexander Duyck
e3cb0c7102 x86/ioperm: Fix use of deprecated config option
The commit

  111e7b15cf ("x86/ioperm: Extend IOPL config to control ioperm() as well")

replaced X86_IOPL_EMULATION with X86_IOPL_IOPERM. However it appears
that there was at least one spot missed as tss_update_io_bitmap() still
had a reference to it contained in the code.

The result of this is that it exposed a NULL pointer dereference as seen
below with a linux-next next-20191120 kernel:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 5 PID: 1542 Comm: ovs-vswitchd Tainted: G        W 5.4.0-rc8-next-20191120 #125
  RIP: 0010:tss_update_io_bitmap+0x4e/0x180
  Code: 10 31 c0 65 48 03 1d 69 54 5d 6d 65 48 8b 04 25 40 8c 01 00 48 8b 10 \
	  f7 c2 00 00 40 00 0f 84 8c 00 00 00 4c 8b a0 c0 22 00 00 <49> 8b 04 \
	  24 48 39 43 68 74 2e 8b 53 70 41 39 54 24 0c 48 8d 7b 78
  RSP: 0018:ffffb8888a0ebf08 EFLAGS: 00010006
  RAX: ffff8a429811a680 RBX: ffff8a4c3f946000 RCX: 0000000000000011
  RDX: 0000000000400080 RSI: 0000000000400080 RDI: 0000000000000000
  RBP: ffffb8888a0ebf30 R08: 00007ffffb5d7ce0 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  FS:  00007f68a9635c40(0000) GS:ffff8a4c3f940000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000103572a001 CR4: 00000000001606e0
  Call Trace:
   ? syscall_slow_exit_work+0x39/0xdb
   do_syscall_64+0x1a5/0x200
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f68a7aff797

Fixes: 111e7b15cf ("x86/ioperm: Extend IOPL config to control ioperm() as well")
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191120222426.3060.18462.stgit@localhost.localdomain
2019-11-20 23:40:05 +01:00
Christoph Hellwig
68a33b1794 dma-direct: exclude dma_direct_map_resource from the min_low_pfn check
The valid memory address check in dma_capable only makes sense when mapping
normal memory, not when using dma_map_resource to map a device resource.
Add a new boolean argument to dma_capable to exclude that check for the
dma_map_resource case.

Fixes: b12d66278d ("dma-direct: check for overflows on 32 bit DMA addresses")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
2019-11-20 20:31:41 +01:00
Christoph Hellwig
cb6f6392db powerpc: remove support for NULL dev in __phys_to_dma / __dma_to_phys
Support for calling the DMA API functions without a valid device pointer
was removed a while ago, so remove the stale support for that from the
powerpc __phys_to_dma / __dma_to_phys helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
2019-11-20 20:31:40 +01:00
Christoph Hellwig
130c1ccbf5 dma-direct: unify the dma_capable definitions
Currently each architectures that wants to override dma_to_phys and
phys_to_dma also has to provide dma_capable.  But there isn't really
any good reason for that.  powerpc and mips just have copies of the
generic one minus the latests fix, and the arm one was the inspiration
for said fix, but misses the bus_dma_mask handling.
Make all architectures use the generic version instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
2019-11-20 20:31:40 +01:00
Christoph Hellwig
56e35f9c5b dma-mapping: drop the dev argument to arch_sync_dma_for_*
These are pure cache maintainance routines, so drop the unused
struct device argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-11-20 20:31:38 +01:00
Grygorii Strashko
3727d259dd arm: omap2plus_defconfig: enable new cpsw switchdev driver
Add CONFIG_TI_CPSW_SWITCHDEV option to enable new cpsw switchdev driver

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Grygorii Strashko
15b991ade4 ARM: dts: am571x-idk: enable for new cpsw switch dev driver
Add DT nodes for new cpsw switchdev driver for am571x-idk board for now to
enable testing of the new solution.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Grygorii Strashko
39331a49c4 ARM: dts: dra7: add dt nodes for new cpsw switch dev driver
Add DT nodes for new cpsw switch dev driver.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-20 11:25:24 -08:00
Pavel Tatashin
e50be648aa arm64: uaccess: Remove uaccess_*_not_uao asm macros
It is safer and simpler to drop the uaccess assembly macros in favour of
inline C functions. Although this bloats the Image size slightly, it
aligns our user copy routines with '{get,put}_user()' and generally
makes the code a lot easier to reason about.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
[will: tweaked commit message and changed temporary variable names]
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-20 18:51:54 +00:00
Pavel Tatashin
94bb804e1e arm64: uaccess: Ensure PAN is re-enabled after unhandled uaccess fault
A number of our uaccess routines ('__arch_clear_user()' and
'__arch_copy_{in,from,to}_user()') fail to re-enable PAN if they
encounter an unhandled fault whilst accessing userspace.

For CPUs implementing both hardware PAN and UAO, this bug has no effect
when both extensions are in use by the kernel.

For CPUs implementing hardware PAN but not UAO, this means that a kernel
using hardware PAN may execute portions of code with PAN inadvertently
disabled, opening us up to potential security vulnerabilities that rely
on userspace access from within the kernel which would usually be
prevented by this mechanism. In other words, parts of the kernel run the
same way as they would on a CPU without PAN implemented/emulated at all.

For CPUs not implementing hardware PAN and instead relying on software
emulation via 'CONFIG_ARM64_SW_TTBR0_PAN=y', the impact is unfortunately
much worse. Calling 'schedule()' with software PAN disabled means that
the next task will execute in the kernel using the page-table and ASID
of the previous process even after 'switch_mm()', since the actual
hardware switch is deferred until return to userspace. At this point, or
if there is a intermediate call to 'uaccess_enable()', the page-table
and ASID of the new process are installed. Sadly, due to the changes
introduced by KPTI, this is not an atomic operation and there is a very
small window (two instructions) where the CPU is configured with the
page-table of the old task and the ASID of the new task; a speculative
access in this state is disastrous because it would corrupt the TLB
entries for the new task with mappings from the previous address space.

As Pavel explains:

  | I was able to reproduce memory corruption problem on Broadcom's SoC
  | ARMv8-A like this:
  |
  | Enable software perf-events with PERF_SAMPLE_CALLCHAIN so userland's
  | stack is accessed and copied.
  |
  | The test program performed the following on every CPU and forking
  | many processes:
  |
  |	unsigned long *map = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
  |				  MAP_SHARED | MAP_ANONYMOUS, -1, 0);
  |	map[0] = getpid();
  |	sched_yield();
  |	if (map[0] != getpid()) {
  |		fprintf(stderr, "Corruption detected!");
  |	}
  |	munmap(map, PAGE_SIZE);
  |
  | From time to time I was getting map[0] to contain pid for a
  | different process.

Ensure that PAN is re-enabled when returning after an unhandled user
fault from our uaccess routines.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Fixes: 338d4f49d6 ("arm64: kernel: Add support for Privileged Access Never")
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
[will: rewrote commit message]
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-20 18:51:47 +00:00
Thomas Richter
6a82e23f45 s390/cpumf: Adjust registration of s390 PMU device drivers
Linux-next commit titled "perf/core: Optimize perf_init_event()"
changed the semantics of PMU device driver registration.
It was done to speed up the lookup/handling of PMU device driver
specific events. It also enforces that only one PMU device
driver will be registered of type PERF_EVENT_RAW.

This change added these line in function perf_pmu_register():

  ...
  +       ret = idr_alloc(&pmu_idr, pmu, max, 0, GFP_KERNEL);
  +       if (ret < 0)
                goto free_pdc;
  +
  +       WARN_ON(type >= 0 && ret != type);

The warn_on generates a message. We have 3 PMU device drivers,
each registered as type PERF_TYPE_RAW.
The cf_diag device driver (arch/s390/kernel/perf_cpumf_cf_diag.c)
always hits the WARN_ON because it is the second PMU device driver
(after sampling device driver arch/s390/kernel/perf_cpumf_sf.c)
which is registered as type 4 (PERF_TYPE_RAW).
So when the sampling device driver is registered, ret has value 4.
When cf_diag device driver is registered with type 4,
ret has value of 5 and WARN_ON fires.

Adjust the PMU device drivers for s390 to support the new
semantics required by perf_pmu_register().

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20 17:16:01 +01:00
Liran Alon
992edeaefe KVM: nVMX: Assume TLB entries of L1 and L2 are tagged differently if L0 use EPT
Since commit 1313cc2bd8 ("kvm: mmu: Add guest_mode to kvm_mmu_page_role"),
guest_mode was added to mmu-role and therefore if L0 use EPT, it will
always run L1 and L2 with different EPTP. i.e. EPTP01!=EPTP02.

Because TLB entries are tagged with EP4TA, KVM can assume
TLB entries populated while running L2 are tagged differently
than TLB entries populated while running L1.

Therefore, update nested_has_guest_tlb_tag() to consider if
L0 use EPT instead of if L1 use EPT.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-20 14:23:27 +01:00
Liran Alon
5637f60b68 KVM: x86: Unexport kvm_vcpu_reload_apic_access_page()
The function is only used in kvm.ko module.

Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-20 14:23:26 +01:00
Chenyi Qiang
c79eb77554 KVM: nVMX: add CR4_LA57 bit to nested CR4_FIXED1
When L1 guest uses 5-level paging, it fails vm-entry to L2 due to
invalid host-state. It needs to add CR4_LA57 bit to nested CR4_FIXED1
MSR.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-20 14:23:26 +01:00
Liran Alon
cc87767097 KVM: nVMX: Use semi-colon instead of comma for exit-handlers initialization
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-20 14:23:25 +01:00
Nitesh Narayan Lal
9a2ae9f6b6 KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap
Not zeroing the bitmap used for identifying the destination vCPUs for an
IOAPIC scan request in fixed delivery mode could lead to waking up unwanted
vCPUs. This patch zeroes the vCPU bitmap before passing it to
kvm_bitmap_or_dest_vcpus(), which is responsible for setting the bitmap
with the bits corresponding to the destination vCPUs.

Fixes: 7ee30bc132c6("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs")
Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-20 14:23:24 +01:00
Thomas Gleixner
407e62f52a irqchip updates for Linux 5.5
- Qualcomm PDC wakeup interrupt support
 - Layerscape external IRQ support
 - Broadcom bcm7038 PM and wakeup support
 - Ingenic driver cleanup and modernization
 - GICv3 ITS preparation for GICv4.1 updates
 - GICv4 fixes
 - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl3VJasPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDMLEP/2U55GLmPuNqW/na4YFmcOYAkoP0WpEKDzn9
 u8lBi8CukKl1Z2JLAyP0E1e5iOVS0exvQ5V4OwxGKmeR9oWmB3Lym8UWRw7vcKEF
 HMKgtPCd3U3J5jW0P4Hr8hn6Q+B55fdMrrAaJcgsfBVB7bRB0lC0LZYGtN4VC4d8
 rTQzup5CK8Mu9k4NztLCxxoBUKFoqM+ZKsrRB2eOXB9amcPQtFvwC+5ZL2tDr3wS
 7d+pd6G4A+hsloIDUxoH9BrO/jd1jlfHyBRDFJIgpo/IQWVT6ciQECZomRR1pW30
 bGFYBf/HPFqfyH+ZOrWprSAd0Yx33WtYaMokYaJ6vGu4wedyxh/1LTRLzL0tWuyZ
 tPFvEmiiP/Hoeq1JHFRFQUO/75ckqALLeAxCjACCN8+F2Z0armk1W/iwehZNQHHV
 JdDXegRNUlMipG2kk5D3L6AK28bi+3+axc1ERMN1RO40eLm8NLogWL2TJlxLbyUe
 lMZMe43ceC0McGnQpAY8qyuC7IycQtngKNBvzG+6ADucGpFez3gYxh39RR43XMVo
 37Hsj+Ur7CFBJj6WTCzV2teC/WaXXQkJYxn6fsHNmUgdwPgGD3LppxhlWG49Ao9w
 x8ZnfyrrYmcFOJrKbT45ExMihioaGf8dyksKZNA/Z4dI0g/kf0LyYi5ujZaDDilI
 eDkMI/xI
 =uENR
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates from Marc Zyngier:

 - Qualcomm PDC wakeup interrupt support
 - Layerscape external IRQ support
 - Broadcom bcm7038 PM and wakeup support
 - Ingenic driver cleanup and modernization
 - GICv3 ITS preparation for GICv4.1 updates
 - GICv4 fixes
 - Various cleanups
2019-11-20 14:16:34 +01:00
Heiko Carstens
72a81ad9d6 s390/smp: fix physical to logical CPU map for SMT
If an SMT capable system is not IPL'ed from the first CPU the setup of
the physical to logical CPU mapping is broken: the IPL core gets CPU
number 0, but then the next core gets CPU number 1. Correct would be
that all SMT threads of CPU 0 get the subsequent logical CPU numbers.

This is important since a lot of code (like e.g. the CPU topology
code) assumes that CPU maps are setup like this. If the mapping is
broken the system will not IPL due to broken topology masks:

[    1.716341] BUG: arch topology broken
[    1.716342]      the SMT domain not a subset of the MC domain
[    1.716343] BUG: arch topology broken
[    1.716344]      the MC domain not a subset of the BOOK domain

This scenario can usually not happen since LPARs are always IPL'ed
from CPU 0 and also re-IPL is intiated from CPU 0. However older
kernels did initiate re-IPL on an arbitrary CPU. If therefore a re-IPL
from an old kernel into a new kernel is initiated this may lead to
crash.

Fix this by setting up the physical to logical CPU mapping correctly.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20 12:58:13 +01:00
Vasily Gorbik
c231359421 s390/early: move access registers setup in C code
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20 12:58:13 +01:00
Vasily Gorbik
b8ce1fa489 s390/head64: remove unnecessary vdso_per_cpu_data setup
vdso_per_cpu_data lowcore value is only needed for fully functional
exception handlers, which are activated in setup_lowcore_dat_off. The
same function does init vdso_per_cpu_data via vdso_alloc_boot_cpu.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20 12:58:12 +01:00
Vasily Gorbik
c02ee6a16a s390/early: move control registers setup in C code
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20 12:58:12 +01:00
Vasily Gorbik
13f9bae579 s390/kasan: support memcpy_real with TRACE_IRQFLAGS
Currently if the kernel is built with CONFIG_TRACE_IRQFLAGS and KASAN
and used as crash kernel it crashes itself due to
trace_hardirqs_off/trace_hardirqs_on being called with DAT off. This
happens because trace_hardirqs_off/trace_hardirqs_on are instrumented and
kasan code tries to perform access to shadow memory to validate memory
accesses. Kasan shadow memory is populated with vmemmap, so all accesses
require DAT on.

memcpy_real could be called with DAT on or off (with kasan enabled DAT
is set even before early code is executed).

Make sure that trace_hardirqs_off/trace_hardirqs_on are called with DAT
on and only actual __memcpy_real is called with DAT off.

Also annotate __memcpy_real and _memcpy_real with __no_sanitize_address
to avoid further problems due to switching DAT off.

Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20 12:58:12 +01:00
YueHaibing
0398d4ab16 s390/crypto: Fix unsigned variable compared with zero
s390_crypto_shash_parmsize() return type is int, it
should not be stored in a unsigned variable, which
compared with zero.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 3c2eb6b76c ("s390/crypto: Support for SHA3 via CPACF (MSA6)")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Joerg Schmidbauer <jschmidb@linux.vnet.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20 12:58:12 +01:00
Jan Beulich
922eea2ce5 x86/xen/32: Simplify ring check in xen_iret_crit_fixup()
This can be had with two instead of six insns, by just checking the high
CS.RPL bit.

Also adjust the comment - there would be no #GP in the mentioned cases, as
there's no segment limit violation or alike. Instead there'd be #PF, but
that one reports the target EIP of said branch, not the address of the
branch insn itself.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lkml.kernel.org/r/a5986837-01eb-7bf8-bf42-4d3084d6a1f5@suse.com
2019-11-19 21:58:28 +01:00
Jan Beulich
29b810f5a5 x86/xen/32: Make xen_iret_crit_fixup() independent of frame layout
Now that SS:ESP always get saved by SAVE_ALL, this also needs to be
accounted for in xen_iret_crit_fixup(). Otherwise the old_ax value gets
interpreted as EFLAGS, and hence VM86 mode appears to be active all the
time, leading to random "vm86_32: no user_vm86: BAD" log messages alongside
processes randomly crashing.

Since following the previous model (sitting after SAVE_ALL) would further
complicate the code _and_ retain the dependency of xen_iret_crit_fixup() on
frame manipulations done by entry_32.S, switch things around and do the
adjustment ahead of SAVE_ALL.

Fixes: 3c88c692c2 ("x86/stackframe/32: Provide consistent pt_regs")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Stable Team <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/32d8713d-25a7-84ab-b74b-aa3e88abce6b@suse.com
2019-11-19 21:58:28 +01:00
Jan Beulich
81ff2c37f9 x86/stackframe/32: Repair 32-bit Xen PV
Once again RPL checks have been introduced which don't account for a 32-bit
kernel living in ring 1 when running in a PV Xen domain. The case in
FIXUP_FRAME has been preventing boot.

Adjust BUG_IF_WRONG_CR3 as well to guard against future uses of the macro
on a code path reachable when running in PV mode under Xen; I have to admit
that I stopped at a certain point trying to figure out whether there are
present ones.

Fixes: 3c88c692c2 ("x86/stackframe/32: Provide consistent pt_regs")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Stable Team <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/0fad341f-b7f5-f859-d55d-f0084ee7087e@suse.com
2019-11-19 21:58:28 +01:00
Dan Williams
e755799aef libnvdimm: Move nvdimm_bus_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nvdimm_bus_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309903815.1582359.6418211876315050283.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
360eba7ebd libnvdimm: Move nvdimm_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nvdimm_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309903201.1582359.10966209746585062329.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
4ce79fa97e libnvdimm: Move nd_mapping_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_mapping_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309902686.1582359.6749533709859492704.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
7c4fc8cde1 libnvdimm: Move nd_region_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_region_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309902169.1582359.16828508538444551337.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
e2f6a0e348 libnvdimm: Move nd_numa_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_numa_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157401269537.43284.14411189404186877352.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-19 09:51:54 -08:00
Ingo Molnar
8f6ee51d77 perf/core improvements and fixes:
x86/insn:
 
   Adrian Hunter:
 
   - Add some more Intel instructions to the opcode map:
 
         cldemote, encls, enclu, enclv, enqcmd, enqcmds, movdir64b,
         movdiri, pconfig, tpause, umonitor, umwait, wbnoinvd.
 
   - The instruction decoding can be tested using the perf tools'
     "x86 instruction decoder - new instructions" test as folllows:
 
     $ perf test -v "new " 2>&1 | grep -i cldemote
     Decoded ok: 0f 1c 00                    cldemote (%eax)
     Decoded ok: 0f 1c 05 78 56 34 12        cldemote 0x12345678
     Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%eax,%ecx,8)
     Decoded ok: 0f 1c 00                    cldemote (%rax)
     Decoded ok: 41 0f 1c 00                 cldemote (%r8)
     Decoded ok: 0f 1c 04 25 78 56 34 12     cldemote 0x12345678
     Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%rax,%rcx,8)
     Decoded ok: 41 0f 1c 84 c8 78 56 34 12  cldemote 0x12345678(%r8,%rcx,8)
     $ perf test -v "new " 2>&1 | grep -i tpause
     Decoded ok: 66 0f ae f3                 tpause %ebx
     Decoded ok: 66 0f ae f3                 tpause %ebx
     Decoded ok: 66 41 0f ae f0              tpause %r8d
 
 callchains:
 
   Adrian Hunter:
 
   - Fix segfault in thread__resolve_callchain_sample().
 
 perf probe:
 
   - Line fixes to show only lines where probes can be used with 'perf probe -L',
     and when reporting them via 'perf probe -l'.
 
   - Support multiprobe events.
 
 perf scripts python:
 
   Adrian Hunter:
 
   - Fix use of TRUE with SQLite < 3.23 in exported-sql-viewer.py.
 
 perf maps:
 
   - Trim 'struct map' by removing the rb_node member for sorting
     by map name, as that is only needed for processing kernel maps,
     and only when classifying symbols by section at load time.
     Sort them by name using qsort() and do lookups using bsearch()
     when map_groups__find_by_name() is used.
 
 perf parse:
 
   Ian Rogers:
 
   - Report initial event parsing error, providing a less cryptic message
     to state that a PMU wasn't found in the system.
 
 perf vendor events:
 
   James Clark:
 
   - Fix commas so that PMU event files for arm64, power8 and power nine
     become valid JSON.
 
 libtraceevent:
 
   Konstantin Khlebnikov:
 
   - Fix parsing of event %o and %X argument types.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXdPSQwAKCRCyPKLppCJ+
 J1FQAQD4pSumHTHczHm9xoYuOxFAoNlT7VaML6YXJb82+pvcCwD9FuVBmWUcBoa6
 /Bmn3Iy4j37aZhQCDLdByaaKVF8n9g0=
 =ST42
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.5-20191119' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

x86/insn:

  Adrian Hunter:

  - Add some more Intel instructions to the opcode map:

        cldemote, encls, enclu, enclv, enqcmd, enqcmds, movdir64b,
        movdiri, pconfig, tpause, umonitor, umwait, wbnoinvd.

  - The instruction decoding can be tested using the perf tools'
    "x86 instruction decoder - new instructions" test as folllows:

    $ perf test -v "new " 2>&1 | grep -i cldemote
    Decoded ok: 0f 1c 00                    cldemote (%eax)
    Decoded ok: 0f 1c 05 78 56 34 12        cldemote 0x12345678
    Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%eax,%ecx,8)
    Decoded ok: 0f 1c 00                    cldemote (%rax)
    Decoded ok: 41 0f 1c 00                 cldemote (%r8)
    Decoded ok: 0f 1c 04 25 78 56 34 12     cldemote 0x12345678
    Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%rax,%rcx,8)
    Decoded ok: 41 0f 1c 84 c8 78 56 34 12  cldemote 0x12345678(%r8,%rcx,8)
    $ perf test -v "new " 2>&1 | grep -i tpause
    Decoded ok: 66 0f ae f3                 tpause %ebx
    Decoded ok: 66 0f ae f3                 tpause %ebx
    Decoded ok: 66 41 0f ae f0              tpause %r8d

callchains:

  Adrian Hunter:

  - Fix segfault in thread__resolve_callchain_sample().

perf probe:

  - Line fixes to show only lines where probes can be used with 'perf probe -L',
    and when reporting them via 'perf probe -l'.

  - Support multiprobe events.

perf scripts python:

  Adrian Hunter:

  - Fix use of TRUE with SQLite < 3.23 in exported-sql-viewer.py.

perf maps:

  - Trim 'struct map' by removing the rb_node member for sorting
    by map name, as that is only needed for processing kernel maps,
    and only when classifying symbols by section at load time.
    Sort them by name using qsort() and do lookups using bsearch()
    when map_groups__find_by_name() is used.

perf parse:

  Ian Rogers:

  - Report initial event parsing error, providing a less cryptic message
    to state that a PMU wasn't found in the system.

perf vendor events:

  James Clark:

  - Fix commas so that PMU event files for arm64, power8 and power nine
    become valid JSON.

libtraceevent:

  Konstantin Khlebnikov:

  - Fix parsing of event %o and %X argument types.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-19 12:59:03 +01:00
Rafael J. Wysocki
cbda56d5fe cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks
Commit 99e98d3fb1 ("cpuidle: Consolidate disabled state checks")
overlooked the fact that the imx6q and tegra20 cpuidle drivers use
the "disabled" field in struct cpuidle_state for quirks which trigger
after the initialization of cpuidle, so reading the initial value of
that field is not sufficient for those drivers.

In order to allow them to implement the quirks without using the
"disabled" field in struct cpuidle_state, introduce a new helper
function and modify them to use it.

Fixes: 99e98d3fb1 ("cpuidle: Consolidate disabled state checks")
Reported-by: Len Brown <lenb@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-19 10:35:13 +01:00
Christophe Leroy
6b7c095a51 powerpc/83xx: map IMMR with a BAT.
On mpc83xx with a QE, IMMR is 2Mbytes and aligned on 2Mbytes boundarie.
On mpc83xx without a QE, IMMR is 1Mbyte and 1Mbyte aligned.

Each driver will map a part of it to access the registers it needs.
Some drivers will map the same part of IMMR as other drivers.

In order to reduce TLB misses, map the full IMMR with a BAT. If it is
2Mbytes aligned, map 2Mbytes. If there is no QE, the upper part will
remain unused, but it doesn't harm as it is mapped as guarded memory.

When the IMMR is not aligned on a 2Mbytes boundarie, only map 1Mbyte.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/269a00951328fb6fa1be2fa3cbc76c19745019b7.1568665466.git.christophe.leroy@c-s.fr
2019-11-19 19:38:38 +11:00
Christophe Leroy
cbcaff7d27 powerpc/32s: automatically allocate BAT in setbat()
If no BAT is given to setbat(), select an available BAT.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a212bd36fbd6179e0929b6c727febc35132ac25c.1568665466.git.christophe.leroy@c-s.fr
2019-11-19 19:38:38 +11:00
Christophe Leroy
d538aadc27 powerpc/ioremap: warn on early use of ioremap()
Powerpc now has EARLY_IOREMAP.

Next step is to convert all early users of ioremap() to
early_ioremap().

Add a warning to help locate those users.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b4f03a68ee8e68773c8973d74ec35f9c82c72871.1568295907.git.christophe.leroy@c-s.fr
2019-11-19 19:38:38 +11:00
Christophe Leroy
265c3491c4 powerpc: Add support for GENERIC_EARLY_IOREMAP
Add support for GENERIC_EARLY_IOREMAP.

Let's define 16 slots of 256Kbytes each for early ioremap.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/412c7eaa6a373d8f82a3c3ee01e6a65a1a6589de.1568295907.git.christophe.leroy@c-s.fr
2019-11-19 19:38:38 +11:00
Christophe Leroy
77693a5fb5 powerpc/fixmap: Use __fix_to_virt() instead of fix_to_virt()
Modify back __set_fixmap() to using __fix_to_virt() instead
of fix_to_virt() otherwise the following happens because it
seems GCC doesn't see idx as a builtin const.

  CC      mm/early_ioremap.o
In file included from ./include/linux/kernel.h:11:0,
                 from mm/early_ioremap.c:11:
In function ‘fix_to_virt’,
    inlined from ‘__set_fixmap’ at ./arch/powerpc/include/asm/fixmap.h:87:2,
    inlined from ‘__early_ioremap’ at mm/early_ioremap.c:156:4:
./include/linux/compiler.h:350:38: error: call to ‘__compiletime_assert_32’ declared with attribute error: BUILD_BUG_ON failed: idx >= __end_of_fixed_addresses
  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
                                      ^
./include/linux/compiler.h:331:4: note: in definition of macro ‘__compiletime_assert’
    prefix ## suffix();    \
    ^
./include/linux/compiler.h:350:2: note: in expansion of macro ‘_compiletime_assert’
  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
  ^
./include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
 #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                     ^
./include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
  BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
  ^
./include/asm-generic/fixmap.h:32:2: note: in expansion of macro ‘BUILD_BUG_ON’
  BUILD_BUG_ON(idx >= __end_of_fixed_addresses);
  ^

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Fixes: 4cfac2f9c7 ("powerpc/mm: Simplify __set_fixmap()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f4984c615f90caa3277775a68849afeea846850d.1568295907.git.christophe.leroy@c-s.fr
2019-11-19 19:38:38 +11:00
Christophe Leroy
eafd687e68 powerpc/8xx: use the fixmapped IMMR in cpm_reset()
Since commit f86ef74ed9 ("powerpc/8xx: Fix vaddr for IMMR early
remap"), the IMMR area has been mapped at startup with fixmap.

Use that fixmap directly instead of calling ioremap(), this
avoids calling ioremap() early before the slab is available.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f816ccdbd15b97cf43c5a8c7cc8dfa8db58ff036.1568294935.git.christophe.leroy@c-s.fr
2019-11-19 19:38:35 +11:00
Christophe Leroy
132f92fdc4 powerpc/8xx: add __init to cpm1 init functions
Functions cpm1_clk_setup(), cpm1_set_pin(), cpm_pic_init() and
mpc8xx_pic_init() are only called from __init functions, so mark
them __init as well.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c27168ef054f3a52edcf0ff91652700d53b3e32d.1568294563.git.christophe.leroy@c-s.fr
2019-11-19 19:38:35 +11:00
Ingo Molnar
9f4813b531 Linux 5.4-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3RzgkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGN18H/0JZbfIpy8/4Irol
 0va7Aj2fBi1a5oxfqYsMKN0u3GKbN3OV9tQ+7w1eBNGvL72TGadgVTzTY+Im7A9U
 UjboAc7jDPCG+YhIwXFufMiIAq5jDIj6h0LDas7ALsMfsnI/RhTwgNtLTAkyI3dH
 YV/6ljFULwueJHCxzmrYbd1x39PScj3kCNL2pOe6On7rXMKOemY/nbbYYISxY30E
 GMgKApSS+li7VuSqgrKoq5Qaox26LyR2wrXB1ij4pqEJ9xgbnKRLdHuvXZnE+/5p
 46EMirt+yeSkltW3d2/9MoCHaA76ESzWMMDijLx7tPgoTc3RB3/3ZLsm3rYVH+cR
 cRlNNSk=
 =0+Cg
 -----END PGP SIGNATURE-----

Merge tag 'v5.4-rc8' into WIP.x86/mm, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-19 09:00:45 +01:00
Ilya Leoshkevich
d1242b10ff s390/bpf: Remove JITed image size limitations
Now that jump and long displacement ranges are no longer a problem,
remove the limit on JITed image size. In practice it's still limited by
2G, but with verifier allowing "only" 1M instructions, it's not an
issue.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191118180340.68373-7-iii@linux.ibm.com
2019-11-18 19:51:16 -08:00
Ilya Leoshkevich
b25c57b6b7 s390/bpf: Use lg(f)rl when long displacement cannot be used
If literal pool grows past 524287 mark, it's no longer possible to use
long displacement to reference literal pool entries. In JIT setting
maintaining multiple literal pool registers is next to impossible, since
we operate on one instruction at a time.

Therefore, fall back to loading literal pool entry using PC-relative
addressing, and then using a register-register form of the following
machine instruction.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191118180340.68373-6-iii@linux.ibm.com
2019-11-18 19:51:16 -08:00
Ilya Leoshkevich
451e448ff4 s390/bpf: Use lgrl instead of lg where possible
lg and lgrl have the same performance characteristics, but the former
requires a base register and is subject to long displacement range
limits, while the latter does not. Therefore, lgrl is totally superior
to lg and should be used instead whenever possible.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191118180340.68373-5-iii@linux.ibm.com
2019-11-18 19:51:16 -08:00
Ilya Leoshkevich
c1aff5682d s390/bpf: Load literal pool register using larl
Currently literal pool register is loaded using basr, which makes it
point not to the beginning of the literal pool, but rather to the next
instruction. In case JITed code is larger than 512k, this renders
literal pool register absolutely useless due to long displacement range
restrictions.

The solution is to use larl to make literal pool register point to the
very beginning of the literal pool. This makes it always possible to
address 512k worth of literal pool entries using long displacement.

However, for short programs, in which the entire literal pool is covered
by basr-generated base, it is still beneficial to use basr, since it is
4 bytes shorter than larl.

Detect situations when basr-generated base does not cover the entire
literal pool, and in such cases use larl instead.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191118180340.68373-4-iii@linux.ibm.com
2019-11-18 19:51:16 -08:00
Ilya Leoshkevich
e0491f6479 s390/bpf: Align literal pool entries
When literal pool size exceeds 512k, it's no longer possible to
reference all the entries in it using a single base register and long
displacement. Therefore, PC-relative lgfrl and lgrl instructions need to
be used.

Unfortunately, they require their arguments to be aligned to 4- and
8-byte boundaries respectively. This generates certain overhead due to
necessary padding bytes. Grouping 4- and 8-byte entries together reduces
the maximum overhead to 6 bytes (2 for aligning 4-byte entries and 4 for
aligning 8-byte entries).

While in theory it is possible to detect whether or not alignment is
needed by comparing the literal pool size with 512k, in practice this
leads to having two ways of emitting constants, making the code more
complicated.

Prefer code simplicity over trivial size saving, and always group and
align literal pool entries.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191118180340.68373-3-iii@linux.ibm.com
2019-11-18 19:51:16 -08:00
Ilya Leoshkevich
4e9b4a6883 s390/bpf: Use relative long branches
Currently maximum JITed code size is limited to 64k, because JIT can
emit only relative short branches, whose range is limited by 64k in both
directions.

Teach JIT to use relative long branches. There are no compare+branch
relative long instructions, so using relative long branches consumes
more space due to having to having to emit an explicit comparison
instruction. Therefore do this only when relative short branch is not
enough.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191118180340.68373-2-iii@linux.ibm.com
2019-11-18 19:51:16 -08:00
Adrian Hunter
b980be189c x86/insn: Add some Intel instructions to the opcode map
Add to the opcode map the following instructions:
        cldemote
        tpause
        umonitor
        umwait
        movdiri
        movdir64b
        enqcmd
        enqcmds
        encls
        enclu
        enclv
        pconfig
        wbnoinvd

For information about the instructions, refer Intel SDM May 2019
(325462-070US) and Intel Architecture Instruction Set Extensions
May 2019 (319433-037).

The instruction decoding can be tested using the perf tools'
"x86 instruction decoder - new instructions" test as folllows:

  $ perf test -v "new " 2>&1 | grep -i cldemote
  Decoded ok: 0f 1c 00                    cldemote (%eax)
  Decoded ok: 0f 1c 05 78 56 34 12        cldemote 0x12345678
  Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%eax,%ecx,8)
  Decoded ok: 0f 1c 00                    cldemote (%rax)
  Decoded ok: 41 0f 1c 00                 cldemote (%r8)
  Decoded ok: 0f 1c 04 25 78 56 34 12     cldemote 0x12345678
  Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%rax,%rcx,8)
  Decoded ok: 41 0f 1c 84 c8 78 56 34 12  cldemote 0x12345678(%r8,%rcx,8)
  $ perf test -v "new " 2>&1 | grep -i tpause
  Decoded ok: 66 0f ae f3                 tpause %ebx
  Decoded ok: 66 0f ae f3                 tpause %ebx
  Decoded ok: 66 41 0f ae f0              tpause %r8d
  $ perf test -v "new " 2>&1 | grep -i umonitor
  Decoded ok: 67 f3 0f ae f0              umonitor %ax
  Decoded ok: f3 0f ae f0                 umonitor %eax
  Decoded ok: 67 f3 0f ae f0              umonitor %eax
  Decoded ok: f3 0f ae f0                 umonitor %rax
  Decoded ok: 67 f3 41 0f ae f0           umonitor %r8d
  $ perf test -v "new " 2>&1 | grep -i umwait
  Decoded ok: f2 0f ae f0                 umwait %eax
  Decoded ok: f2 0f ae f0                 umwait %eax
  Decoded ok: f2 41 0f ae f0              umwait %r8d
  $ perf test -v "new " 2>&1 | grep -i movdiri
  Decoded ok: 0f 38 f9 03                 movdiri %eax,(%ebx)
  Decoded ok: 0f 38 f9 88 78 56 34 12     movdiri %ecx,0x12345678(%eax)
  Decoded ok: 48 0f 38 f9 03              movdiri %rax,(%rbx)
  Decoded ok: 48 0f 38 f9 88 78 56 34 12  movdiri %rcx,0x12345678(%rax)
  $ perf test -v "new " 2>&1 | grep -i movdir64b
  Decoded ok: 66 0f 38 f8 18              movdir64b (%eax),%ebx
  Decoded ok: 66 0f 38 f8 88 78 56 34 12  movdir64b 0x12345678(%eax),%ecx
  Decoded ok: 67 66 0f 38 f8 1c           movdir64b (%si),%bx
  Decoded ok: 67 66 0f 38 f8 8c 34 12     movdir64b 0x1234(%si),%cx
  Decoded ok: 66 0f 38 f8 18              movdir64b (%rax),%rbx
  Decoded ok: 66 0f 38 f8 88 78 56 34 12  movdir64b 0x12345678(%rax),%rcx
  Decoded ok: 67 66 0f 38 f8 18           movdir64b (%eax),%ebx
  Decoded ok: 67 66 0f 38 f8 88 78 56 34 12       movdir64b 0x12345678(%eax),%ecx
  $ perf test -v "new " 2>&1 | grep -i enqcmd
  Decoded ok: f2 0f 38 f8 18              enqcmd (%eax),%ebx
  Decoded ok: f2 0f 38 f8 88 78 56 34 12  enqcmd 0x12345678(%eax),%ecx
  Decoded ok: 67 f2 0f 38 f8 1c           enqcmd (%si),%bx
  Decoded ok: 67 f2 0f 38 f8 8c 34 12     enqcmd 0x1234(%si),%cx
  Decoded ok: f3 0f 38 f8 18              enqcmds (%eax),%ebx
  Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%eax),%ecx
  Decoded ok: 67 f3 0f 38 f8 1c           enqcmds (%si),%bx
  Decoded ok: 67 f3 0f 38 f8 8c 34 12     enqcmds 0x1234(%si),%cx
  Decoded ok: f2 0f 38 f8 18              enqcmd (%rax),%rbx
  Decoded ok: f2 0f 38 f8 88 78 56 34 12  enqcmd 0x12345678(%rax),%rcx
  Decoded ok: 67 f2 0f 38 f8 18           enqcmd (%eax),%ebx
  Decoded ok: 67 f2 0f 38 f8 88 78 56 34 12       enqcmd 0x12345678(%eax),%ecx
  Decoded ok: f3 0f 38 f8 18              enqcmds (%rax),%rbx
  Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%rax),%rcx
  Decoded ok: 67 f3 0f 38 f8 18           enqcmds (%eax),%ebx
  Decoded ok: 67 f3 0f 38 f8 88 78 56 34 12       enqcmds 0x12345678(%eax),%ecx
  $ perf test -v "new " 2>&1 | grep -i enqcmds
  Decoded ok: f3 0f 38 f8 18              enqcmds (%eax),%ebx
  Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%eax),%ecx
  Decoded ok: 67 f3 0f 38 f8 1c           enqcmds (%si),%bx
  Decoded ok: 67 f3 0f 38 f8 8c 34 12     enqcmds 0x1234(%si),%cx
  Decoded ok: f3 0f 38 f8 18              enqcmds (%rax),%rbx
  Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%rax),%rcx
  Decoded ok: 67 f3 0f 38 f8 18           enqcmds (%eax),%ebx
  Decoded ok: 67 f3 0f 38 f8 88 78 56 34 12       enqcmds 0x12345678(%eax),%ecx
  $ perf test -v "new " 2>&1 | grep -i encls
  Decoded ok: 0f 01 cf                    encls
  Decoded ok: 0f 01 cf                    encls
  $ perf test -v "new " 2>&1 | grep -i enclu
  Decoded ok: 0f 01 d7                    enclu
  Decoded ok: 0f 01 d7                    enclu
  $ perf test -v "new " 2>&1 | grep -i enclv
  Decoded ok: 0f 01 c0                    enclv
  Decoded ok: 0f 01 c0                    enclv
  $ perf test -v "new " 2>&1 | grep -i pconfig
  Decoded ok: 0f 01 c5                    pconfig
  Decoded ok: 0f 01 c5                    pconfig
  $ perf test -v "new " 2>&1 | grep -i wbnoinvd
  Decoded ok: f3 0f 09                    wbnoinvd
  Decoded ok: f3 0f 09                    wbnoinvd

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20191115135447.6519-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18 18:54:45 -03:00
Ingo Molnar
b21feab0b8 Linux 5.4-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3RzgkeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGN18H/0JZbfIpy8/4Irol
 0va7Aj2fBi1a5oxfqYsMKN0u3GKbN3OV9tQ+7w1eBNGvL72TGadgVTzTY+Im7A9U
 UjboAc7jDPCG+YhIwXFufMiIAq5jDIj6h0LDas7ALsMfsnI/RhTwgNtLTAkyI3dH
 YV/6ljFULwueJHCxzmrYbd1x39PScj3kCNL2pOe6On7rXMKOemY/nbbYYISxY30E
 GMgKApSS+li7VuSqgrKoq5Qaox26LyR2wrXB1ij4pqEJ9xgbnKRLdHuvXZnE+/5p
 46EMirt+yeSkltW3d2/9MoCHaA76ESzWMMDijLx7tPgoTc3RB3/3ZLsm3rYVH+cR
 cRlNNSk=
 =0+Cg
 -----END PGP SIGNATURE-----

Merge tag 'v5.4-rc8' into sched/core, to pick up fixes and dependencies

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-18 14:41:02 +01:00
Paolo Bonzini
fe289ebb65 KVM: s390: small fixes and enhancements
- selftest improvements
 - yield improvements
 - cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJd0k9KAAoJEBF7vIC1phx8jecP/15y4vJABaNMCb/zzNYEncxr
 lJf8ZeW+257eiEhsmmju4eM8l9/3RzsJM9WXSj91MBRu+xlkt+cyla/TC+CEKMxW
 Z8yd3AkaIPTMDBY/n6QSqDusrUwfR01iM02mr/IKguG/HeCKgLksN03ZU00mc09q
 Ogo+Cl3AdNnIds+5vkIOQAc+CHM3SGjEfyZCqoTwjn46jsKNQeDrq3hHX9RMG4FF
 BxVcSx5rCFCYyb9eruCCK4OHrEEwdJ4l0udkblRjIl+T9Y8LgoXO1/KGIggVL5UJ
 +Smoc/soXMdkOAhefn/2fB1dBRNBaUpvB5xtAd4BHyRjPomw93sftScW06qfiZuo
 0nBiDgTyilpi8dpojyu2vUpYj7NQXTI4ZoHOMTsXOhk6cqGqm4loLb4xdJ8FCoc9
 04Yf1GCfbyEovoyLq1BkL1qD5ZUBecUfYWQGS1xf0+U6/hvn5lQOGeINNe/ho2Zl
 jU1lsFuGGyKs3G5qpk0Dz8UgbRqOYC58VlGQ1eOcNVksTf7qG+MZ3c6kall7CfXg
 MFcK/PuSxyTfrr5CApyK3Gpqu32aMV0rComd6Bv28DlsTRA9F1TJ5WQTO3HUhV9R
 iiqbMAx0s1xHZp6K/VsCvYRjdVyKU7/sQ6OxRmRTybjjKajKijQjMlE2f1Nr0liD
 PKsQjv2kTvrtMDzOhWFu
 =zHPF
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: small fixes and enhancements

- selftest improvements
- yield improvements
- cleanups
2019-11-18 13:16:46 +01:00
Christophe Leroy
b020aa9d1e powerpc: cleanup hw_irq.h
SET_MSR_EE() is just use in this file and doesn't provide
any added value compared to mtmsr(). Drop it.

Add a wrtee() inline function to use wrtee/wrteei insn.

Replace #ifdefs by IS_ENABLED()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a28a20514d5f6df9629c1a117b667e48c4272736.1567068137.git.christophe.leroy@c-s.fr
2019-11-18 22:27:52 +11:00
Christophe Leroy
44448640dd powerpc: permanently include 8xx registers in reg.h
Most 8xx registers have specific names, so just include
reg_8xx.h all the time in reg.h in order to have them defined
even when CONFIG_PPC_8xx is not selected. This will avoid
the need for #ifdefs in C code.

Guard SPRN_ICTRL in an #ifdef CONFIG_PPC_8xx as this register
has same name but different meaning and different spr number as
another register in the mpc7450.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/dd82934ad91aab607d0eb7e626c14e6ac0d654eb.1567068137.git.christophe.leroy@c-s.fr
2019-11-18 22:27:52 +11:00
Christophe Leroy
b06174345f powerpc/reg: use ASM_FTR_IFSET() instead of opencoding fixup.
mftb() includes a feature fixup for CELL ppc.

Use ASM_FTR_IFSET() macro instead of opencoding the setup
of the fixup sections.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ac19713826fa55e9e7bfe3100c5a7b1712ab9526.1566999711.git.christophe.leroy@c-s.fr
2019-11-18 22:27:52 +11:00
Christophe Leroy
a2227a2777 powerpc/32: Don't populate page tables for block mapped pages except on the 8xx.
Commit d2f15e0979 ("powerpc/32: always populate page tables for
Abatron BDI.") wrongly sets page tables for any PPC32 for using BDI,
and does't update them after init (remove RX on init section, set
text and rodata read-only)

Only the 8xx requires page tables to be populated for using the BDI.
They also need to be populated in order to see the mappings in
/sys/kernel/debug/kernel_page_tables

On BOOK3S_32, pages that are not mapped by page tables are mapped
by BATs. The BDI knows BATs and they can be viewed in
/sys/kernel/debug/powerpc/block_address_translation

Only set pagetables for RAM and IMMR on the 8xx and properly update
them at the end of init.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c8610942203e0d93fcb02ad20c57edd3adb4c9d3.1566554029.git.christophe.leroy@c-s.fr
2019-11-18 22:27:52 +11:00
Christophe Leroy
46ddcb3950 powerpc/mm: Show if a bad page fault on data is read or write.
DSISR (or ESR on some CPUs) has a bit to tell if the fault is due to a
read or a write.

Display it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4f88d7e6fda53b5f80a71040ab400242f6c8cb93.1566400889.git.christophe.leroy@c-s.fr
2019-11-18 22:27:51 +11:00
Christophe Leroy
c4028fa2da powerpc/mm: drop #ifdef CONFIG_MMU in is_ioremap_addr()
powerpc always selects CONFIG_MMU and CONFIG_MMU is not checked
anywhere else in powerpc code.

Drop the #ifdef and the alternative part of is_ioremap_addr()

Fixes: 9bd3bb6703 ("mm/nvdimm: add is_ioremap_addr and use that to check ioremap address")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/de395e444fb8dd7a6365c3314d78e15ebb3d7d1b.1566382245.git.christophe.leroy@c-s.fr
2019-11-18 22:27:51 +11:00
Christophe Leroy
43f003bb74 powerpc: Refactor BUG/WARN macros
BUG(), WARN() and friends are using a similar inline assembly to
implement various traps with various flags.

Lets refactor via a new BUG_ENTRY() macro.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c19a82b37677ace0eebb0dc8c2120373c29c8dd1.1566219503.git.christophe.leroy@c-s.fr
2019-11-18 22:27:51 +11:00
Michael Ellerman
98ba8e8013 Merge branch 'next' of https://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Merge changes from Scott:
  Includes a couple of device tree fixes, a spelling fix, and leftover
  code cleanup.
2019-11-18 22:26:59 +11:00
Thomas Gleixner
b41d62201b x86: Remove unused asm/rio.h
The removed calgary IOMMU driver was the only user of this header file.

Reported-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
2019-11-18 10:52:11 +01:00
Michael Schmitz
5ed0794cde m68k/atari: Convert Falcon IDE drivers to platform drivers
Autoloading of Falcon IDE driver modules requires converting these
drivers to platform drivers.

Add platform device for Falcon IDE interface in Atari platform setup
code. Use this in the pata_falcon driver in place of the simple
platform device set up on the fly.

Convert falconide driver to use the same platform device that is used
by pata_falcon also. (With the introduction of a platform device for
the Atari Falcon IDE interface, the old Falcon IDE driver no longer
loads (resource already claimed by the platform device)).

Tested (as built-in driver) on my Atari Falcon.

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Link: https://lore.kernel.org/r/1573008449-8226-1-git-send-email-schmitzmic@gmail.com
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-11-18 10:18:59 +01:00
Cao jin
11a98f37a5 x86: Fix typos in comments
BIOSen -> BIOSes; paing -> paging. Append to 640 its proper unit "Kb".
encomapssing -> encompassing.

 [ bp: Merge into a single patch, fix one more typo, massage. ]

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191118070012.27850-1-caoj.fnst@cn.fujitsu.com
2019-11-18 10:03:26 +01:00
Christoph Hellwig
405fe7aa0d riscv: provide a flat image loader
This allows just loading the kernel at a pre-set address without
qemu going bonkers trying to map the ELF file.

Contains a contribution from Aurabindo Jayamohanan to reuse the
PAGE_OFFSET definition.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: fixed checkpatch issue; minor commit
 message fix]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-17 15:17:39 -08:00
Christoph Hellwig
6bd33e1ece riscv: add nommu support
The kernel runs in M-mode without using page tables, and thus can't run
bare metal without help from additional firmware.

Most of the patch is just stubbing out code not needed without page
tables, but there is an interesting detail in the signals implementation:

 - The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO
   entry point, but the ELF VDSO is not supported for nommu Linux.
   We instead copy the code to call the syscall onto the stack.

In addition to enabling the nommu code a new defconfig for a small
kernel image that can run in nommu mode on qemu is also provided, to run
a kernel in qemu you can use the following command line:

qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \
	-kernel arch/riscv/boot/loader \
	-drive file=rootfs.ext2,format=raw,id=hd0 \
	-device virtio-blk-device,drive=hd0

Contains contributions from Damien Le Moal <Damien.LeMoal@wdc.com>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: updated to apply; add CONFIG_MMU guards
 around PCI_IOBASE definition to fix build issues; fixed checkpatch
 issues; move the PCI_IO_* and VMEMMAP address space macros along
 with the others; resolve sparse warning]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-17 15:17:39 -08:00
Christoph Hellwig
9e80635619 riscv: clear the instruction cache and all registers when booting
When we get booted we want a clear slate without any leaks from previous
supervisors or the firmware.  Flush the instruction cache and then clear
all registers to known good values.  This is really important for the
upcoming nommu support that runs on M-mode, but can't really harm when
running in S-mode either.  Vaguely based on the concepts from opensbi.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-17 15:17:39 -08:00
Damien Le Moal
accb9dbc4a riscv: read the hart ID from mhartid on boot
When in M-Mode, we can use the mhartid CSR to get the ID of the running
HART. Doing so, direct M-Mode boot without firmware is possible.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-17 15:17:39 -08:00
Christoph Hellwig
fcdc653751 riscv: provide native clint access for M-mode
RISC-V has the concept of a cpu level interrupt controller.  The
interface for it is split between a standardized part that is exposed
as bits in the mstatus/sstatus register and the mie/mip/sie/sip
CRS.  But the bit to actually trigger IPIs is not standardized and
just mentioned as implementable using MMIO.

Add support for IPIs using MMIO using the SiFive clint layout (which
is also shared by Ariane, Kendryte and the Qemu virt platform).
Additionally the MMIO block also supports the time value and timer
compare registers, so they are also set up using the same OF node.
Support for other layouts should also be relatively easy to add in the
future.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: update include guard format; fix checkpatch
 issues; minor commit message cleanup]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-11-17 15:17:39 -08:00
Dan Williams
adbb68293f libnvdimm: Move nd_device_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_device_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

For regions this creates a new nd_region_attribute_groups[] added to the
per-region device-type instances.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309901138.1582359.12909354140826530394.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-17 09:17:39 -08:00
Valentin Longchamp
a76bea0287 powerpc/kmcent2: add ranges to the pci bridges
This removes the warnings about the fact that the 4 pci bridges (i.e.
the 4 pci hosts) don't have any ranges.

Signed-off-by: Valentin Longchamp <valentin@longchamp.me>
Signed-off-by: Scott Wood <oss@buserror.net>
2019-11-17 02:01:02 -06:00
Geert Uytterhoeven
3a0990ca1a powerpc/booke: Spelling s/date/data/
Caching dates is never a good idea ;-)

Fixes: e7affb1dba ("powerpc/cache: add cache flush operation for various e500")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Scott Wood <oss@buserror.net>
2019-11-17 01:56:31 -06:00
Rasmus Villemoes
3e4282e484 powerpc/85xx: remove mostly pointless mpc85xx_qe_init()
Since commit 302c059f2e (QE: use subsys_initcall to init qe),
mpc85xx_qe_init() has done nothing apart from possibly emitting a
pr_err(). As part of reducing the amount of QE-related code in
arch/powerpc/ (and eventually support QE on other architectures),
remove this low-hanging fruit.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Scott Wood <oss@buserror.net>
2019-11-17 01:55:42 -06:00
Valentin Longchamp
ea67a5519d powerpc/kmcent2: update the ethernet devices' phy properties
Change all phy-connection-type properties to phy-mode that are better
supported by the fman driver.

Use the more readable fixed-link node for the 2 sgmii links.

Change the RGMII link to rgmii-id as the clock delays are added by the
phy.

Signed-off-by: Valentin Longchamp <valentin@longchamp.me>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
2019-11-17 01:53:57 -06:00
David S. Miller
19b7e21c55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Lots of overlapping changes and parallel additions, stuff
like that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-16 21:51:42 -08:00
Jason A. Donenfeld
d8f1308a02 crypto: arm/curve25519 - wire up NEON implementation
This ports the SUPERCOP implementation for usage in kernel space. In
addition to the usual header, macro, and style changes required for
kernel space, it makes a few small changes to the code:

  - The stack alignment is relaxed to 16 bytes.
  - Superfluous mov statements have been removed.
  - ldr for constants has been replaced with movw.
  - ldreq has been replaced with moveq.
  - The str epilogue has been made more idiomatic.
  - SIMD registers are not pushed and popped at the beginning and end.
  - The prologue and epilogue have been made idiomatic.
  - A hole has been removed from the stack, saving 32 bytes.
  - We write-back the base register whenever possible for vld1.8.
  - Some multiplications have been reordered for better A7 performance.

There are more opportunities for cleanup, since this code is from qhasm,
which doesn't always do the most opportune thing. But even prior to
extensive hand optimizations, this code delivers significant performance
improvements (given in get_cycles() per call):

		      ----------- -------------
	             | generic C | this commit |
	 ------------ ----------- -------------
	| Cortex-A7  |     49136 |       22395 |
	 ------------ ----------- -------------
	| Cortex-A17 |     17326 |        4983 |
	 ------------ ----------- -------------

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
[ardb: - move to arch/arm/crypto
       - wire into lib/crypto framework
       - implement crypto API KPP hooks ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:44 +08:00
Jason A. Donenfeld
f0fb006b60 crypto: arm/curve25519 - import Bernstein and Schwabe's Curve25519 ARM implementation
This comes from Dan Bernstein and Peter Schwabe's public domain NEON
code, and is included here in raw form so that subsequent commits that
fix these up for the kernel can see how it has changed. This code does
have some entirely cosmetic formatting differences, adding indentation
and so forth, so that when we actually port it for use in the kernel in
the subsequent commit, it's obvious what's changed in the process.

This code originates from SUPERCOP 20180818, available at
<https://bench.cr.yp.to/supercop.html>.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:44 +08:00
Jason A. Donenfeld
bb611bdfd6 crypto: curve25519 - x86_64 library and KPP implementations
This implementation is the fastest available x86_64 implementation, and
unlike Sandy2x, it doesn't requie use of the floating point registers at
all. Instead it makes use of BMI2 and ADX, available on recent
microarchitectures. The implementation was written by Armando
Faz-Hernández with contributions (upstream) from Samuel Neves and me,
in addition to further changes in the kernel implementation from us.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
[ardb: - move to arch/x86/crypto
       - wire into lib/crypto framework
       - implement crypto API KPP hooks ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:44 +08:00
Jason A. Donenfeld
ed0356eda1 crypto: blake2s - x86_64 SIMD implementation
These implementations from Samuel Neves support AVX and AVX-512VL.
Originally this used AVX-512F, but Skylake thermal throttling made
AVX-512VL more attractive and possible to do with negligable difference.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
[ardb: move to arch/x86/crypto, wire into lib/crypto framework]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:43 +08:00
Ard Biesheuvel
c12d3362a7 int128: move __uint128_t compiler test to Kconfig
In order to use 128-bit integer arithmetic in C code, the architecture
needs to have declared support for it by setting ARCH_SUPPORTS_INT128,
and it requires a version of the toolchain that supports this at build
time. This is why all existing tests for ARCH_SUPPORTS_INT128 also test
whether __SIZEOF_INT128__ is defined, since this is only the case for
compilers that can support 128-bit integers.

Let's fold this additional test into the Kconfig declaration of
ARCH_SUPPORTS_INT128 so that we can also use the symbol in Makefiles,
e.g., to decide whether a certain object needs to be included in the
first place.

Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:42 +08:00
Ard Biesheuvel
a11d055e7a crypto: mips/poly1305 - incorporate OpenSSL/CRYPTOGAMS optimized implementation
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305 implementation for
MIPS authored by Andy Polyakov, a prior 64-bit only version of which has been
contributed by him to the OpenSSL project. The file 'poly1305-mips.pl' is taken
straight from this upstream GitHub repository [0] at commit
d22ade312a7af958ec955620b0d241cf42c37feb, and already contains all the changes
required to build it as part of a Linux kernel module.

[0] https://github.com/dot-asm/cryptogams

Co-developed-by: Andy Polyakov <appro@cryptogams.org>
Signed-off-by: Andy Polyakov <appro@cryptogams.org>
Co-developed-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:42 +08:00
Ard Biesheuvel
a6b803b3dd crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305 implementation
for NEON authored by Andy Polyakov, and contributed by him to the OpenSSL
project. The file 'poly1305-armv4.pl' is taken straight from this upstream
GitHub repository [0] at commit ec55a08dc0244ce570c4fc7cade330c60798952f,
and already contains all the changes required to build it as part of a
Linux kernel module.

[0] https://github.com/dot-asm/cryptogams

Co-developed-by: Andy Polyakov <appro@cryptogams.org>
Signed-off-by: Andy Polyakov <appro@cryptogams.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:42 +08:00
Ard Biesheuvel
f569ca1647 crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305 implementation
for NEON authored by Andy Polyakov, and contributed by him to the OpenSSL
project. The file 'poly1305-armv8.pl' is taken straight from this upstream
GitHub repository [0] at commit ec55a08dc0244ce570c4fc7cade330c60798952f,
and already contains all the changes required to build it as part of a
Linux kernel module.

[0] https://github.com/dot-asm/cryptogams

Co-developed-by: Andy Polyakov <appro@cryptogams.org>
Signed-off-by: Andy Polyakov <appro@cryptogams.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:41 +08:00
Ard Biesheuvel
f0e89bcfbb crypto: x86/poly1305 - expose existing driver as poly1305 library
Implement the arch init/update/final Poly1305 library routines in the
accelerated SIMD driver for x86 so they are accessible to users of
the Poly1305 library interface as well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:41 +08:00
Ard Biesheuvel
1b2c6a5120 crypto: x86/poly1305 - depend on generic library not generic shash
Remove the dependency on the generic Poly1305 driver. Instead, depend
on the generic library so that we only reuse code without pulling in
the generic skcipher implementation as well.

While at it, remove the logic that prefers the non-SIMD path for short
inputs - this is no longer necessary after recent FPU handling changes
on x86.

Since this removes the last remaining user of the routines exported
by the generic shash driver, unexport them and make them static.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:41 +08:00
Ard Biesheuvel
ad8f5b8838 crypto: x86/poly1305 - unify Poly1305 state struct with generic code
In preparation of exposing a Poly1305 library interface directly from
the accelerated x86 driver, align the state descriptor of the x86 code
with the one used by the generic driver. This is needed to make the
library interface unified between all implementations.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:41 +08:00
Ard Biesheuvel
48ea8c6ebc crypto: poly1305 - move core routines into a separate library
Move the core Poly1305 routines shared between the generic Poly1305
shash driver and the Adiantum and NHPoly1305 drivers into a separate
library so that using just this pieces does not pull in the crypto
API pieces of the generic Poly1305 routine.

In a subsequent patch, we will augment this generic library with
init/update/final routines so that Poyl1305 algorithm can be used
directly without the need for using the crypto API's shash abstraction.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:41 +08:00
Ard Biesheuvel
3a2f58f3ba crypto: mips/chacha - wire up accelerated 32r2 code from Zinc
This integrates the accelerated MIPS 32r2 implementation of ChaCha
into both the API and library interfaces of the kernel crypto stack.

The significance of this is that, in addition to becoming available
as an accelerated library implementation, it can also be used by
existing crypto API code such as Adiantum (for block encryption on
ultra low performance cores) or IPsec using chacha20poly1305. These
are use cases that have already opted into using the abstract crypto
API. In order to support Adiantum, the core assembler routine has
been adapted to take the round count as a function argument rather
than hardcoding it to 20.

Co-developed-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:40 +08:00
Jason A. Donenfeld
49aa7c00ed crypto: mips/chacha - import 32r2 ChaCha code from Zinc
This imports the accelerated MIPS 32r2 ChaCha20 implementation from the
Zinc patch set.

Co-developed-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:40 +08:00
Ard Biesheuvel
a44a3430d7 crypto: arm/chacha - expose ARM ChaCha routine as library function
Expose the accelerated NEON ChaCha routine directly as a symbol
export so that users of the ChaCha library API can use it directly.

Given that calls into the library API will always go through the
routines in this module if it is enabled, switch to static keys
to select the optimal implementation available (which may be none
at all, in which case we defer to the generic implementation for
all invocations).

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:40 +08:00
Ard Biesheuvel
b36d8c09e7 crypto: arm/chacha - remove dependency on generic ChaCha driver
Instead of falling back to the generic ChaCha skcipher driver for
non-SIMD cases, use a fast scalar implementation for ARM authored
by Eric Biggers. This removes the module dependency on chacha-generic
altogether, which also simplifies things when we expose the ChaCha
library interface from this module.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:40 +08:00
Ard Biesheuvel
29621d099f crypto: arm/chacha - import Eric Biggers's scalar accelerated ChaCha code
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Ard Biesheuvel
b3aad5bad2 crypto: arm64/chacha - expose arm64 ChaCha routine as library function
Expose the accelerated NEON ChaCha routine directly as a symbol
export so that users of the ChaCha library API can use it directly.

Given that calls into the library API will always go through the
routines in this module if it is enabled, switch to static keys
to select the optimal implementation available (which may be none
at all, in which case we defer to the generic implementation for
all invocations).

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Ard Biesheuvel
c77da4867c crypto: arm64/chacha - depend on generic chacha library instead of crypto driver
Depend on the generic ChaCha library routines instead of pulling in the
generic ChaCha skcipher driver, which is more than we need, and makes
managing the dependencies between the generic library, generic driver,
accelerated library and driver more complicated.

While at it, drop the logic to prefer the scalar code on short inputs.
Turning the NEON on and off is cheap these days, and one major use case
for ChaCha20 is ChaCha20-Poly1305, which is guaranteed to hit the scalar
path upon every invocation  (when doing the Poly1305 nonce generation)

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Ard Biesheuvel
84e03fa39f crypto: x86/chacha - expose SIMD ChaCha routine as library function
Wire the existing x86 SIMD ChaCha code into the new ChaCha library
interface, so that users of the library interface will get the
accelerated version when available.

Given that calls into the library API will always go through the
routines in this module if it is enabled, switch to static keys
to select the optimal implementation available (which may be none
at all, in which case we defer to the generic implementation for
all invocations).

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Ard Biesheuvel
28e8d89b1c crypto: x86/chacha - depend on generic chacha library instead of crypto driver
In preparation of extending the x86 ChaCha driver to also expose the ChaCha
library interface, drop the dependency on the chacha_generic crypto driver
as a non-SIMD fallback, and depend on the generic ChaCha library directly.
This way, we only pull in the code we actually need, without registering
a set of ChaCha skciphers that we will never use.

Since turning the FPU on and off is cheap these days, simplify the SIMD
routine by dropping the per-page yield, which makes for a cleaner switch
to the library API as well. This also allows use to invoke the skcipher
walk routines in non-atomic mode.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Ard Biesheuvel
5fb8ef2580 crypto: chacha - move existing library code into lib/crypto
Currently, our generic ChaCha implementation consists of a permute
function in lib/chacha.c that operates on the 64-byte ChaCha state
directly [and which is always included into the core kernel since it
is used by the /dev/random driver], and the crypto API plumbing to
expose it as a skcipher.

In order to support in-kernel users that need the ChaCha streamcipher
but have no need [or tolerance] for going through the abstractions of
the crypto API, let's expose the streamcipher bits via a library API
as well, in a way that permits the implementation to be superseded by
an architecture specific one if provided.

So move the streamcipher code into a separate module in lib/crypto,
and expose the init() and crypt() routines to users of the library.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Linus Torvalds
fe30021c36 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two fixes: disable unreliable HPET on Intel Coffe Lake platforms, and
  fix a lockdep splat in the resctrl code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Fix potential lockdep warning
  x86/quirks: Disable HPET on Intel Coffe Lake platforms
2019-11-16 16:10:59 -08:00