Commit Graph

904169 Commits

Author SHA1 Message Date
Arnd Bergmann
ad5d7a5513 Merge tag 'imx-fixes-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes
i.MX fixes for 5.6, round 2:

- Fix minimum voltage setting of vdd_arm and vdd_soc on i.MX6
  phycore-som board.

* tag 'imx-fixes-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6: phycore-som: fix arm and soc minimum voltage

Link: https://lore.kernel.org/r/20200316032555.GD17221@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-03-25 14:27:18 +01:00
Maor Gottlieb
ba80013fba RDMA/mlx5: Block delay drop to unprivileged users
It has been discovered that this feature can globally block the RX port,
so it should be allowed for highly privileged users only.

Fixes: 03404e8ae652("IB/mlx5: Add support to dropless RQ")
Link: https://lore.kernel.org/r/20200322124906.1173790-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-25 09:56:30 -03:00
Arnd Bergmann
d2687b896d Allwinner Fixes for 5.6 - part 2
This follows up on the previous 5.6 fixes tag with a fix for the A33
 Security System (crypto offloading hardware). The hardware was found
 to not be compatible with existing hardware and a new compatible was
 needed.
 
 The driver change was picked up right before the previous -rc6 and
 the DT bindings and DT changes were not picked up. The goal is to have
 all the changes in the same release, that is v5.6.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCgAsFiEE2nN1m/hhnkhOWjtHOJpUIZwPJDAFAl5rIdkOHHdlbnNAY3Np
 ZS5vcmcACgkQOJpUIZwPJDDwyBAA25+D4D8dB5ZEsFcw4JOfuvCFfJE0M4IRerHd
 5CHK4LwDC4S+MFdtE763NiVDQxNBf3K5yMe3wl/dCRL2XkUeJfUjhbi7msKvCqLL
 ReLTMxZzMr3jhYNYb15oEJmLSZCDak29LnI59vh8OfyRnN20drMVyYPjoBwpV77f
 P+rn553tyd0XrbnIXL/2ISSypX0PSEQq7ZSds/GIwz/9n8dPQj3d8DlVcDpC9NOu
 SrTcG55Y5GWkH5RRxOr+kNeoFAAClFTJCWgUHRcIn2/LGuuQazZPGDH7WOnbafJX
 zdkPa/OWTLOev+2oFF9Ae9PxtAORTOZQ9LqzMKi67xmNBnpJ5s5Nk+Fb1JMgVhdx
 B00RZoUrokF1c1qsvLXN+CU8VhFm+Zid8qkc9j3FD0gbLJEL0knoxt7Zr0cnLwf4
 AdGQjlE+EtF8OkbGeCsJ7Pa4hndycbg7HnvarzVj6EjdqwA5MWM3AisqOOvHOo7T
 p5qdsXVXKwesLVM3AtEB9spGe4GE0IBANfGIUiV0RwA3FQZ773uhfHfQfPLGoABD
 wXd1lQvA6WbMkzGRGPqnQ3g07aTKzGMoDXY8AiMGbtUPXbvrDIJqrlPzBo+yHodg
 dNKdz/8rzGbakzCSVtrlQYfVYY+qYVLcghutnv/G3bMJkTrHpYd74xume2EclR/b
 YxzSobs=
 =lpzG
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-fixes-for-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes

Allwinner Fixes for 5.6 - part 2

This follows up on the previous 5.6 fixes tag with a fix for the A33
Security System (crypto offloading hardware). The hardware was found
to not be compatible with existing hardware and a new compatible was
needed.

The driver change was picked up right before the previous -rc6 and
the DT bindings and DT changes were not picked up. The goal is to have
all the changes in the same release, that is v5.6.

* tag 'sunxi-fixes-for-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  ARM: dts: sun8i: a33: add the new SS compatible
  dt-bindings: crypto: add new compatible for A33 SS

Link: https://lore.kernel.org/r/20200313060727.GA23962@wens.csie.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-03-25 13:42:11 +01:00
Arnd Bergmann
aafd017347 Allwinner Fixes for v5.6
A pretty normal set of fixes for v5.6:
 
   - Fix reversed macros used for A83T EMAC clock and reset
   - Fix camera regulator voltage and USB OTG for TBS-A711
   - 16-bit / 8-bit mixed read fix for our RSB driver
   - Fix SPI controller base address for R40
   - Reorder device nodes based on base address for R40
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCgAsFiEE2nN1m/hhnkhOWjtHOJpUIZwPJDAFAl5rGFcOHHdlbnNAY3Np
 ZS5vcmcACgkQOJpUIZwPJDC7ehAAjd+M6e4/XPSJvR6w5EOSbbZ7lJbW916+I4TK
 vFvwyow4YQX7ryE6ivq0ytGYA2THJQI8TSxhaeLUVNF64TnG6LzsXIyyqPRVyav+
 AxIB2tQWWnemcVPfS1CobRF3FXng1hBzs2ooaGb3N9ois1wcVTqHvcOxkvaNvZ3s
 0q8tIAmvnkBiqBESdcwz3hjYZqhcbkj3qra73xJVy7BmwOhsBquEYbeCmO2eyqdO
 FjhWpE5BuR/4D3UZYrVnhz+r+6LSWlM/Bc8MfAOmtFHrYJkcFLF5CYqctGzPyjkS
 sPWtmHKszJfep7u8PHVvEsM7T9EXjtj/B2/z3ms0Z8YnWiEMkBB9qfc9dnvdy0QI
 TAod6n4i2r2Jf8lFVSnDYEGBa2+vD3D6ySG5vAi4PzkErF+eO9Av3t/JUeco2uWY
 uD7UROyK+34kxPcn+RYkcudnTqPyAmSiAPE9xBOM9qIzaEXk9bwrdI4bA+LVtoYX
 y2/c1MC3j4iM+7Xa7UxSkclmDi8Pty++wwtTX6g8MAzEhncx52SALa2mmm6LCSNc
 Z2zOBy1SFkzi71c11GZCzMu8cV+W6R9wXQ8x2hfqqngPDrXBUgnKy6jpCEzF/NCm
 O1JYtwRHslMEsRwR+Co/4Av/Smt20qC1peVjQe93OLWOYTj3YAdLTVR9sd3DLcLd
 S4DkeJ0=
 =yBY0
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-fixes-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes

Allwinner Fixes for v5.6

A pretty normal set of fixes for v5.6:

  - Fix reversed macros used for A83T EMAC clock and reset
  - Fix camera regulator voltage and USB OTG for TBS-A711
  - 16-bit / 8-bit mixed read fix for our RSB driver
  - Fix SPI controller base address for R40
  - Reorder device nodes based on base address for R40

* tag 'sunxi-fixes-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  ARM: dts: sun8i: r40: Move SPI device nodes based on address order
  ARM: dts: sun8i: r40: Fix register base address for SPI2 and SPI3
  ARM: dts: sun8i: r40: Move AHCI device node based on address order
  bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads
  ARM: dts: sun8i-a83t-tbs-a711: Fix USB OTG mode detection
  ARM: dts: sun8i-a83t-tbs-a711: HM5065 doesn't like such a high voltage
  ARM: dts: sun8i: a83t: Fix incorrect clk and reset macros for EMAC device

Link: https://lore.kernel.org/r/20200313055233.GA19649@wens.csie.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-03-25 13:41:02 +01:00
Arnd Bergmann
8b45e9d9c0 NXP/FSL soc driver fixes for v5.6
DPAA2 DPIO
 - Fix a kernel hang caused by irq requested before creating dpio
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEhb3UXAyxp6UQ0v6khtxQDvusFVQFAl5qa2EACgkQhtxQDvus
 FVQFMg/+K9Nt7+LmW3wA3jlds9yz8PlaXMw862EexiDzPiIu/BT1BburHsKUd78G
 MCoIiMG+YuvDmplkLeXyq4mIC+4ofSiB+t7OhXCfzVSf7zm8t+ehRQSFuzffsmbS
 o+bZ4tJLf7Z+U6iSp5pbEZ3kor+EImgj/L4r1MVh7TkBvTgLm45HfUaq/gcm/Qdc
 NKDbwywVU34Ur6jNEY0OHpKOoTyu/lGwRWh1CbvdYBpF1r/CsSLe8Mqclo1dgxwa
 Mq85l3fGDbx8U/275GZij3m6BfGawGWjKCrNC/v9sr+7IJMR4ERp7AewoWks6QsM
 KYE22FpOhhl6YBTo8cT2d5tqHQRBwzG3I48K8fo3UiKzAtC7gEFOPGbsRL41XlUS
 PmpUYDtTp+85ukYWr5/sr/sZzbuad3LbhUz/J09ty0zYJjMbhtEq0tBG7odgrklx
 LgC3yA5+qZMp6/Q82FTNvACO5tcSMnePErB6fy75bn45hgXKEXAAzniG87uWFNp5
 xpvg+YzreEREg9xFRjyW4hwGHtRO+c2q4LErasciiF0QbP237+YBb2/55em+s/Hp
 Fxne8pZIuSOzNjlNx8GvW7I3CC8ym/uh1PffmROBrS81hLVLZrgoAmMZPFFkwnl+
 By8ySMnWSZtOYaXS/sjztnreeVf94YfxEdyWJkoYs3VWmzMzD6w=
 =SFdT
 -----END PGP SIGNATURE-----

Merge tag 'soc-fsl-fix-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux into arm/fixes

NXP/FSL soc driver fixes for v5.6

DPAA2 DPIO
- Fix a kernel hang caused by irq requested before creating dpio

* tag 'soc-fsl-fix-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux:
  soc: fsl: dpio: register dpio irq handlers after dpio create

Link: https://lore.kernel.org/r/20200312202525.16708-1-leoyang.li@nxp.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-03-25 13:40:22 +01:00
Johannes Berg
575a97acc3 ieee80211: fix HE SPR size calculation
The he_sr_control field is just a u8, so le32_to_cpu()
shouldn't be applied to it; this was evidently copied
from ieee80211_he_oper_size(). Fix it, and also adjust
the type of the local variable.

Fixes: ef11a931bd ("mac80211: HE: add Spatial Reuse element parsing support")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200325090918.dfe483b49e06.Ia53622f23b2610a2ae6ea39a199866196fe946c1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-25 09:59:16 +01:00
Johannes Berg
0016d32017 nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type
The new opmode notification used this attribute with a u8, when
it's documented as a u32 and indeed used in userspace as such,
it just happens to work on little-endian systems since userspace
isn't doing any strict size validation, and the u8 goes into the
lower byte. Fix this.

Cc: stable@vger.kernel.org
Fixes: 466b9936bf ("cfg80211: Add support to notify station's opmode change to userspace")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200325090531.be124f0a11c7.Iedbf4e197a85471ebd729b186d5365c0343bf7a8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-03-25 09:58:43 +01:00
Dave Airlie
c255623812 Merge branch 'feature/staging_sm5' of git://people.freedesktop.org/~sroland/linux into drm-next
vmwgfx pull for for 5.7. Needed for GL4 functionality.
Sync up device headers, add support for new commands, code
refactoring around surface definition.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Roland Scheidegger (VMware)" <rscheidegger.oss@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200323235434.11780-1-rscheidegger.oss@gmail.com
2020-03-25 15:45:45 +10:00
Dave Airlie
de487e432d Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into drm-next
- fix for potential out-of-bounds reads in the perfmon ioctl
  implementation from Christian
- override to expose proper feature flags for the GC400 found on the
  STM32MP1 SoC, also from Christian
- Guido fixed an issue where we would spuriously fail to enter
  runtime suspend due to a new GPU engine status bit on GC7000
- tree-wide change from Gustavo to get rid of zero-length arrays
- fix for missed TS cache flush on GC7000, leading to spurious
  MMU faults from me
- request pages from DMA32 zone on systems where we can't address
  all present memory from me

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/74d9c6d19099fdba6c6795204a6aa445b7930c79.camel@pengutronix.de
2020-03-25 15:40:25 +10:00
Martin K. Petersen
ea697a8bf5 scsi: sd: Fix optimal I/O size for devices that change reported values
Some USB bridge devices will return a default set of characteristics during
initialization. And then, once an attached drive has spun up, substitute
the actual parameters reported by the drive. According to the SCSI spec,
the device should return a UNIT ATTENTION in case any reported parameters
change. But in this case the change is made silently after a small window
where default values are reported.

Commit a83da8a450 ("scsi: sd: Optimal I/O size should be a multiple of
physical block size") validated the reported optimal I/O size against the
physical block size to overcome problems with devices reporting nonsensical
transfer sizes. However, this validation did not account for the fact that
aforementioned devices will return default values during a brief window
during spin-up. The subsequent change in reported characteristics would
invalidate the checking that had previously been performed.

Unset a previously configured optimal I/O size should the sanity checking
fail on subsequent revalidate attempts.

Link: https://lore.kernel.org/r/33fb522e-4f61-1b76-914f-c9e6a3553c9b@gmail.com
Cc: Bryan Gurney <bgurney@redhat.com>
Cc: <stable@vger.kernel.org>
Reported-by: Bernhard Sulzer <micraft.b@gmail.com>
Tested-by: Bernhard Sulzer <micraft.b@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-24 22:53:04 -04:00
Damien Le Moal
ccf4ad7da0 zonfs: Fix handling of read-only zones
The write pointer of zones in the read-only consition is defined as
invalid by the SCSI ZBC and ATA ZAC specifications. It is thus not
possible to determine the correct size of a read-only zone file on
mount. Fix this by handling read-only zones in the same manner as
offline zones by disabling all accesses to the zone (read and write)
and initializing the inode size of the read-only zone to 0).

For zones found to be in the read-only condition at runtime, only
disable write access to the zone and keep the size of the zone file to
its last updated value to allow the user to recover previously written
data.

Also fix zonefs documentation file to reflect this change.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
2020-03-25 11:28:26 +09:00
David S. Miller
6f000f9878 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) A new selftest for nf_queue, from Florian Westphal. This test
   covers two recent fixes: 07f8e4d0fd ("tcp: also NULL skb->dev
   when copy was needed") and b738a185be ("tcp: ensure skb->dev is
   NULL before leaving TCP stack").

2) The fwd action breaks with ifb. For safety in next extensions,
   make sure the fwd action only runs from ingress until it is extended
   to be used from a different hook.

3) The pipapo set type now reports EEXIST in case of subrange overlaps.
   Update the rbtree set to validate range overlaps, so far this
   validation is only done only from userspace. From Stefano Brivio.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 17:30:40 -07:00
David S. Miller
7e566df652 mlx5-fixes-2020-03-24
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl56fu0ACgkQSD+KveBX
 +j4g1Qf/edWPTCMGK4eb0jBPUvnxkoGPYj4cq0tfeUaY7Q4r95RzNjuW3gTotzGV
 JZymoC2OoWQxUR2Ye0FkM1C/RQFIAHinEX/KFOMJ6PL+k4+micXeIGNfVo3aflO0
 kaTcBdgZKqFS5hpRtWZc/DVRWckqJYtaAJEFliQbYGwmfiZNoNr0/ZeU+/DX2dHn
 bQkRHQZ3Zq43P4FhVBSyrfmsxUI71k7GtCdJ5G4i80e8qCCARKZDx7q1FRC0k6fh
 a84+7NxpFRkl+kT+se/bxcQaFht49YSJVauGMKK8Ae+pz0XEaNrYsFz9zQY/s7W6
 4a62hzlHuVmUwteZfH+secZzCnOkUw==
 =rO1d
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2020-03-24

This series introduces some fixes to mlx5 driver.

From Aya, Fixes to the RX error recovery flows
From Leon, Fix IB capability mask

Please pull and let me know if there is any problem.

For -stable v5.5
 ('net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure')

For -stable v5.4
 ('net/mlx5e: Fix ICOSQ recovery flow with Striding RQ')
 ('net/mlx5e: Do not recover from a non-fatal syndrome')
 ('net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset')
 ('net/mlx5e: Enhance ICOSQ WQE info fields')

The above patch ('net/mlx5e: Enhance ICOSQ WQE info fields')
will fail to apply cleanly on v5.4 due to a trivial contextual conflict,
but it is an important fix, do I need to do something about it or just
assume Greg will know how to handle this ?
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 17:25:54 -07:00
Heiner Kallweit
f13bc68131 r8169: re-enable MSI on RTL8168c
The original change fixed an issue on RTL8168b by mimicking the vendor
driver behavior to disable MSI on chip versions before RTL8168d.
This however now caused an issue on a system with RTL8168c, see [0].
Therefore leave MSI disabled on RTL8168b, but re-enable it on RTL8168c.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1792839

Fixes: 003bd5b4a7 ("r8169: don't use MSI before RTL8168d")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 17:20:10 -07:00
Andre Przywara
c312c7818b net: phy: mdio-bcm-unimac: Fix clock handling
The DT binding for this PHY describes an *optional* clock property.
Due to a bug in the error handling logic, we are actually ignoring this
clock *all* of the time so far.

Fix this by using devm_clk_get_optional() to handle this clock properly.

Fixes: b78ac6ecd1 ("net: phy: mdio-bcm-unimac: Allow configuring MDIO clock divider")
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:45:32 -07:00
Raju Rangoju
50e0d28d38 cxgb4/ptp: pass the sign of offset delta in FW CMD
cxgb4_ptp_fineadjtime() doesn't pass the signedness of offset delta
in FW_PTP_CMD. Fix it by passing correct sign.

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:23:55 -07:00
Vladimir Oltean
e80f40cbe4 net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
Not only did this wheel did not need reinventing, but there is also
an issue with it: It doesn't remove the VLAN header in a way that
preserves the L2 payload checksum when that is being provided by the DSA
master hw.  It should recalculate checksum both for the push, before
removing the header, and for the pull afterwards. But the current
implementation is quite dizzying, with pulls followed immediately
afterwards by pushes, the memmove is done before the push, etc.  This
makes a DSA master with RX checksumming offload to print stack traces
with the infamous 'hw csum failure' message.

So remove the dsa_8021q_remove_header function and replace it with
something that actually works with inet checksumming.

Fixes: d461933638 ("net: dsa: tag_8021q: Create helper function for removing VLAN header")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:19:01 -07:00
Zh-yuan Ye
961d0e5b32 net: cbs: Fix software cbs to consider packet sending time
Currently the software CBS does not consider the packet sending time
when depleting the credits. It caused the throughput to be
Idleslope[kbps] * (Port transmit rate[kbps] / |Sendslope[kbps]|) where
Idleslope * (Port transmit rate / (Idleslope + |Sendslope|)) = Idleslope
is expected. In order to fix the issue above, this patch takes the time
when the packet sending completes into account by moving the anchor time
variable "last" ahead to the send completion time upon transmission and
adding wait when the next dequeue request comes before the send
completion time of the previous packet.

changelog:
V2->V3:
 - remove unnecessary whitespace cleanup
 - add the checks if port_rate is 0 before division

V1->V2:
 - combine variable "send_completed" into "last"
 - add the comment for estimate of the packet sending

Fixes: 585d763af0 ("net/sched: Introduce Credit Based Shaper (CBS) qdisc")
Signed-off-by: Zh-yuan Ye <ye.zh-yuan@socionext.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 16:14:05 -07:00
Eugene Syromiatnikov
52afa505a0 Input: avoid BIT() macro usage in the serio.h UAPI header
The commit 19ba1eb15a ("Input: psmouse - add a custom serio protocol
to send extra information") introduced usage of the BIT() macro
for SERIO_* flags; this macro is not provided in UAPI headers.
Replace if with similarly defined _BITUL() macro defined
in <linux/const.h>.

Fixes: 19ba1eb15a ("Input: psmouse - add a custom serio protocol to send extra information")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Cc: <stable@vger.kernel.org> # v5.0+
Link: https://lore.kernel.org/r/20200324041341.GA32335@asgard.redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-03-24 15:59:34 -07:00
Leon Romanovsky
950bf4f177 RDMA/mlx5: Fix access to wrong pointer while performing flush due to error
The main difference between send and receive SW completions is related to
separate treatment of WQ queue. For receive completions, the initial index
to be flushed is stored in "tail", while for send completions, it is in
deleted "last_poll".

  CPU: 54 PID: 53405 Comm: kworker/u161:0 Kdump: loaded Tainted: G           OE    --------- -t - 4.18.0-147.el8.ppc64le #1
  Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
  NIP:  c000003c7c00a000 LR: c00800000e586af4 CTR: c000003c7c00a000
  REGS: c0000036cc9db940 TRAP: 0400   Tainted: G           OE    --------- -t -  (4.18.0-147.el8.ppc64le)
  MSR:  9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24004488  XER: 20040000
  CFAR: c00800000e586af0 IRQMASK: 0
  GPR00: c00800000e586ab4 c0000036cc9dbbc0 c00800000e5f1a00 c0000037d8433800
  GPR04: c000003895a26800 c0000037293f2000 0000000000000201 0000000000000011
  GPR08: c000003895a26c80 c000003c7c00a000 0000000000000000 c00800000ed30438
  GPR12: c000003c7c00a000 c000003fff684b80 c00000000017c388 c00000396ec4be40
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: c00000000151e498 0000000000000010 c000003895a26848 0000000000000010
  GPR24: 0000000000000010 0000000000010000 c000003895a26800 0000000000000000
  GPR28: 0000000000000010 c0000037d8433800 c000003895a26c80 c000003895a26800
  NIP [c000003c7c00a000] 0xc000003c7c00a000
  LR [c00800000e586af4] __ib_process_cq+0xec/0x1b0 [ib_core]
  Call Trace:
  [c0000036cc9dbbc0] [c00800000e586ab4] __ib_process_cq+0xac/0x1b0 [ib_core] (unreliable)
  [c0000036cc9dbc40] [c00800000e586c88] ib_cq_poll_work+0x40/0xb0 [ib_core]
  [c0000036cc9dbc70] [c000000000171f44] process_one_work+0x2f4/0x5c0
  [c0000036cc9dbd10] [c000000000172a0c] worker_thread+0xcc/0x760
  [c0000036cc9dbdc0] [c00000000017c52c] kthread+0x1ac/0x1c0
  [c0000036cc9dbe30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80

Fixes: 8e3b688301 ("RDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completion")
Link: https://lore.kernel.org/r/20200318091640.44069-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 19:54:57 -03:00
Mike Marciniszyn
2d47fbacf2 RDMA/core: Ensure security pkey modify is not lost
The following modify sequence (loosely based on ipoib) will lose a pkey
modifcation:

- Modify (pkey index, port)
- Modify (new pkey index, NO port)

After the first modify, the qp_pps list will have saved the pkey and the
unit on the main list.

During the second modify, get_new_pps() will fetch the port from qp_pps
and read the new pkey index from qp_attr->pkey_index.  The state will
still be zero, or IB_PORT_PKEY_NOT_VALID. Because of the invalid state,
the new values will never replace the one in the qp pps list, losing the
new pkey.

This happens because the following if statements will never correct the
state because the first term will be false. If the code had been executed,
it would incorrectly overwrite valid values.

  if ((qp_attr_mask & IB_QP_PKEY_INDEX) && (qp_attr_mask & IB_QP_PORT))
	  new_pps->main.state = IB_PORT_PKEY_VALID;

  if (!(qp_attr_mask & (IB_QP_PKEY_INDEX | IB_QP_PORT)) && qp_pps) {
	  new_pps->main.port_num = qp_pps->main.port_num;
	  new_pps->main.pkey_index = qp_pps->main.pkey_index;
	  if (qp_pps->main.state != IB_PORT_PKEY_NOT_VALID)
		  new_pps->main.state = IB_PORT_PKEY_VALID;
  }

Fix by joining the two if statements with an or test to see if qp_pps is
non-NULL and in the correct state.

Fixes: 1dd017882e ("RDMA/core: Fix protection fault in get_pkey_idx_qp_list")
Link: https://lore.kernel.org/r/20200313124704.14982.55907.stgit@awfm-01.aw.intel.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 19:53:25 -03:00
Leon Romanovsky
1fa7077874 MAINTAINERS: Clean RXE section and add Zhu as RXE maintainer
Zhu Yanjun contributed many patches to RXE and expressed genuine interest
in improve RXE even more. Let's add him as a maintainer.

Link: https://lore.kernel.org/r/20200312083658.29603-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 19:52:17 -03:00
Andrew Duggan
e4ad153ac8 Input: synaptics-rmi4 - set reduced reporting mode only when requested
The previous patch "c5ccf2ad3d33 (Input: synaptics-rmi4 - switch to
reduced reporting mode)" enabled reduced reporting mode unintentionally
on some devices, if the firmware was configured with default Delta X/Y
threshold values. The result unintentionally degrade the performance of
some touchpads.

This patch checks to see that the driver is modifying the delta X/Y
thresholds before modifying the reporting mode.

Signed-off-by: Andrew Duggan <aduggan@synaptics.com>
Fixes: c5ccf2ad3d ("Input: synaptics-rmi4 - switch to reduced reporting mode")
Link: https://lore.kernel.org/r/20200312005549.29922-1-aduggan@synaptics.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-03-24 15:45:18 -07:00
Yussuf Khalil
1369d0abe4 Input: synaptics - enable RMI on HP Envy 13-ad105ng
This laptop (and perhaps other variants of the same model) reports an
SMBus-capable Synaptics touchpad. Everything (including suspend and
resume) works fine when RMI is enabled via the kernel command line, so
let's add it to the whitelist.

Signed-off-by: Yussuf Khalil <dev@pp3345.net>
Link: https://lore.kernel.org/r/20200307213508.267187-1-dev@pp3345.net
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-03-24 15:23:52 -07:00
Aya Levin
187a9830c9 net/mlx5e: Do not recover from a non-fatal syndrome
For non-fatal syndromes like LOCAL_LENGTH_ERR, recovery shouldn't be
triggered. In these scenarios, the RQ is not actually in ERR state.
This misleads the recovery flow which assumes that the RQ is really in
error state and no more completions arrive, causing crashes on bad page
state.

Fixes: 8276ea1353 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24 14:43:07 -07:00
Aya Levin
e239c6d686 net/mlx5e: Fix ICOSQ recovery flow with Striding RQ
In striding RQ mode, the buffers of an RX WQE are first
prepared and posted to the HW using a UMR WQEs via the ICOSQ.
We maintain the state of these in-progress WQEs in the RQ
SW struct.

In the flow of ICOSQ recovery, the corresponding RQ is not
in error state, hence:

- The buffers of the in-progress WQEs must be released
  and the RQ metadata should reflect it.
- Existing RX WQEs in the RQ should not be affected.

For this, wrap the dealloc of the in-progress WQEs in
a function, and use it in the ICOSQ recovery flow
instead of mlx5e_free_rx_descs().

Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24 14:43:05 -07:00
Aya Levin
39369fd536 net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset
When resetting the RQ (moving RQ state from RST to RDY), the driver
resets the WQ's SW metadata.
In striding RQ mode, we maintain a field that reflects the actual
expected WQ head (including in progress WQEs posted to the ICOSQ).
It was mistakenly not reset together with the WQ. Fix this here.

Fixes: 8276ea1353 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24 14:43:02 -07:00
Aya Levin
1de0306c3a net/mlx5e: Enhance ICOSQ WQE info fields
Add number of WQEBBs (WQE's Basic Block) to WQE info struct. Set the
number of WQEBBs on WQE post, and increment the consumer counter (cc)
on completion.

In case of error completions, the cc was mistakenly not incremented,
keeping a gap between cc and pc (producer counter). This failed the
recovery flow on the ICOSQ from a CQE error which timed-out waiting for
the cc and pc to meet.

Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24 14:43:00 -07:00
Leon Romanovsky
306f354c67 net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure
The cap_mask1 isn't protected by field_select and not listed among RW
fields, but it is required to be written to properly initialize ports
in IB virtualization mode.

Link: https://lore.kernel.org/linux-rdma/88bab94d2fd72f3145835b4518bc63dda587add6.camel@redhat.com
Fixes: ab118da4c1 ("net/mlx5: Don't write read-only fields in MODIFY_HCA_VPORT_CONTEXT command")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-24 14:42:58 -07:00
Kai-Heng Feng
d944b27df1 i2c: nvidia-gpu: Handle timeout correctly in gpu_i2c_check_status()
Nvidia card may come with a "phantom" UCSI device, and its driver gets
stuck in probe routine, prevents any system PM operations like suspend.

There's an unaccounted case that the target time can equal to jiffies in
gpu_i2c_check_status(), let's solve that by using readl_poll_timeout()
instead of jiffies comparison functions.

Fixes: c71bcdcb42 ("i2c: add i2c bus driver for NVIDIA GPU")
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Ajay Gupta <ajayg@nvidia.com>
Tested-by: Ajay Gupta <ajayg@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2020-03-24 22:40:55 +01:00
Florian Westphal
a64d558d8c selftests: netfilter: add nfqueue test case
Add a test case to check nf queue infrastructure.
Could be extended in the future to also cover serialization of
conntrack, uid and secctx attributes in nfqueue.

For now, this checks that 'queue bypass' works, that a queue rule with
no bypass option blocks traffic and that userspace receives the expected
number of packets.
For this we add two queues and hook all of
prerouting/input/forward/output/postrouting.

Packets get queued twice with a dummy base chain in between:
This passes with current nf tree, but reverting
commit 946c0d8e6e ("netfilter: nf_queue: fix reinject verdict handling")
makes this trip (it processes 30 instead of expected 20 packets).

v2: update config file with queue and other options missing/needed for
other tests.
v3: also test with tcp, this reveals problem with commit
28f8bfd1ac ("netfilter: Support iif matches in POSTROUTING"), due to
skb->dev pointing at another skb in the retransmit rbtree (skb->dev
aliases to rbnode child).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 20:00:12 +01:00
Pablo Neira Ayuso
bcfabee1af netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
Set skb->tc_redirected to 1, otherwise the ifb driver drops the packet.
Set skb->tc_from_ingress to 1 to reinject the packet back to the ingress
path after leaving the ifb egress path.

This patch inconditionally sets on these two skb fields that are
meaningful to the ifb driver. The existing forward action is guaranteed
to run from ingress path.

Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:39 +01:00
Pablo Neira Ayuso
76a109fac2 netfilter: nft_fwd_netdev: validate family and chain type
Make sure the forward action is only used from ingress.

Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:38 +01:00
Stefano Brivio
7c84d41416 netfilter: nft_set_rbtree: Detect partial overlaps on insertion
...and return -ENOTEMPTY to the front-end in this case, instead of
proceeding. Currently, nft takes care of checking for these cases
and not sending them to the kernel, but if we drop the set_overlap()
call in nft we can end up in situations like:

 # nft add table t
 # nft add set t s '{ type inet_service ; flags interval ; }'
 # nft add element t s '{ 1 - 5 }'
 # nft add element t s '{ 6 - 10 }'
 # nft add element t s '{ 4 - 7 }'
 # nft list set t s
 table ip t {
 	set s {
 		type inet_service
 		flags interval
 		elements = { 1-3, 4-5, 6-7 }
 	}
 }

This change has the primary purpose of making the behaviour
consistent with nft_set_pipapo, but is also functional to avoid
inconsistent behaviour if userspace sends overlapping elements for
any reason.

v2: When we meet the same key data in the tree, as start element while
    inserting an end element, or as end element while inserting a start
    element, actually check that the existing element is active, before
    resetting the overlap flag (Pablo Neira Ayuso)

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:37 +01:00
Stefano Brivio
6f7c9caf01 netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
Replace negations of nft_rbtree_interval_end() with a new helper,
nft_rbtree_interval_start(), wherever this helps to visualise the
problem at hand, that is, for all the occurrences except for the
comparison against given flags in __nft_rbtree_get().

This gets especially useful in the next patch.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:30 +01:00
Stefano Brivio
0eb4b5ee33 netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
...and return -ENOTEMPTY to the front-end on collision, -EEXIST if
an identical element already exists. Together with the previous patch,
element collision will now be returned to the user as -EEXIST.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:59:08 +01:00
Pablo Neira Ayuso
8c2d45b2b6 netfilter: nf_tables: Allow set back-ends to report partial overlaps on insertion
Currently, the -EEXIST return code of ->insert() callbacks is ambiguous: it
might indicate that a given element (including intervals) already exists as
such, or that the new element would clash with existing ones.

If identical elements already exist, the front-end is ignoring this without
returning error, in case NLM_F_EXCL is not set. However, if the new element
can't be inserted due an overlap, we should report this to the user.

To this purpose, allow set back-ends to return -ENOTEMPTY on collision with
existing elements, translate that to -EEXIST, and return that to userspace,
no matter if NLM_F_EXCL was set.

Reported-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-24 19:58:57 +01:00
Thomas Hellstrom (VMware)
9431042dbc drm/vmwgfx: Hook up the helpers to align buffer objects
Start using the helpers that align buffer object user-space addresses and
buffer object vram addresses to huge page boundaries.
This is to improve the chances of allowing huge page-table entries.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
2020-03-24 18:50:35 +01:00
Thomas Hellstrom (VMware)
7546f7ffdb drm/vmwgfx: Introduce a huge page aligning TTM range manager
Using huge page-table entries requires that the physical address of the
start of a buffer object is huge page size aligned.
Make a special version of the TTM range manager that accomplishes this,
but falls back to a smaller page size alignment (PUD->PMD, PMD->NORMAL)
to avoid eviction.
If other drivers want to use it in the future, it can be made a
TTM generic helper. Note that drivers can force eviction for a certain
alignment by assigning the TTM GPU alignment correspondingly.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
2020-03-24 18:50:12 +01:00
Thomas Hellstrom (VMware)
b182341667 drm: Add a drm_get_unmapped_area() helper
Unaligned virtual addresses makes it unlikely that huge page-table entries
can be used.
So align virtual buffer object address huge page boundaries to the
underlying physical address huge page boundaries taking buffer object
sizes into account to determine when it might be possible to use huge
page-table entries.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
2020-03-24 18:49:26 +01:00
Thomas Hellstrom (VMware)
75390281ab drm/vmwgfx: Support huge page faults
With vmwgfx dirty-tracking we need a specialized huge_fault
callback. Implement and hook it up.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
2020-03-24 18:48:55 +01:00
Thomas Hellstrom (VMware)
314b6580ad drm/ttm, drm/vmwgfx: Support huge TTM pagefaults
Support huge (PMD-size and PUD-size) page-table entries by providing a
huge_fault() callback.
We still support private mappings and write-notify by splitting the huge
page-table entries on write-access.

Note that for huge page-faults to occur, either the kernel needs to be
compiled with trans-huge-pages always enabled, or the kernel needs to be
compiled with trans-huge-pages enabled using madvise, and the user-space
app needs to call madvise() to enable trans-huge pages on a per-mapping
basis.

Furthermore huge page-faults will not succeed unless buffer objects and
user-space addresses are aligned on huge page size boundaries.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2020-03-24 18:48:33 +01:00
Thomas Hellstrom (VMware)
9a9731b18c mm: Add vmf_insert_pfn_xxx_prot() for huge page-table entries
For graphics drivers needing to modify the page-protection, add
huge page-table entries counterparts to vmf_insert_pfn_prot().

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-03-24 18:48:09 +01:00
Thomas Hellstrom (VMware)
327e9fd489 mm: Split huge pages on write-notify or COW
The functions wp_huge_pmd() and wp_huge_pud() currently relies on the
huge_fault() callback to split huge page table entries if needed.
However for module users that requires export of the split_huge_xxx()
functionality which may be undesired. Instead split pre-existing huge
page-table entries on VM_FAULT_FALLBACK return.

We currently only do COW and write-notify on the PTE level, so if the
huge_fault() handler returns VM_FAULT_FALLBACK on wp faults,
split the huge pages and page-table entries. Also do this for huge PUDs
if there is no huge_fault() handler and the vma is not anonymous, similar
to how it's done for PMDs.

Note that fs/dax.c still does the splitting in the huge_fault() handler,
but as huge_fault() A follow-up patch can remove the dax.c split_huge_pmd()
if needed.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-03-24 18:47:47 +01:00
Thomas Hellstrom (VMware)
2484ca9b6a mm: Introduce vma_is_special_huge
For VM_PFNMAP and VM_MIXEDMAP vmas that want to support transhuge pages
and -page table entries, introduce vma_is_special_huge() that takes the
same codepaths as vma_is_dax().

The use of "special" follows the definition in memory.c, vm_normal_page():
"Special" mappings do not wish to be associated with a "struct page"
(either it doesn't exist, or it exists but they don't want to touch it)

For PAGE_SIZE pages, "special" is determined per page table entry to be
able to deal with COW pages. But since we don't have huge COW pages,
we can classify a vma as either "special huge" or "normal huge".

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-03-24 18:47:17 +01:00
Thomas Hellstrom (VMware)
f05a3849f6 fs: Constify vma argument to vma_is_dax
The function is used by upcoming vma_is_special_huge() with which we want
to use a const vma argument. Since for vma_is_dax() the vma argument is
only dereferenced for reading, constify it.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Thomas Hellstrom (VMware) <thomas_os@shipmail.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
2020-03-24 18:46:48 +01:00
Linus Torvalds
76ccd23426 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling fixes from Ingo Molnar:
 "A handful of tooling fixes all across the map, no kernel changes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools headers uapi: Update linux/in.h copy
  perf probe: Do not depend on dwfl_module_addrsym()
  perf probe: Fix to delete multiple probe event
  perf parse-events: Fix reading of invalid memory in event parsing
  perf python: Fix clang detection when using CC=clang-version
  perf map: Fix off by one in strncpy() size argument
  tools: Let O= makes handle a relative path with -C option
2020-03-24 10:03:32 -07:00
Linus Torvalds
3f3ee43a46 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "A build fix with certain Kconfig combinations"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioremap: Fix CONFIG_EFI=n build
2020-03-24 09:57:46 -07:00
Linus Torvalds
c6ac7188c1 dmaengine-fix-5.6
Late fixes in dmaengine for v5.6:
   - move .device_release missing log warning to debug
   - couple of maintainer entries for HiSilicon and IADX drivers
   - one off fix for idxd driver
   - documentation warn fixes
   - TI k3 dma error handling fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAl55mjUACgkQfBQHDyUj
 g0dnIA//eHgGyhvuLWkKunLH2+x5YLYL0ZzWdldW6dbsHaxB59/j1aO325KWXiXB
 c2GKGxqRzoNc9i4V5k6cDWjml34z9x44HGUhySyOqE3MkNzM4STjtclePE/DDWaV
 QU2zxd7cg3QkP0q3WFaJtw7ffObwJyJqL2GbXcLbfEw731XyjsV3qZvrcHcHkiro
 X8taSqlVhZEBc6aGQRNQijWYVH/a/SK2kqo79zv1r24EEkvId3f2k0/ZsHT9r/tD
 M2+guUvPEWuL7hUUuhul++7tauvi0Klvil7Ye6HRaUWDjJ1UBW5bnQXzVJMzKoRv
 n8BJXet6owIWucHJijqifRDkKJDAg6XIT97ado/jpoH11xtmRqFF+85uPOF2MQSR
 Ko2CtsHZjk02XmrVBpqesW8vN2iWlpeaG5lKtDyvwMpOR1b/iTs29LEIZkHtpm1z
 kA5/w4MEZF9jP1up7dTIs7rQJsArFhfh6hKUahWu9FdaHg8VufmtiRUW71NTfY9z
 pM5PNL7+2dq6BBVnxobrSN1GybR4NEie37xKZnF5JPKG6/qUl9WCciRxcfJjNxFz
 qip4Bm39elXTzFZ7S/U7qJyB+/vTbKMldDj8YmUa57jev+8pfjp9Pfqja/V2SIHx
 IiLG9ugpQFxXRnGOdo4LDjh6ute9sZvTdSrpKUT4J+7E9qFNuAk=
 =QaCh
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-5.6' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "Late fixes in dmaengine for v5.6:

   - move .device_release missing log warning to debug

   - couple of maintainer entries for HiSilicon and IADX drivers

   - off-by-one fix for idxd driver

   - documentation warning fixes

   - TI k3 dma error handling fix"

* tag 'dmaengine-fix-5.6' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: ti: k3-udma-glue: Fix an error handling path in 'k3_udma_glue_cfg_rx_flow()'
  MAINTAINERS: Add maintainer for HiSilicon DMA engine driver
  dmaengine: idxd: fix off by one on cdev dwq refcount
  MAINTAINERS: rectify the INTEL IADX DRIVER entry
  dmaengine: move .device_release missing log warning to debug level
  docs: dmaengine: provider.rst: get rid of some warnings
2020-03-24 09:53:12 -07:00
Wanpeng Li
94be4b85d8 KVM: LAPIC: Also cancel preemption timer when disarm LAPIC timer
The timer is disarmed when switching between TSC deadline and other modes,
we should set everything to disarmed state, however, LAPIC timer can be
emulated by preemption timer, it still works if vmx->hv_deadline_timer is
not -1. This patch also cancels preemption timer when disarm LAPIC timer.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1585031530-19823-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-24 07:25:20 -04:00