Commit Graph

617943 Commits

Author SHA1 Message Date
Dave Airlie
9929c09767 Merge tag 'drm-intel-fixes-2016-09-15' of git://anongit.freedesktop.org/drm-intel into drm-fixes
i915 fixes from Jani.

* tag 'drm-intel-fixes-2016-09-15' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Ignore OpRegion panel type except on select machines
  Revert "drm/i915/psr: Make idle_frames sensible again"
  drm/i915: Restore lost "Initialized i915" welcome message
2016-09-17 07:57:21 +10:00
Linus Torvalds
dd5a477c7f Round three of 4.8 rc fixes
- Various fixes to rdmavt, ipoib, mlx5, mlx4, rxe
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJX3DdRAAoJELgmozMOVy/d2k4QAJz0HbJvS8uN/ny6zaIsIa74
 08pvzHWpPkbJ6JGyxySToxHx7gs+MMvsvovUM+QPQS4jt6ZdHY1vOUDztG7GVXZC
 SsC8kYX8o0P2zhiDeMi/9LoBjH5bLgS3L5lfwke0jgWXCU6Cdgm5InnZ8XuoBZr6
 zNQ/Zcg8epe92IhqJ9abqMveni4FstXzlj9PlhaeCkUadFarpypG2yTdcvmq7m6i
 aXvGDVWgaVTB0CyaJtXIK3g/lmW4Ay3z5RpIjPbdZTd2j46c8Z4yKrhpHuK2fChb
 4xPSEoMdBTO/FI/M0Mf6FKEtGv4bxFcfwpjw5fuWL3sk+hWVWA0yqil3fCJ4vAy5
 klUdRLE187hd0MBj2Eq8xLeblfuqAmiuWjPJ59npcspPFUaXgvw8jolxjzxE2HVM
 whAVnb5fEVu1nQ8ePfkDPNJ1osFmFwYObYiYqql258U5jBU/QXwohMQihSeIhR04
 yRyzY1ob+WCGqp/MKkkAAZvjSUGdPzuky5YHCymmoinJKXvf9eJ7LdQ1l2jFMoa6
 ZpZNppSya9v2pLhGf8MhO2DXyejhCnPgn1JS7OhoTCHbSPR5zDD8oucgW0+gmUpd
 1QAzEL2HSojvLrDJJpJOTHhL8TQPLEVnnILMq5jRGy3y+lUJ1iGgw3qBLxAhZZRZ
 r7omK4iOuut0P+Rzvgqg
 =HC3X
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 "Round three of 4.8 rc fixes.

  This is likely the last rdma pull request this cycle.  The new rxe
  driver had a few issues (you probably saw the boot bot bug report) and
  they should be addressed now.  There are a couple other fixes here,
  mainly mlx4.  There are still two outstanding issues that need
  resolved but I don't think their fix will make this kernel cycle.

  Summary:

   - Various fixes to rdmavt, ipoib, mlx5, mlx4, rxe"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/rdmavt: Don't vfree a kzalloc'ed memory region
  IB/rxe: Fix kmem_cache leak
  IB/rxe: Fix race condition between requester and completer
  IB/rxe: Fix duplicate atomic request handling
  IB/rxe: Fix kernel panic in udp_setup_tunnel
  IB/mlx5: Set source mac address in FTE
  IB/mlx5: Enable MAD_IFC commands for IB ports only
  IB/mlx4: Diagnostic HW counters are not supported in slave mode
  IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV
  IB/mlx4: Fix code indentation in QP1 MAD flow
  IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV
  IB/ipoib: Don't allow MC joins during light MC flush
  IB/rxe: fix GFP_KERNEL in spinlock context
2016-09-16 13:51:42 -07:00
Linus Torvalds
008f08d64a ARM: SoC fixes
Here are a couple of bugfixes for v4.8-rc. Most of them have
 actually been around for a while this time but for some reason
 didn't get applied early on. The shmobile regulator fix is the
 only one that isn't completely obvious.
 
 device tree changes:
 - archtimer interrupts must be level triggered (multiple platforms)
 - fix for USB and MMC clocks on STiH410
 - fix split DT repository in case of raspberry-pi 3
 - A new use of skeleton.dtsi on arm64 has crept in after that
   was removed.
 
 defconfig updates:
 - xilinx vdma has a new Kconfig symbol name
 - keystone requires CONFIG_NOP_USB_XCEIV since v4.8-rc1
 
 code fixes:
 - fix regulator quirk on shmobile
 - suspend-to-ram regression on EXYNOS
 
 maintainer updates:
 - Javier Martinez Canillas is now a reviewer for Samsung EXYNOS
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAV9wHNGCrR//JCVInAQKPBhAA4HWz5YoE1FwatmrZ7LyLgl+SD7ezDuGC
 w/oo01kGYSq9vN8E7rTWqTW/lylTgt7adX3E6wNPIIfVg9dx9TnJ0HofH3TjHku4
 K7HeqapNqqqWh3VF8xFZO6YkKi09uhsX+j8NOAGlhd50A4OrOA1xh1NtpIakLX7z
 TYBpbjW1TB3SwNiq7CbC1PJUKzTfP49hQmf6dUdKajLZ2Wova4H0bonyo45DhanZ
 JiZyZlR9NnieVcftAP+kGFskM8q2hyZPqtoCar/mWrmerWMUG3n1MWj9LyDTVsVc
 zd7wBcX9dLOe26qGW88MUnbUBC/R2nZ+VDzMwyoYoIHlHALDcn2ldDotLDVIRp6A
 xGMejt06Q2qG8zX4SCjyq9hu2LeyBRWHkRTaoAD2tsT5SD4KNMi3GeYnq9Su+iYw
 hXrCOrua1pMDhWsU5RMGrfPXKbZSkkcvvt1MAoUn5h7xTqLQ1+PKLIUVOPnAR6Ns
 lHR86oo1kAoXDPbKZRnMbHSQ76kW+nWF+vDOJ7ozXNwZtcmXZiqfKxs/RDVecqFL
 kJMPJBPUGW5FAakarLb68f8XVsiHQr3ujofTyA77cUACqLBidbhxbfq+5NMWyck/
 zXPLk4HEGBlg9v8g17g1MxdttS+Na9pzNVfE23CuGKc147QIh1M3DeLjoIZ9gSfH
 p8cxVpe5gBs=
 =tYAb
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Here are a couple of bugfixes for v4.8-rc.

  Most of them have actually been around for a while this time but for
  some reason didn't get applied early on.  The shmobile regulator fix
  is the only one that isn't completely obvious.

  Device tree changes:
   - archtimer interrupts must be level triggered (multiple platforms)
   - fix for USB and MMC clocks on STiH410
   - fix split DT repository in case of raspberry-pi 3
   - a new use of skeleton.dtsi on arm64 has crept in after that was
     removed.

  defconfig updates:
   - xilinx vdma has a new Kconfig symbol name
   - keystone requires CONFIG_NOP_USB_XCEIV since v4.8-rc1

  Code fixes:
   - fix regulator quirk on shmobile
   - suspend-to-ram regression on EXYNOS

  Maintainer updates:
   - Javier Martinez Canillas is now a reviewer for Samsung EXYNOS"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: keystone: defconfig: Fix USB configuration
  arm64: dts: Fix broken architected timer interrupt trigger
  ARM: multi_v7_defconfig: update XILINX_VDMA
  ARM64: dts: bcm: Use a symlink to R-Pi dtsi files from arch=arm
  ARM: dts: Remove use of skeleton.dtsi from bcm283x.dtsi
  ARM: dts: STiH407-family: Provide interconnect clock for consumption in ST SDHCI
  ARM: dts: STiH410: Handle interconnect clock required by EHCI/OHCI (USB)
  ARM: shmobile: fix regulator quirk for Gen2
  ARM: EXYNOS: Clear OF_POPULATED flag from PMU node in IRQ init callback
  MAINTAINERS: Add myself as reviewer for Samsung Exynos support
2016-09-16 12:15:41 -07:00
Linus Torvalds
cac4662a88 Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Most of this update are fixes primarily discovered from testing on the
  older StrongARM 1110 and PXA systems, as a result of recent interest
  from several people in these platforms:

   - Locomo interrupt handling incorrectly stores the handler data in
     the chip's private data slot: when Locomo is combined with an
     interrupt controller who's chip uses the chip private data, this
     leads to an oops.

   - SA1111 was missing a call to clk_disable() to clean up after a
     failed probe.

   - SA1111 and PCMCIA suspend/resume was broken:

     The PCMCIA "ds" layer was using the legacy bus suspend/resume
     methods, which the core PM code is no longer calling as a result of
     device_pm_check_callbacks() introduced in commit aa8e54b559
     ("PM / sleep: Go direct_complete if driver has no callbacks").

     SA1111 was broken due to changes to PCMCIA which makes PCMCIA
     suspend itself later than the SA1111 code expects, and resume
     before the SA1111 code has initialised access to the pcmcia
     sub-device.

   - the default SA1111 interrupt mask polarity got messed up when it
     was converted to use a dynamic interrupt base number for its
     interrupts.

   - fix platform_get_irq() error code propagation, which was causing
     problems on platforms where the interrupt may not be available at
      probe time in DT setups.

   - fix the lack of clock to PCMCIA code on PXA platforms, which was
     omitted in conversions of PXA to CCF.

   - fix an oops in the PXA PCMCIA code caused by a previous commit not
     realising that Lubbock is different from the rest of the PXA PCMCIA
     drivers.

   - ensure that SA1111 low-level PCMCIA drivers propagate their error
     codes to the main probe function, rather than the driver silently
     accepting a failure.

   - fix the sa11xx debugfs reporting of timing information, which
     always indicated zero due to the clock being a factor of 1000 out.

   - fix the polarity of the status change signal reported from the
     sockets.

  Lastly, one ARM specific commit from Stefan Agner fixing the LPAE
  cache attributes"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: pxa/lubbock: add pcmcia clock
  ARM: locomo: fix locomo irq handling
  ARM: 8612/1: LPAE: initialize cache policy correctly
  ARM: sa1111: fix missing clk_disable()
  ARM: sa1111: fix pcmcia suspend/resume
  ARM: sa1111: fix pcmcia interrupt mask polarity
  ARM: sa1111: fix error code propagation in sa1111_probe()
  pcmcia: lubbock: fix sockets configuration
  pcmcia: sa1111: fix propagation of lowlevel board init return code
  pcmcia: soc_common: fix SS_STSCHG polarity
  pcmcia: sa11xx_base: add units to the timing information
  pcmcia: sa11xx_base: fix reporting of timing information
  pcmcia: ds: fix suspend/resume
2016-09-16 12:08:13 -07:00
Colin Ian King
e4618d40eb IB/rdmavt: Don't vfree a kzalloc'ed memory region
The userspace memory region 'mr' is allocated with kzalloc in
__rvt_alloc_mr  however it is incorrectly being freed with vfree in
__rvt_free_mr. Fix this by using kfree to free it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:23 -04:00
Yonatan Cohen
c1cc72cb6f IB/rxe: Fix kmem_cache leak
Decrement qp reference when handling error path
in completer to prevent kmem_cache leak.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Yonatan Cohen
3050b99850 IB/rxe: Fix race condition between requester and completer
rxe_requester() is sending a pkt with rxe_xmit_packet() and
then calls rxe_update() to update the wqe and qp's psn values.
But sometimes the response is received before the requester
had time to update the wqe in which case the completer
acts on errornous wqe values.
This fix updates the wqe and qp before actually sending
the request and rolls back when xmit fails.

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Yonatan Cohen
908948877b IB/rxe: Fix duplicate atomic request handling
When handling ack for atomic opcodes like "fetch&add"
or "cmp&swp", the method send_atomic_ack() saves the ack
before sending it, in case it gets lost and never reach the
requester. In which case the method duplicate_request()
will need to find it using the duplicated request.psn.
But send_atomic_ack() used a wrong psn value and thus
the above ack was never found.
This fix uses the ack.psn to locate the ack in case
its needed.
This fix also copies the ack packet to the skb's control buffer
since duplicate_request() will need it when calling rxe_xmit_packet()

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Yonatan Cohen
dfdd6158ca IB/rxe: Fix kernel panic in udp_setup_tunnel
Disable creation of a UDP socket for ipv6 when
CONFIG_IPV6 is not enabeld. Since udp_sock_create6()
returns 0 when CONFIG_IPV6 is not set

[   46.888632] IP: [<c220705a>] setup_udp_tunnel_sock+0x6/0x4f
[   46.891355] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[   46.893918] Oops: 0002 [#1] PREEMPT
[   46.896014] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-rc4-00001-g8700e3e #1
[   46.900280] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[   46.904905] task: cf06c040 ti: cf05e000 task.ti: cf05e000
[   46.907854] EIP: 0060:[<c220705a>] EFLAGS: 00210246 CPU: 0
[   46.911137] EIP is at setup_udp_tunnel_sock+0x6/0x4f
[   46.914070] EAX: 00000044 EBX: 00000001 ECX: cf05fef0 EDX: ca8142e0
[   46.917236] ESI: c2c4505b EDI: cf05fef0 EBP: cf05fed0 ESP: cf05fed0
[   46.919836]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[   46.922046] CR0: 80050033 CR2: 000001fc CR3: 02cec000 CR4: 000006b0
[   46.924550] Stack:
[   46.926014]  cf05ff10 c1fd4657 ca8142e0 0000000a 00000000 00000000 0000b712 00000008
[   46.931274]  00000000 6bb5bd01 c1fd48de 00000000 00000000 cf05ff1c 00000000 00000000
[   46.936122]  cf05ff1c c1fd4bdf 00000000 cf05ff28 c2c4507b ffffffff cf05ff88 c2bf1c74
[   46.942350] Call Trace:
[   46.944403]  [<c1fd4657>] rxe_setup_udp_tunnel+0x8f/0x99
[   46.947689]  [<c1fd48de>] ? net_to_rxe+0x4e/0x4e
[   46.950567]  [<c1fd4bdf>] rxe_net_init+0xe/0xa4
[   46.953147]  [<c2c4507b>] rxe_module_init+0x20/0x4c
[   46.955448]  [<c2bf1c74>] do_one_initcall+0x89/0x113
[   46.957797]  [<c2bf15eb>] ? set_debug_rodata+0xf/0xf
[   46.959966]  [<c2bf1dbc>] ? kernel_init_freeable+0xbe/0x15b
[   46.962262]  [<c2bf1ddc>] kernel_init_freeable+0xde/0x15b
[   46.964418]  [<c232eb54>] kernel_init+0x8/0xd0
[   46.966618]  [<c2333122>] ret_from_kernel_thread+0xe/0x24
[   46.969592]  [<c232eb4c>] ? rest_init+0x6f/0x6f

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Maor Gottlieb
ee3da804ad IB/mlx5: Set source mac address in FTE
Set the source mac address in the FTE when L2 specification
is provided.

Fixes: 038d2ef875 ('IB/mlx5: Add flow steering support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Noa Osherovich
7fae6655a0 IB/mlx5: Enable MAD_IFC commands for IB ports only
MAD_IFC command is supported only for physical functions (PF)
and when physical port is IB. The proposed fix enforces it.

Fixes: d603c809ef ("IB/mlx5: Fix decision on using MAD_IFC")
Reported-by: David Chang <dchang@suse.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Kamal Heib
69d269d389 IB/mlx4: Diagnostic HW counters are not supported in slave mode
Modify the mlx4_ib_diag_counters() to avoid the following error in the
hypervisor when the slave tries to query the hardware counters in SR-IOV
mode.

mlx4_core 0000:81:00.0: Unknown command:0x30 accepted from slave:1

Fixes: 3f85f2aaab ("IB/mlx4: Add diagnostic hardware counters")
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Jack Morgenstein
8ec07bf8a8 IB/mlx4: Use correct subnet-prefix in QP1 mads under SR-IOV
When sending QP1 MAD packets which use a GRH, the source GID
(which consists of the 64-bit subnet prefix, and the 64 bit port GUID)
must be included in the packet GRH.

For SR-IOV, a GID cache is used, since the source GID needs to be the
slave's source GID, and not the Hypervisor's GID. This cache also
included a subnet_prefix. Unfortunately, the subnet_prefix field in
the cache was never initialized (to the default subnet prefix 0xfe80::0).
As a result, this field remained all zeroes.  Therefore, when SR-IOV
was active, all QP1 packets which included a GRH had a source GID
subnet prefix of all-zeroes.

However, the subnet-prefix should initially be 0xfe80::0 (the default
subnet prefix). In addition, if OpenSM modifies a port's subnet prefix,
the new subnet prefix must be used in the GRH when sending QP1 packets.
To fix this we now initialize the subnet prefix in the SR-IOV GID cache
to the default subnet prefix. We update the cached value if/when OpenSM
modifies the port's subnet prefix. We take this cached value when sending
QP1 packets when SR-IOV is active.

Note that the value is stored as an atomic64. This eliminates any need
for locking when the subnet prefix is being updated.

Note also that we depend on the FW generating the "port management change"
event for tracking subnet-prefix changes performed by OpenSM. If running
early FW (before 2.9.4630), subnet prefix changes will not be tracked (but
the default subnet prefix still will be stored in the cache; therefore
users who do not modify the subnet prefix will not have a problem).
IF there is a need for such tracking also for early FW, we will add that
capability in a subsequent patch.

Fixes: 1ffeb2eb8b ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Jack Morgenstein
baa0be7026 IB/mlx4: Fix code indentation in QP1 MAD flow
The indentation in the QP1 GRH flow in procedure build_mlx_header is
really confusing. Fix it, in preparation for a commit which touches
this code.

Fixes: 1ffeb2eb8b ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Alex Vesker
e5ac40cd66 IB/mlx4: Fix incorrect MC join state bit-masking on SR-IOV
Because of an incorrect bit-masking done on the join state bits, when
handling a join request we failed to detect a difference between the
group join state and the request join state when joining as send only
full member (0x8). This caused the MC join request not to be sent.
This issue is relevant only when SRIOV is enabled and SM supports
send only full member.

This fix separates scope bits and join states bits a nibble each.

Fixes: b9c5d6a643 ('IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV')
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Alex Vesker
344bacca8c IB/ipoib: Don't allow MC joins during light MC flush
This fix solves a race between light flush and on the fly joins.
Light flush doesn't set the device to down and unset IPOIB_OPER_UP
flag, this means that if while flushing we have a MC join in progress
and the QP was attached to BC MGID we can have a mismatches when
re-attaching a QP to the BC MGID.

The light flush would set the broadcast group to NULL causing an on
the fly join to rejoin and reattach to the BC MCG as well as adding
the BC MGID to the multicast list. The flush process would later on
remove the BC MGID and detach it from the QP. On the next flush
the BC MGID is present in the multicast list but not found when trying
to detach it because of the previous double attach and single detach.

[18332.714265] ------------[ cut here ]------------
[18332.717775] WARNING: CPU: 6 PID: 3767 at drivers/infiniband/core/verbs.c:280 ib_dealloc_pd+0xff/0x120 [ib_core]
...
[18332.775198] Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
[18332.779411]  0000000000000000 ffff8800b50dfbb0 ffffffff813fed47 0000000000000000
[18332.784960]  0000000000000000 ffff8800b50dfbf0 ffffffff8109add1 0000011832f58300
[18332.790547]  ffff880226a596c0 ffff880032482000 ffff880032482830 ffff880226a59280
[18332.796199] Call Trace:
[18332.798015]  [<ffffffff813fed47>] dump_stack+0x63/0x8c
[18332.801831]  [<ffffffff8109add1>] __warn+0xd1/0xf0
[18332.805403]  [<ffffffff8109aebd>] warn_slowpath_null+0x1d/0x20
[18332.809706]  [<ffffffffa025d90f>] ib_dealloc_pd+0xff/0x120 [ib_core]
[18332.814384]  [<ffffffffa04f3d7c>] ipoib_transport_dev_cleanup+0xfc/0x1d0 [ib_ipoib]
[18332.820031]  [<ffffffffa04ed648>] ipoib_ib_dev_cleanup+0x98/0x110 [ib_ipoib]
[18332.825220]  [<ffffffffa04e62c8>] ipoib_dev_cleanup+0x2d8/0x550 [ib_ipoib]
[18332.830290]  [<ffffffffa04e656f>] ipoib_uninit+0x2f/0x40 [ib_ipoib]
[18332.834911]  [<ffffffff81772a8a>] rollback_registered_many+0x1aa/0x2c0
[18332.839741]  [<ffffffff81772bd1>] rollback_registered+0x31/0x40
[18332.844091]  [<ffffffff81773b18>] unregister_netdevice_queue+0x48/0x80
[18332.848880]  [<ffffffffa04f489b>] ipoib_vlan_delete+0x1fb/0x290 [ib_ipoib]
[18332.853848]  [<ffffffffa04df1cd>] delete_child+0x7d/0xf0 [ib_ipoib]
[18332.858474]  [<ffffffff81520c08>] dev_attr_store+0x18/0x30
[18332.862510]  [<ffffffff8127fe4a>] sysfs_kf_write+0x3a/0x50
[18332.866349]  [<ffffffff8127f4e0>] kernfs_fop_write+0x120/0x170
[18332.870471]  [<ffffffff81207198>] __vfs_write+0x28/0xe0
[18332.874152]  [<ffffffff810e09bf>] ? percpu_down_read+0x1f/0x50
[18332.878274]  [<ffffffff81208062>] vfs_write+0xa2/0x1a0
[18332.881896]  [<ffffffff812093a6>] SyS_write+0x46/0xa0
[18332.885632]  [<ffffffff810039b7>] do_syscall_64+0x57/0xb0
[18332.889709]  [<ffffffff81883321>] entry_SYSCALL64_slow_path+0x25/0x25
[18332.894727] ---[ end trace 09ebbe31f831ef17 ]---

Fixes: ee1e2c82c2 ("IPoIB: Refresh paths instead of flushing them on SM change events")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Alexey Khoroshilov
5e102b3b4f IB/rxe: fix GFP_KERNEL in spinlock context
There is skb_clone(skb, GFP_KERNEL) in spinlock context
in rxe_rcv_mcast_pkt().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-16 14:14:08 -04:00
Greg Kroah-Hartman
e06226e66b USB-serial fixes for v4.8-rc7
Here's another Infineon flashloader device id.
 
 Signed-off-by: Johan Hovold <johan@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJX3AccAAoJEEEN5E/e4bSVu6oP/3af03nHVtRh8q99iHvEeZIB
 YlWarRQ0CGPzV7l7yooFQDc6msc74tufGWRbtPkcjhKSLFNIiOQurKViKGbCkIQC
 ibW7l1Xa0Xmh7qF0AOwQYarlceejWazN31h3z3JIRfeIWV/T+64vQ3Zp4iC1vsVU
 UJvMRU5Ppp686EiXWd83ytcNLngTazvipG12nx7KeKIZkak7b8fLbWRcxsGwaaOv
 NzJIeSEJ5M/Tn9GsvaW8gL8lakxQZFVM4JQRYXr4VSrphNmNnDHHxZOQyH6dztYR
 VkZWRnQasUrBnyL1CF4VBNLmqjl8Xv9F3R+h/IsQPJt7qq3fs68QChGRZUh2MrYs
 VzfJT5fSwA4pEseHtUuF+z6svpOrtBzgx9nMGqYDRtEd0SBWyIli2ohfJBuxbsQr
 wk4ZKQyWI87krXD2fYxKnzVP1//8nP0ACnKpK/LcZaEozOf2pRaXwSqFXrPEJTYr
 IXWbRPlCQ33gK07HFWd6G27AET0TUyPHEcrxIy8QRqSbEy2ZERtkkC8cvZB2pmYc
 2PgGQaykQKkBqwozhpKTlNwhz2ibN0Kw+pxrNarUv2lNC6PSDlqoHZFU+QGaCEyl
 0nboa7vsxFKKF/apAU+lcFZw2hhuB9RNhkiULzyeEj5TKimrfyLqYiFvw0uuBwpt
 MhbtwvKQ/9NRyzziUISz
 =++RH
 -----END PGP SIGNATURE-----

Merge tag 'usb-serial-4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for v4.8-rc7

Here's another Infineon flashloader device id.

Signed-off-by: Johan Hovold <johan@kernel.org>
2016-09-16 17:42:10 +02:00
Arnd Bergmann
6408649115 1. A recent change in populating irqchip devices from Device Tree
broke Suspend to RAM on Exynos boards due to lack of probing of
    PMU (Power Management Unit) driver.  Multiple drivers attach to
    the PMU's DT node: irqchip, clock controller and PMU platform
    driver for handling suspend.  The new irqchip code marked the
    PMU's DT node as OF_POPULATED but we need to attach to this
    node also PMU platform driver.
 
 2. Add Javier as additional reviewer for Exynos patches.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX25oaAAoJEME3ZuaGi4PXDa4P/0KG/BAzf++1/43zP9I3PAhM
 od3QOLDrnR16ly2AL8QbJWq574ZvNa8ssZ5m3QrWgt4JEGVudc97OpByupTrgm51
 xlfceTcsyG95OV1Psm20NJrIK2o1FEDIr4Bdh0/ixjOs/zajLyzUtx0qQ2rHa8mQ
 hzJeJGbk4yu50jsaUDWBH2p9n2EHuCqWfvRroE/fEbGL6G120OkFpiP3AExEycnD
 U3gUMm+QorEWavf9cMtAzJcnFNKDjaT4iStgTAeCrvcJ4ttwX4x1z5PJV9M/zbs4
 XtGoE7mGfGdajhXXnFPPinUB+1MffJdttI6jHEuTWBc/Znfdgr796HnAUYp2bcPp
 WQT/OcnFN0WUeYY4dONBI5/mIuHcj1p5esOttjEvIJNOVdoHevnl5KUhaAajy9DV
 0alKU9DQ49Z5j+e2lxTJ749Lj1+FfnX63EDQik95hrDZc8QKbbIe3F4/OuR6xm4L
 mPUaMSP2OazSpMVNiFoWL/qyB5ukSP0fyLR9cRggUbBUoq8A0WWT0QRMYaaATnQS
 fzEeEoerHIxuqU2LKUBJJ7m3VUl0VbWt7VBU2jI73RQ13IzeH92eMh6ssMjO3k0d
 R2OAU6R8H6oaCb9qsYajzRjMxm4zEalkUqNTXB4GMM7tsGckqibdgRqF/4fI/4D4
 TtUkGN+FCEJOLzAgAntK
 =Ljgr
 -----END PGP SIGNATURE-----

Merge tag 'samsung-fixes-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into fixes

Pull "ARM: exynos: Fixes for v4.8, secound round" from Krzysztof Kozłowski:

1. A recent change in populating irqchip devices from Device Tree
   broke Suspend to RAM on Exynos boards due to lack of probing of
   PMU (Power Management Unit) driver.  Multiple drivers attach to
   the PMU's DT node: irqchip, clock controller and PMU platform
   driver for handling suspend.  The new irqchip code marked the
   PMU's DT node as OF_POPULATED but we need to attach to this
   node also PMU platform driver.

2. Add Javier as additional reviewer for Exynos patches.

* tag 'samsung-fixes-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux:
  ARM: EXYNOS: Clear OF_POPULATED flag from PMU node in IRQ init callback
  MAINTAINERS: Add myself as reviewer for Samsung Exynos support
2016-09-16 16:29:48 +02:00
Alan Stern
08c5cd3748 USB: change bInterval default to 10 ms
Some full-speed mceusb infrared transceivers contain invalid endpoint
descriptors for their interrupt endpoints, with bInterval set to 0.
In the past they have worked out okay with the mceusb driver, because
the driver sets the bInterval field in the descriptor to 1,
overwriting whatever value may have been there before.  However, this
approach was never sanctioned by the USB core, and in fact it does not
work with xHCI controllers, because they use the bInterval value that
was present when the configuration was installed.

Currently usbcore uses 32 ms as the default interval if the value in
the endpoint descriptor is invalid.  It turns out that these IR
transceivers don't work properly unless the interval is set to 10 ms
or below.  To work around this mceusb problem, this patch changes the
endpoint-descriptor parsing routine, making the default interval value
be 10 ms rather than 32 ms.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Wade Berrier <wberrier@gmail.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-16 16:29:41 +02:00
Tony Lindgren
ed3d6d0ac0 usb: musb: Fix tusb6010 compile error on blackfin
We have CONFIG_BLACKFIN ifdef redefining all musb registers in
musb_regs.h and tusb6010.h is never included causing a build
error with blackfin-allmodconfig and COMPILE_TEST.

Let's fix the issue by not building tusb6010 if CONFIG_BLACKFIN
is selected.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-16 16:29:41 +02:00
Josh Poimboeuf
81539169f2 x86/dumpstack: Remove NULL task pointer convention
show_stack_log_lvl() and friends allow a NULL pointer for the
task_struct to indicate the current task.  This creates confusion and
can cause sneaky bugs.

Instead require the caller to pass 'current' directly.

This only changes the internal workings of the dumpstack code.  The
dump_trace() and show_stack() interfaces still allow a NULL task
pointer.  Those interfaces should also probably be fixed as well.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 16:21:39 +02:00
Matt Fleming
080fe0b790 perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2
While the Intel PMU monitors the LLC when perf enables the
HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor
L1 instruction cache fetches (0x0080) and instruction cache misses
(0x0081) on the AMD PMU.

This is extremely confusing when monitoring the same workload across
Intel and AMD machines, since parameters like,

  $ perf stat -e cache-references,cache-misses

measure completely different things.

Instead, make the AMD PMU measure instruction/data cache and TLB fill
requests to the L2 and instruction/data cache and TLB misses in the L2
when HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled,
respectively. That way the events measure unified caches on both
platforms.

Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 16:19:49 +02:00
Alexander Shishkin
1155bafcb7 perf/x86/intel/pt: Do validate the size of a kernel address filter
Right now, the kernel address filters in PT are prone to integer overflow
that may happen in adding filter's size to its offset to obtain the end
of the range. Such an overflow would also throw a #GP in the PT event
configuration path.

Fix this by explicitly validating the result of this calculation.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-4-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 11:14:16 +02:00
Alexander Shishkin
ddfdad991e perf/x86/intel/pt: Fix kernel address filter's offset validation
The kernel_ip() filter is used mostly by the DS/LBR code to look at the
branch addresses, but Intel PT also uses it to validate the address
filter offsets for kernel addresses, for which it is not sufficient:
supplying something in bits 64:48 that's not a sign extension of the lower
address bits (like 0xf00d000000000000) throws a #GP.

This patch adds address validation for the user supplied kernel filters.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 11:14:16 +02:00
Alexander Shishkin
95f60084ac perf/x86/intel/pt: Fix an off-by-one in address filter configuration
PT address filter configuration requires that a range is specified by
its first and last address, but at the moment we're obtaining the end
of the range by adding user specified size to its start, which is off
by one from what it actually needs to be.

Fix this and make sure that zero-sized filters don't pass the filter
validation.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 11:14:16 +02:00
Andy Lutomirski
ac496bf48d fork: Optimize task creation by caching two thread stacks per CPU if CONFIG_VMAP_STACK=y
vmalloc() is a bit slow, and pounding vmalloc()/vfree() will eventually
force a global TLB flush.

To reduce pressure on them, if CONFIG_VMAP_STACK=y, cache two thread
stacks per CPU.  This will let us quickly allocate a hopefully
cache-hot, TLB-hot stack under heavy forking workloads (shell script style).

On my silly pthread_create() benchmark, it saves about 2 µs per
pthread_create()+join() with CONFIG_VMAP_STACK=y.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/94811d8e3994b2e962f88866290017d498eb069c.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:54 +02:00
Andy Lutomirski
68f24b08ee sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK
We currently keep every task's stack around until the task_struct
itself is freed.  This means that we keep the stack allocation alive
for longer than necessary and that, under load, we free stacks in
big batches whenever RCU drops the last task reference.  Neither of
these is good for reuse of cache-hot memory, and freeing in batches
prevents us from usefully caching small numbers of vmalloced stacks.

On architectures that have thread_info on the stack, we can't easily
change this, but on architectures that set THREAD_INFO_IN_TASK, we
can free it as soon as the task is dead.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/08ca06cde00ebed0046c5d26cbbf3fbb7ef5b812.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:54 +02:00
Andy Lutomirski
aa1f1a6396 lib/syscall: Pin the task stack in collect_syscall()
This will avoid a potential read-after-free if collect_syscall()
(e.g. /proc/PID/syscall) is called on an exiting task.

Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0bfd8e6d4729c97745d3781a29610a33d0a8091d.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00
Andy Lutomirski
74327a3e88 x86/process: Pin the target stack in get_wchan()
This will prevent a crash if get_wchan() runs after the task stack
is freed.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/337aeca8614024aa4d8d9c81053bbf8fcffbe4ad.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00
Andy Lutomirski
1959a60182 x86/dumpstack: Pin the target stack when dumping it
Specifically, pin the stack in save_stack_trace_tsk() and
show_trace_log_lvl().

This will prevent a crash if the target task dies before or while
dumping its stack once we start freeing task stacks early.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cf0082cde65d1941a996d026f2b2cdbfaca17bfa.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00
Oleg Nesterov
23196f2e5f kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function
get_task_struct(tsk) no longer pins tsk->stack so all users of
to_live_kthread() should do try_get_task_stack/put_task_stack to protect
"struct kthread" which lives on kthread's stack.

TODO: Kill to_live_kthread(), perhaps we can even kill "struct kthread" too,
and rework kthread_stop(), it can use task_work_add() to sync with the exiting
kernel thread.

Message-Id: <20160629180357.GA7178@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cb9b16bbc19d4aea4507ab0552e4644c1211d130.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00
Andy Lutomirski
c6c314a613 sched/core: Add try_get_task_stack() and put_task_stack()
There are a few places in the kernel that access stack memory
belonging to a different task.  Before we can start freeing task
stacks before the task_struct is freed, we need a way for those code
paths to pin the stack.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/17a434f50ad3d77000104f21666575e10a9c1fbd.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:53 +02:00
Andy Lutomirski
ff0071c036 x86/entry/64: Fix a minor comment rebase error
When I rebased my thread_info changes onto Brian's switch_to()
changes, I carefully checked that I fixed up all the code correctly,
but I missed a comment :(

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 15f4eae70d ("x86: Move thread_info into task_struct")
Link: http://lkml.kernel.org/r/089fe1e1cbe8b258b064fccbb1a5a5fd23861031.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-16 09:18:52 +02:00
Paul E. McKenney
778935778c PM / runtime: Use _rcuidle for runtime suspend tracepoints
Further testing with false negatives suppressed by commit 293e2421fe
("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
identified a few more unprotected uses of RCU from the idle loop.
Because RCU actively ignores idle-loop code (for energy-efficiency
reasons, among other things), using RCU from the idle loop can result
in too-short grace periods, in turn resulting in arbitrary misbehavior.

The affected function is rpm_suspend().

The resulting lockdep-RCU splat is as follows:

------------------------------------------------------------------------

Warning from omap3

===============================
[ INFO: suspicious RCU usage. ]
4.6.0-rc5-next-20160426+ #1112 Not tainted
-------------------------------
include/trace/events/rpm.h:63 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
 #0:  (&(&dev->power.lock)->rlock){-.-...}, at: [<c052ee24>] __pm_runtime_suspend+0x54/0x84

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
Hardware name: Generic OMAP36xx (Flattened Device Tree)
[<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
[<c047fec8>] (dump_stack) from [<c052d7b4>] (rpm_suspend+0x604/0x7e4)
[<c052d7b4>] (rpm_suspend) from [<c052ee34>] (__pm_runtime_suspend+0x64/0x84)
[<c052ee34>] (__pm_runtime_suspend) from [<c04bf3bc>] (omap2_gpio_prepare_for_idle+0x5c/0x70)
[<c04bf3bc>] (omap2_gpio_prepare_for_idle) from [<c01255e8>] (omap_sram_idle+0x140/0x244)
[<c01255e8>] (omap_sram_idle) from [<c0126b48>] (omap3_enter_idle_bm+0xfc/0x1ec)
[<c0126b48>] (omap3_enter_idle_bm) from [<c0601db8>] (cpuidle_enter_state+0x80/0x3d4)
[<c0601db8>] (cpuidle_enter_state) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
[<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
[<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)

------------------------------------------------------------------------

Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-09-16 02:59:58 +02:00
Jann Horn
22f6b4d34f aio: mark AIO pseudo-fs noexec
This ensures that do_mmap() won't implicitly make AIO memory mappings
executable if the READ_IMPLIES_EXEC personality flag is set.  Such
behavior is problematic because the security_mmap_file LSM hook doesn't
catch this case, potentially permitting an attacker to bypass a W^X
policy enforced by SELinux.

I have tested the patch on my machine.

To test the behavior, compile and run this:

    #define _GNU_SOURCE
    #include <unistd.h>
    #include <sys/personality.h>
    #include <linux/aio_abi.h>
    #include <err.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/syscall.h>

    int main(void) {
        personality(READ_IMPLIES_EXEC);
        aio_context_t ctx = 0;
        if (syscall(__NR_io_setup, 1, &ctx))
            err(1, "io_setup");

        char cmd[1000];
        sprintf(cmd, "cat /proc/%d/maps | grep -F '/[aio]'",
            (int)getpid());
        system(cmd);
        return 0;
    }

In the output, "rw-s" is good, "rwxs" is bad.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-15 15:49:28 -07:00
Linus Torvalds
024c7e3756 One fix for an x86 regression in VM migration, mostly visible with
Windows because it uses RTC periodic interrupts.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJX2xpHAAoJEL/70l94x66D9NsIAIw+9oRA86qjehVnguV3fRKA
 ITZ4OGFDiXPWuxqDaw8mHHXr0RYx8KcMTzfFNbV+YL5U0cq9xYzdaNhchKPpyF+3
 7H5wL8Ku9wkYZ930kdCf5Q+LNCfg8d/wKlibPEbX0MDx4jL99kkcxLzEkmIRqFlq
 bpXaQe/KR1xCWR6gI/a6aRJWLfGuFMV82YSnk/dCSjwotbAwjJUSt+IPhLwhx28o
 7ddcxW3CxQqelJorcu2lvRiGnCvEzDhIdOvHJqibCjo3uzqbcI4PA2gs3rozbs9s
 VMEzqZpNgK0XsyKyccSw75npViIHYPkjMzxoyHMDhgiP3eIwp/tJquxAfLjK4WE=
 =h4P4
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
 "One fix for an x86 regression in VM migration, mostly visible with
  Windows because it uses RTC periodic interrupts"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: correctly reset dest_map->vector when restoring LAPIC state
2016-09-15 15:15:41 -07:00
Darrick J. Wong
b71dbf1032 vfs: cap dedupe request structure size at PAGE_SIZE
Kirill A Shutemov reports that the kernel doesn't try to cap dest_count
in any way, and uses the number to allocate kernel memory.  This causes
high order allocation warnings in the kernel log if someone passes in a
big enough value.  We should clamp the allocation at PAGE_SIZE to avoid
stressing the VM.

The two existing users of the dedupe ioctl never send more than 120
requests, so we can safely clamp dest_range at PAGE_SIZE, because with
4k pages we can handle up to 127 dedupe candidates.  Given the max
extent length of 16MB, we can end up doing 2GB of IO which is plenty.

[ Note: the "offsetof()" can't overflow, because 'count' is just a
  16-bit integer.  That's not obvious in the limited context of the
  patch, so I'm noting it here because it made me go look.  - Linus ]

Reported-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-15 13:29:52 -07:00
Darrick J. Wong
5297e0f0fe vfs: fix return type of ioctl_file_dedupe_range
All the VFS functions in the dedupe ioctl path return int status, so
the ioctl handler ought to as well.

Found by Coverity, CID 1350952.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-15 13:29:52 -07:00
Linus Torvalds
46626600d1 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A set of fixes for the current series in the realm of block.

  Like the previous pull request, the meat of it are fixes for the nvme
  fabrics/target code.  Outside of that, just one fix from Gabriel for
  not doing a queue suspend if we didn't get the admin queue setup in
  the first place"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvme-rdma: add back dependency on CONFIG_BLOCK
  nvme-rdma: fix null pointer dereference on req->mr
  nvme-rdma: use ib_client API to detect device removal
  nvme-rdma: add DELETING queue flag
  nvme/quirk: Add a delay before checking device ready for memblaze device
  nvme: Don't suspend admin queue that wasn't created
  nvme-rdma: destroy nvme queue rdma resources on connect failure
  nvme_rdma: keep a ref on the ctrl during delete/flush
  iw_cxgb4: block module unload until all ep resources are released
  iw_cxgb4: call dev_put() on l2t allocation failure
2016-09-15 13:22:59 -07:00
Al Viro
1c109fabbd fix minor infoleak in get_user_ex()
get_user_ex(x, ptr) should zero x on failure.  It's not a lot of a leak
(at most we are leaking uninitialized 64bit value off the kernel stack,
and in a fairly constrained situation, at that), but the fix is trivial,
so...

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ This sat in different branch from the uaccess fixes since mid-August ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-15 12:54:04 -07:00
Paolo Bonzini
b0eaf4506f kvm: x86: correctly reset dest_map->vector when restoring LAPIC state
When userspace sends KVM_SET_LAPIC, KVM schedules a check between
the vCPU's IRR and ISR and the IOAPIC redirection table, in order
to re-establish the IOAPIC's dest_map (the list of CPUs servicing
the real-time clock interrupt with the corresponding vectors).

However, __rtc_irq_eoi_tracking_restore_one was forgetting to
set dest_map->vectors.  Because of this, the IOAPIC did not process
the real-time clock interrupt EOI, ioapic->rtc_status.pending_eoi
got stuck at a non-zero value, and further RTC interrupts were
reported to userspace as coalesced.

Fixes: 9e4aabe2bb
Fixes: 4d99ba898d
Cc: stable@vger.kernel.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: David Gilbert <dgilbert@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-15 18:00:32 +02:00
Roger Quadros
a6805884e2 ARM: keystone: defconfig: Fix USB configuration
Simply enabling CONFIG_KEYSTONE_USB_PHY doesn't work anymore
as it depends on CONFIG_NOP_USB_XCEIV. We need to enable
that as well.

This fixes USB on Keystone boards from v4.8-rc1 onwards.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-09-15 11:46:12 +02:00
Joerg Roedel
4bf5beef57 iommu/amd: Don't put completion-wait semaphore on stack
The semaphore used by the AMD IOMMU to signal command
completion lived on the stack until now, which was safe as
the driver busy-waited on the semaphore with IRQs disabled,
so the stack can't go away under the driver.

But the recently introduced vmap-based stacks break this as
the physical address of the semaphore can't be determinded
easily anymore. The driver used the __pa() macro, but that
only works in the direct-mapping. The result were
Completion-Wait timeout errors seen by the IOMMU driver,
breaking system boot.

Since putting the semaphore on the stack is bad design
anyway, move the semaphore into 'struct amd_iommu'. It is
protected by the per-iommu lock and now in the direct
mapping again. This fixes the Completion-Wait timeout errors
and makes AMD IOMMU systems boot again with vmap-based
stacks enabled.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 11:28:19 +02:00
Alexander Shishkin
cecf62352a perf/x86/intel: Don't disable "intel_bts" around "intel" event batching
At the moment, intel_bts events get disabled from intel PMU's disable
callback, which includes event scheduling transactions of said PMU,
which have nothing to do with intel_bts events.

We do want to keep intel_bts events off inside the PMI handler to
avoid filling up their buffer too soon.

This patch moves intel_bts enabling/disabling directly to the PMI
handler.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915082233.11065-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 11:25:26 +02:00
Michael Ellerman
ed7d9a1d7d powerpc/powernv/pci: Fix missed TCE invalidations that should fallback to OPAL
In commit f0228c4130 ("powerpc/powernv/pci: Fallback to OPAL for TCE
invalidations"), we added logic to fallback to OPAL for doing TCE
invalidations if we can't do it in Linux.

Ben sent a v2 of the patch, containing these additional call sites, but
I had already applied v1 and didn't notice. So fix them now.

Fixes: f0228c4130 ("powerpc/powernv/pci: Fallback to OPAL for TCE invalidations")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-15 17:05:11 +10:00
Andy Lutomirski
15f4eae70d x86: Move thread_info into task_struct
Now that most of the thread_info users have been cleaned up,
this is straightforward.

Most of this code was written by Linus.

Originally-from: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a50eab40abeaec9cb9a9e3cbdeafd32190206654.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:13 +02:00
Andy Lutomirski
c65eacbe29 sched/core: Allow putting thread_info into task_struct
If an arch opts in by setting CONFIG_THREAD_INFO_IN_TASK_STRUCT,
then thread_info is defined as a single 'u32 flags' and is the first
entry of task_struct.  thread_info::task is removed (it serves no
purpose if thread_info is embedded in task_struct), and
thread_info::cpu gets its own slot in task_struct.

This is heavily based on a patch written by Linus.

Originally-from: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a0898196f0476195ca02713691a5037a14f2aac5.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:13 +02:00
Linus Torvalds
d896fa20a7 um/Stop conflating task_struct::stack with thread_info
thread_info may move in the future, so use the accessors.

[ Andy Lutomirski wrote this changelog message and changed
  "task_thread_info(child)->cpu" to "task_cpu(child)". ]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3439705d9838940cc82733a7335fa8c654c37db8.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:12 +02:00
Linus Torvalds
97245d0058 x86/entry: Get rid of pt_regs_to_thread_info()
It was a nice optimization while it lasted, but thread_info is moving
and this optimization will no longer work.

Quoting Linus:

    Oh Gods, Andy. That pt_regs_to_thread_info() thing made me want
    to do unspeakable acts on a poor innocent wax figure that looked
    _exactly_ like you.

[ Changelog written by Andy. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6376aa81c68798cc81631673f52bd91a3e078944.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:25:12 +02:00