Commit Graph

692238 Commits

Author SHA1 Message Date
Eugenia Emantayev
0b794ffae7 net/mlx5: Fix mlx5_ifc_mtpps_reg_bits structure size
Fix miscalculation in reserved_at_1a0 field.

Fixes: ee7f12205a ('net/mlx5e: Implement 1PPS support')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27 16:40:17 +03:00
Ilan Tayari
0242f4a0bb net/mlx5e: Fix outer_header_zero() check size
outer_header_zero() routine checks if the outer_headers match of a
flow-table entry are all zero.

This function uses the size of whole fte_match_param, instead of just
the outer_headers member, causing failure to detect all-zeros if
any other members of the fte_match_param are non-zero.

Use the correct size for zero check.

Fixes: 6dc6071cfc ("net/mlx5e: Add ethtool flow steering support")
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27 16:40:16 +03:00
Alex Vesker
58569ef8f6 net/mlx5e: IPoIB, Modify add/remove underlay QPN flows
On interface remove, the clean-up was done incorrectly causing
an error in the log:
"SET_FLOW_TABLE_ROOT(0x92f) op_mod(0x0) failed...syndrome (0x7e9f14)"

This was caused by the following flow:
-ndo_uninit:
 Move QP state to RST (this disconnects the QP from FT),
 the QP cannot be attached to any FT unless it is in RTS.

-mlx5_rdma_netdev_free:
 cleanup_rx: Destroy FT
 cleanup_tx: Destroy QP and remove QPN from FT

This caused a problem when destroying current FT we tried to
re-attach the QP to the next FT which is not needed.

The correct flow is:
-mlx5_rdma_netdev_free:
	cleanup_rx: remove QPN from FT & Destroy FT
	cleanup_tx: Destroy QP

Fixes: 508541146a ("net/mlx5: Use underlay QPN from the root name space")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27 16:40:16 +03:00
Moshe Shemesh
219c81f7d1 net/mlx5: Fix command bad flow on command entry allocation failure
When driver fail to allocate an entry to send command to FW, it must
notify the calling function and release the memory allocated for
this command.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27 16:40:16 +03:00
Moshe Shemesh
061870800e net/mlx5: Fix command completion after timeout access invalid structure
Completion on timeout should not free the driver command entry structure
as it will need to access it again once real completion event from FW
will occur.

Fixes: 73dd3a4839 ('net/mlx5: Avoid using pending command interface slots')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27 16:40:16 +03:00
Aviv Heller
dc798b4cc0 net/mlx5: Consider tx_enabled in all modes on remap
The tx_enabled lag event field is used to determine whether a slave is
active.
Current logic uses this value only if the mode is active-backup.

However, LACP mode, although considered a load balancing mode, can mark
a slave as inactive in certain situations (e.g., LACP timeout).

This fix takes the tx_enabled value into account when remapping, with
no respect to the LAG mode (this should not affect the behavior in XOR
mode, since in this mode both slaves are marked as active).

Fixes: 7907f23adc (net/mlx5: Implement RoCE LAG feature)
Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27 16:40:16 +03:00
Eran Ben Elisha
079adf0539 net/mlx5: Clean SRIOV eswitch resources upon VF creation failure
Upon sriov enable, eswitch is always enabled.
Currently, if enable hca failed over all VFs, we would skip eswitch
disable as part of sriov disable, which will lead to resources leak.

Fix it by disabling eswitch if it was enabled (use indication from
eswitch mode).

Fixes: 6b6adee3da ('net/mlx5: SRIOV core code refactoring')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27 16:40:16 +03:00
Thomas Gleixner
8397913303 genirq/cpuhotplug: Revert "Set force affinity flag on hotplug migration"
That commit was part of the changes moving x86 to the generic CPU hotplug
interrupt migration code. The force flag was required on x86 before the
hierarchical irqdomain rework, but invoking set_affinity() with force=true
stayed and had no side effects.

At some point in the past, the force flag got repurposed to support the
exynos timer interrupt affinity setting to a not yet online CPU, so the
interrupt controller callback does not verify the supplied affinity mask
against cpu_online_mask.

Setting the flag in the CPU hotplug code causes the cpu online masking to
be blocked on these irq controllers and results in potentially affining an
interrupt to the CPU which is unplugged, i.e. instead of moving it away,
it's just reassigned to it.

As the force flags is not longer needed on x86, it's safe to revert that
patch so the ARM irqchips which use the force flag work again.

Add comments to that effect, so this won't happen again.

Note: The online mask handling should be done in the generic code and the
force flag and the masking in the irq chips removed all together, but
that's not a change possible for 4.13. 

Fixes: 77f85e66aa ("genirq/cpuhotplug: Set force affinity flag on hotplug migration")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: LAK <linux-arm-kernel@lists.infradead.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707271217590.3109@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-07-27 15:40:02 +02:00
Will Deacon
a3287c41ff drivers/perf: arm_pmu: Request PMU SPIs with IRQF_PER_CPU
Since the PMU register interface is banked per CPU, CPU PMU interrrupts
cannot be handled by a CPU other than the one with the PMU asserting the
interrupt. This means that migrating PMU SPIs, as we do during a CPU
hotplug operation doesn't make any sense and can lead to the IRQ being
disabled entirely if we route a spurious IRQ to the new affinity target.

This has been observed in practice on AMD Seattle, where CPUs on the
non-boot cluster appear to take a spurious PMU IRQ when coming online,
which is routed to CPU0 where it cannot be handled.

This patch passes IRQF_PERCPU for PMU SPIs and forcefully sets their
affinity prior to requesting them, ensuring that they cannot
be migrated during hotplug events. This interacts badly with the DB8500
erratum workaround that ping-pongs the interrupt affinity from the handler,
so we avoid passing IRQF_PERCPU in that case by allowing the IRQ flags
to be overridden in the platdata.

Fixes: 3cf7ee98b8 ("drivers/perf: arm_pmu: move irq request/free into probe")
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-27 13:43:22 +01:00
Arend Van Spriel
5f5d03143d brcmfmac: fix memleak due to calling brcmf_sdiod_sgtable_alloc() twice
Due to a bugfix in wireless tree and the commit mentioned below a merge
was needed which went haywire. So the submitted change resulted in the
function brcmf_sdiod_sgtable_alloc() being called twice during the probe
thus leaking the memory of the first call.

Cc: stable@vger.kernel.org # 4.6.x
Fixes: 4d79289598 ("brcmfmac: switch to new platform data")
Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-07-27 14:03:14 +03:00
Daniel Stone
58f36b4526 brcmfmac: Don't grow SKB by negative size
The commit to rework the headroom check in start_xmit() now calls
pxskb_expand_head() unconditionally if the header is CoW. Unfortunately,
it does so with the delta between the extant headroom and the header
length, which may be negative if there is already sufficient headroom.

pskb_expand_head() does allow for size being 0, in which case it just
copies, so clamp the header delta to zero.

Opening Chrome (and all my tabs) on a PCIE device was enough to reliably
hit this.

Fixes: 270a6c1f65 ("brcmfmac: rework headroom check in .start_xmit()")
Signed-off-by: Daniel Stone <daniels@collabora.com>
Cc: Arend Van Spriel <arend.vanspriel@broadcom.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Cc: Hante Meuleman <hante.meuleman@broadcom.com>
Cc: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Cc: Franky Lin <franky.lin@broadcom.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-07-27 14:02:16 +03:00
Ville Syrjälä
d34cfebbf9 drm/i915: Fix cursor updates on some platforms
Turns out that just writing CURPOS isn't sufficient to move the cursor
on some platforms. My 830 works just fine, but eg. 945 and PNV don't.
On those platforms we need to arm even the CURPOS update with a
CURBASE write.

Even worse, a write to any of the cursor register apart from CURBASE
will cancel an already pending cursor update. So if we have armed a
CURCNTR/CURBASE update, a subsequent CURPOS write prior to vblank
would cancel that armed update. Thus we're left with a cursor that
doesn't appear to move, or even change shape.

Fix the problem by always performing the CURBASE write after a
CURPOS write. Bspec is somewhat unclear which platforms actually
require this CURBASE write and which don't. So to keep it simple
and to make sure we really fix the problem across all supported
devices, let's just perform the CURBASE write unconditionally.

Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101790
Fixes: 75343a44c9 ("drm/i915: Drop useless posting reads from cursor commit")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Paul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20170714155227.6089-1-ville.syrjala@linux.intel.com
(cherry picked from commit 8753d2bc5e)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 11:20:49 +02:00
Imre Deak
7728124af3 drm/i915: Fix user ptr check size in eb_relocate_vma()
Fix the sizeof(ptr) vs. sizeof(*ptr) typo.

Fixes: 2889caa923 ("drm/i915: Eliminate lots of iterations over the execobjects array")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170714151242.517-2-imre.deak@intel.com
(cherry picked from commit edd9003f7f)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 11:20:44 +02:00
Xin Long
6b84202c94 sctp: fix the check for _sctp_walk_params and _sctp_walk_errors
Commit b1f5bfc27a ("sctp: don't dereference ptr before leaving
_sctp_walk_{params, errors}()") tried to fix the issue that it
may overstep the chunk end for _sctp_walk_{params, errors} with
'chunk_end > offset(length) + sizeof(length)'.

But it introduced a side effect: When processing INIT, it verifies
the chunks with 'param.v == chunk_end' after iterating all params
by sctp_walk_params(). With the check 'chunk_end > offset(length)
+ sizeof(length)', it would return when the last param is not yet
accessed. Because the last param usually is fwdtsn supported param
whose size is 4 and 'chunk_end == offset(length) + sizeof(length)'

This is a badly issue even causing sctp couldn't process 4-shakes.
Client would always get abort when connecting to server, due to
the failure of INIT chunk verification on server.

The patch is to use 'chunk_end <= offset(length) + sizeof(length)'
instead of 'chunk_end < offset(length) + sizeof(length)' for both
_sctp_walk_params and _sctp_walk_errors.

Fixes: b1f5bfc27a ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:03:12 -07:00
Xin Long
e90ce2fc27 dccp: fix a memleak for dccp_feat_init err process
In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc
memory for rx.val, it should free tx.val before returning an
error.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:01:05 -07:00
Xin Long
b7953d3c0e dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly
The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk
properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue
exists on dccp_ipv4.

This patch is to fix it for dccp_ipv4.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:01:05 -07:00
Xin Long
0c2232b0a7 dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly
In dccp_v6_conn_request, after reqsk gets alloced and hashed into
ehash table, reqsk's refcnt is set 3. one is for req->rsk_timer,
one is for hlist, and the other one is for current using.

The problem is when dccp_v6_conn_request returns and finishes using
reqsk, it doesn't put reqsk. This will cause reqsk refcnt leaks and
reqsk obj never gets freed.

Jianlin found this issue when running dccp_memleak.c in a loop, the
system memory would run out.

dccp_memleak.c:
  int s1 = socket(PF_INET6, 6, IPPROTO_IP);
  bind(s1, &sa1, 0x20);
  listen(s1, 0x9);
  int s2 = socket(PF_INET6, 6, IPPROTO_IP);
  connect(s2, &sa1, 0x20);
  close(s1);
  close(s2);

This patch is to put the reqsk before dccp_v6_conn_request returns,
just as what tcp_conn_request does.

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 00:01:05 -07:00
Aneesh Kumar K.V
0da12a7a81 powerpc/mm/hash: Free the subpage_prot_table correctly
Fixes: dad6f37c26 ("powerpc: subpage_protect: Increase the array size to take care of 64TB")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-27 13:05:50 +10:00
Arnd Bergmann
7e17510018 drm: exynos: mark pm functions as __maybe_unused
The rework of the exynos DRM clock handling introduced
warnings for configurations that have CONFIG_PM disabled:

drivers/gpu/drm/exynos/exynos_hdmi.c:736:13: error: 'hdmi_clk_disable_gates' defined but not used [-Werror=unused-function]
 static void hdmi_clk_disable_gates(struct hdmi_context *hdata)
             ^~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/exynos/exynos_hdmi.c:717:12: error: 'hdmi_clk_enable_gates' defined but not used [-Werror=unused-function]
 static int hdmi_clk_enable_gates(struct hdmi_context *hdata)

The problem is that the PM functions themselves are inside of
an #ifdef, but some functions they call are not.

This patch removes the #ifdef and instead marks the PM functions
as __maybe_unused, which is a more reliable way to get it right.

Link: https://patchwork.kernel.org/patch/8436281/
Fixes: 9be7e98984 ("drm/exynos/hdmi: clock code re-factoring")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:03 +09:00
Hans Verkuil
8f4e01f9f0 drm/exynos: select CEC_CORE if CEC_NOTIFIER
If the s5p-cec driver is a module and the drm exynos driver is built-in, then
the CEC core will be a module also, causing the CEC notifier to fail (will be
		compiled as empty functions).

To prevent this select CEC_CORE if CEC_NOTIFIER is set to ensure the CEC core
is also built into the kernel.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:03 +09:00
Andrzej Hajda
861b27eca7 drm/exynos/hdmi: fix disable sequence
The "Fixes" patch was incorrectly merged, as a result PHY is prematurely
powered off and for example Odroid-U3 cannot disable TV power domain
when HDMI cable is unplugged.

Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 625e63e2 ("drm/exynos/hdmi: fix pipeline disable order")
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:02 +09:00
Inki Dae
576d72fbfb drm/exynos: mic: add a bridge at probe
This patch moves drm_bridge_add call into probe.

It doesn't need to call drm_bridge_add call every time
bind callback is called.

Changelog v2
- moved drm_bridge_remove call into remove callback.
- corrected description.

Suggested-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:02 +09:00
Hoegeun Kwon
0d51a0a534 drm/exynos/dsi: Remove error handling for bridge_node DT parsing
Remove the error handling of bridge_node because the bridge_node is
optional.

For example, In case of Exynos SoC, a bridge device such as mDNIe and
MIC could be placed between Display Controller and MIPI DSI device but
the bridge device is optional.

Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:02 +09:00
Inki Dae
c9948920cf drm/exynos: dsi: do not try to find bridge
It doesn't need to try to find a bridge if bridge node doesn't exist.

Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com>
Tested-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:01 +09:00
Arvind Yadav
e3cc51ea0b drm: exynos: hdmi: make of_device_ids const.
of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.

File size before:
   text	   data	    bss	    dec	    hex	filename
  12294	   1192	      0	  13486	   34ae	drivers/gpu/drm/exynos/exynos_hdmi.o

File size after constify hdmi_match_types.
   text	   data	    bss	    dec	    hex	filename
  13318	    176	      0	  13494	   34b6	drivers/gpu/drm/exynos/exynos_hdmi.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:01 +09:00
Arvind Yadav
5e6cc1c588 drm: exynos: constify mixer_match_types and *_mxr_drv_data.
File size before:
   text	   data	    bss	    dec	    hex	filename
   9983	   1424	      0	  11407	   2c8f	drivers/gpu/drm/exynos/exynos_mixer.o

File size after constify:
   text	   data	    bss	    dec	    hex	filename
  11231	    176	      0	  11407	   2c8f	drivers/gpu/drm/exynos/exynos_mixer.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:01 +09:00
Gabriel Krisman Bertazi
1d6bb0f9b4 exynos_drm: Clean up duplicated assignment in exynos_drm_driver
num_ioctls is already assigned when declaring the exynos_drm_driver
structure.  No need to duplicate it here.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2017-07-27 09:24:01 +09:00
Jakub Kicinski
d777b2ddbe bpf: don't zero out the info struct in bpf_obj_get_info_by_fd()
The buffer passed to bpf_obj_get_info_by_fd() should be initialized
to zeros.  Kernel will enforce that to guarantee we can safely extend
info structures in the future.

Making the bpf_obj_get_info_by_fd() call in libbpf perform the zeroing
is problematic, however, since some members of the info structures
may need to be initialized by the callers (for instance pointers
to buffers to which kernel is to dump translated and jited images).

Remove the zeroing and fix up the in-tree callers before any kernel
has been released with this code.

As Daniel points out this seems to be the intended operation anyway,
since commit 95b9afd398 ("bpf: Test for bpf ID") is itself setting
the buffer pointers before calling bpf_obj_get_info_by_fd().

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26 17:02:52 -07:00
Matthias Kaehlcke
0c3a8f8b8f netpoll: Fix device name check in netpoll_setup()
Apparently netpoll_setup() assumes that netpoll.dev_name is a pointer
when checking if the device name is set:

if (np->dev_name) {
  ...

However the field is a character array, therefore the condition always
yields true. Check instead whether the first byte of the array has a
non-zero value.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26 17:01:43 -07:00
Dave Airlie
517069ff6e Merge branch 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Three misc amd fixes.

* 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/powerplay: fix AVFS voltage offset for Vega10
  drm/amdgpu/gfx9: simplify and fix GRBM index selection
  drm/amdgpu: Fix blocking in RCU critical section(v2)
2017-07-27 08:49:48 +10:00
Anna Schumaker
1e6f209515 NFS: Use raw NFS access mask in nfs4_opendata_access()
Commit bd8b244174 ("NFS: Store the raw NFS access mask in the inode's
access cache") changed how the access results are stored after an
access() call.  An NFS v4 OPEN might have access bits returned with the
opendata, so we should use the NFS4_ACCESS values when determining the
return value in nfs4_opendata_access().

Fixes: bd8b244174 ("NFS: Store the raw NFS access mask in the inode's
access cache")
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Tested-by: Takashi Iwai <tiwai@suse.de>
2017-07-26 16:53:57 -04:00
WANG Cong
d94708a553 bonding: commit link status change after propose
Commit de77ecd4ef ("bonding: improve link-status update in mii-monitoring")
moves link status commitment into bond_mii_monitor(), but it still relies
on the return value of bond_miimon_inspect() as the hint. We need to return
non-zero as long as we propose a link status change.

Fixes: de77ecd4ef ("bonding: improve link-status update in mii-monitoring")
Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26 13:41:21 -07:00
Vivek Goyal
273752c9ff dm, dax: Make sure dm_dax_flush() is called if device supports it
Currently dm_dax_flush() is not being called, even if underlying dax
device supports write cache, because DAXDEV_WRITE_CACHE is not being
propagated up to the DM dax device.

If the underlying dax device supports write cache, set
DAXDEV_WRITE_CACHE on the DM dax device.  This will cause dm_dax_flush()
to be called.

Fixes: abebfbe2f7 ("dm: add ->flush() dax operation support")
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-26 15:55:44 -04:00
NeilBrown
34c96507e8 dm verity fec: fix GFP flags used with mempool_alloc()
mempool_alloc() cannot fail for GFP_NOIO allocation, so there is no
point testing for failure.

One place the code tested for failure was passing "0" as the GFP
flags.  This is most unusual and is probably meant to be GFP_NOIO,
so that is changed.

Also, allocation from ->extra_pool and ->prealloc_pool are repeated
before releasing the previous allocation.  This can deadlock if the code
is servicing a write under high memory pressure.  To avoid deadlocks,
change these to use GFP_NOWAIT and leave the error handling in place.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-26 15:55:44 -04:00
Damien Le Moal
4218a95546 dm zoned: use GFP_NOIO in I/O path
Use GFP_NOIO for memory allocations in the I/O path.  Other memory
allocations in the initialization path can use GFP_KERNEL.

Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-07-26 15:55:43 -04:00
Linus Torvalds
da08f35b0f virtio: fixes, cleanups
Fixes some minor issues all over the codebase.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZd0o3AAoJECgfDbjSjVRpgK0IAKsLivWrL60y/o83bFH/59K0
 m7XGZ8OMGyl94PTubidM3QqJtYCt2fv+mzkuUZXzxklJ2v4csrJeF0KKQzYlI+/k
 fJv9D1SX96L1yOSHVv91YI8wNNi6NqqLCg8xKd5V7xo/IwufPRmWEEppx7zi1Cb1
 Ne3rRZOox4QdRUL6hQaN9MlCxQUvIL6zRM68UXtGOImLmb0O/vsic+DNjk7gItRz
 /iO3WE4Ig8sbrFjntgJ/+iotJO4/0qbKMxFVQOUOluiOiOhJgGlq+qmMdM45JBzM
 btED8OnuNcAWdDhtsux4ivC1Kl10cEGzXM3NzS2rV4/7A6rCRJvcFH0bUQGHdlk=
 =4J1q
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes and cleanups from Michael Tsirkin:
 "Fixes some minor issues all over the codebase"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-net: fix module unloading
  virtio-balloon: coding format cleanup
  virtio-balloon: deflate via a page list
  virtio_blk: Use sysfs_match_string() helper
2017-07-26 10:46:48 -07:00
Paolo Bonzini
7b5e0a4e82 Merge branch 'kvm-ppc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master
Two commits which fix host crashes.

Signed-off-by: Paolo BOnzini <pbonzini@redhat.com>
2017-07-26 19:04:56 +02:00
Wanpeng Li
1d518c6820 KVM: LAPIC: Fix reentrancy issues with preempt notifiers
Preempt can occur in the preemption timer expiration handler:

          CPU0                    CPU1

  preemption timer vmexit
  handle_preemption_timer(vCPU0)
    kvm_lapic_expired_hv_timer
      hv_timer_is_use == true
  sched_out
                           sched_in
                           kvm_arch_vcpu_load
                             kvm_lapic_restart_hv_timer
                               restart_apic_timer
                                 start_hv_timer
                                   already-expired timer or sw timer triggerd in the window
                                 start_sw_timer
                                   cancel_hv_timer
                           /* back in kvm_lapic_expired_hv_timer */
                           cancel_hv_timer
                             WARN_ON(!apic->lapic_timer.hv_timer_in_use);  ==> Oops

This can be reproduced if CONFIG_PREEMPT is enabled.

------------[ cut here ]------------
 WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
 CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G           OE   4.13.0-rc2+ #16
 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
Call Trace:
  handle_preemption_timer+0xe/0x20 [kvm_intel]
  vmx_handle_exit+0xb8/0xd70 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
  ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x81/0x220
  entry_SYSCALL64_slow_path+0x25/0x25
 ------------[ cut here ]------------
 WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
 CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0-rc2+ #16
 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
Call Trace:
  kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm]
  handle_preemption_timer+0xe/0x20 [kvm_intel]
  vmx_handle_exit+0xb8/0xd70 [kvm_intel]
  kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
  ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x81/0x220
  entry_SYSCALL64_slow_path+0x25/0x25

This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer
and start_sw_timer be in preemption-disabled regions, which trivially
avoid any reentrancy issue with preempt notifier.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Add more WARNs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 19:04:53 +02:00
Lin Ma
67fbcd62f5 tools/kvm_stat: add '-f help' to get the available event list
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 19:04:53 +02:00
Lin Ma
efcb521943 tools/kvm_stat: use variables instead of hard paths in help output
Using variables instead of hard paths makes the requirements information
more accurate.

Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 19:04:52 +02:00
Paolo Bonzini
cb9083eb6e KVM: s390: fixup missing srcu lock
We need to hold the srcu lock when accessing memory slots
 during migration
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJZd0O1AAoJEBF7vIC1phx8WOIQAKLLmnqmKfD5sy3hIxkRUHG7
 WlN33M9tNCjDa1gL6iurBqKbOHt7phBhZdp/Kzuy/UxwPtM5vMaajQv1mLKl7b2y
 1GKEt2aTMJ3e/ql3Z3kVOASNguU2NHjVuzu+BObLhfzBXmNYp4hkyrMSkegb4Q2v
 j6Kfhpz/ltrzCYhL1/WYquKwG/aAV0fDefbfSrZZ0yx1TCnbtAEfbqBO6LXfys2t
 M3LW2OQwavV0YYrrwF4o7Seg0YfpQVYVH1j1/0dLd/Mb1w1D5/CB+gTr6zYhZ6z2
 MzZAa5ADRJ1y+YEoB6aB8T95/4i9Aw3fYhPYfw4ijUJcHUwSbzllSt2brIUSkEZo
 rajClCRjAMaMfeg2q5SZ6Mbg/wt5uo9O0hRG9weqZQR+NCnYkcrfgTe2kYSrorAs
 Msp6CUSfP9cOL9JETD+f3hEWl4uqA1P21n6Ga/v5ERPDxqIrqU49mlgjxBKRcPK4
 wHKaBrA4h53yOjNQ5U/Bdr130W2dQnRHmYAQXuUM9OjbBOliTbtgf8ND0U15S9TY
 HZomb3kEkL1o8J5rH6NBxxntoU48SEtuppBDSIAYBnp9h02F+RvNPqMr4orFHfD/
 OJe0zJHYobK44kz59DGmN5ed5wZl+Vzvdf7KAa62MotKmDMykbkWbcb9j+gcUFHf
 Yal+Mk51xCKPUcadGD/e
 =PjH2
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: fixup missing srcu lock

We need to hold the srcu lock when accessing memory slots
during migration

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 18:59:36 +02:00
Wanpeng Li
2d6144e366 KVM: nVMX: Fix loss of L2's NMI blocking state
Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1:

Before NMI IRET test
Sending NMI to self
NMI isr running stack 0x461000
Sending nested NMI to self
After nested NMI to self
Nested NMI isr running rip=40038e
After iret
After NMI to self
FAIL: NMI

Commit 4c4a6f790e (KVM: nVMX: track NMI blocking state separately
for each VMCS) tracks NMI blocking state separately for vmcs01 and
vmcs02. However it is not enough:

 - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault
   on IRET, so the L2 can generate #PF which can be intercepted by L0.
 - L0 walks L1's guest page table and sees the mapping is invalid, it
   resumes the L1 guest and injects the #PF into L1.  At this point the
   vmcs02 has nmi_known_unmasked=true.
 - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field
   of vmcs12 (and fixes the shadow page table) before resuming L2 guest.
 - L1 executes VMRESUME to resume L2, causing a vmexit to L0
 - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the
   interruptibility-state field of vmcs02, but nmi_known_unmasked is
   still true.
 - L2 immediately exits to L0 with another page fault, because L0 still has
   not updated the NGVA->HPA page tables.  However, nmi_known_unmasked is
   true so vmx_recover_nmi_blocking does not do anything.

The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 18:57:46 +02:00
Wincy Van
06a5524f09 KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode
The PI vector for L0 and L1 must be different. If dest vcpu0
is in guest mode while vcpu1 is delivering a non-nested PI to
vcpu0, there wont't be any vmexit so that the non-nested interrupt
will be delayed.

Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 18:57:46 +02:00
Wincy Van
210f84b0ca x86: irq: Define a global vector for nested posted interrupts
We are using the same vector for nested/non-nested posted
interrupts delivery, this may cause interrupts latency in
L1 since we can't kick the L2 vcpu out of vmx-nonroot mode.

This patch introduces a new vector which is only for nested
posted interrupts to solve the problems above.

Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 18:57:45 +02:00
Paolo Bonzini
a512177ef3 KVM: x86: do mask out upper bits of PAE CR3
This reverts the change of commit f85c758dbe,
as the behavior it modified was intended.

The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual
says:

Table 4-7. Use of CR3 with PAE Paging
Bit Position(s)	Contents
4:0		Ignored
31:5		Physical address of the 32-Byte aligned
		page-directory-pointer table used for linear-address
		translation
63:32		Ignored (these bits exist only on processors supporting
		the Intel-64 architecture)

To placate the static checker, write the mask explicitly as an
unsigned long constant instead of using a 32-bit unsigned constant.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: f85c758dbe
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 18:57:45 +02:00
Claudio Imbrenda
fdeaf7e3eb KVM: make pid available for uevents without debugfs
Simplify and improve the code so that the PID is always available in
the uevent even when debugfs is not available.

This adds a userspace_pid field to struct kvm, as per Radim's
suggestion, so that the PID can be retrieved on destruction too.

Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Fixes: 286de8f6ac ("KVM: trigger uevents when creating or destroying a VM")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 18:57:44 +02:00
Paolo Abeni
9688f9b020 udp: unbreak build lacking CONFIG_XFRM
We must use pre-processor conditional block or suitable accessors to
manipulate skb->sp elsewhere builds lacking the CONFIG_XFRM will break.

Fixes: dce4551cb2 ("udp: preserve head state for IP_CMSG_PASSSEC")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26 09:35:29 -07:00
Scott Bauer
7dd1ab163c nvme: validate admin queue before unquiesce
With a misbehaving controller it's possible we'll never
enter the live state and create an admin queue. When we
fail out of reset work it's possible we failed out early
enough without setting up the admin queue. We tear down
queues after a failed reset, but needed to do some more
sanitization.

Fixes 443bd90f2c: "nvme: host: unquiesce queue in nvme_kill_queues()"

[  189.650995] nvme nvme1: pci function 0000:0b:00.0
[  317.680055] nvme nvme0: Device not ready; aborting reset
[  317.680183] nvme nvme0: Removing after probe failure status: -19
[  317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  317.681397] general protection fault: 0000 [#1] SMP KASAN
[  317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5
[  317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
[  317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
[  317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000
[  317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0
[  317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282
[  317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3
[  317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000
[  317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000
[  317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0
[  317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0
[  317.684371] FS:  0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000
[  317.684519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0
[  317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  317.685018] Call Trace:
[  317.685084]  nvme_kill_queues+0x4d/0x170 [nvme_core]
[  317.685185]  nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme]
[  317.685289]  process_one_work+0x771/0x1170
[  317.685372]  worker_thread+0xde/0x11e0
[  317.685452]  ? pci_mmcfg_check_reserved+0x110/0x110
[  317.685550]  kthread+0x2d3/0x3d0
[  317.685617]  ? process_one_work+0x1170/0x1170
[  317.685704]  ? kthread_create_on_node+0xc0/0xc0
[  317.685785]  ret_from_fork+0x25/0x30
[  317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3
[  317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40
[  317.685908] ---[ end trace a3f8704150b1e8b4 ]---

Signed-off-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-07-26 17:41:41 +02:00
Christoph Hellwig
5b094d6dac xfs: fix multi-AG deadlock in xfs_bunmapi
Just like in the allocator we must avoid touching multiple AGs out of
order when freeing blocks, as freeing still locks the AGF and can cause
the same AB-BA deadlocks as in the allocation path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-07-26 08:20:03 -07:00
Dave Martin
d0153c7ff9 arm64: sysreg: Fix unprotected macro argmuent in write_sysreg
write_sysreg() may misparse the value argument because it is used
without parentheses to protect it.

This patch adds the ( ) in order to avoid any surprises.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
[will: same change to write_sysreg_s]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-07-26 09:28:18 +01:00