Commit Graph

900584 Commits

Author SHA1 Message Date
David S. Miller
89e960b5a9 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2020-02-12

This series contains fixes to only the ice driver.

Dave fixes logic flaws in the DCB rebuild function which is used after a
reset.  Also fixed a configuration issue when switching between firmware
and software LLDP mode where the number of TLV's configured was getting
out of sync with what lldpad thinks is configured.

Paul fixes how the driver displayed all the supported and advertised
link modes by basing it on the PHY capabilities, and in the process
cleaned up a lot of code.

Brett fixes duplicate receive tail bumps by comparing the value we are
writing to tail with the previously written tail value.  Also cleaned up
workarounds that are no longer needed with the latest NVM images.

Anirudh cleaned up unnecessary CONFIG_PCI_IOV wrappers.  Updated the
driver to use ice_pf_to_dev() instead of &pf->pdev->dev or
&vsi->back->pdev->dev.  Cleaned up the string format in print function
calls to remove newlines where applicable.

Akeem updates the link message logging to include "Full Duplex" and
"Negotiated", to help distinguish from "Requested" for FEC.

Bruce fixes and consolidates the logging of firmware/NVM information
during driver load, since the information is duplicate of what is
available via ethtool.  Fixed the checking of the Unit Load Status bits
after reset to ensure they are 0x7FF before continuing, by updating the
mask.  Cleanup up possible NULL dereferences that were created by a
previous commit.

Ben fixes the driver to use the correct netif_msg_tx/rx_error() to
determine whether to print the MDD event type.

Tony provides several trivial fixes, which include whitespace, typos,
function header comments, reverse Christmas tree issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-13 14:10:11 -08:00
Takashi Iwai
0fbb027b44 ALSA: pcm: Fix double hw_free calls
The commit 66f2d19f81 ("ALSA: pcm: Fix memory leak at closing a
stream without hw_free") tried to fix the regression wrt the missing
hw_free call at closing without SNDRV_PCM_IOCTL_HW_FREE ioctl.
However, the code change dropped mistakenly the state check, resulting
in calling hw_free twice when SNDRV_PCM_IOCTL_HW_FRE got called
beforehand.  For most drivers, this is almost harmless, but the
drivers like SOF show another regression now.

This patch adds the state condition check before calling do_hw_free()
at releasing the stream for avoiding the double hw_free calls.

Fixes: 66f2d19f81 ("ALSA: pcm: Fix memory leak at closing a stream without hw_free")
Reported-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/s5hd0ajyprg.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-02-13 16:30:22 +01:00
Alexander Tsoy
9f35a31283 ALSA: usb-audio: Add clock validity quirk for Denon MC7000/MCX8000
It should be safe to ignore clock validity check result if the following
conditions are met:
 - only one single sample rate is supported;
 - the terminal is directly connected to the clock source;
 - the clock type is internal.

This is to deal with some Denon DJ controllers that always reports that
clock is invalid.

Tested-by: Tobias Oszlanyi <toszlanyi@yahoo.de>
Signed-off-by: Alexander Tsoy <alexander@tsoy.me>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200212235450.697348-1-alexander@tsoy.me
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-02-13 07:18:58 +01:00
Randy Dunlap
0bf999f9c5 linux/pipe_fs_i.h: fix kernel-doc warnings after @wait was split
Fix kernel-doc warnings in struct pipe_inode_info after @wait was
split into @rd_wait and @wr_wait.

  include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'rd_wait' not described in 'pipe_inode_info'
  include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'wr_wait' not described in 'pipe_inode_info'

Fixes: 0ddad21d3e ("pipe: use exclusive waits when reading or writing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-12 11:54:08 -08:00
Tony Nguyen
4ee656bba8 ice: Trivial fixes
This is a collection of trivial fixes including fixing whitespace, typos,
function headers, reverse Christmas tree, etc.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:49:12 -08:00
Ben Shelton
1d8bd99272 ice: Use correct netif error function
Use the correct netif_msg_[tx,rx]_error() function to determine whether to
print the MDD event type.

Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:49:07 -08:00
Anirudh Venkataramanan
3306f79f42 ice: Cleanup ice_vsi_alloc_q_vectors
1. Remove local variable num_q_vectors and use vsi->num_q_vectors instead
2. Remove local variable pf and pass vsi->back to ice_pf_to_dev

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:49:04 -08:00
Anirudh Venkataramanan
19cce2c6f6 ice: Make print statements more compact
Formatting strings in print function calls (like dev_info, dev_err, etc.)
can exceed 80 columns without making checkpatch unhappy. So remove
newlines where applicable and make print statements more compact.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:49:00 -08:00
Anirudh Venkataramanan
9a946843ba ice: Use ice_pf_to_dev
Use ice_pf_to_dev(pf) instead of &pf->pdev->dev
Use ice_pf_to_dev(vsi->back) instead of &vsi->back->pdev->dev
When a pointer to the pf instance is available, use ice_pf_to_dev
instead of ice_hw_to_dev

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:55 -08:00
Tony Nguyen
0a6ea04e3b ice: Remove possible null dereference
Commit 1f45ebe0d8 ("ice: add extra check for null Rx descriptor") moved
the call to ice_construct_skb() under a null check as Coverity reported a
possible use of null skb. However, the original call was not deleted, do so
now.

Fixes: 1f45ebe0d8 ("ice: add extra check for null Rx descriptor")
Reported-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:51 -08:00
Bruce Allan
cf8fc2a086 ice: update Unit Load Status bitmask to check after reset
After a reset the Unit Load Status bits in the GLNVM_ULD register to check
for completion should be 0x7FF before continuing.  Update the mask to check
(minus the three reserved bits that are always set).

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:45 -08:00
Bruce Allan
fbf1e1f698 ice: fix and consolidate logging of NVM/firmware version information
Logging the firmware/NVM information during driver load is redundant since
that information is also available via ethtool.  Move the functionality
found in ice_nvm_version_str() directly into ice_get_drvinfo() and remove
calling the former and logging that info during driver probe.  This also
gets rid of a bug in ice_nvm_version_str() where it returns a pointer to
a buffer which is free'ed when that function exits.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:41 -08:00
Akeem G Abodunrin
b55e603252 ice: Modify link message logging
This patch modifies link message logging to include "Full Duplex" and
"Negotiated" for FEC, so as to distinguish it from "Requested" FEC.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:35 -08:00
Anirudh Venkataramanan
a8b72ce03a ice: Remove CONFIG_PCI_IOV wrap in ice_set_pf_caps
Remove unnecessary CONFIG_PCI_IOV wrapping in ice_set_pf_caps. None
of the data structures accessed within the block are wrapped with
this flag. When CONFIG_PCI_IOV is undefined, pf->num_vfs_supported
will be 0 anyway.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:31 -08:00
Brett Creeley
3d9f999080 ice: Remove ice_dev_onetime_setup()
ice_dev_onetime_setup contains driver workarounds needed for
firmware limitations. These issues have now been resolved in newer
NVMs so remove the function.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:26 -08:00
Brett Creeley
168983a8e1 ice: Don't allow same value for Rx tail to be written twice
Currently we compare the value we are about to write to the Rx tail
register with the previous value of next_to_use. The problem with this
is we only write tail on 8 descriptor boundaries, but next_to_use is
updated whenever we clean Rx descriptors. Fix this by comparing the
value we are about to write to tail with the previously written tail
value. This will prevent duplicate Rx tail bumps.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:22 -08:00
Paul Greenwalt
ad9a87bec3 ice: display supported and advertised link modes
Display all of the supported and advertised link modes based on the PHY
capability with media.

Displaying all supported modes is more informative then only displaying
the current link mode.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:18 -08:00
Dave Ertman
53977ee474 ice: Fix switch between FW and SW LLDP
When switching between FW and SW LLDP mode, the
number of configured TLV apps in the driver's
DCB configuration is getting out of synch with
what lldpad thinks is configured.  This is causing
a problem when shutting down lldpad.  The cleanup
is trying to delete TLV apps that are not defined
in the kernel.

Since the driver is keeping an accurate account
of the apps defined, use the drivers number of
apps to determine if there is an app to delete.
If the number of apps is <= 1, then do not
attempt to delete.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:14 -08:00
Dave Ertman
242b5e068b ice: Fix DCB rebuild after reset
The function ice_dcb_rebuild had some logic
flaws in it, and also didn't differentiate
between FW and SW modes needs.

For FW flow, the willing setting was being
forced to OFF and left that way.  Unwilling
in DCB FW mode is not a supported model.

Leave the config alone and use the return value
from the set command to determine if setting the
config was successful.

The SW DCB flow does not need to need to register
for MIB change events (as they are not used in
SW mode).

Use !is_sw_lldp checks to only perform FW specific
task while in FW mode.

Also adding a reapplication of the current DCB
config after a link event.  Some NVMs are not
maintaining their DCB configs across link events.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12 11:48:10 -08:00
Linus Torvalds
f2850dd5ee Kbuild fixes for v5.6
- fix memory corruption in scripts/kallsyms
 
  - fix the vmlinux link stage to correctly update compile.h
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl5EH2wVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGX/UP/0j/U3FxvLba8Ke/hIObdfZQyuhp
 kxvhq4+Mgylx9JQ84nME5HlFpoW83qyvLQGcEnj5yErMiYfOX62h8TPVKjMg5tvu
 w5Wik777pc0bAbrwBCzmO19rDzWX7TH/Gg0HrmgsZFNynF/mlFXVYtCJ0cIgmzxH
 zI9W/UpZX9Mns7gzdn0kig50KoidcdgH3W78KApCBrMK75KC3E4qSWumm11ToCGe
 Yh4n/TsqWhsMdJpnZRYIZq144bOHGvTqj62fgR3UgOIp4fgYEQZ1Eh/phMFu3YXm
 63nr5tx7lsRIb1RrFg9rljktEUAB9I9Nia9+Dpp8lDGnifciZFnyZm6mee0c8RzE
 NMD3FMjp+jSE9LfahdkrunPepI1vlsS0XQbRlSOTu26lEZiYOHiYZomE2Z1Jt5bf
 ancQJUMoQ2rRWAN816r1p2+CiRn/kRI0vWtxJmQylHx5AIN9Gaw6bF1/b67C3jVn
 QNrS7i0nsVZ45tUuzVDxwUMtsqRzS892Me0WvEua2vd53623qiVQnLeyIACdg40X
 HzgF8Q3EChyjR5248x7C4q0Vr3srwUIgHcPHM3I64dmCXAVGOH1G62V7s1BqZ3w7
 74xKy8RsCmpmODOvMSUMajBqouXR9LtXe+vcoECMjksvRbv5DQYHjM52lmBHROYq
 Qg/98qM7Lhr9MH0E
 =4lc5
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - fix memory corruption in scripts/kallsyms

 - fix the vmlinux link stage to correctly update compile.h

* tag 'kbuild-fixes-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: fix mismatch between .version and include/generated/compile.h
  scripts/kallsyms: fix memory corruption caused by write over-run
2020-02-12 10:28:42 -08:00
Kunihiko Hayashi
b9287f2ac3 net: ethernet: ave: Add capability of rgmii-id mode
This allows you to specify the type of rgmii-id that will enable phy
internal delay in ethernet phy-mode.

This adds all RGMII cases to all of get_pinmode() except LD11, because LD11
SoC doesn't support RGMII due to the constraint of the hardware. When RGMII
phy mode is specified in the devicetree for LD11, the driver will abort
with an error.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-12 09:55:04 -08:00
Firo Yang
0f90522591 enic: prevent waking up stopped tx queues over watchdog reset
Recent months, our customer reported several kernel crashes all
preceding with following message:
NETDEV WATCHDOG: eth2 (enic): transmit queue 0 timed out
Error message of one of those crashes:
BUG: unable to handle kernel paging request at ffffffffa007e090

After analyzing severl vmcores, I found that most of crashes are
caused by memory corruption. And all the corrupted memory areas
are overwritten by data of network packets. Moreover, I also found
that the tx queues were enabled over watchdog reset.

After going through the source code, I found that in enic_stop(),
the tx queues stopped by netif_tx_disable() could be woken up over
a small time window between netif_tx_disable() and the
napi_disable() by the following code path:
napi_poll->
  enic_poll_msix_wq->
     vnic_cq_service->
        enic_wq_service->
           netif_wake_subqueue(enic->netdev, q_number)->
              test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state)
In turn, upper netowrk stack could queue skb to ENIC NIC though
enic_hard_start_xmit(). And this might introduce some race condition.

Our customer comfirmed that this kind of kernel crash doesn't occur over
90 days since they applied this patch.

Signed-off-by: Firo Yang <firo.yang@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-12 09:43:26 -08:00
Geert Uytterhoeven
d91771848f arm64: time: Replace <linux/clk-provider.h> by <linux/of_clk.h>
The arm64 time code is not a clock provider, and just needs to call
of_clk_init().

Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>.

Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Will Deacon <will@kernel.org>
2020-02-12 17:26:38 +00:00
Chris Wilson
2aaaa5ee1c drm/i915: Mark the removal of the i915_request from the sched.link
Keep the rq->fence.flags consistent with the status of the
rq->sched.link, and clear the associated bits when decoupling the link
on retirement (as we may wish to inspect those flags independent of
other state).

Fixes: c3f1ed90e6 ("drm/i915/gt: Allow temporary suspension of inflight requests")
References: https://gitlab.freedesktop.org/drm/intel/issues/997
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-3-chris@chris-wilson.co.uk
(cherry picked from commit b4a9a149f9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 17:04:33 +02:00
Chris Wilson
a2f90f4ff3 drm/i915/execlists: Reclaim the hanging virtual request
If we encounter a hang on a virtual engine, as we process the hang the
request may already have been moved back to the virtual engine (we are
processing the hang on the physical engine). We need to reclaim the
request from the virtual engine so that the locking is consistent and
local to the real engine on which we will hold the request for error
state capturing.

v2: Pull the reclamation into execlists_hold() and assert that cannot be
called from outside of the reset (i.e. with the tasklet disabled).
v3: Added selftest
v4: Drop the reference owned by the virtual engine

Fixes: ad18ba7b5e ("drm/i915/execlists: Offline error capture")
Testcase: igt/gem_exec_balancer/hang
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-2-chris@chris-wilson.co.uk
(cherry picked from commit 989df3a7bd)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 17:03:53 +02:00
Chris Wilson
317e0395cc drm/i915/execlists: Take a reference while capturing the guilty request
Thanks to preempt-to-busy, we leave the request on the HW as we submit
the preemption request. This means that the request may complete at any
moment as we process HW events, and in particular the request may be
retired as we are planning to capture it for a preemption timeout.

Be more careful while obtaining the request to capture after a
preemption timeout, and check to see if it completed before we were able
to put it on the on-hold list. If we do see it did complete just before
we capture the request, proclaim the preemption-timeout a false positive
and pardon the reset as we should hit an arbitration point momentarily
and so be able to process the preemption.

Note that even after we move the request to be on hold it may be retired
(as the reset to stop the HW comes after), so we do require to hold our
own reference as we work on the request for capture (and all of the
peeking at state within the request needs to be carefully protected).

Fixes: c3f1ed90e6 ("drm/i915/gt: Allow temporary suspension of inflight requests")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/997
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200122140243.495621-1-chris@chris-wilson.co.uk
(cherry picked from commit 4ba5c086a1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 17:02:53 +02:00
Chris Wilson
ad18ba7b5e drm/i915/execlists: Offline error capture
Currently, we skip error capture upon forced preemption. We apply forced
preemption when there is a higher priority request that should be
running but is being blocked, and we skip inline error capture so that
the preemption request is not further delayed by a user controlled
capture -- extending the denial of service.

However, preemption reset is also used for heartbeats and regular GPU
hangs. By skipping the error capture, we remove the ability to debug GPU
hangs.

In order to capture the error without delaying the preemption request
further, we can do an out-of-line capture by removing the guilty request
from the execution queue and scheduling a worker to dump that request.
When removing a request, we need to remove the entire context and all
descendants from the execution queue, so that they do not jump past.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/738
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-3-chris@chris-wilson.co.uk
(cherry picked from commit 748317386a)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 16:55:58 +02:00
Chris Wilson
c3f1ed90e6 drm/i915/gt: Allow temporary suspension of inflight requests
In order to support out-of-line error capture, we need to remove the
active request from HW and put it to one side while a worker compresses
and stores all the details associated with that request. (As that
compression may take an arbitrary user-controlled amount of time, we
want to let the engine continue running on other workloads while the
hanging request is dumped.) Not only do we need to remove the active
request, but we also have to remove its context and all requests that
were dependent on it (both in flight, queued and future submission).

Finally once the capture is complete, we need to be able to resubmit the
request and its dependents and allow them to execute.

v2: Replace stack recursion with a simple list.
v3: Check all the parents, not just the first, when searching for a
stuck ancestor!

References: https://gitlab.freedesktop.org/drm/intel/issues/738
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk
(cherry picked from commit 32ff621fd7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 16:55:58 +02:00
Chris Wilson
9e2750fc80 drm/i915: Keep track of request among the scheduling lists
If we keep track of when the i915_request.sched.link is on the HW
runlist, or in the priority queue we can simplify our interactions with
the request (such as during rescheduling). This also simplifies the next
patch where we introduce a new in-between list, for requests that are
ready but neither on the run list or in the queue.

v2: Update i915_sched_node.link explanation for current usage where it
is a link on both the queue and on the runlists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-1-chris@chris-wilson.co.uk
(cherry picked from commit 672c368f93)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 16:55:58 +02:00
Jani Nikula
cc3251d8ef Merge tag 'gvt-fixes-2020-02-12' of https://github.com/intel/gvt-linux into drm-intel-next-fixes
gvt-fixes-2020-02-12

- fix possible high-order allocation fail for late load (Igor)
- fix one missed lock for ppgtt mm LRU list (Igor)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200212065912.GB4997@zhen-hp.sh.intel.com
2020-02-12 16:50:04 +02:00
Chris Wilson
2933803bdc drm/i915/gem: Tighten checks and acquiring the mmap object
Make sure we hold the rcu lock as we acquire the rcu protected reference
of the object when looking it up from the associated mmap vma.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1083
Fixes: cc662126b4 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130143931.1906301-1-chris@chris-wilson.co.uk
(cherry picked from commit 280d14a69d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 13:24:45 +02:00
José Roberto de Souza
52144db130 drm/i915: Fix preallocated barrier list append
Only the first and the last nodes were being added to
ref->preallocated_barriers.

Renaming variables to make it more easy to read.

Fixes: 8413502238 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200129232345.84512-1-jose.souza@intel.com
(cherry picked from commit d4c3c0b822)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 13:24:45 +02:00
Chris Wilson
5b92415e64 drm/i915/gt: Acquire ce->active before ce->pin_count/ce->pin_mutex
Similar to commit ac0e331a62 ("drm/i915: Tighten atomicity of
i915_active_acquire vs i915_active_release") we have the same race of
trying to pin the context underneath a mutex while allowing the
decrement to be atomic outside of that mutex. This leads to the problem
where two threads may simultaneously try to pin the context and the
second not notice that they needed to repin the context.

<2> [198.669621] kernel BUG at drivers/gpu/drm/i915/gt/intel_timeline.c:387!
<4> [198.669703] invalid opcode: 0000 [#1] PREEMPT SMP PTI
<4> [198.669712] CPU: 0 PID: 1246 Comm: gem_exec_create Tainted: G     U  W         5.5.0-rc6-CI-CI_DRM_7755+ #1
<4> [198.669723] Hardware name:  /NUC7i5BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017
<4> [198.669776] RIP: 0010:timeline_advance+0x7b/0xe0 [i915]
<4> [198.669785] Code: 00 48 c7 c2 10 f1 46 a0 48 c7 c7 70 1b 32 a0 e8 bb dd e7 e0 bf 01 00 00 00 e8 d1 af e7 e0 31 f6 bf 09 00 00 00 e8 35 ef d8 e0 <0f> 0b 48 c7 c1 48 fa 49 a0 ba 84 01 00 00 48 c7 c6 10 f1 46 a0 48
<4> [198.669803] RSP: 0018:ffffc900004c3a38 EFLAGS: 00010296
<4> [198.669810] RAX: ffff888270b35140 RBX: ffff88826f32ee00 RCX: 0000000000000006
<4> [198.669818] RDX: 00000000000017c5 RSI: 0000000000000000 RDI: 0000000000000009
<4> [198.669826] RBP: ffffc900004c3a64 R08: 0000000000000000 R09: 0000000000000000
<4> [198.669834] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88826f9b5980
<4> [198.669841] R13: 0000000000000cc0 R14: ffffc900004c3dc0 R15: ffff888253610068
<4> [198.669849] FS:  00007f63e663fe40(0000) GS:ffff888276c00000(0000) knlGS:0000000000000000
<4> [198.669857] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [198.669864] CR2: 00007f171f8e39a8 CR3: 000000026b1f6005 CR4: 00000000003606f0
<4> [198.669872] Call Trace:
<4> [198.669924]  intel_timeline_get_seqno+0x12/0x40 [i915]
<4> [198.669977]  __i915_request_create+0x76/0x5a0 [i915]
<4> [198.670024]  i915_request_create+0x86/0x1c0 [i915]
<4> [198.670068]  i915_gem_do_execbuffer+0xbf2/0x2500 [i915]
<4> [198.670082]  ? __lock_acquire+0x460/0x15d0
<4> [198.670128]  i915_gem_execbuffer2_ioctl+0x11f/0x470 [i915]
<4> [198.670171]  ? i915_gem_execbuffer_ioctl+0x300/0x300 [i915]
<4> [198.670181]  drm_ioctl_kernel+0xa7/0xf0
<4> [198.670188]  drm_ioctl+0x2e1/0x390
<4> [198.670233]  ? i915_gem_execbuffer_ioctl+0x300/0x300 [i915]

Fixes: 8413502238 ("drm/i915/gt: Drop mutex serialisation between context pin/unpin")
References: ac0e331a62 ("drm/i915: Tighten atomicity of i915_active_acquire vs i915_active_release")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200127152829.2842149-1-chris@chris-wilson.co.uk
(cherry picked from commit e5429340bf)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 13:24:45 +02:00
Chris Wilson
7c34bb0398 drm/i915: Tighten atomicity of i915_active_acquire vs i915_active_release
As we use a mutex to serialise the first acquire (as it may be a lengthy
operation), but only an atomic decrement for the release, we have to
be careful in case a second thread races and completes both
acquire/release as the first finishes its acquire.

Thread A			Thread B
i915_active_acquire		i915_active_acquire
  atomic_read() == 0		  atomic_read() == 0
  mutex_lock()			  mutex_lock()
				  atomic_read() == 0
				    ref->active();
				  atomic_inc()
				  mutex_unlock()
  atomic_read() == 1
				i915_active_release
				  atomic_dec_and_test() -> 0
				    ref->retire()
  atomic_inc() -> 1
  mutex_unlock()

So thread A has acquired the ref->active_count but since the ref was
still active at the time, it did not initialise it. By switching the
check inside the mutex to an atomic increment only if already active, we
close the race.

Fixes: c9ad602fea ("drm/i915: Split i915_active.mutex into an irq-safe spinlock for the rbtree")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200126102346.1877661-3-chris@chris-wilson.co.uk
(cherry picked from commit ac0e331a62)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 13:24:44 +02:00
Chris Wilson
9556e5c7c4 drm/i915: Stub out i915_gpu_coredump_put
i915_gpu_coreddump_put is currently only defined if
CONFIG_DRM_I915_CAPTURE_ERROR is enabled, provide a stub otherwise.

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Fixes: 742379c0c4 ("drm/i915: Start chopping up the GPU error capture")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mike Lothian <mike@fireburn.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200124192255.541355-1-chris@chris-wilson.co.uk
(cherry picked from commit 7e36505d0c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-02-12 13:24:34 +02:00
Takashi Iwai
7dafba3762 ALSA: hda/realtek - Fix silent output on MSI-GL73
MSI-GL73 laptop with ALC1220 codec requires a similar workaround for
Clevo laptops to enforce the DAC/mixer connection path.  Set up a
quirk entry for that.

BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=204159
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200212081047.27727-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-02-12 09:11:19 +01:00
Kailang Yang
2b3b6497c3 ALSA: hda/realtek - Add more codec supported Headset Button
Add supported Headset Button for ALC215/ALC285/ALC289.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/948f70b4488f4cc2b629a39ce4e4be33@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-02-12 08:49:11 +01:00
David S. Miller
b44beb8ae5 Merge branch 'Bug-fixes-for-ENA-Ethernet-driver'
Sameeh Jubran says:

====================
Bug fixes for ENA Ethernet driver

Difference from V1:
* Started using netdev_rss_key_fill()
* Dropped superflous changes that are not related to bug fixes as
  requested by Jakub
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:31 -08:00
Arthur Kiyanovski
c207979f5a net: ena: ena-com.c: prevent NULL pointer dereference
comp_ctx can be NULL in a very rare case when an admin command is executed
during the execution of ena_remove().

The bug scenario is as follows:

* ena_destroy_device() sets the comp_ctx to be NULL
* An admin command is executed before executing unregister_netdev(),
  this can still happen because our device can still receive callbacks
  from the netdev infrastructure such as ethtool commands.
* When attempting to access the comp_ctx, the bug occurs since it's set
  to NULL

Fix:
Added a check that comp_ctx is not NULL

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:31 -08:00
Sameeh Jubran
886d208927 net: ena: ethtool: use correct value for crc32 hash
Up till kernel 4.11 there was no enum defined for crc32 hash in ethtool,
thus the xor enum was used for supporting crc32.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:31 -08:00
Arthur Kiyanovski
470793a78c net: ena: make ena rxfh support ETH_RSS_HASH_NO_CHANGE
As the name suggests ETH_RSS_HASH_NO_CHANGE is received upon changing
the key or indirection table using ethtool while keeping the same hash
function.

Also add a function for retrieving the current hash function from
the ena-com layer.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Saeed Bshara <saeedb@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:31 -08:00
Arthur Kiyanovski
e3f89f91e9 net: ena: fix corruption of dev_idx_to_host_tbl
The function ena_com_ind_tbl_convert_from_device() has an overflow
bug as explained below. Either way, this function is not needed at
all since we don't retrieve the indirection table from the device
at any point which means that this conversion is not needed.

The bug:
The for loop iterates over all io_sq_queues, when passing the actual
number of used queues the io_sq_queues[i].idx equals 0 since they are
uninitialized which results in the following code to be executed till
the end of the loop:

dev_idx_to_host_tbl[0] = i;

This results dev_idx_to_host_tbl[0] in being equal to
ENA_TOTAL_NUM_QUEUES - 1.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:31 -08:00
Arthur Kiyanovski
92569fd27f net: ena: fix incorrectly saving queue numbers when setting RSS indirection table
The indirection table has the indices of the Rx queues. When we store it
during set indirection operation, we convert the indices to our internal
representation of the indices.

Our internal representation of the indices is: even indices for Tx and
uneven indices for Rx, where every Tx/Rx pair are in a consecutive order
starting from 0. For example if the driver has 3 queues (3 for Tx and 3
for Rx) then the indices are as follows:
0  1  2  3  4  5
Tx Rx Tx Rx Tx Rx

The BUG:
The issue is that when we satisfy a get request for the indirection
table, we don't convert the indices back to the original representation.

The FIX:
Simply apply the inverse function for the indices of the indirection
table after we set it.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:31 -08:00
Arthur Kiyanovski
4844470d47 net: ena: rss: store hash function as values and not bits
The device receives, stores and retrieves the hash function value as bits
and not as their enum value.

The bug:
* In ena_com_set_hash_function() we set
  cmd.u.flow_hash_func.selected_func to the bit value of rss->hash_func.
 (1 << rss->hash_func)
* In ena_com_get_hash_function() we retrieve the hash function and store
  it's bit value in rss->hash_func. (Now the bit value of rss->hash_func
  is stored in rss->hash_func instead of it's enum value)

The fix:
This commit fixes the issue by converting the retrieved hash function
values from the device to the matching enum value of the set bit using
ffs(). ffs() finds the first set bit's index in a word. Since the function
returns 1 for the LSB's index, we need to subtract 1 from the returned
value (note that BIT(0) is 1).

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:31 -08:00
Sameeh Jubran
0c8923c0a6 net: ena: rss: fix failure to get indirection table
On old hardware, getting / setting the hash function is not supported while
gettting / setting the indirection table is.

This commit enables us to still show the indirection table on older
hardwares by setting the hash function and key to NULL.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:30 -08:00
Sameeh Jubran
6a4f7dc82d net: ena: rss: do not allocate key when not supported
Currently we allocate the key whether the device supports setting the
key or not. This commit adds a check to the allocation function and
handles the error accordingly.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:30 -08:00
Arthur Kiyanovski
0d1c3de7b8 net: ena: fix incorrect default RSS key
Bug description:
When running "ethtool -x <if_name>" the key shows up as all zeros.

When we use "ethtool -X <if_name> hfunc toeplitz hkey <some:random:key>" to
set the key and then try to retrieve it using "ethtool -x <if_name>" then
we return the correct key because we return the one we saved.

Bug cause:
We don't fetch the key from the device but instead return the key
that we have saved internally which is by default set to zero upon
allocation.

Fix:
This commit fixes the issue by initializing the key to a random value
using netdev_rss_key_fill().

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:30 -08:00
Arthur Kiyanovski
cf6d17fde9 net: ena: add missing ethtool TX timestamping indication
Current implementation of the driver calls skb_tx_timestamp()to add a
software tx timestamp to the skb, however the software-transmit capability
is not reported in ethtool -T.

This commit updates the ethtool structure to report the software-transmit
capability in ethtool -T using the standard ethtool_op_get_ts_info().
This function reports all software timestamping capabilities (tx and rx),
as well as setting phc_index = -1. phc_index is the index of the PTP
hardware clock device that will be used for hardware timestamps. Since we
don't have such a device in ENA, using the default -1 value is the correct
setting.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Ezequiel Lara Gomez <ezegomez@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:30 -08:00
Arthur Kiyanovski
2a6e5fa2f4 net: ena: fix uses of round_jiffies()
>From the documentation of round_jiffies():
"Rounds a time delta  in the future (in jiffies) up or down to
(approximately) full seconds. This is useful for timers for which
the exact time they fire does not matter too much, as long as
they fire approximately every X seconds.
By rounding these timers to whole seconds, all such timers will fire
at the same time, rather than at various times spread out. The goal
of this is to have the CPU wake up less, which saves power."

There are 2 parts to this patch:
================================
Part 1:
-------
In our case we need timer_service to be called approximately every
X=1 seconds, and the exact time does not matter, so using round_jiffies()
is the right way to go.

Therefore we add round_jiffies() to the mod_timer() in ena_timer_service().

Part 2:
-------
round_jiffies() is used in check_for_missing_keep_alive() when
getting the jiffies of the expiration of the keep_alive timeout. Here it
is actually a mistake to use round_jiffies() because we want the exact
time when keep_alive should expire and not an approximate rounded time,
which can cause early, false positive, timeouts.

Therefore we remove round_jiffies() in the calculation of
keep_alive_expired() in check_for_missing_keep_alive().

Fixes: 82ef30f13b ("net: ena: add hardware hints capability to the driver")
Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:30 -08:00
Arthur Kiyanovski
91a65b7d3e net: ena: fix potential crash when rxfh key is NULL
When ethtool -X is called without an hkey, ena_com_fill_hash_function()
is called with key=NULL, which is passed to memcpy causing a crash.

This commit fixes this issue by checking key is not NULL.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-11 17:08:30 -08:00