Commit Graph

464 Commits

Author SHA1 Message Date
John Hubbard
2170ecfa76 drm/i915: convert get_user_pages() --> pin_user_pages()
This code was using get_user_pages*(), in a "Case 2" scenario (DMA/RDMA),
using the categorization from [1].  That means that it's time to convert
the get_user_pages*() + put_page() calls to pin_user_pages*() +
unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small part
of fixing a long-standing disconnect between pinning pages, and file
systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: "Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://lkml.kernel.org/r/20200519002124.2025955-5-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:42 -07:00
Linus Torvalds
faa392181a drm pull for 5.8-rc1
core:
 - uapi: error out EBUSY when existing master
 - uapi: rework SET/DROP MASTER permission handling
 - remove drm_pci.h
 - drm_pci* are now legacy
 - introduced managed DRM resources
 - subclassing support for drm_framebuffer
 - simple encoder helper
 - edid improvements
 - vblank + writeback documentation improved
 - drm/mm - optimise tree searches
 - port drivers to use devm_drm_dev_alloc
 
 dma-buf:
 - add flag for p2p buffer support
 
 mst:
 - ACT timeout improvements
 - remove drm_dp_mst_has_audio
 - don't use 2nd TX slot - spec recommends against it
 
 bridge:
 - dw-hdmi various improvements
 - chrontel ch7033 support
 - fix stack issues with old gcc
 
 hdmi:
 - add unpack function for drm infoframe
 
 fbdev:
 - misc fbdev driver fixes
 
 i915:
 - uapi: global sseu pinning
 - uapi: OA buffer polling
 - uapi: remove generated perf code
 - uapi: per-engine default property values in sysfs
 - Tigerlake GEN12 enabled.
 - Lots of gem refactoring
 - Tigerlake enablement patches
 - move to drm_device logging
 - Icelake gamma HW readout
 - push MST link retrain to hotplug work
 - bandwidth atomic helpers
 - ICL fixes
 - RPS/GT refactoring
 - Cherryview full-ppgtt support
 - i915 locking guidelines documented
 - require linear fb stride to be 512 multiple on gen9
 - Tigerlake SAGV support
 
 amdgpu:
 - uapi: encrypted GPU memory handling
 - uapi: add MEM_SYNC IB flag
 - p2p dma-buf support
 - export VRAM dma-bufs
 - FRU chip access support
 - RAS/SR-IOV updates
 - Powerplay locking fixes
 - VCN DPG (powergating) enablement
 - GFX10 clockgating fixes
 - DC fixes
 - GPU reset fixes
 - navi SDMA fix
 - expose FP16 for modesetting
 - DP 1.4 compliance fixes
 - gfx10 soft recovery
 - Improved Critical Thermal Faults handling
 - resizable BAR on gmc10
 
 amdkfd:
 - uapi: GWS resource management
 - track GPU memory per process
 - report PCI domain in topology
 
 radeon:
 - safe reg list generator fixes
 
 nouveau:
 - HD audio fixes on recent systems
 - vGPU detection (fail probe if we're on one, for now)
 - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it)
 - SVM improvements/fixes
 - NVIDIA format modifier support
 - Misc other fixes.
 
 adv7511:
 - HDMI SPDIF support
 
 ast:
 - allocate crtc state size
 - fix double assignment
 - fix suspend
 
 bochs:
 - drop connector register
 
 cirrus:
 - move to tiny drivers.
 
 exynos:
 - fix imported dma-buf mapping
 - enable runtime PM
 - fixes and cleanups
 
 mediatek:
 - DPI pin mode swap
 - config mipi_tx current/impedance
 
 lima:
 - devfreq + cooling device support
 - task handling improvements
 - runtime PM support
 
 pl111:
 - vexpress init improvements
 - fix module auto-load
 
 rcar-du:
 - DT bindings conversion to YAML
 - Planes zpos sanity check and fix
 - MAINTAINERS entry for LVDS panel driver
 
 mcde:
 - fix return value
 
 mgag200:
 - use managed config init
 
 stm:
 - read endpoints from DT
 
 vboxvideo:
 - use PCI managed functions
 - drop WC mtrr
 
 vkms:
 - enable cursor by default
 
 rockchip:
 - afbc support
 
 virtio:
 - various cleanups
 
 qxl:
 - fix cursor notify port
 
 hisilicon:
 - 128-byte stride alignment fix
 
 sun4i:
 - improved format handling
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJe1edsAAoJEAx081l5xIa+bKEQAJAZv/8OMM2rx+p+GyKgrNpl
 ihTX/oyToy8dw97s1kWF7V5kKU+qjF8aWlKoPS0xovzaMAzYSFz9FRNEUgqtTXMI
 zIAzSXioqP21oL9/ZTHcXDULtz8Gk3uiPomgXMWLlNBdt3X5qvCwsmPRIYSwG0GJ
 00VCvxDbVxGSM3wzcvbfyRwHCq3SrFvIusXv5jHnnxEFGH0C7Mj2/FLYMKLNjvli
 Q8VEI2wQPZj1QdA8fLFVneIQsR6YUSko9OfFMANP8VJGpPMnUkvVxTJ5ACGJspvn
 U/h6NYqJeUU2Y3BSKqtjIC3a1LY51tp5tL9q4H9TD1hqMckt6F2V7T2IeFU8i6+V
 YzUsSiT4q1xB+uiFVcgopx2hyIp8INOEyWrVdYgw2JviROeRD+pDHvJd13ZNMnTe
 GvLWQ/PfBFrcz8eligjiYjOf66ZTU+j/rivaOBFyrs9gdlsaEW2QRurFrcNX+0lZ
 kDbLsIFjhYnPXsvHP87x4BuQCKQIEh8wWuxXuJjunBPdqVrJyltZWbBiKO571b5/
 BtX6xj6ztUOffR2RdiVanzY546I2hEi7SHMUuWnMqXsOV46GBN0QvlpZad/47n9x
 ZUy8HDDD0/qWuGwvPOJGIeAnUteWge9AhWXTeN5+1h5m+QEOzYkPKqC3Hp8TW1pM
 gToTWgAhnu731fhzLWyt
 =H7IS
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Highlights:

   - Core DRM had a lot of refactoring around managed drm resources to
     make drivers simpler.

   - Intel Tigerlake support is on by default

   - amdgpu now support p2p PCI buffer sharing and encrypted GPU memory

  Details:

  core:
   - uapi: error out EBUSY when existing master
   - uapi: rework SET/DROP MASTER permission handling
   - remove drm_pci.h
   - drm_pci* are now legacy
   - introduced managed DRM resources
   - subclassing support for drm_framebuffer
   - simple encoder helper
   - edid improvements
   - vblank + writeback documentation improved
   - drm/mm - optimise tree searches
   - port drivers to use devm_drm_dev_alloc

  dma-buf:
   - add flag for p2p buffer support

  mst:
   - ACT timeout improvements
   - remove drm_dp_mst_has_audio
   - don't use 2nd TX slot - spec recommends against it

  bridge:
   - dw-hdmi various improvements
   - chrontel ch7033 support
   - fix stack issues with old gcc

  hdmi:
   - add unpack function for drm infoframe

  fbdev:
   - misc fbdev driver fixes

  i915:
   - uapi: global sseu pinning
   - uapi: OA buffer polling
   - uapi: remove generated perf code
   - uapi: per-engine default property values in sysfs
   - Tigerlake GEN12 enabled.
   - Lots of gem refactoring
   - Tigerlake enablement patches
   - move to drm_device logging
   - Icelake gamma HW readout
   - push MST link retrain to hotplug work
   - bandwidth atomic helpers
   - ICL fixes
   - RPS/GT refactoring
   - Cherryview full-ppgtt support
   - i915 locking guidelines documented
   - require linear fb stride to be 512 multiple on gen9
   - Tigerlake SAGV support

  amdgpu:
   - uapi: encrypted GPU memory handling
   - uapi: add MEM_SYNC IB flag
   - p2p dma-buf support
   - export VRAM dma-bufs
   - FRU chip access support
   - RAS/SR-IOV updates
   - Powerplay locking fixes
   - VCN DPG (powergating) enablement
   - GFX10 clockgating fixes
   - DC fixes
   - GPU reset fixes
   - navi SDMA fix
   - expose FP16 for modesetting
   - DP 1.4 compliance fixes
   - gfx10 soft recovery
   - Improved Critical Thermal Faults handling
   - resizable BAR on gmc10

  amdkfd:
   - uapi: GWS resource management
   - track GPU memory per process
   - report PCI domain in topology

  radeon:
   - safe reg list generator fixes

  nouveau:
   - HD audio fixes on recent systems
   - vGPU detection (fail probe if we're on one, for now)
   - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it)
   - SVM improvements/fixes
   - NVIDIA format modifier support
   - Misc other fixes.

  adv7511:
   - HDMI SPDIF support

  ast:
   - allocate crtc state size
   - fix double assignment
   - fix suspend

  bochs:
   - drop connector register

  cirrus:
   - move to tiny drivers.

  exynos:
   - fix imported dma-buf mapping
   - enable runtime PM
   - fixes and cleanups

  mediatek:
   - DPI pin mode swap
   - config mipi_tx current/impedance

  lima:
   - devfreq + cooling device support
   - task handling improvements
   - runtime PM support

  pl111:
   - vexpress init improvements
   - fix module auto-load

  rcar-du:
   - DT bindings conversion to YAML
   - Planes zpos sanity check and fix
   - MAINTAINERS entry for LVDS panel driver

  mcde:
   - fix return value

  mgag200:
   - use managed config init

  stm:
   - read endpoints from DT

  vboxvideo:
   - use PCI managed functions
   - drop WC mtrr

  vkms:
   - enable cursor by default

  rockchip:
   - afbc support

  virtio:
   - various cleanups

  qxl:
   - fix cursor notify port

  hisilicon:
   - 128-byte stride alignment fix

  sun4i:
   - improved format handling"

* tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits)
  drm/amd/display: Fix potential integer wraparound resulting in a hang
  drm/amd/display: drop cursor position check in atomic test
  drm/amdgpu: fix device attribute node create failed with multi gpu
  drm/nouveau: use correct conflicting framebuffer API
  drm/vblank: Fix -Wformat compile warnings on some arches
  drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
  drm/amd/display: Handle GPU reset for DC block
  drm/amdgpu: add apu flags (v2)
  drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven
  drm/amdgpu: fix pm sysfs node handling (v2)
  drm/amdgpu: move gpu_info parsing after common early init
  drm/amdgpu: move discovery gfx config fetching
  drm/nouveau/dispnv50: fix runtime pm imbalance on error
  drm/nouveau: fix runtime pm imbalance on error
  drm/nouveau: fix runtime pm imbalance on error
  drm/nouveau/debugfs: fix runtime pm imbalance on error
  drm/nouveau/nouveau/hmm: fix migrate zero page to GPU
  drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations
  drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST
  drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes()
  ...
2020-06-02 15:04:15 -07:00
Linus Torvalds
94709049fb Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
 "A few little subsystems and a start of a lot of MM patches.

  Subsystems affected by this patch series: squashfs, ocfs2, parisc,
  vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
  swap, memcg, pagemap, memory-failure, vmalloc, kasan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
  kasan: move kasan_report() into report.c
  mm/mm_init.c: report kasan-tag information stored in page->flags
  ubsan: entirely disable alignment checks under UBSAN_TRAP
  kasan: fix clang compilation warning due to stack protector
  x86/mm: remove vmalloc faulting
  mm: remove vmalloc_sync_(un)mappings()
  x86/mm/32: implement arch_sync_kernel_mappings()
  x86/mm/64: implement arch_sync_kernel_mappings()
  mm/ioremap: track which page-table levels were modified
  mm/vmalloc: track which page-table levels were modified
  mm: add functions to track page directory modifications
  s390: use __vmalloc_node in stack_alloc
  powerpc: use __vmalloc_node in alloc_vm_stack
  arm64: use __vmalloc_node in arch_alloc_vmap_stack
  mm: remove vmalloc_user_node_flags
  mm: switch the test_vmalloc module to use __vmalloc_node
  mm: remove __vmalloc_node_flags_caller
  mm: remove both instances of __vmalloc_node_flags
  mm: remove the prot argument to __vmalloc_node
  mm: remove the pgprot argument to __vmalloc
  ...
2020-06-02 12:21:36 -07:00
Christoph Hellwig
d4efd79a81 mm: remove the prot argument from vm_map_ram
This is always PAGE_KERNEL - for long term mappings with other properties
vmap should be used.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200414131348.444715-19-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:11 -07:00
Linus Torvalds
17839856fd gup: document and work around "COW can break either way" issue
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.

Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.

End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.

So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.

At the same time, some users simply don't even care.

For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.

This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.

The current semantics end up being:

 - __get_user_pages_fast(): no change. If you don't ask for a write,
   you won't break COW. You'd better know what you're doing.

 - get_user_pages_fast(): the fast-case "look it up in the page tables
   without anything getting mmap_sem" now refuses to follow a read-only
   page, since it might need COW breaking.  Which happens in the slow
   path - the fast path doesn't know if the memory might be COW or not.

 - get_user_pages() (including the slow-path fallback for gup_fast()):
   for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
   very similar semantics to FOLL_FORCE.

If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path.  So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".

Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.

But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.

[ It might be worth noting that we've always had this ambiguity, and it
  could arguably be seen as a user-space issue.

  You only get private COW mappings that could break either way in
  situations where user space is doing cooperative things (ie fork()
  before an execve() etc), but it _is_ surprising and very subtle, and
  fork() is supposed to give you independent address spaces.

  So let's treat this as a kernel issue and make the semantics of
  get_user_pages() easier to understand. Note that obviously a true
  shared mapping will still get a page that can change under us, so this
  does _not_ mean that get_user_pages() somehow returns any "stable"
  page ]

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:19:17 -07:00
Linus Torvalds
e148a8f948 Merge branch 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull uaccess/readdir updates from Al Viro:
 "Finishing the conversion of readdir.c to unsafe_... API.

  This includes the uaccess_{read,write}_begin series by Christophe
  Leroy"

* 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  readdir.c: get rid of the last __put_user(), drop now-useless access_ok()
  readdir.c: get compat_filldir() more or less in sync with filldir()
  switch readdir(2) to unsafe_copy_dirent_name()
  drm/i915/gem: Replace user_access_begin by user_write_access_begin
  uaccess: Selectively open read or write user access
  uaccess: Add user_read_access_begin/end and user_write_access_begin/end
2020-06-01 16:11:38 -07:00
Dave Airlie
6cf991611b Merge tag 'drm-intel-next-2020-05-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- drm/i915: Show per-engine default property values in sysfs

    By providing the default values configured into the kernel via sysfs, it
    is much more convenient for userspace to restore those sane defaults, or
    at least know what are considered good baseline. This is useful, for
    example, to cleanup after any failed userspace prior to commencing new
    jobs.

Cross-subsystem Changes:

- video/hdmi: Add Unpack only function for DRM infoframe
- Includes pull request gvt-next-2020-05-12

Driver Changes:

- Restore Cherryview back to full-ppgtt (Chris, Mika)
- Document locking guidelines for i915 (Chris, Daniel, Joonas)
- Fix GitLab #1746: Handle idling during i915_gem_evict_something busy loops (Chris)
- Display WA #1105: Require linear fb stride to be multiple of 512 bytes on
  gen9/glk (Ville)
- Add Wa_14010685332 for ICP/ICL (Matt R)
- Restrict w/a 1607087056 for EHL/JSL (Swathi)
- Fix interrupt handling for DP AUX transactions on Tigerlake (Imre)
- Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" (Mika)
- Fix HDC pipeline flush hardware bit on Gen12 (Mika)
- Flush L3 when flushing render on Gen12 (Mika)
- Invalidate aux table entries forcibly between BB on Gen12 (Mika)
- Add aux table invalidate for all engines on Gen12 (Mika)
- Force pte cacheline to main memory Gen8+ (Mika)
- Add and enable TGL+ SAGV support (Stanislav)
- Implement vm_ops->access on i915 mmaps for GDB (Chris, Kristian)
- Replace zero-length array with flexible-array (Gustavo)
- Improve batch buffer pool effectiveness to mitigate soft-rc6 hit (Chris)
- Remove wait priority boosting (Chris)
- Keep driver module referenced when PMU is active (Chris)
- Sanitize RPS interrupts upon resume (Chris)
- Extend pcode read timeout to 20 ms (Chris)
- Wait for ACT sent before enabling MST pipe (Ville)
- Extend support to async relocations to SNB (Chris)
- Remove CNL pre-prod workarounds (Ville)
- Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled (Sultan)
- Record the active CCID from before reset (Chris)
- Mark concurrent submissions with a weak-dependency (Chris)
- Peel dma-fence-chains for await to allow engine-to-engine sync (Lionel)
- Prevent using semaphores to chain up to external fences (Chris)
- Fix GLK watermark calculations (Ville)
- Emit await(batch) before MI_BB_START (Chris)
- Reset execlists registers before HWSP (Chris)
- Drop no-semaphore boosting in favor of fast timeslicing (Chris)
- Fix enabled infoframe states of lspcon (Gwan-gyeong)
- Program DP SDPs on pipe updates (Gwan-gyeong)
- Stop sending DP SDPs on ddi disable (Gwan-gyeong)
- Store CS timestamp frequency in Hz (Ville)

- Remove unused HAS_FWTABLE macro (Pascal)
- Use batchbuffer chaining for relocations to save ring space (Chris)
- Try different engines for relocs if MI ops not supported (Chris, Tvrtko)
- Lazily acquire the device wakeref for freeing objects (Chris)
- Streamline display code arithmetics around rounding etc. (Ville)
- Use bw state for per crtc SAGV evaluation (Stanislav)
- Track active_pipes in bw_state (Stanislav)
- Nuke mode.vrefresh usage (Ville)
- Warn if the FBC is still writing to stolen on removal (Chris)
- Added new PCode commands prepping for QGV rescricting (Stansilav)
- Stop holding onto the pinned_default_state (Chris)
- Propagate error from completed fences (Chris)
- Ignore submit-fences on the same timeline (Chris)
- Pull waiting on an external dma-fence into its routine (Chris)
- Replace the hardcoded I915_FENCE_TIMEOUT with Kconfig (Chris)
- Mark up the racy read of execlists->context_tag (Chris)
- Tidy up the return handling for completed dma-fences (Chris)
- Introduce skl_plane_wm_level accessor (Stanislav)
- Extract SKL SAGV checking (Stanislav)
- Make active_pipes check skl specific (Stanislav)
- Suspend tasklets before resume sanitization (Chris)
- Remove redundant exec_fence (Chris)
- Mark the addition of the initial-breadcrumb in the request (Chris)
- Transfer old virtual breadcrumbs to irq_worker (Chris)
- Read the DP SDPs from the video DIP (Gwan-gyeong)
- Program DP SDPs with computed configs (Gwan-gyeong)
- Add state readout for DP VSC and DP HDR Metadata Infoframe SDP
  (Gwan-gyeong)
- Add compute routine for DP PSR VSC SDP (Gwan-gyeong)
- Use new DP VSC SDP compute routine on PSR (Gwan-gyeong)
- Restrict qgv points which don't have enough bandwidth. (Stanislav)
- Nuke pointless div by 64bit (Ville)

- Static checker code fixes (Nathan, Mika, Chris)
- Add logging function for DP VSC SDP (Gwan-gyeong)
- Include HDMI DRM infoframe, DP HDR metadata and DP VSC SDP in the
  crtc state dump (Gwan-gyeong)
- Make timeslicing explicit engine property (Chris, Tvrtko)
- Selftest and debugging improvements (Chris)
- Align variable names with BSpec (Ville)
- Tidy up gen8+ breadcrumb emission code (Chris)
- Turn intel_digital_port_connected() in a vfunc (Ville)
- Use stashed away hpd isr bits in intel_digital_port_connected() (Ville)
- Extract i915_cs_timestamp_{ns_to_ticks,tick_to_ns}() (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515160703.GA19043@jlahtine-desk.ger.corp.intel.com
2020-05-20 13:36:45 +10:00
Chris Wilson
18e4af04d2 drm/i915: Drop no-semaphore boosting
Now that we have fast timeslicing on semaphores, we no longer need to
prioritise none-semaphore work as we will yield any work blocked on a
semaphore to the next in the queue. Previously with no timeslicing,
blocking on the semaphore caused extremely bad scheduling with multiple
clients utilising multiple rings. Now, there is no impact and we can
remove the complication.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513173504.28322-1-chris@chris-wilson.co.uk
2020-05-14 06:14:33 +01:00
Dave Airlie
a1fb548962 Merge tag 'drm-intel-next-2020-04-30' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Driver Changes:

- Fix GitLab #1698: Performance regression with Linux 5.7-rc1 on
  Iris Plus 655 and 4K screen (Chris)
- Add Wa_14011059788 for Tigerlake (Matt A)
- Add per ctx batchbuffer wa for timestamp for Gen12 (Mika)
- Use indirect ctx bb to load cmd buffer control value
  from context image to avoid corruption (Mika)
- Enable DP Display Audio WA (Uma, Jani)
- Update forcewake firmware ranges for Icelake (Radhakrishna)
- Add missing deinitialization cases of load failure for display (Jose)
- Implement TC cold sequences for Icelake and Tigerlake (Jose)
- Unbreak enable_dpcd_backlight modparam (Lyude)
- Move the late flush_submission in retire to the end (Chris)
- Demote "Reducing compressed framebufer size" message to info (Peter)
- Push MST link retraining to the hotplug work (Ville)
- Hold obj->vma.lock over for_each_ggtt_vma() (Chris)
- Fix timeout handling during TypeC AUX power well enabling for ICL (Imre)
- Fix skl+ non-scaled pfit modes (Ville)
- Prefer soft-rc6 over RPS DOWN_TIMEOUT (Chris)
- Sanitize GT first before poisoning HWSP (Chris)
- Fix up clock RPS frequency readout (Chris)
- Avoid reusing the same logical CCID (Chris)
- Avoid dereferencing a dead context (Chris)
- Always enable busy-stats for execlists (Chris)
- Apply the aggressive downclocking to parking (Chris)
- Restore aggressive post-boost downclocking (Chris)

- Scrub execlists state on resume (Chris)
- Add debugfs attributes for LPSP (Ansuman)
- Improvements to kernel selftests (Chris, Mika)
- Add tiled blits selftest (Zbigniew)
- Fix error handling in __live_lrc_indirect_ctx_bb() (Dan)
- Add pre/post plane updates for SAGV (Stanislav)
- Add ICL PG3 PW ID for EHL (Anshuman)
- Fix Sphinx build duplicate label warning (Jani)
- Error log non-zero audio power refcount after unbind (Jani)
- Remove object_is_locked assertion from unpin_from_display_plane (Chris)
- Use single set of AUX powerwell ops for gen11+ (Matt R)
- Prefer drm_WARN_ON over WARN_ON (Pankaj)
- Poison residual state [HWSP] across resume (Chris, Tvrtko)
- Convert request-before-CS assertion to debug (Chris)
- Carefully order virtual_submission_tasklet (Chris)
- Check carefully for an idle engine in wait-for-idle (Chris)
- Only close vma we open (Chris)
- Trace RPS events (Chris)
- Use the RPM config register to determine clk frequencies (Chris)
- Drop rq->ring->vma peeking from error capture (Chris)
- Check preempt-timeout target before submit_ports (Chris)
- Check HWSP cacheline is valid before acquiring (Chris)
- Use proper fault mask in interrupt postinstall too (Matt R)
- Keep a no-frills swappable copy of the default context state (Chris)

- Add atomic helpers for bandwidth (Stanislav)
- Refactor setting dma info to a common helper from device info (Michael)
- Refactor DDI transcoder code for clairty (Ville)
- Extend PG3 power well ID to ICL (Anshuman)
- Refactor PFIT code for readability and future extensibility (Ville)
- Clarify code split between intel_ddi.c and intel_dp.c (Ville)
- Move out code to return the digital_port of the aux ch (Jose)
- Move rps.enabled/active  and use of RPS interrupts to flags (Chris)
- Remove superfluous inlines and dead code (Jani)
- Re-disable -Wframe-address from top-level Makefile (Nick)
- Static checker and spelling fixes (Colin, Nathan)
- Split long lines (Ville)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430124904.GA100924@jlahtine-desk.ger.corp.intel.com
2020-05-14 11:33:10 +10:00
Chris Wilson
889333c772 drm/i915/gem: Remove redundant exec_fence
Since there can only be one of in_fence/exec_fence, just use the single
in_fence local.

v2: Consolidate lookup

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200513180937.28992-1-chris@chris-wilson.co.uk
2020-05-13 20:02:24 +01:00
Chris Wilson
9bad40a27d drm/i915/selftests: Always flush before unpining after writing
Be consistent, and even when we know we had used a WC, flush the mapped
object after writing into it. The flush understands the mapping type and
will only clflush if !I915_MAP_WC, but will always insert a wmb [sfence]
so that we can be sure that all writes are visible.

v2: Add the unconditional wmb so we are know that we always flush the
writes to memory/HW at that point.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200511141304.599-1-chris@chris-wilson.co.uk
2020-05-11 16:50:04 +01:00
Chris Wilson
b0a997ae52 drm/i915: Emit await(batch) before MI_BB_START
Be consistent and ensure that we always emit the asynchronous waits
prior to issuing instructions that use the address. This ensures that if
we do emit GPU commands to do the await, they are before our use!

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200510102431.21959-1-chris@chris-wilson.co.uk
2020-05-11 16:50:04 +01:00
Chris Wilson
16dc224f1c drm/i915: Replace the hardcoded I915_FENCE_TIMEOUT
Expose the hardcoded timeout for unsignaled foreign fences as a Kconfig
option, primarily to allow brave systems to disable the timeout and
solely rely on correct signaling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200509105021.12542-1-chris@chris-wilson.co.uk
2020-05-09 12:57:57 +01:00
Chris Wilson
eec39e441c drm/i915: Remove wait priority boosting
Upon waiting a request (when asked), we gave that request a small
priority boost, not enough for it to cause preemption, but enough for it
to be scheduled next before all equals. We also used that bit to give
new clients a small priority boost, similar to FQ_CODEL, such that we
favoured short interactive tasks ahead of long running streams.

However, this is causing lots of complications with timeslicing where we
both want to honour the boost and yet ignore it. Those complications
cause unexpected user behaviour (tasks not being timesliced and run
concurrently as epxected), and the easiest way to resolve that is to
remove the boost. Hopefully, we can find a compromise again if we need
to, but in theory timeslicing itself and future more advanced schedulers
should give us the interactivity boost we seek.

Testcase: igt/gem_exec_schedule/lateslice
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk
2020-05-07 20:08:58 +01:00
Chris Wilson
47bf7b7a71 drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane
Since moving the obj->vma.list to a spin_lock, and the vm->bound_list to
its vm->mutex, along with tracking shrinkable status under its own
spinlock, we no long require the object to be locked by the caller.

This is fortunate as it appears we can be called with the lock along an
error path in flipping:

<4> [139.942851] WARN_ON(debug_locks && !lock_is_held(&(&((obj)->base.resv)->lock.base)->dep_map))
<4> [139.943242] WARNING: CPU: 0 PID: 1203 at drivers/gpu/drm/i915/gem/i915_gem_domain.c:405 i915_gem_object_unpin_from_display_plane+0x70/0x130 [i915]
<4> [139.943263] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_realtek snd_hda_codec_generic coretemp snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core r8169 lpc_ich snd_pcm realtek prime_numbers [last unloaded: i915]
<4> [139.943347] CPU: 0 PID: 1203 Comm: kms_flip Tainted: G     U            5.6.0-gd0fda5c2cf3f1-drmtip_474+ #1
<4> [139.943363] Hardware name:  /D510MO, BIOS MOPNV10J.86A.0311.2010.0802.2346 08/02/2010
<4> [139.943589] RIP: 0010:i915_gem_object_unpin_from_display_plane+0x70/0x130 [i915]
<4> [139.943589] Code: 85 28 01 00 00 be ff ff ff ff 48 8d 78 60 e8 d7 9b f0 e2 85 c0 75 b9 48 c7 c6 50 b9 38 c0 48 c7 c7 e9 48 3c c0 e8 20 d4 e9 e2 <0f> 0b eb a2 48 c7 c1 08 bb 38 c0 ba 0a 01 00 00 48 c7 c6 88 a3 35
<4> [139.943589] RSP: 0018:ffffb774c0603b48 EFLAGS: 00010282
<4> [139.943589] RAX: 0000000000000000 RBX: ffff9a142fa36e80 RCX: 0000000000000006
<4> [139.943589] RDX: 000000000000160d RSI: ffff9a142c1a88f8 RDI: ffffffffa434a64d
<4> [139.943589] RBP: ffff9a1410a513c0 R08: ffff9a142c1a88f8 R09: 0000000000000000
<4> [139.943589] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a1436ee94b8
<4> [139.943589] R13: 0000000000000001 R14: 00000000ffffffff R15: ffff9a1410960000
<4> [139.943589] FS:  00007fc73a744e40(0000) GS:ffff9a143da00000(0000) knlGS:0000000000000000
<4> [139.943589] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [139.943589] CR2: 00007fc73997e098 CR3: 000000002f5fe000 CR4: 00000000000006f0
<4> [139.943589] Call Trace:
<4> [139.943589]  intel_pin_and_fence_fb_obj+0x1c9/0x1f0 [i915]
<4> [139.943589]  intel_plane_pin_fb+0x3f/0xd0 [i915]
<4> [139.943589]  intel_prepare_plane_fb+0x13b/0x5c0 [i915]
<4> [139.943589]  drm_atomic_helper_prepare_planes+0x85/0x110
<4> [139.943589]  intel_atomic_commit+0xda/0x390 [i915]
<4> [139.943589]  drm_atomic_helper_page_flip+0x9c/0xd0
<4> [139.943589]  ? drm_event_reserve_init+0x46/0x60
<4> [139.943589]  drm_mode_page_flip_ioctl+0x587/0x5d0

This completes the symmetry lost in commit 8b1c78e06e ("drm/i915: Avoid
calling i915_gem_object_unbind holding object lock").

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1743
Fixes: 8b1c78e06e ("drm/i915: Avoid calling i915_gem_object_unbind holding object lock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420125356.26614-1-chris@chris-wilson.co.uk
(cherry picked from commit a95f3ac21d)
(cherry picked from commit 2208b85fa1766ee4821a9435d548578b67090531)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-05-06 15:37:31 -07:00
Chris Wilson
e3d291301f drm/i915/gem: Implement legacy MI_STORE_DATA_IMM
The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but
left them writing to a physical address. The notes suggest that the
primary reason would be so that the writes were cache coherent, as the
CPU cache uses physical tagging. As such we did not implement the
legacy variant of MI_STORE_DATA_IMM and so left all the relocations
synchronous -- but with a small function to convert from the vma address
into the physical address, we can implement asynchronous relocs on these
older arches, fixing up a few tests that require them.

In order to be able to test the legacy paths, refactor the gpu
relocations so that we can hook them up to a selftest.

v2: Use an array of offsets not enum labels for the selftest
v3: Refactor the common igt_hexdump()

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk
2020-05-04 15:15:04 +01:00
Chris Wilson
f5b62bdbb6 drm/i915/gem: Specify address type for chained reloc batches
It is required that a chained batch be in the same address domain as its
parent, and also that must be specified in the command for earlier gen
as it is not inferred from the chaining until gen6.

Fixes: 964a9b0f61 ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk
2020-05-04 14:28:48 +01:00
Chris Wilson
6983dafa31 drm/i915/gem: Lazily acquire the device wakeref for freeing objects
We only need the device wakeref on freeing the objects if we have to
unbind the object from the global GTT, or otherwise update device
information. If the objects are clean, we never need the wakeref, so
avoid taking until required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200503171513.18704-1-chris@chris-wilson.co.uk
2020-05-04 11:12:37 +01:00
Chris Wilson
6f576d6277 drm/i915/gem: Try an alternate engine for relocations
If at first we don't succeed, try try again.

Not all engines may support the MI ops we need to perform asynchronous
relocation patching, and so we end up falling back to a synchronous
operation that has a liability of blocking. However, Tvrtko pointed out
we don't need to use the same engine to perform the relocations as we
are planning to execute the execbuf on, and so if we switch over to a
working engine, we can perform the relocation asynchronously. The user
execbuf will be queued after the relocations by virtue of fencing.

This patch creates a new context per execbuf requiring asynchronous
relocations on an unusable engines. This is perhaps a bit excessive and
can be ameliorated by a small context cache, but for the moment we only
need it for working around a little used engine on Sandybridge, and only
if relocations are actually required to an active batch buffer.

Now we just need to teach the relocation code to handle physical
addressing for gen2/3, and we should then have universal support!

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_reloc/basic-spin # snb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk
2020-05-01 22:56:16 +01:00
Chris Wilson
0e97fbb080 drm/i915/gem: Use a single chained reloc batches for a single execbuf
As we can now keep chaining together a relocation batch to process any
number of relocations, we can keep building that relocation batch for
all of the target vma. This avoiding emitting a new request into the
ring for each target, consuming precious ring space and a potential
stall.

v2: Propagate the failure from submitting the relocation batch.

Testcase: igt/gem_exec_reloc/basic-wide-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk
2020-05-01 22:56:15 +01:00
Chris Wilson
964a9b0f61 drm/i915/gem: Use chained reloc batches
The ring is a precious resource: we anticipate to only use a few hundred
bytes for a request, and only try to reserve that before we start. If we
go beyond our guess in building the request, then instead of waiting at
the start of execbuf before we hold any locks or other resources, we
may trigger a wait inside a critical region. One example is in using gpu
relocations, where currently we emit a new MI_BB_START from the ring
every time we overflow a page of relocation entries. However, instead of
insert the command into the precious ring, we can chain the next page of
relocation entries as MI_BB_START from the end of the previous.

v2: Delay the emit_bb_start until after all the chained vma
synchronisation is complete. Since the buffer pool batches are idle, this
_should_ be a no-op, but one day we may some fancy async GPU bindings
for new vma!

v3: Use pool/batch consitently, once we start thinking in terms of the
batch vma, use batch->obj.
v4: Explain the magic number 4.

Tvrtko spotted that we lose propagation of the error for failing to
submit the relocation request; that's easier to fix up in the next
patch.

Testcase: igt/gem_exec_reloc/basic-many-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk
2020-05-01 22:56:15 +01:00
Chris Wilson
9f909e215f drm/i915: Implement vm_ops->access for gdb access into mmaps
gdb uses ptrace() to peek and poke bytes of the target's address space.
The driver must implement an vm_ops->access() handler or else gdb will
be unable to inspect the pointer and report it as out-of-bounds.
Worse than useless as it causes immediate suspicion of the valid GTT
pointer, distracting the poor programmer trying to find his bug.

v2: Write-protect readonly objects (Matthew).

Testcase: igt/gem_mmap_gtt/ptrace
Testcase: igt/gem_mmap_offset/ptrace
Suggested-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501145120.18830-1-chris@chris-wilson.co.uk
2020-05-01 17:30:47 +01:00
Christophe Leroy
b44f687386 drm/i915/gem: Replace user_access_begin by user_write_access_begin
When i915_gem_execbuffer2_ioctl() is using user_access_begin(),
that's only to perform unsafe_put_user() so use
user_write_access_begin() in order to only open write access.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ebf1250b6d4f351469fb339e5399d8b92aa8a1c1.1585898438.git.christophe.leroy@c-s.fr
2020-05-01 12:35:22 +10:00
Chris Wilson
16e8745967 drm/i915/gt: Move the batch buffer pool from the engine to the gt
Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.

v2: Hook up to debugfs/drop_caches to clear the cache on demand.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
2020-04-30 19:12:02 +01:00
Zbigniew Kempczyński
79eb8c7f01 drm/i915/selftests: Add tiled blits selftest
Extend coverage of the blitter client by exercising conversion to and
from tiled sources. In the process we perform spot checks to verify that
the tiling/detiling is being applied correctly, along with position
invariance of the tiling parameters.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430064957.14942-1-chris@chris-wilson.co.uk
2020-04-30 08:31:12 +01:00
Chris Wilson
f524a774a4 drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()
While the ggtt vma are protected by their object lifetime, the list
continues until it hits a non-ggtt vma, and that vma is not protected
and may be freed as we inspect it. Hence, we require the obj->vma.lock
to protect the list as we iterate.

An example of forgetting to hold the obj->vma.lock is

[1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
[1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1
[1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017
[1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915]
[1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10
[1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282
[1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000
[1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578
[1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000
[1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8
[1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00
[1642834.465034] FS:  00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000
[1642834.465036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0
[1642834.465038] Call Trace:
[1642834.465083]  i915_gem_set_tiling_ioctl+0x122/0x230 [i915]
[1642834.465121]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465151]  drm_ioctl_kernel+0x86/0xd0 [drm]
[1642834.465156]  ? avc_has_perm+0x3b/0x160
[1642834.465178]  drm_ioctl+0x206/0x390 [drm]
[1642834.465216]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465221]  ? selinux_file_ioctl+0x122/0x1c0
[1642834.465226]  ? __do_munmap+0x24b/0x4d0
[1642834.465231]  ksys_ioctl+0x82/0xc0
[1642834.465235]  __x64_sys_ioctl+0x16/0x20
[1642834.465238]  do_syscall_64+0x5b/0xf0
[1642834.465243]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1642834.465245] RIP: 0033:0x7f6a3d7b047b
[1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
[1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b
[1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e
[1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002
[1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080
[1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30

Now to take the spinlock during the list iteration, we need to break it
down into two phases. In the first phase under the lock, we cannot sleep
and so must defer the actual work to a second list, protected by the
ggtt->mutex.

We also need to hold the spinlock during creation of a new vma to
serialise with updates of the tiling on the object.

Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk
(cherry picked from commit cb593e5d2b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-27 09:47:37 -07:00
Xiyu Yang
5d5e100a20 drm/i915/selftests: Fix i915_address_space refcnt leak
igt_ppgtt_pin_update() invokes i915_gem_context_get_vm_rcu(), which
returns a reference of the i915_address_space object to "vm" with
increased refcount.

When igt_ppgtt_pin_update() returns, "vm" becomes invalid, so the
refcount should be decreased to keep refcount balanced.

The reference counting issue happens in two exception handling paths of
igt_ppgtt_pin_update(). When i915_gem_object_create_internal() returns
IS_ERR, the refcnt increased by i915_gem_context_get_vm_rcu() is not
decreased, causing a refcnt leak.

Fix this issue by jumping to "out_vm" label when
i915_gem_object_create_internal() returns IS_ERR.

Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1587361342-83494-1-git-send-email-xiyuyang19@fudan.edu.cn
(cherry picked from commit e07c7606a0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-27 09:47:33 -07:00
Chris Wilson
50689771c8 drm/i915: Only close vma we open
The history of i915_vma_close() is confusing, as is its use. As the
lifetime of the i915_vma is currently bounded by the object it is
attached to, we needed a means of identify when a vma was no longer in
use by userspace (via the user's fd). This is further complicated by
that only ppgtt vma should be closed at the user's behest, as the ggtt
were always shared.

Now that we attach the vma to a lut on the user's context, the open
count does indicate how many unique and open context/vm are referencing
this vma from the user. As such, we can and should just use the
open_count to track when the vma is still in use by userspace.

It's a poor man's replacement for reference counting.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk
2020-04-24 11:24:45 +01:00
Chris Wilson
cb593e5d2b drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()
While the ggtt vma are protected by their object lifetime, the list
continues until it hits a non-ggtt vma, and that vma is not protected
and may be freed as we inspect it. Hence, we require the obj->vma.lock
to protect the list as we iterate.

An example of forgetting to hold the obj->vma.lock is

[1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
[1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1
[1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017
[1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915]
[1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10
[1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282
[1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000
[1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578
[1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000
[1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8
[1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00
[1642834.465034] FS:  00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000
[1642834.465036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0
[1642834.465038] Call Trace:
[1642834.465083]  i915_gem_set_tiling_ioctl+0x122/0x230 [i915]
[1642834.465121]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465151]  drm_ioctl_kernel+0x86/0xd0 [drm]
[1642834.465156]  ? avc_has_perm+0x3b/0x160
[1642834.465178]  drm_ioctl+0x206/0x390 [drm]
[1642834.465216]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465221]  ? selinux_file_ioctl+0x122/0x1c0
[1642834.465226]  ? __do_munmap+0x24b/0x4d0
[1642834.465231]  ksys_ioctl+0x82/0xc0
[1642834.465235]  __x64_sys_ioctl+0x16/0x20
[1642834.465238]  do_syscall_64+0x5b/0xf0
[1642834.465243]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1642834.465245] RIP: 0033:0x7f6a3d7b047b
[1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
[1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b
[1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e
[1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002
[1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080
[1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30

Now to take the spinlock during the list iteration, we need to break it
down into two phases. In the first phase under the lock, we cannot sleep
and so must defer the actual work to a second list, protected by the
ggtt->mutex.

We also need to hold the spinlock during creation of a new vma to
serialise with updates of the tiling on the object.

Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk
2020-04-22 15:43:56 +01:00
Dave Airlie
1aa63ddf72 drm-misc-next for 5.8:
UAPI Changes:
 
   - drm: error out with EBUSY when device has existing master
   - drm: rework SET_MASTER and DROP_MASTER perm handling
 
 Cross-subsystem Changes:
 
   - fbdev: savage: fix -Wextra build warning
   - video: omap2: Use scnprintf() for avoiding potential buffer overflow
 
 Core Changes:
 
   - Remove drm_pci.h
   - drm_pci_{alloc/free)() are now legacy
   - Introduce managed DRM resourcesA
   - Allow drivers to subclass struct drm_framebuffer
   - Introduce struct drm_afbc_framebuffer and helpers
   - fbdev: remove return value from generic fbdev setup
   - Introduce simple-encoder helper
   - vram-helpers: set fence on plane
   - dp_mst: ACT timeout improvements
   - dp_mst: Remove drm_dp_mst_has_audio()
   - TTM: ttm_trace_dma_{map/unmap}() cleanups
   - dma-buf: add flag for PCIP2P support
   - EDID: Various improvements
   - Encoder: cleanup semantics of possible_clones and possible_crtcs
   - VBLANK documentation updates
   - Writeback documentation updates
 
 Driver Changes:
 
   - Convert several drivers to i2c_new_client_device()
   - Drop explicit drm_mode_config_cleanup() calls from drivers
   - Auto-release device structures with drmm_add_final_kfree()
   - Init bfdev console after registering DRM device
   - Make various .debugfs functions return 0 unconditionally; ignore errors
   - video: Use scnprintf() to avoid buffer overflows
   - Convert drivers to simple encoders
 
   - drm/amdgpu: note that we can handle peer2peer DMA-buf
   - drm/amdgpu: add support for exporting VRAM using DMA-buf v3
   - drm/kirin: Revert change to register connectors
   - drm/lima: Add optional devfreq and cooling device support
   - drm/lima: Various improvements wrt. task handling
   - drm/panel: nt39016: Support multiple modes and 50Hz
   - drm/panel: Support Leadtek LTK050H3146W
   - drm/rockchip: Add support for afbc
   - drm/virtio: Various cleanups
   - drm/hisilicon/hibmc: Enforce 128-byte stride alignment
   - drm/qxl: Fix notify port address of cursor ring buffer
   - drm/sun4i: Improvements to format handling
   - drm/bridge: dw-hdmi: Various improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAl6VfAAACgkQaA3BHVML
 eiNjBwgAtzRaqrKX3c4aL4NCBmfWzqxvKN0fVcx8tHtjhmrPTLITsHCM+wfcD2qC
 lkr/RMYJT02pNPGnX3jamQk0q/2GKGagChVZgORRsdYOOf5IqGIjvllhkg+U+7YV
 X0pHAfvGk2VyriHYj3s/cnwi9OwZ2UFjdS+f/u2Qp9jQYG/k8u9CCSnzgratY99I
 bI4jZi6JIoRkwuBpBEc9NbrduenKhcYNgPLDiYXY2TFmVz89NwITPnLyf5FWG5zd
 HsQ+dfIS9eoIxL3DTRgBZrPMvrqgiUjztB7cM4bdE0ttwTS7MW6M50/iV553qb9k
 DZ1+/pWFFyZLOPUYc3EK/QYdu8R3QA==
 =MQkd
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-04-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 5.8:

UAPI Changes:

  - drm: error out with EBUSY when device has existing master
  - drm: rework SET_MASTER and DROP_MASTER perm handling

Cross-subsystem Changes:

  - mm: export two symbols from slub/slob
  - fbdev: savage: fix -Wextra build warning
  - video: omap2: Use scnprintf() for avoiding potential buffer overflow

Core Changes:

  - Remove drm_pci.h
  - drm_pci_{alloc/free)() are now legacy
  - Introduce managed DRM resourcesA
  - Allow drivers to subclass struct drm_framebuffer
  - Introduce struct drm_afbc_framebuffer and helpers
  - fbdev: remove return value from generic fbdev setup
  - Introduce simple-encoder helper
  - vram-helpers: set fence on plane
  - dp_mst: ACT timeout improvements
  - dp_mst: Remove drm_dp_mst_has_audio()
  - TTM: ttm_trace_dma_{map/unmap}() cleanups
  - dma-buf: add flag for PCIP2P support
  - EDID: Various improvements
  - Encoder: cleanup semantics of possible_clones and possible_crtcs
  - VBLANK documentation updates
  - Writeback documentation updates

Driver Changes:

  - Convert several drivers to i2c_new_client_device()
  - Drop explicit drm_mode_config_cleanup() calls from drivers
  - Auto-release device structures with drmm_add_final_kfree()
  - Init bfdev console after registering DRM device
  - Make various .debugfs functions return 0 unconditionally; ignore errors
  - video: Use scnprintf() to avoid buffer overflows
  - Convert drivers to simple encoders

  - drm/amdgpu: note that we can handle peer2peer DMA-buf
  - drm/amdgpu: add support for exporting VRAM using DMA-buf v3
  - drm/kirin: Revert change to register connectors
  - drm/lima: Add optional devfreq and cooling device support
  - drm/lima: Various improvements wrt. task handling
  - drm/panel: nt39016: Support multiple modes and 50Hz
  - drm/panel: Support Leadtek LTK050H3146W
  - drm/rockchip: Add support for afbc
  - drm/virtio: Various cleanups
  - drm/hisilicon/hibmc: Enforce 128-byte stride alignment
  - drm/qxl: Fix notify port address of cursor ring buffer
  - drm/sun4i: Improvements to format handling
  - drm/bridge: dw-hdmi: Various improvements

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200414090738.GA16827@linux-uq9g
2020-04-22 10:41:35 +10:00
Xiyu Yang
e07c7606a0 drm/i915/selftests: Fix i915_address_space refcnt leak
igt_ppgtt_pin_update() invokes i915_gem_context_get_vm_rcu(), which
returns a reference of the i915_address_space object to "vm" with
increased refcount.

When igt_ppgtt_pin_update() returns, "vm" becomes invalid, so the
refcount should be decreased to keep refcount balanced.

The reference counting issue happens in two exception handling paths of
igt_ppgtt_pin_update(). When i915_gem_object_create_internal() returns
IS_ERR, the refcnt increased by i915_gem_context_get_vm_rcu() is not
decreased, causing a refcnt leak.

Fix this issue by jumping to "out_vm" label when
i915_gem_object_create_internal() returns IS_ERR.

Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1587361342-83494-1-git-send-email-xiyuyang19@fudan.edu.cn
2020-04-20 20:17:50 +01:00
Chris Wilson
a95f3ac21d drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane
Since moving the obj->vma.list to a spin_lock, and the vm->bound_list to
its vm->mutex, along with tracking shrinkable status under its own
spinlock, we no long require the object to be locked by the caller.

This is fortunate as it appears we can be called with the lock along an
error path in flipping:

<4> [139.942851] WARN_ON(debug_locks && !lock_is_held(&(&((obj)->base.resv)->lock.base)->dep_map))
<4> [139.943242] WARNING: CPU: 0 PID: 1203 at drivers/gpu/drm/i915/gem/i915_gem_domain.c:405 i915_gem_object_unpin_from_display_plane+0x70/0x130 [i915]
<4> [139.943263] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_realtek snd_hda_codec_generic coretemp snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core r8169 lpc_ich snd_pcm realtek prime_numbers [last unloaded: i915]
<4> [139.943347] CPU: 0 PID: 1203 Comm: kms_flip Tainted: G     U            5.6.0-gd0fda5c2cf3f1-drmtip_474+ #1
<4> [139.943363] Hardware name:  /D510MO, BIOS MOPNV10J.86A.0311.2010.0802.2346 08/02/2010
<4> [139.943589] RIP: 0010:i915_gem_object_unpin_from_display_plane+0x70/0x130 [i915]
<4> [139.943589] Code: 85 28 01 00 00 be ff ff ff ff 48 8d 78 60 e8 d7 9b f0 e2 85 c0 75 b9 48 c7 c6 50 b9 38 c0 48 c7 c7 e9 48 3c c0 e8 20 d4 e9 e2 <0f> 0b eb a2 48 c7 c1 08 bb 38 c0 ba 0a 01 00 00 48 c7 c6 88 a3 35
<4> [139.943589] RSP: 0018:ffffb774c0603b48 EFLAGS: 00010282
<4> [139.943589] RAX: 0000000000000000 RBX: ffff9a142fa36e80 RCX: 0000000000000006
<4> [139.943589] RDX: 000000000000160d RSI: ffff9a142c1a88f8 RDI: ffffffffa434a64d
<4> [139.943589] RBP: ffff9a1410a513c0 R08: ffff9a142c1a88f8 R09: 0000000000000000
<4> [139.943589] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a1436ee94b8
<4> [139.943589] R13: 0000000000000001 R14: 00000000ffffffff R15: ffff9a1410960000
<4> [139.943589] FS:  00007fc73a744e40(0000) GS:ffff9a143da00000(0000) knlGS:0000000000000000
<4> [139.943589] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [139.943589] CR2: 00007fc73997e098 CR3: 000000002f5fe000 CR4: 00000000000006f0
<4> [139.943589] Call Trace:
<4> [139.943589]  intel_pin_and_fence_fb_obj+0x1c9/0x1f0 [i915]
<4> [139.943589]  intel_plane_pin_fb+0x3f/0xd0 [i915]
<4> [139.943589]  intel_prepare_plane_fb+0x13b/0x5c0 [i915]
<4> [139.943589]  drm_atomic_helper_prepare_planes+0x85/0x110
<4> [139.943589]  intel_atomic_commit+0xda/0x390 [i915]
<4> [139.943589]  drm_atomic_helper_page_flip+0x9c/0xd0
<4> [139.943589]  ? drm_event_reserve_init+0x46/0x60
<4> [139.943589]  drm_mode_page_flip_ioctl+0x587/0x5d0

This completes the symmetry lost in commit 8b1c78e06e ("drm/i915: Avoid
calling i915_gem_object_unbind holding object lock").

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1743
Fixes: 8b1c78e06e ("drm/i915: Avoid calling i915_gem_object_unbind holding object lock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200420125356.26614-1-chris@chris-wilson.co.uk
2020-04-20 16:23:24 +01:00
Colin Ian King
538c329f7f drm/i915: remove redundant assignment to variable err
The variable err is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200409133107.415812-1-colin.king@canonical.com
2020-04-09 20:19:31 +01:00
Jani Nikula
dd1ba6ba09 drm/i915/stolen: prefer struct drm_device based logging
Prefer struct drm_device based logging over struct device based logging.

No functional changes.

Cc: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-15-jani.nikula@intel.com
2020-04-08 13:49:35 +03:00
Chris Wilson
e94f785642 drm/i915/gem: Promote 'remain' to unsigned long
Tidy the code by casting remain to unsigned long once for the duration
of eb_relocate_vma()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200407085930.19421-1-chris@chris-wilson.co.uk
2020-04-07 14:43:58 +01:00
Chris Wilson
e68296259c drm/i915/gem: Wait until the context is finally retired before releasing engines
If we want to percolate information back from the HW, up through the GEM
context, we need to wait until the intel_context is scheduled out for
the last time. This is handled by the retirement of the intel_context's
barrier, i.e. by listening to the pulse after the notional unpin. So
wait until the intel_context is finally retired before releasing the
engine, so that we can inspect the final context state and pass it on.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200406155840.1728-3-chris@chris-wilson.co.uk
2020-04-06 19:48:06 +01:00
Chris Wilson
1aaea8476d drm/i915/gem: Flush all the reloc_gpu batch
__i915_gem_object_flush_map() takes a byte range, so feed it the written
bytes and do not mistake the u32 index as bytes!

Fixes: a679f58d05 ("drm/i915: Flush pages on acquisition")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200406114821.10949-1-chris@chris-wilson.co.uk
(cherry picked from commit 30c88a47f1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-06 10:31:38 -07:00
Chris Wilson
721017cf4b drm/i915/gem: Ignore readonly failures when updating relocs
If the user passes in a readonly reloc[], by the time we notice we have
already committed to modifying the execobjects, or have indeed done so
already. Reporting the failure just compounds the issue as we have no
second pass to fall back to anymore.

"Be damned if you do, and damned if you don't."

Testcase: igt/gem_exec_reloc/readonly
Fixes: 7dc8f11437 ("drm/i915/gem: Drop relocation slowpath")
References: fddcd00a49 ("drm/i915: Force the slow path after a user-write error")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331162150.3635-1-chris@chris-wilson.co.uk
(cherry picked from commit 97a37c919f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-04-06 10:31:23 -07:00
Chris Wilson
39d571d172 drm/i915/gem: Take DBG_FORCE_RELOC into account prior to using reloc_gpu
If we set the debug flag to force ourselves not to relocate via the gpu,
do not relocate via the gpu.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200406123616.7334-1-chris@chris-wilson.co.uk
2020-04-06 15:58:33 +01:00
Chris Wilson
30c88a47f1 drm/i915/gem: Flush all the reloc_gpu batch
__i915_gem_object_flush_map() takes a byte range, so feed it the written
bytes and do not mistake the u32 index as bytes!

Fixes: a679f58d05 ("drm/i915: Flush pages on acquisition")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200406114821.10949-1-chris@chris-wilson.co.uk
2020-04-06 15:58:33 +01:00
Daniel Vetter
625c18d706 drm: delete drm_pci.h
It's empty!

After more than 20 years of OS abstraction layer for pci devices, it's
kinda gone now.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403110610.2344842-2-daniel.vetter@ffwll.ch
2020-04-03 17:11:41 +02:00
Chris Wilson
89ff76bf9b drm/i915/gem: Utilize rcu iteration of context engines
Now that we can peek at GEM->engines[] and obtain a reference to them
using RCU, do so for instances where we can safely iterate the
potentially old copy of the engines. For setting, we can do this when we
know the engine properties are copied over before swapping, so we know
the new engines already have the global property and we update the old
before they are discarded. For reading, we only need to be safe; as we
do so on behalf of the user, their races are their own problem.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402124218.6375-1-chris@chris-wilson.co.uk
2020-04-02 21:43:53 +01:00
Chris Wilson
9da0ea0963 drm/i915/gem: Drop cached obj->bind_count
We cached the number of vma bound to the object in order to speed up
shrinker decisions. This has been superseded by being more proactive in
removing objects we cannot shrink from the shrinker lists, and so we can
drop the clumsy attempt at atomically counting the bind count and
comparing it to the number of pinned mappings of the object. This will
only get more clumsier with asynchronous binding and unbinding.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401223924.16667-1-chris@chris-wilson.co.uk
2020-04-02 01:17:39 +01:00
Chris Wilson
8a338f4bf6 drm/i915/gem: Try allocating va from free space
If the current node/entry location is occupied, and the object is not
pinned, try assigning it some free space. We cannot wait here, so if in
doubt, we unreserve and try to grab all at once.

v2: Use the final pin_flags so that we won't have to move the object if
we find the wrong free space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401194135.5442-1-chris@chris-wilson.co.uk
2020-04-01 21:57:48 +01:00
Chris Wilson
97a37c919f drm/i915/gem: Ignore readonly failures when updating relocs
If the user passes in a readonly reloc[], by the time we notice we have
already committed to modifying the execobjects, or have indeed done so
already. Reporting the failure just compounds the issue as we have no
second pass to fall back to anymore.

"Be damned if you do, and damned if you don't."

Testcase: igt/gem_exec_reloc/readonly
Fixes: 7dc8f11437 ("drm/i915/gem: Drop relocation slowpath")
References: fddcd00a49 ("drm/i915: Force the slow path after a user-write error")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331162150.3635-1-chris@chris-wilson.co.uk
2020-03-31 22:40:39 +01:00
Chris Wilson
0f1dd02295 drm/i915/gem: Split eb_vma into its own allocation
Use a separate array allocation for the execbuf vma, so that we can
track their lifetime independently from the copy of the user arguments.
With luck, this has a secondary benefit of splitting the malloc size to
within reason and avoid vmalloc. The downside is that we might require
two separate vmallocs -- but much less likely.

In the process, this prevents a memory leak on the ww_mutex error
unwind.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1390
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200330133710.14385-1-chris@chris-wilson.co.uk
2020-03-30 20:08:13 +01:00
Nathan Chancellor
7bf03e7504 drm/i915: Cast remain to unsigned long in eb_relocate_vma
A recent commit in clang added -Wtautological-compare to -Wall, which is
enabled for i915 after -Wtautological-compare is disabled for the rest
of the kernel so we see the following warning on x86_64:

 ../drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1433:22: warning:
 result of comparison of constant 576460752303423487 with expression of
 type 'unsigned int' is always false
 [-Wtautological-constant-out-of-range-compare]
         if (unlikely(remain > N_RELOC(ULONG_MAX)))
            ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
 ../include/linux/compiler.h:78:42: note: expanded from macro 'unlikely'
 # define unlikely(x)    __builtin_expect(!!(x), 0)
                                            ^
 1 warning generated.

It is not wrong in the case where ULONG_MAX > UINT_MAX but it does not
account for the case where this file is built for 32-bit x86, where
ULONG_MAX == UINT_MAX and this check is still relevant.

Cast remain to unsigned long, which keeps the generated code the same
(verified with clang-11 on x86_64 and GCC 9.2.0 on x86 and x86_64) and
the warning is silenced so we can catch more potential issues in the
future.

Closes: https://github.com/ClangBuiltLinux/linux/issues/778
Suggested-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200214054706.33870-1-natechancellor@gmail.com
2020-03-27 00:12:42 +02:00
Chris Wilson
2e46a2a0b0 drm/i915: Use explicit flag to mark unreachable intel_context
I need to keep the GEM context around a bit longer so adding an explicit
flag for syncing execbuf with closed/abandonded contexts.

v2:
 * Use already available context flags. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-1-chris@chris-wilson.co.uk
(cherry picked from commit 207e4a71fb)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-03-26 10:21:04 -07:00
Chris Wilson
73c8bfb7fe drm/i915: Drop final few uses of drm_i915_private.engine
We've migrated all the heavy users over to the intel_gt, and can finally
drop the last few users and with that the mirror in dev_priv->engine[].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325234803.6175-1-chris@chris-wilson.co.uk
2020-03-26 10:50:17 +00:00
Chris Wilson
92581f9fb9 drm/i915: Immediately execute the fenced work
If the caller allows and we do not have to wait for any signals,
immediately execute the work within the caller's process. By doing so we
avoid the overhead of scheduling a new task, and the latency in
executing it, at the cost of pulling that work back into the immediate
context. (Sometimes we still prefer to offload the task to another cpu,
especially if we plan on executing many such tasks in parallel for this
client.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325120227.8044-2-chris@chris-wilson.co.uk
2020-03-25 13:05:04 +00:00