linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 07:55:25 +07:00

Author	SHA1	Message	Date
Chris Wilson	d82b4b2621	drm/i915: Report all objects with allocated pages to the shrinker Currently, we try to report to the shrinker the precise number of objects (pages) that are available to be reaped at this moment. This requires searching all objects with allocated pages to see if they fulfill the search criteria, and this count is performed quite frequently. (The shrinker tries to free ~128 pages on each invocation, before which we count all the objects; counting takes longer than unbinding the objects!) If we take the pragmatic view that with sufficient desire, all objects are eventually reapable (they become inactive, or no longer used as framebuffer etc), we can simply return the count of pinned pages maintained during get_pages/put_pages rather than walk the lists every time. The downside is that we may (slightly) over-report the number of objects/pages we could shrink and so penalize ourselves by shrinking more than required. This is mitigated by keeping the order in which we shrink objects such that we avoid penalizing active and frequently used objects, and if memory is so tight that we need to free them we would need to anyway. v2: Only expose shrinkable objects to the shrinker; a small reduction in not considering stolen and foreign objects. v3: Restore the tracking from a "backup" copy from before the gem/ split Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-2-chris@chris-wilson.co.uk	2019-05-31 21:23:51 +01:00
Chris Wilson	3b4fa9640c	drm/i915: Track the purgeable objects on a separate eviction list Currently the purgeable objects, I915_MADV_DONTNEED, are mixed in the normal bound/unbound lists. Every shrinker pass starts with an attempt to purge from this set of unneeded objects, which entails us doing a walk over both lists looking for any candidates. If there are none, and since we are shrinking we can reasonably assume that the lists are full!, this becomes a very slow futile walk. If we separate out the purgeable objects into own list, this search then becomes its own phase that is preferentially handled during shrinking. Instead the cost becomes that we then need to filter the purgeable list if we want to distinguish between bound and unbound objects. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-1-chris@chris-wilson.co.uk	2019-05-31 21:23:51 +01:00
Chris Wilson	c017cf6b1a	drm/i915: Drop the deferred active reference An old optimisation to reduce the number of atomics per batch sadly relies on struct_mutex for coordination. In order to remove struct_mutex from serialising object/context closing, always taking and releasing an active reference on first use / last use greatly simplifies the locking. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk	2019-05-28 12:45:29 +01:00
Chris Wilson	6951e5893b	drm/i915: Move GEM object domain management from struct_mutex to local Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk	2019-05-28 12:45:29 +01:00
Dave Airlie	14ee642c2a	Features: - Engine discovery query (Tvrtko) - Support for DP YCbCr4:2:0 outputs (Gwan-gyeong) - HDCP revocation support, refactoring (Ramalingam) - Remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW (Christian König) - Asynchronous display power disabling (Imre) - Perma-pin uC firmware and re-enable global reset (Fernando) - GTT remapping for display, for bigger fb size and stride (Ville) - Enable pipe HDR mode on ICL if only HDR planes are used (Ville) - Kconfig to tweak the busyspin durations for i915_wait_request (Chris) - Allow multiple user handles to the same VM (Chris) - GT/GEM runtime pm improvements using wakerefs (Chris) - Gen 4&5 render context support (Chris) - Allow userspace to clone contexts on creation (Chris) - SINGLE_TIMELINE flags for context creation (Chris) - Allow specification of parallel execbuf (Chris) Refactoring: - Header refactoring (Jani) - Move GraphicsTechnology files under gt/ (Chris) - Sideband code refactoring (Chris) Fixes: - ICL DSI state readout and checker fixes (Vandita) - GLK DSI picture corruption fix (Stanislav) - HDMI deep color fixes (Clinton, Aditya) - Fix driver unbinding from a device in use (Janusz) - Fix clock gating with pipe scaling (Radhakrishna) - Disable broken FBC on GLK (Daniel Drake) - Miscellaneous GuC fixes (Michal) - Fix MG PHY DP register programming (Imre) - Add missing combo PHY lane power setup (Imre) - Workarounds for early ICL VBT issues (Imre) - Fix fastset vs. pfit on/off on HSW EDP transcoder (Ville) - Add readout and state check for pch_pfit.force_thru (Ville) - Miscellaneous display fixes and refactoring (Ville) - Display workaround fixes (Ville) - Enable audio even if ELD is bogus (Ville) - Fix use-after-free in reporting create.size (Chris) - Sideband fixes to avoid BYT hard lockups (Chris) - Workaround fixes and improvements (Chris) Maintainer shortcomings: - Failure to adequately describe and give credit for all changes (Jani) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEFWWmW3ewYy4RJOWc05gHnSar7m8FAlzoK8oACgkQ05gHnSar 7m8jZg//UuIkz4bIu7A0YfN/VH3/h3fthxboejj27HpO4OO9eFqLVqaEUFEngGvf 66fnFKNwtLdW7Dsx9iQsKNsVTcdsEE5PvSA6FZ3rVtYOwBdZ9OKYRxci2KcSnjqz F0/8Jxgz2G0gu9TV6dgTLrfdJiuJrCbidRV3G5id0XHNEGbpABtmVxYfsbj/w9mU luckCgKyRDZNzfhyGIPV763bNGZWLQPcbP99yrZf4+EcsiQ2MfjHJdwe5Ko+iGDk sO3lFg/1iEf41gqaD4LPokOtUKZfXI1Sujs1w/0djDbqs9USq0eY1L5C3ZBq5Si1 woz7ATXO71FfBcNRxLTejNqCVlQMLix/185/ItkDA4gDlHwWZPYaT5VTNgRtEEy6 XNtscZyM6Z1ghqRqahWWu40g80sOdfYuiTFEAYonVbDAUootgF46uWO/2ib0Hya+ tYlm60M097eMealzaXEyHPHlW1OeUUJTKxl9j7nHmqVn542OI8gn7xvIXX2VsYDY 7D4gVPoFg0UpGXM2uuSHVgvxwtg4t083Wu+utYu76RjmwNye4LkHewWGFjmOkYRf BraHoA+gKPFtAJjjtkyE/ZnlT4c3tDoQ0a6+gRKVurXzu/Y6JVzquhJvH5mShyZ7 oTv+erupcz7JEnEeKzgMCyon/Drumiut5I6zr29GNQ3eelpf4jQ= =U/nc -----END PGP SIGNATURE----- Merge tag 'drm-intel-next-2019-05-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-next Features: - Engine discovery query (Tvrtko) - Support for DP YCbCr4:2:0 outputs (Gwan-gyeong) - HDCP revocation support, refactoring (Ramalingam) - Remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW (Christian König) - Asynchronous display power disabling (Imre) - Perma-pin uC firmware and re-enable global reset (Fernando) - GTT remapping for display, for bigger fb size and stride (Ville) - Enable pipe HDR mode on ICL if only HDR planes are used (Ville) - Kconfig to tweak the busyspin durations for i915_wait_request (Chris) - Allow multiple user handles to the same VM (Chris) - GT/GEM runtime pm improvements using wakerefs (Chris) - Gen 4&5 render context support (Chris) - Allow userspace to clone contexts on creation (Chris) - SINGLE_TIMELINE flags for context creation (Chris) - Allow specification of parallel execbuf (Chris) Refactoring: - Header refactoring (Jani) - Move GraphicsTechnology files under gt/ (Chris) - Sideband code refactoring (Chris) Fixes: - ICL DSI state readout and checker fixes (Vandita) - GLK DSI picture corruption fix (Stanislav) - HDMI deep color fixes (Clinton, Aditya) - Fix driver unbinding from a device in use (Janusz) - Fix clock gating with pipe scaling (Radhakrishna) - Disable broken FBC on GLK (Daniel Drake) - Miscellaneous GuC fixes (Michal) - Fix MG PHY DP register programming (Imre) - Add missing combo PHY lane power setup (Imre) - Workarounds for early ICL VBT issues (Imre) - Fix fastset vs. pfit on/off on HSW EDP transcoder (Ville) - Add readout and state check for pch_pfit.force_thru (Ville) - Miscellaneous display fixes and refactoring (Ville) - Display workaround fixes (Ville) - Enable audio even if ELD is bogus (Ville) - Fix use-after-free in reporting create.size (Chris) - Sideband fixes to avoid BYT hard lockups (Chris) - Workaround fixes and improvements (Chris) Maintainer shortcomings: - Failure to adequately describe and give credit for all changes (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87sgt3n45z.fsf@intel.com	2019-05-28 09:26:52 +10:00
Ville Syrjälä	bb211c3d0c	drm/i915/selftests: Add live vma selftest Add a live selftest to excercise rotated/remapped vmas. We simply write through the rotated/remapped vma, and confirm that the data appears in the right page when read through the normal vma. Not sure what the fallout of making all rotated/remapped vmas mappable/fenceable would be, hence I just hacked it in the test. v2: Grab rpm reference (Chris) GEM_BUG_ON(view.type not as expected) (Chris) Allow CAN_FENCE for rotated/remapped vmas (Chris) Update intel_plane_uses_fence() to ask for a fence only for normal vmas on gen4+ v3: Deal with intel_wakeref_t v4: Rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-4-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:47 +03:00
Ville Syrjälä	1a74fc0b3f	drm/i915: Add a new "remapped" gtt_view To overcome display engine stride limits we'll want to remap the pages in the GTT. To that end we need a new gtt_view type which is just like the "rotated" type except not rotated. v2: Use intel_remapped_plane_info base type s/unused/unused_mbz/ (Chris) Separate BUILD_BUG_ON()s (Chris) Use I915_GTT_PAGE_SIZE (Chris) v3: Use i915_gem_object_get_dma_address() (Chris) Trim the sg (Tvrtko) v4: Actually trim this time. Limit the max length to one row of pages to keep things simple Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-20 18:04:47 +03:00
Linus Torvalds	a2d635decb	drm pull request for 5.2 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJc04M6AAoJEAx081l5xIa+SJgP/0uIgIOM53vPpydgmr+2IEHF jbDqrd+mipgNriRVHjDsWdUHCUNtyhB7YEBCMrj3mY0rRFI7FlQQf4lOwYGoHiKP 4JZg4kwC37997lFXl1uabGj3DmJLtxKL2/D15zCH/uLe+2EDzWznP6NVdFT3WK0P YKZQCWT19PWSsLoBRPutWxkmop4AYvkqE0a6vXUlJlFYZK3Bbytx6/179uWKfiX5 ZkKEEtx1XiDAvcp5gBb6PISurycrBY0e/bkPBnK3ES5vawMbTU5IrmWOrQ4D8yOd z9qOVZawZ6+b2XBDgBWjQ9bM7I5R7Il1q/LglYEaFI9+wHUnlUdDSm6ft5/5BiCZ fqgkh5Bj2iEsajbSsacoljMOpxpYPqj63mqc+7fAGXF34V+B+9U1bpt8kCbMKowf 7Abb7IuiCR6vLDapjP6VqTMvdQ4O466OEAN83ULGFTdmMqYYH4AxaIwc+xcAk/aP RNq7/RHhh4FRynRAj9fCkGlF3ArnM88gLINwWuEQq4SClWGcvdw7eaHpwWo77c4g iccCnTLqSIg5pDVu07AQzzBlW6KulWxh5o72x+Xx+EXWdYUDHQ1SlNs11bSNUBV1 5MkrzY2GuD+NFEjsXJEDIPOr40mQOyJCXnxq8nXPsz/hD9kHeJPvWn3J3eVKyb5B Z6/knNqM0BDn3SaYR/rD =YFiQ -----END PGP SIGNATURE----- Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "This has two exciting community drivers for ARM Mali accelerators. Since ARM has never been open source friendly on the GPU side of the house, the community has had to create open source drivers for the Mali GPUs. Lima covers the older t4xx and panfrost the newer 6xx/7xx series. Well done to all involved and hopefully this will help ARM head in the right direction. There is also now the ability if you don't have any of the legacy drivers enabled (pre-KMS) to remove all the pre-KMS support code from the core drm, this saves 10% or so in codesize on my machine. i915 also enable Icelake/Elkhart Lake Gen11 GPUs by default, vboxvideo moves out of staging. There are also some rcar-du patches which crossover with media tree but all should be acked by Mauro. Summary: uapi changes: - Colorspace connector property - fourcc - new YUV formts - timeline sync objects initially merged - expose FB_DAMAGE_CLIPS to atomic userspace new drivers: - vboxvideo: moved out of staging - aspeed: ASPEED SoC BMC chip display support - lima: ARM Mali4xx GPU acceleration driver support - panfrost: ARM Mali6xx/7xx Midgard/Bitfrost acceleration driver support core: - component helper docs - unplugging fixes - devm device init - MIPI/DSI rate control - shmem backed gem objects - connector, display_info, edid_quirks cleanups - dma_buf fence chain support - 64-bit dma-fence seqno comparison fixes - move initial fb config code to core - gem fence array helpers for Lima - ability to remove legacy support code if no drivers requires it (removes 10% of drm.ko size) - lease fixes ttm: - unified DRM_FILE_PAGE_OFFSET handling - Account for kernel allocations in kernel zone only panel: - OSD070T1718-19TS panel support - panel-tpo-td028ttec1 backlight support - Ronbo RB070D30 MIPI/DSI - Feiyang FY07024DI26A30-D MIPI-DSI panel - Rocktech jh057n00900 MIPI-DSI panel i915: - Comet Lake (Gen9) PCI IDs - Updated Icelake PCI IDs - Elkhartlake (Gen11) support - DP MST property addtions - plane and watermark fixes - Icelake port sync and VEBOX disable fixes - struct_mutex usage reduction - Icelake gamma fix - GuC reset fixes - make mmap more asynchronous - sound display power well race fixes - DDI/MIPI-DSI clocks for Icelake - Icelake RPS frequency changing support - Icelake workarounds amdgpu: - Use HMM for userptr - vega20 experimental smu11 support - RAS support for vega20 - BACO support for vega12 + fixes for vega20 - reworked IH interrupt handling - amdkfd RAS support - Freesync improvements - initial timeline sync object support - DC Z ordering fixes - NV12 planes support - colorspace properties for planes= - eDP opts if eDP already initialized nouveau: - misc fixes etnaviv: - misc fixes msm: - GPU zap shader support expansion - robustness ABI addition exynos: - Logging cleanups tegra: - Shared reset fix - CPU cache maintenance fix cirrus: - driver rewritten using simple helpers meson: - G12A support vmwgfx: - Resource dirtying management improvements - Userspace logging improvements virtio: - PRIME fixes rockchip: - rk3066 hdmi support sun4i: - DSI burst mode support vc4: - load tracker to detect underflow v3d: - v3d v4.2 support malidp: - initial Mali D71 support in komeda driver tfp410: - omap related improvement omapdrm: - drm bridge/panel support - drop some omap specific panels rcar-du: - Display writeback support" * tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm: (1507 commits) drm/msm/a6xx: No zap shader is not an error drm/cma-helper: Fix drm_gem_cma_free_object() drm: Fix timestamp docs for variable refresh properties. drm/komeda: Mark the local functions as static drm/komeda: Fixed warning: Function parameter or member not described drm/komeda: Expose bus_width to Komeda-CORE drm/komeda: Add sysfs attribute: core_id and config_id drm: add non-desktop quirk for Valve HMDs drm/panfrost: Show stored feature registers drm/panfrost: Don't scream about deferred probe drm/panfrost: Disable PM on probe failure drm/panfrost: Set DMA masks earlier drm/panfrost: Add sanity checks to submit IOCTL drm/etnaviv: initialize idle mask before querying the HW db drm: introduce a capability flag for syncobj timeline support drm: report consistent errors when checking syncobj capibility drm/nouveau/nouveau: forward error generated while resuming objects tree drm/nouveau/fb/ramgk104: fix spelling mistake "sucessfully" -> "successfully" drm/nouveau/i2c: Disable i2c bus access after ->fini() drm/nouveau: Remove duplicate ACPI_VIDEO_NOTIFY_PROBE definition ...	2019-05-08 21:35:19 -07:00
Thomas Gleixner	487f3c7fb1	drm: Simplify stacktrace handling Replace the indirection through struct stack_trace by using the storage array based interfaces. The original code in all printing functions is really wrong. It allocates a storage array on stack which is unused because depot_fetch_stack() does not store anything in it. It overwrites the entries pointer in the stack_trace struct so it points to the depot storage. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Cc: Andy Lutomirski <luto@kernel.org> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094802.622094226@linutronix.de	2019-04-29 12:37:53 +02:00
Chris Wilson	112ed2d31a	drm/i915: Move GraphicsTechnology files under gt/ Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk	2019-04-24 21:01:46 +01:00
Chris Wilson	103b76eeff	drm/i915: Use i915_global_register() Rather than manually add every new global into each hook, use i915_global_register() function and keep a list of registered globals to invoke instead. However, I haven't found a way for random drivers to add an .init table to avoid having to manually add ourselves to i915_globals_init() each time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190305213830.18094-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2019-03-06 10:00:50 +00:00
Chris Wilson	13f1bfd3b3	drm/i915: Make object/vma allocation caches global As our allocations are not device specific, we can move our slab caches to a global scope. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk	2019-02-28 11:08:02 +00:00
Chris Wilson	21950ee7cc	drm/i915: Pull i915_gem_active into the i915_active family Looking forward, we need to break the struct_mutex dependency on i915_gem_active. In the meantime, external use of i915_gem_active is quite beguiling, little do new users suspect that it implies a barrier as each request it tracks must be ordered wrt the previous one. As one of many, it can be used to track activity across multiple timelines, a shared fence, which fits our unordered request submission much better. We need to steer external users away from the singular, exclusive fence imposed by i915_gem_active to i915_active instead. As part of that process, we move i915_gem_active out of i915_request.c into i915_active.c to start separating the two concepts, and rename it to i915_active_request (both to tie it to the concept of tracking just one request, and to give it a longer, less appealing name). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk	2019-02-05 17:20:11 +00:00
Chris Wilson	64d6c500a3	drm/i915: Generalise GPU activity tracking We currently track GPU memory usage inside VMA, such that we never release memory used by the GPU until after it has finished accessing it. However, we may want to track other resources aside from VMA, or we may want to split a VMA into multiple independent regions and track each separately. For this purpose, generalise our request tracking (akin to struct reservation_object) so that we can embed it into other objects. v2: Tweak error handling during selftest setup. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk	2019-02-05 17:12:00 +00:00
Chris Wilson	528cbd17ce	drm/i915: Move vma lookup to its own lock Remove the struct_mutex requirement for looking up the vma for an object. v2: Highlight how the race for duplicate vma creation is resolved on reacquiring the lock with a short comment. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk	2019-01-28 16:24:16 +00:00
Chris Wilson	09d7e46b97	drm/i915: Pull VM lists under the VM mutex. A starting point to counter the pervasive struct_mutex. For the goal of avoiding (or at least blocking under them!) global locks during user request submission, a simple but important step is being able to manage each clients GTT separately. For which, we want to replace using the struct_mutex as the guard for all things GTT/VM and switch instead to a specific mutex inside i915_address_space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-2-chris@chris-wilson.co.uk	2019-01-28 16:24:13 +00:00
Chris Wilson	499197dc16	drm/i915: Stop tracking MRU activity on VMA Our goal is to remove struct_mutex and replace it with fine grained locking. One of the thorny issues is our eviction logic for reclaiming space for an execbuffer (or GTT mmaping, among a few other examples). While eviction itself is easy to move under a per-VM mutex, performing the activity tracking is less agreeable. One solution is not to do any MRU tracking and do a simple coarse evaluation during eviction of active/inactive, with a loose temporal ordering of last insertion/evaluation. That keeps all the locking constrained to when we are manipulating the VM itself, neatly avoiding the tricky handling of possible recursive locking during execbuf and elsewhere. Note that discarding the MRU (currently implemented as a pair of lists, to avoid scanning the active list for a NONBLOCKING search) is unlikely to impact upon our efficiency to reclaim VM space (where we think a LRU model is best) as our current strategy is to use random idle replacement first before doing a search, and over time the use of softpinned 48b per-ppGTT is growing (thereby eliminating any need to perform any eviction searches, in theory at least) with the remaining users being found on much older devices (gen2-gen6). v2: Changelog and commentary rewritten to elaborate on the duality of a single list being both an inactive and active list. v3: Consolidate bool parameters into a single set of flags; don't comment on the duality of a single variable being a multiplicity of bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-1-chris@chris-wilson.co.uk	2019-01-28 16:24:09 +00:00
Jani Nikula	2ac5e38ea4	Merge drm/drm-next into drm-intel-next-queued Pull in v4.20-rc3 via drm-next. Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2018-11-20 13:14:08 +02:00
Christian König	ca05359f1e	dma-buf: allow reserving more than one shared fence slot Let's support simultaneous submissions to multiple engines. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.kernel.org/patch/10626149/	2018-10-25 13:45:07 +02:00
Tvrtko Ursulin	bbb8a9d7e0	drm/i915: GEM_WARN_ON considered harmful GEM_WARN_ON currently has dangerous semantics where it is completely compiled out on !GEM_DEBUG builds. This can leave users who expect it to be more like a WARN_ON, just without a warning in non-debug builds, in complete ignorance. Another gotcha with it is that it cannot be used as a statement. Which is again different from a standard kernel WARN_ON. This patch fixes both problems by making it behave as one would expect. It can now be used both as an expression and as statement, and also the condition evaluates properly in all builds - code under the conditional will therefore not unexpectedly disappear. To satisfy call sites which really want the code under the conditional to completely disappear, we add GEM_DEBUG_WARN_ON and convert some of the callers to it. This one can also be used as both expression and statement. >From the above it follows GEM_DEBUG_WARN_ON should be used in situations where we are certain the condition will be hit during development, but at a place in code where error can be handled to the benefit of not crashing the machine. GEM_WARN_ON on the other hand should be used where condition may happen in production and we just want to distinguish the level of debugging output emitted between the production and debug build. v2: * Dropped BUG_ON hunk. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181012063142.16080-1-tvrtko.ursulin@linux.intel.com	2018-10-18 10:10:12 +01:00
Chris Wilson	a4417b7b41	drm/i915: Stop holding a ref to the ppgtt from each vma The context owns both the ppgtt and the vma within it, and our activity tracking on the context ensures that we do not release active ppgtt. As the context fulfils our obligations for active memory tracking, we can relinquish the reference from the vma. This fixes a silly transient refleak from closed vma being kept alive until the entire system was idle, keeping all vm alive as well. Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Testcase: igt/gem_ctx_create/files Fixes: `3365e2268b` ("drm/i915: Lazily unbind vma on close") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Tested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180816073448.19396-1-chris@chris-wilson.co.uk	2018-08-16 13:31:37 +01:00
Chris Wilson	6a2f59e45a	drm/i915: Pull unpin map into vma release A reasonably common operation is to pin the map of the vma alongside the vma itself for the lifetime of the vma, and so release both pins at the same time as destroying the vma. It is common enough to pull into the release function, making that central function more attractive to a couple of other callsites. The continual ulterior motive is to sweep over errors on module load aborting... Testcase: igt/drv_module_reload/basic-reload-inject Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180721125037.20127-1-chris@chris-wilson.co.uk	2018-07-24 09:55:12 +01:00
Chris Wilson	46b1063f91	drm/i915: Handle recursive shrinker for vma->last_active allocation If we call into the shrinker for direct relcaim inside kmalloc, it will retire the requests. If we retire the vma->last_active while processing a new i915_vma_move_to_active() we can upset the delicate bookkeeping required for the cache. After the possible invocation of the shrinker, we need to double check the vma->last_active is still valid. Fixes: `8b293eb53a` ("drm/i915: Track the last-active inside the i915_vma") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105600#c39 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180719072206.16015-1-chris@chris-wilson.co.uk	2018-07-19 12:27:46 +01:00
Chris Wilson	8b293eb53a	drm/i915: Track the last-active inside the i915_vma Using a VMA on more than one timeline concurrently is the exception rather than the rule (using it concurrently on multiple engines). As we expect to only use one active tracker, store the most recently used tracker inside the i915_vma itself and only fallback to the rbtree if we need a second or more concurrent active trackers. v2: Comments on how we overwrite any existing last_active cache. v3: __list_del_entry() before list_replace_init() is confusing and, much more important, entirely redundant. v4: Note that both last_active and the rbtree may be simultaneously tracking this timeline, albeit with different requests, and so the vma may be retired twice for the same timeline. v5: No, that list_del is required! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706123157.9645-1-chris@chris-wilson.co.uk	2018-07-06 18:23:07 +01:00
Chris Wilson	5c3f8c221c	drm/i915: Track vma activity per fence.context, not per engine In the next patch, we will want to be able to use more flexible request timelines that can hop between engines. From the vma pov, we can then not rely on the binding of this request to an engine and so can not ensure that different requests are ordered through a per-engine timeline, and so we must track activity of all timelines. (We track activity on the vma itself to prevent unbinding from HW before the HW has finished accessing it.) v2: Switch to a rbtree for 32b safety (since using u64 as a radixtree index is fraught with aliasing of unsigned longs). v3: s/lookup_active/active_instance/ because we can never agree on names Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-5-chris@chris-wilson.co.uk	2018-07-06 18:22:43 +01:00
Chris Wilson	e6bb1d7f1a	drm/i915: Move i915_vma_move_to_active() to i915_vma.c i915_vma_move_to_active() has grown beyond its execbuf origins, and should take its rightful place in i915_vma.c as a method for i915_vma! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706103947.15919-4-chris@chris-wilson.co.uk	2018-07-06 18:22:41 +01:00
Chris Wilson	1eca65d922	drm/i915: Squelch very verbose error logging Having found the error causing the IGT test to fail, downgrade the verbose logging so that we stop flooding the syslogs as we deliberately provoke it many thousands of time during selftests. References: `10195b1e44` ("drm/i915: Show vma allocator stack when in doubt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706065332.15214-1-chris@chris-wilson.co.uk	2018-07-06 11:23:47 +01:00
Chris Wilson	7e7367d3bc	drm/i915: Try GGTT mmapping whole object as partial If the whole object is already pinned by HW for use as scanout, we will fail to move it to the mappable region and so must resort to using a partial VMA covering the whole object. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104513 Fixes: `aa136d9d72` ("drm/i915: Convert partial ggtt vma to full ggtt if it spans the entire object") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180630090509.469-1-chris@chris-wilson.co.uk	2018-07-02 17:36:09 +01:00
Chris Wilson	10195b1e44	drm/i915: Show vma allocator stack when in doubt At the moment, gem_exec_gttfill fails with a sporadic EBUSY due to us wanting to unbind a pinned batch. Let's dump who first bound that vma to see if that helps us identify who still unexpectedly has it pinned. v2: We cannot allocate inside the printer (as it may be on an fs-reclaim path), so hope for the best and build the string on the stack v3: stack depth of 16 routinely overflows a 512 character string, limit it to 12 to avoid unsightly truncation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180628132206.8329-1-chris@chris-wilson.co.uk	2018-06-28 20:56:35 +01:00
Chris Wilson	93f2cde2a4	drm/i915: Decouple vma vfuncs from vm To allow for future non-object backed vma, we need to be able to specialise the callbacks for binding, et al, the vma. For example, instead of calling vma->vm->bind_vma(), we now call vma->ops->bind_vma(). This gives us the opportunity to later override the operation for a custom vma. v2: flip order of unbind/bind Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-2-chris@chris-wilson.co.uk	2018-06-07 21:53:11 +01:00
Chris Wilson	520ea7c581	drm/i915: Prepare for non-object vma In order to allow ourselves to use VMA to wrap other entities other than GEM objects, we need to allow for the vma->obj backpointer to be NULL. In most cases, we know we are operating on a GEM object and its vma, but we need the core code (such as i915_vma_pin/insert/bind/unbind) to work regardless of the innards. The remaining eyesore here is vma->obj->cache_level and related (but less of an issue) vma->obj->gt_ro. With a bit of care we should mirror those on the vma itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180607154047.9171-1-chris@chris-wilson.co.uk	2018-06-07 21:53:10 +01:00
Chris Wilson	82ad6443a5	drm/i915/gtt: Rename i915_hw_ppgtt base member In the near future, I want to subclass gen6_hw_ppgtt as it contains a few specialised members and I wish to add more. To avoid the ugliness of using ppgtt->base.base, rename the i915_hw_ppgtt base member (i915_address_space) as vm, which is our common shorthand for an i915_address_space local. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605153758.18422-1-chris@chris-wilson.co.uk	2018-06-05 21:11:20 +01:00
Chris Wilson	83d317adfb	drm/i915/vma: Move the bind_count vs pin_count assertion to a helper To spare ourselves a long line later, refactor the repeated check of bind_count vs pin_count to a helper. v2: Fix up the commentary! Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180605094107.31367-1-chris@chris-wilson.co.uk	2018-06-05 15:16:07 +01:00
Chris Wilson	3365e2268b	drm/i915: Lazily unbind vma on close When userspace is passing around swapbuffers using DRI, we frequently have to open and close the same object in the foreign address space. This shows itself as the same object being rebound at roughly 30fps (with a second object also being rebound at 30fps), which involves us having to rewrite the page tables and maintain the drm_mm range manager every time. However, since the object still exists and it is only the local handle that disappears, if we are lazy and do not unbind the VMA immediately when the local user closes the object but defer it until the GPU is idle, then we can reuse the same VMA binding. We still have to be careful to mark the handle and lookup tables as closed to maintain the uABI, just allowing the underlying VMA to be resurrected if the user is able to access the same object from the same context again. If the object itself is destroyed (neither userspace keeping a handle to it), the VMA will be reaped immediately as usual. In the future, this will be even more useful as instantiating a new VMA for use on the GPU will become heavier. A nuisance indeed, so nip it in the bud. v2: s/__i915_vma_final_close/i915_vma_destroy/ etc. v3: Leave a hint as to why we deferred the unbind on close. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180503195115.22309-1-chris@chris-wilson.co.uk	2018-05-04 07:26:56 +01:00
Chris Wilson	e61e0f51ba	drm/i915: Rename drm_i915_gem_request to i915_request We want to de-emphasize the link between the request (dependency, execution and fence tracking) from GEM and so rename the struct from drm_i915_gem_request to i915_request. That is we may implement the GEM user interface on top of requests, but they are an abstraction for tracking execution rather than an implementation detail of GEM. (Since they are not tied to HW, we keep the i915 prefix as opposed to intel.) In short, the spatch: @@ @@ - struct drm_i915_gem_request + struct i915_request A corollary to contracting the type name, we also harmonise on using 'rq' shorthand for local variables where space if of the essence and repetition makes 'request' unwieldy. For globals and struct members, 'request' is still much preferred for its clarity. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2018-02-21 20:57:22 +00:00
Matthew Auld	73ebd50303	drm/i915: make mappable struct resource centric Now that we are using struct resource to track the stolen region, it is more convenient if we track the mappable region in a resource as well. v2: prefer iomap and gmadr naming scheme prefer DEFINE_RES_MEM Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-8-matthew.auld@intel.com	2017-12-12 12:30:21 +02:00
Chris Wilson	e2189dd078	drm/i915: Refactor common list iteration over GGTT vma In quite a few places, we have a list iteration over the vma on an object that only want to inspect GGTT vma. By construction, these are placed at the start of the list, so we have copied that knowledge into many callsites. Pull that knowledge back to i915_vma.h and provide a for_each_ggtt_vma() to tidy up the code. v2: Add a backreference from vma_create() to remind ourselves why we put ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail). v3: Fixup s/vma/V/ Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk	2017-12-07 23:26:55 +00:00
Chris Wilson	7125397b82	drm/i915: Track GGTT writes on the vma As writes through the GTT and GGTT PTE updates do not share the same path, they are not strictly ordered and so we must explicitly flush the indirect writes prior to modifying the PTE. We do track outstanding GGTT writes on the object itself, but since the object may have multiple GGTT vma, that is overly coarse as we can track and flush individual vma as required. Whilst here, update the GGTT flushing behaviour for Cannonlake. v2: Hard-code ring offset to allow use during unload (after RCS may have been freed, or never existed!) References: https://bugs.freedesktop.org/show_bug.cgi?id=104002 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-2-chris@chris-wilson.co.uk	2017-12-07 14:01:59 +00:00
Chris Wilson	010e3e68cd	drm/i915: Remove vma from object on destroy, not close Originally we translated from the object to the vma by walking obj->vma_list to find the matching vm (for user lookups). Now we process user lookups using the rbtree, and we only use obj->vma_list itself for maintaining state (e.g. ensuring that all vma are flushed or rebound). As such maintenance needs to go on beyond the user's awareness of the vma, defer removal of the vma from the obj->vma_list from i915_vma_close() to i915_vma_destroy() Fixes: `5888fc9eac` ("drm/i915: Flush pending GTT writes before unbinding") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104155 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-12-07 14:00:20 +00:00
Chris Wilson	7f017b19fb	drm/i915: Mark up i915_vma_unbind() as a potential sleeper Whenever we want to unbind a vma, we must wait on all GPU activity to complete first. (This is what gives us the ability to do fine grained eviction and purging by only having to wait on the VMA that we need to unbind to proceed; though if pushed we can make it a rule that we are only allowed to unbind already idle VMA and move the burden of the work and organising the sleep onto the caller.) Currently, we might only sleep if the vma is still active on the GPU, but in principle i915_vma_unbind() always implies a sleep, so mark it up with a might_sleep(). Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=103638 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171109213450.13875-2-chris@chris-wilson.co.uk	2017-11-09 22:06:09 +00:00
Chris Wilson	1ab22356b3	drm/i915: Prune the reservation shared fence array The shared fence array is not autopruning and may continue to grow as an object is shared between new timelines. Take the opportunity when we think the object is idle (we have to confirm that any external fence is also signaled) to decouple all the fences. We apply a similar trick after waiting on an object, see commit `e54ca97747` ("drm/i915: Remove completed fences after a wait") v2: No longer need to handle the batch pool as a special case. v3: Need to trylock from within i915_vma_retire as this may be called form the shrinker - and we may later try to allocate underneath the reservation lock, so a deadlock is possible. References: https://bugs.freedesktop.org/show_bug.cgi?id=102936 Fixes: `d07f0e59b2` ("drm/i915: Move GEM activity tracking into a common struct reservation_object") Fixes: `80b204bce8` ("drm/i915: Enable multiple timelines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171107220656.5020-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-11-08 15:40:31 +00:00
Chris Wilson	d36caeea4b	drm/i915: Assert vma->flags are updated correctly during binding As we bind, and unbind on error, we want to be sure that the vma->flags are updated to reflect the binding state so that on the next invocation all is well. v2: Take two. v3: Take three; vma-misplaced is checking map-and-fenceable so keep it last! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171105124550.32715-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>	2017-11-06 09:30:36 +00:00
Chris Wilson	f2123818ff	drm/i915: Move dev_priv->mm.[un]bound_list to its own lock Remove the struct_mutex requirement around dev_priv->mm.bound_list and dev_priv->mm.unbound_list by giving it its own spinlock. This reduces one more requirement for struct_mutex and in the process gives us slightly more accurate unbound_list tracking, which should improve the shrinker - but the drawback is that we drop the retirement before counting so i915_gem_object_is_active() may be stale and lead us to underestimate the number of objects that may be shrunk (see commit `bed50aea61` ("drm/i915/shrinker: Flush active on objects before counting")). v2: Crosslink the spinlock to the lists it protects, and btw this changes s/obj->global_link/obj->mm.link/ v3: Fix decoupling of old links in i915_gem_object_attach_phys() v3.1: Fix the fix, only unlink if it was linked v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk	2017-10-16 20:44:19 +01:00
Chris Wilson	a65adaf8a8	drm/i915: Track user GTT faulting per-vma We don't wish to refault the entire object (other vma) when unbinding one partial vma. To do this track which vma have been faulted into the user's address space. v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-10-09 17:07:29 +01:00
Chris Wilson	3bd4073524	drm/i915: Consolidate get_fence with pin_fence Following the pattern now used for obj->mm.pages, use just pin_fence and unpin_fence to control access to the fence registers. I.e. instead of calling get_fence(); pin_fence(), we now just need to call pin_fence(). This will make it easier to reduce the locking requirements around fence registers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-2-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-10-09 17:07:29 +01:00
Chris Wilson	b4563f595e	drm/i915: Pin fence for iomap Acquire the fence register for the iomap in i915_vma_pin_iomap() on behalf of the caller. We probably want for the caller to specify whether the fence should be pinned for their usage, but at the moment all callers do want the associated fence, or none, so take it on their behalf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-1-chris@chris-wilson.co.uk	2017-10-09 17:07:29 +01:00
Chris Wilson	bef27bdb6c	drm/i915: Assert we do not try to expand VMA for hugepage inside GGTT We only apply the hugepage PD redirection inside the ppGTT, so during i915_vma_insert() we want to exclude the GGTT from the additional alignment constraints (thereby avoiding the extra GTT pressure from fragmentation). Add an assert to document that intention alongside the comment. v2: After discussion with Matthew, make it a blanket GGTT ban (previously we allowed the expansion for appgtt, and so indirectly ggtt). There are issues we need to fix before allowing the current appgtt to be used with hugepages, and if we do, we probably want more care over when to expand/align, as the mappable aperture inside the ggtt is precious. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171009092019.20747-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-10-09 17:07:28 +01:00
Matthew Auld	855822be74	drm/i915: align 64K objects to 2M We can't mix 64K and 4K pte's in the same page-table, so for now we align 64K objects to 2M to avoid any potential mixing. This is potentially wasteful but in reality shouldn't be too bad since this only applies to the virtual address space of a 48b PPGTT. v2: don't separate logically connected ops Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-10-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-9-chris@chris-wilson.co.uk	2017-10-07 10:11:53 +01:00
Matthew Auld	7464284b35	drm/i915: align the vma start to the largest gtt page size For the 48b PPGTT try to align the vma start address to the required page size boundary to guarantee we use said page size in the gtt. If we are dealing with multiple page sizes, we can't guarantee anything and just align to the largest. For soft pinning and objects which need to be tightly packed into the lower 32bits we don't force any alignment. v2: various improvements suggested by Chris v3: use set_pages and better placement of page_sizes v4: prefer upper_32_bits() v5: assign vma->page_sizes = vma->obj->page_sizes directly prefer sizeof(vma->page_sizes) v6: fixup checking of end to exclude GGTT (which are assumed to be limited to 4G). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-9-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-8-chris@chris-wilson.co.uk	2017-10-07 10:11:52 +01:00
Matthew Auld	fa3f46afd3	drm/i915: introduce vm set_pages/clear_pages Move the setting/clearing of the vma->pages to a vm operation. Doing so neatens things up a little, but more importantly gives us a sane place to also set/clear the vma->pages_sizes, which we introduce later in preparation for supporting huge-pages. v2: remove redundant vma->pages check v3: GEM_BUG_ON(vma->pages) following i915_vma_remove Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-8-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-7-chris@chris-wilson.co.uk	2017-10-07 10:11:50 +01:00

1 2

94 Commits