linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 03:55:32 +07:00

Author	SHA1	Message	Date
Dave Airlie	399403c7ce	Merge tag 'drm-intel-next-2013-03-23' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Highlights: - Imre's for_each_sg_pages rework (now also with the stolen mem backed case fixed with a hack) plus the drm prime sg list coalescing patch from Rahul Sharma. I have some follow-up cleanups pending, already acked by Andrew Morton. - Some prep-work for the crazy no-pch/display-less platform by Ben. - Some vlv patches, by far not all (Jesse et al). - Clean up the HDMI/SDVO #define confusion (Paulo) - gen2-4 vblank fixes from Ville. - Unclaimed register warning fixes for hsw (Paulo). More still to come ... - Complete pageflips which have been stuck in a gpu hang, should prevent stuck gl compositors (Ville). - pm patches for vt-switchless resume (Jesse). Note that the i915 enabling is not (yet) included, that took a bit longer to settle. PM patches are acked by Rafael Wysocki. - Minor fixlets all over from various people. * tag 'drm-intel-next-2013-03-23' of git://people.freedesktop.org/~danvet/drm-intel: (79 commits) drm/i915: Implement WaSwitchSolVfFArbitrationPriority drm/i915: Set the VIC in AVI infoframe for SDVO drm/i915: Kill a strange comment about DPMS functions drm/i915: Correct sandybrige overclocking drm/i915: Introduce GEN7_FEATURES for device info drm/i915: Move num_pipes to intel info drm/i915: fixup pd vs pt confusion in gen6 ppgtt code style nit: Align function parameter continuation properly. drm/i915: VLV doesn't have HDMI on port C drm/i915: DSPFW and BLC regs are in the display offset range drm/i915: set conservative clock gating values on VLV v2 drm/i915: fix WaDisablePSDDualDispatchEnable on VLV v2 drm/i915: add more VLV IDs drm/i915: use VLV DIP routines on VLV v2 drm/i915: add media well to VLV force wake routines v2 drm/i915: don't use plane pipe select on VLV drm: modify pages_to_sg prime helper to create optimized SG table drm/i915: use for_each_sg_page for setting up the gtt ptes drm/i915: create compact dma scatter lists for gem objects drm/i915: handle walking compact dma scatter lists ...	2013-04-05 10:18:13 +10:00
Lauri Kasanen	27b7c63a7c	drm/i915: Fix build failure ERROR: "__build_bug_on_failed" [drivers/gpu/drm/i915/i915.ko] undefined! Originally reported at http://www.gossamer-threads.com/lists/linux/kernel/1631803 FDO bug #62775 This needs to be backported to both 3.7 and 3.8 stable trees. Doesn't apply straight, but it's a quick change. Signed-off-by: Lauri Kasanen <cand@gmx.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62775 Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-27 15:05:42 +01:00
Ben Widawsky	41fda59682	drm/i915: Remove unneeded dev argument Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-18 03:03:19 +01:00
Ben Widawsky	cf144969d5	drm/i915: Remove unused file arg from execbuf Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-18 03:02:53 +01:00
Kees Cook	3118a4f652	drm/i915: bounds check execbuffer relocation count It is possible to wrap the counter used to allocate the buffer for relocation copies. This could lead to heap writing overflows. CVE-2013-0913 v3: collapse test, improve comment v2: move check into validate_exec_list Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Pinkie Pie Cc: stable@vger.kernel.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-13 21:31:03 +01:00
Kees Cook	3058753583	drm/i915: clarify reasoning for the access_ok call This clarifies the comment above the access_ok check so a missing VERIFY_READ doesn't alarm anyone. v2: - rewrote comment, thanks to Chris Wilson Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> [danvet: add patch history log to commit message.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-13 21:17:28 +01:00
Ville Syrjälä	2bb4629add	drm/i915: Add to_user_ptr() to_user_ptr() simply casts a pointer passed as u64 from user space to void __user * correctly. Using this lets us get rid of all the tiresome casts. The idea came from Chris Wilson <chris@chris-wilson.co.uk>. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-03 19:49:11 +01:00
Dave Airlie	6dc1c49da6	Merge branch 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux into drm-next This pulls in most of Linus tree up to -rc6, this fixes the worst lockdep reported issues and re-enables fbcon lockdep. (not the fbcon maintainer) * 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux: (529 commits) Revert "Revert "console: implement lockdep support for console_lock"" fbcon: fix locking harder fb: Yet another band-aid for fixing lockdep mess fb: rework locking to fix lock ordering on takeover	2013-02-08 12:10:18 +10:00
Ben Widawsky	5d4545aef5	drm/i915: Create a gtt structure The purpose of the gtt structure is to help isolate our gtt specific properties from the rest of the code (in doing so it help us finish the isolation from the AGP connection). The following members are pulled out (and renamed): gtt_start gtt_total gtt_mappable_end gtt_mappable gtt_base_addr gsm The gtt structure will serve as a nice place to put gen specific gtt routines in upcoming patches. As far as what else I feel belongs in this structure: it is meant to encapsulate the GTT's physical properties. This is why I've not added fields which track various drm_mm properties, or things like gtt_mtrr (which is itself a pretty transient field). Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [Ben modified commit messages] Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:33:56 +01:00
Chris Wilson	eef90ccb8a	drm/i915: Use the reloc.handle as an index into the execbuffer array Using copywinwin10 as an example that is dependent upon emitting a lot of relocations (2 per operation), we see improvements of: c2d/gm45: 618000.0/sec to 623000.0/sec. i3-330m: 748000.0/sec to 789000.0/sec. (measured relative to a baseline with neither optimisations applied). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:23:47 +01:00
Daniel Vetter	ed5982e6ce	drm/i915: Allow userspace to hint that the relocations were known Userspace is able to hint to the kernel that its command stream and auxiliary state buffers already hold the correct presumed addresses and so the relocation process may be skipped if the kernel does not need to move any buffers in preparation for the execbuffer. Thus for the common case where the allotment of buffers is static between batches, we can avoid the overhead of individually checking the relocation entries. Note that this requires userspace to supply the domain tracking and requests for workarounds itself that would otherwise be computed based upon the relocation entries. Using copywinwin10 as an example that is dependent upon emitting a lot of relocations (2 per operation), we see improvements of: c2d/gm45: 618000.0/sec to 632000.0/sec. i3-330m: 748000.0/sec to 830000.0/sec. (measured relative to a baseline with neither optimisations applied). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Fixup merge conflict in userspace header due to different baseline trees.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:23:36 +01:00
Chris Wilson	bcffc3faa6	drm/i915: Move the execbuffer objects list from the stack into the tracker Instead of passing around the eb-objects hashtable and a separate object list, we can include the object list into the eb-objects structure for convenience. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:08:02 +01:00
Chris Wilson	3b96eff447	drm/i915: Take the handle idr spinlock once for looking up the exec objects Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:08:01 +01:00
Chris Wilson	419fa72a19	drm/i915: Mark a temporary allocation for copy-from-user as such The difference is that the kernel will then know that this memory will be reclaimable in the near future. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:08:00 +01:00
Dave Airlie	b5cc6c0387	Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: - seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris Wilson - some leftover kill-agp on gen6+ patches from Ben - hotplug improvements from Damien - clear fb when allocated from stolen, avoids dirt on the fbcon (Chris) - Stolen mem support from Chris Wilson, one of the many steps to get to real fastboot support. - Some DDI code cleanups from Paulo. - Some refactorings around lvds and dp code. - some random little bits&pieces * tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits) drm/i915: Return the real error code from intel_set_mode() drm/i915: Make GSM void drm/i915: Move GSM mapping into dev_priv drm/i915: Move even more gtt code to i915_gem_gtt drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno drm/i915: Introduce i915_gem_set_seqno() drm/i915: Always clear semaphore mboxes on seqno wrap drm/i915: Initialize hardware semaphore state on ring init drm/i915: Introduce ring set_seqno drm/i915: Missed conversion to gtt_pte_t drm/i915: Bug on unsupported swizzled platforms drm/i915: BUG() if fences are used on unsupported platform drm/i915: fixup overlay stolen memory leak drm/i915: clean up PIPECONF bpc #defines drm/i915: add intel_dp_set_signal_levels drm/i915: remove leftover display.update_wm assignment drm/i915: check for the PCH when setting pch_transcoder drm/i915: Clear the stolen fb before enabling drm/i915: Access to snooped system memory through the GTT is incoherent drm/i915: Remove stale comment about intel_dp_detect() ... Conflicts: drivers/gpu/drm/i915/intel_display.c	2013-01-17 20:34:08 +10:00
Chris Wilson	262b6d363f	drm/i915: Invalidate the relocation presumed_offsets along the slow path In the slow path, we are forced to copy the relocations prior to acquiring the struct mutex in order to handle pagefaults. We forgo copying the new offsets back into the relocation entries in order to prevent a recursive locking bug should we trigger a pagefault whilst holding the mutex for the reservations of the execbuffer. Therefore, we need to reset the presumed_offsets just in case the objects are rebound back into their old locations after relocating for this exexbuffer - if that were to happen we would assume the relocations were valid and leave the actual pointers to the kernels dangling, instant hang. Fixes regression from commit `bcf50e2775` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Nov 21 22:07:12 2010 +0000 drm/i915: Handle pagefaults in execbuffer user relocations Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55984 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@fwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-16 10:53:38 +01:00
Daniel Vetter	b45305fce5	drm/i915: Implement workaround for broken CS tlb on i830/845 Now that Chris Wilson demonstrated that the key for stability on early gen 2 is to simple _never_ exchange the physical backing storage of batch buffers I've tried a stab at a kernel solution. Doesn't look too nefarious imho, now that I don't try to be too clever for my own good any more. v2: After discussing the various techniques, we've decided to always blit batches on the suspect devices, but allow userspace to opt out of the kernel workaround assume full responsibility for providing coherent batches. The principal reason is that avoiding the blit does improve performance in a few key microbenchmarks and also in cairo-trace replays. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring wrap w/a. Suggested by Chris Wilson. - Also add the ACTHD check from Chris Wilson for the error state dumping, so that we still catch batches when userspace opts out of the w/a.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-17 17:27:02 +01:00
Chris Wilson	c1f093e09c	drm/i915: Remove check for conflicting relocation write-domains Simply use the last write-domain set for the object in the batch, trusting userspace to have correctly flushed the caches between usage as a write target. This check dates back from the golden age of having only a single operation per batch with the kernel repeating it for each cliprect, and conflicts both with userspace trying to efficiently batch multiple operations and with reducing the kernel overhead of relocation processing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 20:13:16 +01:00
Ville Syrjälä	ca9c46c5c7	drm/i915: Kill i915_gem_execbuffer_wait_for_flips() As per Chris Wilson's suggestion make i915_gem_execbuffer_wait_for_flips() go away. This was used to stall the GPU ring while there are pending page flips involving the relevant BO. Ie. while the BO is still being scanned out by the display controller. The recommended alternative is to use the page flip events to wait for the page flips to fully complete before reusing the BO of the old front buffer. Or use more buffers. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: don't remove obj->pending_flips, still required due to reorder patches.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:58:15 +01:00
Chris Wilson	9d7730914f	drm/i915: Preallocate next seqno before touching the ring Based on the work by Mika Kuoppala, we realised that we need to handle seqno wraparound prior to committing our changes to the ring. The most obvious point then is to grab the seqno inside intel_ring_begin(), and then to reuse that seqno for all ring operations until the next request. As intel_ring_begin() can fail, the callers must already be prepared to handle such failure and so we can safely add further checks. This patch looks like it should be split up into the interface changes and the tweaks to move seqno wrapping from the execbuffer into the core seqno increment. However, I found no easy way to break it into incremental steps without introducing further broken behaviour. v2: Mika found a silly mistake and a subtle error in the existing code; inside i915_gem_retire_requests() we were resetting the sync_seqno of the target ring based on the seqno from this ring - which are only related by the order of their allocation, not retirement. Hence we were applying the optimisation that the rings were synchronised too early, fortunately the only real casualty there is the handling of seqno wrapping. v3: Do not forget to reset the sync_seqno upon module reinitialisation, ala resume. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:52 +01:00
Chris Wilson	be7cb6347e	drm/i915: Remove bogus test for a present execbuffer The intention of checking obj->gtt_offset!=0 is to verify that the target object was listed in the execbuffer and had been bound into the GTT. This is guarranteed by the earlier rearrangement to split the execbuffer operation into reserve and relocation phases and then verified by the check that the target handle had been processed during the reservation phase. However, the actual checking of obj->gtt_offset==0 is bogus as we can indeed reference an object at offset 0. For instance, the framebuffer installed by the BIOS often resides at offset 0 - causing EINVAL as we legimately try to render using the stolen fb. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:45:03 +01:00
Ben Widawsky	e76e9aebcd	drm/i915: Stop using AGP layer for GEN6+ As a quick hack we make the old intel_gtt structure mutable so we can fool a bunch of the existing code which depends on elements in that data structure. We can/should try to remove this in a subsequent patch. This should preserve the old gtt init behavior which upon writing these patches seems incorrect. The next patch will fix these things. The one exception is VLV which doesn't have the preserved flush control write behavior. Since we want to do that for all GEN6+ stuff, we'll handle that in a later patch. Mainstream VLV support doesn't actually exist yet anyway. v2: Update the comment to remove the "voodoo" Check that the last pte written matches what we readback v3: actually kill cache_level_to_agp_type since most of the flags will disappear in an upcoming patch v4: v3 was actually not what we wanted (Daniel) Make the ggtt bind assertions better and stricter (Chris) Fix some uncaught errors at gtt init (Chris) Some other random stuff that Chris wanted v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a tad more robust by mapping everything != CACHE_NONE to the cached agp flag - we have a 1:1 uncached mapping, but different modes of cacheable (at least on later generations). Suggested by Chris Wilson.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:42 +01:00
Daniel Vetter	c2fb791692	Linux 3.7-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJQgvdwAAoJEHm+PkMAQRiG+3AH/i2XsqqN3VctL0nnbWfvds+Q aKulfIdJTjKiVAsawPUtRqReZ8ijiebrgA/53lZLlrFOoPPQ5+LHmnSyQF6gErOY NuAE1lijXDRM1pwBlhvOBbAj26wUobGjqONFJ9OkKr758Ue8ds/Q7UdxyEgmYgmg tvVMzfRcICzryUV3PcqL+3cNPpCUdT6wGGRJ9DCv/jvGiWKExWhOle5oltrmxk+D NsqRcws5pEubfHE4J8BvNWr8lE1kHfYVhrJETiLJUiN2XAJcbI4Jy7rU/3EGteNS 0HMZdaPPjV874lohdM70X2225SbYrCVkAYB5hnZCTeC3tYyCawBBPMQoyAiOcmU= =+861 -----END PGP SIGNATURE----- Merge tag 'v3.7-rc2' into drm-intel-next-queued Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-22 14:34:51 +02:00
Chris Wilson	d7d4eeddb8	drm/i915: Allow DRM_ROOT_ONLY\|DRM_MASTER to submit privileged batchbuffers With the introduction of per-process GTT space, the hardware designers thought it wise to also limit the ability to write to MMIO space to only a "secure" batch buffer. The ability to rewrite registers is the only way to program the hardware to perform certain operations like scanline waits (required for tear-free windowed updates). So we either have a choice of adding an interface to perform those synchronized updates inside the kernel, or we permit certain processes the ability to write to the "safe" registers from within its command stream. This patch exposes the ability to submit a SECURE batch buffer to DRM_ROOT_ONLY\|DRM_MASTER processes. v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security bit (bit 13, accidentally not set). Also add a comment explaining why secure batches need a global gtt binding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) [danvet: added hsw fixup.] Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-17 21:06:59 +02:00
Linus Torvalds	612a9aab56	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...	2012-10-03 23:29:23 -07:00
David Howells	760285e7e7	UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/ Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:07 +01:00
David Howells	4126d5d61f	UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:05 +01:00
Chris Wilson	41783eea1a	drm/i915: Assert that the exec object lookup table is a power-of-two As we make the simplification of using a power-of-two size for the execbuffer handle-to-object TLB, we should validate that this is actually true and so clarify that premise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:23:11 +02:00
Chris Wilson	ba7a64587b	drm/i915: Drop the misleading cast to the wrong user pointer type The exec_list is of type drm_i915_gem_exec_object2 and so casting it to a drm_i915_gem_relocation_entry is very confusing! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:23:05 +02:00
Chris Wilson	9da3da660d	drm/i915: Replace the array of pages with a scatterlist Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	7788a76520	drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer If we need to stall in order to complete the pin_and_fence operation during execbuffer reservation, there is a high likelihood that the operation will be interrupted by a signal (thanks X!). In order to simplify the cleanup along that error path, the object was unconditionally unbound and the error propagated. However, being interrupted here is far more common than I would like and so we can strive to avoid the extra work by eliminating the forced unbind. v2: In discussion over the indecent colour of the new functions and unwind path, we realised that we can use the new unreserve function to clean up the code even further. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 21:02:51 +02:00
Chris Wilson	504c7267a1	drm/i915: Use cpu relocations if the object is in the GTT but not mappable This prevents the case of unbinding the object in order to process the relocations through the GTT and then rebinding it only to then proceed to use cpu relocations as the object is now in the CPU write domain. By choosing to use cpu relocations up front, we can therefore avoid the rebind penalty. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:29:16 +02:00
Chris Wilson	86a1ee26bb	drm/i915: Only pwrite through the GTT if there is space in the aperture Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:03:33 +02:00
Chris Wilson	6c085a728c	drm/i915: Track unbound pages When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:11 +02:00
Daniel Vetter	a22ddff8be	Linux 3.6-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQLWtvAAoJEHm+PkMAQRiG/DYH+wd0FqfEuYkYk4KPyAPuhKpX zX7HYfLvyJE/ZYIdrhjq1E6Xm2KNr7gtX7/Rdzi2W38M9sjbYzwG1UGIw51qnxWy yZJH9BGkfyQgQPeuDGohfB6DkDy2JWr2eqMDvakjOwgBsIzji0PQD/f3UvndhtUa c+tTj/kjavHE1Yr2Wy6OnRZz3Uc0hIMn/Q0JqtbCs3LUgEV1KA4OEAe56XNz4Ku4 WE+FFaGFPvtriQsQON+ohPS5IC8jzQGK/0vbrJ4lWjFnZy4gvZXnborTOwD0WSQG fbsNuxp1AaM2/pqfMwXm1w0ADvwOITHNiwwXf9id6DoK81QwTFpUdvKpn6yB6gQ= =rurr -----END PGP SIGNATURE----- Merge tag 'v3.6-rc2' into drm-intel-next Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put even more madness on top: - drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in -fixes, that has been changed in a variable-rename in -next, too. - drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr (since all their users have been extracted in another fucntion), -fixes added another user for a hw workaroudn. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-17 09:01:08 +02:00
Eric Anholt	e844b990b1	drm/i915: Don't forget to apply SNB PIPE_CONTROL GTT workaround. If a buffer that was the target of a PIPE_CONTROL from userland was a reused one that hadn't been evicted which had not previously had this workaround applied, then the early return for a correct presumed_offset in this function meant we would not bind it into the GTT and the write would land somewhere else. Fixes reproducible failures with GL_EXT_timer_query usage in apitrace, and I also expect it to fix the intermittent OQ issues on snb that danvet's been working on. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48019 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52932 Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Carl Worth <cworth@cworth.org> Tested-by: Carl Worth <cworth@cworth.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-05 21:45:01 +02:00
Chris Wilson	f047e395dd	drm/i915: Avoid concurrent access when marking the device as idle/busy As suggested by Daniel, rip out the independent timers for device and crtc busyness and integrate the manual powermanagement of the display engine into the GEM core and its request tracking. The benefits are that the code is a lot smaller, fewer moving parts and should fit more neatly into the overall activity tracking of the driver. v2: Complete overhaul and removal of the racy timers and workers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:56 +02:00
Chris Wilson	a7b9761d0a	drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:55 +02:00
Chris Wilson	016fd0c1ae	drm/i915: Clear the pending_gpu_fenced_access flag at the start of execbuffer Otherwise once we use the buffer with a BLT command on gen2/3, we will always regard future command submissions as continuing the fenced access. However, now that we flush/invalidate between every batch we can drop this pessimism. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:55 +02:00
Daniel Vetter	6ac42f4148	drm/i915: Replace the complex flushing logic with simple invalidate/flush all Now that we unconditionally flush and invalidate between every batch buffer, we no longer need the complex logic to decide which domains require flushing. Remove it and rejoice. v2 (danvet): Keep around the flip waiting logic. It's gross and broken, I know, but we can't just kill that thing ... even if we just keep it around as a reminder that things are broken. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:54 +02:00
Chris Wilson	69c2fc8913	drm/i915: Remove the per-ring write list This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Chris Wilson	0201f1ecf4	drm/i915: Replace the pending_gpu_write flag with an explicit seqno As we always flush the GPU cache prior to emitting the breadcrumb, we no longer have to worry about the deferred flush causing the pending_gpu_write to be delayed. So we can instead utilize the known last_write_seqno to hopefully minimise the wait times. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:52 +02:00
Chris Wilson	3bb73aba1e	drm/i915: Allow late allocation of request for i915_add_request() Request preallocation was added to i915_add_request() in order to support the overlay. However, not all users care and can quite happily ignore the failure to allocate the request as they will simply repeat the request in the future. By pushing the allocation down into i915_add_request(), we can then remove some rather ugly error handling in the callers. v2: Nullify request->file_priv otherwise we chase a garbage pointer when retiring requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:51 +02:00
Eric Anholt	0da5cec1de	drm/i915: Set the context before setting up regs for the context. Fixes failures in transform feedback on gen7 because our SOL_RESET flag was setting the transform feedback offsets in the old context (occasionally happened to be ours) instead of the new context. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 10:39:59 +02:00
Chris Wilson	09cf7c9a12	drm/i915: Insert a flush between batches if the breadcrumb was dropped If we drop the breadcrumb request after a batch due to a signal for example we aim to fix it up at the next opportunity. In this case we emit a second batchbuffer with no waits upon the first and so no opportunity to insert the missing request, so we need to emit the missing flush for coherency. (Note that that invalidating the render cache is the same as flushing it, so there should have been no observable corruption.) Note that beside simply adding the missing flush, avoiding potential render corruption, this will also fix at least parts of the problem introduced by some funny interaction of these two commits: commit `de2b998552` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jul 4 22:52:50 2012 +0200 drm/i915: don't return a spurious -EIO from intel_ring_begin which allowed intel_ring_begin to return -ERESTARTSYS and commit `cc889e0f6c` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jun 13 20:45:19 2012 +0200 drm/i915: disable flushing_list/gpu_write_list which essentially disabled the flushing list. The issue happens when we submit a batch & emit it, but get interrupted (thanks to the first patch) while trying to emit the flush. On the next batch we still assume that the full gpu domain handling is in effect and hence compute the invalidate&flushing domains. But thanks to the 2nd patch we totally ignore these and only invalidate all gpu domains, presuming that any required flushes have been issued already. Which is wrong and eventually results in us updating the new write_domain values with the computed pending_write_domain values, which leaves an object with write_domain == 0 on the gpu_write_list. As soon as we try to unbind that object, things blow up. Fix this by emitting the missing flush according to the new ring->gpu_caches_dirty flag. Note that this does _not_ fix all the current cases where we end up with an object on the flushing_list that can't be flushed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52040 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add bug explanation to commit message.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:40 +02:00
Daniel Vetter	cc889e0f6c	drm/i915: disable flushing_list/gpu_write_list This is just the minimal patch to disable all this code so that we can do decent amounts of QA before we rip it all out. The complicating thing is that we need to flush the gpu caches after the batchbuffer is emitted. Which is past the point of no return where execbuffer can't fail any more (otherwise we risk submitting the same batch multiple times). Hence we need to add a flag to track whether any caches associated with that ring are dirty. And emit the flush in add_request if that's the case. Note that this has a quite a few behaviour changes: - Caches get flushed/invalidated unconditionally. - Invalidation now happens after potential inter-ring sync. I've bantered around a bit with Chris on irc whether this fixes anything, and it might or might not. The only thing clear is that with these changes it's much easier to reason about correctness. Also rip out a lone get_next_request_seqno in the execbuffer retire_commands function. I've dug around and I couldn't figure out why that is still there, with the outstanding lazy request stuff it shouldn't be necessary. v2: Chris Wilson complained that I also invalidate the read caches when flushing after a batchbuffer. Now optimized. v3: Added some comments to explain the new flushing behaviour. Cc: Eric Anholt <eric@anholt.net> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-20 13:54:28 +02:00
Ben Widawsky	6e0a69dbc8	drm/i915/context: switch contexts with execbuf2 Use the rsvd1 field in execbuf2 to specify the context ID associated with the workload. This will allow the driver to do the proper context switch when/if needed. v2: Add checks for context switches on rings not supporting contexts. Before the code would silently ignore such requests. Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:21 +02:00
Chris Wilson	a15817cf16	drm/i915: Check whether the ring is initialised prior to dispatch Rather than use the magic feature tests HAS_BLT/HAS_BSD just check whether the ring we are about to dispatch the execbuffer on is initialised. v2: Use intel_ring_initialized() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-19 22:39:53 +02:00
Chris Wilson	acb87dfb4b	drm/i915: Limit calling mark-busy only for potential scanouts The principle of intel_mark_busy() is that we want to spot the transition of when the display engine is being used in order to bump powersaving modes and increase display clocks. As such it is only important when the display is changing, i.e. when rendering to the scanout or other sprite/plane, and these are characterised by being pinned. v2: Mark the whole device as busy on execbuffer and pageflips as well and rebase against dinq for the minor bug fix to be immediately applicable. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: fix compile fail.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-08 15:10:34 +02:00
Daniel Vetter	5e13a0c5ec	Merge remote-tracking branch 'airlied/drm-core-next' into drm-intel-next-queued Backmerge of drm-next to resolve a few ugly conflicts and to get a few fixes from 3.4-rc6 (which drm-next has already merged). Note that this merge also restricts the stencil cache lra evict policy workaround to snb (as it should) - I had to frob the code anyway because the CM0_MASK_SHIFT define died in the masked bit cleanups. We need the backmerge to get Paulo Zanoni's infoframe regression fix for gm45 - further bugfixes from him touch the same area and would needlessly conflict. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-08 13:39:59 +02:00

1 2 3

112 Commits