linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 06:55:01 +07:00

Author	SHA1	Message	Date
Akash Goel	3b3f1650b1	drm/i915: Allocate intel_engine_cs structure only for the enabled engines With the possibility of addition of many more number of rings in future, the drm_i915_private structure could bloat as an array, of type intel_engine_cs, is embedded inside it. struct intel_engine_cs engine[I915_NUM_ENGINES]; Though this is still fine as generally there is only a single instance of drm_i915_private structure used, but not all of the possible rings would be enabled or active on most of the platforms. Some memory can be saved by allocating intel_engine_cs structure only for the enabled/active engines. Currently the engine/ring ID is kept static and dev_priv->engine[] is simply indexed using the enums defined in intel_engine_id. To save memory and continue using the static engine/ring IDs, 'engine' is defined as an array of pointers. struct intel_engine_cs engine[I915_NUM_ENGINES]; dev_priv->engine[engine_ID] will be NULL for disabled engine instances. There is a text size reduction of 928 bytes, from 1028200 to 1027272, for i915.o file (but for i915.ko file text size remain same as 1193131 bytes). v2: - Remove the engine iterator field added in drm_i915_private structure, instead pass a local iterator variable to the for_each_engine* macros. (Chris) - Do away with intel_engine_initialized() and instead directly use the NULL pointer check on engine pointer. (Chris) v3: - Remove for_each_engine_id() macro, as the updated macro for_each_engine() can be used in place of it. (Chris) - Protect the access to Render engine Fault register with a NULL check, as engine specific init is done later in Driver load sequence. v4: - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris) - Kill the superfluous init_engine_lists(). v5: - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to allocation of intel_engine_cs structure. (Chris) v6: - Rebase. v7: - Optimize the for_each_engine_masked() macro. (Chris) - Change the type of 'iter' local variable to enum intel_engine_id. (Chris) - Rebase. v8: Rebase. v9: Rebase. v10: - For index calculation use engine ID instead of pointer based arithmetic in intel_engine_sync_index() as engine pointers are not contiguous now (Chris) - For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas) - Use for_each_engine macro for cleanup in intel_engines_init() and remove check for NULL engine pointer in cleanup() routines. (Joonas) v11: Rebase. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Akash Goel <akash.goel@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com	2016-10-14 09:58:43 +01:00
Jani Nikula	911f4869eb	drm/i915: use NULL for NULL pointers Fix sparse warning: drivers/gpu/drm/i915/i915_cmd_parser.c:987:72: warning: Using plain integer as NULL pointer Fixes: `52a42cec4b` ("drm/i915/cmdparser: Accelerate copies from WC memory") Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1473946137-1931-4-git-send-email-jani.nikula@intel.com	2016-09-16 10:35:43 +03:00
Chris Wilson	52a42cec4b	drm/i915/cmdparser: Accelerate copies from WC memory If we need to use clflush to prepare our batch for reads from memory, we can bypass the cache instead by using non-temporal copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-39-chris@chris-wilson.co.uk	2016-08-18 22:37:01 +01:00
Chris Wilson	76ff480ec9	drm/i915/cmdparser: Use binary search for faster register lookup A significant proportion of the cmdparsing time for some batches is the cost to find the register in the mmiotable. We ensure that those tables are in ascending order such that we could do a binary search if it was ever merited. It is. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-38-chris@chris-wilson.co.uk	2016-08-18 22:37:01 +01:00
Chris Wilson	ea884f09e5	drm/i915/cmdparser: Check for SKIP descriptors first If the command descriptor says to skip it, ignore checking for anyother other conflict. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-37-chris@chris-wilson.co.uk	2016-08-18 22:37:00 +01:00
Chris Wilson	efdfd91f9b	drm/i915/cmdparser: Compare against the previous command descriptor On the blitter (and in test code), we see long sequences of repeated commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these, we can skip the hashtable lookup by remembering the previous command descriptor and doing a straightforward compare of the command header. The corollary is that we need to do one extra comparison before lookup up new commands. v2: Less magic mask (ok, it is still magic, but now you cannot see!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-36-chris@chris-wilson.co.uk	2016-08-18 22:37:00 +01:00
Chris Wilson	d6a4ead7a3	drm/i915/cmdparser: Improve hash function The existing code's hashfunction is very suboptimal (most 3D commands use the same bucket degrading the hash to a long list). The code even acknowledge that the issue was known and the fix simple: /* * If we attempt to generate a perfect hash, we should be able to look at bits * 31:29 of a command from a batch buffer and use the full mask for that * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this. */ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-35-chris@chris-wilson.co.uk	2016-08-18 22:36:59 +01:00
Chris Wilson	ed13033f02	drm/i915/cmdparser: Only cache the dst vmap For simplicity, we want to continue using a contiguous mapping of the command buffer, but we can reduce the number of vmappings we hold by switching over to a page-by-page copy from the user batch buffer to the shadow. The cost for saving one linear mapping is about 5% in trivial workloads - which is more or less the overhead in calling kmap_atomic(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-34-chris@chris-wilson.co.uk	2016-08-18 22:36:59 +01:00
Chris Wilson	0b5372727b	drm/i915/cmdparser: Use cached vmappings The single largest factor in the overhead of parsing the commands is the setup of the virtual mapping to provide a continuous block for the batch buffer. If we keep those vmappings around (against the better judgement of mm/vmalloc.c, which we offset by handwaving and looking suggestively at the shrinker) we can dramatically improve the performance of the parser for small batches (such as media workloads). Furthermore, we can use the prepare shmem read/write functions to determine how best we need to clflush the range (rather than every page of the object). The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt for the throughput on small batches. (Caching just the dst vmap and iterating over the src, doing a page by page copy is roughly 5% slower on both platforms. That may be an acceptable trade-off to eliminate one cached vmapping, and we may be able to reduce the per-page copying overhead further.) For this simple test case, the cmdparser is now within a factor of 2 of ideal performance. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-33-chris@chris-wilson.co.uk	2016-08-18 22:36:59 +01:00
Chris Wilson	068715b922	drm/i915/cmdparser: Add the TIMESTAMP register for the other engines Since I have been using the BCS_TIMESTAMP to measure latency of execution upon the blitter ring, allow regular userspace to also read from that register. They are already allowed RCS_TIMESTAMP! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-32-chris@chris-wilson.co.uk	2016-08-18 22:36:58 +01:00
Chris Wilson	7756e45407	drm/i915/cmdparser: Make initialisation failure non-fatal If the developer adds a register in the wrong order, we BUG during boot. That makes development and testing very difficult. Let's be a bit more friendly and disable the command parser with a big warning if the tables are invalid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-31-chris@chris-wilson.co.uk	2016-08-18 22:36:58 +01:00
Chris Wilson	43394c7d0d	drm/i915: Extract i915_gem_obj_prepare_shmem_write() This is a companion to i915_gem_obj_prepare_shmem_read() that prepares the backing storage for direct writes. It first serialises with the GPU, pins the backing storage and then indicates what clfushes are required in order for the writes to be coherent. Whilst here, fix support for ancient CPUs without clflush for which we cannot do the GTT+clflush tricks. v2: Add i915_gem_obj_finish_shmem_access() for symmetry Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-8-chris@chris-wilson.co.uk	2016-08-18 22:36:44 +01:00
Chris Wilson	33a051a5fc	drm/i915/cmdparser: Remove stray intel_engine_cs *ring When we refer to intel_engine_cs, we want to use engine so as not to confuse ourselves about ringbuffers. v2: Rename all the functions as well, as well as a few more stray comments. v3: Split the really long error message strings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk	2016-07-27 16:23:05 +01:00
Tvrtko Ursulin	14bb2c1179	drm/i915: Fix a buch of kerneldoc warnings Just a bunch of stale kerneldocs generating warnings when building the docs. Mostly function parameters so not very useful but still. v2: Tidy. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1464958937-23344-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-06-06 13:04:26 +01:00
Chris Wilson	c033666a94	drm/i915: Store a i915 backpointer from engine, and use it text data bss dec hex filename 6309351 3578714 696320 10584385 a18141 vmlinux 6308391 3578714 696320 10583425 a17d81 vmlinux Almost 1KiB of code reduction. v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions text data bss dec hex filename 6304579 3578778 696320 10579677 a16edd vmlinux 6303427 3578778 696320 10578525 a16a5d vmlinux Now over 1KiB! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk	2016-05-09 13:41:24 +01:00
Kenneth Graunke	6761d0a184	drm/i915: Allow MI_LOAD_REGISTER_REG between whitelisted registers. Allowing register copies where the source and destination are both whitelisted should be safe, and is useful. For example, Mesa uses this to load the command streamer math registers with data from the pipeline statistics counters. v2: Reject writes to OACONTROL (and reads as well :( Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462521014-13595-1-git-send-email-chris@chris-wilson.co.uk	2016-05-09 08:30:48 +01:00
Chris Wilson	1ca3712ca3	drm/i915: Report command parser version 0 if disabled If the command parser is not active, then it is appropriate to report it as operating at version 0 as no higher mode is supported. This greatly simplifies userspace querying for the command parser as we then do not need to second guess when it will be active (a mixture of module parameters and generational support, which may change over time). v2: s/comand/command/ misspelling in comment Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462368336-21230-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-05-05 08:40:02 +01:00
Jordan Justen	6cf0716c03	drm/i915: Bump command parser version for new whitelisted registers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-6-git-send-email-jordan.l.justen@intel.com	2016-03-21 10:03:26 +01:00
Jordan Justen	1b85066bb1	drm/i915: Add Haswell CS GPR registers to whitelist This is needed for the Mesa Vulkan driver on Haswell. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-5-git-send-email-jordan.l.justen@intel.com	2016-03-21 10:03:17 +01:00
Jordan Justen	99c5aeca94	drm/i915: Move Haswell registers to separate whitelist table Now that we can whitelist registers only on Haswell, move HSW_SCRATCH1 and HSW_ROW_CHICKEN3 into a separate Haswell only table. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-4-git-send-email-jordan.l.justen@intel.com	2016-03-21 10:02:46 +01:00
Jordan Justen	361b027bc6	drm/i915: Use an array of register tables in command parser For Haswell, we will want another table of registers while retaining the large common table of whitelisted registers shared by all gen7 devices. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> [danvet: Pipe patch through sed -e 's/\<ring\>/engine/g' to make it apply.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-03-21 10:02:01 +01:00
Jordan Justen	a6573e1f54	drm/i915: Add TIMESTAMP to register whitelist This is needed for the Mesa Vulkan driver on Haswell. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-2-git-send-email-jordan.l.justen@intel.com	2016-03-21 09:56:42 +01:00
Tvrtko Ursulin	0bc40be85f	drm/i915: Rename intel_engine_cs function parameters @@ identifier func; @@ func(..., struct intel_engine_cs * - ring + engine , ...) { <... - ring + engine ...> } @@ identifier func; type T; @@ T func(..., struct intel_engine_cs * - ring + engine , ...); Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:10 +00:00
Ville Syrjälä	f0f59a00a1	drm/i915: Type safe register read/write Make I915_READ and I915_WRITE more type safe by wrapping the register offset in a struct. This should eliminate most of the fumbles we've had with misplaced parens. This only takes care of normal mmio registers. We could extend the idea to other register types and define each with its own struct. That way you wouldn't be able to accidentally pass the wrong thing to a specific register access function. The gpio_reg setup is probably the ugliest thing left. But I figure I'd just leave it for now, and wait for some divine inspiration to strike before making it nice. As for the generated code, it's actually a bit better sometimes. Eg. looking at i915_irq_handler(), we can see the following change: lea 0x70024(%rdx,%rax,1),%r9d mov $0x1,%edx - movslq %r9d,%r9 - mov %r9,%rsi - mov %r9,-0x58(%rbp) - callq 0xd8(%rbx) + mov %r9d,%esi + mov %r9d,-0x48(%rbp) callq 0xd8(%rbx) So previously gcc thought the register offset might be signed and decided to sign extend it, just in case. The rest appears to be mostly just minor shuffling of instructions. v2: i915_mmio_reg_{offset,equal,valid}() helpers added s/_REG/_MMIO/ in the register defines mo more switch statements left to worry about ring_emit stuff got sorted in a prep patch cmd parser, lrc context and w/a batch buildup also in prep patch vgpu stuff cleaned up and moved to a prep patch all other unrelated changes split out v3: Rebased due to BXT DSI/BLC, MOCS, etc. v4: Rebased due to churn, s/i915_mmio_reg_t/i915_reg_t/ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1447853606-2751-1-git-send-email-ville.syrjala@linux.intel.com	2015-11-18 15:39:11 +02:00
Ville Syrjälä	e597ef4045	drm/i915: Make the cmd parser 64bit regs explicit Add defines for the upper halves of the registers used by the cmd parser. Getting rid of the arithmetic with the register offset will help in making registers type safe. v2: s/_HI/_UDW/ (Chris) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1446839080-18732-1-git-send-email-ville.syrjala@linux.intel.com	2015-11-18 14:35:20 +02:00
Jordan Justen	7b9748cb51	drm/i915: Add GEN7_GPGPU_DISPATCHDIMX/Y/Z to the register whitelist This is required to support glDispatchComputeIndirect for gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-10-06 10:40:22 +02:00
Chris Wilson	614f4ad798	drm/i915: Fix cmdparser STORE/LOAD command descriptors Fixes regression from commit `f1afe24f0e` Author: Arun Siluvery <arun.siluvery@linux.intel.com> Date: Tue Aug 4 16:22:20 2015 +0100 drm/i915: Change SRM, LRM instructions to use correct length which forgot to account for the length bias when declaring the fixed length. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91844 Reported-by: Andreas Reis <andreas.reis@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-09-04 10:43:09 +02:00
Francisco Jerez	2bbe6bbb0d	drm/i915: Bump command parser version number. This was forgotten in commit `d351f6d948` Author: Francisco Jerez <currojerez@riseup.net> Date: Fri May 29 16:44:15 2015 +0300 drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-09-01 11:40:15 +02:00
Arun Siluvery	f1afe24f0e	drm/i915: Change SRM, LRM instructions to use correct length MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects (reg, addr) pairs so use fixed length for these instructions. v2: rebase Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems terminally unhappy about i915_cmd_parser.c so that would be a separate patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-08-26 08:44:41 +02:00
Hanno Böck	8453580cb8	drm/i915: Fix command parser table validator As we may like to use a bisection search on the tables in future, we need them to be ordered. For convenience we expect the compiled tables to be order and check on initialisation. However, the validator used the wrong iterators failed to spot the misordered MI tables and instead walked off into the unknown (as spotted by kasan). Signed-off-by: Hanno Boeck <hanno@hboeck.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Again hand-assemble patch ...] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-07-29 10:31:04 +02:00
Hanno Böck	9f58582c7a	drm/i915: Properly sort MI coomand table In the future, we may want to speed up command/register searching using a bisection and so we require them to be in ascending order respectively by command value or register address. However, this was not true for one pair in the MI table; make it so. Signed-off-by: Hanno Boeck <hanno@hboeck.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Hand-assemble patch from raw patch from Hanno and commit message from Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-07-29 10:29:58 +02:00
Arun Siluvery	9e00084750	drm/i915: Update WaFlushCoherentL3CacheLinesAtContextSwitch In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after PIPE_CONTROL instruction but there is a slight complication as this is applied in WA batch where the values are only initialized once. Dave identified an issue with the current implementation where the register value is read once at the beginning and it is reused; this patch corrects this by saving the register value to memory, update register with the bit of our interest and restore it back with original value. This implementation uses MI_LOAD_REGISTER_MEM which is currently only used by command parser and was using a default length of 0. This is now updated with correct length and moved to appropriate place. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-07-06 14:37:39 +02:00
Francisco Jerez	d351f6d948	drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist. Only bit 27 of SCRATCH1 and bit 6 of ROW_CHICKEN3 are allowed to be set because of security-sensitive bits we don't want userspace to mess with. On HSW hardware the whitelisted bits control whether atomic read-modify-write operations are performed on L3 or on GTI, and when set to L3 (which can be 10x-30x better performing than on GTI, depending on the application) require great care to avoid a system hang, so we currently program them to be handled on GTI by default. Beignet can immediately start taking advantage of this change to enable L3 atomics. Mesa should eventually switch to L3 atomics too, but a number of non-trivial changes are still required so it will continue using GTI atomics for now. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-15 12:34:58 +02:00
Francisco Jerez	4e86f725ce	drm/i915: Extend the parser to check register writes against a mask/value pair. In some cases it might be unnecessary or dangerous to give userspace the right to write arbitrary values to some register, even though it might be desirable to give it control of some of its bits. This patch extends the register whitelist entries to contain a mask/value pair in addition to the register offset. For registers with non-zero mask, any LRM writes and LRI writes where the bits of the immediate given by the mask don't match the specified value will be rejected. This will be used in my next patch to grant userspace partial write access to some sensitive registers. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-15 12:34:50 +02:00
Francisco Jerez	6a65c5b932	drm/i915: Fix command parser to validate multiple register access with the same command. Until now the software command checker assumed that commands could read or write at most a single register per packet. This is not necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length list of offset/value pairs and writes them in sequence. The previous code would only check whether the first entry was valid, effectively allowing userspace to write unrestricted registers of the MMIO space by sending a multi-register write with a legal first register, with potential security implications on Gen6 and 7 hardware. Fix it by extending the drm_i915_cmd_descriptor table to represent multi-register access and making validate_cmd() iterate for all register offsets present in the command packet. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-06-15 12:34:26 +02:00
Chris Wilson	de4e783a3f	drm/i915: Tidy batch pool logic Move the madvise logic out of the execbuffer main path into the relatively rare allocation path, making the execbuffer manipulation less fragile. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:04 +02:00
Mika Kuoppala	72c5ba9562	drm/i915: Fix vmap_batch page iterator overrun vmap_batch() calculates amount of needed pages for the mapping we are going to create. And it uses this page count as an argument for the for_each_sg_pages() macro. The macro takes the number of sg list entities as an argument, not the page count. So we ended up iterating through all the pages on the mapped object, corrupting memory past the smaller pages[] array. Fix this by bailing out when we have enough pages. This regression has been introduced in commit `17cabf571e` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 14 11:20:57 2015 +0000 drm/i915: Trim the command parser allocations Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-03-17 22:30:31 +01:00
Chris Wilson	17cabf571e	drm/i915: Trim the command parser allocations Currently, the command parser tries to create a secondary batch exactly as large as the original, and vmap both. This is open to abuse by userspace using extremely large batch objects, but only executing very short batches. For example, this would be if userspace were to implement a command submission ringbuffer. However, we only need to allocate pages for just the contents of the command sequence in the batch - all relocations copied to the secondary batch will reference the original batch and so there can be no access to the secondary batch outside of the explicit execution region. Testcase: igt/gem_exec_big #ivb,byt,hsw Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-02-23 17:07:40 +01:00
Jordan Justen	c61200c2c7	drm/i915: Add GPGPU_THREADS_DISPATCHED to the register whitelist This will allow us to read the number of dispatched compute threads for GL_ARB_pipeline_statistics_query. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-16 10:39:10 +01:00
Brad Volkin	7174537627	drm/i915: Tidy up execbuffer command parsing code Move it to a separate function since the main do_execbuffer function already has so much going on. v2: - Move pin/unpin calls inside i915_parse_cmds() (Chris W, v4 7/7 feedback) Issue: VIZ-4719 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-16 10:39:10 +01:00
Brad Volkin	b9ffd80ed6	drm/i915: Use batch length instead of object size in command parser Previously we couldn't trust the user-supplied batch length because it came directly from userspace (i.e. untrusted code). It would have affected what commands software parsed without regard to what hardware would actually execute, leaving a potential hole. With the parser now copying the user supplied batch buffer and writing MI_NOP commands to any space after the copied region, we can safely use the batch length input. This should be a performance win as the actual batch length is frequently much smaller than the allocated object size. v2: Fix handling of non-zero batch_start_offset Issue: VIZ-4719 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-16 10:39:09 +01:00
Brad Volkin	78a423772d	drm/i915: Use batch pools with the command parser This patch sets up all of the tracking and copying necessary to use batch pools with the command parser and dispatches the copied (shadow) batch to the hardware. After this patch, the parser is in 'enabling' mode. Note that performance takes a hit from the copy in some cases and will likely need some work. At a rough pass, the memcpy appears to be the bottleneck. Without having done a deeper analysis, two ideas that come to mind are: 1) Copy sections of the batch at a time, as they are reached by parsing. Might improve cache locality. 2) Copy only up to the userspace-supplied batch length and memset the rest of the buffer. Reduces the number of reads. v2: - Remove setting the capacity of the pool - One global pool instead of per-ring pools - Replace batch_obj with shadow_batch_obj and hook into eb->vmas - Memset any space in the shadow batch beyond what gets copied - Rebased on execlist prep refactoring v3: - Rebase on chained batch handling - Squash in setting the secure dispatch flag - Add a note about the interaction w/secure dispatch pinning - Check for request->batch_obj == NULL in i915_gem_free_request v4: - Fix read domains for shadow_batch_obj - Remove the set_to_gtt_domain call from i915_parse_cmds - ggtt_pin/unpin in the parser block to simplify error handling - Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag - Remove i915_gem_batch_pool_put calls v5: - Move 'pending_read_domains \|= I915_GEM_DOMAIN_COMMAND' after the parser (danvet, from v4 0/7 feedback) Issue: VIZ-4719 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-16 10:39:09 +01:00
Michael H. Nguyen	86ef630d53	drm/i915: Add MI_SET_APPID cmd to cmd parser tables Was missing. Issue: VIZ-4701 Signed-off-by: Michael H. Nguyen <michael.h.nguyen@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-10 17:47:21 +01:00
Daniel Vetter	bfc882b4e3	drm/i915: Flatten engine init control flow Now that sanity prevails and we have the clean split between software init and starting the engines we can drop all the "have we allocate this struct already?" nonsense. Execlist code could benefit quite a bit more still, but that's for another patch. Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-12-03 09:35:29 +01:00
Neil Roberts	f1f55cc055	drm/i915: Add the predicate source registers to the register whitelist The predicate source registers are needed to implement conditional rendering without stalling. The two source registers are used to load the previous values of the PS_DEPTH_COUNT register saved from PIPE_CONTROL commands. These can then be compared and used to set the predicate enable bit via the MI_PREDICATE command. The command parser version number is increased to 2 to make it easier to detect the new functionality in user space. Signed-off-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:24 +01:00
Daniel Vetter	eb84f976c8	Merge remote-tracking branch 'airlied/drm-next' into HEAD Backmerge drm-next so that I can keep merging patches. Specifically I want: - atomic stuff, yay! - eld parsing patch from Jani. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-11-10 10:55:35 +01:00
Dave Airlie	1f9e14baa9	Merge tag 'topic/core-stuff-2014-11-05' of git://anongit.freedesktop.org/drm-intel into drm-next Just various stuff all over from a bunch of people. Shortlog gives a beter overview, it's really all misc drm patches. * tag 'topic/core-stuff-2014-11-05' of git://anongit.freedesktop.org/drm-intel: drm/edid: add #defines and helpers for ELD drm/dp: Add counters in the drm_dp_aux struct for I2C NACKs and DEFERs drm: Remove compiler BUG_ON() test drm: Fix DRM_FORCE_ON_DIGITAL use drm/gma500: Don't destroy DRM properties in the driver drm/i915: Don't destroy DRM properties in the driver drm: Add a note to drm_property_create() about property lifetime gpu: drm: Fix warning caused by a parameter description in drm_crtc.c drm/dp-helper: Move the legacy helpers to gma500 drm/crtc: Remove duplicated ioctl code drm/crtc: Fix two typos gpu:drm: Fix typo in Documentation/DocBook/drm.xml gpu: drm: drm_dp_mst_topology.c: Fix improper use of strncat drm: drm_err: Remove unnecessary __func__ argument drm: Implement O_NONBLOCK support on /dev/dri/cardN	2014-11-07 10:58:46 +10:00
Brad Volkin	42c7156af9	drm/i915: Abort command parsing for chained batches libva uses chained batch buffers in a way that the command parser can't generally handle. Fortunately, libva doesn't need to write registers from batch buffers in the way that mesa does, so this patch causes the driver to fall back to non-secure dispatch if the parser detects a chained batch buffer. Note: The 2nd hunk to munge the error code of the parser looks a bit superflous. At least until we have the batch copy code ready and can run the cmd parser in granting mode. But it isn't since we still need to let existing libva buffers pass (though not with elevated privs ofc!). Testcase: igt/gem_exec_parse/chained-batch Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Add note - this confused me in review and Brad clarified things (after a few mails ...).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-04 14:04:54 +01:00
Dave Airlie	bbf0ef0334	Merge tag 'drm-intel-next-2014-10-03-no-ppgtt' of git://anongit.freedesktop.org/drm-intel into drm-next Ok, new attempt, this time around with full ppgtt disabled again. drm-intel-next-2014-10-03: - first batch of skl stage 1 enabling - fixes from Rodrigo to the PSR, fbc and sink crc code - kerneldoc for the frontbuffer tracking code, runtime pm code and the basic interrupt enable/disable functions - smaller stuff all over drm-intel-next-2014-09-19: - bunch more i830M fixes from Ville - full ppgtt now again enabled by default - more ppgtt fixes from Michel Thierry and Chris Wilson - plane config work from Gustavo Padovan - spinlock clarifications - piles of smaller improvements all over, as usual * tag 'drm-intel-next-2014-10-03-no-ppgtt' of git://anongit.freedesktop.org/drm-intel: (114 commits) Revert "drm/i915: Enable full PPGTT on gen7" drm/i915: Update DRIVER_DATE to 20141003 drm/i915: Remove the duplicated logic between the two shrink phases drm/i915: kerneldoc for interrupt enable/disable functions drm/i915: Use dev_priv instead of dev in irq setup functions drm/i915: s/pm._irqs_disabled/pm.irqs_enabled/ drm/i915: Clear TX FIFO reset master override bits on chv drm/i915: Make sure hardware uses the correct swing margin/deemph bits on chv drm/i915: make sink_crc return -EIO on aux read/write failure drm/i915: Constify send buffer for intel_dp_aux_ch drm/i915: De-magic the PSR AUX message drm/i915: Reinstate error level message for non-simulated gpu hangs drm/i915: Kerneldoc for intel_runtime_pm.c drm/i915: Call runtime_pm_disable directly drm/i915: Move intel_display_set_init_power to intel_runtime_pm.c drm/i915: Bikeshed rpm functions name a bit. drm/i915: Extract intel_runtime_pm.c drm/i915: Remove intel_modeset_suspend_hw drm/i915: spelling fixes for frontbuffer tracking kerneldoc drm/i915: Tighting frontbuffer tracking around flips ...	2014-10-28 12:37:58 +10:00
Masanari Iida	32197aab04	gpu:drm: Fix typo in Documentation/DocBook/drm.xml This patch fix spelling typos found in drm.xml. It is because the file is generated from comments in source codes, I have to fix the typos within source files. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-10-21 10:55:33 +02:00

1 2

82 Commits