2010-11-06 04:23:30 +07:00
|
|
|
/*
|
|
|
|
* Copyright © 2010 Daniel Vetter
|
2014-02-20 13:05:47 +07:00
|
|
|
* Copyright © 2011-2014 Intel Corporation
|
2010-11-06 04:23:30 +07:00
|
|
|
*
|
|
|
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
|
|
|
* copy of this software and associated documentation files (the "Software"),
|
|
|
|
* to deal in the Software without restriction, including without limitation
|
|
|
|
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
|
|
|
* and/or sell copies of the Software, and to permit persons to whom the
|
|
|
|
* Software is furnished to do so, subject to the following conditions:
|
|
|
|
*
|
|
|
|
* The above copyright notice and this permission notice (including the next
|
|
|
|
* paragraph) shall be included in all copies or substantial portions of the
|
|
|
|
* Software.
|
|
|
|
*
|
|
|
|
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
|
|
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
|
|
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
|
|
|
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
|
|
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
|
|
|
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
|
|
|
|
* IN THE SOFTWARE.
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
|
2014-01-08 22:10:27 +07:00
|
|
|
#include <linux/seq_file.h>
|
2012-10-03 00:01:07 +07:00
|
|
|
#include <drm/drmP.h>
|
|
|
|
#include <drm/i915_drm.h>
|
2010-11-06 04:23:30 +07:00
|
|
|
#include "i915_drv.h"
|
|
|
|
#include "i915_trace.h"
|
|
|
|
#include "intel_drv.h"
|
|
|
|
|
2014-04-09 17:28:01 +07:00
|
|
|
static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv);
|
|
|
|
static void chv_setup_private_ppat(struct drm_i915_private *dev_priv);
|
2014-03-19 06:09:37 +07:00
|
|
|
|
drm/i915: Disable full ppgtt by default
There are too many oustanding issues:
- Fence handling in the current code is broken. There's a patch series
from me, but it's blocked on and extended review (which includes
writing the testcases).
- IOMMU mapping handling is broken, we need to properly refcount it -
currently it gets destroyed when the first vma is unbound, so way
too early.
- There's a pending reset issue on snb. Since Mika's reset work and
full ppgtt have been pulled in in separate branches and ended up
intermittingly breaking each another it's unclear who's the exact
culprit here.
- We still have persistent evidince of crazy recursion bugs through
vma_unbind and ppgtt_relase, e.g.
https://bugs.freedesktop.org/show_bug.cgi?id=73383
This issue (and a few others meanwhile resolved) have blocked our
performance measuring/tuning group since 3 months.
- Secure batch dispatching is broken. This is blocking Brad Volkin's
command checker work since 3 months.
All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:
- We currently unconditionally bind objects into the aliasing ppgtt,
which means all priviledged objects like ringbuffers are visible to
unpriviledged access again. On top of that this also breaks the
command checker for aliasing ppgtt, since it can't hide the
validated batch any more.
Furthermore topic/full-ppgtt has never been reviewed:
- Lifetime rules around vma unbinding/release are unclear, resulting
into this awesome hack called ppgtt_release. Which seems to take the
blame for most of the recursion fallout.
- Context/ring init works different on gpu reset than anywhere else.
Such differeneces have in the past always lead to really hard to
track down bugs.
- Aliasing ppgtt is treated in a bunch of places as a real address
space, but it isn't - the real address space is always the global
gtt in that case. This results in a bit a mess between contexts and
ppgtt object, further complication the context/ppgtt/vma lifetime
rules.
- We don't have any docs describing the overall concepts introduced
with full ppgtt. A short, concise overview describing vmas and some
of the strange bits around them (like the unbound vmas used by
execbuf, or the new binding rules) really is needed.
Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.
Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-06 15:40:43 +07:00
|
|
|
bool intel_enable_ppgtt(struct drm_device *dev, bool full)
|
|
|
|
{
|
2014-04-29 16:53:58 +07:00
|
|
|
if (i915.enable_ppgtt == 0)
|
drm/i915: Disable full ppgtt by default
There are too many oustanding issues:
- Fence handling in the current code is broken. There's a patch series
from me, but it's blocked on and extended review (which includes
writing the testcases).
- IOMMU mapping handling is broken, we need to properly refcount it -
currently it gets destroyed when the first vma is unbound, so way
too early.
- There's a pending reset issue on snb. Since Mika's reset work and
full ppgtt have been pulled in in separate branches and ended up
intermittingly breaking each another it's unclear who's the exact
culprit here.
- We still have persistent evidince of crazy recursion bugs through
vma_unbind and ppgtt_relase, e.g.
https://bugs.freedesktop.org/show_bug.cgi?id=73383
This issue (and a few others meanwhile resolved) have blocked our
performance measuring/tuning group since 3 months.
- Secure batch dispatching is broken. This is blocking Brad Volkin's
command checker work since 3 months.
All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:
- We currently unconditionally bind objects into the aliasing ppgtt,
which means all priviledged objects like ringbuffers are visible to
unpriviledged access again. On top of that this also breaks the
command checker for aliasing ppgtt, since it can't hide the
validated batch any more.
Furthermore topic/full-ppgtt has never been reviewed:
- Lifetime rules around vma unbinding/release are unclear, resulting
into this awesome hack called ppgtt_release. Which seems to take the
blame for most of the recursion fallout.
- Context/ring init works different on gpu reset than anywhere else.
Such differeneces have in the past always lead to really hard to
track down bugs.
- Aliasing ppgtt is treated in a bunch of places as a real address
space, but it isn't - the real address space is always the global
gtt in that case. This results in a bit a mess between contexts and
ppgtt object, further complication the context/ppgtt/vma lifetime
rules.
- We don't have any docs describing the overall concepts introduced
with full ppgtt. A short, concise overview describing vmas and some
of the strange bits around them (like the unbound vmas used by
execbuf, or the new binding rules) really is needed.
Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.
Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-06 15:40:43 +07:00
|
|
|
return false;
|
|
|
|
|
|
|
|
if (i915.enable_ppgtt == 1 && full)
|
|
|
|
return false;
|
|
|
|
|
2014-04-29 16:53:58 +07:00
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
|
|
|
|
{
|
|
|
|
if (enable_ppgtt == 0 || !HAS_ALIASING_PPGTT(dev))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (enable_ppgtt == 1)
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
if (enable_ppgtt == 2 && HAS_PPGTT(dev))
|
|
|
|
return 2;
|
|
|
|
|
drm/i915: Disable full ppgtt by default
There are too many oustanding issues:
- Fence handling in the current code is broken. There's a patch series
from me, but it's blocked on and extended review (which includes
writing the testcases).
- IOMMU mapping handling is broken, we need to properly refcount it -
currently it gets destroyed when the first vma is unbound, so way
too early.
- There's a pending reset issue on snb. Since Mika's reset work and
full ppgtt have been pulled in in separate branches and ended up
intermittingly breaking each another it's unclear who's the exact
culprit here.
- We still have persistent evidince of crazy recursion bugs through
vma_unbind and ppgtt_relase, e.g.
https://bugs.freedesktop.org/show_bug.cgi?id=73383
This issue (and a few others meanwhile resolved) have blocked our
performance measuring/tuning group since 3 months.
- Secure batch dispatching is broken. This is blocking Brad Volkin's
command checker work since 3 months.
All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:
- We currently unconditionally bind objects into the aliasing ppgtt,
which means all priviledged objects like ringbuffers are visible to
unpriviledged access again. On top of that this also breaks the
command checker for aliasing ppgtt, since it can't hide the
validated batch any more.
Furthermore topic/full-ppgtt has never been reviewed:
- Lifetime rules around vma unbinding/release are unclear, resulting
into this awesome hack called ppgtt_release. Which seems to take the
blame for most of the recursion fallout.
- Context/ring init works different on gpu reset than anywhere else.
Such differeneces have in the past always lead to really hard to
track down bugs.
- Aliasing ppgtt is treated in a bunch of places as a real address
space, but it isn't - the real address space is always the global
gtt in that case. This results in a bit a mess between contexts and
ppgtt object, further complication the context/ppgtt/vma lifetime
rules.
- We don't have any docs describing the overall concepts introduced
with full ppgtt. A short, concise overview describing vmas and some
of the strange bits around them (like the unbound vmas used by
execbuf, or the new binding rules) really is needed.
Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.
Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-06 15:40:43 +07:00
|
|
|
#ifdef CONFIG_INTEL_IOMMU
|
|
|
|
/* Disable ppgtt on SNB if VT-d is on. */
|
|
|
|
if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
|
|
|
|
DRM_INFO("Disabling PPGTT because VT-d is on\n");
|
2014-04-29 16:53:58 +07:00
|
|
|
return 0;
|
drm/i915: Disable full ppgtt by default
There are too many oustanding issues:
- Fence handling in the current code is broken. There's a patch series
from me, but it's blocked on and extended review (which includes
writing the testcases).
- IOMMU mapping handling is broken, we need to properly refcount it -
currently it gets destroyed when the first vma is unbound, so way
too early.
- There's a pending reset issue on snb. Since Mika's reset work and
full ppgtt have been pulled in in separate branches and ended up
intermittingly breaking each another it's unclear who's the exact
culprit here.
- We still have persistent evidince of crazy recursion bugs through
vma_unbind and ppgtt_relase, e.g.
https://bugs.freedesktop.org/show_bug.cgi?id=73383
This issue (and a few others meanwhile resolved) have blocked our
performance measuring/tuning group since 3 months.
- Secure batch dispatching is broken. This is blocking Brad Volkin's
command checker work since 3 months.
All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:
- We currently unconditionally bind objects into the aliasing ppgtt,
which means all priviledged objects like ringbuffers are visible to
unpriviledged access again. On top of that this also breaks the
command checker for aliasing ppgtt, since it can't hide the
validated batch any more.
Furthermore topic/full-ppgtt has never been reviewed:
- Lifetime rules around vma unbinding/release are unclear, resulting
into this awesome hack called ppgtt_release. Which seems to take the
blame for most of the recursion fallout.
- Context/ring init works different on gpu reset than anywhere else.
Such differeneces have in the past always lead to really hard to
track down bugs.
- Aliasing ppgtt is treated in a bunch of places as a real address
space, but it isn't - the real address space is always the global
gtt in that case. This results in a bit a mess between contexts and
ppgtt object, further complication the context/ppgtt/vma lifetime
rules.
- We don't have any docs describing the overall concepts introduced
with full ppgtt. A short, concise overview describing vmas and some
of the strange bits around them (like the unbound vmas used by
execbuf, or the new binding rules) really is needed.
Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.
Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-06 15:40:43 +07:00
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-06-13 23:28:33 +07:00
|
|
|
/* Early VLV doesn't have this */
|
2014-06-28 06:03:56 +07:00
|
|
|
if (IS_VALLEYVIEW(dev) && !IS_CHERRYVIEW(dev) &&
|
|
|
|
dev->pdev->revision < 0xb) {
|
2014-06-13 23:28:33 +07:00
|
|
|
DRM_DEBUG_DRIVER("disabling PPGTT on pre-B3 step VLV\n");
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-04-29 16:53:58 +07:00
|
|
|
return HAS_ALIASING_PPGTT(dev) ? 1 : 0;
|
drm/i915: Disable full ppgtt by default
There are too many oustanding issues:
- Fence handling in the current code is broken. There's a patch series
from me, but it's blocked on and extended review (which includes
writing the testcases).
- IOMMU mapping handling is broken, we need to properly refcount it -
currently it gets destroyed when the first vma is unbound, so way
too early.
- There's a pending reset issue on snb. Since Mika's reset work and
full ppgtt have been pulled in in separate branches and ended up
intermittingly breaking each another it's unclear who's the exact
culprit here.
- We still have persistent evidince of crazy recursion bugs through
vma_unbind and ppgtt_relase, e.g.
https://bugs.freedesktop.org/show_bug.cgi?id=73383
This issue (and a few others meanwhile resolved) have blocked our
performance measuring/tuning group since 3 months.
- Secure batch dispatching is broken. This is blocking Brad Volkin's
command checker work since 3 months.
All these issues are confirmed to only happen when full ppgtt is
enabled, falling back to aliasing ppgtt resolves them. But even
aliasing ppgtt itself still has a regression:
- We currently unconditionally bind objects into the aliasing ppgtt,
which means all priviledged objects like ringbuffers are visible to
unpriviledged access again. On top of that this also breaks the
command checker for aliasing ppgtt, since it can't hide the
validated batch any more.
Furthermore topic/full-ppgtt has never been reviewed:
- Lifetime rules around vma unbinding/release are unclear, resulting
into this awesome hack called ppgtt_release. Which seems to take the
blame for most of the recursion fallout.
- Context/ring init works different on gpu reset than anywhere else.
Such differeneces have in the past always lead to really hard to
track down bugs.
- Aliasing ppgtt is treated in a bunch of places as a real address
space, but it isn't - the real address space is always the global
gtt in that case. This results in a bit a mess between contexts and
ppgtt object, further complication the context/ppgtt/vma lifetime
rules.
- We don't have any docs describing the overall concepts introduced
with full ppgtt. A short, concise overview describing vmas and some
of the strange bits around them (like the unbound vmas used by
execbuf, or the new binding rules) really is needed.
Note that a lot of the post topic/full-ppgtt merge fallout has already
been addressed, this entire list here of 10 issues really only contains
the still outstanding issues.
Finally the 3.15 merge window is approaching and I think we need to
use the remaining time to ensure that our fallback option of using
aliasing ppgtt is in solid shape. Hence I think it's time to throw the
switch. While at it demote the helper from static inline status
because really.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-03-06 15:40:43 +07:00
|
|
|
}
|
|
|
|
|
2013-11-05 10:56:49 +07:00
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
static void ppgtt_bind_vma(struct i915_vma *vma,
|
|
|
|
enum i915_cache_level cache_level,
|
|
|
|
u32 flags);
|
|
|
|
static void ppgtt_unbind_vma(struct i915_vma *vma);
|
2013-12-07 05:11:10 +07:00
|
|
|
static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt);
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
|
2013-11-03 11:07:18 +07:00
|
|
|
static inline gen8_gtt_pte_t gen8_pte_encode(dma_addr_t addr,
|
|
|
|
enum i915_cache_level level,
|
|
|
|
bool valid)
|
|
|
|
{
|
|
|
|
gen8_gtt_pte_t pte = valid ? _PAGE_PRESENT | _PAGE_RW : 0;
|
|
|
|
pte |= addr;
|
2014-04-19 04:04:27 +07:00
|
|
|
|
|
|
|
switch (level) {
|
|
|
|
case I915_CACHE_NONE:
|
2013-11-05 10:56:49 +07:00
|
|
|
pte |= PPAT_UNCACHED_INDEX;
|
2014-04-19 04:04:27 +07:00
|
|
|
break;
|
|
|
|
case I915_CACHE_WT:
|
|
|
|
pte |= PPAT_DISPLAY_ELLC_INDEX;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
pte |= PPAT_CACHED_INDEX;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2013-11-03 11:07:18 +07:00
|
|
|
return pte;
|
|
|
|
}
|
|
|
|
|
2013-11-05 12:20:14 +07:00
|
|
|
static inline gen8_ppgtt_pde_t gen8_pde_encode(struct drm_device *dev,
|
|
|
|
dma_addr_t addr,
|
|
|
|
enum i915_cache_level level)
|
|
|
|
{
|
|
|
|
gen8_ppgtt_pde_t pde = _PAGE_PRESENT | _PAGE_RW;
|
|
|
|
pde |= addr;
|
|
|
|
if (level != I915_CACHE_NONE)
|
|
|
|
pde |= PPAT_CACHED_PDE_INDEX;
|
|
|
|
else
|
|
|
|
pde |= PPAT_UNCACHED_INDEX;
|
|
|
|
return pde;
|
|
|
|
}
|
|
|
|
|
2013-08-06 19:17:02 +07:00
|
|
|
static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr,
|
2013-10-16 23:18:21 +07:00
|
|
|
enum i915_cache_level level,
|
2014-06-17 12:29:42 +07:00
|
|
|
bool valid, u32 unused)
|
2012-09-25 06:44:32 +07:00
|
|
|
{
|
2013-10-16 23:18:21 +07:00
|
|
|
gen6_gtt_pte_t pte = valid ? GEN6_PTE_VALID : 0;
|
2012-09-25 06:44:32 +07:00
|
|
|
pte |= GEN6_PTE_ADDR_ENCODE(addr);
|
2012-10-19 23:33:22 +07:00
|
|
|
|
|
|
|
switch (level) {
|
2013-08-06 19:17:02 +07:00
|
|
|
case I915_CACHE_L3_LLC:
|
|
|
|
case I915_CACHE_LLC:
|
|
|
|
pte |= GEN6_PTE_CACHE_LLC;
|
|
|
|
break;
|
|
|
|
case I915_CACHE_NONE:
|
|
|
|
pte |= GEN6_PTE_UNCACHED;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
WARN_ON(1);
|
|
|
|
}
|
|
|
|
|
|
|
|
return pte;
|
|
|
|
}
|
|
|
|
|
|
|
|
static gen6_gtt_pte_t ivb_pte_encode(dma_addr_t addr,
|
2013-10-16 23:18:21 +07:00
|
|
|
enum i915_cache_level level,
|
2014-06-17 12:29:42 +07:00
|
|
|
bool valid, u32 unused)
|
2013-08-06 19:17:02 +07:00
|
|
|
{
|
2013-10-16 23:18:21 +07:00
|
|
|
gen6_gtt_pte_t pte = valid ? GEN6_PTE_VALID : 0;
|
2013-08-06 19:17:02 +07:00
|
|
|
pte |= GEN6_PTE_ADDR_ENCODE(addr);
|
|
|
|
|
|
|
|
switch (level) {
|
|
|
|
case I915_CACHE_L3_LLC:
|
|
|
|
pte |= GEN7_PTE_CACHE_L3_LLC;
|
2012-10-19 23:33:22 +07:00
|
|
|
break;
|
|
|
|
case I915_CACHE_LLC:
|
|
|
|
pte |= GEN6_PTE_CACHE_LLC;
|
|
|
|
break;
|
|
|
|
case I915_CACHE_NONE:
|
2013-04-22 14:53:51 +07:00
|
|
|
pte |= GEN6_PTE_UNCACHED;
|
2012-10-19 23:33:22 +07:00
|
|
|
break;
|
|
|
|
default:
|
2013-08-06 19:17:02 +07:00
|
|
|
WARN_ON(1);
|
2012-10-19 23:33:22 +07:00
|
|
|
}
|
|
|
|
|
2012-09-25 06:44:32 +07:00
|
|
|
return pte;
|
|
|
|
}
|
|
|
|
|
2013-06-28 06:30:19 +07:00
|
|
|
static gen6_gtt_pte_t byt_pte_encode(dma_addr_t addr,
|
2013-10-16 23:18:21 +07:00
|
|
|
enum i915_cache_level level,
|
2014-06-17 12:29:42 +07:00
|
|
|
bool valid, u32 flags)
|
2013-04-22 14:53:50 +07:00
|
|
|
{
|
2013-10-16 23:18:21 +07:00
|
|
|
gen6_gtt_pte_t pte = valid ? GEN6_PTE_VALID : 0;
|
2013-04-22 14:53:50 +07:00
|
|
|
pte |= GEN6_PTE_ADDR_ENCODE(addr);
|
|
|
|
|
|
|
|
/* Mark the page as writeable. Other platforms don't have a
|
|
|
|
* setting for read-only/writable, so this matches that behavior.
|
|
|
|
*/
|
2014-06-17 12:29:42 +07:00
|
|
|
if (!(flags & PTE_READ_ONLY))
|
|
|
|
pte |= BYT_PTE_WRITEABLE;
|
2013-04-22 14:53:50 +07:00
|
|
|
|
|
|
|
if (level != I915_CACHE_NONE)
|
|
|
|
pte |= BYT_PTE_SNOOPED_BY_CPU_CACHES;
|
|
|
|
|
|
|
|
return pte;
|
|
|
|
}
|
|
|
|
|
2013-06-28 06:30:19 +07:00
|
|
|
static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
|
2013-10-16 23:18:21 +07:00
|
|
|
enum i915_cache_level level,
|
2014-06-17 12:29:42 +07:00
|
|
|
bool valid, u32 unused)
|
2013-04-22 14:53:51 +07:00
|
|
|
{
|
2013-10-16 23:18:21 +07:00
|
|
|
gen6_gtt_pte_t pte = valid ? GEN6_PTE_VALID : 0;
|
2013-07-05 01:02:03 +07:00
|
|
|
pte |= HSW_PTE_ADDR_ENCODE(addr);
|
2013-04-22 14:53:51 +07:00
|
|
|
|
|
|
|
if (level != I915_CACHE_NONE)
|
2013-08-05 13:47:29 +07:00
|
|
|
pte |= HSW_WB_LLC_AGE3;
|
2013-04-22 14:53:51 +07:00
|
|
|
|
|
|
|
return pte;
|
|
|
|
}
|
|
|
|
|
2013-07-05 01:02:06 +07:00
|
|
|
static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr,
|
2013-10-16 23:18:21 +07:00
|
|
|
enum i915_cache_level level,
|
2014-06-17 12:29:42 +07:00
|
|
|
bool valid, u32 unused)
|
2013-07-05 01:02:06 +07:00
|
|
|
{
|
2013-10-16 23:18:21 +07:00
|
|
|
gen6_gtt_pte_t pte = valid ? GEN6_PTE_VALID : 0;
|
2013-07-05 01:02:06 +07:00
|
|
|
pte |= HSW_PTE_ADDR_ENCODE(addr);
|
|
|
|
|
2013-08-08 20:41:10 +07:00
|
|
|
switch (level) {
|
|
|
|
case I915_CACHE_NONE:
|
|
|
|
break;
|
|
|
|
case I915_CACHE_WT:
|
2013-11-22 17:37:53 +07:00
|
|
|
pte |= HSW_WT_ELLC_LLC_AGE3;
|
2013-08-08 20:41:10 +07:00
|
|
|
break;
|
|
|
|
default:
|
2013-11-22 17:37:53 +07:00
|
|
|
pte |= HSW_WB_ELLC_LLC_AGE3;
|
2013-08-08 20:41:10 +07:00
|
|
|
break;
|
|
|
|
}
|
2013-07-05 01:02:06 +07:00
|
|
|
|
|
|
|
return pte;
|
|
|
|
}
|
|
|
|
|
2013-11-05 13:29:36 +07:00
|
|
|
/* Broadwell Page Directory Pointer Descriptors */
|
2014-05-22 20:13:33 +07:00
|
|
|
static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry,
|
2013-12-07 05:10:47 +07:00
|
|
|
uint64_t val, bool synchronous)
|
2013-11-05 13:29:36 +07:00
|
|
|
{
|
2013-12-07 05:10:47 +07:00
|
|
|
struct drm_i915_private *dev_priv = ring->dev->dev_private;
|
2013-11-05 13:29:36 +07:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
BUG_ON(entry >= 4);
|
|
|
|
|
2013-12-07 05:10:47 +07:00
|
|
|
if (synchronous) {
|
|
|
|
I915_WRITE(GEN8_RING_PDP_UDW(ring, entry), val >> 32);
|
|
|
|
I915_WRITE(GEN8_RING_PDP_LDW(ring, entry), (u32)val);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-11-05 13:29:36 +07:00
|
|
|
ret = intel_ring_begin(ring, 6);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
|
|
|
|
intel_ring_emit(ring, GEN8_RING_PDP_UDW(ring, entry));
|
|
|
|
intel_ring_emit(ring, (u32)(val >> 32));
|
|
|
|
intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
|
|
|
|
intel_ring_emit(ring, GEN8_RING_PDP_LDW(ring, entry));
|
|
|
|
intel_ring_emit(ring, (u32)(val));
|
|
|
|
intel_ring_advance(ring);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring,
|
2013-12-07 05:11:10 +07:00
|
|
|
bool synchronous)
|
2013-11-05 13:29:36 +07:00
|
|
|
{
|
2013-12-07 05:11:10 +07:00
|
|
|
int i, ret;
|
2013-11-05 13:29:36 +07:00
|
|
|
|
|
|
|
/* bit of a hack to find the actual last used pd */
|
|
|
|
int used_pd = ppgtt->num_pd_entries / GEN8_PDES_PER_PAGE;
|
|
|
|
|
|
|
|
for (i = used_pd - 1; i >= 0; i--) {
|
|
|
|
dma_addr_t addr = ppgtt->pd_dma_addr[i];
|
2013-12-07 05:11:10 +07:00
|
|
|
ret = gen8_write_pdp(ring, i, addr, synchronous);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
2013-11-05 13:29:36 +07:00
|
|
|
}
|
2013-11-26 00:54:32 +07:00
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
return 0;
|
2013-11-05 13:29:36 +07:00
|
|
|
}
|
|
|
|
|
2013-11-03 11:07:23 +07:00
|
|
|
static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
|
|
|
uint64_t length,
|
2013-11-03 11:07:23 +07:00
|
|
|
bool use_scratch)
|
|
|
|
{
|
|
|
|
struct i915_hw_ppgtt *ppgtt =
|
|
|
|
container_of(vm, struct i915_hw_ppgtt, base);
|
|
|
|
gen8_gtt_pte_t *pt_vaddr, scratch_pte;
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
|
|
|
|
unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
|
|
|
|
unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned num_entries = length >> PAGE_SHIFT;
|
2013-11-03 11:07:23 +07:00
|
|
|
unsigned last_pte, i;
|
|
|
|
|
|
|
|
scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
|
|
|
|
I915_CACHE_LLC, use_scratch);
|
|
|
|
|
|
|
|
while (num_entries) {
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
struct page *page_table = ppgtt->gen8_pt_pages[pdpe][pde];
|
2013-11-03 11:07:23 +07:00
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
last_pte = pte + num_entries;
|
2013-11-03 11:07:23 +07:00
|
|
|
if (last_pte > GEN8_PTES_PER_PAGE)
|
|
|
|
last_pte = GEN8_PTES_PER_PAGE;
|
|
|
|
|
|
|
|
pt_vaddr = kmap_atomic(page_table);
|
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
for (i = pte; i < last_pte; i++) {
|
2013-11-03 11:07:23 +07:00
|
|
|
pt_vaddr[i] = scratch_pte;
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
num_entries--;
|
|
|
|
}
|
2013-11-03 11:07:23 +07:00
|
|
|
|
2014-04-09 17:28:02 +07:00
|
|
|
if (!HAS_LLC(ppgtt->base.dev))
|
|
|
|
drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
|
2013-11-03 11:07:23 +07:00
|
|
|
kunmap_atomic(pt_vaddr);
|
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
pte = 0;
|
|
|
|
if (++pde == GEN8_PDES_PER_PAGE) {
|
|
|
|
pdpe++;
|
|
|
|
pde = 0;
|
|
|
|
}
|
2013-11-03 11:07:23 +07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-11-03 11:07:24 +07:00
|
|
|
static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
|
|
|
|
struct sg_table *pages,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
2014-06-17 12:29:42 +07:00
|
|
|
enum i915_cache_level cache_level, u32 unused)
|
2013-11-03 11:07:24 +07:00
|
|
|
{
|
|
|
|
struct i915_hw_ppgtt *ppgtt =
|
|
|
|
container_of(vm, struct i915_hw_ppgtt, base);
|
|
|
|
gen8_gtt_pte_t *pt_vaddr;
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
|
|
|
|
unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
|
|
|
|
unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
|
2013-11-03 11:07:24 +07:00
|
|
|
struct sg_page_iter sg_iter;
|
|
|
|
|
2013-12-31 22:50:31 +07:00
|
|
|
pt_vaddr = NULL;
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
|
2013-11-03 11:07:24 +07:00
|
|
|
for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
if (WARN_ON(pdpe >= GEN8_LEGACY_PDPS))
|
|
|
|
break;
|
|
|
|
|
2013-12-31 22:50:31 +07:00
|
|
|
if (pt_vaddr == NULL)
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
pt_vaddr = kmap_atomic(ppgtt->gen8_pt_pages[pdpe][pde]);
|
2013-11-03 11:07:24 +07:00
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
pt_vaddr[pte] =
|
2013-12-31 22:50:31 +07:00
|
|
|
gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
|
|
|
|
cache_level, true);
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
if (++pte == GEN8_PTES_PER_PAGE) {
|
2014-04-09 17:28:02 +07:00
|
|
|
if (!HAS_LLC(ppgtt->base.dev))
|
|
|
|
drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
|
2013-11-03 11:07:24 +07:00
|
|
|
kunmap_atomic(pt_vaddr);
|
2013-12-31 22:50:31 +07:00
|
|
|
pt_vaddr = NULL;
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
if (++pde == GEN8_PDES_PER_PAGE) {
|
|
|
|
pdpe++;
|
|
|
|
pde = 0;
|
|
|
|
}
|
|
|
|
pte = 0;
|
2013-11-03 11:07:24 +07:00
|
|
|
}
|
|
|
|
}
|
2014-04-09 17:28:02 +07:00
|
|
|
if (pt_vaddr) {
|
|
|
|
if (!HAS_LLC(ppgtt->base.dev))
|
|
|
|
drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
|
2013-12-31 22:50:31 +07:00
|
|
|
kunmap_atomic(pt_vaddr);
|
2014-04-09 17:28:02 +07:00
|
|
|
}
|
2013-11-03 11:07:24 +07:00
|
|
|
}
|
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
static void gen8_free_page_tables(struct page **pt_pages)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (pt_pages == NULL)
|
|
|
|
return;
|
|
|
|
|
|
|
|
for (i = 0; i < GEN8_PDES_PER_PAGE; i++)
|
|
|
|
if (pt_pages[i])
|
|
|
|
__free_pages(pt_pages[i], 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void gen8_ppgtt_free(const struct i915_hw_ppgtt *ppgtt)
|
2014-02-13 05:28:44 +07:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
for (i = 0; i < ppgtt->num_pd_pages; i++) {
|
|
|
|
gen8_free_page_tables(ppgtt->gen8_pt_pages[i]);
|
|
|
|
kfree(ppgtt->gen8_pt_pages[i]);
|
2014-02-13 05:28:44 +07:00
|
|
|
kfree(ppgtt->gen8_pt_dma_addr[i]);
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
}
|
2014-02-13 05:28:44 +07:00
|
|
|
|
|
|
|
__free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT));
|
|
|
|
}
|
|
|
|
|
|
|
|
static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
2014-02-20 13:05:42 +07:00
|
|
|
struct pci_dev *hwdev = ppgtt->base.dev->pdev;
|
2014-02-13 05:28:44 +07:00
|
|
|
int i, j;
|
|
|
|
|
|
|
|
for (i = 0; i < ppgtt->num_pd_pages; i++) {
|
|
|
|
/* TODO: In the future we'll support sparse mappings, so this
|
|
|
|
* will have to change. */
|
|
|
|
if (!ppgtt->pd_dma_addr[i])
|
|
|
|
continue;
|
|
|
|
|
2014-02-20 13:05:42 +07:00
|
|
|
pci_unmap_page(hwdev, ppgtt->pd_dma_addr[i], PAGE_SIZE,
|
|
|
|
PCI_DMA_BIDIRECTIONAL);
|
2014-02-13 05:28:44 +07:00
|
|
|
|
|
|
|
for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
|
|
|
|
dma_addr_t addr = ppgtt->gen8_pt_dma_addr[i][j];
|
|
|
|
if (addr)
|
2014-02-20 13:05:42 +07:00
|
|
|
pci_unmap_page(hwdev, addr, PAGE_SIZE,
|
|
|
|
PCI_DMA_BIDIRECTIONAL);
|
2014-02-13 05:28:44 +07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-11-05 11:47:32 +07:00
|
|
|
static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
|
|
|
|
{
|
|
|
|
struct i915_hw_ppgtt *ppgtt =
|
|
|
|
container_of(vm, struct i915_hw_ppgtt, base);
|
|
|
|
|
2013-12-07 05:11:26 +07:00
|
|
|
list_del(&vm->global_link);
|
2013-11-26 00:54:34 +07:00
|
|
|
drm_mm_takedown(&vm->mm);
|
|
|
|
|
2014-02-13 05:28:44 +07:00
|
|
|
gen8_ppgtt_unmap_pages(ppgtt);
|
|
|
|
gen8_ppgtt_free(ppgtt);
|
2013-11-05 11:47:32 +07:00
|
|
|
}
|
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
static struct page **__gen8_alloc_page_tables(void)
|
|
|
|
{
|
|
|
|
struct page **pt_pages;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
pt_pages = kcalloc(GEN8_PDES_PER_PAGE, sizeof(struct page *), GFP_KERNEL);
|
|
|
|
if (!pt_pages)
|
|
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
|
|
|
|
for (i = 0; i < GEN8_PDES_PER_PAGE; i++) {
|
|
|
|
pt_pages[i] = alloc_page(GFP_KERNEL);
|
|
|
|
if (!pt_pages[i])
|
|
|
|
goto bail;
|
|
|
|
}
|
|
|
|
|
|
|
|
return pt_pages;
|
|
|
|
|
|
|
|
bail:
|
|
|
|
gen8_free_page_tables(pt_pages);
|
|
|
|
kfree(pt_pages);
|
|
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
}
|
|
|
|
|
2014-02-20 13:05:43 +07:00
|
|
|
static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt,
|
|
|
|
const int max_pdp)
|
|
|
|
{
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
struct page **pt_pages[GEN8_LEGACY_PDPS];
|
|
|
|
int i, ret;
|
2014-02-20 13:05:43 +07:00
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
for (i = 0; i < max_pdp; i++) {
|
|
|
|
pt_pages[i] = __gen8_alloc_page_tables();
|
|
|
|
if (IS_ERR(pt_pages[i])) {
|
|
|
|
ret = PTR_ERR(pt_pages[i]);
|
|
|
|
goto unwind_out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* NB: Avoid touching gen8_pt_pages until last to keep the allocation,
|
|
|
|
* "atomic" - for cleanup purposes.
|
|
|
|
*/
|
|
|
|
for (i = 0; i < max_pdp; i++)
|
|
|
|
ppgtt->gen8_pt_pages[i] = pt_pages[i];
|
2014-02-20 13:05:43 +07:00
|
|
|
|
|
|
|
return 0;
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
|
|
|
|
unwind_out:
|
|
|
|
while (i--) {
|
|
|
|
gen8_free_page_tables(pt_pages[i]);
|
|
|
|
kfree(pt_pages[i]);
|
|
|
|
}
|
|
|
|
|
|
|
|
return ret;
|
2014-02-20 13:05:43 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < ppgtt->num_pd_pages; i++) {
|
|
|
|
ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE,
|
|
|
|
sizeof(dma_addr_t),
|
|
|
|
GFP_KERNEL);
|
|
|
|
if (!ppgtt->gen8_pt_dma_addr[i])
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen8_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt,
|
|
|
|
const int max_pdp)
|
|
|
|
{
|
|
|
|
ppgtt->pd_pages = alloc_pages(GFP_KERNEL, get_order(max_pdp << PAGE_SHIFT));
|
|
|
|
if (!ppgtt->pd_pages)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
ppgtt->num_pd_pages = 1 << get_order(max_pdp << PAGE_SHIFT);
|
|
|
|
BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPS);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt,
|
|
|
|
const int max_pdp)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = gen8_ppgtt_allocate_page_directories(ppgtt, max_pdp);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ret = gen8_ppgtt_allocate_page_tables(ppgtt, max_pdp);
|
|
|
|
if (ret) {
|
|
|
|
__free_pages(ppgtt->pd_pages, get_order(max_pdp << PAGE_SHIFT));
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE;
|
|
|
|
|
|
|
|
ret = gen8_ppgtt_allocate_dma(ppgtt);
|
|
|
|
if (ret)
|
|
|
|
gen8_ppgtt_free(ppgtt);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen8_ppgtt_setup_page_directories(struct i915_hw_ppgtt *ppgtt,
|
|
|
|
const int pd)
|
|
|
|
{
|
|
|
|
dma_addr_t pd_addr;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
pd_addr = pci_map_page(ppgtt->base.dev->pdev,
|
|
|
|
&ppgtt->pd_pages[pd], 0,
|
|
|
|
PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
|
|
|
|
|
|
|
|
ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pd_addr);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ppgtt->pd_dma_addr[pd] = pd_addr;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen8_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt,
|
|
|
|
const int pd,
|
|
|
|
const int pt)
|
|
|
|
{
|
|
|
|
dma_addr_t pt_addr;
|
|
|
|
struct page *p;
|
|
|
|
int ret;
|
|
|
|
|
drm/i915/bdw: Reorganize PT allocations
The previous allocation mechanism would get 2 contiguous allocations,
one for the page directories, and one for the page tables. As each page
table is 1 page, and there are 512 of these per page directory, this
goes to 2MB. An unfriendly request at best. Worse still, our HW now
supports 4 page directories, and a 2MB allocation is not allowed.
In order to fix this, this patch attempts to split up each page table
allocation into a single, discrete allocation. There is nothing really
fancy about the patch itself, it just has to manage an extra pointer
indirection, and have a fancier bit of logic to free up the pages.
To accommodate some of the added complexity, two new helpers are
introduced to allocate, and free the page table pages.
NOTE: I really wanted to split the way we do allocations, and the way in
which we identify the page table/page directory being used. I found
splitting this functionality up to be too unwieldy. I apologize in
advance to the reviewer. I'd recommend looking at the result, rather
than the diff.
v2/NOTE2: This patch predated commit:
6f1cc993518462ccf039e195fabd47e7aa5bfd13
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 31 15:50:31 2013 +0000
drm/i915: Avoid dereference past end of page arr
It fixed the same issue as that patch, but because of the limbo state of
PPGTT, Chris patch was merged instead. The excess churn is a result of
my using my original patch, which has my preferred naming. Primarily
act_* is changed to which_*, but it's mostly the same otherwise. I've
kept the convention Chris used for the pte wrap (I had something
slightly different, and broken - but fixable)
v3: Rename which_p[..]e to drop which_ (Chris)
Remove BUG_ON in inner loop (Chris)
Redo the pde/pdpe wrap logic (Chris)
v4: s/1MB/2MB in commit message (Imre)
Plug leaking gen8_pt_pages in both the error path, as well as general
free case (Imre)
v5: Rename leftover "which_" variables (Imre)
Add the pde = 0 wrap that was missed from v3 (Imre)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Squash in fixup from Ben.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-21 02:51:21 +07:00
|
|
|
p = ppgtt->gen8_pt_pages[pd][pt];
|
2014-02-20 13:05:43 +07:00
|
|
|
pt_addr = pci_map_page(ppgtt->base.dev->pdev,
|
|
|
|
p, 0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
|
|
|
|
ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ppgtt->gen8_pt_dma_addr[pd][pt] = pt_addr;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-11-05 11:47:32 +07:00
|
|
|
/**
|
2014-02-20 13:05:42 +07:00
|
|
|
* GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
|
|
|
|
* with a net effect resembling a 2-level page table in normal x86 terms. Each
|
|
|
|
* PDP represents 1GB of memory 4 * 512 * 512 * 4096 = 4GB legacy 32b address
|
|
|
|
* space.
|
2013-11-05 11:47:32 +07:00
|
|
|
*
|
2014-02-20 13:05:42 +07:00
|
|
|
* FIXME: split allocation into smaller pieces. For now we only ever do this
|
|
|
|
* once, but with full PPGTT, the multiple contiguous allocations will be bad.
|
2013-11-05 11:47:32 +07:00
|
|
|
* TODO: Do something with the size parameter
|
2014-02-20 13:05:42 +07:00
|
|
|
*/
|
2013-11-05 11:47:32 +07:00
|
|
|
static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size)
|
|
|
|
{
|
|
|
|
const int max_pdp = DIV_ROUND_UP(size, 1 << 30);
|
2014-02-20 13:05:43 +07:00
|
|
|
const int min_pt_pages = GEN8_PDES_PER_PAGE * max_pdp;
|
2014-02-20 13:05:42 +07:00
|
|
|
int i, j, ret;
|
2013-11-05 11:47:32 +07:00
|
|
|
|
|
|
|
if (size % (1<<30))
|
|
|
|
DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size);
|
|
|
|
|
2014-02-20 13:05:43 +07:00
|
|
|
/* 1. Do all our allocations for page directories and page tables. */
|
|
|
|
ret = gen8_ppgtt_alloc(ppgtt, max_pdp);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
2014-02-20 13:05:42 +07:00
|
|
|
|
2013-11-05 11:47:32 +07:00
|
|
|
/*
|
2014-02-20 13:05:43 +07:00
|
|
|
* 2. Create DMA mappings for the page directories and page tables.
|
2013-11-05 11:47:32 +07:00
|
|
|
*/
|
|
|
|
for (i = 0; i < max_pdp; i++) {
|
2014-02-20 13:05:43 +07:00
|
|
|
ret = gen8_ppgtt_setup_page_directories(ppgtt, i);
|
2014-02-20 13:05:42 +07:00
|
|
|
if (ret)
|
|
|
|
goto bail;
|
2013-11-05 11:47:32 +07:00
|
|
|
|
|
|
|
for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
|
2014-02-20 13:05:43 +07:00
|
|
|
ret = gen8_ppgtt_setup_page_tables(ppgtt, i, j);
|
2014-02-20 13:05:42 +07:00
|
|
|
if (ret)
|
|
|
|
goto bail;
|
2013-11-05 11:47:32 +07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2014-02-20 13:05:42 +07:00
|
|
|
/*
|
|
|
|
* 3. Map all the page directory entires to point to the page tables
|
|
|
|
* we've allocated.
|
|
|
|
*
|
|
|
|
* For now, the PPGTT helper functions all require that the PDEs are
|
2013-11-05 12:20:14 +07:00
|
|
|
* plugged in correctly. So we do that now/here. For aliasing PPGTT, we
|
2014-02-20 13:05:42 +07:00
|
|
|
* will never need to touch the PDEs again.
|
|
|
|
*/
|
2013-11-05 12:20:14 +07:00
|
|
|
for (i = 0; i < max_pdp; i++) {
|
|
|
|
gen8_ppgtt_pde_t *pd_vaddr;
|
|
|
|
pd_vaddr = kmap_atomic(&ppgtt->pd_pages[i]);
|
|
|
|
for (j = 0; j < GEN8_PDES_PER_PAGE; j++) {
|
|
|
|
dma_addr_t addr = ppgtt->gen8_pt_dma_addr[i][j];
|
|
|
|
pd_vaddr[j] = gen8_pde_encode(ppgtt->base.dev, addr,
|
|
|
|
I915_CACHE_LLC);
|
|
|
|
}
|
2014-04-09 17:28:02 +07:00
|
|
|
if (!HAS_LLC(ppgtt->base.dev))
|
|
|
|
drm_clflush_virt_range(pd_vaddr, PAGE_SIZE);
|
2013-11-05 12:20:14 +07:00
|
|
|
kunmap_atomic(pd_vaddr);
|
|
|
|
}
|
|
|
|
|
2014-02-20 13:05:42 +07:00
|
|
|
ppgtt->enable = gen8_ppgtt_enable;
|
|
|
|
ppgtt->switch_mm = gen8_mm_switch;
|
|
|
|
ppgtt->base.clear_range = gen8_ppgtt_clear_range;
|
|
|
|
ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
|
|
|
|
ppgtt->base.cleanup = gen8_ppgtt_cleanup;
|
|
|
|
ppgtt->base.start = 0;
|
2014-02-22 04:06:34 +07:00
|
|
|
ppgtt->base.total = ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE * PAGE_SIZE;
|
2014-02-20 13:05:42 +07:00
|
|
|
|
2014-02-22 04:06:34 +07:00
|
|
|
ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
|
2013-11-03 11:07:23 +07:00
|
|
|
|
2013-11-05 11:47:32 +07:00
|
|
|
DRM_DEBUG_DRIVER("Allocated %d pages for page directories (%d wasted)\n",
|
|
|
|
ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp);
|
|
|
|
DRM_DEBUG_DRIVER("Allocated %d pages for page tables (%lld wasted)\n",
|
2014-02-22 04:06:34 +07:00
|
|
|
ppgtt->num_pd_entries,
|
|
|
|
(ppgtt->num_pd_entries - min_pt_pages) + size % (1<<30));
|
2013-11-03 11:07:26 +07:00
|
|
|
return 0;
|
2013-11-05 11:47:32 +07:00
|
|
|
|
2014-02-20 13:05:42 +07:00
|
|
|
bail:
|
|
|
|
gen8_ppgtt_unmap_pages(ppgtt);
|
|
|
|
gen8_ppgtt_free(ppgtt);
|
2013-11-05 11:47:32 +07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:29 +07:00
|
|
|
static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
|
|
|
|
struct i915_address_space *vm = &ppgtt->base;
|
|
|
|
gen6_gtt_pte_t __iomem *pd_addr;
|
|
|
|
gen6_gtt_pte_t scratch_pte;
|
|
|
|
uint32_t pd_entry;
|
|
|
|
int pte, pde;
|
|
|
|
|
2014-06-17 12:29:42 +07:00
|
|
|
scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC, true, 0);
|
2013-12-07 05:11:29 +07:00
|
|
|
|
|
|
|
pd_addr = (gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm +
|
|
|
|
ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
|
|
|
|
|
|
|
|
seq_printf(m, " VM %p (pd_offset %x-%x):\n", vm,
|
|
|
|
ppgtt->pd_offset, ppgtt->pd_offset + ppgtt->num_pd_entries);
|
|
|
|
for (pde = 0; pde < ppgtt->num_pd_entries; pde++) {
|
|
|
|
u32 expected;
|
|
|
|
gen6_gtt_pte_t *pt_vaddr;
|
|
|
|
dma_addr_t pt_addr = ppgtt->pt_dma_addr[pde];
|
|
|
|
pd_entry = readl(pd_addr + pde);
|
|
|
|
expected = (GEN6_PDE_ADDR_ENCODE(pt_addr) | GEN6_PDE_VALID);
|
|
|
|
|
|
|
|
if (pd_entry != expected)
|
|
|
|
seq_printf(m, "\tPDE #%d mismatch: Actual PDE: %x Expected PDE: %x\n",
|
|
|
|
pde,
|
|
|
|
pd_entry,
|
|
|
|
expected);
|
|
|
|
seq_printf(m, "\tPDE: %x\n", pd_entry);
|
|
|
|
|
|
|
|
pt_vaddr = kmap_atomic(ppgtt->pt_pages[pde]);
|
|
|
|
for (pte = 0; pte < I915_PPGTT_PT_ENTRIES; pte+=4) {
|
|
|
|
unsigned long va =
|
|
|
|
(pde * PAGE_SIZE * I915_PPGTT_PT_ENTRIES) +
|
|
|
|
(pte * PAGE_SIZE);
|
|
|
|
int i;
|
|
|
|
bool found = false;
|
|
|
|
for (i = 0; i < 4; i++)
|
|
|
|
if (pt_vaddr[pte + i] != scratch_pte)
|
|
|
|
found = true;
|
|
|
|
if (!found)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
seq_printf(m, "\t\t0x%lx [%03d,%04d]: =", va, pde, pte);
|
|
|
|
for (i = 0; i < 4; i++) {
|
|
|
|
if (pt_vaddr[pte + i] != scratch_pte)
|
|
|
|
seq_printf(m, " %08x", pt_vaddr[pte + i]);
|
|
|
|
else
|
|
|
|
seq_puts(m, " SCRATCH ");
|
|
|
|
}
|
|
|
|
seq_puts(m, "\n");
|
|
|
|
}
|
|
|
|
kunmap_atomic(pt_vaddr);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-04-24 13:15:32 +07:00
|
|
|
static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
|
2013-04-09 08:43:54 +07:00
|
|
|
{
|
2013-07-17 06:50:05 +07:00
|
|
|
struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
|
2013-04-09 08:43:54 +07:00
|
|
|
gen6_gtt_pte_t __iomem *pd_addr;
|
|
|
|
uint32_t pd_entry;
|
|
|
|
int i;
|
|
|
|
|
2013-04-24 13:15:30 +07:00
|
|
|
WARN_ON(ppgtt->pd_offset & 0x3f);
|
2013-04-09 08:43:54 +07:00
|
|
|
pd_addr = (gen6_gtt_pte_t __iomem*)dev_priv->gtt.gsm +
|
|
|
|
ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
|
|
|
|
for (i = 0; i < ppgtt->num_pd_entries; i++) {
|
|
|
|
dma_addr_t pt_addr;
|
|
|
|
|
|
|
|
pt_addr = ppgtt->pt_dma_addr[i];
|
|
|
|
pd_entry = GEN6_PDE_ADDR_ENCODE(pt_addr);
|
|
|
|
pd_entry |= GEN6_PDE_VALID;
|
|
|
|
|
|
|
|
writel(pd_entry, pd_addr + i);
|
|
|
|
}
|
|
|
|
readl(pd_addr);
|
2013-04-24 13:15:32 +07:00
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt)
|
2013-04-24 13:15:32 +07:00
|
|
|
{
|
2013-12-07 05:11:09 +07:00
|
|
|
BUG_ON(ppgtt->pd_offset & 0x3f);
|
|
|
|
|
|
|
|
return (ppgtt->pd_offset / 64) << 16;
|
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:12 +07:00
|
|
|
static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring,
|
2013-12-07 05:11:12 +07:00
|
|
|
bool synchronous)
|
|
|
|
{
|
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* If we're in reset, we can assume the GPU is sufficiently idle to
|
|
|
|
* manually frob these bits. Ideally we could use the ring functions,
|
|
|
|
* except our error handling makes it quite difficult (can't use
|
|
|
|
* intel_ring_begin, ring->flush, or intel_ring_advance)
|
|
|
|
*
|
|
|
|
* FIXME: We should try not to special case reset
|
|
|
|
*/
|
|
|
|
if (synchronous ||
|
|
|
|
i915_reset_in_progress(&dev_priv->gpu_error)) {
|
|
|
|
WARN_ON(ppgtt != dev_priv->mm.aliasing_ppgtt);
|
|
|
|
I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
|
|
|
|
I915_WRITE(RING_PP_DIR_BASE(ring), get_pd_offset(ppgtt));
|
|
|
|
POSTING_READ(RING_PP_DIR_BASE(ring));
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* NB: TLBs must be flushed and invalidated before a switch */
|
|
|
|
ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ret = intel_ring_begin(ring, 6);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
|
|
|
|
intel_ring_emit(ring, RING_PP_DIR_DCLV(ring));
|
|
|
|
intel_ring_emit(ring, PP_DIR_DCLV_2G);
|
|
|
|
intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
|
|
|
|
intel_ring_emit(ring, get_pd_offset(ppgtt));
|
|
|
|
intel_ring_emit(ring, MI_NOOP);
|
|
|
|
intel_ring_advance(ring);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:11 +07:00
|
|
|
static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring,
|
2013-12-07 05:11:11 +07:00
|
|
|
bool synchronous)
|
|
|
|
{
|
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* If we're in reset, we can assume the GPU is sufficiently idle to
|
|
|
|
* manually frob these bits. Ideally we could use the ring functions,
|
|
|
|
* except our error handling makes it quite difficult (can't use
|
|
|
|
* intel_ring_begin, ring->flush, or intel_ring_advance)
|
|
|
|
*
|
|
|
|
* FIXME: We should try not to special case reset
|
|
|
|
*/
|
|
|
|
if (synchronous ||
|
|
|
|
i915_reset_in_progress(&dev_priv->gpu_error)) {
|
|
|
|
WARN_ON(ppgtt != dev_priv->mm.aliasing_ppgtt);
|
|
|
|
I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
|
|
|
|
I915_WRITE(RING_PP_DIR_BASE(ring), get_pd_offset(ppgtt));
|
|
|
|
POSTING_READ(RING_PP_DIR_BASE(ring));
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* NB: TLBs must be flushed and invalidated before a switch */
|
|
|
|
ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ret = intel_ring_begin(ring, 6);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
|
|
|
|
intel_ring_emit(ring, RING_PP_DIR_DCLV(ring));
|
|
|
|
intel_ring_emit(ring, PP_DIR_DCLV_2G);
|
|
|
|
intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
|
|
|
|
intel_ring_emit(ring, get_pd_offset(ppgtt));
|
|
|
|
intel_ring_emit(ring, MI_NOOP);
|
|
|
|
intel_ring_advance(ring);
|
|
|
|
|
2013-12-07 05:11:12 +07:00
|
|
|
/* XXX: RCS is the only one to auto invalidate the TLBs? */
|
|
|
|
if (ring->id != RCS) {
|
|
|
|
ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:11 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt,
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring,
|
2013-12-07 05:11:10 +07:00
|
|
|
bool synchronous)
|
|
|
|
{
|
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
|
2013-12-07 05:11:11 +07:00
|
|
|
if (!synchronous)
|
|
|
|
return 0;
|
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
|
|
|
|
I915_WRITE(RING_PP_DIR_BASE(ring), get_pd_offset(ppgtt));
|
|
|
|
|
|
|
|
POSTING_READ(RING_PP_DIR_DCLV(ring));
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen8_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring;
|
2013-12-07 05:11:10 +07:00
|
|
|
int j, ret;
|
2013-04-24 13:15:32 +07:00
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
for_each_ring(ring, dev_priv, j) {
|
|
|
|
I915_WRITE(RING_MODE_GEN7(ring),
|
|
|
|
_MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
|
2013-04-24 13:15:32 +07:00
|
|
|
|
2013-12-07 05:11:27 +07:00
|
|
|
/* We promise to do a switch later with FULL PPGTT. If this is
|
|
|
|
* aliasing, this is the one and only switch we'll do */
|
|
|
|
if (USES_FULL_PPGTT(dev))
|
|
|
|
continue;
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
ret = ppgtt->switch_mm(ppgtt, ring, true);
|
|
|
|
if (ret)
|
|
|
|
goto err_out;
|
|
|
|
}
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
return 0;
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
err_out:
|
|
|
|
for_each_ring(ring, dev_priv, j)
|
|
|
|
I915_WRITE(RING_MODE_GEN7(ring),
|
|
|
|
_MASKED_BIT_DISABLE(GFX_PPGTT_ENABLE));
|
|
|
|
return ret;
|
|
|
|
}
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
static int gen7_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
|
2013-04-24 13:15:32 +07:00
|
|
|
{
|
2013-12-07 05:11:06 +07:00
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
2014-03-31 18:27:21 +07:00
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring;
|
2013-12-07 05:11:09 +07:00
|
|
|
uint32_t ecochk, ecobits;
|
2013-04-24 13:15:32 +07:00
|
|
|
int i;
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
ecobits = I915_READ(GAC_ECO_BITS);
|
|
|
|
I915_WRITE(GAC_ECO_BITS, ecobits | ECOBITS_PPGTT_CACHE64B);
|
2013-04-04 19:13:41 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
ecochk = I915_READ(GAM_ECOCHK);
|
|
|
|
if (IS_HASWELL(dev)) {
|
|
|
|
ecochk |= ECOCHK_PPGTT_WB_HSW;
|
|
|
|
} else {
|
|
|
|
ecochk |= ECOCHK_PPGTT_LLC_IVB;
|
|
|
|
ecochk &= ~ECOCHK_PPGTT_GFDT_IVB;
|
|
|
|
}
|
|
|
|
I915_WRITE(GAM_ECOCHK, ecochk);
|
2013-04-04 19:13:41 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
for_each_ring(ring, dev_priv, i) {
|
2013-12-07 05:11:10 +07:00
|
|
|
int ret;
|
2013-04-09 08:43:54 +07:00
|
|
|
/* GFX_MODE is per-ring on gen7+ */
|
2013-12-07 05:11:09 +07:00
|
|
|
I915_WRITE(RING_MODE_GEN7(ring),
|
|
|
|
_MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
|
2013-12-07 05:11:27 +07:00
|
|
|
|
|
|
|
/* We promise to do a switch later with FULL PPGTT. If this is
|
|
|
|
* aliasing, this is the one and only switch we'll do */
|
|
|
|
if (USES_FULL_PPGTT(dev))
|
|
|
|
continue;
|
|
|
|
|
2013-12-07 05:11:10 +07:00
|
|
|
ret = ppgtt->switch_mm(ppgtt, ring, true);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
2013-04-09 08:43:54 +07:00
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
return 0;
|
|
|
|
}
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
2014-03-31 18:27:21 +07:00
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring;
|
2013-12-07 05:11:09 +07:00
|
|
|
uint32_t ecochk, gab_ctl, ecobits;
|
|
|
|
int i;
|
2013-04-04 19:13:41 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
ecobits = I915_READ(GAC_ECO_BITS);
|
|
|
|
I915_WRITE(GAC_ECO_BITS, ecobits | ECOBITS_SNB_BIT |
|
|
|
|
ECOBITS_PPGTT_CACHE64B);
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
gab_ctl = I915_READ(GAB_CTL);
|
|
|
|
I915_WRITE(GAB_CTL, gab_ctl | GAB_CTL_CONT_AFTER_PAGEFAULT);
|
|
|
|
|
|
|
|
ecochk = I915_READ(GAM_ECOCHK);
|
|
|
|
I915_WRITE(GAM_ECOCHK, ecochk | ECOCHK_SNB_BIT | ECOCHK_PPGTT_CACHE64B);
|
|
|
|
|
|
|
|
I915_WRITE(GFX_MODE, _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
|
2013-04-09 08:43:54 +07:00
|
|
|
|
2013-12-07 05:11:09 +07:00
|
|
|
for_each_ring(ring, dev_priv, i) {
|
2013-12-07 05:11:10 +07:00
|
|
|
int ret = ppgtt->switch_mm(ppgtt, ring, true);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
2013-04-09 08:43:54 +07:00
|
|
|
}
|
2013-12-07 05:11:09 +07:00
|
|
|
|
2013-04-09 08:43:56 +07:00
|
|
|
return 0;
|
2013-04-09 08:43:54 +07:00
|
|
|
}
|
|
|
|
|
2012-02-09 23:15:46 +07:00
|
|
|
/* PPGTT support for Sandybdrige/Gen6 and later */
|
2013-07-17 06:50:05 +07:00
|
|
|
static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
|
|
|
uint64_t length,
|
2013-10-16 23:21:30 +07:00
|
|
|
bool use_scratch)
|
2012-02-09 23:15:46 +07:00
|
|
|
{
|
2013-07-17 06:50:05 +07:00
|
|
|
struct i915_hw_ppgtt *ppgtt =
|
|
|
|
container_of(vm, struct i915_hw_ppgtt, base);
|
2013-04-09 08:43:48 +07:00
|
|
|
gen6_gtt_pte_t *pt_vaddr, scratch_pte;
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned first_entry = start >> PAGE_SHIFT;
|
|
|
|
unsigned num_entries = length >> PAGE_SHIFT;
|
2013-03-20 05:48:39 +07:00
|
|
|
unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
|
2012-02-09 23:15:47 +07:00
|
|
|
unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
|
|
|
|
unsigned last_pte, i;
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2014-06-17 12:29:42 +07:00
|
|
|
scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC, true, 0);
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2012-02-09 23:15:47 +07:00
|
|
|
while (num_entries) {
|
|
|
|
last_pte = first_pte + num_entries;
|
|
|
|
if (last_pte > I915_PPGTT_PT_ENTRIES)
|
|
|
|
last_pte = I915_PPGTT_PT_ENTRIES;
|
|
|
|
|
2013-03-20 05:48:39 +07:00
|
|
|
pt_vaddr = kmap_atomic(ppgtt->pt_pages[act_pt]);
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2012-02-09 23:15:47 +07:00
|
|
|
for (i = first_pte; i < last_pte; i++)
|
|
|
|
pt_vaddr[i] = scratch_pte;
|
2012-02-09 23:15:46 +07:00
|
|
|
|
|
|
|
kunmap_atomic(pt_vaddr);
|
|
|
|
|
2012-02-09 23:15:47 +07:00
|
|
|
num_entries -= last_pte - first_pte;
|
|
|
|
first_pte = 0;
|
2013-03-20 05:48:39 +07:00
|
|
|
act_pt++;
|
2012-02-09 23:15:47 +07:00
|
|
|
}
|
2012-02-09 23:15:46 +07:00
|
|
|
}
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
|
2013-01-25 05:44:56 +07:00
|
|
|
struct sg_table *pages,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
2014-06-17 12:29:42 +07:00
|
|
|
enum i915_cache_level cache_level, u32 flags)
|
2013-01-25 05:44:56 +07:00
|
|
|
{
|
2013-07-17 06:50:05 +07:00
|
|
|
struct i915_hw_ppgtt *ppgtt =
|
|
|
|
container_of(vm, struct i915_hw_ppgtt, base);
|
2013-04-09 08:43:48 +07:00
|
|
|
gen6_gtt_pte_t *pt_vaddr;
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned first_entry = start >> PAGE_SHIFT;
|
2013-03-20 05:48:39 +07:00
|
|
|
unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
|
2013-02-19 00:28:04 +07:00
|
|
|
unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
|
|
|
|
struct sg_page_iter sg_iter;
|
|
|
|
|
2013-12-31 22:50:30 +07:00
|
|
|
pt_vaddr = NULL;
|
2013-02-19 00:28:04 +07:00
|
|
|
for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
|
2013-12-31 22:50:30 +07:00
|
|
|
if (pt_vaddr == NULL)
|
|
|
|
pt_vaddr = kmap_atomic(ppgtt->pt_pages[act_pt]);
|
2013-02-19 00:28:04 +07:00
|
|
|
|
2013-12-31 22:50:30 +07:00
|
|
|
pt_vaddr[act_pte] =
|
|
|
|
vm->pte_encode(sg_page_iter_dma_address(&sg_iter),
|
2014-06-17 12:29:42 +07:00
|
|
|
cache_level, true, flags);
|
|
|
|
|
2013-02-19 00:28:04 +07:00
|
|
|
if (++act_pte == I915_PPGTT_PT_ENTRIES) {
|
|
|
|
kunmap_atomic(pt_vaddr);
|
2013-12-31 22:50:30 +07:00
|
|
|
pt_vaddr = NULL;
|
2013-03-20 05:48:39 +07:00
|
|
|
act_pt++;
|
2013-02-19 00:28:04 +07:00
|
|
|
act_pte = 0;
|
2013-01-25 05:44:56 +07:00
|
|
|
}
|
|
|
|
}
|
2013-12-31 22:50:30 +07:00
|
|
|
if (pt_vaddr)
|
|
|
|
kunmap_atomic(pt_vaddr);
|
2013-01-25 05:44:56 +07:00
|
|
|
}
|
|
|
|
|
2014-02-20 13:05:48 +07:00
|
|
|
static void gen6_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt)
|
2012-02-09 23:15:46 +07:00
|
|
|
{
|
2013-01-25 04:49:56 +07:00
|
|
|
int i;
|
|
|
|
|
|
|
|
if (ppgtt->pt_dma_addr) {
|
|
|
|
for (i = 0; i < ppgtt->num_pd_entries; i++)
|
2013-07-17 06:50:05 +07:00
|
|
|
pci_unmap_page(ppgtt->base.dev->pdev,
|
2013-01-25 04:49:56 +07:00
|
|
|
ppgtt->pt_dma_addr[i],
|
|
|
|
4096, PCI_DMA_BIDIRECTIONAL);
|
|
|
|
}
|
2014-02-20 13:05:48 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static void gen6_ppgtt_free(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
int i;
|
2013-01-25 04:49:56 +07:00
|
|
|
|
|
|
|
kfree(ppgtt->pt_dma_addr);
|
|
|
|
for (i = 0; i < ppgtt->num_pd_entries; i++)
|
|
|
|
__free_page(ppgtt->pt_pages[i]);
|
|
|
|
kfree(ppgtt->pt_pages);
|
|
|
|
}
|
|
|
|
|
2014-02-20 13:05:48 +07:00
|
|
|
static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
|
|
|
|
{
|
|
|
|
struct i915_hw_ppgtt *ppgtt =
|
|
|
|
container_of(vm, struct i915_hw_ppgtt, base);
|
|
|
|
|
|
|
|
list_del(&vm->global_link);
|
|
|
|
drm_mm_takedown(&ppgtt->base.mm);
|
|
|
|
drm_mm_remove_node(&ppgtt->node);
|
|
|
|
|
|
|
|
gen6_ppgtt_unmap_pages(ppgtt);
|
|
|
|
gen6_ppgtt_free(ppgtt);
|
|
|
|
}
|
|
|
|
|
2014-02-20 13:05:49 +07:00
|
|
|
static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
|
2013-01-25 04:49:56 +07:00
|
|
|
{
|
2013-07-17 06:50:05 +07:00
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
2012-02-09 23:15:46 +07:00
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2013-12-07 05:11:08 +07:00
|
|
|
bool retried = false;
|
2014-02-20 13:05:49 +07:00
|
|
|
int ret;
|
2012-02-09 23:15:46 +07:00
|
|
|
|
drm/i915: Use drm_mm for PPGTT PDEs
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.
Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.
The first step here is to make the existing single PPGTT use the
allocator.
The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.
The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.
v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)
v3: Use Chris' top down allocator
v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)
v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)
v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region
v7: Undo v3, so we can make the drm patch last in the series
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:11:07 +07:00
|
|
|
/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
|
|
|
|
* allocator works in address space sizes, so it's multiplied by page
|
|
|
|
* size. We allocate at the top of the GTT to avoid fragmentation.
|
|
|
|
*/
|
|
|
|
BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
|
2013-12-07 05:11:08 +07:00
|
|
|
alloc:
|
drm/i915: Use drm_mm for PPGTT PDEs
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.
Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.
The first step here is to make the existing single PPGTT use the
allocator.
The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.
The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.
v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)
v3: Use Chris' top down allocator
v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)
v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)
v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region
v7: Undo v3, so we can make the drm patch last in the series
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:11:07 +07:00
|
|
|
ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
|
|
|
|
&ppgtt->node, GEN6_PD_SIZE,
|
|
|
|
GEN6_PD_ALIGN, 0,
|
|
|
|
0, dev_priv->gtt.base.total,
|
2014-05-07 12:21:30 +07:00
|
|
|
DRM_MM_TOPDOWN);
|
2013-12-07 05:11:08 +07:00
|
|
|
if (ret == -ENOSPC && !retried) {
|
|
|
|
ret = i915_gem_evict_something(dev, &dev_priv->gtt.base,
|
|
|
|
GEN6_PD_SIZE, GEN6_PD_ALIGN,
|
drm/i915: Prevent negative relocation deltas from wrapping
This is pure evil. Userspace, I'm looking at you SNA, repacks batch
buffers on the fly after generation as they are being passed to the
kernel for execution. These batches also contain self-referenced
relocations as a single buffer encompasses the state commands, kernels,
vertices and sampler. During generation the buffers are placed at known
offsets within the full batch, and then the relocation deltas (as passed
to the kernel) are tweaked as the batch is repacked into a smaller buffer.
This means that userspace is passing negative relocations deltas, which
subsequently wrap to large values if the batch is at a low address. The
GPU hangs when it then tries to use the large value as a base for its
address offsets, rather than wrapping back to the real value (as one
would hope). As the GPU uses positive offsets from the base, we can
treat the relocation address as the minimum address read by the GPU.
For the upper bound, we trust that userspace will not read beyond the
end of the buffer.
So, how do we fix negative relocations from wrapping? We can either
check that every relocation looks valid when we write it, and then
position each object such that we prevent the offset wraparound, or we
just special-case the self-referential behaviour of SNA and force all
batches to be above 256k. Daniel prefers the latter approach.
This fixes a GPU hang when it tries to use an address (relocation +
offset) greater than the GTT size. The issue would occur quite easily
with full-ppgtt as each fd gets its own VM space, so low offsets would
often be handed out. However, with the rearrangement of the low GTT due
to capturing the BIOS framebuffer, it is already affecting kernels 3.15
onwards. I think only IVB+ is susceptible to this bug, but the workaround
should only kick in rarely, so it seems sensible to always apply it.
v3: Use a bias for batch buffers to prevent small negative delta relocations
from wrapping.
v4 from Daniel:
- s/BIAS/BATCH_OFFSET_BIAS/
- Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
were growing rather cumbersome.
- Add a comment to eb_get_batch explaining why we do this.
- Apply the batch offset bias everywhere but mention that we've only
observed it on gen7 gpus.
- Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.
v5: Add static to eb_get_batch, spotted by 0-day tester.
Testcase: igt/gem_bad_reloc
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-05-23 13:48:08 +07:00
|
|
|
I915_CACHE_NONE,
|
|
|
|
0, dev_priv->gtt.base.total,
|
|
|
|
0);
|
2013-12-07 05:11:08 +07:00
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
retried = true;
|
|
|
|
goto alloc;
|
|
|
|
}
|
drm/i915: Use drm_mm for PPGTT PDEs
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.
Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.
The first step here is to make the existing single PPGTT use the
allocator.
The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.
The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.
v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)
v3: Use Chris' top down allocator
v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)
v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)
v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region
v7: Undo v3, so we can make the drm patch last in the series
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:11:07 +07:00
|
|
|
|
|
|
|
if (ppgtt->node.start < dev_priv->gtt.mappable_end)
|
|
|
|
DRM_DEBUG("Forced to use aperture for PDEs\n");
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2013-06-28 06:30:04 +07:00
|
|
|
ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
|
2014-02-20 13:05:49 +07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen6_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
2013-09-21 05:35:38 +07:00
|
|
|
ppgtt->pt_pages = kcalloc(ppgtt->num_pd_entries, sizeof(struct page *),
|
2012-02-09 23:15:46 +07:00
|
|
|
GFP_KERNEL);
|
2014-02-20 13:05:49 +07:00
|
|
|
|
|
|
|
if (!ppgtt->pt_pages)
|
2013-01-25 04:49:56 +07:00
|
|
|
return -ENOMEM;
|
2012-02-09 23:15:46 +07:00
|
|
|
|
|
|
|
for (i = 0; i < ppgtt->num_pd_entries; i++) {
|
|
|
|
ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
|
2014-02-20 13:05:49 +07:00
|
|
|
if (!ppgtt->pt_pages[i]) {
|
|
|
|
gen6_ppgtt_free(ppgtt);
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen6_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = gen6_ppgtt_allocate_page_directories(ppgtt);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ret = gen6_ppgtt_allocate_page_tables(ppgtt);
|
|
|
|
if (ret) {
|
|
|
|
drm_mm_remove_node(&ppgtt->node);
|
|
|
|
return ret;
|
2012-02-09 23:15:46 +07:00
|
|
|
}
|
|
|
|
|
2013-09-21 05:35:38 +07:00
|
|
|
ppgtt->pt_dma_addr = kcalloc(ppgtt->num_pd_entries, sizeof(dma_addr_t),
|
2013-01-19 03:30:33 +07:00
|
|
|
GFP_KERNEL);
|
2014-02-20 13:05:49 +07:00
|
|
|
if (!ppgtt->pt_dma_addr) {
|
|
|
|
drm_mm_remove_node(&ppgtt->node);
|
|
|
|
gen6_ppgtt_free(ppgtt);
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen6_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
|
|
|
int i;
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2013-01-19 03:30:33 +07:00
|
|
|
for (i = 0; i < ppgtt->num_pd_entries; i++) {
|
|
|
|
dma_addr_t pt_addr;
|
2012-04-10 22:29:17 +07:00
|
|
|
|
2013-01-19 03:30:33 +07:00
|
|
|
pt_addr = pci_map_page(dev->pdev, ppgtt->pt_pages[i], 0, 4096,
|
|
|
|
PCI_DMA_BIDIRECTIONAL);
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2013-01-19 03:30:33 +07:00
|
|
|
if (pci_dma_mapping_error(dev->pdev, pt_addr)) {
|
2014-02-20 13:05:49 +07:00
|
|
|
gen6_ppgtt_unmap_pages(ppgtt);
|
|
|
|
return -EIO;
|
2012-04-10 22:29:17 +07:00
|
|
|
}
|
2014-02-20 13:05:49 +07:00
|
|
|
|
2013-01-19 03:30:33 +07:00
|
|
|
ppgtt->pt_dma_addr[i] = pt_addr;
|
2012-02-09 23:15:46 +07:00
|
|
|
}
|
|
|
|
|
2014-02-20 13:05:49 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
|
|
|
|
{
|
|
|
|
struct drm_device *dev = ppgtt->base.dev;
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ppgtt->base.pte_encode = dev_priv->gtt.base.pte_encode;
|
|
|
|
if (IS_GEN6(dev)) {
|
|
|
|
ppgtt->enable = gen6_ppgtt_enable;
|
|
|
|
ppgtt->switch_mm = gen6_mm_switch;
|
|
|
|
} else if (IS_HASWELL(dev)) {
|
|
|
|
ppgtt->enable = gen7_ppgtt_enable;
|
|
|
|
ppgtt->switch_mm = hsw_mm_switch;
|
|
|
|
} else if (IS_GEN7(dev)) {
|
|
|
|
ppgtt->enable = gen7_ppgtt_enable;
|
|
|
|
ppgtt->switch_mm = gen7_mm_switch;
|
|
|
|
} else
|
|
|
|
BUG();
|
|
|
|
|
|
|
|
ret = gen6_ppgtt_alloc(ppgtt);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ret = gen6_ppgtt_setup_page_tables(ppgtt);
|
|
|
|
if (ret) {
|
|
|
|
gen6_ppgtt_free(ppgtt);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
ppgtt->base.clear_range = gen6_ppgtt_clear_range;
|
|
|
|
ppgtt->base.insert_entries = gen6_ppgtt_insert_entries;
|
|
|
|
ppgtt->base.cleanup = gen6_ppgtt_cleanup;
|
|
|
|
ppgtt->base.start = 0;
|
2014-03-09 02:58:17 +07:00
|
|
|
ppgtt->base.total = ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
|
2013-12-07 05:11:29 +07:00
|
|
|
ppgtt->debug_dump = gen6_dump_ppgtt;
|
2012-02-09 23:15:46 +07:00
|
|
|
|
drm/i915: Use drm_mm for PPGTT PDEs
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.
Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.
The first step here is to make the existing single PPGTT use the
allocator.
The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.
The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.
v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)
v3: Use Chris' top down allocator
v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)
v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)
v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region
v7: Undo v3, so we can make the drm patch last in the series
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:11:07 +07:00
|
|
|
ppgtt->pd_offset =
|
|
|
|
ppgtt->node.start / PAGE_SIZE * sizeof(gen6_gtt_pte_t);
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2014-02-20 13:05:49 +07:00
|
|
|
ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true);
|
2012-02-09 23:15:46 +07:00
|
|
|
|
2014-02-20 13:05:49 +07:00
|
|
|
DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
|
|
|
|
ppgtt->node.size >> 20,
|
|
|
|
ppgtt->node.start / PAGE_SIZE);
|
2013-01-25 04:49:56 +07:00
|
|
|
|
2014-02-20 13:05:49 +07:00
|
|
|
return 0;
|
2013-01-25 04:49:56 +07:00
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:14 +07:00
|
|
|
int i915_gem_init_ppgtt(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
|
2013-01-25 04:49:56 +07:00
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2013-12-07 05:11:13 +07:00
|
|
|
int ret = 0;
|
2013-01-25 04:49:56 +07:00
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
ppgtt->base.dev = dev;
|
2014-03-09 02:58:16 +07:00
|
|
|
ppgtt->base.scratch = dev_priv->gtt.base.scratch;
|
2013-01-25 04:49:56 +07:00
|
|
|
|
2013-04-09 08:43:53 +07:00
|
|
|
if (INTEL_INFO(dev)->gen < 8)
|
|
|
|
ret = gen6_ppgtt_init(ppgtt);
|
2013-11-03 11:07:01 +07:00
|
|
|
else if (IS_GEN8(dev))
|
2013-11-05 11:47:32 +07:00
|
|
|
ret = gen8_ppgtt_init(ppgtt, dev_priv->gtt.base.total);
|
2013-04-09 08:43:53 +07:00
|
|
|
else
|
|
|
|
BUG();
|
|
|
|
|
drm/i915: Add VM to context
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.
To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.
Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:11:15 +07:00
|
|
|
if (!ret) {
|
2013-12-07 05:11:26 +07:00
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
drm/i915: Add VM to context
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.
To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.
Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:11:15 +07:00
|
|
|
kref_init(&ppgtt->ref);
|
2013-07-17 06:50:06 +07:00
|
|
|
drm_mm_init(&ppgtt->base.mm, ppgtt->base.start,
|
|
|
|
ppgtt->base.total);
|
2013-12-07 05:11:26 +07:00
|
|
|
i915_init_vm(dev_priv, &ppgtt->base);
|
|
|
|
if (INTEL_INFO(dev)->gen < 8) {
|
2013-12-07 05:11:16 +07:00
|
|
|
gen6_write_pdes(ppgtt);
|
2013-12-07 05:11:26 +07:00
|
|
|
DRM_DEBUG("Adding PPGTT at offset %x\n",
|
|
|
|
ppgtt->pd_offset << 10);
|
|
|
|
}
|
2013-07-17 06:50:06 +07:00
|
|
|
}
|
2012-02-09 23:15:46 +07:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:26 +07:00
|
|
|
static void
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
ppgtt_bind_vma(struct i915_vma *vma,
|
|
|
|
enum i915_cache_level cache_level,
|
|
|
|
u32 flags)
|
2012-02-09 23:15:46 +07:00
|
|
|
{
|
2014-06-17 12:29:42 +07:00
|
|
|
/* Currently applicable only to VLV */
|
|
|
|
if (vma->obj->gt_ro)
|
|
|
|
flags |= PTE_READ_ONLY;
|
|
|
|
|
2014-02-21 02:50:33 +07:00
|
|
|
vma->vm->insert_entries(vma->vm, vma->obj->pages, vma->node.start,
|
2014-06-17 12:29:42 +07:00
|
|
|
cache_level, flags);
|
2012-02-09 23:15:46 +07:00
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:26 +07:00
|
|
|
static void ppgtt_unbind_vma(struct i915_vma *vma)
|
2012-02-09 23:15:47 +07:00
|
|
|
{
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
vma->vm->clear_range(vma->vm,
|
2014-02-21 02:50:33 +07:00
|
|
|
vma->node.start,
|
|
|
|
vma->obj->base.size,
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
true);
|
2012-02-09 23:15:47 +07:00
|
|
|
}
|
|
|
|
|
2013-01-19 03:30:31 +07:00
|
|
|
extern int intel_iommu_gfx_mapped;
|
|
|
|
/* Certain Gen5 chipsets require require idling the GPU before
|
|
|
|
* unmapping anything from the GTT when VT-d is enabled.
|
|
|
|
*/
|
|
|
|
static inline bool needs_idle_maps(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
#ifdef CONFIG_INTEL_IOMMU
|
|
|
|
/* Query intel_iommu to see if we need the workaround. Presumably that
|
|
|
|
* was loaded first.
|
|
|
|
*/
|
|
|
|
if (IS_GEN5(dev) && IS_MOBILE(dev) && intel_iommu_gfx_mapped)
|
|
|
|
return true;
|
|
|
|
#endif
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2011-10-18 05:51:55 +07:00
|
|
|
static bool do_idling(struct drm_i915_private *dev_priv)
|
|
|
|
{
|
|
|
|
bool ret = dev_priv->mm.interruptible;
|
|
|
|
|
2013-01-19 03:30:31 +07:00
|
|
|
if (unlikely(dev_priv->gtt.do_idle_maps)) {
|
2011-10-18 05:51:55 +07:00
|
|
|
dev_priv->mm.interruptible = false;
|
2012-04-27 06:02:58 +07:00
|
|
|
if (i915_gpu_idle(dev_priv->dev)) {
|
2011-10-18 05:51:55 +07:00
|
|
|
DRM_ERROR("Couldn't idle GPU\n");
|
|
|
|
/* Wait a bit, in hopes it avoids the hang */
|
|
|
|
udelay(10);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible)
|
|
|
|
{
|
2013-01-19 03:30:31 +07:00
|
|
|
if (unlikely(dev_priv->gtt.do_idle_maps))
|
2011-10-18 05:51:55 +07:00
|
|
|
dev_priv->mm.interruptible = interruptible;
|
|
|
|
}
|
|
|
|
|
2013-10-16 23:21:30 +07:00
|
|
|
void i915_check_and_clear_faults(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2014-05-22 20:13:33 +07:00
|
|
|
struct intel_engine_cs *ring;
|
2013-10-16 23:21:30 +07:00
|
|
|
int i;
|
|
|
|
|
|
|
|
if (INTEL_INFO(dev)->gen < 6)
|
|
|
|
return;
|
|
|
|
|
|
|
|
for_each_ring(ring, dev_priv, i) {
|
|
|
|
u32 fault_reg;
|
|
|
|
fault_reg = I915_READ(RING_FAULT_REG(ring));
|
|
|
|
if (fault_reg & RING_FAULT_VALID) {
|
|
|
|
DRM_DEBUG_DRIVER("Unexpected fault\n"
|
|
|
|
"\tAddr: 0x%08lx\\n"
|
|
|
|
"\tAddress space: %s\n"
|
|
|
|
"\tSource ID: %d\n"
|
|
|
|
"\tType: %d\n",
|
|
|
|
fault_reg & PAGE_MASK,
|
|
|
|
fault_reg & RING_FAULT_GTTSEL_MASK ? "GGTT" : "PPGTT",
|
|
|
|
RING_FAULT_SRCID(fault_reg),
|
|
|
|
RING_FAULT_FAULT_TYPE(fault_reg));
|
|
|
|
I915_WRITE(RING_FAULT_REG(ring),
|
|
|
|
fault_reg & ~RING_FAULT_VALID);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
POSTING_READ(RING_FAULT_REG(&dev_priv->ring[RCS]));
|
|
|
|
}
|
|
|
|
|
|
|
|
void i915_gem_suspend_gtt_mappings(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
|
|
|
|
/* Don't bother messing with faults pre GEN6 as we have little
|
|
|
|
* documentation supporting that it's a good idea.
|
|
|
|
*/
|
|
|
|
if (INTEL_INFO(dev)->gen < 6)
|
|
|
|
return;
|
|
|
|
|
|
|
|
i915_check_and_clear_faults(dev);
|
|
|
|
|
|
|
|
dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
|
2014-02-21 02:50:33 +07:00
|
|
|
dev_priv->gtt.base.start,
|
|
|
|
dev_priv->gtt.base.total,
|
2014-03-27 02:08:20 +07:00
|
|
|
true);
|
2013-10-16 23:21:30 +07:00
|
|
|
}
|
|
|
|
|
2010-11-06 04:23:30 +07:00
|
|
|
void i915_gem_restore_gtt_mappings(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2010-11-09 02:18:58 +07:00
|
|
|
struct drm_i915_gem_object *obj;
|
2013-12-07 05:11:17 +07:00
|
|
|
struct i915_address_space *vm;
|
2010-11-06 04:23:30 +07:00
|
|
|
|
2013-10-16 23:21:30 +07:00
|
|
|
i915_check_and_clear_faults(dev);
|
|
|
|
|
2011-01-21 17:54:32 +07:00
|
|
|
/* First fill our portion of the GTT with scratch pages */
|
2013-07-17 06:50:05 +07:00
|
|
|
dev_priv->gtt.base.clear_range(&dev_priv->gtt.base,
|
2014-02-21 02:50:33 +07:00
|
|
|
dev_priv->gtt.base.start,
|
|
|
|
dev_priv->gtt.base.total,
|
2013-10-16 23:21:30 +07:00
|
|
|
true);
|
2011-01-21 17:54:32 +07:00
|
|
|
|
2013-06-01 01:28:48 +07:00
|
|
|
list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
struct i915_vma *vma = i915_gem_obj_to_vma(obj,
|
|
|
|
&dev_priv->gtt.base);
|
|
|
|
if (!vma)
|
|
|
|
continue;
|
|
|
|
|
2013-08-09 18:26:45 +07:00
|
|
|
i915_gem_clflush_object(obj, obj->pin_display);
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
/* The bind_vma code tries to be smart about tracking mappings.
|
|
|
|
* Unfortunately above, we've just wiped out the mappings
|
|
|
|
* without telling our object about it. So we need to fake it.
|
|
|
|
*/
|
|
|
|
obj->has_global_gtt_mapping = 0;
|
|
|
|
vma->bind_vma(vma, obj->cache_level, GLOBAL_BIND);
|
2010-11-06 04:23:30 +07:00
|
|
|
}
|
|
|
|
|
2013-12-07 05:11:17 +07:00
|
|
|
|
2014-03-19 06:09:37 +07:00
|
|
|
if (INTEL_INFO(dev)->gen >= 8) {
|
2014-04-09 17:28:01 +07:00
|
|
|
if (IS_CHERRYVIEW(dev))
|
|
|
|
chv_setup_private_ppat(dev_priv);
|
|
|
|
else
|
|
|
|
bdw_setup_private_ppat(dev_priv);
|
|
|
|
|
2013-12-07 05:11:17 +07:00
|
|
|
return;
|
2014-03-19 06:09:37 +07:00
|
|
|
}
|
2013-12-07 05:11:17 +07:00
|
|
|
|
|
|
|
list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
|
|
|
|
/* TODO: Perhaps it shouldn't be gen6 specific */
|
|
|
|
if (i915_is_ggtt(vm)) {
|
|
|
|
if (dev_priv->mm.aliasing_ppgtt)
|
|
|
|
gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
gen6_write_pdes(container_of(vm, struct i915_hw_ppgtt, base));
|
2010-11-06 04:23:30 +07:00
|
|
|
}
|
|
|
|
|
2012-11-05 00:21:27 +07:00
|
|
|
i915_gem_chipset_flush(dev);
|
2010-11-06 04:23:30 +07:00
|
|
|
}
|
2010-11-06 16:10:47 +07:00
|
|
|
|
2012-02-16 05:50:21 +07:00
|
|
|
int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
|
2010-11-06 16:10:47 +07:00
|
|
|
{
|
2012-06-01 21:20:22 +07:00
|
|
|
if (obj->has_dma_mapping)
|
2012-02-16 05:50:21 +07:00
|
|
|
return 0;
|
2012-06-01 21:20:22 +07:00
|
|
|
|
|
|
|
if (!dma_map_sg(&obj->base.dev->pdev->dev,
|
|
|
|
obj->pages->sgl, obj->pages->nents,
|
|
|
|
PCI_DMA_BIDIRECTIONAL))
|
|
|
|
return -ENOSPC;
|
|
|
|
|
|
|
|
return 0;
|
2010-11-06 16:10:47 +07:00
|
|
|
}
|
|
|
|
|
2013-11-03 11:07:18 +07:00
|
|
|
static inline void gen8_set_pte(void __iomem *addr, gen8_gtt_pte_t pte)
|
|
|
|
{
|
|
|
|
#ifdef writeq
|
|
|
|
writeq(pte, addr);
|
|
|
|
#else
|
|
|
|
iowrite32((u32)pte, addr);
|
|
|
|
iowrite32(pte >> 32, addr + 4);
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
|
|
|
|
struct sg_table *st,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
2014-06-17 12:29:42 +07:00
|
|
|
enum i915_cache_level level, u32 unused)
|
2013-11-03 11:07:18 +07:00
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = vm->dev->dev_private;
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned first_entry = start >> PAGE_SHIFT;
|
2013-11-03 11:07:18 +07:00
|
|
|
gen8_gtt_pte_t __iomem *gtt_entries =
|
|
|
|
(gen8_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
|
|
|
|
int i = 0;
|
|
|
|
struct sg_page_iter sg_iter;
|
2014-07-28 18:20:58 +07:00
|
|
|
dma_addr_t addr = 0; /* shut up gcc */
|
2013-11-03 11:07:18 +07:00
|
|
|
|
|
|
|
for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
|
|
|
|
addr = sg_dma_address(sg_iter.sg) +
|
|
|
|
(sg_iter.sg_pgoffset << PAGE_SHIFT);
|
|
|
|
gen8_set_pte(>t_entries[i],
|
|
|
|
gen8_pte_encode(addr, level, true));
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* XXX: This serves as a posting read to make sure that the PTE has
|
|
|
|
* actually been updated. There is some concern that even though
|
|
|
|
* registers and PTEs are within the same BAR that they are potentially
|
|
|
|
* of NUMA access patterns. Therefore, even with the way we assume
|
|
|
|
* hardware should work, we must keep this posting read for paranoia.
|
|
|
|
*/
|
|
|
|
if (i != 0)
|
|
|
|
WARN_ON(readq(>t_entries[i-1])
|
|
|
|
!= gen8_pte_encode(addr, level, true));
|
|
|
|
|
|
|
|
/* This next bit makes the above posting read even more important. We
|
|
|
|
* want to flush the TLBs only after we're certain all the PTE updates
|
|
|
|
* have finished.
|
|
|
|
*/
|
|
|
|
I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
|
|
|
|
POSTING_READ(GFX_FLSH_CNTL_GEN6);
|
|
|
|
}
|
|
|
|
|
2012-11-05 00:21:27 +07:00
|
|
|
/*
|
|
|
|
* Binds an object into the global gtt with the specified cache level. The object
|
|
|
|
* will be accessible to the GPU via commands whose operands reference offsets
|
|
|
|
* within the global GTT as well as accessible by the GPU through the GMADR
|
|
|
|
* mapped BAR (dev_priv->mm.gtt->gtt).
|
|
|
|
*/
|
2013-07-17 06:50:05 +07:00
|
|
|
static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
|
2013-01-25 05:44:55 +07:00
|
|
|
struct sg_table *st,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
2014-06-17 12:29:42 +07:00
|
|
|
enum i915_cache_level level, u32 flags)
|
2012-11-05 00:21:27 +07:00
|
|
|
{
|
2013-07-17 06:50:05 +07:00
|
|
|
struct drm_i915_private *dev_priv = vm->dev->dev_private;
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned first_entry = start >> PAGE_SHIFT;
|
2013-04-09 08:43:48 +07:00
|
|
|
gen6_gtt_pte_t __iomem *gtt_entries =
|
|
|
|
(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
|
2013-02-19 00:28:04 +07:00
|
|
|
int i = 0;
|
|
|
|
struct sg_page_iter sg_iter;
|
2014-07-28 18:20:58 +07:00
|
|
|
dma_addr_t addr = 0;
|
2012-11-05 00:21:27 +07:00
|
|
|
|
2013-02-19 00:28:04 +07:00
|
|
|
for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
|
2013-03-26 20:14:18 +07:00
|
|
|
addr = sg_page_iter_dma_address(&sg_iter);
|
2014-06-17 12:29:42 +07:00
|
|
|
iowrite32(vm->pte_encode(addr, level, true, flags), >t_entries[i]);
|
2013-02-19 00:28:04 +07:00
|
|
|
i++;
|
2012-11-05 00:21:27 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* XXX: This serves as a posting read to make sure that the PTE has
|
|
|
|
* actually been updated. There is some concern that even though
|
|
|
|
* registers and PTEs are within the same BAR that they are potentially
|
|
|
|
* of NUMA access patterns. Therefore, even with the way we assume
|
|
|
|
* hardware should work, we must keep this posting read for paranoia.
|
|
|
|
*/
|
2014-07-28 18:20:58 +07:00
|
|
|
if (i != 0) {
|
|
|
|
unsigned long gtt = readl(>t_entries[i-1]);
|
|
|
|
WARN_ON(gtt != vm->pte_encode(addr, level, true, flags));
|
|
|
|
}
|
2012-11-05 00:21:30 +07:00
|
|
|
|
|
|
|
/* This next bit makes the above posting read even more important. We
|
|
|
|
* want to flush the TLBs only after we're certain all the PTE updates
|
|
|
|
* have finished.
|
|
|
|
*/
|
|
|
|
I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
|
|
|
|
POSTING_READ(GFX_FLSH_CNTL_GEN6);
|
2012-11-05 00:21:27 +07:00
|
|
|
}
|
|
|
|
|
2013-11-03 11:07:18 +07:00
|
|
|
static void gen8_ggtt_clear_range(struct i915_address_space *vm,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
|
|
|
uint64_t length,
|
2013-11-03 11:07:18 +07:00
|
|
|
bool use_scratch)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = vm->dev->dev_private;
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned first_entry = start >> PAGE_SHIFT;
|
|
|
|
unsigned num_entries = length >> PAGE_SHIFT;
|
2013-11-03 11:07:18 +07:00
|
|
|
gen8_gtt_pte_t scratch_pte, __iomem *gtt_base =
|
|
|
|
(gen8_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
|
|
|
|
const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (WARN(num_entries > max_entries,
|
|
|
|
"First entry = %d; Num entries = %d (max=%d)\n",
|
|
|
|
first_entry, num_entries, max_entries))
|
|
|
|
num_entries = max_entries;
|
|
|
|
|
|
|
|
scratch_pte = gen8_pte_encode(vm->scratch.addr,
|
|
|
|
I915_CACHE_LLC,
|
|
|
|
use_scratch);
|
|
|
|
for (i = 0; i < num_entries; i++)
|
|
|
|
gen8_set_pte(>t_base[i], scratch_pte);
|
|
|
|
readl(gtt_base);
|
|
|
|
}
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
static void gen6_ggtt_clear_range(struct i915_address_space *vm,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
|
|
|
uint64_t length,
|
2013-10-16 23:21:30 +07:00
|
|
|
bool use_scratch)
|
2013-01-25 05:44:55 +07:00
|
|
|
{
|
2013-07-17 06:50:05 +07:00
|
|
|
struct drm_i915_private *dev_priv = vm->dev->dev_private;
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned first_entry = start >> PAGE_SHIFT;
|
|
|
|
unsigned num_entries = length >> PAGE_SHIFT;
|
2013-04-09 08:43:48 +07:00
|
|
|
gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
|
|
|
|
(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
|
2013-01-25 05:45:00 +07:00
|
|
|
const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
|
2013-01-25 05:44:55 +07:00
|
|
|
int i;
|
|
|
|
|
|
|
|
if (WARN(num_entries > max_entries,
|
|
|
|
"First entry = %d; Num entries = %d (max=%d)\n",
|
|
|
|
first_entry, num_entries, max_entries))
|
|
|
|
num_entries = max_entries;
|
|
|
|
|
2014-06-17 12:29:42 +07:00
|
|
|
scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC, use_scratch, 0);
|
2013-10-16 23:21:30 +07:00
|
|
|
|
2013-01-25 05:44:55 +07:00
|
|
|
for (i = 0; i < num_entries; i++)
|
|
|
|
iowrite32(scratch_pte, >t_base[i]);
|
|
|
|
readl(gtt_base);
|
|
|
|
}
|
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
|
|
|
|
static void i915_ggtt_bind_vma(struct i915_vma *vma,
|
|
|
|
enum i915_cache_level cache_level,
|
|
|
|
u32 unused)
|
2013-01-25 05:44:55 +07:00
|
|
|
{
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
const unsigned long entry = vma->node.start >> PAGE_SHIFT;
|
2013-01-25 05:44:55 +07:00
|
|
|
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
|
|
|
|
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
|
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
BUG_ON(!i915_is_ggtt(vma->vm));
|
|
|
|
intel_gtt_insert_sg_entries(vma->obj->pages, entry, flags);
|
|
|
|
vma->obj->has_global_gtt_mapping = 1;
|
2013-01-25 05:44:55 +07:00
|
|
|
}
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
static void i915_ggtt_clear_range(struct i915_address_space *vm,
|
2014-02-21 02:50:33 +07:00
|
|
|
uint64_t start,
|
|
|
|
uint64_t length,
|
2013-10-16 23:21:30 +07:00
|
|
|
bool unused)
|
2013-01-25 05:44:55 +07:00
|
|
|
{
|
2014-02-21 02:50:33 +07:00
|
|
|
unsigned first_entry = start >> PAGE_SHIFT;
|
|
|
|
unsigned num_entries = length >> PAGE_SHIFT;
|
2013-01-25 05:44:55 +07:00
|
|
|
intel_gtt_clear_range(first_entry, num_entries);
|
|
|
|
}
|
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
static void i915_ggtt_unbind_vma(struct i915_vma *vma)
|
|
|
|
{
|
|
|
|
const unsigned int first = vma->node.start >> PAGE_SHIFT;
|
|
|
|
const unsigned int size = vma->obj->base.size >> PAGE_SHIFT;
|
2013-01-25 05:44:55 +07:00
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
BUG_ON(!i915_is_ggtt(vma->vm));
|
|
|
|
vma->obj->has_global_gtt_mapping = 0;
|
|
|
|
intel_gtt_clear_range(first, size);
|
|
|
|
}
|
2013-01-25 05:44:55 +07:00
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
static void ggtt_bind_vma(struct i915_vma *vma,
|
|
|
|
enum i915_cache_level cache_level,
|
|
|
|
u32 flags)
|
2011-04-14 12:48:26 +07:00
|
|
|
{
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
struct drm_device *dev = vma->vm->dev;
|
2013-01-25 05:44:55 +07:00
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
struct drm_i915_gem_object *obj = vma->obj;
|
2013-01-25 05:44:55 +07:00
|
|
|
|
2014-06-17 12:29:42 +07:00
|
|
|
/* Currently applicable only to VLV */
|
|
|
|
if (obj->gt_ro)
|
|
|
|
flags |= PTE_READ_ONLY;
|
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
/* If there is no aliasing PPGTT, or the caller needs a global mapping,
|
|
|
|
* or we have a global mapping already but the cacheability flags have
|
|
|
|
* changed, set the global PTEs.
|
|
|
|
*
|
|
|
|
* If there is an aliasing PPGTT it is anecdotally faster, so use that
|
|
|
|
* instead if none of the above hold true.
|
|
|
|
*
|
|
|
|
* NB: A global mapping should only be needed for special regions like
|
|
|
|
* "gtt mappable", SNB errata, or if specified via special execbuf
|
|
|
|
* flags. At all other times, the GPU will use the aliasing PPGTT.
|
|
|
|
*/
|
|
|
|
if (!dev_priv->mm.aliasing_ppgtt || flags & GLOBAL_BIND) {
|
|
|
|
if (!obj->has_global_gtt_mapping ||
|
|
|
|
(cache_level != obj->cache_level)) {
|
2014-02-21 02:50:33 +07:00
|
|
|
vma->vm->insert_entries(vma->vm, obj->pages,
|
|
|
|
vma->node.start,
|
2014-06-17 12:29:42 +07:00
|
|
|
cache_level, flags);
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
obj->has_global_gtt_mapping = 1;
|
|
|
|
}
|
|
|
|
}
|
2011-04-14 12:48:26 +07:00
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
if (dev_priv->mm.aliasing_ppgtt &&
|
|
|
|
(!obj->has_aliasing_ppgtt_mapping ||
|
|
|
|
(cache_level != obj->cache_level))) {
|
|
|
|
struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
|
|
|
|
appgtt->base.insert_entries(&appgtt->base,
|
2014-02-21 02:50:33 +07:00
|
|
|
vma->obj->pages,
|
|
|
|
vma->node.start,
|
2014-06-17 12:29:42 +07:00
|
|
|
cache_level, flags);
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
vma->obj->has_aliasing_ppgtt_mapping = 1;
|
|
|
|
}
|
2011-04-14 12:48:26 +07:00
|
|
|
}
|
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
static void ggtt_unbind_vma(struct i915_vma *vma)
|
2012-02-16 05:50:21 +07:00
|
|
|
{
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
struct drm_device *dev = vma->vm->dev;
|
2013-01-25 05:44:55 +07:00
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
struct drm_i915_gem_object *obj = vma->obj;
|
|
|
|
|
|
|
|
if (obj->has_global_gtt_mapping) {
|
2014-02-21 02:50:33 +07:00
|
|
|
vma->vm->clear_range(vma->vm,
|
|
|
|
vma->node.start,
|
|
|
|
obj->base.size,
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
true);
|
|
|
|
obj->has_global_gtt_mapping = 0;
|
|
|
|
}
|
2012-02-16 05:50:22 +07:00
|
|
|
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
if (obj->has_aliasing_ppgtt_mapping) {
|
|
|
|
struct i915_hw_ppgtt *appgtt = dev_priv->mm.aliasing_ppgtt;
|
|
|
|
appgtt->base.clear_range(&appgtt->base,
|
2014-02-21 02:50:33 +07:00
|
|
|
vma->node.start,
|
|
|
|
obj->base.size,
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
true);
|
|
|
|
obj->has_aliasing_ppgtt_mapping = 0;
|
|
|
|
}
|
2012-02-16 05:50:21 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
|
2010-11-06 16:10:47 +07:00
|
|
|
{
|
2011-10-18 05:51:55 +07:00
|
|
|
struct drm_device *dev = obj->base.dev;
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
bool interruptible;
|
|
|
|
|
|
|
|
interruptible = do_idling(dev_priv);
|
|
|
|
|
2012-06-01 21:20:22 +07:00
|
|
|
if (!obj->has_dma_mapping)
|
|
|
|
dma_unmap_sg(&dev->pdev->dev,
|
|
|
|
obj->pages->sgl, obj->pages->nents,
|
|
|
|
PCI_DMA_BIDIRECTIONAL);
|
2011-10-18 05:51:55 +07:00
|
|
|
|
|
|
|
undo_idling(dev_priv, interruptible);
|
2010-11-06 16:10:47 +07:00
|
|
|
}
|
2012-03-26 14:45:40 +07:00
|
|
|
|
2012-07-26 17:49:32 +07:00
|
|
|
static void i915_gtt_color_adjust(struct drm_mm_node *node,
|
|
|
|
unsigned long color,
|
|
|
|
unsigned long *start,
|
|
|
|
unsigned long *end)
|
|
|
|
{
|
|
|
|
if (node->color != color)
|
|
|
|
*start += 4096;
|
|
|
|
|
|
|
|
if (!list_empty(&node->node_list)) {
|
|
|
|
node = list_entry(node->node_list.next,
|
|
|
|
struct drm_mm_node,
|
|
|
|
node_list);
|
|
|
|
if (node->allocated && node->color != color)
|
|
|
|
*end -= 4096;
|
|
|
|
}
|
|
|
|
}
|
2013-11-05 10:56:49 +07:00
|
|
|
|
2012-12-19 01:31:25 +07:00
|
|
|
void i915_gem_setup_global_gtt(struct drm_device *dev,
|
|
|
|
unsigned long start,
|
|
|
|
unsigned long mappable_end,
|
|
|
|
unsigned long end)
|
2012-03-26 14:45:40 +07:00
|
|
|
{
|
2013-01-26 07:41:04 +07:00
|
|
|
/* Let GEM Manage all of the aperture.
|
|
|
|
*
|
|
|
|
* However, leave one page at the end still bound to the scratch page.
|
|
|
|
* There are a number of places where the hardware apparently prefetches
|
|
|
|
* past the end of the object, and we've seen multiple hangs with the
|
|
|
|
* GPU head pointer stuck in a batchbuffer bound at the last page of the
|
|
|
|
* aperture. One page should be enough to keep any prefetching inside
|
|
|
|
* of the aperture.
|
|
|
|
*/
|
2013-08-01 06:59:59 +07:00
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
struct i915_address_space *ggtt_vm = &dev_priv->gtt.base;
|
2012-11-15 18:32:19 +07:00
|
|
|
struct drm_mm_node *entry;
|
|
|
|
struct drm_i915_gem_object *obj;
|
|
|
|
unsigned long hole_start, hole_end;
|
2012-03-26 14:45:40 +07:00
|
|
|
|
2013-01-18 03:45:13 +07:00
|
|
|
BUG_ON(mappable_end > end);
|
|
|
|
|
2012-11-15 18:32:19 +07:00
|
|
|
/* Subtract the guard page ... */
|
2013-08-01 06:59:59 +07:00
|
|
|
drm_mm_init(&ggtt_vm->mm, start, end - start - PAGE_SIZE);
|
2012-07-26 17:49:32 +07:00
|
|
|
if (!HAS_LLC(dev))
|
2013-07-17 06:50:06 +07:00
|
|
|
dev_priv->gtt.base.mm.color_adjust = i915_gtt_color_adjust;
|
2012-03-26 14:45:40 +07:00
|
|
|
|
2012-11-15 18:32:19 +07:00
|
|
|
/* Mark any preallocated objects as occupied */
|
2013-06-01 01:28:48 +07:00
|
|
|
list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
|
2013-08-01 06:59:59 +07:00
|
|
|
struct i915_vma *vma = i915_gem_obj_to_vma(obj, ggtt_vm);
|
2013-07-06 04:41:02 +07:00
|
|
|
int ret;
|
2013-07-06 04:41:05 +07:00
|
|
|
DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
|
2013-07-06 04:41:06 +07:00
|
|
|
i915_gem_obj_ggtt_offset(obj), obj->base.size);
|
|
|
|
|
|
|
|
WARN_ON(i915_gem_obj_ggtt_bound(obj));
|
2013-08-01 06:59:59 +07:00
|
|
|
ret = drm_mm_reserve_node(&ggtt_vm->mm, &vma->node);
|
2013-07-06 04:41:06 +07:00
|
|
|
if (ret)
|
2013-07-06 04:41:02 +07:00
|
|
|
DRM_DEBUG_KMS("Reservation failed\n");
|
2012-11-15 18:32:19 +07:00
|
|
|
obj->has_global_gtt_mapping = 1;
|
|
|
|
}
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
dev_priv->gtt.base.start = start;
|
|
|
|
dev_priv->gtt.base.total = end - start;
|
2012-03-26 14:45:40 +07:00
|
|
|
|
2012-11-15 18:32:19 +07:00
|
|
|
/* Clear any non-preallocated blocks */
|
2013-08-01 06:59:59 +07:00
|
|
|
drm_mm_for_each_hole(entry, &ggtt_vm->mm, hole_start, hole_end) {
|
2012-11-15 18:32:19 +07:00
|
|
|
DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
|
|
|
|
hole_start, hole_end);
|
2014-02-21 02:50:33 +07:00
|
|
|
ggtt_vm->clear_range(ggtt_vm, hole_start,
|
|
|
|
hole_end - hole_start, true);
|
2012-11-15 18:32:19 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/* And finally clear the reserved guard page */
|
2014-02-21 02:50:33 +07:00
|
|
|
ggtt_vm->clear_range(ggtt_vm, end - PAGE_SIZE, PAGE_SIZE, true);
|
2012-11-05 00:21:27 +07:00
|
|
|
}
|
|
|
|
|
2012-12-19 01:31:25 +07:00
|
|
|
void i915_gem_init_global_gtt(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
unsigned long gtt_size, mappable_size;
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
gtt_size = dev_priv->gtt.base.total;
|
2013-01-18 03:45:17 +07:00
|
|
|
mappable_size = dev_priv->gtt.mappable_end;
|
2012-12-19 01:31:25 +07:00
|
|
|
|
2013-01-26 07:41:04 +07:00
|
|
|
i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
|
2012-11-05 00:21:27 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static int setup_scratch_page(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
struct page *page;
|
|
|
|
dma_addr_t dma_addr;
|
|
|
|
|
|
|
|
page = alloc_page(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO);
|
|
|
|
if (page == NULL)
|
|
|
|
return -ENOMEM;
|
|
|
|
get_page(page);
|
|
|
|
set_pages_uc(page, 1);
|
|
|
|
|
|
|
|
#ifdef CONFIG_INTEL_IOMMU
|
|
|
|
dma_addr = pci_map_page(dev->pdev, page, 0, PAGE_SIZE,
|
|
|
|
PCI_DMA_BIDIRECTIONAL);
|
|
|
|
if (pci_dma_mapping_error(dev->pdev, dma_addr))
|
|
|
|
return -EINVAL;
|
|
|
|
#else
|
|
|
|
dma_addr = page_to_phys(page);
|
|
|
|
#endif
|
2013-07-17 06:50:05 +07:00
|
|
|
dev_priv->gtt.base.scratch.page = page;
|
|
|
|
dev_priv->gtt.base.scratch.addr = dma_addr;
|
2012-11-05 00:21:27 +07:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void teardown_scratch_page(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2013-07-17 06:50:05 +07:00
|
|
|
struct page *page = dev_priv->gtt.base.scratch.page;
|
|
|
|
|
|
|
|
set_pages_wb(page, 1);
|
|
|
|
pci_unmap_page(dev->pdev, dev_priv->gtt.base.scratch.addr,
|
2012-11-05 00:21:27 +07:00
|
|
|
PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
|
2013-07-17 06:50:05 +07:00
|
|
|
put_page(page);
|
|
|
|
__free_page(page);
|
2012-11-05 00:21:27 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
|
|
|
|
{
|
|
|
|
snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT;
|
|
|
|
snb_gmch_ctl &= SNB_GMCH_GGMS_MASK;
|
|
|
|
return snb_gmch_ctl << 20;
|
|
|
|
}
|
|
|
|
|
2013-11-04 07:53:55 +07:00
|
|
|
static inline unsigned int gen8_get_total_gtt_size(u16 bdw_gmch_ctl)
|
|
|
|
{
|
|
|
|
bdw_gmch_ctl >>= BDW_GMCH_GGMS_SHIFT;
|
|
|
|
bdw_gmch_ctl &= BDW_GMCH_GGMS_MASK;
|
|
|
|
if (bdw_gmch_ctl)
|
|
|
|
bdw_gmch_ctl = 1 << bdw_gmch_ctl;
|
2014-05-28 06:53:08 +07:00
|
|
|
|
|
|
|
#ifdef CONFIG_X86_32
|
|
|
|
/* Limit 32b platforms to a 2GB GGTT: 4 << 20 / pte size * PAGE_SIZE */
|
|
|
|
if (bdw_gmch_ctl > 4)
|
|
|
|
bdw_gmch_ctl = 4;
|
|
|
|
#endif
|
|
|
|
|
2013-11-04 07:53:55 +07:00
|
|
|
return bdw_gmch_ctl << 20;
|
|
|
|
}
|
|
|
|
|
2014-05-09 02:19:40 +07:00
|
|
|
static inline unsigned int chv_get_total_gtt_size(u16 gmch_ctrl)
|
|
|
|
{
|
|
|
|
gmch_ctrl >>= SNB_GMCH_GGMS_SHIFT;
|
|
|
|
gmch_ctrl &= SNB_GMCH_GGMS_MASK;
|
|
|
|
|
|
|
|
if (gmch_ctrl)
|
|
|
|
return 1 << (20 + gmch_ctrl);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-01-25 04:49:57 +07:00
|
|
|
static inline size_t gen6_get_stolen_size(u16 snb_gmch_ctl)
|
2012-11-05 00:21:27 +07:00
|
|
|
{
|
|
|
|
snb_gmch_ctl >>= SNB_GMCH_GMS_SHIFT;
|
|
|
|
snb_gmch_ctl &= SNB_GMCH_GMS_MASK;
|
|
|
|
return snb_gmch_ctl << 25; /* 32 MB units */
|
|
|
|
}
|
|
|
|
|
2013-11-04 07:53:55 +07:00
|
|
|
static inline size_t gen8_get_stolen_size(u16 bdw_gmch_ctl)
|
|
|
|
{
|
|
|
|
bdw_gmch_ctl >>= BDW_GMCH_GMS_SHIFT;
|
|
|
|
bdw_gmch_ctl &= BDW_GMCH_GMS_MASK;
|
|
|
|
return bdw_gmch_ctl << 25; /* 32 MB units */
|
|
|
|
}
|
|
|
|
|
2014-05-09 02:19:40 +07:00
|
|
|
static size_t chv_get_stolen_size(u16 gmch_ctrl)
|
|
|
|
{
|
|
|
|
gmch_ctrl >>= SNB_GMCH_GMS_SHIFT;
|
|
|
|
gmch_ctrl &= SNB_GMCH_GMS_MASK;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* 0x0 to 0x10: 32MB increments starting at 0MB
|
|
|
|
* 0x11 to 0x16: 4MB increments starting at 8MB
|
|
|
|
* 0x17 to 0x1d: 4MB increments start at 36MB
|
|
|
|
*/
|
|
|
|
if (gmch_ctrl < 0x11)
|
|
|
|
return gmch_ctrl << 25;
|
|
|
|
else if (gmch_ctrl < 0x17)
|
|
|
|
return (gmch_ctrl - 0x11 + 2) << 22;
|
|
|
|
else
|
|
|
|
return (gmch_ctrl - 0x17 + 9) << 22;
|
|
|
|
}
|
|
|
|
|
2013-11-05 10:32:22 +07:00
|
|
|
static int ggtt_probe_common(struct drm_device *dev,
|
|
|
|
size_t gtt_size)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2013-12-22 00:52:52 +07:00
|
|
|
phys_addr_t gtt_phys_addr;
|
2013-11-05 10:32:22 +07:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* For Modern GENs the PTEs and register space are split in the BAR */
|
2013-12-22 00:52:52 +07:00
|
|
|
gtt_phys_addr = pci_resource_start(dev->pdev, 0) +
|
2013-11-05 10:32:22 +07:00
|
|
|
(pci_resource_len(dev->pdev, 0) / 2);
|
|
|
|
|
2013-12-22 00:52:52 +07:00
|
|
|
dev_priv->gtt.gsm = ioremap_wc(gtt_phys_addr, gtt_size);
|
2013-11-05 10:32:22 +07:00
|
|
|
if (!dev_priv->gtt.gsm) {
|
|
|
|
DRM_ERROR("Failed to map the gtt page table\n");
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = setup_scratch_page(dev);
|
|
|
|
if (ret) {
|
|
|
|
DRM_ERROR("Scratch setup failed\n");
|
|
|
|
/* iounmap will also get called at remove, but meh */
|
|
|
|
iounmap(dev_priv->gtt.gsm);
|
|
|
|
}
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-11-05 10:56:49 +07:00
|
|
|
/* The GGTT and PPGTT need a private PPAT setup in order to handle cacheability
|
|
|
|
* bits. When using advanced contexts each context stores its own PAT, but
|
|
|
|
* writing this data shouldn't be harmful even in those cases. */
|
2014-04-09 17:28:01 +07:00
|
|
|
static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
|
2013-11-05 10:56:49 +07:00
|
|
|
{
|
|
|
|
uint64_t pat;
|
|
|
|
|
|
|
|
pat = GEN8_PPAT(0, GEN8_PPAT_WB | GEN8_PPAT_LLC) | /* for normal objects, no eLLC */
|
|
|
|
GEN8_PPAT(1, GEN8_PPAT_WC | GEN8_PPAT_LLCELLC) | /* for something pointing to ptes? */
|
|
|
|
GEN8_PPAT(2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC) | /* for scanout with eLLC */
|
|
|
|
GEN8_PPAT(3, GEN8_PPAT_UC) | /* Uncached objects, mostly for scanout */
|
|
|
|
GEN8_PPAT(4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0)) |
|
|
|
|
GEN8_PPAT(5, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1)) |
|
|
|
|
GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
|
|
|
|
GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
|
|
|
|
|
|
|
|
/* XXX: spec defines this as 2 distinct registers. It's unclear if a 64b
|
|
|
|
* write would work. */
|
|
|
|
I915_WRITE(GEN8_PRIVATE_PAT, pat);
|
|
|
|
I915_WRITE(GEN8_PRIVATE_PAT + 4, pat >> 32);
|
|
|
|
}
|
|
|
|
|
2014-04-09 17:28:01 +07:00
|
|
|
static void chv_setup_private_ppat(struct drm_i915_private *dev_priv)
|
|
|
|
{
|
|
|
|
uint64_t pat;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Map WB on BDW to snooped on CHV.
|
|
|
|
*
|
|
|
|
* Only the snoop bit has meaning for CHV, the rest is
|
|
|
|
* ignored.
|
|
|
|
*
|
|
|
|
* Note that the harware enforces snooping for all page
|
|
|
|
* table accesses. The snoop bit is actually ignored for
|
|
|
|
* PDEs.
|
|
|
|
*/
|
|
|
|
pat = GEN8_PPAT(0, CHV_PPAT_SNOOP) |
|
|
|
|
GEN8_PPAT(1, 0) |
|
|
|
|
GEN8_PPAT(2, 0) |
|
|
|
|
GEN8_PPAT(3, 0) |
|
|
|
|
GEN8_PPAT(4, CHV_PPAT_SNOOP) |
|
|
|
|
GEN8_PPAT(5, CHV_PPAT_SNOOP) |
|
|
|
|
GEN8_PPAT(6, CHV_PPAT_SNOOP) |
|
|
|
|
GEN8_PPAT(7, CHV_PPAT_SNOOP);
|
|
|
|
|
|
|
|
I915_WRITE(GEN8_PRIVATE_PAT, pat);
|
|
|
|
I915_WRITE(GEN8_PRIVATE_PAT + 4, pat >> 32);
|
|
|
|
}
|
|
|
|
|
2013-11-05 10:32:22 +07:00
|
|
|
static int gen8_gmch_probe(struct drm_device *dev,
|
|
|
|
size_t *gtt_total,
|
|
|
|
size_t *stolen,
|
|
|
|
phys_addr_t *mappable_base,
|
|
|
|
unsigned long *mappable_end)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
unsigned int gtt_size;
|
|
|
|
u16 snb_gmch_ctl;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* TODO: We're not aware of mappable constraints on gen8 yet */
|
|
|
|
*mappable_base = pci_resource_start(dev->pdev, 2);
|
|
|
|
*mappable_end = pci_resource_len(dev->pdev, 2);
|
|
|
|
|
|
|
|
if (!pci_set_dma_mask(dev->pdev, DMA_BIT_MASK(39)))
|
|
|
|
pci_set_consistent_dma_mask(dev->pdev, DMA_BIT_MASK(39));
|
|
|
|
|
|
|
|
pci_read_config_word(dev->pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
|
|
|
|
|
2014-05-09 02:19:40 +07:00
|
|
|
if (IS_CHERRYVIEW(dev)) {
|
|
|
|
*stolen = chv_get_stolen_size(snb_gmch_ctl);
|
|
|
|
gtt_size = chv_get_total_gtt_size(snb_gmch_ctl);
|
|
|
|
} else {
|
|
|
|
*stolen = gen8_get_stolen_size(snb_gmch_ctl);
|
|
|
|
gtt_size = gen8_get_total_gtt_size(snb_gmch_ctl);
|
|
|
|
}
|
2013-11-05 10:32:22 +07:00
|
|
|
|
2013-11-03 11:07:17 +07:00
|
|
|
*gtt_total = (gtt_size / sizeof(gen8_gtt_pte_t)) << PAGE_SHIFT;
|
2013-11-05 10:32:22 +07:00
|
|
|
|
2014-04-09 17:28:01 +07:00
|
|
|
if (IS_CHERRYVIEW(dev))
|
|
|
|
chv_setup_private_ppat(dev_priv);
|
|
|
|
else
|
|
|
|
bdw_setup_private_ppat(dev_priv);
|
2013-11-05 10:56:49 +07:00
|
|
|
|
2013-11-05 10:32:22 +07:00
|
|
|
ret = ggtt_probe_common(dev, gtt_size);
|
|
|
|
|
2013-11-03 11:07:18 +07:00
|
|
|
dev_priv->gtt.base.clear_range = gen8_ggtt_clear_range;
|
|
|
|
dev_priv->gtt.base.insert_entries = gen8_ggtt_insert_entries;
|
2013-11-05 10:32:22 +07:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-01-25 04:49:57 +07:00
|
|
|
static int gen6_gmch_probe(struct drm_device *dev,
|
|
|
|
size_t *gtt_total,
|
2013-02-09 02:32:47 +07:00
|
|
|
size_t *stolen,
|
|
|
|
phys_addr_t *mappable_base,
|
|
|
|
unsigned long *mappable_end)
|
2012-11-05 00:21:27 +07:00
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
2013-01-25 04:49:57 +07:00
|
|
|
unsigned int gtt_size;
|
2012-11-05 00:21:27 +07:00
|
|
|
u16 snb_gmch_ctl;
|
|
|
|
int ret;
|
|
|
|
|
2013-02-09 02:32:47 +07:00
|
|
|
*mappable_base = pci_resource_start(dev->pdev, 2);
|
|
|
|
*mappable_end = pci_resource_len(dev->pdev, 2);
|
|
|
|
|
2013-01-25 04:49:57 +07:00
|
|
|
/* 64/512MB is the current min/max we actually know of, but this is just
|
|
|
|
* a coarse sanity check.
|
2012-11-05 00:21:27 +07:00
|
|
|
*/
|
2013-02-09 02:32:47 +07:00
|
|
|
if ((*mappable_end < (64<<20) || (*mappable_end > (512<<20)))) {
|
2013-01-25 04:49:57 +07:00
|
|
|
DRM_ERROR("Unknown GMADR size (%lx)\n",
|
|
|
|
dev_priv->gtt.mappable_end);
|
|
|
|
return -ENXIO;
|
2012-11-05 00:21:27 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
if (!pci_set_dma_mask(dev->pdev, DMA_BIT_MASK(40)))
|
|
|
|
pci_set_consistent_dma_mask(dev->pdev, DMA_BIT_MASK(40));
|
|
|
|
pci_read_config_word(dev->pdev, SNB_GMCH_CTRL, &snb_gmch_ctl);
|
|
|
|
|
Revert "drm/i915: Calculate correct stolen size for GEN7+"
This reverts commit 03752f5b7b77b95d83479885040950fba1250850.
This revert requires a bit of explanation on how I understand things
work. Internally the architects/designers decide how the stolen encoding
works. We put it in a doc. BIOS writers take these docs and implement
it. Driver writers read the doc too, and read the value left by the BIOS
writers, and then we make magic.
The failing here is that in the docs we had[1] contained two different
definitions for this register for Gen7. (We have both a PCI register,
and an MMIO, and each of these were different). At the time [2] of
03752f5, we asked the architects what the correct value should be; but
that doesn't match the reality (BIOS) unfortunately.
So on all machines I can get my hands on, this revert is the right thing
to do. I've also worked with the product group to confirm that they
agree this revert is what we should do. People using HW made my "people"
who both write their own BIOS, and have access to our docs (Apple?).
Investigations are still ongoing about whether we need to add a list
of machines needing special handling, but this patch should be the
right thing for pretty much everyone.
[1] The docs are still wrong on this one. Now instead of two registers with
two definitions, we have one register with BOTH definitions, progress?
[2] The open source PRMs have the "wrong" definitions in chapter Volume
1 part6, section 1.1.12.
This digging was inspired by Paulo.
Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Augment the patch saying that it's still a bit unclear
whether there are any machines out there with "wrong" firmware and
whether we need to add a list to handle them specially.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-05-02 01:00:34 +07:00
|
|
|
*stolen = gen6_get_stolen_size(snb_gmch_ctl);
|
2013-04-09 08:43:47 +07:00
|
|
|
|
2013-11-05 10:32:22 +07:00
|
|
|
gtt_size = gen6_get_total_gtt_size(snb_gmch_ctl);
|
|
|
|
*gtt_total = (gtt_size / sizeof(gen6_gtt_pte_t)) << PAGE_SHIFT;
|
2012-11-05 00:21:27 +07:00
|
|
|
|
2013-11-05 10:32:22 +07:00
|
|
|
ret = ggtt_probe_common(dev, gtt_size);
|
2012-11-05 00:21:27 +07:00
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
dev_priv->gtt.base.clear_range = gen6_ggtt_clear_range;
|
|
|
|
dev_priv->gtt.base.insert_entries = gen6_ggtt_insert_entries;
|
2013-01-25 05:44:55 +07:00
|
|
|
|
2012-11-05 00:21:27 +07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
static void gen6_gmch_remove(struct i915_address_space *vm)
|
2012-11-05 00:21:27 +07:00
|
|
|
{
|
2013-07-17 06:50:05 +07:00
|
|
|
|
|
|
|
struct i915_gtt *gtt = container_of(vm, struct i915_gtt, base);
|
2013-11-26 00:54:43 +07:00
|
|
|
|
2014-06-05 21:22:16 +07:00
|
|
|
if (drm_mm_initialized(&vm->mm)) {
|
|
|
|
drm_mm_takedown(&vm->mm);
|
|
|
|
list_del(&vm->global_link);
|
|
|
|
}
|
2013-07-17 06:50:05 +07:00
|
|
|
iounmap(gtt->gsm);
|
|
|
|
teardown_scratch_page(vm->dev);
|
2012-03-26 14:45:40 +07:00
|
|
|
}
|
2013-01-25 04:49:57 +07:00
|
|
|
|
|
|
|
static int i915_gmch_probe(struct drm_device *dev,
|
|
|
|
size_t *gtt_total,
|
2013-02-09 02:32:47 +07:00
|
|
|
size_t *stolen,
|
|
|
|
phys_addr_t *mappable_base,
|
|
|
|
unsigned long *mappable_end)
|
2013-01-25 04:49:57 +07:00
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
ret = intel_gmch_probe(dev_priv->bridge_dev, dev_priv->dev->pdev, NULL);
|
|
|
|
if (!ret) {
|
|
|
|
DRM_ERROR("failed to set up gmch\n");
|
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
|
2013-02-09 02:32:47 +07:00
|
|
|
intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
|
2013-01-25 04:49:57 +07:00
|
|
|
|
|
|
|
dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
|
2013-07-17 06:50:05 +07:00
|
|
|
dev_priv->gtt.base.clear_range = i915_ggtt_clear_range;
|
2013-01-25 04:49:57 +07:00
|
|
|
|
2013-12-30 19:16:15 +07:00
|
|
|
if (unlikely(dev_priv->gtt.do_idle_maps))
|
|
|
|
DRM_INFO("applying Ironlake quirks for intel_iommu\n");
|
|
|
|
|
2013-01-25 04:49:57 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
static void i915_gmch_remove(struct i915_address_space *vm)
|
2013-01-25 04:49:57 +07:00
|
|
|
{
|
2014-06-05 21:22:16 +07:00
|
|
|
if (drm_mm_initialized(&vm->mm)) {
|
|
|
|
drm_mm_takedown(&vm->mm);
|
|
|
|
list_del(&vm->global_link);
|
|
|
|
}
|
2013-01-25 04:49:57 +07:00
|
|
|
intel_gmch_remove();
|
|
|
|
}
|
|
|
|
|
|
|
|
int i915_gem_gtt_init(struct drm_device *dev)
|
|
|
|
{
|
|
|
|
struct drm_i915_private *dev_priv = dev->dev_private;
|
|
|
|
struct i915_gtt *gtt = &dev_priv->gtt;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (INTEL_INFO(dev)->gen <= 5) {
|
2013-06-28 06:30:20 +07:00
|
|
|
gtt->gtt_probe = i915_gmch_probe;
|
2013-07-17 06:50:05 +07:00
|
|
|
gtt->base.cleanup = i915_gmch_remove;
|
2013-11-05 10:32:22 +07:00
|
|
|
} else if (INTEL_INFO(dev)->gen < 8) {
|
2013-06-28 06:30:20 +07:00
|
|
|
gtt->gtt_probe = gen6_gmch_probe;
|
2013-07-17 06:50:05 +07:00
|
|
|
gtt->base.cleanup = gen6_gmch_remove;
|
2013-07-05 01:02:06 +07:00
|
|
|
if (IS_HASWELL(dev) && dev_priv->ellc_size)
|
2013-07-17 06:50:05 +07:00
|
|
|
gtt->base.pte_encode = iris_pte_encode;
|
2013-07-05 01:02:06 +07:00
|
|
|
else if (IS_HASWELL(dev))
|
2013-07-17 06:50:05 +07:00
|
|
|
gtt->base.pte_encode = hsw_pte_encode;
|
2013-06-28 06:30:20 +07:00
|
|
|
else if (IS_VALLEYVIEW(dev))
|
2013-07-17 06:50:05 +07:00
|
|
|
gtt->base.pte_encode = byt_pte_encode;
|
2013-08-06 19:17:02 +07:00
|
|
|
else if (INTEL_INFO(dev)->gen >= 7)
|
|
|
|
gtt->base.pte_encode = ivb_pte_encode;
|
2013-06-28 06:30:20 +07:00
|
|
|
else
|
2013-08-06 19:17:02 +07:00
|
|
|
gtt->base.pte_encode = snb_pte_encode;
|
2013-11-05 10:32:22 +07:00
|
|
|
} else {
|
|
|
|
dev_priv->gtt.gtt_probe = gen8_gmch_probe;
|
|
|
|
dev_priv->gtt.base.cleanup = gen6_gmch_remove;
|
2013-01-25 04:49:57 +07:00
|
|
|
}
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
ret = gtt->gtt_probe(dev, >t->base.total, >t->stolen_size,
|
2013-06-28 06:30:20 +07:00
|
|
|
>t->mappable_base, >t->mappable_end);
|
2013-01-25 05:45:00 +07:00
|
|
|
if (ret)
|
2013-01-25 04:49:57 +07:00
|
|
|
return ret;
|
|
|
|
|
2013-07-17 06:50:05 +07:00
|
|
|
gtt->base.dev = dev;
|
|
|
|
|
2013-01-25 04:49:57 +07:00
|
|
|
/* GMADR is the PCI mmio aperture into the global GTT. */
|
2013-07-17 06:50:05 +07:00
|
|
|
DRM_INFO("Memory usable by graphics device = %zdM\n",
|
|
|
|
gtt->base.total >> 20);
|
2013-06-28 06:30:20 +07:00
|
|
|
DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
|
|
|
|
DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
|
2014-03-31 21:23:04 +07:00
|
|
|
#ifdef CONFIG_INTEL_IOMMU
|
|
|
|
if (intel_iommu_gfx_mapped)
|
|
|
|
DRM_INFO("VT-d active for gfx access\n");
|
|
|
|
#endif
|
2014-04-29 16:53:58 +07:00
|
|
|
/*
|
|
|
|
* i915.enable_ppgtt is read-only, so do an early pass to validate the
|
|
|
|
* user's requested state against the hardware/driver capabilities. We
|
|
|
|
* do this now so that we can print out any log messages once rather
|
|
|
|
* than every time we check intel_enable_ppgtt().
|
|
|
|
*/
|
|
|
|
i915.enable_ppgtt = sanitize_enable_ppgtt(dev, i915.enable_ppgtt);
|
|
|
|
DRM_DEBUG_DRIVER("ppgtt mode: %i\n", i915.enable_ppgtt);
|
2013-01-25 04:49:57 +07:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
|
|
|
|
static struct i915_vma *__i915_gem_vma_create(struct drm_i915_gem_object *obj,
|
|
|
|
struct i915_address_space *vm)
|
|
|
|
{
|
|
|
|
struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
|
|
|
|
if (vma == NULL)
|
|
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
|
|
|
|
INIT_LIST_HEAD(&vma->vma_link);
|
|
|
|
INIT_LIST_HEAD(&vma->mm_list);
|
|
|
|
INIT_LIST_HEAD(&vma->exec_list);
|
|
|
|
vma->vm = vm;
|
|
|
|
vma->obj = obj;
|
|
|
|
|
|
|
|
switch (INTEL_INFO(vm->dev)->gen) {
|
|
|
|
case 8:
|
|
|
|
case 7:
|
|
|
|
case 6:
|
2013-12-07 05:11:26 +07:00
|
|
|
if (i915_is_ggtt(vm)) {
|
|
|
|
vma->unbind_vma = ggtt_unbind_vma;
|
|
|
|
vma->bind_vma = ggtt_bind_vma;
|
|
|
|
} else {
|
|
|
|
vma->unbind_vma = ppgtt_unbind_vma;
|
|
|
|
vma->bind_vma = ppgtt_bind_vma;
|
|
|
|
}
|
drm/i915: Create bind/unbind abstraction for VMAs
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-12-07 05:10:56 +07:00
|
|
|
break;
|
|
|
|
case 5:
|
|
|
|
case 4:
|
|
|
|
case 3:
|
|
|
|
case 2:
|
|
|
|
BUG_ON(!i915_is_ggtt(vm));
|
|
|
|
vma->unbind_vma = i915_ggtt_unbind_vma;
|
|
|
|
vma->bind_vma = i915_ggtt_bind_vma;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
BUG();
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Keep GGTT vmas first to make debug easier */
|
|
|
|
if (i915_is_ggtt(vm))
|
|
|
|
list_add(&vma->vma_link, &obj->vma_list);
|
|
|
|
else
|
|
|
|
list_add_tail(&vma->vma_link, &obj->vma_list);
|
|
|
|
|
|
|
|
return vma;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct i915_vma *
|
|
|
|
i915_gem_obj_lookup_or_create_vma(struct drm_i915_gem_object *obj,
|
|
|
|
struct i915_address_space *vm)
|
|
|
|
{
|
|
|
|
struct i915_vma *vma;
|
|
|
|
|
|
|
|
vma = i915_gem_obj_to_vma(obj, vm);
|
|
|
|
if (!vma)
|
|
|
|
vma = __i915_gem_vma_create(obj, vm);
|
|
|
|
|
|
|
|
return vma;
|
|
|
|
}
|