When we use an extern reservation object that otherwise waits for every
fence registered with it.
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
TTM BO accounting is out of sync with how memory is really allocated
in ttm[_dma]_tt_alloc_page_directory. This resulted in excessive
estimated overhead with many small allocations.
ttm_dma_tt_alloc_page_directory makes a single allocation for three
arrays: pages, DMA and CPU addresses. It uses drm_calloc_large, which
uses kmalloc internally for allocations smaller than PAGE_SIZE.
ttm_round_pot should be a good approximation of its memory usage both
above and below PAGE_SIZE.
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes the following scenario:
1. Page table bo allocated in vram and linked to man->lru.
tbo->list_kref.refcount=2
2. Page table bo is swapped out and removed from man->lru.
tbo->list_kref.refcount=1
3. Command submission from userspace. Page table bo is moved
to vram. ttm_bo_move_to_lru_tail() link it to man->lru and
don't increase the kref count.
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It tries to do fancy things with excluding agp support if ttm is
built-in, but agp isn't. Instead just express this depency like drm
does and use CONFIG_AGP everywhere.
Also use the neat Makefile magic to make the entire ttm_agp_backend
file optional.
v2: Use IS_ENABLED(CONFIG_AGP) as suggested by Ville
v3: Review from Emil.
v4: Actually get it right as spotted by 0-day.
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1459337046-25882-1-git-send-email-daniel.vetter@ffwll.ch
Pull drm updates from Dave Airlie:
"This is the main drm pull request for 4.5. I don't think I've missed
anything too major, I'm mostly back at work now but I'll probably get
some sleep in 5 years time.
Summary:
New drivers:
- etnaviv:
GPU driver for the 3D core on the Vivante core used in numerous
ARM boards.
Highlights:
Core:
- Atomic suspend/resume helpers
- Move the headers to using userspace friendlier types.
- Documentation updates
- Lots of struct_mutex removal.
- Bunch of DP MST fixes from AMD.
Panel:
- More DSI helpers
- Support for some new basic panels
i915:
- Basic Kabylake support
- DP link training and detect code refactoring
- fbc/psr fixes
- FIFO underrun fixes
- SDE interrupt handling fixes
- dma-buf/fence support in pageflip path.
- GPU side for MST audio support
radeon/amdgpu:
- Drop UMS support
- GPUVM/Scheduler optimisations
- Initial Powerplay support for Tonga/Fiji/CZ/ST
- ACP audio prerequisites
nouveau:
- GK20a instmem improvements
- PCIE link speed change support
msm:
- DSI support for msm8960/apq8064
tegra:
- Host1X support for Tegra210 SoC
vc4:
- 3D acceleration support
armada:
- Get rid of struct mutex
tda998x:
- Atomic modesetting support
- TMDS clock limitations
omapdrm:
- Atomic modesetting support
- improved TILER performance
rockchip:
- RK3036 VOP support
- Atomic modesetting support
- Synopsys DW MIPI DSI support
exynos:
- Runtime PM support
- of_graph binding for DP panels
- Cleanup of IPP code
- Configurable plane support
- Kernel panic fixes at release time"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (711 commits)
drm/fb_cma_helper: Remove implicit call to disable_unused_functions
drm/amdgpu: add missing irq.h include
drm/vmwgfx: Fix a width / pitch mismatch on framebuffer updates
drm/vmwgfx: Fix an incorrect lock check
drm: nouveau: fix nouveau_debugfs_init prototype
drm/nouveau/pci: fix check in nvkm_pcie_set_link
drm/amdgpu: validate duplicates first
drm/amdgpu: move VM page tables to the LRU end on CS v2
drm/ttm: add ttm_bo_move_to_lru_tail function v2
drm/ttm: fix adding foreign BOs to the swap LRU
drm/ttm: fix adding foreign BOs to the LRU during init v2
drm/radeon: use kobj_to_dev()
drm/amdgpu: use kobj_to_dev()
drm/amdgpu/cz: force vce clocks when sclks are forced
drm/amdgpu/cz: force uvd clocks when sclks are forced
drm/amdgpu/cz: add code to enable forcing VCE clocks
drm/amdgpu/cz: add code to enable forcing UVD clocks
drm/amdgpu: fix lost sync_to if scheduler is enabled.
drm/amd/powerplay: fix static checker warning for return meaningless value.
drm/sysfs: use kobj_to_dev()
...
Convert the raw unsigned long 'pfn' argument to pfn_t for the purpose of
evaluating the PFN_MAP and PFN_DEV flags. When both are set it triggers
_PAGE_DEVMAP to be set in the resulting pte.
There are no functional changes to the gpu drivers as a result of this
conversion.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This allows the drivers to move a BO to the end of the LRU
without removing and adding it again.
v2: Make it more robust, handle pinned and swapable BOs as well.
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It doesn't make any sense to try to swap out imported BOs.
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If we import a BO with an external reservation object we don't
reserve/unreserve it. So we never add it to the LRU causing a possible
denial of service.
v2: fix typo in commit message
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In ttm_write_lock(), the uninterruptible path should call
__ttm_write_lock() not __ttm_read_lock(). This fixes a vmwgfx hang
on F23 start up.
syeh: Extracted this from one of Thomas' internal patches.
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
In the event that TTM doesn't find a compatible memory type for the
driver's first placement choice (placement without eviction), TTM
returns -EINVAL without trying the driver's second choice.
This causes problems on vmwgfx when VRAM is disabled before first modeset
and during VT switches when fbdev is not enabled.
Fix this by also trying the driver's second choice before returning
-EINVAL.
v2: Also check that man->use_type is true for the driver's second choice.
Fixes a bug where disallowed memory types could be used.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Return proper pgprot for ARM64. This is required for objects like
Nouveau fences to be mapped with expected coherency.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Calls to set_memory_wb() incure heavy TLB flush and IPI cost. To
minimize those wait until pool grow beyond batch size before
draining the pool.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Current code never allowed the page pool to actualy fill in anyway.
This fix it, so that we only start freeing page from the pool when
we go over the pool size.
Changed since v1:
- Move the page batching optimization to its separate patch.
Changed since v2:
- Do not remove code part of the batching optimization with
this patch.
- Better commit message.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
dma_alloc_coherent() can return memory in the vmalloc range.
virt_to_page() cannot handle such addresses and crashes. This
patch detects such cases and obtains the struct page * using
vmalloc_to_page() instead.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
At present, dma_buf_export() takes a series of parameters, which
makes it difficult to add any new parameters for exporters, if required.
Make it simpler by moving all these parameters into a struct, and pass
the struct * as parameter to dma_buf_export().
While at it, unite dma_buf_export_named() with dma_buf_export(), and
change all callers accordingly.
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
We need to store device offsets in 64 bit as the device
address space may be larger than the CPU's.
Fixes GPU init failures on radeons with 4GB or more of
vram on 32 bit kernels. We put vram at the start of the
GPU's address space so the gart aperture starts at 4 GB
causing all GPU addresses in the gart aperture to get
truncated.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=89072
[airlied: fix warning on nouveau build]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: thellstrom@vmware.com
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch adds an optional list_head parameter to ttm_eu_reserve_buffers.
If specified duplicates in the execbuf list are no longer reported as errors,
but moved to this list instead.
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Andrew Morton wrote:
> On Wed, 12 Nov 2014 13:08:55 +0900 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> > Andrew Morton wrote:
> > > Poor ttm guys - this is a bit of a trap we set for them.
> >
> > Commit a91576d791 ("drm/ttm: Pass GFP flags in order to avoid deadlock.")
> > changed to use sc->gfp_mask rather than GFP_KERNEL.
> >
> > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *),
> > - GFP_KERNEL);
> > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> >
> > But this bug is caused by sc->gfp_mask containing some flags which are not
> > in GFP_KERNEL, right? Then, I think
> >
> > - pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> > + pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp & GFP_KERNEL);
> >
> > would hide this bug.
> >
> > But I think we should use GFP_ATOMIC (or drop __GFP_WAIT flag)
>
> Well no - ttm_page_pool_free() should stop calling kmalloc altogether.
> Just do
>
> struct page *pages_to_free[16];
>
> and rework the code to free 16 pages at a time. Easy.
Well, ttm code wants to process 512 pages at a time for performance.
Memory footprint increased by 512 * sizeof(struct page *) buffer is
only 4096 bytes. What about using static buffer like below?
----------
>From d3cb5393c9c8099d6b37e769f78c31af1541fe8c Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Thu, 13 Nov 2014 22:21:54 +0900
Subject: [PATCH] drm/ttm: Avoid memory allocation from shrinker functions.
Commit a91576d791 ("drm/ttm: Pass GFP flags in order to avoid
deadlock.") caused BUG_ON() due to sc->gfp_mask containing flags
which are not in GFP_KERNEL.
https://bugzilla.kernel.org/show_bug.cgi?id=87891
Changing from sc->gfp_mask to (sc->gfp_mask & GFP_KERNEL) would
avoid the BUG_ON(), but avoiding memory allocation from shrinker
function is better and reliable fix.
Shrinker function is already serialized by global lock, and
clean up function is called after shrinker function is unregistered.
Thus, we can use static buffer when called from shrinker function
and clean up function.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [2.6.35+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems
against the idea of TTM_PL_FLAG_TOPDOWN:
* The smallest hole may be in the overall bottom of the area
* If the hole isn't much larger than the BO, it doesn't make much
difference whether the BO is placed at the bottom or at the top of the
hole
Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the BO should be placed at the top of the area, we should start looking
for holes from the top.
Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The radeon driver uses placement range restrictions for several reasons,
in particular to make sure BOs in VRAM can be accessed by the CPU, e.g.
during a page fault.
Without this change, TTM could evict other BOs while trying to satisfy
the requested placement, even if the evicted BOs were outside of the
requested placement range. Doing so didn't free up any space in the
requested placement range, so the (potentially high) eviction cost was
incurred for no benefit.
Nominating for stable because radeon driver changes in 3.17 made this
much more noticeable than before.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84662
Cc: stable@vger.kernel.org
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Today, most callers of ttm_io_prot() check TTM_PL_FLAG_CACHED before
calling it since on some archs it will unconditionally create non-cached
mappings.
But not all callers do which is incorrect as far as I can tell.
Instead, move that check inside ttm_io_port() itself for all archs
and make powerpc use the same implementation as ia64 and arm
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
While zone->name is currently hard coded, the call to kobject_init_and_add()
should follow the more defensive argument list usage (as already done in
other places in ttm_memory.c) where "%s" is used instead of directly passing
in a variable as a format string.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch adds a new flag to the ttm_validate_buffer list to
add the fence as shared to the reservation object.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reorders the list to keep track of what buffers are reserved,
so previous members are always unreserved.
This gets rid of some bookkeeping that's no longer needed,
while simplifying the code some.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
No users are left, kill it off! :D
Conversion to the reservation api is next on the list, after
that the functionality can be restored with rcu.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
This is the last remaining function that doesn't use the reservation
lock completely to fence off access to a buffer.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
This allows us to more fine grained specify where to place the buffer object.
v2: rebased on drm-next, add bochs changes as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Pull nouveau drm updates from Ben Skeggs:
"Apologies for not getting this done in time for Dave's drm-next merge
window. As he mentioned, a pre-existing bug reared its head a lot
more obviously after this lot of changes. It took quite a bit of time
to track it down. In any case, Dave suggested I try my luck by
sending directly to you this time.
Overview:
- more code for Tegra GK20A from NVIDIA - probing, reclockig
- better fix for Kepler GPUs that have the graphics engine powered
off on startup, method courtesy of info provided by NVIDIA
- unhardcoding of a bunch of graphics engine setup on
Fermi/Kepler/Maxwell, will hopefully solve some issues people have
noticed on higher-end models
- support for "Zero Bandwidth Clear" on Fermi/Kepler/Maxwell, needs
userspace support in general, but some lucky apps will benefit
automagically
- reviewed/exposed the full object APIs to userspace (finally), gives
it access to perfctrs, ZBC controls, various events. More to come
in the future.
- various other fixes"
Acked-by: Dave Airlie <airlied@redhat.com>
* 'linux-3.17' of git://anongit.freedesktop.org/git/nouveau/linux-2.6: (87 commits)
drm/nouveau: expose the full object/event interfaces to userspace
drm/nouveau: fix headless mode
drm/nouveau: hide sysfs pstate file behind an option again
drm/nv50/disp: shhh compiler
drm/gf100-/gr: implement the proper SetShaderExceptions method
drm/gf100-/gr: remove some broken ltc bashing, for now
drm/gf100-/gr: unhardcode attribute cb config
drm/gf100-/gr: fetch tpcs-per-ppc info on startup
drm/gf100-/gr: unhardcode pagepool config
drm/gf100-/gr: unhardcode bundle cb config
drm/gf100-/gr: improve initial context patch list helpers
drm/gf100-/gr: add support for zero bandwidth clear
drm/nouveau/ltc: add zbc drivers
drm/nouveau/ltc: s/ltcg/ltc/ + cleanup
drm/nouveau: use ram info from nvif_device
drm/nouveau/disp: implement nvif event sources for vblank/connector notifiers
drm/nouveau/disp: allow user direct access to channel control registers
drm/nouveau/disp: audit and version display classes
drm/nouveau/disp: audit and version SCANOUTPOS method
drm/nv50-/disp: audit and version PIOR_PWR method
...
Pages allocated using the DMA API have a coherent memory mapping. Make
this mapping visible to drivers so they can decide to use it instead of
creating their own redundant one.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pull DRM updates from Dave Airlie:
"Like all good pull reqs this ends with a revert, so it must mean we
tested it,
[ Ed. That's _one_ way of looking at it ]
This pull is missing nouveau, Ben has been stuck trying to track down
a very longstanding bug that revealed itself due to some other
changes. I've asked him to send you a direct pull request for nouveau
once he cleans things up. I'm away until Monday so don't want to
delay things, you can make a decision on that when he sends it, I have
my phone so I can ack things just not really merge much.
It has one trivial conflict with your tree in armada_drv.c, and also
the pull request contains some component changes that are already in
your tree, the base tree from Russell went via Greg's tree already,
but some stuff still shows up in here that doesn't when I merge my
tree into yours.
Otherwise all pretty standard graphics fare, one new driver and
changes all over the place.
New drivers:
- sti kms driver for STMicroelectronics chipsets stih416 and stih407.
core:
- lots of cleanups to the drm core
- DP MST helper code merged
- universal cursor planes.
- render nodes enabled by default
panel:
- better panel interfaces
- new panel support
- non-continuous cock advertising ability
ttm:
- shrinker fixes
i915:
- hopefully ditched UMS support
- runtime pm fixes
- psr tracking and locking - now enabled by default
- userptr fixes
- backlight brightness fixes
- MST support merged
- runtime PM for dpms
- primary planes locking fixes
- gen8 hw semaphore support
- fbc fixes
- runtime PM on SOix sleep state hw.
- mmio base page flipping
- lots of vlv/chv fixes.
- universal cursor planes
radeon:
- Hawaii fixes
- display scalar support for non-fixed mode displays
- new firmware format support
- dpm on more asics by default
- GPUVM improvements
- uncached and wc GTT buffers
- BOs > visible VRAM
exynos:
- i80 interface support
- module auto-loading
- ipp driver consolidated.
armada:
- irq handling in crtc layer only
- crtc renumbering
- add component support
- DT interaction changes.
tegra:
- load as module fixes
- eDP bpp and sync polarity fixed
- DSI non-continuous clock mode support
- better support for importing buffers from nouveau
msm:
- mdp5/adq8084 v1.3 hw enablement
- devicetree clk changse
- ifc6410 board working
tda998x:
- component support
- DT documentation update
vmwgfx:
- fix compat shader namespace"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (551 commits)
Revert "drm: drop redundant drm_file->is_master"
drm/panel: simple: Use devm_gpiod_get_optional()
drm/dsi: Replace upcasting macro by function
drm/panel: ld9040: Replace upcasting macro by function
drm/exynos: dp: Modify driver to support drm_panel
drm/exynos: Move DP setup into commit()
drm/panel: simple: Add AUO B133HTN01 panel support
drm/panel: simple: Support delays in panel functions
drm/panel: simple: Add proper definition for prepare and unprepare
drm/panel: s6e8aa0: Add proper definition for prepare and unprepare
drm/panel: ld9040: Add proper definition for prepare and unprepare
drm/tegra: Add support for panel prepare and unprepare routines
drm/exynos: dsi: Add support for panel prepare and unprepare routines
drm/exynos: dpi: Add support for panel prepare and unprepare routines
drm/panel: simple: Add dummy prepare and unprepare routines
drm/panel: s6e8aa0: Add dummy prepare and unprepare routines
drm/panel: ld9040: Add dummy prepare and unprepare routines
drm/panel: Provide convenience wrapper for .get_modes()
drm/panel: add .prepare() and .unprepare() functions
drm/panel: simple: Remove simple-panel compatible
...
Commit 7dc19d5a "drivers: convert shrinkers to new count/scan API" added
deadlock warnings that ttm_page_pool_free() and ttm_dma_page_pool_free()
are currently doing GFP_KERNEL allocation.
But these functions did not get updated to receive gfp_t argument.
This patch explicitly passes sc->gfp_mask or GFP_KERNEL to these functions,
and removes the deadlock warning.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [2.6.35+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
While ttm_dma_pool_shrink_scan() tries to take mutex before doing GFP_KERNEL
allocation, ttm_pool_shrink_scan() does not do it. This can result in stack
overflow if kmalloc() in ttm_page_pool_free() triggered recursion due to
memory pressure.
shrink_slab()
=> ttm_pool_shrink_scan()
=> ttm_page_pool_free()
=> kmalloc(GFP_KERNEL)
=> shrink_slab()
=> ttm_pool_shrink_scan()
=> ttm_page_pool_free()
=> kmalloc(GFP_KERNEL)
Change ttm_pool_shrink_scan() to do like ttm_dma_pool_shrink_scan() does.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [2.6.35+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
I can observe that RHEL7 environment stalls with 100% CPU usage when a
certain type of memory pressure is given. While the shrinker functions
are called by shrink_slab() before the OOM killer is triggered, the stall
lasts for many minutes.
One of reasons of this stall is that
ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and
are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with
_manager->lock held causes someone (including kswapd) to deadlock when
these functions are called due to memory pressure. This patch changes
"mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to
avoid deadlock.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.3+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
We can use "unsigned int" instead of "atomic_t" by updating start_pool
variable under _manager->lock. This patch will make it possible to avoid
skipping when choosing a pool to shrink in round-robin style, after next
patch changes mutex_lock(_manager->lock) to !mutex_trylock(_manager->lork).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.3+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
list_empty(&_manager->pools) being false before taking _manager->lock
does not guarantee that _manager->npools != 0 after taking _manager->lock
because _manager->npools is updated under _manager->lock.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@kernel.org> [3.3+]
Signed-off-by: Dave Airlie <airlied@redhat.com>
This allows reservation objects to be used in dma-buf. it's required
for implementing polling support on the fences that belong to a dma-buf.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com> #drivers/media/v4l2-core/
Acked-by: Thomas Hellstrom <thellstrom@vmware.com> #drivers/gpu/drm/ttm
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net> #drivers/gpu/drm/armada/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix a sparse warning: ttm_bo_reserve()'s last argument is a
pointer to a struct, so use NULL as nullpointer.
Signed-off-by: Martin Kepplinger <martink@posteo.de>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
bo->mem.placement is not initialized when ttm_bo_man_get_node is called,
so the flag had no effect at all.
v2: change nouveau and vmwgfx as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Nouveau can now be used on ARM, so add an ioprot handler for this
architecture.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Pull request of 2014-04-04
Currently only a single patch fixing up mixed use of the ttm_bo_reserve and
ww_mutex APIs
* tag 'ttm-next-2014-04-04' of git://people.freedesktop.org/~thomash/linux:
drm/ttm: Hide the implementation details of reservation
Clients like i915 need to segregate cache domains within the GTT which
can lead to small amounts of fragmentation. By allocating the uncached
buffers from the bottom and the cacheable buffers from the top, we can
reduce the amount of wasted space and also optimize allocation of the
mappable portion of the GTT to only those buffers that require CPU
access through the GTT.
For other drivers, allocating small bos from one end and large ones
from the other helps improve the quality of fragmentation.
Based on drm_mm work by Chris Wilson.
v3: Changed to use a TTM placement flag
v2: Updated kerneldoc
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Christian König <deathsimple@vodafone.de>
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: David Airlie <airlied@redhat.com>
A function to be used to check whether a caller has put a ref object
(opened) a struct ttm_base_object
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This is the 3rd respin of the drm-anon patches. They allow module unloading, use
the pin_fs_* helpers recommended by Al and are rebased on top of drm-next. Note
that there are minor conflicts with the "drm-minor" branch.
* 'drm-next' of git://people.freedesktop.org/~dvdhrm/linux:
drm: init TTM dev_mapping in ttm_bo_device_init()
drm: use anon-inode instead of relying on cdevs
drm: add pseudo filesystem for shared inodes
With dev->anon_inode we have a global address_space ready for operation
right from the beginning. Therefore, there is no need to do a delayed
setup with TTM. Instead, set dev_mapping during initialization in
ttm_bo_device_init() and remove any "if (dev_mapping)" conditions.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Alex Deucher <alexdeucher@gmail.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
A performance regression was introduced in TTM in linux 3.13 when we started using
VM_PFNMAP for shared mappings. In theory this should've been faster due to
less page book-keeping but it appears like VM_PFNMAP + x86 PAT + write-combine
is a particularly cpu-hungry combination, as seen by largely increased
cpu-usage on r200 GL video playback.
Until we've sorted out why, revert to always use VM_MIXEDMAP.
Reference: freedesktop.org bugzilla bug #75719
Reported-and-tested-by: <smoki00790@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: stable@vger.kernel.org
This patch fix a memory leak found by cppcheck.
[drivers/gpu/drm/ttm/ttm_agp_backend.c:129]:
(error) Memory leak: agp_be
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
These page pointers shouldn't be visible to TTM in the first place, but
until we fix that up, don't clear the page metadata because that
will upset the exporter.
Reported-and-tested-by: Cristoph Haag <haagch.christoph@googleemail.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Commit drm/ttm: ttm object security fixes for render nodes introduced a
regression where, if a TTM object was opened multiple times from the same
open file, the caller would spin uninterruptibly in the kernel.
Fix this.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
drm-intel-next-2014-01-10:
- final bits for runtime D3 on Haswell from Paul (now enabled fully)
- parse the backlight modulation freq information in the VBT from Jani
(but not yet used)
- more watermark improvements from Ville for ilk-ivb and bdw
- bugfixes for fastboot from Jesse
- watermark fix for i830M (but not yet everything)
- vlv vga hotplug w/a (Imre)
- piles of other small improvements, cleanups and fixes all over
Note that the pull request includes a backmerge of the last drm-fixes
pulled into Linus' tree - things where getting a bit too messy. So the
shortlog also contains a bunch of patches from Linus tree. Please yell if
you want me to frob it for you a bit.
* 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (609 commits)
drm/i915/bdw: make sure south port interrupts are enabled properly v2
drm/i915: Include more information in disabled hotplug interrupt warning
drm/i915: Only complain about a rogue hotplug IRQ after disabling
drm/i915: Only WARN about a stuck hotplug irq ONCE
drm/i915: s/hotplugt_status_gen4/hotplug_status_g4x/
Anyway, nothing big here, Three more code cleanup patches from Rashika
Kheria, and one TTM/vmwgfx patch from me that tightens security around TTM
objects enough for them to opened using prime objects from render nodes:
Previously any client could access a shared buffer using the "name", also
without actually opening it. Now a reference is required, and for render nodes
such a reference is intended to only be obtainable using a prime fd.
vmwgfx-next 2014-01-13 pull request
* tag 'vmwgfx-next-2014-01-13' of git://people.freedesktop.org/~thomash/linux:
drivers: gpu: Mark functions as static in vmwgfx_fence.c
drivers: gpu: Mark functions as static in vmwgfx_buffer.c
drivers: gpu: Mark functions as static in vmwgfx_kms.c
drm/ttm: ttm object security fixes for render nodes
Remove unused function ttm_write_lock_downgrade() from
drm/ttm/ttm_lock.c.
This eliminates the following warning in drm/ttm/ttm_lock.c:
drivers/gpu/drm/ttm/ttm_lock.c:189:6: warning: no previous prototype for ‘ttm_write_lock_downgrade’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Mark functions as static because they are not used outside the file
drm/ttm/ttm_bo_util.c.
This eliminates the following warnings in drm/ttm/ttm_bo_util.c:
drivers/gpu/drm/ttm/ttm_bo_util.c:190:5: warning: no previous prototype for ‘ttm_mem_reg_ioremap’ [-Wmissing-prototypes]
drivers/gpu/drm/ttm/ttm_bo_util.c:222:6: warning: no previous prototype for ‘ttm_mem_reg_iounmap’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Mark function as static because it is not used outside file
drm/ttm/ttm_bo.c.
This eliminates the following warning in drm/ttm/ttm_bo.c:
drivers/gpu/drm/ttm/ttm_bo.c:960:5: warning: no previous prototype for ‘ttm_bo_move_buffer’ [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
When a client looks up a ttm object, don't look it up through the device hash
table, but rather from the file hash table. That makes sure that the client
has indeed put a reference on the object, or in gem terms, has opened
the object; either using prime or using the global "name".
To avoid a performance loss, make sure the file hash table entries can be
looked up from under an RCU lock, and as a consequence, replace the rwlock
with a spinlock, since we never need to take it in read mode only anymore.
Finally add a ttm object lookup function for the device hash table, that is
intended to be used when we put a ref object on a base object or, in gem terms,
when we open the object.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Needed for some vm operations; most notably unmap_mapping_range() with
even_cows = 0.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This is illegal for at least two reasons:
1) While it may work on some platforms / iommus, obtaining page pointers from
mapped sg-lists is illegal, since the DMA API allows page pointer information
to be destroyed in the sg mapping process.
2) TTM has no way of determining the linear kernel map caching state of the
underlying pages. PTEs with conflicting caching state pointing to the same
pfn is not allowed.
TTM operations touching pages of imported sg-tables should be redirected through
the proper dma-buf operations.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
VM_PFNMAP is faster than VM_MIXEDMAP due to reduced page administration so
use it for shared maps where we don't have any Copy-On-Write pages. For
private maps, we continue to use VM_MIXEDMAP.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Commit "drm/ttm: Don't move non-existing data" didn't take the
swapped-out corner case into account. This patch corrects that.
Fixes blank screen after attempted suspend / hibernate on vmwgfx.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Checking directly for the right capability is simpler. Also this rids
us of a few places that use DRM_CURRENTPID.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
VMAs covering a bo but that didn't start at the same address space offset as
the bo they were mapping were incorrectly generating SEGFAULT errors in
the fault handler.
Reported-by: Joseph Dolinak <kanilo2@yahoo.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: stable@vger.kernel.org
The set_need_resched() removal fix and yet another fix in
ttm_bo_move_memcpy().
* 'ttm-fixes-3.13' of git://people.freedesktop.org/~thomash/linux:
drm/ttm: Remove set_need_resched from the ttm fault handler
drm/ttm: Don't move non-existing data
Addresses
"[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE".
In the first occurence it was used to try to be nice while releasing the
mmap_sem and retrying the fault to work around a locking inversion.
The second occurence was never used.
There has been some discussion whether we should change the locking order to
mmap_sem -> bo_reserve. This patch doesn't address that issue, and leaves
that locking order undefined. The solution that we release the mmap_sem if
tryreserve fails and wait for the buffer to become unreserved is something
we want in any case, and follows how the core vm system waits for pages
to be come unlocked while releasing the mmap_sem.
The code also outlines what needs to be changed if we want to establish the
locking order as mmap_sem -> bo::reserve.
One slight issue that remains with this code is that the fault handler might
be prone to starvation if another thread countinously reserves the buffer.
IMO that usage pattern is highly unlikely.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
If ttm_bo_move_memcpy was instructed to move a non-populated ttm to
io memory, it would first populate the ttm, then move the data and then
destroy the ttm. That's stupid. However, some drivers might have relied on
this to clear io memory from old stuff. So instead of a NOP, which would
be the most efficient, just clear the destination.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
If no reservation ticket is given to the execbuf reservation utilities,
try reservation with non-blocking semantics.
This is intended for eviction paths that use the execbuf reservation
utilities for convenience rather than for deadlock avoidance.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Fix a long-standing TTM issue where we manipulated the vma page_prot
bits while mmap_sem was taken in read mode only. We now make a local
copy of the vma structure which we pass when we set the ptes.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
- A couple of fixes that never made it into fixes-3.12
- Make NO_EVICT bo's available for shrinkers when on delayed-delete list
- Allow retrying page-faults that need to wait for GPU.
* 'ttm-next-3.13' of git://people.freedesktop.org/~thomash/linux:
drm/ttm: Fix memory type compatibility check
drm/ttm: Fix ttm_bo_move_memcpy
drm/ttm: Handle in-memory region copies
drm/ttm: Make NO_EVICT bos available to shrinkers pending destruction
drm/ttm: Allow vm fault retries
Also check the busy placements before deciding to move a buffer object.
Failing to do this may result in a completely unneccessary move within a
single memory type.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: stable@vger.kernel.org
All error paths will want to keep the mm node, so handle this at the
function exit. This fixes an ioremap failure error path.
Also add some comments to make the function a bit easier to understand.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: stable@vger.kernel.org
Fix the case where the ttm pointer may be NULL causing
a NULL pointer dereference.
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellström <thellstrom@vmware.com>
Cc: stable@vger.kernel.org
NO_EVICT bos that are not idle when all references are dropped are put on
the delayed destroy list. However, since they are not on LRU lists, they
are not available to shrinkers at that point, and buffers on the delayed
destroy list are not checked very often for idle.
So when these buffers are put on the delayed destroy list, clear the
NO_EVICT flag and put them on the right LRU list. This way they are
immediately available for eviction or shrinkers and will not cause false
OOMS.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Make use of the FAULT_FLAG_ALLOW_RETRY flag to allow dropping the
mmap_sem while waiting for bo idle.
FAULT_FLAG_ALLOW_RETRY appears to be primarily designed for disk waits
but should work just as fine for GPU waits..
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Used by the vmwgfx driver
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Pull drm radeon/nouveau/core fixes from Dave Airlie:
"Mostly radeon fixes, with some nouveau bios parser, ttm fix and a fix
for AST driver"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (42 commits)
drm/fb-helper: don't sleep for screen unblank when an oops is in progress
drm, ttm Fix uninitialized warning
drm/ttm: fix the tt_populated check in ttm_tt_destroy()
drm/nouveau/ttm: prevent double-free in nouveau_sgdma_create_ttm() failure path
drm/nouveau/bios/init: fix thinko in INIT_CONFIGURE_MEM
drm/nouveau/kms: enable for non-vga pci classes
drm/nouveau/bios/init: stub opcode 0xaa
drm/radeon: avoid UVD corruptions on AGP cards
drm/radeon: fix panel scaling with eDP and LVDS bridges
drm/radeon/dpm: rework auto performance level enable
drm/radeon: Fix hmdi typo
drm/radeon/dpm/rs780: fix force_performance state for same sclks
drm/radeon/dpm/rs780: don't enable sclk scaling if not required
drm/radeon/dpm/rs780: add some sanity checking to sclk scaling
drm/radeon/dpm/rs780: use drm_mode_vrefresh()
drm/udl: rip out set_need_resched
drm/ast: fix the ast open key function
drm/radeon/dpm: add bapm callback for kb/kv
drm/radeon/dpm: add bapm callback for trinity
drm/radeon/dpm: add infrastructure to properly handle bapm
...
Fix uninitialized warning.
drivers/gpu/drm/ttm/ttm_object.c: In function ‘ttm_base_object_lookup’:
drivers/gpu/drm/ttm/ttm_object.c:213:10: error: ‘base’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
kref_put(&base->refcount, ttm_release_base);
^
drivers/gpu/drm/ttm/ttm_object.c:221:26: note: ‘base’ was declared here
struct ttm_base_object *base;
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
After a vmalloc failure in ttm_dma_tt_alloc_page_directory(),
ttm_dma_tt_init() will call ttm_tt_destroy() to cleanup, and end up
inside the driver's unpopulate() hook when populate() has never yet
been called.
On nouveau, the first issue to be hit because of this is that
dma_address[] may be a NULL pointer. After working around this,
ttm_pool_unpopulate() may potentially hit the same issue with
the pages[] array.
It seems to make more sense to avoid calling unpopulate on already
unpopulated TTMs than to add checks to all the implementations.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: stable@vger.kernel.org
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Convert the driver shrinkers to the new API. Most changes are compile
tested only because I either don't have the hardware or it's staging
stuff.
FWIW, the md and android code is pretty good, but the rest of it makes me
want to claw my eyes out. The amount of broken code I just encountered is
mind boggling. I've added comments explaining what is broken, but I fear
that some of the code would be best dealt with by being dragged behind the
bike shed, burying in mud up to it's neck and then run over repeatedly
with a blunt lawn mower.
Special mention goes to the zcache/zcache2 drivers. They can't co-exist
in the build at the same time, they are under different menu options in
menuconfig, they only show up when you've got the right set of mm
subsystem options configured and so even compile testing is an exercise in
pulling teeth. And that doesn't even take into account the horrible,
broken code...
[glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Glauber Costa <glommer@openvz.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This helper is used only once and just wraps a call to
drm_vma_offset_add(). Remove this unneeded indirection to safe 10 lines of
code.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Instead of calling drm_mm_pre_get() in a row, we now preallocate the node
and then use the atomic insertion functions. This has the exact same
semantics and there is no reason to use the racy pre-allocations.
Note that ttm_bo_man_get_node() does not run in atomic context. Nouveau
already uses GFP_KERNEL alloc in nouveau/nouveau_ttm.c in
nouveau_gart_manager_new(). So we can do the same in
ttm_bo_man_get_node().
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add a "best_match" flag similar to the drm_mm_search_*() helpers so we
can convert TTM to use them in follow up patches. We can also inline the
non-generic helpers and move them into the header to allow compile-time
optimizations.
To make calls to drm_mm_{search,insert}_node() more readable, this
converts the boolean argument to a flagset. There are pending patches that
add additional flags for top-down allocators and more.
v2:
- use flag parameter instead of boolean "best_match"
- convert *_search_free() helpers to also use flags argument
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of unmapping the nodes in TTM and GEM users manually, we provide
a generic wrapper which does the correct thing for all vma-nodes.
v2: remove bdev->dev_mapping test in ttm_bo_unmap_virtual_unlocked() as
ttm_mem_io_free_vm() does nothing in that case (io_reserved_vm is 0).
v4: Fix docbook comments
v5: use drm_vma_node_size()
Cc: Dave Airlie <airlied@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Use the new vma-manager infrastructure. This doesn't change any
implementation details as the vma-offset-manager is nearly copied 1-to-1
from TTM.
The vm_lock is moved into the offset manager so we can drop it from TTM.
During lookup, we use the vma locking helpers to take a reference to the
found object.
In all other scenarios, locking stays the same as before. We always
guarantee that drm_vma_offset_remove() is called only during destruction.
Hence, helpers like drm_vma_node_offset_addr() are always safe as long as
the node has a valid offset.
This also drops the addr_space_offset member as it is a copy of vm_start
in vma_node objects. Use the accessor functions instead.
v4:
- remove vm_lock
- use drm_vma_offset_lock_lookup() to protect lookup (instead of vm_lock)
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Martin Peres <martin.peres@labri.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
There is no reason to return "int" as this function never fails.
Furthermore, several drivers (ast, sis) already depend on this.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Now that the code is compatible in semantics, flip the switch.
Use ww_mutex instead of the homegrown implementation.
ww_mutex uses -EDEADLK to signal that the caller has to back off,
and -EALREADY to indicate this buffer is already held by the caller.
ttm used -EAGAIN and -EDEADLK for those, respectively. So some changes
were needed to handle this correctly.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This commit converts the source of the val_seq counter to
the ww_mutex api. The reservation objects are converted later,
because there is still a lockdep splat in nouveau that has to
resolved first.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(*->vm_end - *->vm_start) >> PAGE_SHIFT operation is implemented
as a inline funcion vma_pages() in linux/mm.h, so using it.
Signed-off-by: Libin <huawei.libin@huawei.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
qxl wants to use io mapping like i915 gem does, for now
just export the symbols so the driver can implement atomic
page maps using io mapping.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
Pull drm merge from Dave Airlie:
"Highlights:
- TI LCD controller KMS driver
- TI OMAP KMS driver merged from staging
- drop gma500 stub driver
- the fbcon locking fixes
- the vgacon dirty like zebra fix.
- open firmware videomode and hdmi common code helpers
- major locking rework for kms object handling - pageflip/cursor
won't block on polling anymore!
- fbcon helper and prime helper cleanups
- i915: all over the map, haswell power well enhancements, valleyview
macro horrors cleaned up, killing lots of legacy GTT code,
- radeon: CS ioctl unification, deprecated UMS support, gpu reset
rework, VM fixes
- nouveau: reworked thermal code, external dp/tmds encoder support
(anx9805), fences sleep instead of polling,
- exynos: all over the driver fixes."
Lovely conflict in radeon/evergreen_cs.c between commit de0babd60d
("drm/radeon: enforce use of radeon_get_ib_value when reading user cmd")
and the new changes that modified that evergreen_dma_cs_parse()
function.
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (508 commits)
drm/tilcdc: only build on arm
drm/i915: Revert hdmi HDP pin checks
drm/tegra: Add list of framebuffers to debugfs
drm/tegra: Fix color expansion
drm/tegra: Split DC_CMD_STATE_CONTROL register write
drm/tegra: Implement page-flipping support
drm/tegra: Implement VBLANK support
drm/tegra: Implement .mode_set_base()
drm/tegra: Add plane support
drm/tegra: Remove bogus tegra_framebuffer structure
drm: Add consistency check for page-flipping
drm/radeon: Use generic HDMI infoframe helpers
drm/tegra: Use generic HDMI infoframe helpers
drm: Add EDID helper documentation
drm: Add HDMI infoframe helpers
video: Add generic HDMI infoframe helpers
drm: Add some missing forward declarations
drm: Move mode tables to drm_edid.c
drm: Remove duplicate drm_mode_cea_vic()
gma500: Fix n, m1 and m2 clock limits for sdvo and lvds
...
TTM reservations changes, preparing for new reservation mutex system.
* 'for-airlied' of git://people.freedesktop.org/~mlankhorst/linux:
drm/ttm: unexport ttm_bo_wait_unreserved
drm/nouveau: use ttm_bo_reserve_slowpath in validate_init, v2
drm/ttm: use ttm_bo_reserve_slowpath_nolru in ttm_eu_reserve_buffers, v2
drm/ttm: add ttm_bo_reserve_slowpath
drm/ttm: cleanup ttm_eu_reserve_buffers handling
drm/ttm: remove lru_lock around ttm_bo_reserve
drm/nouveau: increase reservation sequence every retry
drm/vmwgfx: always use ttm_bo_is_reserved
This fixes up
commit e8e89622ed
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue Dec 18 22:25:11 2012 +0100
drm/ttm: fix fence locking in ttm_buffer_object_transfer
which leaves behind a might_sleep in atomic context, since the
fence_lock spinlock is held over a kmalloc(GFP_KERNEL) call. The fix
is to revert the above commit and only take the lock where we need it,
around the call to ->sync_obj_ref.
v2: Fixup things noticed by Maarten Lankhorst:
- Brown paper bag locking bug.
- No need for kzalloc if we clear the entire thing on the next line.
- check for bo->sync_obj (totally unlikely race, but still someone
else could have snuck in) and clear fbo->sync_obj if it's cleared
already.
Reported-by: Dave Airlie <airlied@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
if we have a move notify callback, when moving fails, we call move notify
the opposite way around, however this ends up with *mem containing the mm_node
from the bo, which means we double free it. This is a follow on to the previous
fix.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When we are using memcpy to move objects around, and we fail to memcpy
due to lack of memory to populate or failure to finish the copy, we don't
want to destroy the mm_node that has been copied into old_copy.
While working on a new kms driver that uses memcpy, if I overallocated bo's
up to the memory limits, and eviction failed, then machine would oops soon
after due to having an active bo with an already freed drm_mm embedded in it,
freeing it a second time didn't end well.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
All legitimate users of this function outside ttm_bo.c are gone, now
it's only an implementation detail.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
This requires re-use of the seqno, which increases fairness slightly.
Instead of spinning with a new seqno every time we keep the current one,
but still drop all other reservations we hold. Only when we succeed,
we try to get back our other reservations again.
This should increase fairness slightly as well.
Changes since v1:
- Increase val_seq before calling ttm_bo_reserve_slowpath_nolru and
retrying to take all entries to prevent a race.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Instead of dropping everything, waiting for the bo to be unreserved
and trying over, a better strategy would be to do a blocking wait.
This can be mapped a lot better to a mutex_lock-like call.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
With the lru lock no longer required for protecting reservations we
can just do a ttm_bo_reserve_nolru on -EBUSY, and handle all errors
in a single path.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
There should no longer be assumptions that reserve will always succeed
with the lru lock held, so we can safely break the whole atomic
reserve/lru thing. As a bonus this fixes most lockdep annotations for
reservations.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Noticed while reviewing the fence locking in the radeon pageflip
handler.
v2: Instead of grabbing the bdev->fence_lock in object_transfer just
move the single callsite of that function a few lines, so that it is
protected by the fence_lock. Suggested by Jerome Glisse.
v3: Fix typo in commit message.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
All items on the lru list are always reservable, so this is a stupid
thing to keep. Not only that, it is used in a way which would
guarantee deadlocks if it were ever to be set to block on reserve.
This is a lot of churn, but mostly because of the removal of the
argument which can be nested arbitrarily deeply in many places.
No change of code in this patch except removal of the no_wait_reserve
argument, the previous patch removed the use of no_wait_reserve.
v2:
- Warn if -EBUSY is returned on reservation, all objects on the list
should be reservable. Adjusted patch slightly due to conflicts.
v3:
- Focus on no_wait_reserve removal only.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Replace the goto loop with a simple for each loop, and only run the
delayed destroy cleanup if we can reserve the buffer first.
No race occurs, since lru lock is never dropped any more. An empty list
and a list full of unreservable buffers both cause -EBUSY to be returned,
which is identical to the previous situation, because previously buffers
on the lru list were always guaranteed to be reservable.
This should work since currently ttm guarantees items on the lru are
always reservable, and reserving items blockingly with some bo held
are enough to cause you to run into a deadlock.
Currently this is not a concern since removal off the lru list and
reservations are always done with atomically, but when this guarantee
no longer holds, we have to handle this situation or end up with
possible deadlocks.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Replace the while loop with a simple for each loop, and only run the
delayed destroy cleanup if we can reserve the buffer first.
No race occurs, since lru lock is never dropped any more. An empty list
and a list full of unreservable buffers both cause -EBUSY to be returned,
which is identical to the previous situation.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
By removing the unlocking of lru and retaking it immediately, a race is
removed where the bo is taken off the swap list or the lru list between
the unlock and relock. As such the cleanup_refs code can be simplified,
it will attempt to call ttm_bo_wait non-blockingly, and if it fails
it will drop the locks and perform a blocking wait, or return an error
if no_wait_gpu was set.
The need for looping is also eliminated, since swapout and evict_mem_first
will always follow the destruction path, no new fence is allowed
to be attached. As far as I can see this may already have been the case,
but the unlocking / relocking required a complicated loop to deal with
re-reservation.
Changes since v1:
- Simplify no_wait_gpu case by folding it in with empty ddestroy.
- Hold a reservation while calling ttm_bo_cleanup_memtype_use again.
Changes since v2:
- Do not remove bo from lru list while waiting
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This requires changing the order in ttm_bo_cleanup_refs_or_queue to
take the reservation first, as there is otherwise no race free way to
take lru lock before fence_lock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex writes:
Pretty minor -next pull request. We some additional new bits waiting
internally for release. Hopefully Monday we can get at least some of
them out. The others will probably take a few more weeks.
Highlights of the current request:
- ELD registers for passing audio information to the sound hardware
- Handle GPUVM page faults more gracefully
- Misc fixes
Merge radeon test
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux: (483 commits)
drm/radeon: bump driver version for new info ioctl requests
drm/radeon: fix eDP clk and lane setup for scaled modes
drm/radeon: add new INFO ioctl requests
drm/radeon/dce32+: use fractional fb dividers for high clocks
drm/radeon: use cached memory when evicting for vram on non agp
drm/radeon: add a CS flag END_OF_FRAME
drm/radeon: stop page faults from hanging the system (v2)
drm/radeon/dce4/5: add registers for ELD handling
drm/radeon/dce3.2: add registers for ELD handling
radeon: fix pll/ctrc mapping on dce2 and dce3 hardware
Linux 3.7-rc7
powerpc/eeh: Do not invalidate PE properly
Revert "drm/i915: enable rc6 on ilk again"
ALSA: hda - Fix build without CONFIG_PM
of/address: sparc: Declare of_iomap as an extern function for sparc again
PM / QoS: fix wrong error-checking condition
bnx2x: remove redundant warning log
vxlan: fix command usage in its doc
8139cp: revert "set ring address before enabling receiver"
MPI: Fix compilation on MIPS with GCC 4.4 and newer
...
Conflicts:
drivers/gpu/drm/exynos/exynos_drm_encoder.c
drivers/gpu/drm/exynos/exynos_drm_fbdev.c
drivers/gpu/drm/nouveau/core/engine/disp/nv50.c
Removes the need for a write lock each time we call ttm_bo_unref().
v2: Remove an unused variable.
v3: Really remove the unused variable.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Also move a kref_init() out of spinlocked region
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is similar to other platforms that don't allow command submission
to buffers locked on the cpu.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reservation locking currently always takes place under the LRU spinlock.
Hence, strictly there is no need for an atomic_cmpxchg call; we can use
atomic_read followed by atomic_write since nobody else will ever reserve
without the lru spinlock held.
At least on Intel this should remove a locked bus cycle on successful
reserve.
Note that thit commit may be obsoleted by the cross-device reservation work.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.
v2: Don't touch include/linux/kref.h
v3: Adapt to kref_get_unless_zero return value change
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
vmwgfx was its only user and always sets it to the same..
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-By: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It's unused.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It's unused.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
All drivers set it to 0 and nothing uses it.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It is unnecessary to disable preemption explicitly while calling
copy_highpage(). Because copy_highpage() will do it again through
kmap_atomic/kunmap_atomic.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The TTM page can be allocated from high memory. In such case it is
wrong to use the page_address(page) as the virtual address for the high memory
page.
bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=50241
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In theory, that function could release the lru lock between
checking for bo on ddestroy list and a successful reserve if the bo
was already reserved, and the function was called with waiting reserves
allowed.
However, all current reservers of a bo on the ddestroy list would
atomically take the bo off the list after a successful reserve so this
race should not have been hit, so no need to backport for stable.
This patch also fixes a case found by Maarten Lankhorst where
ttm_mem_evict_first called with no_wait_gpu would incorrectly
spin waiting for bo idle if trying to evict a busy buffer that
also sits on the ddestroy list.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The ttm_mem_evict_first function could theoretically drop the
lru lock without retrying if a reservation from off the LRU list
ended up waiting.
However, since currently there are no users that could cause a wait
in that situation so this is not suitable for stable
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
currently it lost original meaning but still has some effects:
| effect | alternative flags
-+------------------------+---------------------------------------------
1| account as reserved_vm | VM_IO
2| skip in core dump | VM_IO, VM_DONTDUMP
3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
4| do not mlock | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
This patch removes reserved_vm counter from mm_struct. Seems like nobody
cares about it, it does not exported into userspace directly, it only
reduces total_vm showed in proc.
Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.
remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.
[akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull drm merge (part 1) from Dave Airlie:
"So first of all my tree and uapi stuff has a conflict mess, its my
fault as the nouveau stuff didn't hit -next as were trying to rebase
regressions out of it before we merged.
Highlights:
- SH mobile modesetting driver and associated helpers
- some DRM core documentation
- i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
combined pte writing, ilk rc6 support,
- nouveau: major driver rework into a hw core driver, makes features
like SLI a lot saner to implement,
- psb: add eDP/DP support for Cedarview
- radeon: 2 layer page tables, async VM pte updates, better PLL
selection for > 2 screens, better ACPI interactions
The rest is general grab bag of fixes.
So why part 1? well I have the exynos pull req which came in a bit
late but was waiting for me to do something they shouldn't have and it
looks fairly safe, and David Howells has some more header cleanups
he'd like me to pull, that seem like a good idea, but I'd like to get
this merge out of the way so -next dosen't get blocked."
Tons of conflicts mostly due to silly include line changes, but mostly
mindless. A few other small semantic conflicts too, noted from Dave's
pre-merged branch.
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
drm/nv98/crypt: fix fuc build with latest envyas
drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
drm/nv41/vm: fix and enable use of "real" pciegart
drm/nv44/vm: fix and enable use of "real" pciegart
drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
drm/nouveau: store supported dma mask in vmmgr
drm/nvc0/ibus: initial implementation of subdev
drm/nouveau/therm: add support for fan-control modes
drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
drm/nouveau/therm: calculate the pwm divisor on nv50+
drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
drm/nouveau/therm: move thermal-related functions to the therm subdev
drm/nouveau/bios: parse the pwm divisor from the perf table
drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
drm/nouveau/therm: rework thermal table parsing
drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
drm/nouveau: fix pm initialization order
drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
drm/nouveau: log channel debug/error messages from client object rather than drm client
drm/nouveau: have drm debugging macros build on top of core macros
...
Convert #include "..." to #include <path/...> in drivers/gpu/.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Remove useless kfree() and clean up code related to the removal.
The semantic patch that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
position p1,p2;
expression x;
@@
if (x@p1 == NULL) { ... kfree@p2(x); ... return ...; }
@unchanged exists@
position r.p1,r.p2;
expression e <= r.x,x,e1;
iterator I;
statement S;
@@
if (x@p1 == NULL) { ... when != I(x,...) S
when != e = e1
when != e += e1
when != e -= e1
when != ++e
when != --e
when != e++
when != e--
when != &e
kfree@p2(x); ... return ...; }
@ok depends on unchanged exists@
position any r.p1;
position r.p2;
expression x;
@@
... when != true x@p1 == NULL
kfree@p2(x);
@depends on !ok && unchanged@
position r.p2;
expression x;
@@
*kfree@p2(x);
// </smpl>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Use copy_highpage() to copy from one page to another.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
A regression was introduced in the 3.3 rc series, commit
"drm/ttm: simplify memory accounting for ttm user v2",
causing the metadata of buffer objects created using the ttm_bo_create()
function to be accounted twice.
That causes massive leaks with the vmwgfx driver running for example
SpecViewperf Catia-03 test 2, eventually killing the app.
Furthermore, the same commit introduces a regression where
metadata accounting is leaked if a buffer object is
initialized with an illegal size. This is also fixed with this commit.
v2: Fixed an error path and removed an unused variable.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>