The problem boils down to the order when the bit11
of the texture size is or'ed to the original width.
In the end each mipmap level has the same width or
height because of that 11 bit is ored to the scaled
down lod with and thus blows up the size again to the
full size or more due to the power of two rounding
afterwards.
The attached patch changes this order so that the
texture sizes are computed correct. Also the on error
the yet missing inputs to the size computation are
printed which helped me to find out where it really breaks.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
While investigating the cause of CRTC FIFO underruns, I noticed that when
converting the memory bandwidth calculation from the userspace X driver code,
an instance of '8.0' was apparently accidentally converted to '80'.
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Also add single crtc for RN50 chips.
changes in v2:
fix vblank init to respect single crtc flag
fix r100 mode bandwidth to respect single crtc flag
Signed-off-by: Dave Airlie <airlied@redhat.com>
This remove old init path and allow code cleanup, now all hw
use the new init path, see top of radeon.h for description of
this.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
New init path allow to simply asic initialization and make easier
to trace what happen on each different asic. We are removing most
callback. More cleanup should happen latter to remove even more
callback. Also cleanup register specific to R100,RV200,RV250.
Version 2 correct the placement on IGP of the VRAM inside GPU address
space to match the stollen RAM placement of IGP.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Also cleanup register specific to RS400/RS480. This patch also fix
legacy VGA register used to disable VGA access we were programming
wrong register. Now we should properly disable VGA on r100 up to
rs400 asics. Note that RS400/RS480 resume is broken, it hangs the
computer while reprogramming dynamic clock, doesn't work either
without that patch. We need to spend more time investigating this
issue. Version 2 of the patch remove dead code that was left
commented out in the previous version. Version 3 correct the
placement on IGP of the VRAM inside GPU address space to match the
stollen RAM placement of IGP.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- fix offset of NOP packet for parsing
- fix p->idx increments
- fix bad mask when updating crtc vline info
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>
This avoids needing to do a kmalloc > PAGE_SIZE for the main
indirect buffer chunk, it adds an accessor for all reads from
the chunk and caches a single page at a time for subsequent
reads.
changes since v1:
Use a two page pool which should be the most common case
a single packet spanning > PAGE_SIZE will be hit, but I'm
having trouble seeing anywhere we currently generate anything like that.
hopefully proper short page copying at end
added parser_error flag to set deep errors instead of having to test
every ib value fetch.
fixed bug in patch that went to list.
Signed-off-by: Dave Airlie <airlied@redhat.com>
GART static one time initialization was mixed up with GART
enabling/disabling which could happen several time for instance
during suspend/resume cycles. This patch splits all GART
handling into 4 differents function. gart_init is for one
time initialization, gart_deinit is called upon module unload
to free resources allocated by gart_init, gart_enable enable
the GART and is intented to be call after first initialization
and at each resume cycle or reset cycle. Finaly gart_disable
stop the GART and is intended to be call at suspend time or
when unloading the module.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This convert r4xx to new init path it also fix few bugs.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If module is being unloaded we should not try to handle irq especialy
we should not call into drm helper or we could hard hang the computer
free_irq will call the irq handler to make sure we behave properly.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we stop CP and that it's still processing thing GPU hang might
happen, this patch wait for CP idle (the wait can timeout) so we
can avoid shutting down CP at bad time. This is especialy usefull
when reseting the GPU as it seems GPU reset fails to properly reset
CP when the CP wasn't stop after being idle.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds the r600 KMS + CS support to the Linux kernel.
The r600 TTM support is quite basic and still needs more
work esp around using interrupts, but the polled fencing
should work okay for now.
Also currently TTM is using memcpy to do VRAM moves,
the code is here to use a 3D blit to do this, but
isn't fully debugged yet.
Authors:
Alex Deucher <alexdeucher@gmail.com>
Dave Airlie <airlied@redhat.com>
Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds the command stream checker for the RN50, R100 and R200 cards.
It stops any access to 3D registers on RN50, and does checks
on buffer sizes on the r100/r200 cards. It also fixes some texture
sizing checks on r300.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Loosely based on a patch by
Jaswinder Singh Rajput <jaswinderlinux@gmail.com>.
KMS support by Dave Airlie <airlied@redhat.com>.
For Radeon 100- to 500-series, firmware blobs look like:
struct {
__be32 datah;
__be32 datal;
} cp_ucode[256];
For Radeon 600-series, there are two separate firmware blobs:
__be32 me_ucode[PM4_UCODE_SIZE * 3];
__be32 pfp_ucode[PFP_UCODE_SIZE];
For Radeon 700-series, likewise:
__be32 me_ucode[R700_PM4_UCODE_SIZE];
__be32 pfp_ucode[R700_PFP_UCODE_SIZE];
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We really don't want to be doing all these indirects, updating
the GPU gart table is something we do often so the less overhead the
better.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fixes 3D apps timing out in the WAIT_VBLANK ioctl.
AVIVO bits compile-tested only.
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Check whether index is within bounds before grabbing the element.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If an rn50/r100/m6/m7 GPU has < 64MB RAM, i.e. 8/16/32, the
aperture used to calculate the MC_FB_LOCATION needs to be worked
out from the CONFIG_APER_SIZE register, and not the actual vram size.
TTM VRAM size was also being initialised wrong, use actual vram size
to initialise it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fix bandwidth computation and crtc priority in memory controller
so that crtc memory request are fullfill in time to avoid display
artifact.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds new set/get tiling interfaces where the pitch
and macro/micro tiling enables can be set. Along with
a flag to decide if this object should have a surface when mapped.
The only thing we need to allocate with a mapped surface should be
the frontbuffer. Note rotate scanout shouldn't require one, and
back/depth shouldn't either, though mesa needs some fixes.
It fixes the TTM interfaces along Thomas's suggestions, and I've tested
the surface stealing code with two X servers and not seen any lockdep issues.
I've stopped tiling the fbcon frontbuffer, as I don't see there being
any advantage other than testing, I've left the testing commands in there,
just flip the fb_tiled to true in radeon_fb.c
Open: Can we integrate endian swapping in with this?
Future features:
texture tiling - need to relocate texture registers TXOFFSET* with tiling info.
This also merges Michel's cleanup surfaces regs at init time patch
even though it makes sense on its own, this patch really relies on it.
Some PowerMac firmwares set up a tiling surface at the beginning of VRAM
which messes us up otherwise.
that patch is:
Signed-off-by: Michel Dänzer <daenzer@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
RN50/ES1000 is a cut-down rv100 chip used in the server market.
The 3D engine on these is either not there or unverified so refuse
any attempt to configure registers on it.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Doing this like the DDX seems like the most sure fire way to avoid
having to reinvent it slowly and painfully. At the moment we keep
getting things wrong with aper vs vram, so we know the DDX does it right.
booted on PCI r100, PCIE rv370, IGP rs400.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Userspace sends us a special relocation type to sync video/exa
to vlines to avoid tearing, this deals with the relocation
in the kernel, it picks the correct crtc and avoids issues
where crtcs are disabled.
This version also parses the wait until to make sure it isn't
trying to do anything evil.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Normally we are free to place VRAM where we want in the GPUs
memory address space, however on IGP chips the VRAM is actual RAM,
and no special translation or aperture is used inside the GPU MC.
So when you move the VRAM aperture away from the TOM register,
you actually move it into main memory and can trash things quite badly.
This commit makes the code respect the TOM location for MC_FB_LOCATION.
Signed-off-by: Dave Airlie <airlied@redhat.com>
1. rv370 can accept 40-bit addresses - also at 24-bit shift not 4 bits
2. rs480 table can be in 40-bit space. - 4 bit shift for top 8 bits
3. rs480 table entries can be in 40-bit space.
Signed-off-by: Dave Airlie <airlied@redhat.com>
For security purpose we want to make sure the userspace process doesn't
access memory beyond buffer it owns. To achieve this we need to check
states the userspace program. For color buffer and zbuffer we check that
the clipping register will discard access beyond buffers set as color
or zbuffer. For vertex buffer we check that no vertex fetch will happen
beyond buffer end. For texture we check various texture states (number
of mipmap level, texture size, texture depth, ...) to compute the amount
of memory the texture fetcher might access.
The command stream checking impact the performances so far quick benchmark
shows an average of 3% decrease in fps of various applications. It can
be optimized a bit more by caching result of checking and thus avoid a
full recheck if no states changed since last check.
Note that this patch is still incomplete on checking side as it doesn't
check 2d rendering states.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add kernel modesetting support to radeon driver, use the ttm memory
manager to manage memory and DRM/GEM to provide userspace API.
In order to avoid backward compatibility issue and to allow clean
design and code the radeon kernel modesetting use different code path
than old radeon/drm driver.
When kernel modesetting is enabled the IOCTL of radeon/drm
driver are considered as invalid and an error message is printed
in the log and they return failure.
KMS enabled userspace will use new API to talk with the radeon/drm
driver. The new API provide functions to create/destroy/share/mmap
buffer object which are then managed by the kernel memory manager
(here TTM). In order to submit command to the GPU the userspace
provide a buffer holding the command stream, along this buffer
userspace have to provide a list of buffer object used by the
command stream. The kernel radeon driver will then place buffer
in GPU accessible memory and will update command stream to reflect
the position of the different buffers.
The kernel will also perform security check on command stream
provided by the user, we want to catch and forbid any illegal use
of the GPU such as DMA into random system memory or into memory
not owned by the process supplying the command stream. This part
of the code is still incomplete and this why we propose that patch
as a staging driver addition, future security might forbid current
experimental userspace to run.
This code support the following hardware : R1XX,R2XX,R3XX,R4XX,R5XX
(radeon up to X1950). Works is underway to provide support for R6XX,
R7XX and newer hardware (radeon from HD2XXX to HD4XXX).
Authors:
Jerome Glisse <jglisse@redhat.com>
Dave Airlie <airlied@redhat.com>
Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>