The idxd driver introduces the Intel Data Stream Accelerator [1] that will
be available on future Intel Xeon CPUs. One of the kernel access
point for the driver is through the dmaengine subsystem. It will initially
provide the DMA copy service to the kernel.
Some of the main functionality introduced with this accelerator
are: shared virtual memory (SVM) support, and descriptor submission using
Intel CPU instructions movdir64b and enqcmds. There will be additional
accelerator devices that share the same driver with variations to
capabilities.
This commit introduces the probe and initialization component of the
driver.
[1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
With the introduction of MOVDIR64B instruction, there is now an instruction
that can write 64 bytes of data atomically.
Quoting from Intel SDM:
"There is no atomicity guarantee provided for the 64-byte load operation
from source address, and processor implementations may use multiple
load operations to read the 64-bytes. The 64-byte direct-store issued
by MOVDIR64B guarantees 64-byte write-completion atomicity. This means
that the data arrives at the destination in a single undivided 64-byte
write transaction."
We have identified at least 3 different use cases for this instruction in
the format of func(dst, src, count):
1) Clear poison / Initialize MKTME memory
@dst is normal memory.
@src in normal memory. Does not increment. (Copy same line to all
targets)
@count (to clear/init multiple lines)
2) Submit command(s) to new devices
@dst is a special MMIO region for a device. Does not increment.
@src is normal memory. Increments.
@count usually is 1, but can be multiple.
3) Copy to iomem in big chunks
@dst is iomem and increments
@src in normal memory and increments
@count is number of chunks to copy
Add support for case #2 to support device that will accept commands via
this instruction. We provide a @count in order to submit a batch of
preprogrammed descriptors in virtually contiguous memory. This
allows the caller to submit multiple descriptors to a device with a single
submission. The special device requires the entire 64bytes descriptor to
be written atomically and will accept MOVDIR64B instruction.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/157965022175.73301.10174614665472962675.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The functions dma_get_slave_channel() and dma_get_any_slave_channel()
are called from DMA engine drivers only. Hence move their declarations
from the public header file <linux/dmaengine.h> to the private header
file drivers/dma/dmaengine.h.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200121093311.28639-4-geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
At its original introduction, dma_request_slave_channel_compat() used a
wrapper, to accommodate filter functions that modify the mask passed.
Filter functions can no longer modify masks, and the mask parameter was
made const in commit a53e28da57 ("dma: Make the 'mask' parameter
of __dma_request_channel const") consecutively.
Hence remove the wrapper, and rename __dma_request_slave_channel_compat()
to dma_request_slave_channel_compat(), to get rid of one more function
name starting with a double underscore.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200121093311.28639-3-geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Commit aa1e6f1a38 ("dmaengine: kill struct dma_client and
supporting infrastructure") removed the last user of the
dma_device_satisfies_mask() wrapper.
Remove the wrapper, and rename __dma_device_satisfies_mask() to
dma_device_satisfies_mask(), to get rid of one more function starting
with a double underscore.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20200121093311.28639-2-geert+renesas@glider.be
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Since the dma engine expects the burst length register content as
power of 2 value, the burst length needs to be converted first.
Additionally add a burst length range check to avoid corrupting unrelated
register bits.
Signed-off-by: Matthias Fend <matthias.fend@wolfvision.net>
Link: https://lore.kernel.org/r/20200115102249.24398-1-matthias.fend@wolfvision.net
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Currently the cyclic transfers can be used only with normal DMAs. They
can be used by pcm_dmaengine module, which is required for implementing
sound with sun4i-hdmi encoder. This is so because the controller can
accept audio only from a dedicated DMA.
This patch enables them, following the existing style for the
scatter/gather type transfers.
Signed-off-by: Stefan Mavrodiev <stefan@olimex.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20200110141140.28527-2-stefan@olimex.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
There is duplicated argument to && in function fsl_qdma_free_chan_resources,
which looks like a typo, pointer fsl_queue->desc_pool also needs NULL check,
fix it.
Detected with coccinelle.
Fixes: b092529e0a ("dmaengine: fsl-qdma: Add qDMA controller driver for Layerscape SoCs")
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Reviewed-by: Peng Ma <peng.ma@nxp.com>
Tested-by: Peng Ma <peng.ma@nxp.com>
Link: https://lore.kernel.org/r/20200120125843.34398-1-chenzhou10@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Fixe the following warnings by making these static
drivers/dma/ti/k3-psil-j721e.c:62:16: warning: symbol 'j721e_src_ep_map' was not declared. Should it be static?
drivers/dma/ti/k3-psil-j721e.c:172:16: warning: symbol 'j721e_dst_ep_map' was not declared. Should it be static?
drivers/dma/ti/k3-psil-j721e.c:216:20: warning: symbol 'j721e_ep_map' was not declared. Should it be static?
CC drivers/dma/ti/k3-psil-j721e.o
drivers/dma/ti/k3-psil-am654.c:52:16: warning: symbol 'am654_src_ep_map' was not declared. Should it be static?
drivers/dma/ti/k3-psil-am654.c:127:16: warning: symbol 'am654_dst_ep_map' was not declared. Should it be static?
drivers/dma/ti/k3-psil-am654.c:169:20: warning: symbol 'am654_ep_map' was not declared. Should it be static?
Reported-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20200121070104.4393-1-peter.ujfalusi@ti.com
[vkoul: updated patch title]
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Certain users can not use right now the DMAengine API due to missing
features in the core. Prime example is Networking.
These users can use the glue layer interface to avoid misuse of DMAengine
API and when the core gains the needed features they can be converted to
use generic API.
The most prominent features the glue layer clients are depending on:
- most PSI-L native peripheral use extra rflow ranges on a receive channel
and depending on the peripheral's configuration packets from a single
free descriptor ring is going to be received to different receive ring
- it is also possible to have different free descriptor rings per rflow
and an rflow can also support 4 additional free descriptor ring based
on the size of the incoming packet
- out of order completion of descriptors on a channel
- when we have several queues to handle different priority packets the
descriptors will be completed 'out-of-order'
- the notion of prep_slave_sg is not matching with what the streaming type
of operation is demanding for networking
- Streaming type of operation
- Ability to fill the free descriptor ring with descriptors in
anticipation of incoming traffic and when a packet arrives UDMAP will
form a packet and gives it to the client driver
- the descriptors are not backed with exact size data buffers as we don't
know the size of the packet we will receive, but as a generic pool of
buffers to be used by the receive channel
- NAPI type of operation (polling instead of interrupt driven transfer)
- without this we can not sustain gigabit speeds and we need to support NAPI
- not to limit this to networking, but other high performance operations
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-12-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Split patch for review containing: defines, structs, io and low level
functions and interrupt callbacks.
DMA driver for
Texas Instruments K3 NAVSS Unified DMA – Peripheral Root Complex (UDMA-P)
The UDMA-P is intended to perform similar (but significantly upgraded) functions
as the packet-oriented DMA used on previous SoC devices. The UDMA-P module
supports the transmission and reception of various packet types. The UDMA-P is
architected to facilitate the segmentation and reassembly of SoC DMA data
structure compliant packets to/from smaller data blocks that are natively
compatible with the specific requirements of each connected peripheral. Multiple
Tx and Rx channels are provided within the DMA which allow multiple segmentation
or reassembly operations to be ongoing. The DMA controller maintains state
information for each of the channels which allows packet segmentation and
reassembly operations to be time division multiplexed between channels in order
to share the underlying DMA hardware. An external DMA scheduler is used to
control the ordering and rate at which this multiplexing occurs for Transmit
operations. The ordering and rate of Receive operations is indirectly controlled
by the order in which blocks are pushed into the DMA on the Rx PSI-L interface.
The UDMA-P also supports acting as both a UTC and UDMA-C for its internal
channels. Channels in the UDMA-P can be configured to be either Packet-Based or
Third-Party channels on a channel by channel basis.
The initial driver supports:
- MEM_TO_MEM (TR mode)
- DEV_TO_MEM (Packet / TR mode)
- MEM_TO_DEV (Packet / TR mode)
- Cyclic (Packet / TR mode)
- Metadata for descriptors
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-11-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
New binding document for
Texas Instruments K3 NAVSS Unified DMA – Peripheral Root Complex (UDMA-P).
UDMA-P is introduced as part of the K3 architecture and can be found in
AM654 and j721e.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-10-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In K3 architecture the DMA operates within threads. One end of the thread
is UDMAP, the other is on the peripheral side.
The UDMAP channel configuration depends on the needs of the remote
endpoint and it can be differ from peripheral to peripheral.
This patch adds database for am654 and j721e and small API to fetch the
PSI-L endpoint configuration from the database which should only used by
the DMA driver(s).
Another API is added for native peripherals to give possibility to pass new
configuration for the threads they are using, which is needed to be able to
handle changes caused by different firmware loaded for the peripheral for
example.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-9-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The K3 DMA architecture uses CPPI5 (Communications Port Programming
Interface) specified descriptors over PSI-L bus within NAVSS.
The header provides helpers, macros to work with these descriptors in a
consistent way.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-8-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
dmaengine_get_direction_text() can be useful when the direction is printed
out. The text is easier to comprehend than the number.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-7-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
A DMA hardware can have big cache or FIFO and the amount of data sitting in
the DMA fabric can be an interest for the clients.
For example in audio we want to know the delay in the data flow and in case
the DMA have significantly large FIFO/cache, it can affect the latenc/delay
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-6-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The metadata is best described as side band data or parameters traveling
alongside the data DMAd by the DMA engine. It is data
which is understood by the peripheral and the peripheral driver only, the
DMA engine see it only as data block and it is not interpreting it in any
way.
The metadata can be different per descriptor as it is a parameter for the
data being transferred.
If the DMA supports per descriptor metadata it can implement the attach,
get_ptr/set_len callbacks.
Client drivers must only use either attach or get_ptr/set_len to avoid
misconfiguration.
Client driver can check if a given metadata mode is supported by the
channel during probe time with
dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_CLIENT);
dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_ENGINE);
and based on this information can use either mode.
Wrappers are also added for the metadata_ops.
To be used in DESC_METADATA_CLIENT mode:
dmaengine_desc_attach_metadata()
To be used in DESC_METADATA_ENGINE mode:
dmaengine_desc_get_metadata_ptr()
dmaengine_desc_set_metadata_len()
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-5-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Update the provider and client documentation with details about the
metadata support.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20191223110458.30766-4-peter.ujfalusi@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
enable straightforward passing of work between a producer and a consumer.
There is one RINGACC module per NAVSS on TI AM65x SoCs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJeH1VYAAoJEHJsHOdBp5c/k4cP/RHrk67mB8L6JC7G9H1XWedM
w4aG80q7u8dAydk9k3wJM1Y4ChXfflR3JYc5KBghrsQiRPF8myh4JEwi3+8jpTT5
LUOJifaEtR79VuNCHyG9Wgni3LWYnp7HmG4jM1bzcDwLgpf4I8cBwXkCfW0kmyPq
HAaFLWo1net3TkhSnUkbFUn54vESl569/D6FIfFFK2DPJH/gLUKQ6W8Xnd1uKBaI
9RpnN6eGA0ZWZkZG+Xu7XA/VfC/AR+wxbCr0l1jHB9+4CrngOEwsTnP1SFxFsD1v
L7zAT5JguNg13jbELVgTpUX8X5/XBfbOzIrNZpeXnl0fl8p2SU25n8hw8KdYjKCP
5puO6wye5NnliQs5hLlMTydF77VazAqdBhLz5i1OMa+cuA1zVzm7dyIZBAWbI8IC
TY86h9eGyGKELjljrYhW8ijARbzu/J3SpO3cSklvL/BfXtlVYen5d0mZxsKBDWOi
bni1yV37b65IWjkimwzkaVq/sN9jWTAF2a192SkKeEBqnms5jxKDeCRq9qSxpCWv
ONFROd6WXImDTk/MLgo4EdAq+ProoBFR+YDLModSAv3fZaBFUgi5mU5Xnx1+cAFq
SL9TguzvXhUv6o6ywNZSsSM/7I2iB5uOMRKjCeGg2x1JN9lyOMzrVlJ8JzzUyKV3
iiKDJjNdZD0/Rn2fl5a1
=m1rI
-----END PGP SIGNATURE-----
Merge TI ringacc driver from Santosh
This is for dependency of new TI ringacc dmaengine drivers
Merge tag 'drivers_soc_for_5.6' into topic/ti
SOC: TI Keystone Ring Accelerator driver
The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
enable straightforward passing of work between a producer and a consumer.
There is one RINGACC module per NAVSS on TI AM65x SoCs.
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
enable straightforward passing of work between a producer and a consumer.
There is one RINGACC module per NAVSS on TI AM65x SoCs.
The RINGACC converts constant-address read and write accesses to equivalent
read or write accesses to a circular data structure in memory. The RINGACC
eliminates the need for each DMA controller which needs to access ring
elements from having to know the current state of the ring (base address,
current offset). The DMA controller performs a read or write access to a
specific address range (which maps to the source interface on the RINGACC)
and the RINGACC replaces the address for the transaction with a new address
which corresponds to the head or tail element of the ring (head for reads,
tail for writes). Since the RINGACC maintains the state, multiple DMA
controllers or channels are allowed to coherently share the same rings as
applicable. The RINGACC is able to place data which is destined towards
software into cached memory directly.
Supported ring modes:
- Ring Mode
- Messaging Mode
- Credentials Mode
- Queue Manager Mode
TI-SCI integration:
Texas Instrument's System Control Interface (TI-SCI) Message Protocol now
has control over Ringacc module resources management (RM) and Rings
configuration.
The corresponding support of TI-SCI Ringacc module RM protocol
introduced as option through DT parameters:
- ti,sci: phandle on TI-SCI firmware controller DT node
- ti,sci-dev-id: TI-SCI device identifier as per TI-SCI firmware spec
if both parameters present - Ringacc driver will configure/free/reset Rings
using TI-SCI Message Ringacc RM Protocol.
The Ringacc driver manages Rings allocation by itself now and requests
TI-SCI firmware to allocate and configure specific Rings only. It's done
this way because, Linux driver implements two stage Rings allocation and
configuration (allocate ring and configure ring) while TI-SCI Message
Protocol supports only one combined operation (allocate+configure).
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Tero Kristo <t-kristo@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
enable straightforward passing of work between a producer and a consumer.
There is one RINGACC module per NAVSS on TI AM65x and j721e.
This patch introduces RINGACC device tree bindings.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
On prep, a spin lock is taken and the next entry in the circular buffer
is filled. On submit, the valid bit is set in the hardware descriptor
and the lock is released.
The DMA engine is started (if it's not already running) when the client
calls dma_async_issue_pending().
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20200103212021.2881-4-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Allocate DMA coherent memory for the ring of DMA descriptors and
program the appropriate hardware registers.
A tasklet is created which is triggered on an interrupt to process
all the finished requests. Additionally, any remaining descriptors
are aborted when the hardware is removed or the resources freed.
Use an RCU pointer to synchronize PCI device unbind.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20200103212021.2881-3-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Some PLX Switches can expose DMA engines via extra PCI functions
on the upstream port. Each function will have one DMA channel.
This patch is just the core PCI driver skeleton and dma
engine registration.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20200103212021.2881-2-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The current descriptor is not on any list of the virtual DMA channel.
Once sdma_terminate_all() is called when a descriptor is currently
in flight then this one is forgotten to be freed. We have to call
vchan_terminate_vdesc() on this descriptor to re-add it to the lists.
Now that we also free the currently running descriptor we can (and
actually have to) remove the current descriptor from its list also
for the cyclic case.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Robin Gong <yibin.gong@nxp.com>
Tested-by: Robin Gong <yibin.gong@nxp.com>
Link: https://lore.kernel.org/r/20191216105328.15198-10-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In sdma_tx_status() we must first find the current sdma_desc. In cyclic
mode we assume that this can always be found with vchan_find_desc().
This is true because do not remove the current descriptor from the
desc_issued list:
/*
* Do not delete the node in desc_issued list in cyclic mode, otherwise
* the desc allocated will never be freed in vchan_dma_desc_free_list
*/
if (!(sdmac->flags & IMX_DMA_SG_LOOP))
list_del(&vd->node);
We will change this in the next step, so check if the current descriptor is
the desired one also for the cyclic case.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20191216105328.15198-9-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Rename sdma_disable_channel_async() after the hook it implements, like
done for all other functions in the SDMA driver.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20191216105328.15198-8-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
vchan_dma_desc_free_list() basically open codes vchan_vdesc_fini() in its
loop body. Call it directly rather than duplicating the code.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191216105328.15198-7-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
All list operations are protected by &vc->lock. As vchan_vdesc_fini()
is called unlocked add the missing locking around the list operations.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191216105328.15198-6-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
vchan_vdesc_fini() shouldn't be called under a spin_lock. This is done
in two places, once in vchan_terminate_vdesc() and once in
vchan_synchronize(). Instead of freeing the vdesc right away, collect
the aborted vdescs on a separate list and free them along with the other
vdescs. The terminated descs are also freed in vchan_synchronize as done
before this patch.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191216105328.15198-5-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
vchan_dma_desc_free_list() basically open codes vchan_vdesc_fini() in
the loop body. One difference is an additional debug message. As this
isn't overly useful remove it.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191216105328.15198-4-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Originally freeing descriptors was split into a locked and an unlocked
part. The locked part in vchan_get_all_descriptors() collected all
descriptors on a separate list_head. This was done to allow iterating
over that new list in vchan_dma_desc_free_list() without a lock held.
This became broken in 13bb26ae88 ("dmaengine: virt-dma: don't always
free descriptor upon completion"). With this commit
vchan_dma_desc_free_list() no longer exclusively operates on the
separate list, but starts to put descriptors which can be reused back on
&vc->desc_allocated. This list operation should have been locked, but
wasn't.
In the mean time drivers started to call vchan_dma_desc_free_list() with
their lock held so that we now have the situation that
vchan_dma_desc_free_list() is called locked from some drivers and
unlocked from others.
To clean this up we have to do two things:
1. Add missing locking in vchan_dma_desc_free_list()
2. Make sure drivers call vchan_dma_desc_free_list() unlocked
This needs to be done atomically, so in this patch the locking is added
and all drivers are fixed.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Green Wan <green.wan@sifive.com>
Tested-by: Green Wan <green.wan@sifive.com>
Link: https://lore.kernel.org/r/20191216105328.15198-3-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
vchan_vdesc_fini() can't be called locked. Instead, call
vchan_terminate_vdesc() which delays the freeing of the descriptor to
vchan_synchronize().
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20191216105328.15198-2-s.hauer@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
error log for dma_channel_table_init() failure pointed a mere
"initialization failure", which is not very helpful message, so print
additional details like function name and error code.
Signed-off-by: Vinod Koul <vkoul@kernel.org>
We call dma_device_put() and module_put() after invoking
.device_free_chan_resources callback, but we should also take care of
router devices and invoke this after .route_free callback. So move it
after .route_free
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Don't allocate memory using the devm infrastructure and instead call
kfree with the new dmaengine device_release call back. This ensures
the structures are available until the last reference is dropped.
We also need to ensure we call ioat_shutdown() in ioat_remove() so
that all the channels are quiesced and further transaction fails.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20191216190120.21374-6-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Adding a reference count helps drivers to properly implement the unbind
while in use case.
References are taken and put every time a channel is allocated or freed.
Once the final reference is put, the device is removed from the
dma_device_list and a release callback function is called to signal
the driver to free the memory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20191216190120.21374-5-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
So it can be called by a release function which is needed higher up in
the code. No functional changes intended.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20191216190120.21374-4-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The module reference is taken to ensure the callbacks still exist
when they are called. If the channel holds the last reference to the
module, the module can disappear before device_free_chan_resources() is
called and would cause a call into free'd memory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20191216190120.21374-3-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
dma_chan_to_owner() dereferences the driver from the struct device to
obtain the owner and call module_[get|put](). However, if the backing
device is unbound before the dma_device is unregistered, the driver
will be cleared and this will cause a NULL pointer dereference.
Instead, store a pointer to the owner module in the dma_device struct
so the module reference can be properly put when the channel is put, even
if the backing device was destroyed first.
This change helps to support a safer unbind of DMA engines.
If the dma_device is unregistered in the driver's remove function,
there's no guarantee that there are no existing clients and a users
action may trigger the WARN_ONCE in dma_async_device_unregister()
which is unlikely to leave the system in a consistent state.
Instead, a better approach is to allow the backing driver to go away
and fail any subsequent requests to it.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20191216190120.21374-2-logang@deltatee.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
QORIQ LS1028A soc used fsl,vf610-edma, but it has a little bit different
from others, so add new compatible to distinguish them.
Signed-off-by: Peng Ma <peng.ma@nxp.com>
Link: https://lore.kernel.org/r/20191212033714.4090-3-peng.ma@nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 2a03c13145 ("dmaengine: ti: edma: add missed operations")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Link: https://lore.kernel.org/r/20191212114622.127322-1-weiyongjun1@huawei.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>