linux_dsm_epyc7002/Documentation/dmaengine/client.txt

223 lines
8.8 KiB
Plaintext
Raw Normal View History

DMA Engine API Guide
====================
Vinod Koul <vinod dot koul at intel.com>
NOTE: For DMA Engine usage in async_tx please see:
Documentation/crypto/async-tx-api.txt
Below is a guide to device driver writers on how to use the Slave-DMA API of the
DMA Engine. This is applicable only for slave DMA usage only.
The slave DMA usage consists of following steps:
1. Allocate a DMA slave channel
2. Set slave and controller specific parameters
3. Get a descriptor for transaction
4. Submit the transaction
5. Issue pending requests and wait for callback notification
1. Allocate a DMA slave channel
Channel allocation is slightly different in the slave DMA context,
client drivers typically need a channel from a particular DMA
controller only and even in some cases a specific channel is desired.
dmaengine: core: Introduce new, universal API to request a channel The two API function can cover most, if not all current APIs used to request a channel. With minimal effort dmaengine drivers, platforms and dmaengine user drivers can be converted to use the two function. struct dma_chan *dma_request_chan_by_mask(const dma_cap_mask_t *mask); To request any channel matching with the requested capabilities, can be used to request channel for memcpy, memset, xor, etc where no hardware synchronization is needed. struct dma_chan *dma_request_chan(struct device *dev, const char *name); To request a slave channel. The dma_request_chan() will try to find the channel via DT, ACPI or in case if the kernel booted in non DT/ACPI mode it will use a filter lookup table and retrieves the needed information from the dma_slave_map provided by the DMA drivers. This legacy mode needs changes in platform code, in dmaengine drivers and finally the dmaengine user drivers can be converted: For each dmaengine driver an array of DMA device, slave and the parameter for the filter function needs to be added: static const struct dma_slave_map da830_edma_map[] = { { "davinci-mcasp.0", "rx", EDMA_FILTER_PARAM(0, 0) }, { "davinci-mcasp.0", "tx", EDMA_FILTER_PARAM(0, 1) }, { "davinci-mcasp.1", "rx", EDMA_FILTER_PARAM(0, 2) }, { "davinci-mcasp.1", "tx", EDMA_FILTER_PARAM(0, 3) }, { "davinci-mcasp.2", "rx", EDMA_FILTER_PARAM(0, 4) }, { "davinci-mcasp.2", "tx", EDMA_FILTER_PARAM(0, 5) }, { "spi_davinci.0", "rx", EDMA_FILTER_PARAM(0, 14) }, { "spi_davinci.0", "tx", EDMA_FILTER_PARAM(0, 15) }, { "da830-mmc.0", "rx", EDMA_FILTER_PARAM(0, 16) }, { "da830-mmc.0", "tx", EDMA_FILTER_PARAM(0, 17) }, { "spi_davinci.1", "rx", EDMA_FILTER_PARAM(0, 18) }, { "spi_davinci.1", "tx", EDMA_FILTER_PARAM(0, 19) }, }; This information is going to be needed by the dmaengine driver, so modification to the platform_data is needed, and the driver map should be added to the pdata of the DMA driver: da8xx_edma0_pdata.slave_map = da830_edma_map; da8xx_edma0_pdata.slavecnt = ARRAY_SIZE(da830_edma_map); The DMA driver then needs to configure the needed device -> filter_fn mapping before it registers with dma_async_device_register() : ecc->dma_slave.filter_map.map = info->slave_map; ecc->dma_slave.filter_map.mapcnt = info->slavecnt; ecc->dma_slave.filter_map.fn = edma_filter_fn; When neither DT or ACPI lookup is available the dma_request_chan() will try to match the requester's device name with the filter_map's list of device names, when a match found it will use the information from the dma_slave_map to get the channel with the dma_get_channel() internal function. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-12-15 03:47:40 +07:00
To request a channel dma_request_chan() API is used.
Interface:
dmaengine: core: Introduce new, universal API to request a channel The two API function can cover most, if not all current APIs used to request a channel. With minimal effort dmaengine drivers, platforms and dmaengine user drivers can be converted to use the two function. struct dma_chan *dma_request_chan_by_mask(const dma_cap_mask_t *mask); To request any channel matching with the requested capabilities, can be used to request channel for memcpy, memset, xor, etc where no hardware synchronization is needed. struct dma_chan *dma_request_chan(struct device *dev, const char *name); To request a slave channel. The dma_request_chan() will try to find the channel via DT, ACPI or in case if the kernel booted in non DT/ACPI mode it will use a filter lookup table and retrieves the needed information from the dma_slave_map provided by the DMA drivers. This legacy mode needs changes in platform code, in dmaengine drivers and finally the dmaengine user drivers can be converted: For each dmaengine driver an array of DMA device, slave and the parameter for the filter function needs to be added: static const struct dma_slave_map da830_edma_map[] = { { "davinci-mcasp.0", "rx", EDMA_FILTER_PARAM(0, 0) }, { "davinci-mcasp.0", "tx", EDMA_FILTER_PARAM(0, 1) }, { "davinci-mcasp.1", "rx", EDMA_FILTER_PARAM(0, 2) }, { "davinci-mcasp.1", "tx", EDMA_FILTER_PARAM(0, 3) }, { "davinci-mcasp.2", "rx", EDMA_FILTER_PARAM(0, 4) }, { "davinci-mcasp.2", "tx", EDMA_FILTER_PARAM(0, 5) }, { "spi_davinci.0", "rx", EDMA_FILTER_PARAM(0, 14) }, { "spi_davinci.0", "tx", EDMA_FILTER_PARAM(0, 15) }, { "da830-mmc.0", "rx", EDMA_FILTER_PARAM(0, 16) }, { "da830-mmc.0", "tx", EDMA_FILTER_PARAM(0, 17) }, { "spi_davinci.1", "rx", EDMA_FILTER_PARAM(0, 18) }, { "spi_davinci.1", "tx", EDMA_FILTER_PARAM(0, 19) }, }; This information is going to be needed by the dmaengine driver, so modification to the platform_data is needed, and the driver map should be added to the pdata of the DMA driver: da8xx_edma0_pdata.slave_map = da830_edma_map; da8xx_edma0_pdata.slavecnt = ARRAY_SIZE(da830_edma_map); The DMA driver then needs to configure the needed device -> filter_fn mapping before it registers with dma_async_device_register() : ecc->dma_slave.filter_map.map = info->slave_map; ecc->dma_slave.filter_map.mapcnt = info->slavecnt; ecc->dma_slave.filter_map.fn = edma_filter_fn; When neither DT or ACPI lookup is available the dma_request_chan() will try to match the requester's device name with the filter_map's list of device names, when a match found it will use the information from the dma_slave_map to get the channel with the dma_get_channel() internal function. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-12-15 03:47:40 +07:00
struct dma_chan *dma_request_chan(struct device *dev, const char *name);
Which will find and return the 'name' DMA channel associated with the 'dev'
device. The association is done via DT, ACPI or board file based
dma_slave_map matching table.
A channel allocated via this interface is exclusive to the caller,
until dma_release_channel() is called.
2. Set slave and controller specific parameters
Next step is always to pass some specific information to the DMA
driver. Most of the generic information which a slave DMA can use
is in struct dma_slave_config. This allows the clients to specify
DMA direction, DMA addresses, bus widths, DMA burst lengths etc
for the peripheral.
If some DMA controllers have more parameters to be sent then they
should try to embed struct dma_slave_config in their controller
specific structure. That gives flexibility to client to pass more
parameters, if required.
Interface:
int dmaengine_slave_config(struct dma_chan *chan,
struct dma_slave_config *config)
Please see the dma_slave_config structure definition in dmaengine.h
for a detailed explanation of the struct members. Please note
that the 'direction' member will be going away as it duplicates the
direction given in the prepare call.
3. Get a descriptor for transaction
For slave usage the various modes of slave transfers supported by the
DMA-engine are:
slave_sg - DMA a list of scatter gather buffers from/to a peripheral
dma_cyclic - Perform a cyclic DMA operation from/to a peripheral till the
operation is explicitly stopped.
interleaved_dma - This is common to Slave as well as M2M clients. For slave
address of devices' fifo could be already known to the driver.
Various types of operations could be expressed by setting
appropriate values to the 'dma_interleaved_template' members.
A non-NULL return of this transfer API represents a "descriptor" for
the given transaction.
Interface:
struct dma_async_tx_descriptor *dmaengine_prep_slave_sg(
struct dma_chan *chan, struct scatterlist *sgl,
unsigned int sg_len, enum dma_data_direction direction,
unsigned long flags);
struct dma_async_tx_descriptor *dmaengine_prep_dma_cyclic(
struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
size_t period_len, enum dma_data_direction direction);
struct dma_async_tx_descriptor *dmaengine_prep_interleaved_dma(
struct dma_chan *chan, struct dma_interleaved_template *xt,
unsigned long flags);
The peripheral driver is expected to have mapped the scatterlist for
the DMA operation prior to calling dmaengine_prep_slave_sg(), and must
keep the scatterlist mapped until the DMA operation has completed.
The scatterlist must be mapped using the DMA struct device.
If a mapping needs to be synchronized later, dma_sync_*_for_*() must be
called using the DMA struct device, too.
So, normal setup should look like this:
nr_sg = dma_map_sg(chan->device->dev, sgl, sg_len);
if (nr_sg == 0)
/* error */
desc = dmaengine_prep_slave_sg(chan, sgl, nr_sg, direction, flags);
Once a descriptor has been obtained, the callback information can be
added and the descriptor must then be submitted. Some DMA engine
drivers may hold a spinlock between a successful preparation and
submission so it is important that these two operations are closely
paired.
Note:
Although the async_tx API specifies that completion callback
routines cannot submit any new operations, this is not the
case for slave/cyclic DMA.
For slave DMA, the subsequent transaction may not be available
for submission prior to callback function being invoked, so
slave DMA callbacks are permitted to prepare and submit a new
transaction.
For cyclic DMA, a callback function may wish to terminate the
dmaengine: Add transfer termination synchronization support The DMAengine API has a long standing race condition that is inherent to the API itself. Calling dmaengine_terminate_all() is supposed to stop and abort any pending or active transfers that have previously been submitted. Unfortunately it is possible that this operation races against a currently running (or with some drivers also scheduled) completion callback. Since the API allows dmaengine_terminate_all() to be called from atomic context as well as from within a completion callback it is not possible to synchronize to the execution of the completion callback from within dmaengine_terminate_all() itself. This means that a user of the DMAengine API does not know when it is safe to free resources used in the completion callback, which can result in a use-after-free race condition. This patch addresses the issue by introducing an explicit synchronization primitive to the DMAengine API called dmaengine_synchronize(). The existing dmaengine_terminate_all() is deprecated in favor of dmaengine_terminate_sync() and dmaengine_terminate_async(). The former aborts all pending and active transfers and synchronizes to the current context, meaning it will wait until all running completion callbacks have finished. This means it is only possible to call this function from non-atomic context. The later function does not synchronize, but can still be used in atomic context or from within a complete callback. It has to be followed up by dmaengine_synchronize() before a client can free the resources used in a completion callback. In addition to this the semantics of the device_terminate_all() callback are slightly relaxed by this patch. It is now OK for a driver to only schedule the termination of the active transfer, but does not necessarily have to wait until the DMA controller has completely stopped. The driver must ensure though that the controller has stopped and no longer accesses any memory when the device_synchronize() callback returns. This was in part done since most drivers do not pay attention to this anyway at the moment and to emphasize that this needs to be done when the device_synchronize() callback is implemented. But it also helps with implementing support for devices where stopping the controller can require operations that may sleep. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-10-20 16:46:28 +07:00
DMA via dmaengine_terminate_async().
Therefore, it is important that DMA engine drivers drop any
locks before calling the callback function which may cause a
deadlock.
Note that callbacks will always be invoked from the DMA
engines tasklet, never from interrupt context.
4. Submit the transaction
Once the descriptor has been prepared and the callback information
added, it must be placed on the DMA engine drivers pending queue.
Interface:
dma_cookie_t dmaengine_submit(struct dma_async_tx_descriptor *desc)
This returns a cookie can be used to check the progress of DMA engine
activity via other DMA engine calls not covered in this document.
dmaengine_submit() will not start the DMA operation, it merely adds
it to the pending queue. For this, see step 5, dma_async_issue_pending.
5. Issue pending DMA requests and wait for callback notification
The transactions in the pending queue can be activated by calling the
issue_pending API. If channel is idle then the first transaction in
queue is started and subsequent ones queued up.
On completion of each DMA operation, the next in queue is started and
a tasklet triggered. The tasklet will then call the client driver
completion callback routine for notification, if set.
Interface:
void dma_async_issue_pending(struct dma_chan *chan);
Further APIs:
dmaengine: Add transfer termination synchronization support The DMAengine API has a long standing race condition that is inherent to the API itself. Calling dmaengine_terminate_all() is supposed to stop and abort any pending or active transfers that have previously been submitted. Unfortunately it is possible that this operation races against a currently running (or with some drivers also scheduled) completion callback. Since the API allows dmaengine_terminate_all() to be called from atomic context as well as from within a completion callback it is not possible to synchronize to the execution of the completion callback from within dmaengine_terminate_all() itself. This means that a user of the DMAengine API does not know when it is safe to free resources used in the completion callback, which can result in a use-after-free race condition. This patch addresses the issue by introducing an explicit synchronization primitive to the DMAengine API called dmaengine_synchronize(). The existing dmaengine_terminate_all() is deprecated in favor of dmaengine_terminate_sync() and dmaengine_terminate_async(). The former aborts all pending and active transfers and synchronizes to the current context, meaning it will wait until all running completion callbacks have finished. This means it is only possible to call this function from non-atomic context. The later function does not synchronize, but can still be used in atomic context or from within a complete callback. It has to be followed up by dmaengine_synchronize() before a client can free the resources used in a completion callback. In addition to this the semantics of the device_terminate_all() callback are slightly relaxed by this patch. It is now OK for a driver to only schedule the termination of the active transfer, but does not necessarily have to wait until the DMA controller has completely stopped. The driver must ensure though that the controller has stopped and no longer accesses any memory when the device_synchronize() callback returns. This was in part done since most drivers do not pay attention to this anyway at the moment and to emphasize that this needs to be done when the device_synchronize() callback is implemented. But it also helps with implementing support for devices where stopping the controller can require operations that may sleep. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-10-20 16:46:28 +07:00
1. int dmaengine_terminate_sync(struct dma_chan *chan)
int dmaengine_terminate_async(struct dma_chan *chan)
int dmaengine_terminate_all(struct dma_chan *chan) /* DEPRECATED */
This causes all activity for the DMA channel to be stopped, and may
discard data in the DMA FIFO which hasn't been fully transferred.
No callback functions will be called for any incomplete transfers.
dmaengine: Add transfer termination synchronization support The DMAengine API has a long standing race condition that is inherent to the API itself. Calling dmaengine_terminate_all() is supposed to stop and abort any pending or active transfers that have previously been submitted. Unfortunately it is possible that this operation races against a currently running (or with some drivers also scheduled) completion callback. Since the API allows dmaengine_terminate_all() to be called from atomic context as well as from within a completion callback it is not possible to synchronize to the execution of the completion callback from within dmaengine_terminate_all() itself. This means that a user of the DMAengine API does not know when it is safe to free resources used in the completion callback, which can result in a use-after-free race condition. This patch addresses the issue by introducing an explicit synchronization primitive to the DMAengine API called dmaengine_synchronize(). The existing dmaengine_terminate_all() is deprecated in favor of dmaengine_terminate_sync() and dmaengine_terminate_async(). The former aborts all pending and active transfers and synchronizes to the current context, meaning it will wait until all running completion callbacks have finished. This means it is only possible to call this function from non-atomic context. The later function does not synchronize, but can still be used in atomic context or from within a complete callback. It has to be followed up by dmaengine_synchronize() before a client can free the resources used in a completion callback. In addition to this the semantics of the device_terminate_all() callback are slightly relaxed by this patch. It is now OK for a driver to only schedule the termination of the active transfer, but does not necessarily have to wait until the DMA controller has completely stopped. The driver must ensure though that the controller has stopped and no longer accesses any memory when the device_synchronize() callback returns. This was in part done since most drivers do not pay attention to this anyway at the moment and to emphasize that this needs to be done when the device_synchronize() callback is implemented. But it also helps with implementing support for devices where stopping the controller can require operations that may sleep. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-10-20 16:46:28 +07:00
Two variants of this function are available.
dmaengine_terminate_async() might not wait until the DMA has been fully
stopped or until any running complete callbacks have finished. But it is
possible to call dmaengine_terminate_async() from atomic context or from
within a complete callback. dmaengine_synchronize() must be called before it
is safe to free the memory accessed by the DMA transfer or free resources
accessed from within the complete callback.
dmaengine_terminate_sync() will wait for the transfer and any running
complete callbacks to finish before it returns. But the function must not be
called from atomic context or from within a complete callback.
dmaengine_terminate_all() is deprecated and should not be used in new code.
2. int dmaengine_pause(struct dma_chan *chan)
This pauses activity on the DMA channel without data loss.
3. int dmaengine_resume(struct dma_chan *chan)
Resume a previously paused DMA channel. It is invalid to resume a
channel which is not currently paused.
4. enum dma_status dma_async_is_tx_complete(struct dma_chan *chan,
dma_cookie_t cookie, dma_cookie_t *last, dma_cookie_t *used)
This can be used to check the status of the channel. Please see
the documentation in include/linux/dmaengine.h for a more complete
description of this API.
This can be used in conjunction with dma_async_is_complete() and
the cookie returned from dmaengine_submit() to check for
completion of a specific DMA transaction.
Note:
Not all DMA engine drivers can return reliable information for
a running DMA channel. It is recommended that DMA engine users
pause or stop (via dmaengine_terminate_all()) the channel before
using this API.
dmaengine: Add transfer termination synchronization support The DMAengine API has a long standing race condition that is inherent to the API itself. Calling dmaengine_terminate_all() is supposed to stop and abort any pending or active transfers that have previously been submitted. Unfortunately it is possible that this operation races against a currently running (or with some drivers also scheduled) completion callback. Since the API allows dmaengine_terminate_all() to be called from atomic context as well as from within a completion callback it is not possible to synchronize to the execution of the completion callback from within dmaengine_terminate_all() itself. This means that a user of the DMAengine API does not know when it is safe to free resources used in the completion callback, which can result in a use-after-free race condition. This patch addresses the issue by introducing an explicit synchronization primitive to the DMAengine API called dmaengine_synchronize(). The existing dmaengine_terminate_all() is deprecated in favor of dmaengine_terminate_sync() and dmaengine_terminate_async(). The former aborts all pending and active transfers and synchronizes to the current context, meaning it will wait until all running completion callbacks have finished. This means it is only possible to call this function from non-atomic context. The later function does not synchronize, but can still be used in atomic context or from within a complete callback. It has to be followed up by dmaengine_synchronize() before a client can free the resources used in a completion callback. In addition to this the semantics of the device_terminate_all() callback are slightly relaxed by this patch. It is now OK for a driver to only schedule the termination of the active transfer, but does not necessarily have to wait until the DMA controller has completely stopped. The driver must ensure though that the controller has stopped and no longer accesses any memory when the device_synchronize() callback returns. This was in part done since most drivers do not pay attention to this anyway at the moment and to emphasize that this needs to be done when the device_synchronize() callback is implemented. But it also helps with implementing support for devices where stopping the controller can require operations that may sleep. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-10-20 16:46:28 +07:00
5. void dmaengine_synchronize(struct dma_chan *chan)
Synchronize the termination of the DMA channel to the current context.
This function should be used after dmaengine_terminate_async() to synchronize
the termination of the DMA channel to the current context. The function will
wait for the transfer and any running complete callbacks to finish before it
returns.
If dmaengine_terminate_async() is used to stop the DMA channel this function
must be called before it is safe to free memory accessed by previously
submitted descriptors or to free any resources accessed within the complete
callback of previously submitted descriptors.
The behavior of this function is undefined if dma_async_issue_pending() has
been called between dmaengine_terminate_async() and this function.