Pull RCU updates from Ingo Molnar:
"The main RCU changes in this cycle were:
- Updates to use cond_resched() instead of cond_resched_rcu_qs()
where feasible (currently everywhere except in kernel/rcu and in
kernel/torture.c). Also a couple of fixes to avoid sending IPIs to
offline CPUs.
- Updates to simplify RCU's dyntick-idle handling.
- Updates to remove almost all uses of smp_read_barrier_depends() and
read_barrier_depends().
- Torture-test updates.
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
torture: Save a line in stutter_wait(): while -> for
torture: Eliminate torture_runnable and perf_runnable
torture: Make stutter less vulnerable to compilers and races
locking/locktorture: Fix num reader/writer corner cases
locking/locktorture: Fix rwsem reader_delay
torture: Place all torture-test modules in one MAINTAINERS group
rcutorture/kvm-build.sh: Skip build directory check
rcutorture: Simplify functions.sh include path
rcutorture: Simplify logging
rcutorture/kvm-recheck-*: Improve result directory readability check
rcutorture/kvm.sh: Support execution from any directory
rcutorture/kvm.sh: Use consistent help text for --qemu-args
rcutorture/kvm.sh: Remove unused variable, `alldone`
rcutorture: Remove unused script, config2frag.sh
rcutorture/configinit: Fix build directory error message
rcutorture: Preempt RCU-preempt readers more vigorously
torture: Reduce #ifdefs for preempt_schedule()
rcu: Remove have_rcu_nocb_mask from tree_plugin.h
rcu: Add comment giving debug strategy for double call_rcu()
tracing, rcu: Hide trace event rcu_nocb_wake when not used
...
The type of arg passed to dmatest_callback is struct dmatest_done.
It refers to test_done in struct dmatest_thread, not done_wait.
Fixes: 6f6a23a213 ("dmaengine: dmatest: move callback wait ...")
Signed-off-by: Yang Shunyong <shunyong.yang@hxt-semitech.com>
Acked-by: Adam Wallis <awallis@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The __memzero assembly code is almost identical to memset's except for
two orr instructions. The runtime performance of __memset(p, n) and
memset(p, 0, n) is accordingly almost identical.
However, the memset() macro used to guard against a zero length and to
call __memzero at compile time when the fill value is a constant zero
interferes with compiler optimizations.
Arnd found tha the test against a zero length brings up some new
warnings with gcc v8:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82103
And successively rremoving the test against a zero length and the call
to __memzero optimization produces the following kernel sizes for
defconfig with gcc 6:
text data bss dec hex filename
12248142 6278960 413588 18940690 1210312 vmlinux.orig
12244474 6278960 413588 18937022 120f4be vmlinux.no_zero_test
12239160 6278960 413588 18931708 120dffc vmlinux.no_memzero
So it is probably not worth keeping __memzero around given that the
compiler can do a better job at inlining trivial memset(p,0,n) on its
own. And the memset code already handles a zero length just fine.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
* pm-core: (29 commits)
dmaengine: rcar-dmac: Make DMAC reinit during system resume explicit
PM / runtime: Allow no callbacks in pm_runtime_force_suspend|resume()
PM / runtime: Check ignore_children in pm_runtime_need_not_resume()
PM / runtime: Rework pm_runtime_force_suspend/resume()
PM / wakeup: Print warn if device gets enabled as wakeup source during sleep
PM / core: Propagate wakeup_path status flag in __device_suspend_late()
PM / core: Re-structure code for clearing the direct_complete flag
PM: i2c-designware-platdrv: Optimize power management
PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE
PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
PCI / PM: Use SMART_SUSPEND and LEAVE_SUSPENDED flags for PCIe ports
PM / wakeup: Add device_set_wakeup_path() helper to control wakeup path
PM / core: Assign the wakeup_path status flag in __device_prepare()
PM / wakeup: Do not fail dev_pm_attach_wake_irq() unnecessarily
PM / core: Direct DPM_FLAG_LEAVE_SUSPENDED handling
PM / core: Direct DPM_FLAG_SMART_SUSPEND optimization
PM / core: Add helpers for subsystem callback selection
PM / wakeup: Drop redundant check from device_init_wakeup()
PM / wakeup: Drop redundant check from device_set_wakeup_enable()
PM / wakeup: only recommend "call"ing device_init_wakeup() once
...
The current (empty) system sleep callbacks rely on the PM core to force
a runtime resume to reinitialize the DMAC registers during system
resume. Without a reinitialization, e.g. SCIF DMA will hang silently
after a system resume on R-Car Gen3.
Make this explicit by using pm_runtime_force_{suspend,resume}() as the
system sleep callbacks instead. Use SET_LATE_SYSTEM_SLEEP_PM_OPS() as
DMA engines must be initialized before all DMA slave devices.
Fixes: 17218e0092 "PM / genpd: Stop/start devices without pm_runtime_force_suspend/resume()"
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sparse warns that 'sprd_dma_prep_dma_memcpy' should be static so make it
static.
drivers/dma/sprd-dma.c:713:32: warning:
symbol'sprd_dma_prep_dma_memcpy' was not declared. Should it be static?
Reviewed-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The hidma driver open codes populating address and IRQ resources from DT.
We have standard functions of_address_to_resource and of_irq_to_resource
for this, so use them instead.
The DT binding states each child should have 2 addresses and 1 IRQ, so we
can simplify the logic and do a fixed size resource allocation. Using the
standard of_address_to_resource will also do any address translation which
was missing.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch fixes the below sparse warning in the driver
drivers/dma/xilinx/xilinx_dma.c: In function ‘xilinx_vdma_dma_prep_interleaved’:
drivers/dma/xilinx/xilinx_dma.c:1614:43: warning: variable ‘prev’ set but not used [-Wunused-but-set-variable]
struct xilinx_vdma_tx_segment *segment, *prev = NULL;
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
If the hardware is configured for Scatter Gather(SG) mode,
and hardware is idle, in the control register SG mode bit
must be set to a 0 then back to 1 by the software, to force
the CDMA SG engine to use a new value written to the CURDESC_PNTR
register, failure to do so could result errors from the dmaengine.
This patch updates the same.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Pull RCU updates from Paul E. McKenney:
- Updates to use cond_resched() instead of cond_resched_rcu_qs()
where feasible (currently everywhere except in kernel/rcu and
in kernel/torture.c). Also a couple of fixes to avoid sending
IPIs to offline CPUs.
- Updates to simplify RCU's dyntick-idle handling.
- Updates to remove almost all uses of smp_read_barrier_depends()
and read_barrier_depends().
- Miscellaneous fixes.
- Torture-test updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Register layout of a typical TPCC_EVT_MUX_M_N register is such that the
lowest numbered event is at the lowest byte address and highest numbered
event at highest byte address. But TPCC_EVT_MUX_60_63 register layout is
different, in that the lowest numbered event is at the highest address
and highest numbered event is at the lowest address. Therefore, modify
ti_am335x_xbar_write() to handle TPCC_EVT_MUX_60_63 register
accordingly.
Signed-off-by: Vignesh R <vigneshr@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The test should be >= ARRAY_SIZE() instead of > ARRAY_SIZE().
Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This avoid the following error when using an initramfs on wandboard quad
Direct firmware load for imx/sdma/sdma-imx6q.bin failed with error -2
Signed-off-by: Nicolas Chauvet <kwizart@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
some typos is comments, so fix them up
/s/enusres/ensures
/s/descripotrs/descriptors
/s/Submited/Submitted
/s/pollling/polling
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch updates the probe banner info based on the ip probed.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch fixes below.
ERROR: open brace '{' following function definitions go on the next line
+static int xilinx_dma_child_probe(struct xilinx_dma_device *xdev,
+ struct device_node *node) {
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch fixes the kernel doc warnings
in the driver.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
As per axi dmaengine spec the software must not move the tail pointer
to a location that has not been updated (next descriptor field of the
h/w descriptor should always point to a valid address).
When user submits multiple descriptors on the recv side, with the
current driver flow the last buffer descriptor next descriptor field
points to a invalid location, resulting the invalid data or errors from the
axidma dmaengine.
This patch fixes this issue by creating a buffer descritpor chain during
channel allocation itself and use those buffer descriptors for the
subsequent dma operations.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
VDMA engine default frame buffer configuration is cirular mode.
in this mode dmaengine continuously circles through h/w configured fstore
frame buffers.
When vdma h/w is configured for more than one frame.
for example h/w is configured for n number of frames, user
submits less than n number of frames and triggered the dmaengine
using issue_pending API.
since the h/w (or) driver default configuraiton is circular mode
h/w tries to write/read from an invalid frame buffer resulting
errors from the vdma dmaengine.
This patch fixes this issue by enabling the park mode as
default mode configuration for frame buffers in s/w,
so that driver can handle all cases for "k" frames where n%k==0
(n is a multiple of k) by simply replicating the frame pointers.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add variable for checking channel idle state to ensure that dma
descriptor is not submitted when dmaengine is in progress.
This will avoid the polling for a bit in the status register to know
dma state in the driver hot path.
Reviewed-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Incase of interrupt property is not present,
Driver is trying to free an invalid irq,
This patch fixes it by adding a check before freeing the irq.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch fixes the below issues.
--> Need to clear the channel data count register
when overflow interrupts occurs.
--> Reduce the log level from _info to _dbg when
overflow interrupt occurs.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch fixes the below warning
drivers/dma/xilinx/zynqmp_dma.c: In function 'zynqmp_dma_handle_ovfl_int':
drivers/dma/xilinx/zynqmp_dma.c:522:6: warning: variable 'val' set but not used [-Wunused-but-set-variable]
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch fixes the below kernel doc warnings
drivers/dma/xilinx/zynqmp_dma.c:552: info: Scanning doc for
zynqmp_dma_device_config
drivers/dma/xilinx/zynqmp_dma.c:558: warning: No description found for
return value of 'zynqmp_dma_device_config'
drivers/dma/xilinx/zynqmp_dma.c:649: info: Scanning doc for
zynqmp_dma_free_descriptors
drivers/dma/xilinx/zynqmp_dma.c:653: warning: No description found for
parameter 'chan'
drivers/dma/xilinx/zynqmp_dma.c:653: warning: Excess function parameter
'dchan' description in 'zynqmp_dma_free_descriptors'
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch adds runtime pm support in the driver.
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Previously enabled clks are only disabled if clk_prepare_enable() fails.
However, there are other error paths were the previously enabled
clocks are not disabled.
To fix the problem, fsl_disable_clocks() now takes the number of clocks
that shall be disabled + unprepared. For existing calls were all clocks
were already successfully prepared + enabled, DMAMUX_NR is passed to
disable + unprepare all clocks.
In error paths were only some clocks were successfully prepared +
enabled the loop counter is passed, in order to disable + unprepare
all successfully prepared + enabled clocks.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Andreas Platschek <andreas.platschek@opentech.at>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The location for destination event channel register has been relocated from
offset 0x28 to 0x40. Update the code accordingly.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add support for probing the newer HW and also organize MSI capable hardware
into an array for maintenance reasons.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Driver is missing the interrupts if two requests are queued up at the same
time as the interrupt handler is servicing a request that was just
delivered.
The ISR clears the interrupt at the end but it could be clearing the
interrupt for an outstanding event. Therefore, second interrupt never
arrives.
Clear the interrupt first and then check for completions.
Also, make sure that request start and interrupt clear do not overlap in
time by using a spinlock.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
in error path of jz4740_dma_probe(), call clk_disable_unprepare() to clean
up.
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: 25ce6c35fe MIPS: jz4740: Remove custom DMA API
Signed-off-by: Tobias Jordan <Tobias.Jordan@elektrobit.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Trivial fix to spelling mistake in dev_err error message text.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Commit adfa543e73 ("dmatest: don't use set_freezable_with_signal()")
introduced a bug (that is in fact documented by the patch commit text)
that leaves behind a dangling pointer. Since the done_wait structure is
allocated on the stack, future invocations to the DMATEST can produce
undesirable results (e.g., corrupted spinlocks).
Commit a9df21e34b ("dmaengine: dmatest: warn user when dma test times
out") attempted to WARN the user that the stack was likely corrupted but
did not fix the actual issue.
This patch fixes the issue by pushing the wait queue and callback
structs into the the thread structure. If a failure occurs due to time,
dmaengine_terminate_all will force the callback to safely call
wake_up_all() without possibility of using a freed pointer.
Cc: stable@vger.kernel.org
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197605
Fixes: adfa543e73 ("dmatest: don't use set_freezable_with_signal()")
Reviewed-by: Sinan Kaya <okaya@codeaurora.org>
Suggested-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
Signed-off-by: Adam Wallis <awallis@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Now that READ_ONCE() implies smp_read_barrier_depends(), the
__cleanup() and ioat_abort_descs() functions no longer need their
smp_read_barrier_depends() calls, which this commit removes.
It is actually not entirely clear why this driver ever included
smp_read_barrier_depends() given that it appears to be x86-only and
given that smp_read_barrier_depends() has no effect whatsoever except
on DEC Alpha.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <dmaengine@vger.kernel.org>
Fix ptr_ret.cocci warnings:
drivers/dma/mic_x100_dma.c:483:1-3: WARNING: PTR_ERR_OR_ZERO can be used
Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
Generated by: scripts/coccinelle/api/ptr_ret.cocci
Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Implement the device_synchronize callback to make sure that the terminated
descriptor is freed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Implement the device_synchronize callback to make sure that the terminated
descriptor is freed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Implement the device_synchronize callback to make sure that the terminated
descriptor is freed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Implement the device_synchronize callback to make sure that the terminated
descriptor is freed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Implement the device_synchronize callback to make sure that the terminated
descriptor is freed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Implement the device_synchronize callback to make sure that the terminated
descriptor is freed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Even with the introduced vchan_synchronize() we can face race when
terminating a cyclic transfer.
If the terminate_all is called after the interrupt handler called
vchan_cyclic_callback(), but before the vchan_complete tasklet is called:
vc->cyclic is set to the cyclic descriptor, but the descriptor itself was
freed up in the driver's terminate_all() callback.
When the vhan_complete() is executed it will try to fetch the vc->cyclic
vdesc, but the pointer is pointing now to uninitialized memory leading to
(hard to reproduce) kernel crash.
In order to fix this, drivers should:
- call vchan_terminate_vdesc() from their terminate_all callback instead
calling their free_desc function to free up the descriptor.
- implement device_synchronize callback and call vchan_synchronize().
This way we can make sure that the descriptor is only going to be freed up
after the vchan_callback was executed in a safe manner.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The vchan_vdesc_fini() can be used to free or reuse a given descriptor
after it has been marked as completed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
_xt_ is being dereferenced before it is null checked, hence there is a
potential null pointer dereference.
Fix this by moving the pointer dereference after _xt_ has been null
checked.
This issue was detected with the help of Coccinelle.
Fixes: 4483320e24 ("dmaengine: Use Pointer xt after NULL check.")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
If the last test in 'ioat_dma_self_test()' fails, we must release all
the allocated resources and not just part of them.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
SYS/RT/Audio DMAC includes independent data buffers for reading
and writing. Therefore, the read transfer counter and write transfer
counter have different values.
TCR indicates read counter, and TCRB indicates write counter.
The relationship is like below.
TCR TCRB
[SOURCE] -> [DMAC] -> [SINK]
In the MEM_TO_DEV direction, what really matters is how much data has
been written to the device. If the DMA is interrupted between read and
write, then, the data doesn't end up in the destination, so shouldn't
be counted. TCRB is thus the register we should use in this cases.
In the DEV_TO_MEM direction, the situation is more complex. Both the
read and write side are important. What matters from a data consumer
point of view is how much data has been written to memory.
On the other hand, if the transfer is interrupted between read and
write, we'll end up losing data. It can also be important to report.
In the MEM_TO_MEM direction, what matters is of course how much data
has been written to memory from data consumer point of view.
Here, because read and write have independent data buffers, it will
take a while for TCR and TCRB to become equal. Thus we should check
TCRB in this case, too.
Thus, all cases we should check TCRB instead of TCR.
Without this patch, Sound Capture has noise after PulseAudio support
(= 07b7acb51d ("ASoC: rsnd: update pointer more accurate")), because
the recorder will use wrong residue counter which indicates transferred
from sound device, but in reality the data was not yet put to memory
and recorder will record it.
However, because DMAC is buffering data until it can be transferable
size, TCRB might not be updated.
For example, if consumer doesn't know how much data can be received,
it requests enough size to DMAC. But in reality, it might receive very
few data. In such case, DMAC just buffered it until transferable size,
and no TCRB updated.
In such case, this buffered data will be transferred if CHCR::DE bit was
cleared, and this is happen if rcar_dmac_chan_halt(). In other word, it
happen when consumer called dmaengine_terminate_all().
Because of this behavior, it need to flush buffered data when it returns
"residue" (= dmaengine_tx_status()).
Otherwise, consumer might calculate wrong things if it called
dmaengine_tx_status() and dmaengine_terminate_all() consecutively.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Tested-by: Ryo Kodama <ryo.kodama.vz@renesas.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
DMAC reads data from source device, and buffered it until transferable
size for sink device. Because of this behavior, DMAC is including
buffered data .
Now, CHCR DE bit is controlling DMA transfer enable/disable.
If DE bit was cleared during data transferring, or during buffering,
it will flush buffered data if source device was peripheral device
(The buffered data will be removed if source device was memory).
Because of this behavior, driver should ensure that DE bit is actually
0 after clearing.
This patch adds new rcar_dmac_chcr_de_barrier() and call it after CHCR
register access.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
Tested-by: Ryo Kodama <ryo.kodama.vz@renesas.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This allows DMA client to issue a non-flow controlled TX. In particular
it is needed for the fuse driver that reads fuse registers using APBDMA
to workaround a HW bug that results in hang when CPU and DMA perform
simultaneous access to fuse peripheral.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Updates for this cycle include:
- New driver for Spreadtrum dma controller, ST MDMA and DMAMUX controllers
- PM support for IMG MDC drivers
- Updates to bcm-sba-raid driver and improvements to sun6i driver
- Subsystem conversion for:
- timers to use timer_setup()
- remove usage of PCI pool API
- usage of %p format specifier
- Minor updates to bunch of drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaCn48AAoJEHwUBw8lI4NHTe8P/RpJH8tDat/joT7Hl71stEod
vKa0iSkW2fdwd6PeaRfd+UTloska1NE9rdgfh8pCVveoHjCPQBVBOC7V8DbMtlsi
/IlJjFT74wl2R1aSHcSGoLGsIEyurz+9SK88qCU54OQSjVHSnfmyGI4ycTLQGH9U
zce5JHWHB5MkdftM4eJaSE/t0Md1DBkxadFSQRkwQqqDqoLE7jgJUK0TADRukQqS
fsDYPh/OhYAizAHlmEGuLZQheN0ld5W7n1sGsEnBD88wtBMvYHzAwT17B+BobxEp
jyaoE5nV4AgqWh1mvixrmgKoj2KL3DDC+QeoHYCExdcgIrvc86xN3homx9g9y38a
b99pgDDvXjw4N7S6AmRyQlm/5D0QyjUaoHgGklsaR3ix81dFwDY15aZa8/uQ4EAT
iKH8DxAgOq6aG1MkUycQ/7QTenRbN4yWQQa+Mm5ncoNU8bpazyxf2l5L9OJWpFjX
Q6VagNim+plGeUhpJ4IEfPi7LChXFaYsb1D7A/dqpIRvaYzwsy80b/DNhobGMDF6
eTpny64AKHnozWw/KP5k3DfcYvoU/ytcSsWf8h+CPN7EdLMBqUXFgkVwtyf6WKNc
UPl+2in08GLgfGb+n2IAdaQzlJ4dK2P7f7mx0T4OvRymu35HXd8nJjmMJ5ZyBr1t
Z/0JVfcA66AL+XSt179C
=t9Ix
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-4.15-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine updates from Vinod Koul:
"Updates for this cycle include:
- new driver for Spreadtrum dma controller, ST MDMA and DMAMUX
controllers
- PM support for IMG MDC drivers
- updates to bcm-sba-raid driver and improvements to sun6i driver
- subsystem conversion for:
- timers to use timer_setup()
- remove usage of PCI pool API
- usage of %p format specifier
- minor updates to bunch of drivers"
* tag 'dmaengine-4.15-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (49 commits)
dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type
dmaengine: dmatest: warn user when dma test times out
dmaengine: Revert "rcar-dmac: use TCRB instead of TCR for residue"
dmaengine: stm32_mdma: activate pack/unpack feature
dmaengine: at_hdmac: Remove unnecessary 0x prefixes before %pad
dmaengine: coh901318: Remove unnecessary 0x prefixes before %pad
MAINTAINERS: Step down from a co-maintaner of DW DMAC driver
dmaengine: pch_dma: Replace PCI pool old API
dmaengine: Convert timers to use timer_setup()
dmaengine: sprd: Add Spreadtrum DMA driver
dt-bindings: dmaengine: Add Spreadtrum SC9860 DMA controller
dmaengine: sun6i: Retrieve channel count/max request from devicetree
dmaengine: Build bcm-sba-raid driver as loadable module for iProc SoCs
dmaengine: bcm-sba-raid: Use common GPL comment header
dmaengine: bcm-sba-raid: Use only single mailbox channel
dmaengine: bcm-sba-raid: serialize dma_cookie_complete() using reqs_lock
dmaengine: pl330: fix descriptor allocation fail
dmaengine: rcar-dmac: use TCRB instead of TCR for residue
dmaengine: sun6i: Add support for Allwinner A64 and compatibles
arm64: allwinner: a64: Add devicetree binding for DMA controller
...
The used 0x1f mask is only valid for am335x family of SoC, different family
using this type of crossbar might have different number of electable
events. In case of am43xx family 0x3f mask should have been used for
example.
Instead of trying to handle each family's mask, just use u8 type to store
the mux value since the event offsets are aligned to byte offset.
Fixes: 42dbdcc6bf ("dmaengine: ti-dma-crossbar: Add support for crossbar on AM33xx/AM43xx")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Commit adfa543e73 ("dmatest: don't use set_freezable_with_signal()")
introduced a bug (that is in fact documented by the patch commit text)
that leaves behind a dangling pointer. Since the done_wait structure is
allocated on the stack, future invocations to the DMATEST can produce
undesirable results (e.g., corrupted spinlocks). Ideally, this would be
cleaned up in the thread handler, but at the very least, the kernel
is left in a very precarious scenario that can lead to some long debug
sessions when the crash comes later.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=197605
Signed-off-by: Adam Wallis <awallis@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This reverts commit 847449f23d: ("dmaengine: rcar-dmac: use TCRB instead
of TCR for residue") as it breaks small serial console.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
If source and destination bus width differs pack/unpack MDMA
feature has to be activated for alignment.
This pack/unpack feature implies to have both source/destination address
and buffer length aligned on bus width.
Fixes: a4ffb13c89 ("dmaengine: Add STM32 MDMA driver")
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Since commit 3cab1e7112 ("lib/vsprintf: refactor duplicate code
to special_hex_number()") %pad doesn't need 0x prefix so drop that.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Since commit 3cab1e7112 ("lib/vsprintf: refactor duplicate code
to special_hex_number()") %pad doesn't need 0x prefix so drop that.
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.
Signed-off-by: Romain Perier <romain.perier@collabora.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch adds the DMA controller driver for Spreadtrum SC9860 platform.
Signed-off-by: Baolin Wang <baolin.wang@spreadtrum.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
To avoid introduction of a new compatible for each small SoC/DMA controller
variation, move the definition of the channel count to the devicetree.
The number of vchans is no longer explicit, but limited by the highest
port/DMA request number. The result is a slight overallocation for SoCs
with a sparse port mapping.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
By default, we build Broadcom SBA RAID driver as loadable module for
iProc SOCs so that kernel image is little smaller and we load SBA RAID
driver only when required.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch makes the comment header of Broadcom SBA RAID driver
similar to the GPL comment header used across Broadcom driver
sources.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Each mailbox channel used by Broadcom SBA RAID driver is
a separate HW ring.
Currently, Broadcom SBA RAID driver creates one DMA channel
using one or more mailbox channels. When we are using more
than one mailbox channels for a DMA channel, the sba_request
are distributed evenly among multiple mailbox channels which
results in sba_request being completed out-of-order.
The above described out-of-order completion of sba_request
breaks the dma_async_is_complete() API because it assumes
DMA cookies are completed in orderly fashion.
To ensure correct behaviour of dma_async_is_complete() API,
this patch updates Broadcom SBA RAID driver to use only
single mailbox channel. If additional mailbox channels are
specified in DT then those will be ignored.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
As-per documentation in driver/dma/dmaengine.h, the
dma_cookie_complete() API should be called with lock
held.
This patch ensures that Broadcom SBA RAID driver calls
the dma_cookie_complete() API with reqs_lock held.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The patch edf10919 [dmaengine: altera: fix spinlock usage] missed to
change 2 occurrences of spin_unlock_bh() to spin_unlock_irqrestore().
This patch fixes this by moving to the IRQ-safe call in the error
paths as well.
Fixes: edf10919 (dmaengine: altera: fix spinlock usage)
Signed-off-by: Stefan Roese <sr@denx.de>
Reviewed-by: Sylvain Lesne <lesne@alse-fr.com>
[add fixes tag and fix typo in log]
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
If two concurrent threads call pl330_get_desc() when DMAC descriptor
pool is empty it is possible that allocation for one of threads will fail
with message:
kernel: dma-pl330 20078000.dma-controller: pl330_get_desc:2469 ALERT!
Here how that can happen. Thread A calls pl330_get_desc() to get
descriptor. If DMAC descriptor pool is empty pl330_get_desc() allocates
new descriptor on shared pool using add_desc() and then get newly
allocated descriptor using pluck_desc(). At the same time thread B calls
pluck_desc() and take newly allocated descriptor. In that case descriptor
allocation for thread A will fail.
Using on-stack pool for new descriptor allow avoid the issue described.
The patch modify pl330_get_desc() to use on-stack pool for allocation
new descriptors.
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
SYS/RT/Audio DMAC includes independent data buffers for reading
and writing. Therefore, the read transfer counter and write transfer
counter have different values.
TCR indicates read counter, and TCRB indicates write counter.
The relationship is like below.
TCR TCRB
[SOURCE] -> [DMAC] -> [SINK]
In the MEM_TO_DEV direction, what really matters is how much data has
been written to the device. If the DMA is interrupted between read and
write, then, the data doesn't end up in the destination, so shouldn't
be counted. TCRB is thus the register we should use in this cases.
In the DEV_TO_MEM direction, the situation is more complex. Both the
read and write side are important. What matters from a data consumer
point of view is how much data has been written to memory.
On the other hand, if the transfer is interrupted between read and
write, we'll end up losing data. It can also be important to report.
In the MEM_TO_MEM direction, what matters is of course how much data
has been written to memory from data consumer point of view.
Here, because read and write have independent data buffers, it will
take a while for TCR and TCRB to become equal. Thus we should check
TCRB in this case, too.
Thus, all cases we should check TCRB instead of TCR.
Without this patch, Sound Capture has noise after PluseAudio support
(= 07b7acb51d ("ASoC: rsnd: update pointer more accurate")), because
the recorder will use wrong residue counter which indicates transferred
from sound device, but in reality the data was not yet put to memory
and recorder will record it.
Signed-off-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com>
[Kuninori: added detail information in log]
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The A64 SoC has the same dma engine as the H3 (sun8i), with a
reduced amount of physical channels. To allow future reuse of the
compatible, leave the channel count etc. in the config data blank
and retrieve it from the devicetree.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Preparatory patch: If the same compatible is used for different SoCs which
have a common register layout, but different number of channels, the
channel count can no longer be stored in the config. Store it in the
device structure instead.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The H3 supports bursts lengths of 1, 4, 8 and 16 transfers, each with
a width of 1, 2, 4 or 8 bytes.
The register value for the the width is log2-encoded, change the
conversion function to provide the correct value for width == 8.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The current code mixes three distinct operations when transforming
the slave config to register settings:
1. special handling of DMA_SLAVE_BUSWIDTH_UNDEFINED, maxburst == 0
2. range checking
3. conversion of raw to register values
As the range checks depend on the specific SoC, move these out of the
conversion to distinct operations.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
For the H3, the burst lengths field offsets in the channel configuration
register differs from earlier SoC generations.
Using the A31 register macros actually configured the H3 controller
do to bursts of length 1 always, which although working leads to higher
bus utilisation.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The H83T uses a compatible string different from the A23, but requires
the same clock autogating register setting.
The H3 also requires setting the clock autogating register, but has
the register at a different offset.
Add three suitable callbacks for the existing controller generations
and set it in the controller config structure.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add runtime PM support to disable the clock when the h/w is not in use.
The existing clock_prepare_enable is removed from probe() as the clock
is no longer permanently enabled.
Signed-off-by: Ed Blake <ed.blake@sondrel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add suspend / resume handling using suspend_late and resume_early, and
check that all channels are idle before suspending.
DMA drivers should use suspend_late / resume_early to ensure that all
DMA client devices are suspended before the DMA device itself, and that
client devices are resumed after the DMA device. This avoids suspending
the DMA device while transactions are still active.
It is the responsibility of client drivers to terminate all DMA
transactions in their suspend handlers, so there should be no active
transactions by the time suspend_late is called.
There's no need to save and restore registers for MDC during suspend /
resume, as all transactions will be terminated as a result of the
suspend, and all required registers are programmed anyway at the start
of any new transactions following resume.
Signed-off-by: Ed Blake <ed.blake@sondrel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Use the of_device_get_match_data() helper instead of open coding.
Note that when used with DT, there's always a valid match.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
the device's max_burst to 16777215 (EN is 24bit unsigned value) so
clients can take this into consideration when setting up the transfer.
During slave transfer preparation check if the requested maxburst is valid.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
the device's max_burst to 32767 (CIDX is 16bit signed value) so clients
can take this into consideration when setting up the transfer.
During slave transfer preparation check if the requested maxburst is valid.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
hwdesc is being initialized to desc->hwdesc but this is never read
as hwdesc is overwritten in a for-loop. Remove the redundant
initialization and move the declaration of hwdesc into the for-loop.
Cleans up clang warning:
Value stored to 'hwdesc' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Without CONFIG_OF we get a build warning:
warning: (STM32_MDMA) selects DMA_OF which has unmet direct dependencies (DMADEVICES && OF)
This adds a dependency on CONFIG_OF. Since this means
we no longer need to select 'DMA_OF', I'm dropping that line
as well.
Fixes: a4ffb13c89 ("dmaengine: Add STM32 MDMA driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Pointer print was using explict cast and printing as %x which causes below
warn on some arch's so print using %p format specfier.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch adds the driver for the STM32 MDMA controller.
Master Direct memory access (MDMA) is used in order to provide high-speed
data transfer between memory and memory or between peripherals and memory.
MDMA controller provides a master AXI interface for main memory and
peripheral registers access (system access port) and a master AHB
interface only for Cortex-M7 TCM memory access (TCM access port).
MDMA works in conjunction with the standard DMA controllers (DMA1 or DMA2).
It offers up to 64 channels, each dedicated to managing memory access
requests from one of the DMA stream memory buffer or other peripherals
(w/ integrated FIFO).
Signed-off-by: M'boumba Cedric Madianga <cedric.madianga@gmail.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Since this lock is acquired in both process and IRQ context, failing to
to disable IRQs when trying to acquire the lock in process context can
lead to deadlocks.
Signed-off-by: Sylvain Lesne <lesne@alse-fr.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Commit 6084fc2ec4 ("dmaengine: altera: Use macros instead of structs
to describe the registers") introduced a minus sign before a register
offset.
This leads to soft-locks of the DMA controller, since reading the last
status byte is required to pop the response from the FIFO. Failing to
do so will lead to a full FIFO, which means that the DMA controller
will stop processing descriptors.
Signed-off-by: Sylvain Lesne <lesne@alse-fr.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add DMA filters for the sa11x0 DMA channels. This will allow us to
migrate away from directly using the DMA filter function in drivers.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch implements the STM32 DMAMUX driver.
The DMAMUX request multiplexer allows routing a DMA request line between
the peripherals and the DMA controllers of the product. The routing
function is ensured by a programmable multi-channel DMA request line
multiplexer. Each channel selects a unique DMA request line,
unconditionally or synchronously with events from its DMAMUX
synchronization inputs. The DMAMUX may also be used as a DMA request
generator from programmable events on its input trigger signals
Signed-off-by: M'boumba Cedric Madianga <cedric.madianga@gmail.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The bam dmaengine has a circular FIFO to which we
add hw descriptors that describes the transaction.
The FIFO has space for about 4096 hw descriptors.
Currently we add one descriptor and wait for it to
complete with interrupt and then add the next pending
descriptor. In this way, the FIFO is underutilized
since only one descriptor is processed at a time, although
there is space in FIFO for the BAM to process more.
Instead keep adding descriptors to FIFO till its full,
that allows BAM to continue to work on the next descriptor
immediately after signalling completion interrupt for the
previous descriptor.
Also when the client has not set the DMA_PREP_INTERRUPT for
a descriptor, then do not configure BAM to trigger a interrupt
upon completion of that descriptor. This way we get a interrupt
only for the descriptor for which DMA_PREP_INTERRUPT was
requested and there signal completion of all the previous completed
descriptors. So we still do callbacks for all requested descriptors,
but just that the number of interrupts are reduced.
CURRENT:
------ ------- ---------------
|DES 0| |DESC 1| |DESC 2 + INT |
------ ------- ---------------
| | |
| | |
INTERRUPT: (INT) (INT) (INT)
CALLBACK: (CB) (CB) (CB)
MTD_SPEEDTEST READ PAGE: 3560 KiB/s
MTD_SPEEDTEST WRITE PAGE: 2664 KiB/s
IOZONE READ: 2456 KB/s
IOZONE WRITE: 1230 KB/s
bam dma interrupts (after tests): 96508
CHANGE:
------ ------- -------------
|DES 0| |DESC 1 |DESC 2 + INT |
------ ------- --------------
|
|
(INT)
(CB for 0, 1, 2)
MTD_SPEEDTEST READ PAGE: 3860 KiB/s
MTD_SPEEDTEST WRITE PAGE: 2837 KiB/s
IOZONE READ: 2677 KB/s
IOZONE WRITE: 1308 KB/s
bam dma interrupts (after tests): 58806
Signed-off-by: Sricharan R <sricharan@codeaurora.org>
Reviewed-by: Andy Gross <andy.gross@linaro.org>
Tested-by: Abhishek Sahu <absahu@codeaurora.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
When looking for unused xbar_out lane we should also protect the set_bit()
call with the same mutex to protect against concurrent threads picking the
same ID.
Fixes: ec9bfa1e1a ("dmaengine: ti-dma-crossbar: dra7: Use bitops instead of idr")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The usage of of_device_get_match_data reduce the code size a bit.
Furthermore, it prevents an improbable dereference when
of_match_device() return NULL.
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Memory to Memory transfers does not have any special alignment needs
regarding to acnt array size, but if one of the areas are in memory mapped
regions (like PCIe memory), we need to make sure that the acnt array size
is aligned with the mem copy parameters.
Before "dmaengine: edma: Optimize memcpy operation" change the memcpy was set
up in a different way: acnt == number of bytes in a word based on
__ffs((src | dest | len), bcnt and ccnt for looping the necessary number of
words to comlete the trasnfer.
Instead of reverting the commit we can fix it to make sure that the ACNT size
is aligned to the traswnfer.
Fixes: df6694f803 (dmaengine: edma: Optimize memcpy operation)
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The driver already supports DMA_DEV_TO_DEV in sdma_config(),
DMA_SLAVE_BUSWIDTH_2_BYTES and DMA_SLAVE_BUSWIDTH_1_BYTE in
sdma_prep_slave_sg(). So this patch adds them to the lists.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The enum xdma_ip_type is only used inside the Xilinx DMA driver and not
exported to any consumers (nor should it be). So move it from the global
header to driver file itself.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
When running in software cyclic mode the driver currently does not go back
to the first segment once the last segment has been reached. Effectively
making the transfer non-cyclic.
Fix this by going back to the first segment once the last segment has been
reached for cyclic transfers.
Special care need to be taken to avoid a segment from being submitted
multiple times concurrently, which could happen for transfers with a number
of segments that is smaller than the DMA controller's internal queue.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
In hardware cyclic mode the submitted segment is repeated. This means
hardware cyclic mode can only be used if the transfer has a single segment.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
- Removal of DMA_SG support as we have no users for this feature
- New driver for Altera / Intel mSGDMA IP core
- Support for memset in dmatest and qcom_hidma driver
- Update for non cyclic mode in k3dma, bunch of update in bam_dma, bcm sba-raid
- Constify device ids across drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZsXteAAoJEHwUBw8lI4NHPwcP/iihF1n7jOQVtUm3zxPvUV+n
GzU7+rqAEDLKBaIttK28LIjgvg0AC/4aiEsosfCzTjpkzMHteRw00YyplwF7/wdM
O0owKOIub4PriDiL6d/SWFnhcWwv0/KLbyKscQcOwwvkksG/mwMn1VfW7alCrz1w
81TOQaW9SxLxL7guJU0aQHljkudT53l8Dgsp55iC9Ccz515Iuu7dQm3DnSG3sYjJ
Ct4u4MWWzDmmKKpbDoYe/Z+fiQT0WKuGfI7QHURVnw5qLo2sDKREWGbThhRG/lZj
YlnLQnkjWwLU5dyX1MyIWipPxe83sjf/7OwJ7XUlLjD6o+lNEuQxjmNkVAh0hNRc
dgrXRuqPRJMW40uOvAMDHTkexxikWc5ggt5LN9dIYDOdaS4Ch5ewf19SRi9pSDap
FZeIWY1FWwQCAU7HQMwSYyRLBjlmEmeSkElkXCd+2wu5aH2oKOMUMbUIYcqL4fjD
qMAR7kfn6e92fDT1gR1ZKL79Cfe9zsCQA3XmecpC/HwqiE3XtfZuDY/73cXD0MeO
SbJUCv4ldPGjrTKBHvs0wiWbxi5Mj5sXglmSaD0lEhtMsOfhPHY2BGatTzSmKKwO
WwmKAvM8qElQZy2Eh25dvlE04yAOofoJb6Pf/AraQOLTUkMyF8wRWEpltjUuttM9
VzQLvh8s25naKM5mOAM2
=88SI
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-4.14-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine updates from Vinod Koul:
"This one features the usual updates to the drivers and one good part
of removing DA_SG from core as it has no users.
Summary:
- Remove DMA_SG support as we have no users for this feature
- New driver for Altera / Intel mSGDMA IP core
- Support for memset in dmatest and qcom_hidma driver
- Update for non cyclic mode in k3dma, bunch of update in bam_dma,
bcm sba-raid
- Constify device ids across drivers"
* tag 'dmaengine-4.14-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (52 commits)
dmaengine: sun6i: support V3s SoC variant
dmaengine: sun6i: make gate bit in sun8i's DMA engines a common quirk
dmaengine: rcar-dmac: document R8A77970 bindings
dmaengine: xilinx_dma: Fix error code format specifier
dmaengine: altera: Use macros instead of structs to describe the registers
dmaengine: ti-dma-crossbar: Fix dra7 reserve function
dmaengine: pl330: constify amba_id
dmaengine: pl08x: constify amba_id
dmaengine: bcm-sba-raid: Remove redundant SBA_REQUEST_STATE_COMPLETED
dmaengine: bcm-sba-raid: Explicitly ACK mailbox message after sending
dmaengine: bcm-sba-raid: Add debugfs support
dmaengine: bcm-sba-raid: Remove redundant SBA_REQUEST_STATE_RECEIVED
dmaengine: bcm-sba-raid: Re-factor sba_process_deferred_requests()
dmaengine: bcm-sba-raid: Pre-ack async tx descriptor
dmaengine: bcm-sba-raid: Peek mbox when we have no free requests
dmaengine: bcm-sba-raid: Alloc resources before registering DMA device
dmaengine: bcm-sba-raid: Improve sba_issue_pending() run duration
dmaengine: bcm-sba-raid: Increase number of free sba_request
dmaengine: bcm-sba-raid: Allow arbitrary number free sba_request
dmaengine: bcm-sba-raid: Remove reqs_free_count from sba_device
...
Allwinner V3s has a DMA engine similar to the ones from A31, but with
fewer channels and DRQs.
Add support for it.
Signed-off-by: Icenowy Zheng <icenowy@aosc.xyz>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Originally we enable a special gate bit when the compatible indicates
A23/33.
But according to BSP sources and user manuals, more SoCs will need this
gate bit.
So make it a common quirk configured in the config struct.
Signed-off-by: Icenowy Zheng <icenowy@aosc.xyz>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
'err' is a signed int and error codes are typically negative numbers, so
use '%d' instead of '%u' to format the error code in the error message.
Fixes: ba16db36b5 ("dmaengine: vdma: Add clock support")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch moves from a struct declaration for the DMA controller
registers to macros with offests to the base address. This is mainly
done to remove the sparse warnings, since the function parameter of
ioread32/iowrite32 is "void __iomem *" instead of a pointer to struct
members. With this patch applied, no sparse warning is seen anymore.
Please note that the struct for the descriptors is still kept in place,
as the code largely accesses the struct members as internal variables
before the complete struct is copied into the descriptor FIFO of the
DMA controller.
Additionally this patch also removes two warnings "variable xxx set but
not used" seen when compiling with "W=1". The registers need to be read
to flush the response FIFO, but nothing needs to be done with them. So
the code is correct here and the warning is a false one.
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
DMA crossbar uses 'xbar->dma_inuse' variable to manage allocated routes.
Each bit represents respective DMA channel. If the channel is free, bit
is set to '0', if channel is allocated, bit should be set to '1'.
In reserve function, the bits for requested DMA channels are cleared, so
they are not really reserved, but freed and become ready for allocation.
Signed-off-by: Alexander Smirnov <asmirnov@ilbers.de>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
amba_id are not supposed to change at runtime. All functions
working with const amba_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
amba_id are not supposed to change at runtime. All functions
working with const amba_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The SBA_REQUEST_STATE_COMPLETED state was added to keep track
of sba_request which got completed but cannot be freed because
underlying Async Tx descriptor was not ACKed by DMA client.
Instead of above, we can free the sba_request with non-ACKed
Async Tx descriptor and sba_alloc_request() will ensure that
it always allocates sba_request with ACKed Async Tx descriptor.
This alternate approach makes SBA_REQUEST_STATE_COMPLETED state
redundant hence this patch removes it.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
We should explicitly ACK mailbox message because after
sending message we can know the send status via error
attribute of brcm_message.
This will also help SBA-RAID to use "txdone_ack" method
whenever mailbox controller supports it.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
This patch adds debugfs support to report stats via debugfs
which in-turn will help debug hang or error situations.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The SBA_REQUEST_STATE_RECEIVED state is now redundant because
received sba_request are immediately freed or moved to completed
list in sba_process_received_request().
This patch removes redundant SBA_REQUEST_STATE_RECEIVED state.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Currently, sba_process_deferred_requests() handles both pending
and completed sba_request which is unnecessary overhead for
sba_issue_pending() because completed sba_request handling is
not required in sba_issue_pending().
This patch breaks sba_process_deferred_requests() into two parts
sba_process_received_request() and _sba_process_pending_requests().
The sba_issue_pending() will only process pending sba_request
by calling _sba_process_pending_requests(). This will improve
sba_issue_pending().
The sba_receive_message() will only process received sba_request
by calling sba_process_received_request() for each received
sba_request. The sba_process_received_request() will also call
_sba_process_pending_requests() after handling received sba_request
because we might have pending sba_request not submitted by previous
call to sba_issue_pending().
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
We should pre-ack async tx descriptor at time of
allocating sba_request (just like other RAID drivers).
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
When setting up RAID array on several NVMe disks we observed that
sba_alloc_request() start failing (due to no free requests left)
and RAID array setup becomes very slow.
To improve performance, we do mbox channel peek when we have
no free requests. This improves performance of RAID array setup
because mbox requests that were completed but not processed by
mbox completion worker will be processed immediately by mbox
channel peek.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
We should allocate DMA channel resources before registering the
DMA device in sba_probe() because we can get DMA request soon
after registering the DMA device. If DMA channel resources are
not allocated before first DMA request then SBA-RAID driver will
crash.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The pending sba_request list can become very long in real-life usage
(e.g. setting up RAID array) which can cause sba_issue_pending() to
run for long duration.
This patch adds common sba_process_deferred_requests() to process
few completed and pending requests so that it finishes in short
duration. We use this common sba_process_deferred_requests() in
both sba_issue_pending() and sba_receive_message().
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Currently, we have only 1024 free sba_request created
by sba_prealloc_channel_resources(). This is too low
and the prep_xxx() callbacks start failing more often
at time of RAID array setup over NVMe disks.
This patch sets number of free sba_request created by
sba_prealloc_channel_resources() to be:
<number_of_mailbox_channels> x 8192
Due to above, we will have sufficient number of free
sba_request and prep_xxx() callbacks failing is very
unlikely.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Currently, we cannot have any arbitrary number of free sba_request
because sba_prealloc_channel_resources() allocates an array of
sba_request using devm_kcalloc() and kcalloc() cannot provide
memory beyond certain size.
This patch removes "reqs" (sba_request array) from sba_device
and makes "cmds" as variable array (instead of pointer) in
sba_request. This helps sba_prealloc_channel_resources() to
allocate sba_request and associated SBA command in one allocation
which in-turn allows arbitrary number of free sba_request.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The reqs_free_count member of sba_device is not used anywhere
hence no point in tracking number of free sba_request.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Both resp and resp_dma are redundant in sba_request because
resp is unused and resp_dma carries same information present
in tx.phys of sba_request. This patch removes both resp and
resp_dma from sba_request.
Signed-off-by: Anup Patel <anup.patel@broadcom.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>