On some SoCs such as i.MX7ULP, there is no busfreq
driver, but cpuidle has some levels which may disable
system/bus clocks, so need to add pm_qos to prevent
cpuidle from entering low level idles and make sure
system/bus clocks are enabled when usdhc is active.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Use signal resampling tuning for the UHS and HS200 modes.
Instead of trying to get the *best* resampling setting with complex
window calculation, we just stop on the first working setting.
If the tuning setting later proves unstable, we will just continue the
tuning where we left it.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This remove all the code related to phase settings. Using the Rx phase
for tuning has not been reliable. We had several issues over the past
months, on both v2 and v3 mmc chips After discussing the issue matter
with Amlogic, They suggested to set a phase shift of 180 between Core and
Tx and use signal resampling for the tuning.
Since we won't be playing with the phase anymore, let's remove all the
clock code related to it and set the appropriate value on init.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Activating DDR in the Amlogic mmc controller, among other things, will
divide the output clock by 2. So by activating it with clock on, we are
creating a glitch on the output.
Instead, let's deal with DDR when the clock output is off, when setting
the clock.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
At the moment, all our attempts to enable HS400 on Amlogic chipsets have
been unsuccessful or unreliable. Until we can figure out how to enable this
mode safely and reliably, let's force it off.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
There is no reason for another device to request the MMC irq. It should
only be used the MMC device, so remove IRQ_SHARED and replace by
IRQ_ONESHOT as we don't the irq to fire again until the irq thread is
done
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This is merely a clean up. It makes sense to only ack raised irqs
instead of acking everything all the time.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
There is already a function available to poll a register until a
condition is met. Let's use it instead of open coding it.
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
DMA on this hardware is limited to dealing with a 4096 bytes at a
time. Previously, the driver was set up accordingly to request single-page
DMA buffers, however that had the effect of generating a large number
of small MMC requests for data I/O.
Improve the driver to accept multi-entry scatter-gather lists. The size of
each entry is already capped to 4096 bytes (AU6601_MAX_DMA_BLOCK_SIZE),
matching the hardware requirements. Existing driver code already iterates
through remaining sglist entries after each DMA transfer is complete.
Also add some comments to help clarify the situation, and clear up
some of the confusion I had regarding DMA vs PIO.
Testing with dd, this increases write performance from 2mb/sec to
10mb/sec, and increases read performance from 4mb/sec to 14mb/sec.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Link: http://lkml.kernel.org/r/CAD8Lp47JYdZzbV9F+asNwvSfLF_po_J7ir6R_Vb-Dab21_=Krw@mail.gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This reverts commit 57ebb96c293da9f0ec56aba13c5541269a5c10b1.
Usage of the DMA page iterator was problematic here because
we were not considering offset & length of entries in the scatterlist.
Also, after further discussion, the suggested revised approach is much
more similar to the driver implementation before this commit was
applied, so revert it.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Remove finish_tasklet. Requests that require DMA-unmapping or sdhci_reset
are completed either in the IRQ thread or a workqueue if the completion is
not initiated by the IRQ.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In preparation for removing finish_tasklet, call mmc_request_done() from
the IRQ handler if possible. That will alleviate the potential loss of
performance from shifting away from finish_tasklet.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In preparation for removing finish_tasklet, move some processing from
sdhci_request_done() to __sdhci_finish_mrq().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In preparation for removing finish_tasklet, move some functions.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In preparation for removing finish_tasklet, reorganize sdhci_finish_mrq()
and __sdhci_finish_mrq() to separate the tasklet scheduling from other
processing.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
'top_base' memory region is optional. Check that the resource is valid
before using it. This avoid getting a "invalid resource" error message
printed by the kernel.
Signed-off-by: Fabien Parent <fparent@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
DMA on this hardware is limited to dealing with a single page at a
time. Previously, the driver was set up accordingly to request single-page
DMA buffers, however that had the effect of generating a large number
of small MMC requests for data I/O.
Improve the driver to accept scatter-gather DMA buffers of larger sizes.
Iterate through those buffers a page at a time.
Testing with dd, this increases write performance from 2mb/sec to
10mb/sec, and increases read performance from 4mb/sec to 14mb/sec.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
According to the AM654x Data Manual[1], the setup timing in lower speed
modes can only be met if the controller uses a falling edge data launch.
To ensure this, the HIGH_SPEED_ENA (HOST_CONTROL[2]) bit should be
cleared in default speed, SD high speed, MMC high speed, SDR12 and SDR25
speed modes.
Use the sdhci writeb callback to implement this condition.
[1] http://www.ti.com/lit/gpn/am6546 Section 5.10.5.16.1
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reduce size of duplicated comments by switching to use SPDX identifier.
No functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
- spaces surrounding arithmetic operators
- utilize full line limit
- drop extra spaces / TABs in variable definitions
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
For easy grepping on debug purposes join string literals back in
the messages.
No functional change.
While here, join list of module authors as well.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The mmc pointer can't be NULL at ->remove(), drop the useless check.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Driver core sets it to NULL upon probe failure or release.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch allows to get datactrl configuration specific
at variant. This introduce more flexibility on datactlr
value.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch defines get_dctrl_cfg callback for sdmmc variant.
sdmmc variant has specific stm32 transfer modes.
sdmmc data transfer mode selection could be:
-Block data transfer ending on block count.
-SDIO multibyte data transfer.
-MMC Stream data transfer (not used).
-Block data transfer ending with STOP_TRANSMISSION command.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch defines get_dctrl_cfg callback for qcom variant.
qcom variant has a specific block size definition.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch adds get_datactrl_cfg callback in mmci_host_ops
to allow to get datactrl configuration specific at variant.
Common helper function is defined and could be call by variant.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Enable the DMA codepath for writes as well as reads.
This improves write speed from 1mb/sec to 2mb/sec (tested with dd).
The original ampe_stor vendor driver also uses DMA for writes.
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Direct commands (DCMDs) are an optional feature of eMMC 5.1's command
queue engine (CQE). The Arasan eMMC 5.1 controller uses the CQHCI,
which exposes a control register bit to enable the feature.
The current implementation sets this bit unconditionally.
This patch allows to suppress the feature activation,
by specifying the property disable-cqe-dcmd.
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 84362d79f4 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Fix sparse warning:
drivers/mmc/host/sdhci-omap.c:788:6: warning:
symbol 'sdhci_omap_reset' was not declared. Should it be static?
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tegra CQHCI/SDHCI design prevents write access to SDHCI block size
register when CQE is enabled and unhalted.
CQHCI driver enables CQE prior to invoking sdhci_cqe_enable which
violates this Tegra specific host requirement.
This patch fixes this by configuring sdhci block registers prior
to CQE unhalt.
This patch also has a fix for retry of unhalt due to known Tegra
specific CQE resume bug where first unhalt might not succeed when
clear all tasks is performed prior to resume and need a second unhalt.
This patch also includes CQE enable fix for CMD CRC errors that
happen with the specific sandisk emmc device when status command
is sent during the transfer of last data block due to marginal timing.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch adds define for CBC field mask of the register
CQHCI_SSC1.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tegra186 CQHCI host has a known bug where CQHCI controller selects
DATA_PRESENT_SELECT bit to 1 for DCMDs with R1B response type and
since DCMD does not trigger any data transfer, DCMD task complete
happens leaving the DATA FSM of host controller in wait state for
the data.
This effects the data transfer tasks issued after the DCMDs with
R1b response type resulting in timeout.
SW WAR is to set CMD_TIMING to 1 in DCMD task descriptor. This bug
and SW WAR is applicable only for Tegra186 and not for Tegra194.
This patch implements this WAR thru NVQUIRK_CQHCI_DCMD_R1B_CMD_TIMING
for Tegra186 and also implements update_dcmd_desc of cqhci_host_ops
interface to set CMD_TIMING bit depending on the NVQUIRK.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch adds update_dcmd_desc interface to cqhci_host_ops to
allow hosts to update any of the DCMD task descriptor attributes
and parameters.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch includes below HW tuning related fixes.
configures tuning parameters as per Tegra TRM
WAR fix for manual tap change
HW auto-tuning post process
As per Tegra TRM, SDR50 mode tuning execution takes upto maximum
of 256 tuning iterations and SDR104/HS200/HS400 modes tuning
execution takes upto maximum of 128 tuning iterations.
This patch programs tuning control register with maximum tuning
iterations needed based on the timing along with the start tap,
multiplier, and step size used by the HW tuning.
Tegra210 has a known issue of glitch on trimmer output when the
tap value is changed with the trimmer input clock running and the
WAR is to disable card clock before sending tuning command and
after sending tuning command wait for 1usec and issue SW reset
followed by enabling card clock.
This WAR is applicable when changing tap value manually as well.
Tegra SDHCI driver has this implemented correctly for manual tap
change but missing SW reset before enabling card clock during
sending tuning command.
Issuing SW reset during tuning command as a part of WAR and is
applicable in cases where tuning is performed with single step size
for more iterations. This patch includes this fix.
HW auto-tuning finds the best largest passing window and sets the
tap at the middle of the window. With some devices like sandisk
eMMC driving fast edges and due to high tap to tap delay in the
Tegra chipset, auto-tuning does not detect falling tap between the
valid windows resulting in a parital window or a merged window and
the best tap is set at the signal transition which is actually the
worst tap location.
Recommended SW solution is to detect if the best passing window
picked by the HW tuning is a partial or a merged window based on
min and max tap delays found from chip characterization across
PVT and perform tuning correction to pick the best tap.
This patch has implementation of this post HW tuning process for
the tegra hosts that support HW tuning through the callback function
tegra_sdhci_execute_hw_tuning and uses the tuned tap delay.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
As per the Host Controller Standard Specification Version 4.20,
limitation of tuning iteration count is removed as PLL locking
time can be longer than UHS-1 tuning due to larger PVT fluctuation
and it will result in increase of tuning iteration to complete the
tuning.
This patch creates sdhci_host member tuning_loop_count to allow
hosts to specify maximum tuning iterations and also updates
execute_tuning to use this specified maximum tuning iteration count.
Default tuning_loop_count is set to same as existing loop count of
MAX_TUNING_LOOP which is 40 iterations.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
ddr_signaling is set to true for DDR50 and DDR52 modes but is
not set back to false for other modes. This programs incorrect
host clock when mode change happens from DDR52/DDR50 to other
SDR or HS modes like incase of mmc_retune where it switches
from HS400 to HS DDR and then from HS DDR to HS mode and then
to HS200.
This patch fixes the ddr_signaling to set properly for non DDR
modes.
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Cc: stable@vger.kernel.org # v4.20 +
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The CBSY flag should be proper before calling tmio_mmc_host_probe()
because this function will already use write16 which checks this bit.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
max_req_size is calculated by 'max_blk_size * max_blk_count' in the TMIO
core. So, specifying U32_MAX as max_blk_count will overflow this
calculation. It will cause no harm in practice because the immense high
number will overflow into another immense high number. However, it is
not good coding practice, so calculate max_blk_count so that
max_req_size will fit into unsigned int on ARM32/64.
Thanks to the Renesas BSP team for the bug report!
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We will need it later for other calculations.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Mostly year updates, but one addition as well.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
According to the i.MX23/28 reference manuals both mmc interfaces support
the MMC_ERASE command. So enable this capability in the driver to allow
erase/discard/trim requests.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In case spi_sync_locked fails, the fix reports the error and
returns the error code upstream.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
For some controllers, in Present State Register, Data Line
Active bit is not reliable for commands (such as CMD6, CMD7,
CMD12, CMD28, CMD29, or CMD38) with busy signal. DLA affects
Command with Data Inhibit bit. Therefore, software driver
may not know the busy status in DLA/CDIHB.
Futunately MMC core driver has already polled card status
with CMD13 after sending any command with busy signal. So
we can just ignore CDIHB never released issue for such
controllers. This patch is to add a quirk to handle this.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Invalid Transfer Complete (IRQSTAT[TC]) bit could be set during
multi-write operation even when the BLK_CNT in BLKATTR register
has not reached zero. Therefore, Transfer Complete might be
reported twice due to this erratum since a valid Transfer Complete
occurs when BLK_CNT reaches zero. This erratum is to fix this issue
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>