After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/dma after the previous
conversion commits apart from the wireless drivers to use .remove(),
with the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20241004062227.187726-2-u.kleine-koenig@baylibre.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Currently there are two names utilized in the driver to keep the functions
call status: ret and err. For the sake of unification convert to using the
first version only.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240802075100.6475-7-fancer.lancer@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In order to have a more coherent DW AHB DMA slave configuration method -
dwc_config() - let's simplify the source and destination channel max-burst
calculation procedure:
1. Create the max-burst verification method as it has been just done for
the memory and peripheral address widths. Thus the dwc_config() method
will turn to a set of the verification methods execution.
2. Since both the generic DW AHB DMA and Intel iDMA 32-bit engines support
the power-of-2 bursts only, then the specified by the client driver
max-burst values can be converted to being power-of-2 right in the
max-burst verification method.
3. Since max-burst encoded value is required on the CTL_LO fields
calculation stage, the encode_maxburst() callback can be easily dropped
from the dw_dma structure meanwhile the encoding procedure will be
executed right in the CTL_LO register value calculation.
Thus the update will provide the next positive effects: the internal
DMA-slave config structure will contain only the real DMA-transfer config
values, which will be encoded to the DMA-controller register fields only
when it's required on the buffer mapping; the redundant encode_maxburst()
callback will be dropped simplifying the internal HW-abstraction API;
dwc_config() will look more readable executing the verification functions
one-by-one.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240802075100.6475-6-fancer.lancer@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
As a preparatory change before dropping the encode_maxburst() callbacks
let's move dw_dma_encode_maxburst() and idma32_encode_maxburst() to being
defined above the dw_dma_prepare_ctllo() and idma32_prepare_ctllo()
methods respectively. That's required since the former methods will be
called from the later ones directly.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240802075100.6475-5-fancer.lancer@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Currently the CTL LO fields are calculated on the platform-specific basis.
It's implemented by means of the prepare_ctllo() callbacks using the
ternary operator within the local variables init block at the beginning of
the block scope. The functions code currently is relatively hard to
comprehend and isn't that optimal since implies four conditional
statements executed and two additional local variables defined. Let's
simplify the DW AHB DMA prepare_ctllo() method by unrolling the ternary
operators into the normal if-else statement, dropping redundant
master-interface ID variables and initializing the local variables based
on the singly evaluated DMA-transfer direction check. Thus the method will
look much more readable since now the fields content can be easily
inferred right from the if-else branch. Provide the same update in the
Intel DMA32 core driver for the sake of the driver code unification.
Note besides of the effects described above this update is basically a
preparation before dropping the max burst encoding callback. The dropping
will require to call the burst fields calculation methods right in the
prepare_ctllo() callbacks. It would have made the later functions code
even more complex should they were left in the original state.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240802075100.6475-4-fancer.lancer@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory
data width (single transfer width) is determined based on the buffer
length, buffer base address or DMA master-channel max address width
capability. It isn't enough in case of the channel disabling prior the
block transfer is finished. Here is what DW AHB DMA IP-core databook says
regarding the port suspension (DMA-transfer pause) implementation in the
controller:
"When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is
high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not
permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may
still be data in the channel FIFO, but not enough to form a single
transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is
disabled, the remaining data in the channel FIFO is not transferred to the
destination peripheral."
So in case if the port gets to be suspended and then disabled it's
possible to have the data silently discarded even though the controller
reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped
data already received from the source device. This looks as if the data
somehow got lost on a way from the peripheral device to memory and causes
problems for instance in the DW APB UART driver, which pauses and disables
the DMA-transfer as soon as the recv data timeout happens. Here is the way
it looks:
Memory <------- DMA FIFO <------ UART FIFO <---------------- UART
DST_TR_WIDTH -+--------| | |
| | | | No more data
Current lvl -+--------| |---------+- DMA-burst lvl
| | |---------+- Leftover data
| | |---------+- SRC_TR_WIDTH
-+--------+-------+---------+
In the example above: no more data is getting received over the UART port
and BLOCK_TS is not even close to be fully received; some data is left in
the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA
FIFO; some data is left in the DMA FIFO, but not enough to be passed
further to the system memory in a single transfer. In this situation the
8250 UART driver catches the recv timeout interrupt, pauses the
DMA-transfer and terminates it completely, after which the IRQ handler
manually fetches the leftover data from the UART FIFO into the
recv-buffer. But since the DMA-channel has been disabled with the data
left in the DMA FIFO, that data will be just discarded and the recv-buffer
will have a gap of the "current lvl" size in the recv-buffer at the tail
of the lately received data portion. So the data will be lost just due to
the misconfigured DMA transfer.
Note this is only relevant for the case of the transfer suspension and
_disabling_. No problem will happen if the transfer will be re-enabled
afterwards or the block transfer is fully completed. In the later case the
"FIFO flush mode" will be executed at the transfer final stage in order to
push out the data left in the DMA FIFO.
In order to fix the denoted problem the DW AHB DMA-engine driver needs to
make sure that the _bursted_ source transfer width is greater or equal to
the single destination transfer (note the HW databook describes more
strict constraint than actually required). Since the peripheral-device
side is prescribed by the client driver logic, the memory-side can be only
used for that. The solution can be easily implemented for the DEV_TO_MEM
transfers just by adjusting the memory-channel address width. Sadly it's
not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size
is normally dynamically determined by the controller. So the only thing
that can be done is to make sure that memory-side address width is greater
than the peripheral device address width.
Fixes: a09820043c9e ("dw_dmac: autoconfigure data_width or get it via platform data")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240802075100.6475-3-fancer.lancer@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Currently the src_addr_width and dst_addr_width fields of the
dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
properly aligned data passed to the target device. It's done just by
converting the passed peripheral bus width to the encoded value using the
__ffs() function. This implementation has several problematic sides:
1. __ffs() is undefined if no bit exist in the passed value. Thus if the
specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
unexpected value depending on the platform-specific implementation.
2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
bus-width out of that constraints scope will definitely cause unexpected
result since the destination reg will be only partly touched than the
client driver implied.
Let's fix all of that by adding the peripheral bus width verification
method and calling it in dwc_config() which is supposed to be executed
before preparing any transfer. The new method will make sure that the
passed source or destination address width is valid and if undefined then
the driver will just fallback to the 1-byte width transfer.
Fixes: 029a40e97d0d ("dmaengine: dw: provide DMA capabilities")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20240802075100.6475-2-fancer.lancer@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new() which already returns void. Eventually after all drivers
are converted, .remove_new() is renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20230919133207.1400430-12-u.kleine-koenig@pengutronix.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230718143138.1066177-1-robh@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Move check for paused channel to dwc_get_residue() and rename the latter
to dwc_get_residue_and_status().
This improves data integrity as residue and DMA channel status are set
in the same function under the same conditions.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230130151747.20704-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
New support/Core
- Remove DMA_MEMCPY_SG for lack of users
- Tegra 234 dmaengine support
- Mediatek MT8365 dma support
- Apple ADMAC driver
Updates:
- Yaml conversion for ST-Ericsson DMA40 binding and Freescale edma
- rz-dmac updates and device_synchronize support
- Bunch of typo in comments fixes in drivers
- multithread support in sf-pdma driver
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmLrWacACgkQfBQHDyUj
g0f/wRAAsGxg7IQqMKhWTiE6xN3/B4vxTD9Er4jCwjVw+ibivH9Nvhp9n4Cv5qr0
Me1eGNq6e4KMD1RRBvy2KmK44pBodrCeDpWLGonOBToWPlKBGFRjOZ0v/H3/eVOs
kjfYb73zPmleGZy2w0i6g8g5cwCwb5eDUGtztqIcYRET3jH+rWKYrHnMG/gaa1iF
9isMKUNqplv2mKSXmxsMRJPzY7NRuPJthnsQSKdEXaY9HEmEUX9wAB8K1Dy+UPK/
vAPg/Zn9XSnir4JWYxLSMI2bDrOz4xkaQ2Xac9pV1KIAMyx76RGu/Yz0JdVUsgGU
w6aI/AYDtKeQe5sZSpbt3K/Ef2s5tVRfnCO3avtva6ozO39vOxpqTyujidxF8gJW
xCsQVa8t92mKB8Y9/pwGIjYEnSoyLoxclBTMl7eVLvbHPa+maVeOnixfb/5uWD45
+6djWv3FW/D7WilsjyZe57tSjvhw3RrDQEpKwuMCpmScMqitu0pVzFBYv+vpIjxL
q5lbRK0mP9trdGHqsoD/GVjdxv+O7bwZjBNPzahxoRpN4+jktb8xfRQEZUW2Uqyf
HPLvoLNbVPK0UyHkPTAj/QnTq56M21fMIuCn1Jp6RjzRzD2w7fHFtoOF6+wsFVx6
iBYDzQRTq2lNIGFnoQ8N94XiKORfdJNv+ZstGTirWKc6xaKDw7E=
=/IFO
-----END PGP SIGNATURE-----
Merge tag 'dmaengine-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New support / Core:
- Remove DMA_MEMCPY_SG for lack of users
- Tegra 234 dmaengine support
- Mediatek MT8365 dma support
- Apple ADMAC driver
Updates:
- Yaml conversion for ST-Ericsson DMA40 binding and Freescale edma
- rz-dmac updates and device_synchronize support
- Bunch of typo in comments fixes in drivers
- multithread support in sf-pdma driver"
* tag 'dmaengine-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (50 commits)
dmaengine: mediatek: mtk-hsdma: Fix typo 'the the' in comment
dmaengine: axi-dmac: check cache coherency register
dmaengine: sh: rz-dmac: Add device_synchronize callback
dmaengine: sprd: Cleanup in .remove() after pm_runtime_get_sync() failed
dmaengine: tegra: Add terminate() for Tegra234
dt-bindings: dmaengine: Add compatible for Tegra234
dmaengine: xilinx: use strscpy to replace strlcpy
dmaengine: imx-sdma: Add FIFO stride support for multi FIFO script
dmaengine: idxd: Correct IAX operation code names
dmaengine: imx-dma: Cast of_device_get_match_data() with (uintptr_t)
dmaengine: dw-axi-dmac: ignore interrupt if no descriptor
dmaengine: dw-axi-dmac: do not print NULL LLI during error
dmaengine: altera-msgdma: Fixed some inconsistent function name descriptions
dmaengine: imx-sdma: Add missing struct documentation
dmaengine: sf-pdma: Add multithread support for a DMA channel
dt-bindings: dma: dw-axi-dmac: extend the number of interrupts
dmaengine: dmatest: use strscpy to replace strlcpy
dmaengine: ste_dma40: fix typo in comment
dmaengine: jz4780: fix typo in comment
dmaengine: s3c24xx: fix typo in comment
...
The AVR32 architecture does no longer exist in the Linux kernel, hence
remove a reference to it in comments to avoid confusion.
Signed-off-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
When built without OF support, of_match_node() expands to NULL, which
produces the following output:
>> drivers/dma/dw/rzn1-dmamux.c:105:34: warning: unused variable 'rzn1_dmac_match' [-Wunused-const-variable]
static const struct of_device_id rzn1_dmac_match[] = {
One way to silence the warning is to enclose the structure definition
with an #ifdef CONFIG_OF/#endif block.
Fixes: 134d9c52fca2 ("dmaengine: dw: dmamux: Introduce RZN1 DMA router support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220609141455.300879-2-miquel.raynal@bootlin.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This is a tristate driver that can be built as a module, as a result,
the OF match table should be exported with MODULE_DEVICE_TABLE().
Fixes: 134d9c52fca2 ("dmaengine: dw: dmamux: Introduce RZN1 DMA router support")
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220609141455.300879-1-miquel.raynal@bootlin.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The Renesas RZN1 DMA IP is very close to the original DW DMA IP, a DMA
router has been introduced to handle the wiring options that have been
added.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-By: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20220427095653.91804-8-miquel.raynal@bootlin.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The Renesas RZN1 DMA IP is based on a DW core, with eg. an additional
dmamux register located in the system control area which can take up to
32 requests (16 per DMA controller). Each DMA channel can be wired to
two different peripherals.
We need two additional information from the 'dmas' property: the channel
(bit in the dmamux register) that must be accessed and the value of the
mux for this channel.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220427095653.91804-6-miquel.raynal@bootlin.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The wrappers in include/linux/pci-dma-compat.h should go away.
pci_set_dma_mask()/pci_set_consistent_dma_mask() should be
replaced with dma_set_mask()/dma_set_coherent_mask(),
and use dma_set_mask_and_coherent() for both.
Signed-off-by: Qing Wang <wangqing@vivo.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/1633663733-47199-6-git-send-email-wangqing@vivo.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Since we converted internal data types to match DT, there is no need to have
an intermediate conversion layer, hence drop a few conditionals and for loops
for good.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20210802184355.49879-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Users are a bit frightened of the harmless message that tells that
DT is missed on ACPI-based platforms. Remove it for good, it will
simplify the future conversion to fwnode and device property APIs.
Fixes: a9ddb575d6d6 ("dmaengine: dw_dmac: Enhance device tree support")
Depends-on: f5e84eae7956 ("dmaengine: dw: platform: Split OF helpers to separate module")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199379
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20210802184355.49879-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Intel Elkhart Lake PSE DMA implementation is integrated with crossbar IP
in order to serve more hardware than there are DMA request lines available.
Due to this, program xBAR hardware to make flexible support of PSE peripheral.
The Device-to-Device has not been tested and it's not supported by DMA Engine,
but it's left in the code for the sake of documenting hardware features.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210712113940.42753-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Some architectures do not provide devm_*() APIs. Hence make the driver
dependent on HAVE_IOMEM.
Fixes: dbde5c2934d1 ("dw_dmac: use devm_* functions to simplify code")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20210324141757.24710-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
This reverts commit 842067940a3e3fc008a60fee388e000219b32632.
For some solutions e.g. sound/soc/intel/catpt, DW DMA is part of a
compound device (in that very example, domains: ADSP, SSP0, SSP1, DMA0
and DMA1 are part of a single entity) rather than being a standalone
one. Driver for said device may enlist DMA to transfer data during
suspend or resume sequences.
Manipulating RPM explicitly in dw's DMA request and release channel
functions causes suspend() to also invoke resume() for the exact same
device. Similar situation occurs for resume() sequence. Effectively
renders device dysfunctional after first suspend() attempt. Revert the
change to address the problem.
Fixes: 842067940a3e ("dmaengine: dw: Enable runtime PM")
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20210203191924.15706-1-cezary.rojewski@intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
When consumer requests channel power on the DMA controller device
and otherwise on the freeing channel resources.
Note, in some cases consumer acquires channel at the ->probe() stage and
releases it at the ->remove() stage. It will mean that DMA controller device
will be powered during all this time if there is no assist from hardware
to idle it. The above mentioned cases should be investigated separately
and individually.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20201103183938.64752-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.
Signed-off-by: Romain Perier <romain.perier@gmail.com>
Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Link: https://lore.kernel.org/r/20200831103542.305571-6-allen.lkml@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
DW DMA IP-core provides a way to synthesize the DMA controller with
channels having different parameters like maximum burst-length,
multi-block support, maximum data width, etc. Those parameters both
explicitly and implicitly affect the channels performance. Since DMA slave
devices might be very demanding to the DMA performance, let's provide a
functionality for the slaves to be assigned with DW DMA channels, which
performance according to the platform engineer fulfill their requirements.
After this patch is applied it can be done by passing the mask of suitable
DMA-channels either directly in the dw_dma_slave structure instance or as
a fifth cell of the DMA DT-property. If mask is zero or not provided, then
there is no limitation on the channels allocation.
For instance Baikal-T1 SoC is equipped with a DW DMAC engine, which first
two channels are synthesized with max burst length of 16, while the rest
of the channels have been created with max-burst-len=4. It would seem that
the first two channels must be faster than the others and should be more
preferable for the time-critical DMA slave devices. In practice it turned
out that the situation is quite the opposite. The channels with
max-burst-len=4 demonstrated a better performance than the channels with
max-burst-len=16 even when they both had been initialized with the same
settings. The performance drop of the first two DMA-channels made them
unsuitable for the DW APB SSI slave device. No matter what settings they
are configured with, full-duplex SPI transfers occasionally experience the
Rx FIFO overflow. It means that the DMA-engine doesn't keep up with
incoming data pace even though the SPI-bus is enabled with speed of 25MHz
while the DW DMA controller is clocked with 50MHz signal. There is no such
problem has been noticed for the channels synthesized with
max-burst-len=4.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Link: https://lore.kernel.org/r/20200731200826.9292-6-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
According to the DW DMA controller Databook 2.18b (page 40 "3.5 Memory
Peripherals") memory peripherals don't have handshaking interface
connected to the controller, therefore they can never be a flow
controller. Since the CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties
valid only for peripherals with a handshaking interface, we can freely
zero these fields out if the memory peripheral is selected to be the
source or the destination of the DMA transfers.
Note according to the databook, length of burst transfers to memory is
always equal to the number of data items available in a channel FIFO or
data items required to complete the block transfer, whichever is smaller;
length of burst transfers from memory is always equal to the space
available in a channel FIFO or number of data items required to complete
the block transfer, whichever is smaller.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Link: https://lore.kernel.org/r/20200731200826.9292-5-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Indeed in case of the DMA_DEV_TO_MEM DMA transfers it's enough to take the
destination memory address and the destination master data width into
account to calculate the CTLx.DST_TR_WIDTH setting of the memory
peripheral. According to the DW DMAC IP-core Databook 2.18b (page 66,
Example 5) at the and of a DMA transfer when the DMA-channel internal FIFO
is left with data less than for a single destination burst transaction,
the destination peripheral will enter the Single Transaction Region where
the DW DMA controller can complete a block transfer to the destination
using single transactions (non-burst transaction of CTLx.DST_TR_WIDTH
bytes). If there is no enough data in the DMA-channel internal FIFO for
even a single non-burst transaction of CTLx.DST_TR_WIDTH bytes, then the
channel enters "FIFO flush mode". That mode is activated to empty the FIFO
and flush the leftovers out to the memory peripheral. The flushing
procedure is simple. The data is sent to the memory by means of a set of
single transaction of CTLx.SRC_TR_WIDTH bytes. To sum up it's redundant to
use the LLPs length to find out the CTLx.DST_TR_WIDTH parameter value,
since each DMA transfer will be completed with the CTLx.SRC_TR_WIDTH bytes
transaction if it is required.
We suggest to remove the LLP entry length from the statement which
calculates the memory peripheral DMA transaction width since it's
redundant due to the feature described above. By doing so we'll improve
the memory bus utilization and speed up the DMA-channel performance for
DMA_DEV_TO_MEM DMA-transfers.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200731200826.9292-4-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
CFGx.FIFO_MODE field controls a DMA-controller "FIFO readiness" criterion.
In other words it determines when to start pushing data out of a DW
DMAC channel FIFO to a destination peripheral or from a source
peripheral to the DW DMAC channel FIFO. Currently FIFO-mode is set to one
for all DW DMAC channels. It means they are tuned to flush data out of
FIFO (to a memory peripheral or by accepting the burst transaction
requests) when FIFO is at least half-full (except at the end of the block
transfer, when FIFO-flush mode is activated) and are configured to get
data to the FIFO when it's at least half-empty.
Such configuration is a good choice when there is no slave device involved
in the DMA transfers. In that case the number of bursts per block is less
than when CFGx.FIFO_MODE = 0 and, hence, the bus utilization will improve.
But the latency of DMA transfers may increase when CFGx.FIFO_MODE = 1,
since DW DMAC will wait for the channel FIFO contents to be either
half-full or half-empty depending on having the destination or the source
transfers. Such latencies might be dangerous in case if the DMA transfers
are expected to be performed from/to a slave device. Since normally
peripheral devices keep data in internal FIFOs, any latency at some
critical moment may cause one being overflown and consequently losing
data. This especially concerns a case when either a peripheral device is
relatively fast or the DW DMAC engine is relatively slow with respect to
the incoming data pace.
In order to solve problems, which might be caused by the latencies
described above, let's enable the FIFO half-full/half-empty "FIFO
readiness" criterion only for DMA transfers with no slave device involved.
Thanks to the commit 99ba8b9b0d97 ("dmaengine: dw: Initialize channel
before each transfer") we can freely do that in the generic
dw_dma_initialize_chan() method.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200731200826.9292-3-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Multi-block support provides a way to map the kernel-specific SG-table so
the DW DMA device would handle it as a whole instead of handling the
SG-list items or so called LLP block items one by one. So if true LLP
list isn't supported by the DW DMA engine, then soft-LLP mode will be
utilized to load and execute each LLP-block one by one. The soft-LLP mode
of the DMA transactions execution might not work well for some DMA
consumers like SPI due to its Tx and Rx buffers inter-dependency. Let's
initialize the max_sg_burst DMA channels capability based on the nollp
flag state. If it's true, no hardware accelerated LLP is available and
max_sg_burst should be set with 1, which means that the DMA engine
can handle only a single SG list entry at a time. If noLLP is set to
false, then hardware accelerated LLP is supported and the DMA engine
can handle infinite number of SG entries in a single DMA transaction.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200723005848.31907-11-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
IP core of the DW DMA controller may be synthesized with different
max burst length of the transfers per each channel. According to Synopsis
having the fixed maximum burst transactions length may provide some
performance gain. At the same time setting up the source and destination
multi size exceeding the max burst length limitation may cause a serious
problems. In our case the DMA transaction just hangs up. In order to fix
this lets introduce the max burst length platform config of the DW DMA
controller device and don't let the DMA channels configuration code
exceed the burst length hardware limitation.
Note the maximum burst length parameter can be detected either in runtime
from the DWC parameter registers or from the dedicated DT property.
Depending on the IP core configuration the maximum value can vary from
channel to channel so by overriding the channel slave max_burst capability
we make sure a DMA consumer will get the channel-specific max burst
length.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200723005848.31907-10-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
According to the DW APB DMAC data book the minimum burst transaction
length is 1 and it's true for any version of the controller since
isn't parametrised in the coreAssembler so can't be changed at the
IP-core synthesis stage. The maximum burst transaction can vary from
channel to channel and from controller to controller depending on a
IP-core parameter the system engineer activated during the IP-core
synthesis. Let's initialise both min_burst and max_burst members of the
DMA controller descriptor with extreme values so the DMA clients could
use them to properly optimize the DMA requests. The channels and
controller-specific max_burst length initialization will be introduced
by the follow-up patches.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200723005848.31907-9-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Maximum block size DW DMAC configuration corresponds to the max segment
size DMA parameter in the DMA core subsystem notation. Lets set it with a
value specific to the probed DW DMA controller. It shall help the DMA
clients to create size-optimized SG-list items for the controller. This in
turn will cause less dw_desc allocations, less LLP reinitializations,
better DMA device performance.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200723005848.31907-8-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Full multi-block transfers functionality is enabled in DW DMA
controller only if CHx_MULTI_BLK_EN is set. But LLP-based transfers
can be executed only if hardcode channel x LLP register feature isn't
enabled, which can be switched on at the IP core synthesis for
optimization. If it's enabled then the LLP register is hardcoded to
zero, so the blocks chaining based on the LLPs is unsupported.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200723005848.31907-7-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In some cases DMA can be used only with a consumer which does runtime power
management and on the platforms, that have DMA auto power gating logic
(see comments in the drivers/acpi/acpi_lpss.c), may result in DMA losing
its context. Simple mitigation of this issue is to initialize channel
each time the consumer initiates a transfer.
Fixes: cfdf5b6cc598 ("dw_dmac: add support for Lynxpoint DMA controllers")
Reported-by: Tsuchiya Yuto <kitakar@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206403
Link: https://lore.kernel.org/r/20200705115620.51929-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
If PCI enumerated controller has a companion device,
register it in the ACPI DMA controllers as well.
Fixes: f7c799e950f9 ("dmaengine: dw: we do support Merrifield SoC in PCI mode")
Depends-on: b685fe26e9af ("dmaengine: dw: platform: Split ACPI helpers to separate module")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200526182416.52805-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
On some platforms the clock can be fixed rate, always running one and
there is no need to do anything with it.
In order to support those platforms, switch to use optional clock.
Fixes: f8d9ddbc2851 ("dmaengine: dw: platform: Enable iDMA 32-bit on Intel Elkhart Lake")
Depends-on: 60b8f0ddf1a9 ("clk: Add (devm_)clk_get_optional() functions")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20190924085116.83683-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Move ACPI handle check to the dw_dma_acpi_controller_register().
While here, convert it to has_acpi_companion() which is recommended way.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20190820131546.75744-9-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
There is a possibility to have registered ACPI DMA controller
while it has been gone already.
To avoid the potential crash, move to non-managed
acpi_dma_controller_register().
Fixes: 42c91ee71d6d ("dw_dmac: add ACPI support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20190820131546.75744-8-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Intel® PSE (Programmable Services Engine) provides few DMA controllers
to the host on Intel Elkhart Lake. Enable them in the ACPI glue driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20190820131546.75744-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
We are expecting some devices can be enumerated either as PCI or ACPI.
Nevertheless, they will share same information, thus, provide a generic
struct dw_dma_chip_pdata for all glue drivers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20190820131546.75744-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Intel Elkhart Lake Offload Service Engine (OSE) will be called as
Intel(R) Programmable Services Engine (Intel(R) PSE) in documentation.
Update the comment here accordingly.
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20190813080602.15376-1-jarkko.nikula@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Intel Elkhart Lake OSE (Offload Service Engine) provides few DMA controllers
to the host. Enable them in the driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
In the same way as done for ->probe(), call ->remove() based on
the type of the hardware.
While it works now due to equivalency of the two removal functions,
it might be changed in the future.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>