Add mem_val debugfs file for dumping the firmware (target) memory and also for
writing to the memory. The firmware memory is accessed through one file which
uses position of the file as the firmware memory address. For example, with dd
use skip parameter for the address.
Beucase target memory width is 32 bits it's strongly recommended to use
blocksize divisable with 4 when using this interface. For example, when using
dd use bs=4 to set the block size to 4 and remember to divide both count and
skip values with four.
To read 4 kB chunk from address 0x400000:
dd if=mem_value bs=4 count=1024 skip=1048576 | xxd -g1
To write value 0x01020304 to address 0x400400:
echo 0x01020304 | xxd -r | dd of=mem_value bs=4 seek=1048832
To read 4 KB chunk of memory and then write back after edit:
dd if=mem_value of=tmp.bin bs=4 count=1024 skip=1048576
emacs tmp.bin
dd if=tmp.bin of=mem_value bs=4 count=1024 seek=1048576
Signed-off-by: Yanbo Li <yanbol@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Debugfs files reg_addr and reg_val are used for reading and writing to the
firmware (target) registers. reg_addr contains the address to be accessed,
which also needs to be set first, and reg_value is when used for reading and
writing the actual value in ASCII.
To read a value from the firmware register 0x100000:
# echo 0x100000 > reg_addr
# cat reg_value
0x00100000:0x000002d3
To write value 0x2400 to address 0x100000:
# echo 0x100000 > reg_addr
# echo 0x2400 > reg_value
#
Signed-off-by: Yanbo Li <yanbol@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Firmware was crashing when we were trying to warm reset it
after suspend. This was due to the fact that target registeres
can be accessed only if the hardware is awaken.
This patch makes sure to awake the device also on the hif up,
not only in case of probe call.
Signed-off-by: Bartosz Markowski <bartosz.markowski@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
While testing other things I've found that CE
items aren't cleared properly. This could lead to
null dereferences in BMI.
To prevent that make sure CE revoking clears the
nbytes value (which is used as a buffer completion
indication) and memset the entire CE ring data
shared between host and target when
(re)initializing.
Also make sure to check BMI xfer pointer and print
a splat instead of crashing the kernel.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Currently hif_power_up performs effectively a
reset and hif_stop resets the chip as well so
there's no point in resetting here.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The power up procedure was overly complex due to
warm/cold reset workarounds and issues.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
One of the problems with warm reset I've found is
that it must be guaranteed that copy engine
registers are not being accessed while being
reset. Otherwise in worst case scenario the host
may lock up.
Instead of using sleeps and hoping the device is
operational in some arbitrary timeframes use
firmware indication register.
As a side effect this makes driver
boot/stop/recovery faster.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Make ath10k_pci_init_pipes() effectively only
alter shared target-host data.
The per_transfer_context is a host-only thing.
It is necessary to preserve it's contents for a
more robust ring cleanup.
This is required for future warm reset fixes.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Calling init to reinit ce pipe state would also
re-set all static structure links and setting
(which don't change over driver lifecycle).
Make it so alloc links structures and initializes
static data and init part to setup state
variables and clear stuff.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This was the final missing bit to making sure the
device doesn't assert interrupts to host.
This should fix possible race when target crashes
during driver teardown.
This also removes an early warm reset workaround
during pci probing.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
If MSI isn't configured device ROM program expects
legacy interrupts to be enabled before it can
fully boot. Don't forget to disable legacy
interrupts after that.
While at it re-use the legacy irq enabling helper
instead of calling ath10k_pci_write32().
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Commit 3a0861fffd ("ath10k: remove ath10k_bus") removed enum ath10k_bus
because it was not used for anything at the time. But now it's needed for for
retrieving the right calibration data file so add it back. Only new addition is
ath10k_bus_str().
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This is required if we take into account possibility to load the driver
from initrd (RAM disk), so in other words: very early in the boot process,
before the file system is visible.
In such case we need to have the firmware files accessible from ram disk too,
and this patch guarantee this.
Signed-off-by: Bartosz Markowski <bartosz.markowski@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Add three counters related to firmware crashes or resets.
Usage:
# cat /sys/kernel/debug/ieee80211/phy0/ath10k/fw_reset_stats
fw_crash_counter 2
fw_warm_reset_counter 43
fw_cold_reset_counter 0
#
kvalo: split into it's own patch, add debugfs file and add locking
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
diag_read() is used for reading from firmware memory via the diagnose window.
First user will be cal_data debugfs file.
To serialise diagnostic window access and make it safe to use while firmware is
running take ce_lock both in ath10k_pci_diag_write_mem() and
ath10k_pci_diag_read_mem(). Because of that all the CE calls had to be changed
to _nolock variants.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This makes it easier to debug the device-target
communication at a very low level.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Fixes checkpatch warnings:
ath10k/htc.c:49: WARNING: Possible unnecessary 'out of memory' message
ath10k/htc.c:810: WARNING: Possible unnecessary 'out of memory' message
ath10k/htt.h:1034: CHECK: Please use a blank line after function/struct/union/enum declarations
ath10k/htt_rx.c:135: CHECK: Unnecessary parentheses around htt->rx_ring.alloc_idx.vaddr
ath10k/htt_rx.c:173: CHECK: Unnecessary parentheses around htt->rx_ring.alloc_idx.vaddr
ath10k/pci.c:633: WARNING: macros should not use a trailing semicolon
ath10k/wmi.c:3594: WARNING: quoted string split across lines
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Remove the ugly _access functions. Being explicit
is a good thing.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Commit 5c771e7454
introduced a regression. On some systems spurious
interrupts could schedule a tasklet while tearing
down leading to, e.g.:
BUG: unable to handle kernel paging request at fe589030
IP: [<c1316fb0>] ioread32+0x30/0x40
...
Call Trace:
[<fe576c1b>] ath10k_pci_tasklet+0x1b/0x60 [ath10k_pci]
[<c1053fbe>] tasklet_action+0x9e/0xb0
[<c10534f1>] __do_softirq+0xf1/0x3f0
[<c1053400>] ? ftrace_raw_event_irq_handler_entry+0xa0/0xa0
[<c1004999>] do_softirq_own_stack+0x29/0x40
<IRQ>
[<c1053a76>] irq_exit+0x86/0xb0
...
[<c132d522>] do_pci_disable_device+0x52/0x60
[<c132d57f>] pci_disable_device+0x4f/0xb0
[<c132a961>] ? __pci_set_master+0x51/0x80
[<fe5740b3>] ath10k_pci_release+0x33/0x40 [ath10k_pci]
[<fe575d4b>] ath10k_pci_remove+0x7b/0x90 [ath10k_pci]
Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Tested-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Recent changes done to start/restart sequences
broke hw recovery in some hw configurations. The
pci transport was stopped twice however due to a
workaround in the pci disabling code the
disable/enable for first msi interrupt was not
balanced. This ended up with irqs not being
properly re-enabled and the following print out
during recovery:
ath10k: failed to receive control response completion, polling..
ath10k: Service connect timeout: -110
ath10k: Could not init core: -110
Legacy interrupt mode was unaffected while msi
ranged mode would be partially crippled (it would
miss fw indication interrupts but otherwise it
worked fine).
This fixes completely broken fw recovery for a
single msi interrupt mode and fixes subsequent fw
crash reports for msi range interrupt mode.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Some copy engine structures are target specific
and are uploaded to the device during
init/configuration.
This also cleans up a bit diag_mem_read/write
implicit byteswap mess leaving only
diag_access_read/write with an implicit endianess
byteswap.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The mapping is already defined in a structure. It
makes little sense to duplicate information stored
in it within a function.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It doesn't make much sense to have copy engine
configuration structures spread across the whole
source file.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Recent crash dump patches introduced a regression.
If debugfs was disabled upon crash user could only
see the following:
[ 793.880000] ath10k: firmware crashed! (uuid n/a)
[ 793.890000] ath10k: qca988x hw2.0 (0x4100016c, 0x043202ff) fw 10.1.467.2-1 api 2 htt 2.1
[ 793.890000] ath10k: debug 0 debugfs 0 tracing 0 dfs 1
The report was missing register dump. Fix it by
printing registers regardless if crash_data is
present or not.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This makes it a lot easier to log and debug
messages if there's more than 1 ath10k device on a
system.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
There are basically no more uses for
ar_pci->started. It is also perfectly safe to call
hif_stop without hif_start now.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Structures used by these functions are now
guaranteed to remain accessible until driver is
unregistered.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The old comment was a little out of date. HTT Rx
ring is a more relevant problem when stopping
transport layer.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It was possible on a host system running low on
memory to end up with no rx buffers on pci pipes.
This makes the driver more robust as it won't fail
to start if it can't allocate all rx buffers right
away. If it is fatal then upper layers will notice
trouble anyway.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It's not really necessary to have a dedicated irq
handler just for the sake of catching early fw
crashes anymore. It is now safe to use one handler
even during early stages of device boot up.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This fixes two corner cases.
One is a race between disabling copy engine
interrupts and unhandled pending interrupts on the
host. This could end up with a runaway tasklet and
consequently memory leak of a few copy engine
rx buffers.
The other one is an unexpected (and non-maskable
via device CSR) MSI fw indication interrupt during
teardown. This could trigger the same problem as
the first corner case.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It doesn't make much sense to overwrite send_cb
and recv_cb callbacks over and over again whenever
transport starts. Just make sure to unmask copy
engine interrupts when starting.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It doesn't make sense to re-init irqs completely
whenever transport is started/stopped. Do it just
once upon probing/removing.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Wrong register was being set up. This could
prevent firmware from booting in some rare cases
when using legacy interrupts.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Sometimes users forget to include important info like firmware version,
so better to print all the info.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Better to have a clear name for the function. While at it, clear up the title
for the register dump.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Store the firmware registers and other relevant data to a firmware crash dump
file and provide it to user-space via debugfs. Should help with figuring out
why the firmware crashed.
kvalo: remove dbglog support, rework and refactor the code to avoid ifdefs and
otherwise simplify it as well
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
ath10k_pci_diag_read32() is for reading u32 from a device and ath10k_pci_diag_read_hi()
is a helper for reading data using "host interest" table.
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to
meet kernel coding style guidelines. This issue was reported by checkpatch.
A simplified version of the semantic patch that makes this change is as
follows (http://coccinelle.lip6.fr/):
// <smpl>
@@
identifier i;
declarer name DEFINE_PCI_DEVICE_TABLE;
initializer z;
@@
- DEFINE_PCI_DEVICE_TABLE(i)
+ const struct pci_device_id i[]
= z;
// </smpl>
[bhelgaas: add semantic patch]
Signed-off-by: Benoit Taine <benoit.taine@lip6.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make probe/remove functions shorter and easier to
understand.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The ATH10K_PCI_FEATURE_MSI_X was originally
introduced to support both chips QCA988Xv1 and
QCA988Xv2. Since v1 isn't supported anymore it
doesn't make sense to keep the feature flag
around. Since this is the last one remove the
whole thing.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The soc powersave was disabled by default. It
never was fully tested. Some hw apparently had
problems with it and the implementation itself had
a possible race.
Just remove the refcounting and simply wake up the
device when probing and put to sleep when
removing.
kvalo: make ath10k_pci_wake() and _sleep() static
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Use the common convention of embedding private
structures inside parent structures. This
reduces allocations and simplifies pci probing
code.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The 10.2 firmware is a successor of 10.1 firmware
(formerly identified as 10.x). Both share a lot
but have some slight ABI differences that need to
be taken care of.
The 10.2 firmware introduces some new features but
those can be added in subsequent patches. This
patch makes ath10k boot and work with 10.2 with
comparable functionality to 10.1.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It was possible to enter an endless loop while
processing a single pci copy engine pipe. This
could effectively render ath10k incapable of
responding to any requests.
An example case when this could happen is when
firmware generates a lot of events, e.g. spectral
scan phyerr via WMI.
Reported-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
It was possible for tx completion not to be
processed. In that case an old stack pointer was
left on copy engine tx ring. Next bmi exchange
would immediately pop it and use complete() on the
completion struct there causing corruption.
Make sure to wait for both tx and rx completions
properly.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>