We were missing a return statement in the PSL interrupt handler in the
case of an AFU error, which would trigger an "Unhandled CXL PSL IRQ"
warning. We do actually handle these type of errors (by notifying
userspace), so add the missing return IRQ_HANDLED so we don't throw
unecessary warnings.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch adds tracepoints throughout the cxl driver, which can provide
insight into:
- Context lifetimes
- Commands sent to the PSL and AFU and their completion status
- Segment and page table misses and their resolution
- PSL and AFU interrupts
- slbia calls from the powerpc copro_fault code
These tracepoints are mostly intended to aid in debugging (particularly
for new AFU designs), and may be useful standalone or in conjunction
with hardware traces collected by the PSL (read out via the trace
interface in debugfs) and AFUs.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
hwirq has not been initialized, however it is being incremented
and also not being referenced in a loop. This error was detected with
cppcheck:
[drivers/misc/cxl/irq.c:439]: (error) Uninitialized variable: hwirq
Commit 80fa93fce3 ("cxl: Name interrupts in /proc/interrupt")
introduced this error.
This is a simple fix that removes the redundant increment.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-By: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently all interrupts generated by cxl are named "cxl". This is not very
informative as we can't distinguish between cards, AFUs, error interrupts, user
contexts and user interrupts numbers. Being able to distinguish them is useful
for setting affinity.
This patch gives each of these names in /proc/interrupts.
A two card CAPI system, with afu0.0 having 2 active contexts each with 4 user
IRQs each, will now look like this:
% grep cxl /proc/interrupts
444: 0 OPAL ICS 141312 Level cxl-card1-err
445: 0 OPAL ICS 141313 Level cxl-afu1.0-err
446: 0 OPAL ICS 141314 Level cxl-afu1.0
462: 0 OPAL ICS 2052 Level cxl-afu0.0-pe0-1
463: 75517 OPAL ICS 2053 Level cxl-afu0.0-pe0-2
468: 0 OPAL ICS 2054 Level cxl-afu0.0-pe0-3
469: 0 OPAL ICS 2055 Level cxl-afu0.0-pe0-4
470: 0 OPAL ICS 2056 Level cxl-afu0.0-pe1-1
471: 75506 OPAL ICS 2057 Level cxl-afu0.0-pe1-2
472: 0 OPAL ICS 2058 Level cxl-afu0.0-pe1-3
473: 0 OPAL ICS 2059 Level cxl-afu0.0-pe1-4
502: 1066 OPAL ICS 2050 Level cxl-afu0.0
514: 0 OPAL ICS 2048 Level cxl-card0-err
515: 0 OPAL ICS 2049 Level cxl-afu0.0-err
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
If an AFU has a hardware bug that causes it to acknowledge a context
terminate or remove while that context has outstanding transactions, it
is possible for the kernel to receive an interrupt for that context
after we have removed it from the context list.
The kernel will not be able to demultiplex the interrupt (or worse - if
we have already reallocated the process handle we could mis-attribute it
to the new context), and printed a big scary warning.
It did not acknowledge the interrupt, which would effectively halt
further translation fault processing on the PSL.
This patch makes the warning clearer about the likely cause of the issue
(i.e. hardware bug) to make it obvious to future AFU designers of what
needs to be fixed. It also prints out the process handle which can then
be matched up with hardware and software traces for debugging.
It also acknowledges the interrupt to the PSL with either an address
error or acknowledge, so that the PSL can continue with other
translations.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This is the core of the cxl driver.
It adds support for using cxl cards in the powernv environment only (ie POWER8
bare metal). It allows access to cxl accelerators by userspace using the
/dev/cxl/afuM.N char devices.
The kernel driver has no knowledge of the function implemented by the
accelerator. It provides services to userspace via the /dev/cxl/afuM.N
devices. When a program opens this device and runs the start work IOCTL, the
accelerator will have coherent access to that processes memory using the same
virtual addresses. That process may mmap the device to access any MMIO space
the accelerator provides. Also, reads on the device will allow interrupts to
be received. These services are further documented in a later patch in
Documentation/powerpc/cxl.txt.
Documentation of the cxl hardware architecture and userspace API is provided in
subsequent patches.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>