Brad Love
95f408bbc4
media: cx23885: Ryzen DMA related RiSC engine stall fixes
...
This bug affects all of Hauppauge QuadHD boards when used on all Ryzen
platforms and some XEON platforms. On these platforms it is possible to
error out the RiSC engine and cause it to stall, whereafter the only
way to reset the board to a working state is to reboot.
This is the fatal condition with current driver:
[ 255.663598] cx23885: cx23885[0]: mpeg risc op code error
[ 255.663607] cx23885: cx23885[0]: TS1 B - dma channel status dump
[ 255.663612] cx23885: cx23885[0]: cmds: init risc lo : 0xffe54000
[ 255.663615] cx23885: cx23885[0]: cmds: init risc hi : 0x00000000
[ 255.663619] cx23885: cx23885[0]: cmds: cdt base : 0x00010870
[ 255.663622] cx23885: cx23885[0]: cmds: cdt size : 0x0000000a
[ 255.663625] cx23885: cx23885[0]: cmds: iq base : 0x00010630
[ 255.663629] cx23885: cx23885[0]: cmds: iq size : 0x00000010
[ 255.663632] cx23885: cx23885[0]: cmds: risc pc lo : 0xffe54018
[ 255.663636] cx23885: cx23885[0]: cmds: risc pc hi : 0x00000000
[ 255.663639] cx23885: cx23885[0]: cmds: iq wr ptr : 0x00004192
[ 255.663642] cx23885: cx23885[0]: cmds: iq rd ptr : 0x0000418c
[ 255.663645] cx23885: cx23885[0]: cmds: cdt current : 0x00010898
[ 255.663649] cx23885: cx23885[0]: cmds: pci target lo : 0xf85ca340
[ 255.663652] cx23885: cx23885[0]: cmds: pci target hi : 0x00000000
[ 255.663655] cx23885: cx23885[0]: cmds: line / byte : 0x000c0000
[ 255.663659] cx23885: cx23885[0]: risc0:
[ 255.663661] 0x1c0002f0 [ write sol eol count=752 ]
[ 255.663666] cx23885: cx23885[0]: risc1:
[ 255.663667] 0xf85ca050 [ INVALID sol 22 20 19 18 resync 13 count=80 ]
[ 255.663674] cx23885: cx23885[0]: risc2:
[ 255.663674] 0x00000000 [ INVALID count=0 ]
[ 255.663678] cx23885: cx23885[0]: risc3:
[ 255.663679] 0x1c0002f0 [ write sol eol count=752 ]
[ 255.663684] cx23885: cx23885[0]: (0x00010630) iq 0:
[ 255.663685] 0x1c0002f0 [ write sol eol count=752 ]
[ 255.663690] cx23885: cx23885[0]: iq 1: 0xf85ca630 [ arg #1 ]
[ 255.663693] cx23885: cx23885[0]: iq 2: 0x00000000 [ arg #2 ]
[ 255.663696] cx23885: cx23885[0]: (0x0001063c) iq 3:
[ 255.663697] 0x1c0002f0 [ write sol eol count=752 ]
[ 255.663702] cx23885: cx23885[0]: iq 4: 0xf85ca920 [ arg #1 ]
[ 255.663705] cx23885: cx23885[0]: iq 5: 0x00000000 [ arg #2 ]
[ 255.663709] cx23885: cx23885[0]: (0x00010648) iq 6:
[ 255.663709] 0xf85ca340 [ INVALID sol 22 20 19 18 resync 13 count=832 ]
[ 255.663716] cx23885: cx23885[0]: (0x0001064c) iq 7:
[ 255.663717] 0x00000000 [ INVALID count=0 ]
[ 255.663721] cx23885: cx23885[0]: (0x00010650) iq 8:
[ 255.663721] 0x00000000 [ INVALID count=0 ]
[ 255.663725] cx23885: cx23885[0]: (0x00010654) iq 9:
[ 255.663726] 0x1c0002f0 [ write sol eol count=752 ]
[ 255.663731] cx23885: cx23885[0]: iq a: 0xf85c9780 [ arg #1 ]
[ 255.663734] cx23885: cx23885[0]: iq b: 0x00000000 [ arg #2 ]
[ 255.663737] cx23885: cx23885[0]: (0x00010660) iq c:
[ 255.663738] 0x1c0002f0 [ write sol eol count=752 ]
[ 255.663743] cx23885: cx23885[0]: iq d: 0xf85c9a70 [ arg #1 ]
[ 255.663746] cx23885: cx23885[0]: iq e: 0x00000000 [ arg #2 ]
[ 255.663749] cx23885: cx23885[0]: (0x0001066c) iq f:
[ 255.663750] 0x1c0002f0 [ write sol eol count=752 ]
[ 255.663755] cx23885: cx23885[0]: iq 10: 0xf4fa2920 [ arg #1 ]
[ 255.663758] cx23885: cx23885[0]: iq 11: 0x00000000 [ arg #2 ]
[ 255.663759] cx23885: cx23885[0]: fifo: 0x00005000 -> 0x6000
[ 255.663760] cx23885: cx23885[0]: ctrl: 0x00010630 -> 0x10690
[ 255.663764] cx23885: cx23885[0]: ptr1_reg: 0x00005980
[ 255.663767] cx23885: cx23885[0]: ptr2_reg: 0x000108a8
[ 255.663770] cx23885: cx23885[0]: cnt1_reg: 0x0000000b
[ 255.663773] cx23885: cx23885[0]: cnt2_reg: 0x00000003
Included is checks of the TC_REQ and TC_REQ_SET registers during states
of board initialization, reset, DMA start, and DMA stop. If both registers
are set, this indicates a stall in the RiSC engine, at which point the
bridge error is cleared.
A small delay is introduced in stop_dma as well, to allow transfers in
progress to finish.
After application all models work on Ryzen, occasionally yielding:
cx23885_clear_bridge_error: dma in progress detected 0x00000001 0x00000001, clearing
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Hans Verkuil <hansverk@cisco.com>
[hansverk@cisco.com: fix compiler warning of unused 'reg' variable]
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-05-11 11:29:11 -04:00