Commit Graph

11 Commits

Author SHA1 Message Date
Vakul Garg
3862de1f6c crypto: caam - fix job ring cleanup code
The job ring init function creates a platform device for each job ring.
While the job ring is shutdown, e.g. while caam module removal, its
platform device was not being removed. This leads to failure while
reinsertion and then removal of caam module second time.

The following kernel crash dump appears when caam module is reinserted
and then removed again. This patch fixes it.

root@p4080ds:~# rmmod caam.ko
Unable to handle kernel paging request for data at address 0x00000008
Faulting instruction address: 0xf94aca18
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=8 P4080 DS
Modules linked in: caam(-) qoriq_dbg(O) [last unloaded: caam]
NIP: f94aca18 LR: f94aca18 CTR: c029f950
REGS: eac47d60 TRAP: 0300   Tainted: G           O  (3.8.4-rt2)
MSR: 00029002 <CE,EE,ME>  CR: 22022484  XER: 20000000
DEAR: 00000008, ESR: 00000000
TASK = e49dfaf0[2110] 'rmmod' THREAD: eac46000 CPU: 1
GPR00: f94ad3f4 eac47e10 e49dfaf0 00000000 00000005 ea2ac210 ffffffff 00000000
GPR08: c286de68 e4977ce0 c029b1c0 00000001 c029f950 10029738 00000000 100e0000
GPR16: 00000000 10023d00 1000cbdc 1000cb8c 1000cbb8 00000000 c07dfecc 00000000
GPR24: c07e0000 00000000 1000cbd8 f94e0000 ffffffff 00000000 ea53cd40 00000000
NIP [f94aca18] caam_reset_hw_jr+0x18/0x1c0 [caam]
LR [f94aca18] caam_reset_hw_jr+0x18/0x1c0 [caam]
Call Trace:
[eac47e10] [eac47e30] 0xeac47e30 (unreliable)
[eac47e20] [f94ad3f4] caam_jr_shutdown+0x34/0x220 [caam]
[eac47e60] [f94ac0e4] caam_remove+0x54/0xb0 [caam]
[eac47e80] [c029fb38] __device_release_driver+0x68/0x120
[eac47e90] [c02a05c8] driver_detach+0xd8/0xe0
[eac47eb0] [c029f8e0] bus_remove_driver+0xa0/0x110
[eac47ed0] [c00768e4] sys_delete_module+0x144/0x270
[eac47f40] [c000e2f0] ret_from_syscall+0x0/0x3c

Signed-off-by: Vakul Garg <vakul@freescale.com>
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Reviewed-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-04-25 21:09:07 +08:00
Kim Phillips
ce026cb9cb crypto: caam - fix possible deadlock condition
commit "crypto: caam - use non-irq versions of spinlocks for job rings"
made two bad assumptions:

(a) The caam_jr_enqueue lock isn't used in softirq context.
Not true: jr_enqueue can be interrupted by an incoming net
interrupt and the received packet may be sent for encryption,
via caam_jr_enqueue in softirq context, thereby inducing a
deadlock.

This is evidenced when running netperf over an IPSec tunnel
between two P4080's, with spinlock debugging turned on:

[  892.092569] BUG: spinlock lockup on CPU#7, netperf/10634, e8bf5f70
[  892.098747] Call Trace:
[  892.101197] [eff9fc10] [c00084c0] show_stack+0x48/0x15c (unreliable)
[  892.107563] [eff9fc50] [c0239c2c] do_raw_spin_lock+0x16c/0x174
[  892.113399] [eff9fc80] [c0596494] _raw_spin_lock+0x3c/0x50
[  892.118889] [eff9fc90] [c0445e74] caam_jr_enqueue+0xf8/0x250
[  892.124550] [eff9fcd0] [c044a644] aead_decrypt+0x6c/0xc8
[  892.129625] BUG: spinlock lockup on CPU#5, swapper/5/0, e8bf5f70
[  892.129629] Call Trace:
[  892.129637] [effa7c10] [c00084c0] show_stack+0x48/0x15c (unreliable)
[  892.129645] [effa7c50] [c0239c2c] do_raw_spin_lock+0x16c/0x174
[  892.129652] [effa7c80] [c0596494] _raw_spin_lock+0x3c/0x50
[  892.129660] [effa7c90] [c0445e74] caam_jr_enqueue+0xf8/0x250
[  892.129666] [effa7cd0] [c044a644] aead_decrypt+0x6c/0xc8
[  892.129674] [effa7d00] [c0509724] esp_input+0x178/0x334
[  892.129681] [effa7d50] [c0519778] xfrm_input+0x77c/0x818
[  892.129688] [effa7da0] [c050e344] xfrm4_rcv_encap+0x20/0x30
[  892.129697] [effa7db0] [c04b90c8] ip_local_deliver+0x190/0x408
[  892.129703] [effa7de0] [c04b966c] ip_rcv+0x32c/0x898
[  892.129709] [effa7e10] [c048b998] __netif_receive_skb+0x27c/0x4e8
[  892.129715] [effa7e80] [c048d744] netif_receive_skb+0x4c/0x13c
[  892.129726] [effa7eb0] [c03c28ac] _dpa_rx+0x1a8/0x354
[  892.129732] [effa7ef0] [c03c2ac4] ingress_rx_default_dqrr+0x6c/0x108
[  892.129742] [effa7f10] [c0467ae0] qman_poll_dqrr+0x170/0x1d4
[  892.129748] [effa7f40] [c03c153c] dpaa_eth_poll+0x20/0x94
[  892.129754] [effa7f60] [c048dbd0] net_rx_action+0x13c/0x1f4
[  892.129763] [effa7fa0] [c003d1b8] __do_softirq+0x108/0x1b0
[  892.129769] [effa7ff0] [c000df58] call_do_softirq+0x14/0x24
[  892.129775] [ebacfe70] [c0004868] do_softirq+0xd8/0x104
[  892.129780] [ebacfe90] [c003d5a4] irq_exit+0xb8/0xd8
[  892.129786] [ebacfea0] [c0004498] do_IRQ+0xa4/0x1b0
[  892.129792] [ebacfed0] [c000fad8] ret_from_except+0x0/0x18
[  892.129798] [ebacff90] [c0009010] cpu_idle+0x94/0xf0
[  892.129804] [ebacffb0] [c059ff88] start_secondary+0x42c/0x430
[  892.129809] [ebacfff0] [c0001e28] __secondary_start+0x30/0x84
[  892.281474]
[  892.282959] [eff9fd00] [c0509724] esp_input+0x178/0x334
[  892.288186] [eff9fd50] [c0519778] xfrm_input+0x77c/0x818
[  892.293499] [eff9fda0] [c050e344] xfrm4_rcv_encap+0x20/0x30
[  892.299074] [eff9fdb0] [c04b90c8] ip_local_deliver+0x190/0x408
[  892.304907] [eff9fde0] [c04b966c] ip_rcv+0x32c/0x898
[  892.309872] [eff9fe10] [c048b998] __netif_receive_skb+0x27c/0x4e8
[  892.315966] [eff9fe80] [c048d744] netif_receive_skb+0x4c/0x13c
[  892.321803] [eff9feb0] [c03c28ac] _dpa_rx+0x1a8/0x354
[  892.326855] [eff9fef0] [c03c2ac4] ingress_rx_default_dqrr+0x6c/0x108
[  892.333212] [eff9ff10] [c0467ae0] qman_poll_dqrr+0x170/0x1d4
[  892.338872] [eff9ff40] [c03c153c] dpaa_eth_poll+0x20/0x94
[  892.344271] [eff9ff60] [c048dbd0] net_rx_action+0x13c/0x1f4
[  892.349846] [eff9ffa0] [c003d1b8] __do_softirq+0x108/0x1b0
[  892.355338] [eff9fff0] [c000df58] call_do_softirq+0x14/0x24
[  892.360910] [e7169950] [c0004868] do_softirq+0xd8/0x104
[  892.366135] [e7169970] [c003d5a4] irq_exit+0xb8/0xd8
[  892.371101] [e7169980] [c0004498] do_IRQ+0xa4/0x1b0
[  892.375979] [e71699b0] [c000fad8] ret_from_except+0x0/0x18
[  892.381466] [e7169a70] [c0445e74] caam_jr_enqueue+0xf8/0x250
[  892.387127] [e7169ab0] [c044ad4c] aead_givencrypt+0x6ac/0xa70
[  892.392873] [e7169b20] [c050a0b8] esp_output+0x2b4/0x570
[  892.398186] [e7169b80] [c0519b9c] xfrm_output_resume+0x248/0x7c0
[  892.404194] [e7169bb0] [c050e89c] xfrm4_output_finish+0x18/0x28
[  892.410113] [e7169bc0] [c050e8f4] xfrm4_output+0x48/0x98
[  892.415427] [e7169bd0] [c04beac0] ip_local_out+0x48/0x98
[  892.420740] [e7169be0] [c04bec7c] ip_queue_xmit+0x16c/0x490
[  892.426314] [e7169c10] [c04d6128] tcp_transmit_skb+0x35c/0x9a4
[  892.432147] [e7169c70] [c04d6f98] tcp_write_xmit+0x200/0xa04
[  892.437808] [e7169cc0] [c04c8ccc] tcp_sendmsg+0x994/0xcec
[  892.443213] [e7169d40] [c04eebfc] inet_sendmsg+0xd0/0x164
[  892.448617] [e7169d70] [c04792f8] sock_sendmsg+0x8c/0xbc
[  892.453931] [e7169e40] [c047aecc] sys_sendto+0xc0/0xfc
[  892.459069] [e7169f10] [c047b934] sys_socketcall+0x110/0x25c
[  892.464729] [e7169f40] [c000f480] ret_from_syscall+0x0/0x3c

(b) since the caam_jr_dequeue lock is only used in bh context,
then semantically it should use _bh spin_lock types.  spin_lock_bh
semantics are to disable back-halves, and used when a lock is shared
between softirq (bh) context and process and/or h/w IRQ context.
Since the lock is only used within softirq context, and this tasklet
is atomic, there is no need to do the additional work to disable
back halves.

This patch adds back-half disabling protection to caam_jr_enqueue
spin_locks to fix (a), and drops it from caam_jr_dequeue to fix (b).

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-08-20 16:35:40 +08:00
Bharat Bhushan
1af8ea862c crypto: caam - Using alloc_coherent for caam job rings
The caam job rings (input/output job ring) are allocated using
dma_map_single(). These job rings can be visualized as the ring
buffers in which the jobs are en-queued/de-queued. The s/w enqueues
the jobs in input job ring which h/w dequeues and after processing
it copies the jobs in output job ring. Software then de-queues the
job from output ring. Using dma_map/unmap_single() is not preferred
way to allocate memory for this type of requirements because this
adds un-necessary complexity.

Example, if bounce buffer (SWIOTLB) will get used then to make any
change visible in this memory to other processing unit requires
dmap_unmap_single() or dma_sync_single_for_cpu/device(). The
dma_unmap_single() can not be used as this will free the bounce
buffer, this will require changing the job rings on running system
and I seriously doubt that it will be not possible or very complex
to implement. Also using dma_sync_single_for_cpu/device() will also
add unnecessary complexity.

The simple and preferred way is using dma_alloc_coherent() for these
type of memory requirements.

This resolves the Linux boot crash issue when "swiotlb=force" is set
in bootargs on systems which have memory more than 4G.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Acked-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-07-11 11:06:10 +08:00
Kim Phillips
a0ca6ca022 crypto: caam - one tasklet per job ring
there is no noticeable benefit for multiple cores to process one
job ring's output ring: in fact, we can benefit from cache effects
of having the back-half stay on the core that receives a particular
ring's interrupts, and further relax general contention and the
locking involved with reading outring_used, since tasklets run
atomically.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:07 +08:00
Kim Phillips
14a8e29cc2 crypto: caam - consolidate memory barriers from job ring en/dequeue
Memory barriers are implied by the i/o register write implementation
(at least on Power).  So we can remove the redundant wmb() in
caam_jr_enqueue, and, in dequeue(), hoist the h/w done notification
write up to before we need to increment the head of the ring, and
save an smp_mb.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:07 +08:00
Kim Phillips
a8ea07c21d crypto: caam - only query h/w in job ring dequeue path
Code was needlessly checking the s/w job ring when there
would be nothing to process if the h/w's output completion
ring were empty anyway.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:07 +08:00
Kim Phillips
4bba1e9f41 crypto: caam - use non-irq versions of spinlocks for job rings
The enqueue lock isn't used in any interrupt context, and
the dequeue lock isn't used in the h/w interrupt context,
only in bh context.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:06 +08:00
Kim Phillips
e13af18a3e crypto: caam - assign 40-bit masks on SEC v5.0 and above
SEC v4.x were only 36-bit, SEC v5+ are 40-bit capable.
Also set a DMA mask for any job ring devices created.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:06 +08:00
Kim Phillips
a68d259587 crypto: caam - fix input job ring element dma mapping size
SEC4 h/w gets configured in 32- vs. 36-bit physical
addressing modes depending on the size of dma_addr_t,
which is not always equal to sizeof(u32 *).

Also fixed alignment of a dma_unmap call whilst in there.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-06-27 14:42:03 +08:00
Kim Phillips
9620fd959f crypto: caam - handle interrupt lines shared across rings
- add IRQF_SHARED to request_irq flags to support parts such as
the p1023 that has one IRQ line per couple of rings.

- resetting a job ring triggers an interrupt, so move request_irq
prior to jr_reset to avoid 'got IRQ but nobody cared' messages.

- disable IRQs in h/w to avoid contention between reset and
interrupt status

- delete invalid comment - if there were incomplete jobs,
module would be in use, preventing an unload.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-05-03 09:53:31 +10:00
Kim Phillips
8e8ec596e6 crypto: caam - Add support for the Freescale SEC4/CAAM
The SEC4 supercedes the SEC2.x/3.x as Freescale's
Integrated Security Engine.  Its programming model is
incompatible with all prior versions of the SEC (talitos).

The SEC4 is also known as the Cryptographic Accelerator
and Assurance Module (CAAM); this driver is named caam.

This initial submission does not include support for Data Path
mode operation - AEAD descriptors are submitted via the job
ring interface, while the Queue Interface (QI) is enabled
for use by others.  Only AEAD algorithms are implemented
at this time, for use with IPsec.

Many thanks to the Freescale STC team for their contributions
to this driver.

Signed-off-by: Steve Cornelius <sec@pobox.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-03-27 10:45:16 +08:00