Commit Graph

662304 Commits

Author SHA1 Message Date
Marc Zyngier
6c9ae25dfc arm64: KVM: Move lr save/restore to do_el2_call
At the moment, we only save/restore lr if on VHE, as we rely only
the EL1 code to have preserved it in the non-VHE case.

As we're about to get rid of the latter, let's move the save/restore
code to the do_el2_call macro, unifying both code paths.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:49:18 -07:00
Marc Zyngier
50d912cc3e arm64: hyp-stub: Stop pointlessly clobbering lr
When entering the kernel hyp stub, we check whether or not we've
made it here through an HVC instruction, clobbering lr (aka x30)
in the process.

This is completely pointless, as HVC is the only way to get here
(all traps to EL2 are disabled, no interrupt override is applied).

So let's remove this bit of code whose only point is to corrupt
a valuable register.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:49:17 -07:00
Marc Zyngier
9d0d4d34d9 arm: KVM: Treat CP15 accessors returning false as successful
Instead of considering that a CP15 accessor has failed when
returning false, let's consider that it is *always* successful
(after all, we won't stand for an incomplete emulation).

The return value now simply indicates whether we should skip
the instruction (because it has now been emulated), or if we
should leave the PC alone if the emulation has injected an
exception.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-09 07:49:17 -07:00
Marc Zyngier
b1d4cb6983 arm: KVM: Make unexpected register accesses inject an undef
Reads from write-only system registers are generally confined to
EL1 and not propagated to EL2 (that's what the architecture
mantates). In order to be sure that we have a sane behaviour
even in the unlikely event that we have a broken system, we still
handle it in KVM. Same goes for write to RO registers.

In that case, let's inject an undef into the guest.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-09 07:49:16 -07:00
Marc Zyngier
b6b7a8069d arm64: KVM: Do not corrupt registers on failed 64bit CP read
If we fail to emulate a mrrc instruction, we:
1) deliver an exception,
2) spit a nastygram on the console,
3) write back some garbage to Rt/Rt2

While 1) and 2) are perfectly acceptable, 3) is out of the scope of
the architecture... Let's mimick the code in kvm_handle_cp_32 and
be more cautious.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:49:15 -07:00
Marc Zyngier
e70b952263 arm64: KVM: Treat sysreg accessors returning false as successful
Instead of considering that a sysreg accessor has failed when
returning false, let's consider that it is *always* successful
(after all, we won't stand for an incomplete emulation).

The return value now simply indicates whether we should skip
the instruction (because it has now been emulated), or if we
should leave the PC alone if the emulation has injected an
exception.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-09 07:49:15 -07:00
Marc Zyngier
e044323016 arm64: KVM: PMU: Inject UNDEF on read access to PMSWINC_EL0
PMSWINC_EL0 is a WO register, so let's UNDEF when reading from it
(in the highly hypothetical case where this doesn't UNDEF at EL1).

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-09 07:49:14 -07:00
Marc Zyngier
7b5b4df1a7 arm64: KVM: Make unexpected reads from WO registers inject an undef
Reads from write-only system registers are generally confined to
EL1 and not propagated to EL2 (that's what the architecture
mantates). In order to be sure that we have a sane behaviour
even in the unlikely event that we have a broken system, we still
handle it in KVM.

In that case, let's inject an undef into the guest.

Let's also remove write_to_read_only which isn't used anywhere.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-09 07:49:14 -07:00
Marc Zyngier
9008c235cb arm64: KVM: PMU: Inject UNDEF on non-privileged accesses
access_pminten() and access_pmuserenr() can only be accessed when
the CPU is in a priviledged mode. If it is not, let's inject an
UNDEF exception.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-09 07:49:13 -07:00
Marc Zyngier
24d5950f6b arm64: KVM: PMU: Inject UNDEF exception on illegal register access
Both pmu_*_el0_disabled() and pmu_counter_idx_valid() perform checks
on the validity of an access, but only return a boolean indicating
if the access is valid or not.

Let's allow these functions to also inject an UNDEF exception if
the access was illegal.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-04-09 07:49:13 -07:00
Marc Zyngier
6c0070366d arm64: KVM: PMU: Refactor pmu_*_el0_disabled
There is a lot of duplication in the pmu_*_el0_disabled helpers,
and as we're going to modify them shortly, let's move all the
common stuff in a single function.

No functional change.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:49:12 -07:00
Christoffer Dall
8ac76ef4b5 KVM: arm/arm64: vgic: Improve sync_hwstate performance
There is no need to call any functions to fold LRs when we don't use any
LRs and we don't need to mess with overflow flags, take spinlocks, or
prune the AP list if the AP list is empty.

Note: list_empty is a single atomic read (uses READ_ONCE) and can
therefore check if a list is empty or not without the need to take the
spinlock protecting the list.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:49:12 -07:00
Christoffer Dall
0b09b6e519 KVM: arm/arm64: vgic: Don't check vgic_initialized in sync/flush
Now when we do an early init of the static parts of the VGIC data
structures, we can do things like checking if the AP lists are empty
directly without having to explicitly check if the vgic is initialized
and reduce a bit of work in our critical path.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:49:11 -07:00
Christoffer Dall
966e014919 KVM: arm/arm64: vgic: Implement early VGIC init functionality
Implement early initialization for both the distributor and the CPU
interfaces.  The basic idea is that even though the VGIC is not
functional or not requested from user space, the critical path of the
run loop can still call VGIC functions that just won't do anything,
without them having to check additional initialization flags to ensure
they don't look at uninitialized data structures.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:49:11 -07:00
Christoffer Dall
096f31c436 KVM: arm/arm64: vgic: Get rid of MISR and EISR fields
We don't use these fields anymore so let's nuke them completely.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-04-09 07:49:10 -07:00
Christoffer Dall
b6095b084d KVM: arm/arm64: vgic: Get rid of unnecessary save_maint_int_state
Now when we don't look at the MISR and EISR values anymore, we can get
rid of the logic to save them in the GIC save/restore code.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-04-09 07:49:09 -07:00
Christoffer Dall
af0614991a KVM: arm/arm64: vgic: Get rid of unnecessary process_maintenance operation
Since we always read back the LRs that we wrote to the guest and the
MISR and EISR registers simply provide a summary of the configuration of
the bits in the LRs, there is really no need to read back those status
registers and process them.  We might as well just signal the
notifyfd when folding the LR state and save some cycles in the process.
We now clear the underflow bit in the fold_lr_state functions as we only
need to clear this bit if we had used all the LRs, so this is as good a
place as any to do that work.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-04-09 07:49:07 -07:00
Christoffer Dall
90cac1f52a KVM: arm/arm64: vgic: Only set underflow when actually out of LRs
We currently assume that all the interrupts in our AP list will be
queued to LRs, but that's not necessarily the case, because some of them
could have been migrated away to different VCPUs and only the VCPU
thread itself can remove interrupts from its AP list.

Therefore, slightly change the logic to only setting the underflow
interrupt when we actually run out of LRs.

As it turns out, this allows us to further simplify the handling in
vgic_sync_hwstate in later patches.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:45:32 -07:00
Christoffer Dall
00dafa0fcf KVM: arm/arm64: vgic: Get rid of live_lrs
There is no need to calculate and maintain live_lrs when we always
populate the lowest numbered LRs first on every entry and clear all LRs
on every exit.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-04-09 07:45:31 -07:00
Shih-Wei Li
f6769581e9 KVM: arm/arm64: vgic: Avoid flushing vgic state when there's no pending IRQ
We do not need to flush vgic states in each world switch unless
there is pending IRQ queued to the vgic's ap list. We can thus reduce
the overhead by not grabbing the spinlock and not making the extra
function call to vgic_flush_lr_state.

Note: list_empty is a single atomic read (uses READ_ONCE) and can
therefore check if a list is empty or not without the need to take the
spinlock protecting the list.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shih-Wei Li <shihwei@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:45:31 -07:00
Christoffer Dall
328e566479 KVM: arm/arm64: vgic: Defer touching GICH_VMCR to vcpu_load/put
We don't have to save/restore the VMCR on every entry to/from the guest,
since on GICv2 we can access the control interface from EL1 and on VHE
systems with GICv3 we can access the control interface from KVM running
in EL2.

GICv3 systems without VHE becomes the rare case, which has to
save/restore the register on each round trip.

Note that userspace accesses may see out-of-date values if the VCPU is
running while accessing the VGIC state via the KVM device API, but this
is already the case and it is up to userspace to quiesce the CPUs before
reading the CPU registers from the GIC for an up-to-date view.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:45:22 -07:00
Suzuki K Poulose
056aad67f8 kvm: arm/arm64: Rework gpa callback handlers
In order to perform an operation on a gpa range, we currently iterate
over each page in a user memory slot for the given range. This is
inefficient while dealing with a big range (e.g, a VMA), especially
while unmaping a range. At present, with stage2 unmap on a range with
a hugepage backed region, we clear the PMD when we unmap the first
page in the loop. The remaining iterations simply traverse the page table
down to the PMD level only to see that nothing is in there.

This patch reworks the code to invoke the callback handlers on the
biggest range possible within the memory slot to to reduce the number of
times the handler is called.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
2017-04-09 07:42:50 -07:00
Linus Torvalds
97da3854c5 Linux 4.11-rc3 2017-03-19 19:09:39 -07:00
Linus Torvalds
452b94b8c8 mm/swap: don't BUG_ON() due to uninitialized swap slot cache
This BUG_ON() triggered for me once at shutdown, and I don't see a
reason for the check.  The code correctly checks whether the swap slot
cache is usable or not, so an uninitialized swap slot cache is not
actually problematic afaik.

I've temporarily just switched the BUG_ON() to a WARN_ON_ONCE(), since
I'm not sure why that seemingly pointless check was there.  I suspect
the real fix is to just remove it entirely, but for now we'll warn about
it but not bring the machine down.

Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-19 19:00:47 -07:00
Linus Torvalds
a07a6e4121 powerpc fixes for 4.11 #5
- Wire up statx() syscall
  - Don't print a warning on memory hotplug when HPT resizing isn't available
 
 Thanks to:
   David Gibson, Chandan Rajendra.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYzjnLAAoJEFHr6jzI4aWAwcAQAIeOYvCaoxQg6IEFoWJVc+hv
 YfMy86TotkWFojQDV6OUddlz7S14AypjniegVq9aWOPIVwGqmDfp+W8qRfhUh3ab
 S+uorqmXzAFhIUpzVpVpX/nnK6C3eFQdaKCLOxM0Ev413AWUMu70cPmXlCR+x0Kx
 GxeV+nfIfBdyIL4yhK/aDHSItfUwNTmTPSyaUj17/cwLu7DwMjcwKWjrCYCzgBZW
 1hQeWo+yrfvJ1U4yMEGdnDfuTPnWQZsHMw3qSuPzPCnVMgV6YS3HC7/ZEUvBSSBJ
 Bwr6vVEsniLkHDgyFu3665y4abfvqw2iojXKuTpUQMp40T3RCZlT1FQoMvIoJQGC
 FSMUsqnEudjliURi92zq4ImSbPbezB2bG3EsK1jKOOQ9gGbQa2qjXc2CSJiCtZwE
 zsUCcwRFo8Pl1D/KYMTl52nQG3oOMQV/2ceX73HxajsdShXcyJxFEpMPv6aEKi1E
 YT5i2KUXq3sfFPNWe/4gtfOYX9j+tSqi+CRvnwDNZG+a3XLUk/VRptCeIyntH965
 8GYs28CxLz1qljBeAA7czo+1gpNhl1h6FNl1nhqyWRM6jdcLucMeEml0RYCYPmnk
 +jLO6CkYYpIraM0mRqAolM7fYAtm6dOphaeeezCJIMvD/NTdk+pH86JoA64GQZvE
 v1HXCc4lYyUFlvASrCNl
 =TfiW
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull more powerpc fixes from Michael Ellerman:
 "A couple of minor powerpc fixes for 4.11:

   - wire up statx() syscall

   - don't print a warning on memory hotplug when HPT resizing isn't
     available

  Thanks to: David Gibson, Chandan Rajendra"

* tag 'powerpc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Don't give a warning when HPT resizing isn't available
  powerpc: Wire up statx() syscall
2017-03-19 18:49:28 -07:00
Linus Torvalds
4571bc5abf Merge branch 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:

 - Mikulas Patocka added support for R_PARISC_SECREL32 relocations in
   modules with CONFIG_MODVERSIONS.

 - Dave Anglin optimized the cache flushing for vmap ranges.

 - Arvind Yadav provided a fix for a potential NULL pointer dereference
   in the parisc perf code (and some code cleanups).

 - I wired up the new statx system call, fixed some compiler warnings
   with the access_ok() macro and fixed shutdown code to really halt a
   system at shutdown instead of crashing & rebooting.

* 'parisc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix system shutdown halt
  parisc: perf: Fix potential NULL pointer dereference
  parisc: Avoid compiler warnings with access_ok()
  parisc: Wire up statx system call
  parisc: Optimize flush_kernel_vmap_range and invalidate_kernel_vmap_range
  parisc: support R_PARISC_SECREL32 relocation in modules
2017-03-19 18:11:13 -07:00
Linus Torvalds
8aa3417255 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "The bulk of the changes are in qla2xxx target driver code to address
  various issues found during Cavium/QLogic's internal testing (stable
  CC's included), along with a few other stability and smaller
  miscellaneous improvements.

  There are also a couple of different patch sets from Mike Christie,
  which have been a result of his work to use target-core ALUA logic
  together with tcm-user backend driver.

  Finally, a patch to address some long standing issues with
  pass-through SCSI export of TYPE_TAPE + TYPE_MEDIUM_CHANGER devices,
  which will make folks using physical (or virtual) magnetic tape happy"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (28 commits)
  qla2xxx: Update driver version to 9.00.00.00-k
  qla2xxx: Fix delayed response to command for loop mode/direct connect.
  qla2xxx: Change scsi host lookup method.
  qla2xxx: Add DebugFS node to display Port Database
  qla2xxx: Use IOCB interface to submit non-critical MBX.
  qla2xxx: Add async new target notification
  qla2xxx: Export DIF stats via debugfs
  qla2xxx: Improve T10-DIF/PI handling in driver.
  qla2xxx: Allow relogin to proceed if remote login did not finish
  qla2xxx: Fix sess_lock & hardware_lock lock order problem.
  qla2xxx: Fix inadequate lock protection for ABTS.
  qla2xxx: Fix request queue corruption.
  qla2xxx: Fix memory leak for abts processing
  qla2xxx: Allow vref count to timeout on vport delete.
  tcmu: Convert cmd_time_out into backend device attribute
  tcmu: make cmd timeout configurable
  tcmu: add helper to check if dev was configured
  target: fix race during implicit transition work flushes
  target: allow userspace to set state to transitioning
  target: fix ALUA transition timeout handling
  ...
2017-03-19 18:06:31 -07:00
Linus Torvalds
1b8df61908 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull device-dax fixes from Dan Williams:
 "The device-dax driver was not being careful to handle falling back to
  smaller fault-granularity sizes.

  The driver already fails fault attempts that are smaller than the
  device's alignment, but it also needs to handle the cases where a
  larger page mapping could be established. For simplicity of the
  immediate fix the implementation just signals VM_FAULT_FALLBACK until
  fault-size == device-alignment.

  One fix is for -stable to address pmd-to-pte fallback from the
  original implementation, another fix is for the new (introduced in
  4.11-rc1) pud-to-pmd regression, and a typo fix comes along for the
  ride.

  These have received a build success notification from the kbuild
  robot"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: fix debug output typo
  device-dax: fix pud fault fallback handling
  device-dax: fix pmd/pte fault fallback handling
2017-03-19 15:45:02 -07:00
Himanshu Madhani
6c611d18f3 qla2xxx: Update driver version to 9.00.00.00-k
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:38 -07:00
Quinn Tran
ec7193e260 qla2xxx: Fix delayed response to command for loop mode/direct connect.
Current driver wait for FW to be in the ready state before
processing in-coming commands. For Arbitrated Loop or
Point-to- Point (not switch), FW Ready state can take a while.
FW will transition to ready state after all Nports have been
logged in. In the mean time, certain initiators have completed
the login and starts IO. Driver needs to start processing all
queues if FW is already started.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:38 -07:00
Quinn Tran
482c9dc792 qla2xxx: Change scsi host lookup method.
For target mode, when new scsi command arrive, driver first performs
a look up of the SCSI Host. The current look up method is based on
the ALPA portion of the NPort ID. For Cisco switch, the ALPA can
not be used as the index. Instead, the new search method is based
on the full value of the Nport_ID via btree lib.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:37 -07:00
Himanshu Madhani
c423437e3f qla2xxx: Add DebugFS node to display Port Database
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:37 -07:00
Quinn Tran
15f30a5752 qla2xxx: Use IOCB interface to submit non-critical MBX.
The Mailbox interface is currently over subscribed. We like
to reserve the Mailbox interface for the chip managment and
link initialization. Any non essential Mailbox command will
be routed through the IOCB interface. The IOCB interface is
able to absorb more commands.

Following commands are being routed through IOCB interface

- Get ID List (007Ch)
- Get Port DB (0064h)
- Get Link Priv Stats (006Dh)

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:37 -07:00
Quinn Tran
f1443eebca qla2xxx: Add async new target notification
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:37 -07:00
Anil Gurumurthy
54b9993c8c qla2xxx: Export DIF stats via debugfs
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:36 -07:00
Quinn Tran
be25152c0d qla2xxx: Improve T10-DIF/PI handling in driver.
Add routines to support T10 DIF tag.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:36 -07:00
Quinn Tran
5b33469a05 qla2xxx: Allow relogin to proceed if remote login did not finish
If the remote port have started the login process, then the
PLOGI and PRLI should be back to back. Driver will allow
the remote port to complete the process. For the case where
the remote port decide to back off from sending PRLI, this
local port sets an expiration timer for the PRLI. Once the
expiration time passes, the relogin retry logic is allowed
to go through and perform login with the remote port.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:36 -07:00
Quinn Tran
f159b3c7cd qla2xxx: Fix sess_lock & hardware_lock lock order problem.
The main lock that needs to be held for CMD or TMR submission
to upper layer is the sess_lock. The sess_lock is used to
serialize cmd submission and session deletion. The addition
of hardware_lock being held is not necessary. This patch removes
hardware_lock dependency from CMD/TMR submission.

Use hardware_lock only for error response in this case.

Path1
       CPU0                    CPU1
       ----                    ----
  lock(&(&ha->tgt.sess_lock)->rlock);
                               lock(&(&ha->hardware_lock)->rlock);
                               lock(&(&ha->tgt.sess_lock)->rlock);
  lock(&(&ha->hardware_lock)->rlock);

Path2/deadlock
*** DEADLOCK ***
Call Trace:
dump_stack+0x85/0xc2
print_circular_bug+0x1e3/0x250
__lock_acquire+0x1425/0x1620
lock_acquire+0xbf/0x210
_raw_spin_lock_irqsave+0x53/0x70
qlt_sess_work_fn+0x21d/0x480 [qla2xxx]
process_one_work+0x1f4/0x6e0

Cc: <stable@vger.kernel.org>
Cc: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reported-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:08 -07:00
Quinn Tran
8f6fc8d4e7 qla2xxx: Fix inadequate lock protection for ABTS.
Normally, ABTS is sent to Target Core as Task MGMT command.
In the case of error, qla2xxx needs to send response, hardware_lock
is required to prevent request queue corruption.

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:08 -07:00
Quinn Tran
8b666809e1 qla2xxx: Fix request queue corruption.
When FW notify driver or driver detects low FW resource,
driver tries to send out Busy SCSI Status to tell Initiator
side to back off. During the send process, the lock was not held.

Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:08 -07:00
Quinn Tran
ae940f2c47 qla2xxx: Fix memory leak for abts processing
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:28:08 -07:00
Joe Carnuccio
c4a9b538ab qla2xxx: Allow vref count to timeout on vport delete.
Cc: <stable@vger.kernel.org>
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 17:27:56 -07:00
Nicholas Bellinger
7d7a743543 tcmu: Convert cmd_time_out into backend device attribute
Instead of putting cmd_time_out under ../target/core/user_0/foo/control,
which has historically been used by parameters needed for initial
backend device configuration, go ahead and move cmd_time_out into
a backend device attribute.

In order to do this, tcmu_module_init() has been updated to create
a local struct configfs_attribute **tcmu_attrs, that is based upon
the existing passthrough_attrib_attrs along with the new cmd_time_out
attribute.  Once **tcm_attrs has been setup, go ahead and point
it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core.

Also following MNC's previous change, ->cmd_time_out is stored in
milliseconds but exposed via configfs in seconds.  Also, note this
patch restricts the modification of ->cmd_time_out to before +
after the TCMU device has been configured, but not while it has
active fabric exports.

Cc: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 16:32:30 -07:00
Mike Christie
af980e46a2 tcmu: make cmd timeout configurable
A single daemon could implement multiple types of devices
using multuple types of real devices that may not support
restarting from crashes and/or handling tcmu timeouts. This
makes the cmd timeout configurable, so handlers that do not
support it can turn if off for now.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 16:32:27 -07:00
Mike Christie
972c7f1679 tcmu: add helper to check if dev was configured
This adds a helper to check if the dev was configured. It
will be used in the next patch to prevent updates to some
config settings after the device has been setup.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 16:32:19 -07:00
Linus Torvalds
93afaa4513 OpenRISC fixes for 4.11
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYy4GpAAoJEMOzHC1eZifkRm8P/jD5fc3XqdBy9hHX2x5p0mtz
 UNO+tHIO5wjadsf6iMcS/2Ozc8SOXdAietnCK8khodn0RvXBUEW+samSxGKNY8zO
 0Oz3EcUEN3SEJ03sfMwE0G+O+LTUIgdSQ/2galTy6WqrrHVZV5Gp/z9JjDm19kUW
 j5U2/rqVBENSPb+nhOgeq59KpUep7RacXXLDCl34vmQKtFY2fMHa2ddOdtTEBJQh
 HXi7js0EfWeU6LNhMzcuUk1gwitfgNl85up2XBZmgxqe9J7Ysbosv8yH4GyrNjZU
 7TOM4NTUM+LHfVQHCemJ0Ie4CJqMzafY4cdrXC6YMIB86bbLZSMw64FBINcqg8XF
 +RASeFbzUZO38dRccepfRxmSoOPjrN4Nsj83pzdh/4jyD1gDVuQgIPwLONZag3sU
 g9W7bBPPTPaRXn4MA1iyqyLwck2RIpjSMW867LXrgs6cmZXLpccW/psh/c8w08h8
 8c4e++6IoiEuOp4TN0Z38TVT82apfQyfUkHepL26nI5YYgilfM8/9oPAM9l7WRSR
 Zyn7i3TEDX0kmt29ImUjf1ZBGIHzP3krepDkFi+gP5Xycf4sYLXjTffoqlbx9s4U
 8FVdXdiXJGAE3tm+H0DvbS+oTC3eW175SfJ88GZiYog3eWWPnYyMysk/Iv2qAZuE
 KUR4aVx5Rb5+gJAfga2/
 =lNoR
 -----END PGP SIGNATURE-----

Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linux

Pull OpenRISC fixes from Stafford Horne:
 "OpenRISC fixes for build issues that were exposed by kbuild robots
  after 4.11 merge. All from allmodconfig builds. This includes:

   - bug in the handling of 8-byte get_user() calls

   - module build failure due to multile missing symbol exports"

* tag 'openrisc-for-linus' of git://github.com/openrisc/linux:
  openrisc: Export symbols needed by modules
  openrisc: fix issue handling 8 byte get_user calls
  openrisc: xchg: fix `computed is not used` warning
2017-03-18 15:50:39 -07:00
Mike Christie
760bf578ed target: fix race during implicit transition work flushes
This fixes the following races:

1. core_alua_do_transition_tg_pt could have read
tg_pt_gp_alua_access_state and gone into this if chunk:

if (!explicit &&
        atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) ==
           ALUA_ACCESS_STATE_TRANSITION) {

and then core_alua_do_transition_tg_pt_work could update the
state. core_alua_do_transition_tg_pt would then only set
tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would
not get updated with the second calls state.

2. core_alua_do_transition_tg_pt could be setting
tg_pt_gp_transition_complete while the tg_pt_gp_transition_work
is already completing. core_alua_do_transition_tg_pt then waits on the
completion that will never be called.

To handle these issues, we just call flush_work which will return when
core_alua_do_transition_tg_pt_work has completed so there is no need
to do the complete/wait. And, if core_alua_do_transition_tg_pt_work
was running, instead of trying to sneak in the state change, we just
schedule up another core_alua_do_transition_tg_pt_work call.

Note that this does not handle a possible race where there are multiple
threads call core_alua_do_transition_tg_pt at the same time. I think
we need a mutex in target_tg_pt_gp_alua_access_state_store.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 14:47:29 -07:00
Mike Christie
1ca4d4fa3b target: allow userspace to set state to transitioning
Userspace target_core_user handlers like tcmu-runner may want to set the
ALUA state to transitioning while it does implicit transitions. This
patch allows that state when set from configfs.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 14:47:28 -07:00
Mike Christie
d7175373f2 target: fix ALUA transition timeout handling
The implicit transition time tells initiators the min time
to wait before timing out a transition. We currently schedule
the transition to occur in tg_pt_gp_implicit_trans_secs
seconds so there is no room for delays. If
core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata
needs to write out info to a remote file, then the initiator can
easily time out the operation.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 14:47:28 -07:00
Mike Christie
207ee84133 target: Use system workqueue for ALUA transitions
If tcmu-runner is processing a STPG and needs to change the kernel's
ALUA state then we cannot use the same work queue for task management
requests and ALUA transitions, because we could deadlock. The problem
occurs when a STPG times out before tcmu-runner is able to
call into target_tg_pt_gp_alua_access_state_store->
core_alua_do_port_transition -> core_alua_do_transition_tg_pt ->
queue_work. In this case, the tmr is on the work queue waiting for
the STPG to complete, but the STPG transition is now queued behind
the waiting tmr.

Note:
This bug will also be fixed by this patch:
http://www.spinics.net/lists/target-devel/msg14560.html
which switches the tmr code to use the system workqueues.

For both, I am not sure if we need a dedicated workqueue since
it is not a performance path and I do not think we need WQ_MEM_RECLAIM
to make forward progress to free up memory like the block layer does.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 14:47:27 -07:00