linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-25 19:05:18 +07:00

Author	SHA1	Message	Date
James Bottomley	90a88d6ef8	scsi: fix soft lockup in scsi_remove_target() on module removal This softlockup is currently happening: [ 444.088002] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:29] [ 444.088002] Modules linked in: lpfc(-) qla2x00tgt(O) qla2xxx_scst(O) scst_vdisk(O) scsi_transport_fc libcrc32c scst(O) dlm configfs nfsd lockd grace nfs_acl auth_rpcgss sunrpc ed d snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device dm_mod iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic gpio_ich iTCO_vendor_support ppdev snd_hda_intel snd_hda_codec snd_hda _core snd_hwdep tg3 snd_pcm snd_timer libphy lpc_ich parport_pc ptp acpi_cpufreq snd pps_core fjes parport i2c_i801 ehci_pci tpm_tis tpm sr_mod cdrom soundcore floppy hwmon sg 8250_ fintek pcspkr i915 drm_kms_helper uhci_hcd ehci_hcd drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit usbcore button video usb_common fan ata_generic ata_piix libata th ermal [ 444.088002] CPU: 1 PID: 29 Comm: kworker/1:1 Tainted: G O 4.4.0-rc5-2.g1e923a3-default #1 [ 444.088002] Hardware name: FUJITSU SIEMENS ESPRIMO E /D2164-A1, BIOS 5.00 R1.10.2164.A1 05/08/2006 [ 444.088002] Workqueue: fc_wq_4 fc_rport_final_delete [scsi_transport_fc] [ 444.088002] task: f6266ec0 ti: f6268000 task.ti: f6268000 [ 444.088002] EIP: 0060:[<c07e7044>] EFLAGS: 00000286 CPU: 1 [ 444.088002] EIP is at _raw_spin_unlock_irqrestore+0x14/0x20 [ 444.088002] EAX: 00000286 EBX: f20d3800 ECX: 00000002 EDX: 00000286 [ 444.088002] ESI: f50ba800 EDI: f2146848 EBP: f6269ec8 ESP: f6269ec8 [ 444.088002] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 444.088002] CR0: 8005003b CR2: 08f96600 CR3: 363ae000 CR4: 000006d0 [ 444.088002] Stack: [ 444.088002] f6269eec c066b0f7 00000286 f2146848 f50ba808 f50ba800 f50ba800 f2146a90 [ 444.088002] f2146848 f6269f08 f8f0a4ed f3141000 f2146800 f2146a90 f619fa00 00000040 [ 444.088002] f6269f40 c026cb25 00000001 166c6392 00000061 f6757140 f6136340 00000004 [ 444.088002] Call Trace: [ 444.088002] [<c066b0f7>] scsi_remove_target+0x167/0x1c0 [ 444.088002] [<f8f0a4ed>] fc_rport_final_delete+0x9d/0x1e0 [scsi_transport_fc] [ 444.088002] [<c026cb25>] process_one_work+0x155/0x3e0 [ 444.088002] [<c026cde7>] worker_thread+0x37/0x490 [ 444.088002] [<c027214b>] kthread+0x9b/0xb0 [ 444.088002] [<c07e72c1>] ret_from_kernel_thread+0x21/0x40 What appears to be happening is that something has pinned the target so it can't go into STARGET_DEL via final release and the loop in scsi_remove_target spins endlessly until that happens. The fix for this soft lockup is to not keep looping over a device that we've called remove on but which hasn't gone into DEL state. This patch will retain a simplistic memory of the last target and not keep looping over it. Reported-by: Sebastian Herbszt <herbszt@gmx.de> Tested-by: Sebastian Herbszt <herbszt@gmx.de> Fixes: `4099819356` Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2016-02-11 21:47:32 -08:00
James Bottomley	abaee091a1	Merge branch 'jejb-scsi' into misc	2016-01-07 15:51:13 -08:00
James Bottomley	be9e2f775f	Merge branch 'mkp-fixes' into fixes	2015-12-03 09:32:33 -08:00
Hannes Reinecke	248d4fe95f	scsi: export 'wwid' to sysfs Use scsi_vpd_lun_id() to export the world-wide unique id (wwid) to sysfs. Note that this is the 'best' wwid according to the rules in scsi_vpd_lun_id(), not every possible wwid presented by the drive. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-12-02 16:42:13 -05:00
Hannes Reinecke	221255aee6	scsi: ignore errors from scsi_dh_add_device() device handler initialisation might fail due to a number of reasons. But as device_handlers are optional this shouldn't cause us to disable the device entirely. So just ignore errors from scsi_dh_add_device(). Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-12-02 16:29:46 -05:00
Hannes Reinecke	41f95dd2ef	scsi_dh: move 'dh_state' sysfs attribute to generic code As scsi_dh.c is now always compiled in we should be moving the 'dh_state' attribute to the generic code. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-12-02 16:29:19 -05:00
Bart Van Assche	e619e6cbec	Revert "scsi: Fix a bdi reregistration race" The SCSI sd driver probes SCSI devices asynchronously. The sd_remove() function, called indirectly by device_del(), waits until asynchronous probing has finished. Since the block layer queue must only be cleaned up after probing has finished, device_del() has to be called before blk_cleanup_queue(). Hence revert commit `bf2cf3baa2`. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-12-02 11:53:44 -08:00
Hannes Reinecke	09e2b0b146	scsi: rescan VPD attributes The VPD page information might change, so we need to be able to update it. This patch implements a VPD page rescan whenever the 'rescan' sysfs attribute is triggered. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Shane Seymour <shane.seymour@hpe.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-30 11:23:45 -05:00
Vitaly Kuznetsov	be821fd8e6	scsi_sysfs: protect against double execution of __scsi_remove_device() On some host errors storvsc module tries to remove sdev by scheduling a job which does the following: sdev = scsi_device_lookup(wrk->host, 0, 0, wrk->lun); if (sdev) { scsi_remove_device(sdev); scsi_device_put(sdev); } While this code seems correct the following crash is observed: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC RIP: 0010:[<ffffffff81169979>] [<ffffffff81169979>] bdi_destroy+0x39/0x220 ... [<ffffffff814aecdc>] ? _raw_spin_unlock_irq+0x2c/0x40 [<ffffffff8127b7db>] blk_cleanup_queue+0x17b/0x270 [<ffffffffa00b54c4>] __scsi_remove_device+0x54/0xd0 [scsi_mod] [<ffffffffa00b556b>] scsi_remove_device+0x2b/0x40 [scsi_mod] [<ffffffffa00ec47d>] storvsc_remove_lun+0x3d/0x60 [hv_storvsc] [<ffffffff81080791>] process_one_work+0x1b1/0x530 ... The problem comes with the fact that many such jobs (for the same device) are being scheduled simultaneously. While scsi_remove_device() uses shost->scan_mutex and scsi_device_lookup() will fail for a device in SDEV_DEL state there is no protection against someone who did scsi_device_lookup() before we actually entered __scsi_remove_device(). So the whole scenario looks like that: two callers do simultaneous (or preemption happens) calls to scsi_device_lookup() ant these calls succeed for both of them, after that they try doing scsi_remove_device(). shost->scan_mutex only serializes their calls to __scsi_remove_device() and we end up doing the cleanup path twice. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-19 12:15:09 -05:00
Linus Torvalds	d83763f4a6	SCSI misc on 20151113 Sorry for the delay in this patch which was mostly caused by getting the merger of the mpt2/mpt3sas driver, which was seen as an essential item of maintenance work to do before the drivers diverge too much. Unfortunately, this caused a compile failure (detected by linux-next), which then had to be fixed up and incubated. In addition to the mpt2/3sas rework, there are updates from pm80xx, lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc and ufs plus an assortment of changes including some year 2038 issues, a fix for a remove before detach issue in some drivers and a couple of other minor issues. Signed-off-by: James Bottomley <JBottomley@Odin.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJWRnpiAAoJEDeqqVYsXL0MJd0IAIkNP1q/6ksMw/lam2UlbdxN zFsFOIhGM3xLnBFehbCx7JW3cmybA7WC5jhMjaa1OeJmYHLpwbTzvVwrLs2l/0y1 /S+G0wwxROZfIKj/1EO3oKbSPCu9N3lStxOnmNXt5PSUEzAXrTqSNTtnMf2ZGh7j bQxTWEJe+66GckgGw4ozTXJHWXqM/Zs/FsYjn2h/WzFhFv8utr7zRSzHMVjcqpLG TK/Lt03hiTGqiignKrV9W6JzdGTWf2LGIsj/njgR0dxv59cNH8PrHIjZyknSBxgn lblinsCjxDHWnP4BSfTi9MQG1lEiZiWO3Y6TKkKJTgxZ9M0Eitspc+cLOiJ1mpg= =HvQf -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull final round of SCSI updates from James Bottomley: "Sorry for the delay in this patch which was mostly caused by getting the merger of the mpt2/mpt3sas driver, which was seen as an essential item of maintenance work to do before the drivers diverge too much. Unfortunately, this caused a compile failure (detected by linux-next), which then had to be fixed up and incubated. In addition to the mpt2/3sas rework, there are updates from pm80xx, lpfc, bnx2fc, hpsa, ipr, aacraid, megaraid_sas, storvsc and ufs plus an assortment of changes including some year 2038 issues, a fix for a remove before detach issue in some drivers and a couple of other minor issues" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits) mpt3sas: fix inline markers on non inline function declarations sd: Clear PS bit before Mode Select. ibmvscsi: set max_lun to 32 ibmvscsi: display default value for max_id, max_lun and max_channel. mptfusion: don't allow negative bytes in kbuf_alloc_2_sgl() scsi: pmcraid: replace struct timeval with ktime_get_real_seconds() mvumi: 64bit value for seconds_since1970 be2iscsi: Fix bogus WARN_ON length check scsi_scan: don't dump trace when scsi_prep_async_scan() is called twice mpt3sas: Bump mpt3sas driver version to 09.102.00.00 mpt3sas: Single driver module which supports both SAS 2.0 & SAS 3.0 HBAs mpt2sas, mpt3sas: Update the driver versions mpt3sas: setpci reset kernel oops fix mpt3sas: Added OEM Gen2 PnP ID branding names mpt3sas: Refcount fw_events and fix unsafe list usage mpt3sas: Refcount sas_device objects and fix unsafe list usage mpt3sas: sysfs attribute to report Backup Rail Monitor Status mpt3sas: Ported WarpDrive product SSS6200 support mpt3sas: fix for driver fails EEH, recovery from injected pci bus error mpt3sas: Manage MSI-X vectors according to HBA device type ...	2015-11-13 20:35:54 -08:00
James Bottomley	febdfbd213	SCSI queue for 4.4. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIVAwUAVkP2pu7pYBp1xd49AQL+dg/6Atk5KH17IKE+SGIAaIqS9rhckwqOB6qK 7pT5gbvuyA8nB4eXUFPFk6VejYa6PdHL5wVf7a2w9pPJAoIogTkiYb9PvlRLbzBB OrBFfm8h00psQd8YpzEAIdPvVQGsR/OTqYVMXnrNN0pra81iFWwaiB5QcJukadYl +0d/2wJnw4887ZReO/51n9fJkPwIvs+jtCj7k36yX9NL9SRm8s/JlH3aVGRSzIBy ip7ahtdcw7ncqXCWJVzQ1HCdEiwcWkbMNI8gTFpJ4V5GR6A1ZkN+jNn88C/f5qQF 1uAsIBy9B99mU5Rz7Vrbl8710DjT2SkVgQ43rC54MzszTuj34y4GNS+sCZyTfzFG vnEVWyX7Jzg1SLbp8KxjhhCrhegG8vXnyr6RJDfEzHsUHLxbnMKNGclDOny5NG4n TmXGFTfDKBVcHwFLOwwKXsjKicyirBDIRb2eKnqC0j56kFNQp9pWhFA62Xi/AOe6 vqMj2I2t30za5X3iZv5XEvhv63fFMx2nflcYIA+rCqq1AMaC7T0X170czKb5g1v+ aSZZ0qCFhMFUWETlHjOQSbHkZ6fsWJlgaPZS0ODsiGrbxXPtrHXH1lbLAh6ECveu O8dYKqC2kJbBoolqD2e59z1fQ6cW45sUHxiDjaeqgTlexwVEy25+t8uIBPUnRNwI 1YTHKn3U+ug= =g6ZL -----END PGP SIGNATURE----- Merge tag '4.4-scsi-mkp' into misc SCSI queue for 4.4. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-12 07:06:18 -05:00
Peter Oberparleiter	5cb9b40d61	scsi_sysfs: Fix queue_ramp_up_period return code Writing a number to /sys/bus/scsi/devices/<sdev>/queue_ramp_up_period returns the value of that number instead of the number of bytes written. This behavior can confuse programs expecting POSIX write() semantics. Fix this by returning the number of bytes written instead. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-11-09 17:40:22 -08:00
Bart Van Assche	bf2cf3baa2	scsi: Fix a bdi reregistration race Unregister and reregister BDI devices in the proper order. This patch avoids that the following kernel warning can get triggered: WARNING: CPU: 7 PID: 203 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x80() sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:32' Workqueue: events_unbound async_run_entry_fn Call Trace: [<ffffffff814ff5a4>] dump_stack+0x4c/0x65 [<ffffffff810746ba>] warn_slowpath_common+0x8a/0xc0 [<ffffffff81074736>] warn_slowpath_fmt+0x46/0x50 [<ffffffff81237ca8>] sysfs_warn_dup+0x68/0x80 [<ffffffff81237d8e>] sysfs_create_dir_ns+0x7e/0x90 [<ffffffff81291f58>] kobject_add_internal+0xa8/0x320 [<ffffffff812923a0>] kobject_add+0x60/0xb0 [<ffffffff8138c937>] device_add+0x107/0x5e0 [<ffffffff8138d018>] device_create_groups_vargs+0xd8/0x100 [<ffffffff8138d05c>] device_create_vargs+0x1c/0x20 [<ffffffff8117f233>] bdi_register+0x63/0x2a0 [<ffffffff8117f497>] bdi_register_dev+0x27/0x30 [<ffffffff81281549>] add_disk+0x1a9/0x4e0 [<ffffffffa00c5739>] sd_probe_async+0x119/0x1d0 [sd_mod] [<ffffffff8109a81a>] async_run_entry_fn+0x4a/0x140 [<ffffffff81091078>] process_one_work+0x1d8/0x7c0 [<ffffffff81091774>] worker_thread+0x114/0x460 [<ffffffff81097878>] kthread+0xf8/0x110 [<ffffffff8150801f>] ret_from_fork+0x3f/0x70 See also patch "block: destroy bdi before blockdev is unregistered" (commit ID `6cd18e711d`). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-11-09 16:34:14 -08:00
Peter Oberparleiter	863e02d0e1	scsi_sysfs: Fix queue_ramp_up_period return code Writing a number to /sys/bus/scsi/devices/<sdev>/queue_ramp_up_period returns the value of that number instead of the number of bytes written. This behavior can confuse programs expecting POSIX write() semantics. Fix this by returning the number of bytes written instead. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-09 12:26:31 -05:00
Johannes Thumshirn	92e6246c8e	scsi: Export SCSI Inquiry data to sysfs Export the RAW SCSI Inquiry to sysfs as binfile. This way the data can be used by userland without the need to have and ioctl or use the sg_inq tool. Here is an example of the provided data linux:~ # hexdump /sys/class/scsi_device/1\:0\:0\:0/device/inquiry 0000000 8005 3205 001f 0000 4551 554d 2020 2020 0000010 4551 554d 4420 4456 522d 4d4f 2020 2020 0000020 2e32 2e33 0000024 Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-09 11:46:44 -05:00
Christoph Hellwig	4099819356	scsi: restart list search after unlock in scsi_remove_target When dropping a lock while iterating a list we must restart the search as other threads could have manipulated the list under us. Without this we can get stuck in an endless loop. This bug was introduced by commit `bc3f02a795` Author: Dan Williams <djbw@fb.com> Date: Tue Aug 28 22:12:10 2012 -0700 [SCSI] scsi_remove_target: fix softlockup regression on hot remove Which was itself trying to fix a reported soft lockup issue http://thread.gmane.org/gmane.linux.kernel/1348679 However, we believe even with this revert of the original patch, the soft lockup problem has been fixed by commit `f2495e228f` Author: James Bottomley <JBottomley@Parallels.com> Date: Tue Jan 21 07:01:41 2014 -0800 [SCSI] dual scan thread bug fix Thanks go to Dan Williams <dan.j.williams@intel.com> for tracking all this prior history down. Reported-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: `bc3f02a795` Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-11-05 09:24:14 -08:00
Junichi Nomura	23695e41a1	scsi_dh: fix use-after-free when removing scsi device The commit `1bab0de027` ("dm-mpath, scsi_dh: don't let dm detach device handlers") removed reference counting of attached scsi device handler. As a result, handler data is freed immediately via scsi_dh->detach() in the context of scsi_remove_device() where activation request can be still in flight. This patch moves scsi_dh_handler_detach() to sdev releasing function, scsi_device_dev_release_usercontext(), at that point the device is already in quiesced state. Fixes: `1bab0de027` ("dm-mpath, scsi_dh: don't let dm detach device handlers") Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-10-27 11:22:37 +09:00
Christoph Hellwig	086b91d052	scsi_dh: integrate into the core SCSI code Stop building scsi_dh as a separate module and integrate it fully into the core SCSI code with explicit callouts at bus scan time. For now the callouts are placed at the same point as the old bus notifiers were called, but in the future we will be able to look at ALUA INQUIRY data earlier on. Note that this also means that the device handler modules need to be loaded by the time we scan the bus. The next patches will add support for autoloading device handlers at bus scan time to make sure they are always loaded if they are enabled in the kernel config. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-08-28 13:14:56 -07:00
Jens Axboe	1278dd6809	scsi: fix host max depth checking for the 'queue_depth' sysfs interface Commit `1e6f241604` changed the scsi sysfs 'queue_depth' code to rejects depths higher than the scsi host template setting. But lots of hosts set this to 1, and update the settings in the scsi host when the controller/devices probing happens. This breaks (at least) mpt2sas and mpt3sas runtime setting of queue depth, returning EINVAL for all settings but '1'. And once it's set to 1, there's no way to go back up. Cc: stable@vger.kernel.org Fixes: `1e6f241604` "scsi: don't allow setting of queue_depth bigger than can_queue" Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>	2015-07-16 16:09:53 +03:00
Christoph Hellwig	efc3c1df5f	scsi: remove ->change_queue_type method Since we got rid of ordered tag support in 2010 the prime use case of switching on and off ordered tags has been obsolete. The other function of enabling/disabling tagging entirely has only been correctly implemented by the 53c700 driver and isn't generally useful. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-12-04 09:55:45 +01:00
Christoph Hellwig	db5ed4dfd5	scsi: drop reason argument from ->change_queue_depth Drop the now unused reason argument from the ->change_queue_depth method. Also add a return value to scsi_adjust_queue_depth, and rename it to scsi_change_queue_depth now that it can be used as the default ->change_queue_depth implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-24 14:45:27 +01:00
Christoph Hellwig	1e6f241604	scsi: don't allow setting of queue_depth bigger than can_queue We won't ever queue more commands than the host allows. Instead of letting drivers either reject or ignore this case handle it in common code. Note that various driver use internal constant or variables that are assigned to both shost->can_queue and checked in ->change_queue_depth - I did remove those checks as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-11-24 14:45:26 +01:00
Christoph Hellwig	609aa22f3b	scsi: remove ordered_tags scsi_device field Remove the ordered_tags field, we haven't been issuing ordered tags based on it since the big barrier rework in 2010. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>	2014-11-12 11:19:40 +01:00
Subhash Jadavani	6fe8c1dbef	scsi: balance out autopm get/put calls in scsi_sysfs_add_sdev() SCSI Well-known logical units generally don't have any scsi driver associated with it which means no one will call scsi_autopm_put_device() on these wlun scsi devices and this would result in keeping the corresponding scsi device always active (hence LLD can't be suspended as well). Same exact problem can be seen for other scsi device representing normal logical unit whose driver is yet to be loaded. This patch fixes the above problem with this approach: - make the scsi_autopm_put_device call at the end of scsi_sysfs_add_sdev to make it balance out the get earlier in the function. - let drivers do paired get/put calls in their probe methods. Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Dolev Raviv <draviv@codeaurora.org> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-09-15 16:02:05 -07:00
Alan Stern	50c4e96411	scsi: don't store LUN bits in CDB[1] for USB mass-storage devices The SCSI specification requires that the second Command Data Byte should contain the LUN value in its high-order bits if the recipient device reports SCSI level 2 or below. Nevertheless, some USB mass-storage devices use those bits for other purposes in vendor-specific commands. Currently Linux has no way to send such commands, because the SCSI stack always overwrites the LUN bits. Testing shows that Windows 7 and XP do not store the LUN bits in the CDB when sending commands to a USB device. This doesn't matter if the device uses the Bulk-Only or UAS transports (which virtually all modern USB mass-storage devices do), as these have a separate mechanism for sending the LUN value. Therefore this patch introduces a flag in the Scsi_Host structure to inform the SCSI midlayer that a transport does not require the LUN bits to be stored in the CDB, and it makes usb-storage set this flag for all devices using the Bulk-Only transport. (UAS is handled by a separate driver, but it doesn't really matter because no SCSI-2 or lower device is at all likely to use UAS.) The patch also cleans up the code responsible for storing the LUN value by adding a bitflag to the scsi_device structure. The test for whether to stick the LUN value in the CDB can be made when the device is probed, and stored for future use rather than being made over and over in the fast path. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Tiziano Bacocco <tiziano.bacocco@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-09-15 16:01:58 -07:00
Daniel Walter	f7c65af513	drivers/scsi: replace strict_strto calls Replace obsolete strict_strto with more appropriate kstrto calls Signed-off-by: Daniel Walter <dwalter@google.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:28 -07:00
Christoph Hellwig	d285203cf6	scsi: add support for a blk-mq based I/O path. This patch adds support for an alternate I/O path in the scsi midlayer which uses the blk-mq infrastructure instead of the legacy request code. Use of blk-mq is fully transparent to drivers, although for now a host template field is provided to opt out of blk-mq usage in case any unforseen incompatibilities arise. In general replacing the legacy request code with blk-mq is a simple and mostly mechanical transformation. The biggest exception is the new code that deals with the fact the I/O submissions in blk-mq must happen from process context, which slightly complicates the I/O completion handler. The second biggest differences is that blk-mq is build around the concept of preallocated requests that also include driver specific data, which in SCSI context means the scsi_cmnd structure. This completely avoids dynamic memory allocations for the fast path through I/O submission. Due the preallocated requests the MQ code path exclusively uses the host-wide shared tag allocator instead of a per-LUN one. This only affects drivers actually using the block layer provided tag allocator instead of their own. Unlike the old path blk-mq always provides a tag, although drivers don't have to use it. For now the blk-mq path is disable by defauly and must be enabled using the "use_blk_mq" module parameter. Once the remaining work in the block layer to make blk-mq more suitable for slow devices is complete I hope to make it the default and eventually even remove the old code path. Based on the earlier scsi-mq prototype by Nicholas Bellinger. Thanks to Bart Van Assche and Robert Elliot for testing, benchmarking and various sugestions and code contributions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 17:16:28 -04:00
Christoph Hellwig	cd9070c9c5	scsi: fix the {host,target,device}_blocked counter mess Seems like these counters are missing any sort of synchronization for updates, as a over 10 year old comment from me noted. Fix this by using atomic counters, and while we're at it also make sure they are in the same cacheline as the _busy counters and not needlessly stored to in every I/O completion. With the new model the _busy counters can temporarily go negative, so all the readers are updated to check for > 0 values. Longer term every successful I/O completion will reset the counters to zero, so the temporarily negative values will not cause any harm. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 17:15:48 -04:00
Christoph Hellwig	71e75c97f9	scsi: convert device_busy to atomic_t Avoid taking the queue_lock to check the per-device queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Unlike the host and target busy counters this doesn't allow us to avoid the queue_lock in the request_fn due to the way the interface works, but it'll allow us to prepare for using the blk-mq code, which doesn't use the queue_lock at all, and it at least avoids a queue_lock round trip in scsi_device_unbusy, which is still important given how busy the queue_lock is. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:43:45 -04:00
Christoph Hellwig	7466501608	scsi: convert host_busy to atomic_t Avoid taking the host-wide host_lock to check the per-host queue limit. Instead we do an atomic_inc_return early on to grab our slot in the queue, and if necessary decrement it after finishing all checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Webb Scales <webbnh@hp.com> Acked-by: Jens Axboe <axboe@kernel.dk> Tested-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Robert Elliott <elliott@hp.com>	2014-07-25 07:43:43 -04:00
Hannes Reinecke	9cb78c16f5	scsi: use 64-bit LUNs The SCSI standard defines 64-bit values for LUNs, and large arrays employing large or hierarchical LUN numbers become more and more common. So update the linux SCSI stack to use 64-bit LUN numbers. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:37 +02:00
Linus Torvalds	1a0b6abaea	SCSI misc on 20140401 This patch consists of the usual driver updates (megaraid_sas, scsi_debug, qla2xxx, qla4xxx, lpfc, bnx2fc, be2iscsi, hpsa, ipr) plus an assortment of minor fixes and the first precursors of SCSI-MQ (the code path simplifications) and the bug fix for the USB oops on remove (which involves an infrastructure change, so is sent via the main tree with a delayed backport after a cycle in which it is shown to introduce no new bugs). Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJTOsP1AAoJEDeqqVYsXL0MraUIAMCHWIN791cSc/E4d6mw/6nC j5CG/wwuw3VfqJcJJ8PcItfReWPuS7aLwhAx3wNGDUe7Vcz9pmcgJU9c2/ZWhIJH D0YXnGSkkfxI9Wc5WJ/NbueS0TFt0G5B6wpIxSLpSEJ1k9I90vxe3symCwv5vS/p 3Cd2nZZCLg6ArzZJ3PJLnNG9FUp2ZBeZwfPu4CuPm+3kEq9oRATg7bS4NNtVTQLP 0zNs5rKAVWfnE5Ii8VFjA7DLduG9W1IBNnSI7EERenrLKMbHG5530Rnl71uvjjgY 0jmQ5YGpTsYcJggLdaijZdK+zuq6Jtc+0DwWJKIE3cEHx3kUrYi4UQWTTRk9ttQ= =Bp1Y -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull first round of SCSI updates from James Bottomley: "This patch consists of the usual driver updates (megaraid_sas, scsi_debug, qla2xxx, qla4xxx, lpfc, bnx2fc, be2iscsi, hpsa, ipr) plus an assortment of minor fixes and the first precursors of SCSI-MQ (the code path simplifications) and the bug fix for the USB oops on remove (which involves an infrastructure change, so is sent via the main tree with a delayed backport after a cycle in which it is shown to introduce no new bugs)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (196 commits) [SCSI] sd: Quiesce mode sense error messages [SCSI] add support for per-host cmd pools [SCSI] simplify command allocation and freeing a bit [SCSI] megaraid: simplify internal command handling [SCSI] ses: Use vpd information from scsi_device [SCSI] Add EVPD page 0x83 and 0x80 to sysfs [SCSI] Return VPD page length in scsi_vpd_inquiry() [SCSI] scsi_sysfs: Implement 'is_visible' callback [SCSI] hpsa: update driver version to 3.4.4-1 [SCSI] hpsa: fix bad endif placement in RAID 5 mapper code [SCSI] qla2xxx: Fix build errors related to invalid print fields on some architectures. [SCSI] bfa: Replace large udelay() with mdelay() [SCSI] vmw_pvscsi: Some improvements in pvscsi driver. [SCSI] vmw_pvscsi: Add support for I/O requests coalescing. [SCSI] vmw_pvscsi: Fix pvscsi_abort() function. [SCSI] remove deprecated IRQF_DISABLED from SCSI [SCSI] bfa: Updating Maintainers email ids [SCSI] ipr: Add new CCIN definition for Grand Canyon support [SCSI] ipr: Format HCAM overlay ID 0x21 [SCSI] ipr: Use pci_enable_msi_range() and pci_enable_msix_range() ...	2014-04-01 18:49:04 -07:00
Hannes Reinecke	b3ae8780b4	[SCSI] Add EVPD page 0x83 and 0x80 to sysfs EVPD page 0x83 is used to uniquely identify the device. So instead of having each and every program issue a separate SG_IO call to retrieve this information it does make far more sense to display it in sysfs. Some older devices (most notably tapes) will only report reliable information in page 0x80 (Unit Serial Number). So export this in the sysfs attribute 'vpd_pg80'. [jejb: checkpatch fix] [hare: attach after transport configure] [fengguang.wu@intel.com: spotted problems with the original now fixed] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-27 08:25:33 -07:00
Hannes Reinecke	276b20d09b	[SCSI] scsi_sysfs: Implement 'is_visible' callback Instead of modifying attributes after the device has been created we should be using the 'is_visible' callback to avoid races. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-19 15:22:39 -07:00
Hannes Reinecke	ad469a5764	[SCSI] scsi_error: disable eh_deadline if no host_reset_handler is set When the host template doesn't declare an eh_host_reset_handler the eh_deadline mechanism is pointless and will set the device to offline. So disable eh_deadline if no eh_host_reset_handler is present. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:18:59 -07:00
James Bottomley	e63ed0d7a9	[SCSI] fix our current target reap infrastructure This patch eliminates the reap_ref and replaces it with a proper kref. On last put of this kref, the target is removed from visibility in sysfs. The final call to scsi_target_reap() for the device is done from __scsi_remove_device() and only if the device was made visible. This ensures that the target disappears as soon as the last device is gone rather than waiting until final release of the device (which is often too long). Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: stable@vger.kernel.org # delay backport by 2 months for field testing Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-15 10:18:59 -07:00
Tejun Heo	ac0ece9174	scsi: use device_remove_file_self() instead of device_schedule_callback() driver-core now supports synchrnous self-deletion of attributes and the asynchrnous removal mechanism is scheduled for removal. Use it instead of device_schedule_callback(). This makes "delete" behave synchronously. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: linux-scsi@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:42:41 -08:00
Ren Mingxin	bb3b621a33	[SCSI] Set the minimum valid value of 'eh_deadline' as 0 The former minimum valid value of 'eh_deadline' is 1s, which means the earliest occasion to shorten EH is 1 second later since a command is failed or timed out. But if we want to skip EH steps ASAP, we have to wait until the first EH step is finished. If the duration of the first EH step is long, this waiting time is excruciating. So, it is necessary to accept 0 as the minimum valid value for 'eh_deadline'. According to my test, with Hannes' patchset 'New EH command timeout handler' as well, the minimum IO time is improved from 73s (eh_deadline = 1) to 43s(eh_deadline = 0) when commands are timed out by disabling RSCN and target port. Signed-off-by: Ren Mingxin <renmx@cn.fujitsu.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-12-19 07:39:02 -08:00
Hannes Reinecke	b45620229d	[SCSI] Add 'eh_deadline' to limit SCSI EH runtime This patchs adds an 'eh_deadline' sysfs attribute to the scsi host which limits the overall runtime of the SCSI EH. The 'eh_deadline' value is stored in the now obsolete field 'resetting'. When a command is failed the start time of the EH is stored in 'last_reset'. If the overall runtime of the SCSI EH is longer than last_reset + eh_deadline, the EH is short-circuited and falls through to issue a host reset only. [jejb: add comments in Scsi_Host about new fields] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-10-25 12:17:59 +01:00
Jack Wang	522db3c9e1	[SCSI] export device_busy for sdev If you mutiple devices connect to a host, we might be interested in have an intensive I/O workload on one disk, and notice starvation on others. This give the user more hint about current infight io for scsi device. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-10-25 09:57:56 +01:00
Ewan D. Milne	279afdfe78	[SCSI] Generate uevents on certain unit attention codes Generate a uevent when the following Unit Attention ASC/ASCQ codes are received: 2A/01 MODE PARAMETERS CHANGED 2A/09 CAPACITY DATA HAS CHANGED 38/07 THIN PROVISIONING SOFT THRESHOLD REACHED 3F/03 INQUIRY DATA HAS CHANGED 3F/0E REPORTED LUNS DATA HAS CHANGED Log kernel messages when the following Unit Attention ASC/ASCQ codes are received that are not as specific as those above: 2A/xx PARAMETERS CHANGED 3F/xx TARGET OPERATING CONDITIONS HAVE CHANGED Added logic to set expecting_lun_change for other LUNs on the target after REPORTED LUNS DATA HAS CHANGED is received, so that duplicate uevents are not generated, and clear expecting_lun_change when a REPORT LUNS command completes, in accordance with the SPC-3 specification regarding reporting of the 3F 0E ASC/ASCQ UA. [jejb: remove SPC3 test in scsi_report_lun_change and some docbook fixes and unused variable fix, both reported by Fengguang Wu] Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-08-26 18:52:27 +04:00
Martin K. Petersen	0816c9251a	[SCSI] Allow error handling timeout to be specified Introduce eh_timeout which can be used for error handling purposes. This was previously hardcoded to 10 seconds in the SCSI error handling code. However, for some fast-fail scenarios it is necessary to be able to tune this as it can take several iterations (bus device, target, bus, controller) before we give up. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-06-04 11:16:24 -07:00
Sasha Levin	072f19b4be	[SCSI] prevent stack buffer overflow in host_reset store_host_reset() has tried to re-invent the wheel to compare sysfs strings. Unfortunately it did so poorly and never bothered to check the input from userspace before overwriting stack with it, so something simple as: echo "WoopsieWoopsie" > /sys/devices/pseudo_0/adapter0/host0/scsi_host/host0/host_reset would result in: [ 316.310101] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81f5bac7 [ 316.310101] [ 316.320051] Pid: 6655, comm: sh Tainted: G W 3.7.0-rc5-next-20121114-sasha-00016-g5c9d68d-dirty #129 [ 316.320051] Call Trace: [ 316.340058] pps pps0: PPS event at 1352918752.620355751 [ 316.340062] pps pps0: capture assert seq #303 [ 316.320051] [<ffffffff83b3856b>] panic+0xcd/0x1f4 [ 316.320051] [<ffffffff81f5bac7>] ? store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff8110b996>] __stack_chk_fail+0x16/0x20 [ 316.320051] [<ffffffff81f5bac7>] store_host_reset+0xd7/0x100 [ 316.320051] [<ffffffff81e55bb3>] dev_attr_store+0x13/0x30 [ 316.320051] [<ffffffff812f7db1>] sysfs_write_file+0x101/0x170 [ 316.320051] [<ffffffff8127acc8>] vfs_write+0xb8/0x180 [ 316.320051] [<ffffffff8127ae80>] sys_write+0x50/0xa0 [ 316.320051] [<ffffffff83c03418>] tracesys+0xe1/0xe6 Fix this by uninventing whatever was going on there and just use sysfs_streq. Bug introduced by `29443691` ("[SCSI] scsi: Added support for adapter and firmware reset"). [jejb: added necessary const to prevent compile warnings] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: <stable@vger.kernel.org> #3.2+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-30 09:08:16 +00:00
Dan Williams	bc3f02a795	[SCSI] scsi_remove_target: fix softlockup regression on hot remove John reports: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/u:8:2202] [..] Call Trace: [<ffffffff8141782a>] scsi_remove_target+0xda/0x1f0 [<ffffffff81421de5>] sas_rphy_remove+0x55/0x60 [<ffffffff81421e01>] sas_rphy_delete+0x11/0x20 [<ffffffff81421e35>] sas_port_delete+0x25/0x160 [<ffffffff814549a3>] mptsas_del_end_device+0x183/0x270 ...introduced by commit `3b661a9` "[SCSI] fix hot unplug vs async scan race". Don't restart lookup of more stargets in the multi-target case, just arrange to traverse the list once, on the assumption that new targets are always added at the end. There is no guarantee that the target will change state in scsi_target_reap() so we can end up spinning if we restart. Cc: <stable@vger.kernel.org> Acked-by: Jack Wang <jack_wang@usish.com> LKML-Reference: <CAEhu1-6wq1YsNiscGMwP4ud0Q+MrViRzv=kcWCQSBNc8c68N5Q@mail.gmail.com> Reported-by: John Drescher <drescherjm@gmail.com> Tested-by: John Drescher <drescherjm@gmail.com> Signed-off-by: Dan Williams <djbw@fb.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-09-24 12:17:49 +04:00
Dan Williams	3b661a92e8	[SCSI] fix hot unplug vs async scan race The following crash results from cases where the end_device has been removed before scsi_sysfs_add_sdev has had a chance to run. BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6 ... Call Trace: [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3 [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8125e641>] kobject_add_varg+0x41/0x50 [<ffffffff8125e70b>] kobject_add+0x64/0x66 [<ffffffff8131122b>] device_add+0x12d/0x63a [<ffffffff814b65ea>] ? _raw_spin_unlock_irqrestore+0x47/0x56 [<ffffffff8107de15>] ? module_refcount+0x89/0xa0 [<ffffffff8132f348>] scsi_sysfs_add_sdev+0x4e/0x28a [<ffffffff8132dcbb>] do_scan_async+0x9c/0x145 ...teach scsi_sysfs_add_devices() to check for deleted devices() before trying to add them, and teach scsi_remove_target() how to remove targets that have not been added via device_add(). Cc: <stable@vger.kernel.org> Reported-by: Dariusz Majchrzak <dariusz.majchrzak@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:45 +01:00
Bart Van Assche	b485462aca	[SCSI] Stop accepting SCSI requests before removing a device Avoid that the code for requeueing SCSI requests triggers a crash by making sure that that code isn't scheduled anymore after a device has been removed. Also, source code inspection of __scsi_remove_device() revealed a race condition in this function: no new SCSI requests must be accepted for a SCSI device after device removal started. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:41 +01:00
Bart Van Assche	67bd941300	[SCSI] Fix device removal NULL pointer dereference Use blk_queue_dead() to test whether the queue is dead instead of !sdev. Since scsi_prep_fn() may be invoked concurrently with __scsi_remove_device(), keep the queuedata (sdev) pointer in __scsi_remove_device(). This patch fixes a kernel oops that can be triggered by USB device removal. See also http://www.spinics.net/lists/linux-scsi/msg56254.html. Other changes included in this patch: - Swap the blk_cleanup_queue() and kfree() calls in scsi_host_dev_release() to make that code easier to grasp. - Remove the queue dead check from scsi_run_queue() since the queue state can change anyway at any point in that function where the queue lock is not held. - Remove the queue dead check from the start of scsi_request_fn() since it is redundant with the scsi_device_online() check. Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: <stable@kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:40 +01:00
Mike Christie	1b8d262061	[SCSI] add new SDEV_TRANSPORT_OFFLINE state This patch adds a new state SDEV_TRANSPORT_OFFLINE. It will be used by transport classes to offline devices for cases like when the fast_io_fail/recovery_tmo fires. In those cases we want all IO to fail, and we have not yet escalated to dev_loss_tmo behavior where we are removing the devices. Currently to handle this state, transport classes are setting the scsi_device's state to running, setting their internal session/port structs state to something that indicates failed, and then failing IO from some transport check in the queuecommand. The reason for the new value is so that users can distinguish between a device failure that is a result of a transport problem vs the wide range of errors that devices get offlined for when a scsi command times out and we offline the devices there. It also fixes the confusion as to why the transport class is failing IO, but has set the device state from blocked to running. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-07-20 08:58:21 +01:00
Vikas Chaudhary	2944369144	[SCSI] scsi: Added support for adapter and firmware reset Added new sysfs attr 'host_reset' in scsi_sysfs.c to perform adapter or firmware reset as suggested by Mike Christie here: http://marc.info/?l=linux-scsi&m=127359347111167&w=2 user/application can write "adapter" or "firmware" on this attr and it will call newly added function hook in scsi_host_template to call LDD adapter or firmware reset implementation. Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-08-27 08:36:46 -06:00
James Bottomley	e73e079bf1	[SCSI] Fix oops caused by queue refcounting failure In certain circumstances, we can get an oops from a torn down device. Most notably this is from CD roms trying to call scsi_ioctl. The root cause of the problem is the fact that after scsi_remove_device() has been called, the queue is fully torn down. This is actually wrong since the queue can be used until the sdev release function is called. Therefore, we add an extra reference to the queue which is released in sdev->release, so the queue always exists. Reported-by: Parag Warudkar <parag.lkml@gmail.com> Cc: stable@kernel.org Signed-off-by: James Bottomley <jbottomley@parallels.com>	2011-06-02 18:34:43 +09:00

1 2 3

129 Commits