linux-stable/drivers/scsi/scsi_scan.c

1990 lines
56 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
// SPDX-License-Identifier: GPL-2.0
/*
* scsi_scan.c
*
* Copyright (C) 2000 Eric Youngdale,
* Copyright (C) 2002 Patrick Mansfield
*
* The general scanning/probing algorithm is as follows, exceptions are
* made to it depending on device specific flags, compilation options, and
* global variable (boot or module load time) settings.
*
* A specific LUN is scanned via an INQUIRY command; if the LUN has a
* device attached, a scsi_device is allocated and setup for it.
*
* For every id of every channel on the given host:
*
* Scan LUN 0; if the target responds to LUN 0 (even if there is no
* device or storage attached to LUN 0):
*
* If LUN 0 has a device attached, allocate and setup a
* scsi_device for it.
*
* If target is SCSI-3 or up, issue a REPORT LUN, and scan
* all of the LUNs returned by the REPORT LUN; else,
* sequentially scan LUNs up until some maximum is reached,
* or a LUN is seen that cannot have a device attached to it.
*/
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/init.h>
#include <linux/blkdev.h>
#include <linux/delay.h>
#include <linux/kthread.h>
#include <linux/spinlock.h>
#include <linux/async.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 08:04:11 +00:00
#include <linux/slab.h>
#include <asm/unaligned.h>
#include <scsi/scsi.h>
#include <scsi/scsi_cmnd.h>
#include <scsi/scsi_device.h>
#include <scsi/scsi_driver.h>
#include <scsi/scsi_devinfo.h>
#include <scsi/scsi_host.h>
#include <scsi/scsi_transport.h>
#include <scsi/scsi_dh.h>
#include <scsi/scsi_eh.h>
#include "scsi_priv.h"
#include "scsi_logging.h"
#define ALLOC_FAILURE_MSG KERN_ERR "%s: Allocation failure during" \
" SCSI scanning, some SCSI devices might not be configured\n"
/*
* Default timeout
*/
#define SCSI_TIMEOUT (2*HZ)
#define SCSI_REPORT_LUNS_TIMEOUT (30*HZ)
/*
* Prefix values for the SCSI id's (stored in sysfs name field)
*/
#define SCSI_UID_SER_NUM 'S'
#define SCSI_UID_UNKNOWN 'Z'
/*
* Return values of some of the scanning functions.
*
* SCSI_SCAN_NO_RESPONSE: no valid response received from the target, this
* includes allocation or general failures preventing IO from being sent.
*
* SCSI_SCAN_TARGET_PRESENT: target responded, but no device is available
* on the given LUN.
*
* SCSI_SCAN_LUN_PRESENT: target responded, and a device is available on a
* given LUN.
*/
#define SCSI_SCAN_NO_RESPONSE 0
#define SCSI_SCAN_TARGET_PRESENT 1
#define SCSI_SCAN_LUN_PRESENT 2
static const char *scsi_null_device_strs = "nullnullnullnull";
#define MAX_SCSI_LUNS 512
static u64 max_scsi_luns = MAX_SCSI_LUNS;
module_param_named(max_luns, max_scsi_luns, ullong, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(max_luns,
"last scsi LUN (should be between 1 and 2^64-1)");
#ifdef CONFIG_SCSI_SCAN_ASYNC
#define SCSI_SCAN_TYPE_DEFAULT "async"
#else
#define SCSI_SCAN_TYPE_DEFAULT "sync"
#endif
static char scsi_scan_type[7] = SCSI_SCAN_TYPE_DEFAULT;
module_param_string(scan, scsi_scan_type, sizeof(scsi_scan_type),
S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(scan, "sync, async, manual, or none. "
"Setting to 'manual' disables automatic scanning, but allows "
"for manual device scan via the 'scan' sysfs attribute.");
static unsigned int scsi_inq_timeout = SCSI_TIMEOUT/HZ + 18;
module_param_named(inq_timeout, scsi_inq_timeout, uint, S_IRUGO|S_IWUSR);
MODULE_PARM_DESC(inq_timeout,
"Timeout (in seconds) waiting for devices to answer INQUIRY."
" Default is 20. Some devices may need more; most need less.");
/* This lock protects only this list */
static DEFINE_SPINLOCK(async_scan_lock);
static LIST_HEAD(scanning_hosts);
struct async_scan_data {
struct list_head list;
struct Scsi_Host *shost;
struct completion prev_finished;
};
/*
* scsi_enable_async_suspend - Enable async suspend and resume
*/
void scsi_enable_async_suspend(struct device *dev)
{
/*
* If a user has disabled async probing a likely reason is due to a
* storage enclosure that does not inject staggered spin-ups. For
* safety, make resume synchronous as well in that case.
*/
if (strncmp(scsi_scan_type, "async", 5) != 0)
return;
/* Enable asynchronous suspend and resume. */
device_enable_async_suspend(dev);
}
/**
* scsi_complete_async_scans - Wait for asynchronous scans to complete
*
* When this function returns, any host which started scanning before
* this function was called will have finished its scan. Hosts which
* started scanning after this function was called may or may not have
* finished.
*/
int scsi_complete_async_scans(void)
{
struct async_scan_data *data;
do {
if (list_empty(&scanning_hosts))
return 0;
/* If we can't get memory immediately, that's OK. Just
* sleep a little. Even if we never get memory, the async
* scans will finish eventually.
*/
data = kmalloc(sizeof(*data), GFP_KERNEL);
if (!data)
msleep(1);
} while (!data);
data->shost = NULL;
init_completion(&data->prev_finished);
spin_lock(&async_scan_lock);
/* Check that there's still somebody else on the list */
if (list_empty(&scanning_hosts))
goto done;
list_add_tail(&data->list, &scanning_hosts);
spin_unlock(&async_scan_lock);
printk(KERN_INFO "scsi: waiting for bus probes to complete ...\n");
wait_for_completion(&data->prev_finished);
spin_lock(&async_scan_lock);
list_del(&data->list);
if (!list_empty(&scanning_hosts)) {
struct async_scan_data *next = list_entry(scanning_hosts.next,
struct async_scan_data, list);
complete(&next->prev_finished);
}
done:
spin_unlock(&async_scan_lock);
kfree(data);
return 0;
}
/**
* scsi_unlock_floptical - unlock device via a special MODE SENSE command
* @sdev: scsi device to send command to
* @result: area to store the result of the MODE SENSE
*
* Description:
* Send a vendor specific MODE SENSE (not a MODE SELECT) command.
* Called for BLIST_KEY devices.
**/
static void scsi_unlock_floptical(struct scsi_device *sdev,
unsigned char *result)
{
unsigned char scsi_cmd[MAX_COMMAND_SIZE];
sdev_printk(KERN_NOTICE, sdev, "unlocking floptical drive\n");
scsi_cmd[0] = MODE_SENSE;
scsi_cmd[1] = 0;
scsi_cmd[2] = 0x2e;
scsi_cmd[3] = 0;
scsi_cmd[4] = 0x2a; /* size */
scsi_cmd[5] = 0;
scsi_execute_cmd(sdev, scsi_cmd, REQ_OP_DRV_IN, result, 0x2a,
SCSI_TIMEOUT, 3, NULL);
}
static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
unsigned int depth)
{
int new_shift = sbitmap_calculate_shift(depth);
bool need_alloc = !sdev->budget_map.map;
bool need_free = false;
int ret;
struct sbitmap sb_backup;
depth = min_t(unsigned int, depth, scsi_device_max_queue_depth(sdev));
/*
* realloc if new shift is calculated, which is caused by setting
* up one new default queue depth after calling ->slave_configure
*/
if (!need_alloc && new_shift != sdev->budget_map.shift)
need_alloc = need_free = true;
if (!need_alloc)
return 0;
/*
* Request queue has to be frozen for reallocating budget map,
* and here disk isn't added yet, so freezing is pretty fast
*/
if (need_free) {
blk_mq_freeze_queue(sdev->request_queue);
sb_backup = sdev->budget_map;
}
ret = sbitmap_init_node(&sdev->budget_map,
scsi_device_max_queue_depth(sdev),
new_shift, GFP_KERNEL,
sdev->request_queue->node, false, true);
if (!ret)
sbitmap_resize(&sdev->budget_map, depth);
if (need_free) {
if (ret)
sdev->budget_map = sb_backup;
else
sbitmap_free(&sb_backup);
ret = 0;
blk_mq_unfreeze_queue(sdev->request_queue);
}
return ret;
}
/**
* scsi_alloc_sdev - allocate and setup a scsi_Device
* @starget: which target to allocate a &scsi_device for
* @lun: which lun
* @hostdata: usually NULL and set by ->slave_alloc instead
*
* Description:
* Allocate, initialize for io, and return a pointer to a scsi_Device.
* Stores the @shost, @channel, @id, and @lun in the scsi_Device, and
* adds scsi_Device to the appropriate list.
*
* Return value:
* scsi_Device pointer, or NULL on failure.
**/
static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
u64 lun, void *hostdata)
{
scsi: core: Replace sdev->device_busy with sbitmap SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 02:33:17 +00:00
unsigned int depth;
struct scsi_device *sdev;
struct request_queue *q;
int display_failure_msg = 1, ret;
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
sdev = kzalloc(sizeof(*sdev) + shost->transportt->device_size,
scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c We had a test-report where, under memory pressure, adding LUNs to the systems would fail (the tests add LUNs strictly in sequence): [ 5525.853432] scsi 0:0:1:1088045124: Direct-Access IBM 2107900 .148 PQ: 0 ANSI: 5 [ 5525.853826] scsi 0:0:1:1088045124: alua: supports implicit TPGS [ 5525.853830] scsi 0:0:1:1088045124: alua: device naa.6005076303ffd32700000000000044da port group 0 rel port 43 [ 5525.853931] sd 0:0:1:1088045124: Attached scsi generic sg10 type 0 [ 5525.854075] sd 0:0:1:1088045124: [sdk] Disabling DIF Type 1 protection [ 5525.855495] sd 0:0:1:1088045124: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 5525.855606] sd 0:0:1:1088045124: [sdk] Write Protect is off [ 5525.855609] sd 0:0:1:1088045124: [sdk] Mode Sense: ed 00 00 08 [ 5525.855795] sd 0:0:1:1088045124: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5525.857838] sdk: sdk1 [ 5525.859468] sd 0:0:1:1088045124: [sdk] Attached SCSI disk [ 5525.865073] sd 0:0:1:1088045124: alua: transition timeout set to 60 seconds [ 5525.865078] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.015070] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.015213] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.587439] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured [ 5526.588562] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured Looking at the code of scsi_alloc_sdev(), and all the calling contexts, there seems to be no reason to use GFP_ATMOIC here. All the different call-contexts use a mutex at some point, and nothing in between that requires no sleeping, as far as I could see. Additionally, the code that later allocates the block queue for the device (scsi_mq_alloc_queue()) already uses GFP_KERNEL. There are similar allocations in two other functions: scsi_probe_and_add_lun(), and scsi_add_lun(),; that can also be done with GFP_KERNEL. Here is the contexts for the three functions so far: scsi_alloc_sdev() scsi_probe_and_add_lun() scsi_sequential_lun_scan() __scsi_scan_target() scsi_scan_target() mutex_lock() scsi_scan_channel() scsi_scan_host_selected() mutex_lock() scsi_report_lun_scan() __scsi_scan_target() ... __scsi_add_device() mutex_lock() __scsi_scan_target() ... scsi_report_lun_scan() ... scsi_get_host_dev() mutex_lock() scsi_probe_and_add_lun() ... scsi_add_lun() scsi_probe_and_add_lun() ... So replace all these, and give them a bit of a better chance to succeed, with more chances of reclaim. Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-21 09:18:00 +00:00
GFP_KERNEL);
if (!sdev)
goto out;
sdev->vendor = scsi_null_device_strs;
sdev->model = scsi_null_device_strs;
sdev->rev = scsi_null_device_strs;
sdev->host = shost;
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 22:46:33 +00:00
sdev->queue_ramp_up_period = SCSI_DEFAULT_RAMP_UP_PERIOD;
sdev->id = starget->id;
sdev->lun = lun;
sdev->channel = starget->channel;
mutex_init(&sdev->state_mutex);
sdev->sdev_state = SDEV_CREATED;
INIT_LIST_HEAD(&sdev->siblings);
INIT_LIST_HEAD(&sdev->same_target_siblings);
INIT_LIST_HEAD(&sdev->starved_entry);
INIT_LIST_HEAD(&sdev->event_list);
spin_lock_init(&sdev->list_lock);
mutex_init(&sdev->inquiry_mutex);
INIT_WORK(&sdev->event_work, scsi_evt_thread);
INIT_WORK(&sdev->requeue_work, scsi_requeue_run_queue);
sdev->sdev_gendev.parent = get_device(&starget->dev);
sdev->sdev_target = starget;
/* usually NULL and set by ->slave_alloc instead */
sdev->hostdata = hostdata;
/* if the device needs this changing, it may do so in the
* slave_configure function */
sdev->max_device_blocked = SCSI_DEFAULT_DEVICE_BLOCKED;
/*
* Some low level driver could use device->type
*/
sdev->type = -1;
/*
* Assume that the device will have handshaking problems,
* and then fix this field later if it turns out it
* doesn't
*/
sdev->borken = 1;
sdev->sg_reserved_size = INT_MAX;
q = blk_mq_init_queue(&sdev->host->tag_set);
if (IS_ERR(q)) {
/* release fn is set up in scsi_sysfs_device_initialise, so
* have to free and put manually here */
put_device(&starget->dev);
kfree(sdev);
goto out;
}
scsi: core: Fix a use-after-free There are two .exit_cmd_priv implementations. Both implementations use resources associated with the SCSI host. Make sure that these resources are still available when .exit_cmd_priv is called by waiting inside scsi_remove_host() until the tag set has been freed. This commit fixes the following use-after-free: ================================================================== BUG: KASAN: use-after-free in srp_exit_cmd_priv+0x27/0xd0 [ib_srp] Read of size 8 at addr ffff888100337000 by task multipathd/16727 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x5e/0x5db kasan_report+0xab/0x120 srp_exit_cmd_priv+0x27/0xd0 [ib_srp] scsi_mq_exit_request+0x4d/0x70 blk_mq_free_rqs+0x143/0x410 __blk_mq_free_map_and_rqs+0x6e/0x100 blk_mq_free_tag_set+0x2b/0x160 scsi_host_dev_release+0xf3/0x1a0 device_release+0x54/0xe0 kobject_put+0xa5/0x120 device_release+0x54/0xe0 kobject_put+0xa5/0x120 scsi_device_dev_release_usercontext+0x4c1/0x4e0 execute_in_process_context+0x23/0x90 device_release+0x54/0xe0 kobject_put+0xa5/0x120 scsi_disk_release+0x3f/0x50 device_release+0x54/0xe0 kobject_put+0xa5/0x120 disk_release+0x17f/0x1b0 device_release+0x54/0xe0 kobject_put+0xa5/0x120 dm_put_table_device+0xa3/0x160 [dm_mod] dm_put_device+0xd0/0x140 [dm_mod] free_priority_group+0xd8/0x110 [dm_multipath] free_multipath+0x94/0xe0 [dm_multipath] dm_table_destroy+0xa2/0x1e0 [dm_mod] __dm_destroy+0x196/0x350 [dm_mod] dev_remove+0x10c/0x160 [dm_mod] ctl_ioctl+0x2c2/0x590 [dm_mod] dm_ctl_ioctl+0x5/0x10 [dm_mod] __x64_sys_ioctl+0xb4/0xf0 dm_ctl_ioctl+0x5/0x10 [dm_mod] __x64_sys_ioctl+0xb4/0xf0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Link: https://lore.kernel.org/r/20220826002635.919423-1-bvanassche@acm.org Fixes: 65ca846a5314 ("scsi: core: Introduce {init,exit}_cmd_priv()") Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Christie <michael.christie@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Li Zhijian <lizhijian@fujitsu.com> Reported-by: Li Zhijian <lizhijian@fujitsu.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-08-26 00:26:34 +00:00
kref_get(&sdev->host->tagset_refcnt);
sdev->request_queue = q;
q->queuedata = sdev;
__scsi_init_queue(sdev->host, q);
scsi: core: Replace sdev->device_busy with sbitmap SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 02:33:17 +00:00
depth = sdev->host->cmd_per_lun ?: 1;
/*
* Use .can_queue as budget map's depth because we have to
* support adjusting queue depth from sysfs. Meantime use
* default device queue depth to figure out sbitmap shift
* since we use this queue depth most of times.
*/
if (scsi_realloc_sdev_budget_map(sdev, depth)) {
scsi: core: Replace sdev->device_busy with sbitmap SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 02:33:17 +00:00
put_device(&starget->dev);
kfree(sdev);
goto out;
}
scsi_change_queue_depth(sdev, depth);
scsi_sysfs_device_initialize(sdev);
if (shost->hostt->slave_alloc) {
ret = shost->hostt->slave_alloc(sdev);
if (ret) {
/*
* if LLDD reports slave not present, don't clutter
* console with alloc failure messages
*/
if (ret == -ENXIO)
display_failure_msg = 0;
goto out_device_destroy;
}
}
return sdev;
out_device_destroy:
[SCSI] fix WARNING: at drivers/scsi/scsi_lib.c:1704 On Mon, 2011-11-07 at 17:24 +1100, Stephen Rothwell wrote: > Hi all, > > Starting some time last week I am getting the following during boot on > our PPC970 blade: > > calling .ipr_init+0x0/0x68 @ 1 > ipr: IBM Power RAID SCSI Device Driver version: 2.5.2 (April 27, 2011) > ipr 0000:01:01.0: Found IOA with IRQ: 26 > ipr 0000:01:01.0: Starting IOA initialization sequence. > ipr 0000:01:01.0: Adapter firmware version: 06160039 > ipr 0000:01:01.0: IOA initialized. > scsi0 : IBM 572E Storage Adapter > ------------[ cut here ]------------ > WARNING: at drivers/scsi/scsi_lib.c:1704 > Modules linked in: > NIP: c00000000053b3d4 LR: c00000000053e5b0 CTR: c000000000541d70 > REGS: c0000000783c2f60 TRAP: 0700 Not tainted (3.1.0-autokern1) > MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 24002024 XER: 20000002 > TASK = c0000000783b8000[1] 'swapper' THREAD: c0000000783c0000 CPU: 0 > GPR00: 0000000000000001 c0000000783c31e0 c000000000cf38b0 c00000000239a9d0 > GPR04: c000000000cbe8f8 0000000000000000 c0000000783c3040 0000000000000000 > GPR08: c000000075daf488 c000000078a3b7ff c000000000bcacc8 0000000000000000 > GPR12: 0000000044002028 c000000007ffb000 0000000002e40000 000000000099b800 > GPR16: 0000000000000000 c000000000bba5fc c000000000a61db8 0000000000000000 > GPR20: 0000000001b77200 0000000000000000 c000000078990000 0000000000000001 > GPR24: c000000002396828 0000000000000000 0000000000000000 c000000078a3b938 > GPR28: fffffffffffffffa c0000000008ad2c0 c000000000c7faa8 c00000000239a9d0 > NIP [c00000000053b3d4] .scsi_free_queue+0x24/0x90 > LR [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0 > Call Trace: > [c0000000783c31e0] [c000000000c7faa8] wireless_seq_fops+0x278d0/0x2eb88 (unreliable) > [c0000000783c3270] [c00000000053e5b0] .scsi_alloc_sdev+0x280/0x2e0 > [c0000000783c3330] [c00000000053eba0] .scsi_probe_and_add_lun+0x390/0xb40 > [c0000000783c34a0] [c00000000053f7ec] .__scsi_scan_target+0x16c/0x650 > [c0000000783c35f0] [c00000000053fd90] .scsi_scan_channel+0xc0/0x100 > [c0000000783c36a0] [c00000000053fefc] .scsi_scan_host_selected+0x12c/0x1c0 > [c0000000783c3750] [c00000000083dcb4] .ipr_probe+0x2c0/0x390 > [c0000000783c3830] [c0000000003f50b4] .local_pci_probe+0x34/0x50 > [c0000000783c38a0] [c0000000003f5f78] .pci_device_probe+0x148/0x150 > [c0000000783c3950] [c0000000004e1e8c] .driver_probe_device+0xdc/0x210 > [c0000000783c39f0] [c0000000004e20cc] .__driver_attach+0x10c/0x110 > [c0000000783c3a80] [c0000000004e1228] .bus_for_each_dev+0x98/0xf0 > [c0000000783c3b30] [c0000000004e1bf8] .driver_attach+0x28/0x40 > [c0000000783c3bb0] [c0000000004e07d8] .bus_add_driver+0x218/0x340 > [c0000000783c3c60] [c0000000004e2a2c] .driver_register+0x9c/0x1b0 > [c0000000783c3d00] [c0000000003f62d4] .__pci_register_driver+0x64/0x140 > [c0000000783c3da0] [c000000000b99f88] .ipr_init+0x4c/0x68 > [c0000000783c3e20] [c00000000000ad24] .do_one_initcall+0x1a4/0x1e0 > [c0000000783c3ee0] [c000000000b512d0] .kernel_init+0x14c/0x1fc > [c0000000783c3f90] [c000000000022468] .kernel_thread+0x54/0x70 > Instruction dump: > ebe1fff8 7c0803a6 4e800020 7c0802a6 fba1ffe8 fbe1fff8 7c7f1b78 f8010010 > f821ff71 e8030398 3120ffff 7c090110 <0b000000> e86303b0 482de065 60000000 > ---[ end trace 759bed76a85e8dec ]--- > scsi 0:0:1:0: Direct-Access IBM-ESXS MAY2036RC T106 PQ: 0 ANSI: 5 > ------------[ cut here ]------------ > > I get lots more of these. The obvious commit to point the finger at > is 3308511c93e6 ("[SCSI] Make scsi_free_queue() kill pending SCSI > commands") but the root cause may be something different. Caused by commit f7c9c6bb14f3104608a3a83cadea10a6943d2804 Author: Anton Blanchard <anton@samba.org> Date: Thu Nov 3 08:56:22 2011 +1100 [SCSI] Fix block queue and elevator memory leak in scsi_alloc_sdev Doesn't completely do the teardown. The true fix is to do a proper teardown instead of hand rolling it Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: stable@kernel.org #2.6.38+ Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-11-07 14:51:24 +00:00
__scsi_remove_device(sdev);
out:
if (display_failure_msg)
printk(ALLOC_FAILURE_MSG, __func__);
return NULL;
}
static void scsi_target_destroy(struct scsi_target *starget)
{
struct device *dev = &starget->dev;
struct Scsi_Host *shost = dev_to_shost(dev->parent);
unsigned long flags;
BUG_ON(starget->state == STARGET_DEL);
starget->state = STARGET_DEL;
transport_destroy_device(dev);
spin_lock_irqsave(shost->host_lock, flags);
if (shost->hostt->target_destroy)
shost->hostt->target_destroy(starget);
list_del_init(&starget->siblings);
spin_unlock_irqrestore(shost->host_lock, flags);
put_device(dev);
}
static void scsi_target_dev_release(struct device *dev)
{
struct device *parent = dev->parent;
struct scsi_target *starget = to_scsi_target(dev);
kfree(starget);
put_device(parent);
}
static struct device_type scsi_target_type = {
.name = "scsi_target",
.release = scsi_target_dev_release,
};
int scsi_is_target_device(const struct device *dev)
{
return dev->type == &scsi_target_type;
}
EXPORT_SYMBOL(scsi_is_target_device);
static struct scsi_target *__scsi_find_target(struct device *parent,
int channel, uint id)
{
struct scsi_target *starget, *found_starget = NULL;
struct Scsi_Host *shost = dev_to_shost(parent);
/*
* Search for an existing target for this sdev.
*/
list_for_each_entry(starget, &shost->__targets, siblings) {
if (starget->id == id &&
starget->channel == channel) {
found_starget = starget;
break;
}
}
if (found_starget)
get_device(&found_starget->dev);
return found_starget;
}
/**
* scsi_target_reap_ref_release - remove target from visibility
* @kref: the reap_ref in the target being released
*
* Called on last put of reap_ref, which is the indication that no device
* under this target is visible anymore, so render the target invisible in
* sysfs. Note: we have to be in user context here because the target reaps
* should be done in places where the scsi device visibility is being removed.
*/
static void scsi_target_reap_ref_release(struct kref *kref)
{
struct scsi_target *starget
= container_of(kref, struct scsi_target, reap_ref);
/*
scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state The addition of the STARGET_REMOVE state had the side effect of introducing a race condition that can cause a crash. scsi_target_reap_ref_release() checks the starget->state to see if it still in STARGET_CREATED, and if so, skips calling transport_remove_device() and device_del(), because the starget->state is only set to STARGET_RUNNING after scsi_target_add() has called device_add() and transport_add_device(). However, if an rport loss occurs while a target is being scanned, it can happen that scsi_remove_target() will be called while the starget is still in the STARGET_CREATED state. In this case, the starget->state will be set to STARGET_REMOVE, and as a result, scsi_target_reap_ref_release() will take the wrong path. The end result is a panic: [ 1255.356653] Oops: 0000 [#1] SMP [ 1255.360154] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel ghash_clmulni_i [ 1255.393234] CPU: 5 PID: 149 Comm: kworker/u96:4 Tainted: G W 4.11.0+ #8 [ 1255.401879] Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013 [ 1255.410327] Workqueue: scsi_wq_6 fc_scsi_scan_rport [scsi_transport_fc] [ 1255.417720] task: ffff88060ca8c8c0 task.stack: ffffc900048a8000 [ 1255.424331] RIP: 0010:kernfs_find_ns+0x13/0xc0 [ 1255.429287] RSP: 0018:ffffc900048abbf0 EFLAGS: 00010246 [ 1255.435123] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 1255.443083] RDX: 0000000000000000 RSI: ffffffff8188d659 RDI: 0000000000000000 [ 1255.451043] RBP: ffffc900048abc10 R08: 0000000000000000 R09: 0000012433fe0025 [ 1255.459005] R10: 0000000025e5a4b5 R11: 0000000025e5a4b5 R12: ffffffff8188d659 [ 1255.466972] R13: 0000000000000000 R14: ffff8805f55e5088 R15: 0000000000000000 [ 1255.474931] FS: 0000000000000000(0000) GS:ffff880616b40000(0000) knlGS:0000000000000000 [ 1255.483959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1255.490370] CR2: 0000000000000068 CR3: 0000000001c09000 CR4: 00000000000406e0 [ 1255.498332] Call Trace: [ 1255.501058] kernfs_find_and_get_ns+0x31/0x60 [ 1255.505916] sysfs_unmerge_group+0x1d/0x60 [ 1255.510498] dpm_sysfs_remove+0x22/0x60 [ 1255.514783] device_del+0xf4/0x2e0 [ 1255.518577] ? device_remove_file+0x19/0x20 [ 1255.523241] attribute_container_class_device_del+0x1a/0x20 [ 1255.529457] transport_remove_classdev+0x4e/0x60 [ 1255.534607] ? transport_add_class_device+0x40/0x40 [ 1255.540046] attribute_container_device_trigger+0xb0/0xc0 [ 1255.546069] transport_remove_device+0x15/0x20 [ 1255.551025] scsi_target_reap_ref_release+0x25/0x40 [ 1255.556467] scsi_target_reap+0x2e/0x40 [ 1255.560744] __scsi_scan_target+0xaa/0x5b0 [ 1255.565312] scsi_scan_target+0xec/0x100 [ 1255.569689] fc_scsi_scan_rport+0xb1/0xc0 [scsi_transport_fc] [ 1255.576099] process_one_work+0x14b/0x390 [ 1255.580569] worker_thread+0x4b/0x390 [ 1255.584651] kthread+0x109/0x140 [ 1255.588251] ? rescuer_thread+0x330/0x330 [ 1255.592730] ? kthread_park+0x60/0x60 [ 1255.596815] ret_from_fork+0x29/0x40 [ 1255.600801] Code: 24 08 48 83 42 40 01 5b 41 5c 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 [ 1255.621876] RIP: kernfs_find_ns+0x13/0xc0 RSP: ffffc900048abbf0 [ 1255.628479] CR2: 0000000000000068 [ 1255.632756] ---[ end trace 34a69ba0477d036f ]--- Fix this by adding another scsi_target state STARGET_CREATED_REMOVE to distinguish this case. Fixes: f05795d3d771 ("scsi: Add intermediate STARGET_REMOVE state to scsi_target_state") Reported-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Cc: <stable@vger.kernel.org> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-27 18:55:58 +00:00
* if we get here and the target is still in a CREATED state that
* means it was allocated but never made visible (because a scan
* turned up no LUNs), so don't call device_del() on it.
*/
scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state The addition of the STARGET_REMOVE state had the side effect of introducing a race condition that can cause a crash. scsi_target_reap_ref_release() checks the starget->state to see if it still in STARGET_CREATED, and if so, skips calling transport_remove_device() and device_del(), because the starget->state is only set to STARGET_RUNNING after scsi_target_add() has called device_add() and transport_add_device(). However, if an rport loss occurs while a target is being scanned, it can happen that scsi_remove_target() will be called while the starget is still in the STARGET_CREATED state. In this case, the starget->state will be set to STARGET_REMOVE, and as a result, scsi_target_reap_ref_release() will take the wrong path. The end result is a panic: [ 1255.356653] Oops: 0000 [#1] SMP [ 1255.360154] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel ghash_clmulni_i [ 1255.393234] CPU: 5 PID: 149 Comm: kworker/u96:4 Tainted: G W 4.11.0+ #8 [ 1255.401879] Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013 [ 1255.410327] Workqueue: scsi_wq_6 fc_scsi_scan_rport [scsi_transport_fc] [ 1255.417720] task: ffff88060ca8c8c0 task.stack: ffffc900048a8000 [ 1255.424331] RIP: 0010:kernfs_find_ns+0x13/0xc0 [ 1255.429287] RSP: 0018:ffffc900048abbf0 EFLAGS: 00010246 [ 1255.435123] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 1255.443083] RDX: 0000000000000000 RSI: ffffffff8188d659 RDI: 0000000000000000 [ 1255.451043] RBP: ffffc900048abc10 R08: 0000000000000000 R09: 0000012433fe0025 [ 1255.459005] R10: 0000000025e5a4b5 R11: 0000000025e5a4b5 R12: ffffffff8188d659 [ 1255.466972] R13: 0000000000000000 R14: ffff8805f55e5088 R15: 0000000000000000 [ 1255.474931] FS: 0000000000000000(0000) GS:ffff880616b40000(0000) knlGS:0000000000000000 [ 1255.483959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1255.490370] CR2: 0000000000000068 CR3: 0000000001c09000 CR4: 00000000000406e0 [ 1255.498332] Call Trace: [ 1255.501058] kernfs_find_and_get_ns+0x31/0x60 [ 1255.505916] sysfs_unmerge_group+0x1d/0x60 [ 1255.510498] dpm_sysfs_remove+0x22/0x60 [ 1255.514783] device_del+0xf4/0x2e0 [ 1255.518577] ? device_remove_file+0x19/0x20 [ 1255.523241] attribute_container_class_device_del+0x1a/0x20 [ 1255.529457] transport_remove_classdev+0x4e/0x60 [ 1255.534607] ? transport_add_class_device+0x40/0x40 [ 1255.540046] attribute_container_device_trigger+0xb0/0xc0 [ 1255.546069] transport_remove_device+0x15/0x20 [ 1255.551025] scsi_target_reap_ref_release+0x25/0x40 [ 1255.556467] scsi_target_reap+0x2e/0x40 [ 1255.560744] __scsi_scan_target+0xaa/0x5b0 [ 1255.565312] scsi_scan_target+0xec/0x100 [ 1255.569689] fc_scsi_scan_rport+0xb1/0xc0 [scsi_transport_fc] [ 1255.576099] process_one_work+0x14b/0x390 [ 1255.580569] worker_thread+0x4b/0x390 [ 1255.584651] kthread+0x109/0x140 [ 1255.588251] ? rescuer_thread+0x330/0x330 [ 1255.592730] ? kthread_park+0x60/0x60 [ 1255.596815] ret_from_fork+0x29/0x40 [ 1255.600801] Code: 24 08 48 83 42 40 01 5b 41 5c 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 [ 1255.621876] RIP: kernfs_find_ns+0x13/0xc0 RSP: ffffc900048abbf0 [ 1255.628479] CR2: 0000000000000068 [ 1255.632756] ---[ end trace 34a69ba0477d036f ]--- Fix this by adding another scsi_target state STARGET_CREATED_REMOVE to distinguish this case. Fixes: f05795d3d771 ("scsi: Add intermediate STARGET_REMOVE state to scsi_target_state") Reported-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Cc: <stable@vger.kernel.org> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-27 18:55:58 +00:00
if ((starget->state != STARGET_CREATED) &&
(starget->state != STARGET_CREATED_REMOVE)) {
transport_remove_device(&starget->dev);
device_del(&starget->dev);
}
scsi_target_destroy(starget);
}
static void scsi_target_reap_ref_put(struct scsi_target *starget)
{
kref_put(&starget->reap_ref, scsi_target_reap_ref_release);
}
/**
* scsi_alloc_target - allocate a new or find an existing target
* @parent: parent of the target (need not be a scsi host)
* @channel: target channel number (zero if no channels)
* @id: target id number
*
* Return an existing target if one exists, provided it hasn't already
* gone into STARGET_DEL state, otherwise allocate a new target.
*
* The target is returned with an incremented reference, so the caller
* is responsible for both reaping and doing a last put
*/
static struct scsi_target *scsi_alloc_target(struct device *parent,
int channel, uint id)
{
struct Scsi_Host *shost = dev_to_shost(parent);
struct device *dev = NULL;
unsigned long flags;
const int size = sizeof(struct scsi_target)
+ shost->transportt->target_size;
struct scsi_target *starget;
struct scsi_target *found_target;
int error, ref_got;
starget = kzalloc(size, GFP_KERNEL);
if (!starget) {
printk(KERN_ERR "%s: allocation failure\n", __func__);
return NULL;
}
dev = &starget->dev;
device_initialize(dev);
kref_init(&starget->reap_ref);
dev->parent = get_device(parent);
dev_set_name(dev, "target%d:%d:%d", shost->host_no, channel, id);
dev->bus = &scsi_bus_type;
dev->type = &scsi_target_type;
scsi_enable_async_suspend(dev);
starget->id = id;
starget->channel = channel;
[SCSI] Add helper code so transport classes/driver can control queueing (v3) SCSI-ml manages the queueing limits for the device and host, but does not do so at the target level. However something something similar can come in userful when a driver is transitioning a transport object to the the blocked state, becuase at that time we do not want to queue io and we do not want the queuecommand to be called again. The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers. You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit a transport level queueing issue like the hw cannot allocate some resource at the iscsi session/connection level, or the target has temporarily closed or shrunk the queueing window, or if we are transitioning to the blocked state. bnx2i, when they rework their firmware according to netdev developers requests, will also need to be able to limit queueing at this level. bnx2i will hook into libiscsi, but will allocate a scsi host per netdevice/hba, so unlike pure software iscsi/iser which is allocating a host per session, it cannot set the scsi_host->can_queue and return SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport. The iscsi class/driver can also set a scsi_target->can_queue value which reflects the max commands the driver/class can support. For iscsi this reflects the number of commands we can support for each session due to session/connection hw limits, driver limits, and to also reflect the session/targets's queueing window. Changes: v1 - initial patch. v2 - Fix scsi_run_queue handling of multiple blocked targets. Previously we would break from the main loop if a device was added back on the starved list. We now run over the list and check if any target is blocked. v3 - Rediff for scsi-misc. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-08-17 20:24:38 +00:00
starget->can_queue = 0;
INIT_LIST_HEAD(&starget->siblings);
INIT_LIST_HEAD(&starget->devices);
starget->state = STARGET_CREATED;
starget->scsi_level = SCSI_2;
starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
retry:
spin_lock_irqsave(shost->host_lock, flags);
found_target = __scsi_find_target(parent, channel, id);
if (found_target)
goto found;
list_add_tail(&starget->siblings, &shost->__targets);
spin_unlock_irqrestore(shost->host_lock, flags);
/* allocate and add */
transport_setup_device(dev);
if (shost->hostt->target_alloc) {
error = shost->hostt->target_alloc(starget);
if(error) {
if (error != -ENXIO)
dev_err(dev, "target allocation failed, error %d\n", error);
/* don't want scsi_target_reap to do the final
* put because it will be under the host lock */
scsi_target_destroy(starget);
return NULL;
}
}
get_device(dev);
return starget;
found:
/*
* release routine already fired if kref is zero, so if we can still
* take the reference, the target must be alive. If we can't, it must
* be dying and we need to wait for a new target
*/
ref_got = kref_get_unless_zero(&found_target->reap_ref);
spin_unlock_irqrestore(shost->host_lock, flags);
if (ref_got) {
put_device(dev);
return found_target;
}
/*
* Unfortunately, we found a dying target; need to wait until it's
* dead before we can get a new one. There is an anomaly here. We
* *should* call scsi_target_reap() to balance the kref_get() of the
* reap_ref above. However, since the target being released, it's
* already invisible and the reap_ref is irrelevant. If we call
* scsi_target_reap() we might spuriously do another device_del() on
* an already invisible target.
*/
put_device(&found_target->dev);
/*
* length of time is irrelevant here, we just want to yield the CPU
* for a tick to avoid busy waiting for the target to die.
*/
msleep(1);
goto retry;
}
/**
* scsi_target_reap - check to see if target is in use and destroy if not
* @starget: target to be checked
*
* This is used after removing a LUN or doing a last put of the target
* it checks atomically that nothing is using the target and removes
* it if so.
*/
void scsi_target_reap(struct scsi_target *starget)
{
/*
* serious problem if this triggers: STARGET_DEL is only set in the if
* the reap_ref drops to zero, so we're trying to do another final put
* on an already released kref
*/
BUG_ON(starget->state == STARGET_DEL);
scsi_target_reap_ref_put(starget);
}
/**
* scsi_sanitize_inquiry_string - remove non-graphical chars from an
* INQUIRY result string
* @s: INQUIRY result string to sanitize
* @len: length of the string
*
* Description:
* The SCSI spec says that INQUIRY vendor, product, and revision
* strings must consist entirely of graphic ASCII characters,
* padded on the right with spaces. Since not all devices obey
* this rule, we will replace non-graphic or non-ASCII characters
* with spaces. Exception: a NUL character is interpreted as a
* string terminator, so all the following characters are set to
* spaces.
**/
void scsi_sanitize_inquiry_string(unsigned char *s, int len)
{
int terminated = 0;
for (; len > 0; (--len, ++s)) {
if (*s == 0)
terminated = 1;
if (terminated || *s < 0x20 || *s > 0x7e)
*s = ' ';
}
}
EXPORT_SYMBOL(scsi_sanitize_inquiry_string);
/**
* scsi_probe_lun - probe a single LUN using a SCSI INQUIRY
* @sdev: scsi_device to probe
* @inq_result: area to store the INQUIRY result
* @result_len: len of inq_result
* @bflags: store any bflags found here
*
* Description:
* Probe the lun associated with @req using a standard SCSI INQUIRY;
*
* If the INQUIRY is successful, zero is returned and the
* INQUIRY data is in @inq_result; the scsi_level and INQUIRY length
* are copied to the scsi_device any flags value is stored in *@bflags.
**/
static int scsi_probe_lun(struct scsi_device *sdev, unsigned char *inq_result,
int result_len, blist_flags_t *bflags)
{
unsigned char scsi_cmd[MAX_COMMAND_SIZE];
int first_inquiry_len, try_inquiry_len, next_inquiry_len;
int response_len = 0;
int pass, count, result, resid;
struct scsi_sense_hdr sshdr;
const struct scsi_exec_args exec_args = {
.sshdr = &sshdr,
.resid = &resid,
};
*bflags = 0;
/* Perform up to 3 passes. The first pass uses a conservative
* transfer length of 36 unless sdev->inquiry_len specifies a
* different value. */
first_inquiry_len = sdev->inquiry_len ? sdev->inquiry_len : 36;
try_inquiry_len = first_inquiry_len;
pass = 1;
next_pass:
SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
"scsi scan: INQUIRY pass %d length %d\n",
pass, try_inquiry_len));
/* Each pass gets up to three chances to ignore Unit Attention */
for (count = 0; count < 3; ++count) {
memset(scsi_cmd, 0, 6);
scsi_cmd[0] = INQUIRY;
scsi_cmd[4] = (unsigned char) try_inquiry_len;
memset(inq_result, 0, try_inquiry_len);
result = scsi_execute_cmd(sdev, scsi_cmd, REQ_OP_DRV_IN,
inq_result, try_inquiry_len,
HZ / 2 + HZ * scsi_inq_timeout, 3,
&exec_args);
SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
"scsi scan: INQUIRY %s with code 0x%x\n",
result ? "failed" : "successful", result));
if (result > 0) {
/*
* not-ready to ready transition [asc/ascq=0x28/0x0]
* or power-on, reset [asc/ascq=0x29/0x0], continue.
* INQUIRY should not yield UNIT_ATTENTION
* but many buggy devices do so anyway.
*/
if (scsi_status_is_check_condition(result) &&
scsi_sense_valid(&sshdr)) {
if ((sshdr.sense_key == UNIT_ATTENTION) &&
((sshdr.asc == 0x28) ||
(sshdr.asc == 0x29)) &&
(sshdr.ascq == 0))
continue;
}
} else if (result == 0) {
/*
* if nothing was transferred, we try
* again. It's a workaround for some USB
* devices.
*/
if (resid == try_inquiry_len)
continue;
}
break;
}
if (result == 0) {
scsi_sanitize_inquiry_string(&inq_result[8], 8);
scsi_sanitize_inquiry_string(&inq_result[16], 16);
scsi_sanitize_inquiry_string(&inq_result[32], 4);
response_len = inq_result[4] + 5;
if (response_len > 255)
response_len = first_inquiry_len; /* sanity */
/*
* Get any flags for this device.
*
* XXX add a bflags to scsi_device, and replace the
* corresponding bit fields in scsi_device, so bflags
* need not be passed as an argument.
*/
*bflags = scsi_get_device_flags(sdev, &inq_result[8],
&inq_result[16]);
/* When the first pass succeeds we gain information about
* what larger transfer lengths might work. */
if (pass == 1) {
if (BLIST_INQUIRY_36 & *bflags)
next_inquiry_len = 36;
/*
* LLD specified a maximum sdev->inquiry_len
* but device claims it has more data. Capping
* the length only makes sense for legacy
* devices. If a device supports SPC-4 (2014)
* or newer, assume that it is safe to ask for
* as much as the device says it supports.
*/
else if (sdev->inquiry_len &&
response_len > sdev->inquiry_len &&
(inq_result[2] & 0x7) < 6) /* SPC-4 */
next_inquiry_len = sdev->inquiry_len;
else
next_inquiry_len = response_len;
/* If more data is available perform the second pass */
if (next_inquiry_len > try_inquiry_len) {
try_inquiry_len = next_inquiry_len;
pass = 2;
goto next_pass;
}
}
} else if (pass == 2) {
sdev_printk(KERN_INFO, sdev,
"scsi scan: %d byte inquiry failed. "
"Consider BLIST_INQUIRY_36 for this device\n",
try_inquiry_len);
/* If this pass failed, the third pass goes back and transfers
* the same amount as we successfully got in the first pass. */
try_inquiry_len = first_inquiry_len;
pass = 3;
goto next_pass;
}
/* If the last transfer attempt got an error, assume the
* peripheral doesn't exist or is dead. */
if (result)
return -EIO;
/* Don't report any more data than the device says is valid */
sdev->inquiry_len = min(try_inquiry_len, response_len);
/*
* XXX Abort if the response length is less than 36? If less than
* 32, the lookup of the device flags (above) could be invalid,
* and it would be possible to take an incorrect action - we do
* not want to hang because of a short INQUIRY. On the flip side,
* if the device is spun down or becoming ready (and so it gives a
* short INQUIRY), an abort here prevents any further use of the
* device, including spin up.
*
* On the whole, the best approach seems to be to assume the first
* 36 bytes are valid no matter what the device says. That's
* better than copying < 36 bytes to the inquiry-result buffer
* and displaying garbage for the Vendor, Product, or Revision
* strings.
*/
if (sdev->inquiry_len < 36) {
if (!sdev->host->short_inquiry) {
shost_printk(KERN_INFO, sdev->host,
"scsi scan: INQUIRY result too short (%d),"
" using 36\n", sdev->inquiry_len);
sdev->host->short_inquiry = 1;
}
sdev->inquiry_len = 36;
}
/*
* Related to the above issue:
*
* XXX Devices (disk or all?) should be sent a TEST UNIT READY,
* and if not ready, sent a START_STOP to start (maybe spin up) and
* then send the INQUIRY again, since the INQUIRY can change after
* a device is initialized.
*
* Ideally, start a device if explicitly asked to do so. This
* assumes that a device is spun up on power on, spun down on
* request, and then spun up on request.
*/
/*
* The scanning code needs to know the scsi_level, even if no
* device is attached at LUN 0 (SCSI_SCAN_TARGET_PRESENT) so
* non-zero LUNs can be scanned.
*/
sdev->scsi_level = inq_result[2] & 0x07;
if (sdev->scsi_level >= 2 ||
(sdev->scsi_level == 1 && (inq_result[3] & 0x0f) == 1))
sdev->scsi_level++;
sdev->sdev_target->scsi_level = sdev->scsi_level;
scsi: don't store LUN bits in CDB[1] for USB mass-storage devices The SCSI specification requires that the second Command Data Byte should contain the LUN value in its high-order bits if the recipient device reports SCSI level 2 or below. Nevertheless, some USB mass-storage devices use those bits for other purposes in vendor-specific commands. Currently Linux has no way to send such commands, because the SCSI stack always overwrites the LUN bits. Testing shows that Windows 7 and XP do not store the LUN bits in the CDB when sending commands to a USB device. This doesn't matter if the device uses the Bulk-Only or UAS transports (which virtually all modern USB mass-storage devices do), as these have a separate mechanism for sending the LUN value. Therefore this patch introduces a flag in the Scsi_Host structure to inform the SCSI midlayer that a transport does not require the LUN bits to be stored in the CDB, and it makes usb-storage set this flag for all devices using the Bulk-Only transport. (UAS is handled by a separate driver, but it doesn't really matter because no SCSI-2 or lower device is at all likely to use UAS.) The patch also cleans up the code responsible for storing the LUN value by adding a bitflag to the scsi_device structure. The test for whether to stick the LUN value in the CDB can be made when the device is probed, and stored for future use rather than being made over and over in the fast path. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Tiziano Bacocco <tiziano.bacocco@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-02 15:35:50 +00:00
/*
* If SCSI-2 or lower, and if the transport requires it,
* store the LUN value in CDB[1].
*/
sdev->lun_in_cdb = 0;
if (sdev->scsi_level <= SCSI_2 &&
sdev->scsi_level != SCSI_UNKNOWN &&
!sdev->host->no_scsi2_lun_in_cdb)
sdev->lun_in_cdb = 1;
return 0;
}
/**
* scsi_add_lun - allocate and fully initialze a scsi_device
* @sdev: holds information to be stored in the new scsi_device
* @inq_result: holds the result of a previous INQUIRY to the LUN
* @bflags: black/white list flag
* @async: 1 if this device is being scanned asynchronously
*
* Description:
* Initialize the scsi_device @sdev. Optionally set fields based
* on values in *@bflags.
*
* Return:
* SCSI_SCAN_NO_RESPONSE: could not allocate or setup a scsi_device
* SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized
**/
static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
blist_flags_t *bflags, int async)
{
int ret;
/*
* XXX do not save the inquiry, since it can change underneath us,
* save just vendor/model/rev.
*
* Rather than save it and have an ioctl that retrieves the saved
* value, have an ioctl that executes the same INQUIRY code used
* in scsi_probe_lun, let user level programs doing INQUIRY
* scanning run at their own risk, or supply a user level program
* that can correctly scan.
*/
/*
* Copy at least 36 bytes of INQUIRY data, so that we don't
* dereference unallocated memory when accessing the Vendor,
* Product, and Revision strings. Badly behaved devices may set
* the INQUIRY Additional Length byte to a small value, indicating
* these strings are invalid, but often they contain plausible data
* nonetheless. It doesn't matter if the device sent < 36 bytes
* total, since scsi_probe_lun() initializes inq_result with 0s.
*/
sdev->inquiry = kmemdup(inq_result,
max_t(size_t, sdev->inquiry_len, 36),
scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c We had a test-report where, under memory pressure, adding LUNs to the systems would fail (the tests add LUNs strictly in sequence): [ 5525.853432] scsi 0:0:1:1088045124: Direct-Access IBM 2107900 .148 PQ: 0 ANSI: 5 [ 5525.853826] scsi 0:0:1:1088045124: alua: supports implicit TPGS [ 5525.853830] scsi 0:0:1:1088045124: alua: device naa.6005076303ffd32700000000000044da port group 0 rel port 43 [ 5525.853931] sd 0:0:1:1088045124: Attached scsi generic sg10 type 0 [ 5525.854075] sd 0:0:1:1088045124: [sdk] Disabling DIF Type 1 protection [ 5525.855495] sd 0:0:1:1088045124: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 5525.855606] sd 0:0:1:1088045124: [sdk] Write Protect is off [ 5525.855609] sd 0:0:1:1088045124: [sdk] Mode Sense: ed 00 00 08 [ 5525.855795] sd 0:0:1:1088045124: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5525.857838] sdk: sdk1 [ 5525.859468] sd 0:0:1:1088045124: [sdk] Attached SCSI disk [ 5525.865073] sd 0:0:1:1088045124: alua: transition timeout set to 60 seconds [ 5525.865078] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.015070] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.015213] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.587439] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured [ 5526.588562] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured Looking at the code of scsi_alloc_sdev(), and all the calling contexts, there seems to be no reason to use GFP_ATMOIC here. All the different call-contexts use a mutex at some point, and nothing in between that requires no sleeping, as far as I could see. Additionally, the code that later allocates the block queue for the device (scsi_mq_alloc_queue()) already uses GFP_KERNEL. There are similar allocations in two other functions: scsi_probe_and_add_lun(), and scsi_add_lun(),; that can also be done with GFP_KERNEL. Here is the contexts for the three functions so far: scsi_alloc_sdev() scsi_probe_and_add_lun() scsi_sequential_lun_scan() __scsi_scan_target() scsi_scan_target() mutex_lock() scsi_scan_channel() scsi_scan_host_selected() mutex_lock() scsi_report_lun_scan() __scsi_scan_target() ... __scsi_add_device() mutex_lock() __scsi_scan_target() ... scsi_report_lun_scan() ... scsi_get_host_dev() mutex_lock() scsi_probe_and_add_lun() ... scsi_add_lun() scsi_probe_and_add_lun() ... So replace all these, and give them a bit of a better chance to succeed, with more chances of reclaim. Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-21 09:18:00 +00:00
GFP_KERNEL);
if (sdev->inquiry == NULL)
return SCSI_SCAN_NO_RESPONSE;
sdev->vendor = (char *) (sdev->inquiry + 8);
sdev->model = (char *) (sdev->inquiry + 16);
sdev->rev = (char *) (sdev->inquiry + 32);
[SCSI] Fix 'Device not ready' issue on mpt2sas This is a particularly nasty SCSI ATA Translation Layer (SATL) problem. SAT-2 says (section 8.12.2) if the device is in the stopped state as the result of processing a START STOP UNIT command (see 9.11), then the SATL shall terminate the TEST UNIT READY command with CHECK CONDITION status with the sense key set to NOT READY and the additional sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND REQUIRED; mpt2sas internal SATL seems to implement this. The result is very confusing standby behaviour (using hdparm -y). If you suspend a drive and then send another command, usually it wakes up. However, if the next command is a TEST UNIT READY, the SATL sees that the drive is suspended and proceeds to follow the SATL rules for this, returning NOT READY to all subsequent commands. This means that the ordering of TEST UNIT READY is crucial: if you send TUR and then a command, you get a NOT READY to both back. If you send a command and then a TUR, you get GOOD status because the preceeding command woke the drive. This bit us badly because commit 85ef06d1d252f6a2e73b678591ab71caad4667bb Author: Tejun Heo <tj@kernel.org> Date: Fri Jul 1 16:17:47 2011 +0200 block: flush MEDIA_CHANGE from drivers on close(2) Changed our ordering on TEST UNIT READY commands meaning that SATA drives connected to an mpt2sas now suspend and refuse to wake (because the mpt2sas SATL sees the suspend *before* the drives get awoken by the next ATA command) resulting in lots of failed commands. The standard is completely nuts forcing this inconsistent behaviour, but we have to work around it. The fix for this is twofold: 1. Set the allow_restart flag so we wake the drive when we see it has been suspended 2. Return all TEST UNIT READY status directly to the mid layer without any further error handling which prevents us causing error handling which may offline the device just because of a media check TUR. Reported-by: Matthias Prager <linux@matthiasprager.de> Cc: stable@vger.kernel.org Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-07-25 19:55:55 +00:00
if (strncmp(sdev->vendor, "ATA ", 8) == 0) {
/*
* sata emulation layer device. This is a hack to work around
* the SATL power management specifications which state that
* when the SATL detects the device has gone into standby
* mode, it shall respond with NOT READY.
*/
sdev->allow_restart = 1;
}
if (*bflags & BLIST_ISROM) {
sdev->type = TYPE_ROM;
sdev->removable = 1;
} else {
sdev->type = (inq_result[0] & 0x1f);
sdev->removable = (inq_result[1] & 0x80) >> 7;
/*
* some devices may respond with wrong type for
* well-known logical units. Force well-known type
* to enumerate them correctly.
*/
if (scsi_is_wlun(sdev->lun) && sdev->type != TYPE_WLUN) {
sdev_printk(KERN_WARNING, sdev,
"%s: correcting incorrect peripheral device type 0x%x for W-LUN 0x%16xhN\n",
__func__, sdev->type, (unsigned int)sdev->lun);
sdev->type = TYPE_WLUN;
}
}
if (sdev->type == TYPE_RBC || sdev->type == TYPE_ROM) {
/* RBC and MMC devices can return SCSI-3 compliance and yet
* still not support REPORT LUNS, so make them act as
* BLIST_NOREPORTLUN unless BLIST_REPORTLUN2 is
* specifically set */
if ((*bflags & BLIST_REPORTLUN2) == 0)
*bflags |= BLIST_NOREPORTLUN;
}
/*
* For a peripheral qualifier (PQ) value of 1 (001b), the SCSI
* spec says: The device server is capable of supporting the
* specified peripheral device type on this logical unit. However,
* the physical device is not currently connected to this logical
* unit.
*
* The above is vague, as it implies that we could treat 001 and
* 011 the same. Stay compatible with previous code, and create a
* scsi_device for a PQ of 1
*
* Don't set the device offline here; rather let the upper
* level drivers eval the PQ to decide whether they should
* attach. So remove ((inq_result[0] >> 5) & 7) == 1 check.
*/
sdev->inq_periph_qual = (inq_result[0] >> 5) & 7;
sdev->lockable = sdev->removable;
sdev->soft_reset = (inq_result[7] & 1) && ((inq_result[3] & 7) == 2);
if (sdev->scsi_level >= SCSI_3 ||
(sdev->inquiry_len > 56 && inq_result[56] & 0x04))
sdev->ppr = 1;
if (inq_result[7] & 0x60)
sdev->wdtr = 1;
if (inq_result[7] & 0x10)
sdev->sdtr = 1;
sdev_printk(KERN_NOTICE, sdev, "%s %.8s %.16s %.4s PQ: %d "
"ANSI: %d%s\n", scsi_device_type(sdev->type),
sdev->vendor, sdev->model, sdev->rev,
sdev->inq_periph_qual, inq_result[2] & 0x07,
(inq_result[3] & 0x0f) == 1 ? " CCS" : "");
if ((sdev->scsi_level >= SCSI_2) && (inq_result[7] & 2) &&
!(*bflags & BLIST_NOTQ)) {
sdev->tagged_supported = 1;
sdev->simple_tags = 1;
}
/*
* Some devices (Texel CD ROM drives) have handshaking problems
* when used with the Seagate controllers. borken is initialized
* to 1, and then set it to 0 here.
*/
if ((*bflags & BLIST_BORKEN) == 0)
sdev->borken = 0;
if (*bflags & BLIST_NO_ULD_ATTACH)
sdev->no_uld_attach = 1;
/*
* Apparently some really broken devices (contrary to the SCSI
* standards) need to be selected without asserting ATN
*/
if (*bflags & BLIST_SELECT_NO_ATN)
sdev->select_no_atn = 1;
/*
* Maximum 512 sector transfer length
* broken RA4x00 Compaq Disk Array
*/
if (*bflags & BLIST_MAX_512)
blk_queue_max_hw_sectors(sdev->request_queue, 512);
SCSI: add 1024 max sectors black list flag This works around a issue with qnap iscsi targets not handling large IOs very well. The target returns: VPD INQUIRY: Block limits page (SBC) Maximum compare and write length: 1 blocks Optimal transfer length granularity: 1 blocks Maximum transfer length: 4294967295 blocks Optimal transfer length: 4294967295 blocks Maximum prefetch, xdread, xdwrite transfer length: 0 blocks Maximum unmap LBA count: 8388607 Maximum unmap block descriptor count: 1 Optimal unmap granularity: 16383 Unmap granularity alignment valid: 0 Unmap granularity alignment: 0 Maximum write same length: 0xffffffff blocks Maximum atomic transfer length: 0 Atomic alignment: 0 Atomic transfer length granularity: 0 and it is *sometimes* able to handle at least one IO of size up to 8 MB. We have seen in traces where it will sometimes work, but other times it looks like it fails and it looks like it returns failures if we send multiple large IOs sometimes. Also it looks like it can return 2 different errors. It will sometimes send iscsi reject errors indicating out of resources or it will send invalid cdb illegal requests check conditions. And then when it sends iscsi rejects it does not seem to handle retries when there are command sequence holes, so I could not just add code to try and gracefully handle that error code. The problem is that we do not have a good contact for the company, so we are not able to determine under what conditions it returns which error and why it sometimes works. So, this patch just adds a new black list flag to set targets like this to the old max safe sectors of 1024. The max_hw_sectors changes added in 3.19 caused this regression, so I also ccing stable. Reported-by: Christian Hesse <list@eworm.de> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-04-21 03:42:24 +00:00
/*
* Max 1024 sector transfer length for targets that report incorrect
* max/optimal lengths and relied on the old block layer safe default
*/
else if (*bflags & BLIST_MAX_1024)
blk_queue_max_hw_sectors(sdev->request_queue, 1024);
/*
* Some devices may not want to have a start command automatically
* issued when a device is added.
*/
if (*bflags & BLIST_NOSTARTONADD)
sdev->no_start_on_add = 1;
if (*bflags & BLIST_SINGLELUN)
scsi_target(sdev)->single_lun = 1;
sdev->use_10_for_rw = 1;
/* some devices don't like REPORT SUPPORTED OPERATION CODES
* and will simply timeout causing sd_mod init to take a very
* very long time */
if (*bflags & BLIST_NO_RSOC)
sdev->no_report_opcodes = 1;
/* set the device running here so that slave configure
* may do I/O */
mutex_lock(&sdev->state_mutex);
ret = scsi_device_set_state(sdev, SDEV_RUNNING);
if (ret)
ret = scsi_device_set_state(sdev, SDEV_BLOCK);
mutex_unlock(&sdev->state_mutex);
if (ret) {
sdev_printk(KERN_ERR, sdev,
"in wrong state %s to complete scan\n",
scsi_device_state_name(sdev->sdev_state));
return SCSI_SCAN_NO_RESPONSE;
}
if (*bflags & BLIST_NOT_LOCKABLE)
sdev->lockable = 0;
if (*bflags & BLIST_RETRY_HWERROR)
sdev->retry_hwerror = 1;
if (*bflags & BLIST_NO_DIF)
sdev->no_dif = 1;
if (*bflags & BLIST_UNMAP_LIMIT_WS)
sdev->unmap_limit_for_ws = 1;
if (*bflags & BLIST_IGN_MEDIA_CHANGE)
sdev->ignore_media_change = 1;
sdev->eh_timeout = SCSI_DEFAULT_EH_TIMEOUT;
if (*bflags & BLIST_TRY_VPD_PAGES)
sdev->try_vpd_pages = 1;
else if (*bflags & BLIST_SKIP_VPD_PAGES)
sdev->skip_vpd_pages = 1;
if (*bflags & BLIST_NO_VPD_SIZE)
sdev->no_vpd_size = 1;
transport_configure_device(&sdev->sdev_gendev);
if (sdev->host->hostt->slave_configure) {
ret = sdev->host->hostt->slave_configure(sdev);
if (ret) {
/*
* if LLDD reports slave not present, don't clutter
* console with alloc failure messages
*/
if (ret != -ENXIO) {
sdev_printk(KERN_ERR, sdev,
"failed to configure device\n");
}
return SCSI_SCAN_NO_RESPONSE;
}
/*
* The queue_depth is often changed in ->slave_configure.
* Set up budget map again since memory consumption of
* the map depends on actual queue depth.
*/
scsi_realloc_sdev_budget_map(sdev, sdev->queue_depth);
}
if (sdev->scsi_level >= SCSI_3)
scsi_attach_vpd(sdev);
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 22:46:33 +00:00
sdev->max_queue_depth = sdev->queue_depth;
scsi: core: Replace sdev->device_busy with sbitmap SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 02:33:17 +00:00
WARN_ON_ONCE(sdev->max_queue_depth > sdev->budget_map.depth);
sdev->sdev_bflags = *bflags;
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 22:46:33 +00:00
/*
* Ok, the device is now all set up, we can
* register it and tell the rest of the kernel
* about it.
*/
if (!async && scsi_sysfs_add_sdev(sdev) != 0)
return SCSI_SCAN_NO_RESPONSE;
return SCSI_SCAN_LUN_PRESENT;
}
#ifdef CONFIG_SCSI_LOGGING
/**
* scsi_inq_str - print INQUIRY data from min to max index, strip trailing whitespace
* @buf: Output buffer with at least end-first+1 bytes of space
* @inq: Inquiry buffer (input)
* @first: Offset of string into inq
* @end: Index after last character in inq
*/
static unsigned char *scsi_inq_str(unsigned char *buf, unsigned char *inq,
unsigned first, unsigned end)
{
unsigned term = 0, idx;
for (idx = 0; idx + first < end && idx + first < inq[4] + 5; idx++) {
if (inq[idx+first] > ' ') {
buf[idx] = inq[idx+first];
term = idx+1;
} else {
buf[idx] = ' ';
}
}
buf[term] = 0;
return buf;
}
#endif
/**
* scsi_probe_and_add_lun - probe a LUN, if a LUN is found add it
* @starget: pointer to target device structure
* @lun: LUN of target device
* @bflagsp: store bflags here if not NULL
* @sdevp: probe the LUN corresponding to this scsi_device
* @rescan: if not equal to SCSI_SCAN_INITIAL skip some code only
* needed on first scan
* @hostdata: passed to scsi_alloc_sdev()
*
* Description:
* Call scsi_probe_lun, if a LUN with an attached device is found,
* allocate and set it up by calling scsi_add_lun.
*
* Return:
*
* - SCSI_SCAN_NO_RESPONSE: could not allocate or setup a scsi_device
* - SCSI_SCAN_TARGET_PRESENT: target responded, but no device is
* attached at the LUN
* - SCSI_SCAN_LUN_PRESENT: a new scsi_device was allocated and initialized
**/
static int scsi_probe_and_add_lun(struct scsi_target *starget,
u64 lun, blist_flags_t *bflagsp,
struct scsi_device **sdevp,
enum scsi_scan_mode rescan,
void *hostdata)
{
struct scsi_device *sdev;
unsigned char *result;
blist_flags_t bflags;
int res = SCSI_SCAN_NO_RESPONSE, result_len = 256;
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
/*
* The rescan flag is used as an optimization, the first scan of a
* host adapter calls into here with rescan == 0.
*/
sdev = scsi_device_lookup_by_target(starget, lun);
if (sdev) {
if (rescan != SCSI_SCAN_INITIAL || !scsi_device_created(sdev)) {
SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
"scsi scan: device exists on %s\n",
dev_name(&sdev->sdev_gendev)));
if (sdevp)
*sdevp = sdev;
else
scsi_device_put(sdev);
if (bflagsp)
*bflagsp = scsi_get_device_flags(sdev,
sdev->vendor,
sdev->model);
return SCSI_SCAN_LUN_PRESENT;
}
scsi_device_put(sdev);
} else
sdev = scsi_alloc_sdev(starget, lun, hostdata);
if (!sdev)
goto out;
result = kmalloc(result_len, GFP_KERNEL);
if (!result)
goto out_free_sdev;
if (scsi_probe_lun(sdev, result, result_len, &bflags))
goto out_free_result;
if (bflagsp)
*bflagsp = bflags;
/*
* result contains valid SCSI INQUIRY data.
*/
if ((result[0] >> 5) == 3) {
/*
* For a Peripheral qualifier 3 (011b), the SCSI
* spec says: The device server is not capable of
* supporting a physical device on this logical
* unit.
*
* For disks, this implies that there is no
* logical disk configured at sdev->lun, but there
* is a target id responding.
*/
SCSI_LOG_SCAN_BUS(2, sdev_printk(KERN_INFO, sdev, "scsi scan:"
" peripheral qualifier of 3, device not"
" added\n"))
if (lun == 0) {
SCSI_LOG_SCAN_BUS(1, {
unsigned char vend[9];
unsigned char mod[17];
sdev_printk(KERN_INFO, sdev,
"scsi scan: consider passing scsi_mod."
"dev_flags=%s:%s:0x240 or 0x1000240\n",
scsi_inq_str(vend, result, 8, 16),
scsi_inq_str(mod, result, 16, 32));
});
}
res = SCSI_SCAN_TARGET_PRESENT;
goto out_free_result;
}
/*
* Some targets may set slight variations of PQ and PDT to signal
* that no LUN is present, so don't add sdev in these cases.
* Two specific examples are:
* 1) NetApp targets: return PQ=1, PDT=0x1f
* 2) USB UFI: returns PDT=0x1f, with the PQ bits being "reserved"
* in the UFI 1.0 spec (we cannot rely on reserved bits).
*
* References:
* 1) SCSI SPC-3, pp. 145-146
* PQ=1: "A peripheral device having the specified peripheral
* device type is not connected to this logical unit. However, the
* device server is capable of supporting the specified peripheral
* device type on this logical unit."
* PDT=0x1f: "Unknown or no device type"
* 2) USB UFI 1.0, p. 20
* PDT=00h Direct-access device (floppy)
* PDT=1Fh none (no FDD connected to the requested logical unit)
*/
if (((result[0] >> 5) == 1 || starget->pdt_1f_for_no_lun) &&
(result[0] & 0x1f) == 0x1f &&
!scsi_is_wlun(lun)) {
SCSI_LOG_SCAN_BUS(3, sdev_printk(KERN_INFO, sdev,
"scsi scan: peripheral device type"
" of 31, no device added\n"));
res = SCSI_SCAN_TARGET_PRESENT;
goto out_free_result;
}
res = scsi_add_lun(sdev, result, &bflags, shost->async_scan);
if (res == SCSI_SCAN_LUN_PRESENT) {
if (bflags & BLIST_KEY) {
sdev->lockable = 0;
scsi_unlock_floptical(sdev, result);
}
}
out_free_result:
kfree(result);
out_free_sdev:
if (res == SCSI_SCAN_LUN_PRESENT) {
if (sdevp) {
if (scsi_device_get(sdev) == 0) {
*sdevp = sdev;
} else {
__scsi_remove_device(sdev);
res = SCSI_SCAN_NO_RESPONSE;
}
}
} else
__scsi_remove_device(sdev);
out:
return res;
}
/**
* scsi_sequential_lun_scan - sequentially scan a SCSI target
* @starget: pointer to target structure to scan
* @bflags: black/white list flag for LUN 0
* @scsi_level: Which version of the standard does this device adhere to
* @rescan: passed to scsi_probe_add_lun()
*
* Description:
* Generally, scan from LUN 1 (LUN 0 is assumed to already have been
* scanned) to some maximum lun until a LUN is found with no device
* attached. Use the bflags to figure out any oddities.
*
* Modifies sdevscan->lun.
**/
static void scsi_sequential_lun_scan(struct scsi_target *starget,
blist_flags_t bflags, int scsi_level,
enum scsi_scan_mode rescan)
{
uint max_dev_lun;
u64 sparse_lun, lun;
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
SCSI_LOG_SCAN_BUS(3, starget_printk(KERN_INFO, starget,
"scsi scan: Sequential scan\n"));
max_dev_lun = min(max_scsi_luns, shost->max_lun);
/*
* If this device is known to support sparse multiple units,
* override the other settings, and scan all of them. Normally,
* SCSI-3 devices should be scanned via the REPORT LUNS.
*/
if (bflags & BLIST_SPARSELUN) {
max_dev_lun = shost->max_lun;
sparse_lun = 1;
} else
sparse_lun = 0;
/*
* If less than SCSI_1_CCS, and no special lun scanning, stop
* scanning; this matches 2.4 behaviour, but could just be a bug
* (to continue scanning a SCSI_1_CCS device).
*
* This test is broken. We might not have any device on lun0 for
* a sparselun device, and if that's the case then how would we
* know the real scsi_level, eh? It might make sense to just not
* scan any SCSI_1 device for non-0 luns, but that check would best
* go into scsi_alloc_sdev() and just have it return null when asked
* to alloc an sdev for lun > 0 on an already found SCSI_1 device.
*
if ((sdevscan->scsi_level < SCSI_1_CCS) &&
((bflags & (BLIST_FORCELUN | BLIST_SPARSELUN | BLIST_MAX5LUN))
== 0))
return;
*/
/*
* If this device is known to support multiple units, override
* the other settings, and scan all of them.
*/
if (bflags & BLIST_FORCELUN)
max_dev_lun = shost->max_lun;
/*
* REGAL CDC-4X: avoid hang after LUN 4
*/
if (bflags & BLIST_MAX5LUN)
max_dev_lun = min(5U, max_dev_lun);
/*
* Do not scan SCSI-2 or lower device past LUN 7, unless
* BLIST_LARGELUN.
*/
if (scsi_level < SCSI_3 && !(bflags & BLIST_LARGELUN))
max_dev_lun = min(8U, max_dev_lun);
else
max_dev_lun = min(256U, max_dev_lun);
/*
* We have already scanned LUN 0, so start at LUN 1. Keep scanning
* until we reach the max, or no LUN is found and we are not
* sparse_lun.
*/
for (lun = 1; lun < max_dev_lun; ++lun)
if ((scsi_probe_and_add_lun(starget, lun, NULL, NULL, rescan,
NULL) != SCSI_SCAN_LUN_PRESENT) &&
!sparse_lun)
return;
}
/**
* scsi_report_lun_scan - Scan using SCSI REPORT LUN results
* @starget: which target
* @bflags: Zero or a mix of BLIST_NOLUN, BLIST_REPORTLUN2, or BLIST_NOREPORTLUN
* @rescan: nonzero if we can skip code only needed on first scan
*
* Description:
* Fast scanning for modern (SCSI-3) devices by sending a REPORT LUN command.
* Scan the resulting list of LUNs by calling scsi_probe_and_add_lun.
*
* If BLINK_REPORTLUN2 is set, scan a target that supports more than 8
* LUNs even if it's older than SCSI-3.
* If BLIST_NOREPORTLUN is set, return 1 always.
* If BLIST_NOLUN is set, return 0 always.
* If starget->no_report_luns is set, return 1 always.
*
* Return:
* 0: scan completed (or no memory, so further scanning is futile)
* 1: could not scan with REPORT LUN
**/
static int scsi_report_lun_scan(struct scsi_target *starget, blist_flags_t bflags,
enum scsi_scan_mode rescan)
{
unsigned char scsi_cmd[MAX_COMMAND_SIZE];
unsigned int length;
u64 lun;
unsigned int num_luns;
unsigned int retries;
int result;
struct scsi_lun *lunp, *lun_data;
struct scsi_sense_hdr sshdr;
struct scsi_device *sdev;
struct Scsi_Host *shost = dev_to_shost(&starget->dev);
const struct scsi_exec_args exec_args = {
.sshdr = &sshdr,
};
int ret = 0;
/*
* Only support SCSI-3 and up devices if BLIST_NOREPORTLUN is not set.
* Also allow SCSI-2 if BLIST_REPORTLUN2 is set and host adapter does
* support more than 8 LUNs.
* Don't attempt if the target doesn't support REPORT LUNS.
*/
if (bflags & BLIST_NOREPORTLUN)
return 1;
if (starget->scsi_level < SCSI_2 &&
starget->scsi_level != SCSI_UNKNOWN)
return 1;
if (starget->scsi_level < SCSI_3 &&
(!(bflags & BLIST_REPORTLUN2) || shost->max_lun <= 8))
return 1;
if (bflags & BLIST_NOLUN)
return 0;
if (starget->no_report_luns)
return 1;
if (!(sdev = scsi_device_lookup_by_target(starget, 0))) {
sdev = scsi_alloc_sdev(starget, 0, NULL);
if (!sdev)
return 0;
if (scsi_device_get(sdev)) {
__scsi_remove_device(sdev);
return 0;
}
}
/*
* Allocate enough to hold the header (the same size as one scsi_lun)
* plus the number of luns we are requesting. 511 was the default
* value of the now removed max_report_luns parameter.
*/
length = (511 + 1) * sizeof(struct scsi_lun);
retry:
lun_data = kmalloc(length, GFP_KERNEL);
if (!lun_data) {
printk(ALLOC_FAILURE_MSG, __func__);
goto out;
}
scsi_cmd[0] = REPORT_LUNS;
/*
* bytes 1 - 5: reserved, set to zero.
*/
memset(&scsi_cmd[1], 0, 5);
/*
* bytes 6 - 9: length of the command.
*/
put_unaligned_be32(length, &scsi_cmd[6]);
scsi_cmd[10] = 0; /* reserved */
scsi_cmd[11] = 0; /* control */
/*
* We can get a UNIT ATTENTION, for example a power on/reset, so
* retry a few times (like sd.c does for TEST UNIT READY).
* Experience shows some combinations of adapter/devices get at
* least two power on/resets.
*
* Illegal requests (for devices that do not support REPORT LUNS)
* should come through as a check condition, and will not generate
* a retry.
*/
for (retries = 0; retries < 3; retries++) {
SCSI_LOG_SCAN_BUS(3, sdev_printk (KERN_INFO, sdev,
"scsi scan: Sending REPORT LUNS to (try %d)\n",
retries));
result = scsi_execute_cmd(sdev, scsi_cmd, REQ_OP_DRV_IN,
lun_data, length,
SCSI_REPORT_LUNS_TIMEOUT, 3,
&exec_args);
SCSI_LOG_SCAN_BUS(3, sdev_printk (KERN_INFO, sdev,
"scsi scan: REPORT LUNS"
" %s (try %d) result 0x%x\n",
result ? "failed" : "successful",
retries, result));
if (result == 0)
break;
else if (scsi_sense_valid(&sshdr)) {
if (sshdr.sense_key != UNIT_ATTENTION)
break;
}
}
if (result) {
/*
* The device probably does not support a REPORT LUN command
*/
ret = 1;
goto out_err;
}
/*
* Get the length from the first four bytes of lun_data.
*/
if (get_unaligned_be32(lun_data->scsi_lun) +
sizeof(struct scsi_lun) > length) {
length = get_unaligned_be32(lun_data->scsi_lun) +
sizeof(struct scsi_lun);
kfree(lun_data);
goto retry;
}
length = get_unaligned_be32(lun_data->scsi_lun);
num_luns = (length / sizeof(struct scsi_lun));
SCSI_LOG_SCAN_BUS(3, sdev_printk (KERN_INFO, sdev,
"scsi scan: REPORT LUN scan\n"));
/*
* Scan the luns in lun_data. The entry at offset 0 is really
* the header, so start at 1 and go up to and including num_luns.
*/
for (lunp = &lun_data[1]; lunp <= &lun_data[num_luns]; lunp++) {
lun = scsilun_to_int(lunp);
if (lun > sdev->host->max_lun) {
sdev_printk(KERN_WARNING, sdev,
"lun%llu has a LUN larger than"
" allowed by the host adapter\n", lun);
} else {
int res;
res = scsi_probe_and_add_lun(starget,
lun, NULL, NULL, rescan, NULL);
if (res == SCSI_SCAN_NO_RESPONSE) {
/*
* Got some results, but now none, abort.
*/
sdev_printk(KERN_ERR, sdev,
"Unexpected response"
" from lun %llu while scanning, scan"
" aborted\n", (unsigned long long)lun);
break;
}
}
}
out_err:
kfree(lun_data);
out:
if (scsi_device_created(sdev))
/*
* the sdev we used didn't appear in the report luns scan
*/
__scsi_remove_device(sdev);
scsi: Fix use-after-free This patch fixes one use-after-free report[1] by KASAN. In __scsi_scan_target(), when a type 31 device is probed, SCSI_SCAN_TARGET_PRESENT is returned and the target will be scanned again. Inside the following scsi_report_lun_scan(), one new scsi_device instance is allocated, and scsi_probe_and_add_lun() is called again to probe the target and still see type 31 device, finally __scsi_remove_device() is called to remove & free the device at the end of scsi_probe_and_add_lun(), so cause use-after-free in scsi_report_lun_scan(). And the following SCSI log can be observed: scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36 scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0 scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added scsi 0:0:2:0: scsi scan: Sending REPORT LUNS to (try 0) scsi 0:0:2:0: scsi scan: REPORT LUNS successful (try 0) result 0x0 scsi 0:0:2:0: scsi scan: REPORT LUN scan scsi 0:0:2:0: scsi scan: INQUIRY pass 1 length 36 scsi 0:0:2:0: scsi scan: INQUIRY successful with code 0x0 scsi 0:0:2:0: scsi scan: peripheral device type of 31, no device added BUG: KASAN: use-after-free in __scsi_scan_target+0xbf8/0xe40 at addr ffff88007b44a104 This patch fixes the issue by moving the putting reference at the end of scsi_report_lun_scan(). [1] KASAN report ================================================================== [ 3.274597] PM: Adding info for serio:serio1 [ 3.275127] BUG: KASAN: use-after-free in __scsi_scan_target+0xd87/0xdf0 at addr ffff880254d8c304 [ 3.275653] Read of size 4 by task kworker/u10:0/27 [ 3.275903] CPU: 3 PID: 27 Comm: kworker/u10:0 Not tainted 4.8.0 #2121 [ 3.276258] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 3.276797] Workqueue: events_unbound async_run_entry_fn [ 3.277083] ffff880254d8c380 ffff880259a37870 ffffffff94bbc6c1 ffff880078402d80 [ 3.277532] ffff880254d8bb80 ffff880259a37898 ffffffff9459fec1 ffff880259a37930 [ 3.277989] ffff880254d8bb80 ffff880078402d80 ffff880259a37920 ffffffff945a0165 [ 3.278436] Call Trace: [ 3.278528] [<ffffffff94bbc6c1>] dump_stack+0x65/0x84 [ 3.278797] [<ffffffff9459fec1>] kasan_object_err+0x21/0x70 [ 3.279063] device: 'psaux': device_add [ 3.279616] [<ffffffff945a0165>] kasan_report_error+0x205/0x500 [ 3.279651] PM: Adding info for No Bus:psaux [ 3.280202] [<ffffffff944ecd22>] ? kfree_const+0x22/0x30 [ 3.280486] [<ffffffff94bc2dc9>] ? kobject_release+0x119/0x370 [ 3.280805] [<ffffffff945a0543>] __asan_report_load4_noabort+0x43/0x50 [ 3.281170] [<ffffffff9507e1f7>] ? __scsi_scan_target+0xd87/0xdf0 [ 3.281506] [<ffffffff9507e1f7>] __scsi_scan_target+0xd87/0xdf0 [ 3.281848] [<ffffffff9507d470>] ? scsi_add_device+0x30/0x30 [ 3.282156] [<ffffffff94f7f660>] ? pm_runtime_autosuspend_expiration+0x60/0x60 [ 3.282570] [<ffffffff956ddb07>] ? _raw_spin_lock+0x17/0x40 [ 3.282880] [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160 [ 3.283200] [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0 [ 3.283563] [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250 [ 3.283882] [<ffffffff9507efc1>] do_scan_async+0x41/0x450 [ 3.284173] [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610 [ 3.284492] [<ffffffff941a8954>] ? pwq_dec_nr_in_flight+0x124/0x2a0 [ 3.284876] [<ffffffff941d1770>] ? preempt_count_add+0x130/0x160 [ 3.285207] [<ffffffff941a9a84>] process_one_work+0x544/0x12d0 [ 3.285526] [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0 [ 3.285844] [<ffffffff941aa810>] ? process_one_work+0x12d0/0x12d0 [ 3.286182] [<ffffffff941bb365>] kthread+0x1c5/0x260 [ 3.286443] [<ffffffff940855cd>] ? __switch_to+0x88d/0x1430 [ 3.286745] [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0 [ 3.287085] [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40 [ 3.287368] [<ffffffff941bb1a0>] ? kthread_worker_fn+0x5a0/0x5a0 [ 3.287697] Object at ffff880254d8bb80, in cache kmalloc-2048 size: 2048 [ 3.288064] Allocated: [ 3.288147] PID = 27 [ 3.288218] [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50 [ 3.288531] [<ffffffff9459f246>] save_stack+0x46/0xd0 [ 3.288806] [<ffffffff9459f4bd>] kasan_kmalloc+0xad/0xe0 [ 3.289098] [<ffffffff9459c07e>] __kmalloc+0x13e/0x250 [ 3.289378] [<ffffffff95078e5a>] scsi_alloc_sdev+0xea/0xcf0 [ 3.289701] [<ffffffff9507de76>] __scsi_scan_target+0xa06/0xdf0 [ 3.290034] [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160 [ 3.290362] [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0 [ 3.290724] [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250 [ 3.291055] [<ffffffff9507efc1>] do_scan_async+0x41/0x450 [ 3.291354] [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610 [ 3.291695] [<ffffffff941a9a84>] process_one_work+0x544/0x12d0 [ 3.292022] [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0 [ 3.292325] [<ffffffff941bb365>] kthread+0x1c5/0x260 [ 3.292594] [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40 [ 3.292886] Freed: [ 3.292945] PID = 27 [ 3.293016] [<ffffffff940b27ab>] save_stack_trace+0x2b/0x50 [ 3.293327] [<ffffffff9459f246>] save_stack+0x46/0xd0 [ 3.293600] [<ffffffff9459fa61>] kasan_slab_free+0x71/0xb0 [ 3.293916] [<ffffffff9459bac2>] kfree+0xa2/0x1f0 [ 3.294168] [<ffffffff9508158a>] scsi_device_dev_release_usercontext+0x50a/0x730 [ 3.294598] [<ffffffff941ace9a>] execute_in_process_context+0xda/0x130 [ 3.294974] [<ffffffff9508107c>] scsi_device_dev_release+0x1c/0x20 [ 3.295322] [<ffffffff94f566f6>] device_release+0x76/0x1e0 [ 3.295626] [<ffffffff94bc2db7>] kobject_release+0x107/0x370 [ 3.295942] [<ffffffff94bc29ce>] kobject_put+0x4e/0xa0 [ 3.296222] [<ffffffff94f56e17>] put_device+0x17/0x20 [ 3.296497] [<ffffffff9505201c>] scsi_device_put+0x7c/0xa0 [ 3.296801] [<ffffffff9507e1bc>] __scsi_scan_target+0xd4c/0xdf0 [ 3.297132] [<ffffffff9507e505>] scsi_scan_channel+0x105/0x160 [ 3.297458] [<ffffffff9507e8a2>] scsi_scan_host_selected+0x212/0x2f0 [ 3.297829] [<ffffffff9507eb3c>] do_scsi_scan_host+0x1bc/0x250 [ 3.298156] [<ffffffff9507efc1>] do_scan_async+0x41/0x450 [ 3.298453] [<ffffffff941c1fee>] async_run_entry_fn+0xfe/0x610 [ 3.298777] [<ffffffff941a9a84>] process_one_work+0x544/0x12d0 [ 3.299105] [<ffffffff941aa8e9>] worker_thread+0xd9/0x12f0 [ 3.299408] [<ffffffff941bb365>] kthread+0x1c5/0x260 [ 3.299676] [<ffffffff956dde9f>] ret_from_fork+0x1f/0x40 [ 3.299967] Memory state around the buggy address: [ 3.300209] ffff880254d8c200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3.300608] ffff880254d8c280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3.300986] >ffff880254d8c300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3.301408] ^ [ 3.301550] ffff880254d8c380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 3.301987] ffff880254d8c400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 3.302396] ================================================================== Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-10-09 05:23:27 +00:00
scsi_device_put(sdev);
return ret;
}
struct scsi_device *__scsi_add_device(struct Scsi_Host *shost, uint channel,
uint id, u64 lun, void *hostdata)
{
struct scsi_device *sdev = ERR_PTR(-ENODEV);
struct device *parent = &shost->shost_gendev;
struct scsi_target *starget;
if (strncmp(scsi_scan_type, "none", 4) == 0)
return ERR_PTR(-ENODEV);
starget = scsi_alloc_target(parent, channel, id);
if (!starget)
return ERR_PTR(-ENOMEM);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_get_target(starget);
mutex_lock(&shost->scan_mutex);
if (!shost->async_scan)
scsi_complete_async_scans();
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
if (scsi_host_scan_allowed(shost) && scsi_autopm_get_host(shost) == 0) {
scsi_probe_and_add_lun(starget, lun, NULL, &sdev,
SCSI_SCAN_RESCAN, hostdata);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_put_host(shost);
}
mutex_unlock(&shost->scan_mutex);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_put_target(starget);
/*
* paired with scsi_alloc_target(). Target will be destroyed unless
* scsi_probe_and_add_lun made an underlying device visible
*/
scsi_target_reap(starget);
put_device(&starget->dev);
return sdev;
}
EXPORT_SYMBOL(__scsi_add_device);
int scsi_add_device(struct Scsi_Host *host, uint channel,
uint target, u64 lun)
{
struct scsi_device *sdev =
__scsi_add_device(host, channel, target, lun, NULL);
if (IS_ERR(sdev))
return PTR_ERR(sdev);
scsi_device_put(sdev);
return 0;
}
EXPORT_SYMBOL(scsi_add_device);
void scsi_rescan_device(struct device *dev)
{
struct scsi_device *sdev = to_scsi_device(dev);
device_lock(dev);
scsi_attach_vpd(sdev);
if (sdev->handler && sdev->handler->rescan)
sdev->handler->rescan(sdev);
if (dev->driver && try_module_get(dev->driver->owner)) {
struct scsi_driver *drv = to_scsi_driver(dev->driver);
if (drv->rescan)
drv->rescan(dev);
module_put(dev->driver->owner);
}
device_unlock(dev);
}
EXPORT_SYMBOL(scsi_rescan_device);
static void __scsi_scan_target(struct device *parent, unsigned int channel,
unsigned int id, u64 lun, enum scsi_scan_mode rescan)
{
struct Scsi_Host *shost = dev_to_shost(parent);
blist_flags_t bflags = 0;
int res;
struct scsi_target *starget;
if (shost->this_id == id)
/*
* Don't scan the host adapter
*/
return;
starget = scsi_alloc_target(parent, channel, id);
if (!starget)
return;
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_get_target(starget);
if (lun != SCAN_WILD_CARD) {
/*
* Scan for a specific host/chan/id/lun.
*/
scsi_probe_and_add_lun(starget, lun, NULL, NULL, rescan, NULL);
goto out_reap;
}
/*
* Scan LUN 0, if there is some response, scan further. Ideally, we
* would not configure LUN 0 until all LUNs are scanned.
*/
res = scsi_probe_and_add_lun(starget, 0, &bflags, NULL, rescan, NULL);
if (res == SCSI_SCAN_LUN_PRESENT || res == SCSI_SCAN_TARGET_PRESENT) {
if (scsi_report_lun_scan(starget, bflags, rescan) != 0)
/*
* The REPORT LUN did not scan the target,
* do a sequential scan.
*/
scsi_sequential_lun_scan(starget, bflags,
starget->scsi_level, rescan);
}
out_reap:
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_put_target(starget);
/*
* paired with scsi_alloc_target(): determine if the target has
* any children at all and if not, nuke it
*/
scsi_target_reap(starget);
put_device(&starget->dev);
}
/**
* scsi_scan_target - scan a target id, possibly including all LUNs on the target.
* @parent: host to scan
* @channel: channel to scan
* @id: target id to scan
* @lun: Specific LUN to scan or SCAN_WILD_CARD
* @rescan: passed to LUN scanning routines; SCSI_SCAN_INITIAL for
* no rescan, SCSI_SCAN_RESCAN to rescan existing LUNs,
* and SCSI_SCAN_MANUAL to force scanning even if
* 'scan=manual' is set.
*
* Description:
* Scan the target id on @parent, @channel, and @id. Scan at least LUN 0,
* and possibly all LUNs on the target id.
*
* First try a REPORT LUN scan, if that does not scan the target, do a
* sequential scan of LUNs on the target id.
**/
void scsi_scan_target(struct device *parent, unsigned int channel,
unsigned int id, u64 lun, enum scsi_scan_mode rescan)
{
struct Scsi_Host *shost = dev_to_shost(parent);
if (strncmp(scsi_scan_type, "none", 4) == 0)
return;
if (rescan != SCSI_SCAN_MANUAL &&
strncmp(scsi_scan_type, "manual", 6) == 0)
return;
mutex_lock(&shost->scan_mutex);
if (!shost->async_scan)
scsi_complete_async_scans();
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
if (scsi_host_scan_allowed(shost) && scsi_autopm_get_host(shost) == 0) {
__scsi_scan_target(parent, channel, id, lun, rescan);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_put_host(shost);
}
mutex_unlock(&shost->scan_mutex);
}
EXPORT_SYMBOL(scsi_scan_target);
static void scsi_scan_channel(struct Scsi_Host *shost, unsigned int channel,
unsigned int id, u64 lun,
enum scsi_scan_mode rescan)
{
uint order_id;
if (id == SCAN_WILD_CARD)
for (id = 0; id < shost->max_id; ++id) {
/*
* XXX adapter drivers when possible (FCP, iSCSI)
* could modify max_id to match the current max,
* not the absolute max.
*
* XXX add a shost id iterator, so for example,
* the FC ID can be the same as a target id
* without a huge overhead of sparse id's.
*/
if (shost->reverse_ordering)
/*
* Scan from high to low id.
*/
order_id = shost->max_id - id - 1;
else
order_id = id;
__scsi_scan_target(&shost->shost_gendev, channel,
order_id, lun, rescan);
}
else
__scsi_scan_target(&shost->shost_gendev, channel,
id, lun, rescan);
}
int scsi_scan_host_selected(struct Scsi_Host *shost, unsigned int channel,
unsigned int id, u64 lun,
enum scsi_scan_mode rescan)
{
SCSI_LOG_SCAN_BUS(3, shost_printk (KERN_INFO, shost,
"%s: <%u:%u:%llu>\n",
__func__, channel, id, lun));
if (((channel != SCAN_WILD_CARD) && (channel > shost->max_channel)) ||
((id != SCAN_WILD_CARD) && (id >= shost->max_id)) ||
((lun != SCAN_WILD_CARD) && (lun >= shost->max_lun)))
return -EINVAL;
mutex_lock(&shost->scan_mutex);
if (!shost->async_scan)
scsi_complete_async_scans();
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
if (scsi_host_scan_allowed(shost) && scsi_autopm_get_host(shost) == 0) {
if (channel == SCAN_WILD_CARD)
for (channel = 0; channel <= shost->max_channel;
channel++)
scsi_scan_channel(shost, channel, id, lun,
rescan);
else
scsi_scan_channel(shost, channel, id, lun, rescan);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_put_host(shost);
}
mutex_unlock(&shost->scan_mutex);
return 0;
}
static void scsi_sysfs_add_devices(struct Scsi_Host *shost)
{
struct scsi_device *sdev;
shost_for_each_device(sdev, shost) {
/* target removed before the device could be added */
if (sdev->sdev_state == SDEV_DEL)
continue;
/* If device is already visible, skip adding it to sysfs */
if (sdev->is_visible)
continue;
if (!scsi_host_scan_allowed(shost) ||
scsi_sysfs_add_sdev(sdev) != 0)
__scsi_remove_device(sdev);
}
}
/**
* scsi_prep_async_scan - prepare for an async scan
* @shost: the host which will be scanned
* Returns: a cookie to be passed to scsi_finish_async_scan()
*
* Tells the midlayer this host is going to do an asynchronous scan.
* It reserves the host's position in the scanning list and ensures
* that other asynchronous scans started after this one won't affect the
* ordering of the discovered devices.
*/
static struct async_scan_data *scsi_prep_async_scan(struct Scsi_Host *shost)
{
struct async_scan_data *data = NULL;
unsigned long flags;
if (strncmp(scsi_scan_type, "sync", 4) == 0)
return NULL;
mutex_lock(&shost->scan_mutex);
if (shost->async_scan) {
shost_printk(KERN_DEBUG, shost, "%s called twice\n", __func__);
goto err;
}
data = kmalloc(sizeof(*data), GFP_KERNEL);
if (!data)
goto err;
data->shost = scsi_host_get(shost);
if (!data->shost)
goto err;
init_completion(&data->prev_finished);
spin_lock_irqsave(shost->host_lock, flags);
shost->async_scan = 1;
spin_unlock_irqrestore(shost->host_lock, flags);
mutex_unlock(&shost->scan_mutex);
spin_lock(&async_scan_lock);
if (list_empty(&scanning_hosts))
complete(&data->prev_finished);
list_add_tail(&data->list, &scanning_hosts);
spin_unlock(&async_scan_lock);
return data;
err:
mutex_unlock(&shost->scan_mutex);
kfree(data);
return NULL;
}
/**
* scsi_finish_async_scan - asynchronous scan has finished
* @data: cookie returned from earlier call to scsi_prep_async_scan()
*
* All the devices currently attached to this host have been found.
* This function announces all the devices it has found to the rest
* of the system.
*/
static void scsi_finish_async_scan(struct async_scan_data *data)
{
struct Scsi_Host *shost;
unsigned long flags;
if (!data)
return;
shost = data->shost;
mutex_lock(&shost->scan_mutex);
if (!shost->async_scan) {
shost_printk(KERN_INFO, shost, "%s called twice\n", __func__);
dump_stack();
mutex_unlock(&shost->scan_mutex);
return;
}
wait_for_completion(&data->prev_finished);
scsi_sysfs_add_devices(shost);
spin_lock_irqsave(shost->host_lock, flags);
shost->async_scan = 0;
spin_unlock_irqrestore(shost->host_lock, flags);
mutex_unlock(&shost->scan_mutex);
spin_lock(&async_scan_lock);
list_del(&data->list);
if (!list_empty(&scanning_hosts)) {
struct async_scan_data *next = list_entry(scanning_hosts.next,
struct async_scan_data, list);
complete(&next->prev_finished);
}
spin_unlock(&async_scan_lock);
[SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost' In do_scan_async(), calling scsi_autopm_put_host(shost) may reference freed shost, and cause Posison overwitten warning. Yes, this case can happen, for example, an USB is disconnected just when do_scan_async() thread starts to run, then scsi_host_put() called in scsi_finish_async_scan() will lead to shost be freed(because the refcount of shost->shost_gendev decreases to 1 after USB disconnects), at this point, if references shost again, system will show following warning msg. To make scsi_autopm_put_host(shost) always reference a valid shost, put it just before scsi_host_put() in function scsi_finish_async_scan(). [ 299.281565] ============================================================================= [ 299.281634] BUG kmalloc-4096 (Tainted: G I ): Poison overwritten [ 299.281682] ----------------------------------------------------------------------------- [ 299.281684] [ 299.281752] INFO: 0xffff880056c305d0-0xffff880056c305d0. First byte 0x6a instead of 0x6b [ 299.281816] INFO: Allocated in scsi_host_alloc+0x4a/0x490 age=1688 cpu=1 pid=2004 [ 299.281870] __slab_alloc+0x617/0x6c1 [ 299.281901] __kmalloc+0x28c/0x2e0 [ 299.281931] scsi_host_alloc+0x4a/0x490 [ 299.281966] usb_stor_probe1+0x5b/0xc40 [usb_storage] [ 299.282010] storage_probe+0xa4/0xe0 [usb_storage] [ 299.282062] usb_probe_interface+0x172/0x330 [usbcore] [ 299.282105] driver_probe_device+0x257/0x3b0 [ 299.282138] __driver_attach+0x103/0x110 [ 299.282171] bus_for_each_dev+0x8e/0xe0 [ 299.282201] driver_attach+0x26/0x30 [ 299.282230] bus_add_driver+0x1c4/0x430 [ 299.282260] driver_register+0xb6/0x230 [ 299.282298] usb_register_driver+0xe5/0x270 [usbcore] [ 299.282337] 0xffffffffa04ab03d [ 299.282364] do_one_initcall+0x47/0x230 [ 299.282396] sys_init_module+0xa0f/0x1fe0 [ 299.282429] INFO: Freed in scsi_host_dev_release+0x18a/0x1d0 age=85 cpu=0 pid=2008 [ 299.282482] __slab_free+0x3c/0x2a1 [ 299.282510] kfree+0x296/0x310 [ 299.282536] scsi_host_dev_release+0x18a/0x1d0 [ 299.282574] device_release+0x74/0x100 [ 299.282606] kobject_release+0xc7/0x2a0 [ 299.282637] kobject_put+0x54/0xa0 [ 299.282668] put_device+0x27/0x40 [ 299.282694] scsi_host_put+0x1d/0x30 [ 299.282723] do_scan_async+0x1fc/0x2b0 [ 299.282753] kthread+0xdf/0xf0 [ 299.282782] kernel_thread_helper+0x4/0x10 [ 299.282817] INFO: Slab 0xffffea00015b0c00 objects=7 used=7 fp=0x (null) flags=0x100000000004080 [ 299.282882] INFO: Object 0xffff880056c30000 @offset=0 fp=0x (null) [ 299.282884] ... Signed-off-by: Huajun Li <huajun.li.lee@gmail.com> Cc: stable@kernel.org Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-12 11:59:14 +00:00
scsi_autopm_put_host(shost);
scsi_host_put(shost);
kfree(data);
}
static void do_scsi_scan_host(struct Scsi_Host *shost)
{
if (shost->hostt->scan_finished) {
unsigned long start = jiffies;
if (shost->hostt->scan_start)
shost->hostt->scan_start(shost);
while (!shost->hostt->scan_finished(shost, jiffies - start))
msleep(10);
} else {
scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
SCAN_WILD_CARD, SCSI_SCAN_INITIAL);
}
}
static void do_scan_async(void *_data, async_cookie_t c)
{
struct async_scan_data *data = _data;
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
struct Scsi_Host *shost = data->shost;
do_scsi_scan_host(shost);
scsi_finish_async_scan(data);
}
/**
* scsi_scan_host - scan the given adapter
* @shost: adapter to scan
**/
void scsi_scan_host(struct Scsi_Host *shost)
{
struct async_scan_data *data;
if (strncmp(scsi_scan_type, "none", 4) == 0 ||
strncmp(scsi_scan_type, "manual", 6) == 0)
return;
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
if (scsi_autopm_get_host(shost) < 0)
return;
data = scsi_prep_async_scan(shost);
if (!data) {
do_scsi_scan_host(shost);
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
scsi_autopm_put_host(shost);
return;
}
/* register with the async subsystem so wait_for_device_probe()
* will flush this work
*/
async_schedule(do_scan_async, data);
[SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost' In do_scan_async(), calling scsi_autopm_put_host(shost) may reference freed shost, and cause Posison overwitten warning. Yes, this case can happen, for example, an USB is disconnected just when do_scan_async() thread starts to run, then scsi_host_put() called in scsi_finish_async_scan() will lead to shost be freed(because the refcount of shost->shost_gendev decreases to 1 after USB disconnects), at this point, if references shost again, system will show following warning msg. To make scsi_autopm_put_host(shost) always reference a valid shost, put it just before scsi_host_put() in function scsi_finish_async_scan(). [ 299.281565] ============================================================================= [ 299.281634] BUG kmalloc-4096 (Tainted: G I ): Poison overwritten [ 299.281682] ----------------------------------------------------------------------------- [ 299.281684] [ 299.281752] INFO: 0xffff880056c305d0-0xffff880056c305d0. First byte 0x6a instead of 0x6b [ 299.281816] INFO: Allocated in scsi_host_alloc+0x4a/0x490 age=1688 cpu=1 pid=2004 [ 299.281870] __slab_alloc+0x617/0x6c1 [ 299.281901] __kmalloc+0x28c/0x2e0 [ 299.281931] scsi_host_alloc+0x4a/0x490 [ 299.281966] usb_stor_probe1+0x5b/0xc40 [usb_storage] [ 299.282010] storage_probe+0xa4/0xe0 [usb_storage] [ 299.282062] usb_probe_interface+0x172/0x330 [usbcore] [ 299.282105] driver_probe_device+0x257/0x3b0 [ 299.282138] __driver_attach+0x103/0x110 [ 299.282171] bus_for_each_dev+0x8e/0xe0 [ 299.282201] driver_attach+0x26/0x30 [ 299.282230] bus_add_driver+0x1c4/0x430 [ 299.282260] driver_register+0xb6/0x230 [ 299.282298] usb_register_driver+0xe5/0x270 [usbcore] [ 299.282337] 0xffffffffa04ab03d [ 299.282364] do_one_initcall+0x47/0x230 [ 299.282396] sys_init_module+0xa0f/0x1fe0 [ 299.282429] INFO: Freed in scsi_host_dev_release+0x18a/0x1d0 age=85 cpu=0 pid=2008 [ 299.282482] __slab_free+0x3c/0x2a1 [ 299.282510] kfree+0x296/0x310 [ 299.282536] scsi_host_dev_release+0x18a/0x1d0 [ 299.282574] device_release+0x74/0x100 [ 299.282606] kobject_release+0xc7/0x2a0 [ 299.282637] kobject_put+0x54/0xa0 [ 299.282668] put_device+0x27/0x40 [ 299.282694] scsi_host_put+0x1d/0x30 [ 299.282723] do_scan_async+0x1fc/0x2b0 [ 299.282753] kthread+0xdf/0xf0 [ 299.282782] kernel_thread_helper+0x4/0x10 [ 299.282817] INFO: Slab 0xffffea00015b0c00 objects=7 used=7 fp=0x (null) flags=0x100000000004080 [ 299.282882] INFO: Object 0xffff880056c30000 @offset=0 fp=0x (null) [ 299.282884] ... Signed-off-by: Huajun Li <huajun.li.lee@gmail.com> Cc: stable@kernel.org Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-02-12 11:59:14 +00:00
/* scsi_autopm_put_host(shost) is called in scsi_finish_async_scan() */
}
EXPORT_SYMBOL(scsi_scan_host);
void scsi_forget_host(struct Scsi_Host *shost)
{
struct scsi_device *sdev;
unsigned long flags;
restart:
spin_lock_irqsave(shost->host_lock, flags);
list_for_each_entry(sdev, &shost->__devices, siblings) {
if (sdev->sdev_state == SDEV_DEL)
continue;
spin_unlock_irqrestore(shost->host_lock, flags);
__scsi_remove_device(sdev);
goto restart;
}
spin_unlock_irqrestore(shost->host_lock, flags);
}