linux-stable/include/scsi/scsi_device.h

692 lines
24 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _SCSI_SCSI_DEVICE_H
#define _SCSI_SCSI_DEVICE_H
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/workqueue.h>
#include <linux/blk-mq.h>
#include <scsi/scsi.h>
#include <linux/atomic.h>
scsi: core: Replace sdev->device_busy with sbitmap SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 02:33:17 +00:00
#include <linux/sbitmap.h>
struct bsg_device;
struct device;
struct request_queue;
struct scsi_cmnd;
struct scsi_lun;
struct scsi_sense_hdr;
typedef __u64 __bitwise blist_flags_t;
#define SCSI_SENSE_BUFFERSIZE 96
struct scsi_mode_data {
__u32 length;
__u16 block_descriptor_length;
__u8 medium_type;
__u8 device_specific;
__u8 header_length;
__u8 longlba:1;
};
/*
* sdev state: If you alter this, you also need to alter scsi_sysfs.c
* (for the ascii descriptions) and the state model enforcer:
* scsi_lib:scsi_device_set_state().
*/
enum scsi_device_state {
SDEV_CREATED = 1, /* device created but not added to sysfs
* Only internal commands allowed (for inq) */
SDEV_RUNNING, /* device properly configured
* All commands allowed */
SDEV_CANCEL, /* beginning to delete device
* Only error handler commands allowed */
SDEV_DEL, /* device deleted
* no commands allowed */
SDEV_QUIESCE, /* Device quiescent. No block commands
* will be accepted, only specials (which
* originate in the mid-layer) */
SDEV_OFFLINE, /* Device offlined (by error handling or
* user request */
SDEV_TRANSPORT_OFFLINE, /* Offlined by transport class error handler */
SDEV_BLOCK, /* Device blocked by scsi lld. No
* scsi commands from user or midlayer
* should be issued to the scsi
* lld. */
SDEV_CREATED_BLOCK, /* same as above but for created devices */
};
enum scsi_scan_mode {
SCSI_SCAN_INITIAL = 0,
SCSI_SCAN_RESCAN,
SCSI_SCAN_MANUAL,
};
enum scsi_device_event {
SDEV_EVT_MEDIA_CHANGE = 1, /* media has changed */
SDEV_EVT_INQUIRY_CHANGE_REPORTED, /* 3F 03 UA reported */
SDEV_EVT_CAPACITY_CHANGE_REPORTED, /* 2A 09 UA reported */
SDEV_EVT_SOFT_THRESHOLD_REACHED_REPORTED, /* 38 07 UA reported */
SDEV_EVT_MODE_PARAMETER_CHANGE_REPORTED, /* 2A 01 UA reported */
SDEV_EVT_LUN_CHANGE_REPORTED, /* 3F 0E UA reported */
SDEV_EVT_ALUA_STATE_CHANGE_REPORTED, /* 2A 06 UA reported */
SDEV_EVT_POWER_ON_RESET_OCCURRED, /* 29 00 UA reported */
SDEV_EVT_FIRST = SDEV_EVT_MEDIA_CHANGE,
SDEV_EVT_LAST = SDEV_EVT_POWER_ON_RESET_OCCURRED,
SDEV_EVT_MAXBITS = SDEV_EVT_LAST + 1
};
struct scsi_event {
enum scsi_device_event evt_type;
struct list_head node;
/* put union of data structures, for non-simple event types,
* here
*/
};
/**
* struct scsi_vpd - SCSI Vital Product Data
* @rcu: For kfree_rcu().
* @len: Length in bytes of @data.
* @data: VPD data as defined in various T10 SCSI standard documents.
*/
struct scsi_vpd {
struct rcu_head rcu;
int len;
unsigned char data[];
};
struct scsi_device {
struct Scsi_Host *host;
struct request_queue *request_queue;
/* the next two are protected by the host->host_lock */
struct list_head siblings; /* list of all devices on this host */
struct list_head same_target_siblings; /* just the devices sharing same target id */
scsi: core: Replace sdev->device_busy with sbitmap SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 02:33:17 +00:00
struct sbitmap budget_map;
atomic_t device_blocked; /* Device returned QUEUE_FULL. */
atomic_t restarts;
spinlock_t list_lock;
struct list_head starved_entry;
unsigned short queue_depth; /* How deep of a queue we want */
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 22:46:33 +00:00
unsigned short max_queue_depth; /* max queue depth */
unsigned short last_queue_full_depth; /* These two are used by */
unsigned short last_queue_full_count; /* scsi_track_queue_full() */
[SCSI] add queue_depth ramp up code Current FC HBA queue_depth ramp up code depends on last queue full time. The sdev already has last_queue_full_time field to track last queue full time but stored value is truncated by last four bits. So this patch updates last_queue_full_time without truncating last 4 bits to store full value and then updates its only current usages in scsi_track_queue_full to ignore last four bits to keep current usages same while also use this field in added ramp up code. Adds scsi_handle_queue_ramp_up to ramp up queue_depth on successful completion of IO. The scsi_handle_queue_ramp_up will do ramp up on all luns of a target, just same as ramp down done on all luns on a target. The ramp up is skipped in case the change_queue_depth is not supported by LLD or already reached to added max_queue_depth. Updates added max_queue_depth on every new update to default queue_depth value. The ramp up is also skipped if lapsed time since either last queue ramp up or down is less than LLD specified queue_ramp_up_period. Adds queue_ramp_up_period to sysfs but only if change_queue_depth is supported since ramp up and queue_ramp_up_period is needed only in case change_queue_depth is supported first. Initializes queue_ramp_up_period to 120HZ jiffies as initial default value, it is same as used in existing lpfc and qla2xxx. -v2 Combined all ramp code into this single patch. -v3 Moves max_queue_depth initialization after slave_configure is called from after slave_alloc calling done. Also adjusted max_queue_depth check to skip ramp up if current queue_depth is >= max_queue_depth. -v4 Changes sdev->queue_ramp_up_period unit to ms when using sysfs i/f to store or show its value. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Christof Schmitt <christof.schmitt@de.ibm.com> Tested-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 22:46:33 +00:00
unsigned long last_queue_full_time; /* last queue full time */
unsigned long queue_ramp_up_period; /* ramp up period in jiffies */
#define SCSI_DEFAULT_RAMP_UP_PERIOD (120 * HZ)
unsigned long last_queue_ramp_up; /* last queue ramp up time */
unsigned int id, channel;
u64 lun;
unsigned int manufacturer; /* Manufacturer of device, for using
* vendor-specific cmd's */
unsigned sector_size; /* size in bytes */
void *hostdata; /* available to low-level driver */
unsigned char type;
char scsi_level;
char inq_periph_qual; /* PQ from INQUIRY data */
struct mutex inquiry_mutex;
unsigned char inquiry_len; /* valid bytes in 'inquiry' */
unsigned char * inquiry; /* INQUIRY response data */
const char * vendor; /* [back_compat] point into 'inquiry' ... */
const char * model; /* ... after scan; point to static string */
const char * rev; /* ... "nullnullnullnull" before scan */
#define SCSI_DEFAULT_VPD_LEN 255 /* default SCSI VPD page size (max) */
struct scsi_vpd __rcu *vpd_pg0;
struct scsi_vpd __rcu *vpd_pg83;
struct scsi_vpd __rcu *vpd_pg80;
struct scsi_vpd __rcu *vpd_pg89;
struct scsi_vpd __rcu *vpd_pgb0;
struct scsi_vpd __rcu *vpd_pgb1;
struct scsi_vpd __rcu *vpd_pgb2;
struct scsi_vpd __rcu *vpd_pgb7;
struct scsi_target *sdev_target;
blist_flags_t sdev_bflags; /* black/white flags as also found in
* scsi_devinfo.[hc]. For now used only to
* pass settings from slave_alloc to scsi
* core. */
unsigned int eh_timeout; /* Error handling timeout */
scsi: sd: Differentiate system and runtime start/stop management The underlying device and driver of a SCSI disk may have different system and runtime power mode control requirements. This is because runtime power management affects only the SCSI disk, while system level power management affects all devices, including the controller for the SCSI disk. For instance, issuing a START STOP UNIT command when a SCSI disk is runtime suspended and resumed is fine: the command is translated to a STANDBY IMMEDIATE command to spin down the ATA disk and to a VERIFY command to wake it up. The SCSI disk runtime operations have no effect on the ata port device used to connect the ATA disk. However, for system suspend/resume operations, the ATA port used to connect the device will also be suspended and resumed, with the resume operation requiring re-validating the device link and the device itself. In this case, issuing a VERIFY command to spinup the disk must be done before starting to revalidate the device, when the ata port is being resumed. In such case, we must not allow the SCSI disk driver to issue START STOP UNIT commands. Allow a low level driver to refine the SCSI disk start/stop management by differentiating system and runtime cases with two new SCSI device flags: manage_system_start_stop and manage_runtime_start_stop. These new flags replace the current manage_start_stop flag. Drivers setting the manage_start_stop are modifed to set both new flags, thus preserving the existing start/stop management behavior. For backward compatibility, the old manage_start_stop sysfs device attribute is kept as a read-only attribute showing a value of 1 for devices enabling both new flags and 0 otherwise. Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-15 01:02:41 +00:00
scsi: sd: Introduce manage_shutdown device flag Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") change setting the manage_system_start_stop flag to false for libata managed disks to enable libata internal management of disk suspend/resume. However, a side effect of this change is that on system shutdown, disks are no longer being stopped (set to standby mode with the heads unloaded). While this is not a critical issue, this unclean shutdown is not recommended and shows up with increased smart counters (e.g. the unexpected power loss counter "Unexpect_Power_Loss_Ct"). Instead of defining a shutdown driver method for all ATA adapter drivers (not all of them define that operation), this patch resolves this issue by further refining the sd driver start/stop control of disks using the new flag manage_shutdown. If this new flag is set to true by a low level driver, the function sd_shutdown() will issue a START STOP UNIT command with the start argument set to 0 when a disk needs to be powered off (suspended) on system power off, that is, when system_state is equal to SYSTEM_POWER_OFF. Similarly to the other manage_xxx flags, the new manage_shutdown flag is exposed through sysfs as a read-write device attribute. To avoid any confusion between manage_shutdown and manage_system_start_stop, the comments describing these flags in include/scsi/scsi.h are also improved. Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038 Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/ Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-25 06:46:12 +00:00
/*
* If true, let the high-level device driver (sd) manage the device
* power state for system suspend/resume (suspend to RAM and
* hibernation) operations.
*/
unsigned manage_system_start_stop:1;
scsi: sd: Introduce manage_shutdown device flag Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") change setting the manage_system_start_stop flag to false for libata managed disks to enable libata internal management of disk suspend/resume. However, a side effect of this change is that on system shutdown, disks are no longer being stopped (set to standby mode with the heads unloaded). While this is not a critical issue, this unclean shutdown is not recommended and shows up with increased smart counters (e.g. the unexpected power loss counter "Unexpect_Power_Loss_Ct"). Instead of defining a shutdown driver method for all ATA adapter drivers (not all of them define that operation), this patch resolves this issue by further refining the sd driver start/stop control of disks using the new flag manage_shutdown. If this new flag is set to true by a low level driver, the function sd_shutdown() will issue a START STOP UNIT command with the start argument set to 0 when a disk needs to be powered off (suspended) on system power off, that is, when system_state is equal to SYSTEM_POWER_OFF. Similarly to the other manage_xxx flags, the new manage_shutdown flag is exposed through sysfs as a read-write device attribute. To avoid any confusion between manage_shutdown and manage_system_start_stop, the comments describing these flags in include/scsi/scsi.h are also improved. Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038 Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/ Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-25 06:46:12 +00:00
/*
* If true, let the high-level device driver (sd) manage the device
* power state for runtime device suspand and resume operations.
*/
unsigned manage_runtime_start_stop:1;
scsi: sd: Introduce manage_shutdown device flag Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") change setting the manage_system_start_stop flag to false for libata managed disks to enable libata internal management of disk suspend/resume. However, a side effect of this change is that on system shutdown, disks are no longer being stopped (set to standby mode with the heads unloaded). While this is not a critical issue, this unclean shutdown is not recommended and shows up with increased smart counters (e.g. the unexpected power loss counter "Unexpect_Power_Loss_Ct"). Instead of defining a shutdown driver method for all ATA adapter drivers (not all of them define that operation), this patch resolves this issue by further refining the sd driver start/stop control of disks using the new flag manage_shutdown. If this new flag is set to true by a low level driver, the function sd_shutdown() will issue a START STOP UNIT command with the start argument set to 0 when a disk needs to be powered off (suspended) on system power off, that is, when system_state is equal to SYSTEM_POWER_OFF. Similarly to the other manage_xxx flags, the new manage_shutdown flag is exposed through sysfs as a read-write device attribute. To avoid any confusion between manage_shutdown and manage_system_start_stop, the comments describing these flags in include/scsi/scsi.h are also improved. Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038 Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/ Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-25 06:46:12 +00:00
/*
* If true, let the high-level device driver (sd) manage the device
* power state for system shutdown (power off) operations.
*/
unsigned manage_shutdown:1;
scsi: sd: Differentiate system and runtime start/stop management The underlying device and driver of a SCSI disk may have different system and runtime power mode control requirements. This is because runtime power management affects only the SCSI disk, while system level power management affects all devices, including the controller for the SCSI disk. For instance, issuing a START STOP UNIT command when a SCSI disk is runtime suspended and resumed is fine: the command is translated to a STANDBY IMMEDIATE command to spin down the ATA disk and to a VERIFY command to wake it up. The SCSI disk runtime operations have no effect on the ata port device used to connect the ATA disk. However, for system suspend/resume operations, the ATA port used to connect the device will also be suspended and resumed, with the resume operation requiring re-validating the device link and the device itself. In this case, issuing a VERIFY command to spinup the disk must be done before starting to revalidate the device, when the ata port is being resumed. In such case, we must not allow the SCSI disk driver to issue START STOP UNIT commands. Allow a low level driver to refine the SCSI disk start/stop management by differentiating system and runtime cases with two new SCSI device flags: manage_system_start_stop and manage_runtime_start_stop. These new flags replace the current manage_start_stop flag. Drivers setting the manage_start_stop are modifed to set both new flags, thus preserving the existing start/stop management behavior. For backward compatibility, the old manage_start_stop sysfs device attribute is kept as a read-only attribute showing a value of 1 for devices enabling both new flags and 0 otherwise. Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-15 01:02:41 +00:00
/*
* If set and if the device is runtime suspended, ask the high-level
* device driver (sd) to force a runtime resume of the device.
*/
unsigned force_runtime_start_on_system_start:1;
unsigned removable:1;
unsigned changed:1; /* Data invalid due to media change */
unsigned busy:1; /* Used to prevent races */
unsigned lockable:1; /* Able to prevent media removal */
unsigned locked:1; /* Media removal disabled */
unsigned borken:1; /* Tell the Seagate driver to be
* painfully slow on this device */
unsigned disconnect:1; /* can disconnect */
unsigned soft_reset:1; /* Uses soft reset option */
unsigned sdtr:1; /* Device supports SDTR messages */
unsigned wdtr:1; /* Device supports WDTR messages */
unsigned ppr:1; /* Device supports PPR messages */
unsigned tagged_supported:1; /* Supports SCSI-II tagged queuing */
unsigned simple_tags:1; /* simple queue tag messages are enabled */
unsigned was_reset:1; /* There was a bus reset on the bus for
* this device */
unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
* because we did a bus reset. */
unsigned use_10_for_rw:1; /* first try 10-byte read / write */
unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
scsi: sd: usb_storage: uas: Access media prior to querying device properties It has been observed that some USB/UAS devices return generic properties hardcoded in firmware for mode pages for a period of time after a device has been discovered. The reported properties are either garbage or they do not accurately reflect the characteristics of the physical storage device attached in the case of a bridge. Prior to commit 1e029397d12f ("scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice") we would call revalidate several times during device discovery. As a result, incorrect values would eventually get replaced with ones accurately describing the attached storage. When we did away with the redundant revalidate pass, several cases were reported where devices reported nonsensical values or would end up in write-protected state. An initial attempt at addressing this issue involved introducing a delayed second revalidate invocation. However, this approach still left some devices reporting incorrect characteristics. Tasos Sahanidis debugged the problem further and identified that introducing a READ operation prior to MODE SENSE fixed the problem and that it wasn't a timing issue. Issuing a READ appears to cause the devices to update their state to reflect the actual properties of the storage media. Device properties like vendor, model, and storage capacity appear to be correctly reported from the get-go. It is unclear why these devices defer populating the remaining characteristics. Match the behavior of a well known commercial operating system and trigger a READ operation prior to querying device characteristics to force the device to populate the mode pages. The additional READ is triggered by a flag set in the USB storage and UAS drivers. We avoid issuing the READ for other transport classes since some storage devices identify Linux through our particular discovery command sequence. Link: https://lore.kernel.org/r/20240213143306.2194237-1-martin.petersen@oracle.com Fixes: 1e029397d12f ("scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice") Cc: stable@vger.kernel.org Reported-by: Tasos Sahanidis <tasos@tasossah.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Tasos Sahanidis <tasos@tasossah.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-02-13 14:33:06 +00:00
unsigned read_before_ms:1; /* perform a READ before MODE SENSE */
unsigned no_report_opcodes:1; /* no REPORT SUPPORTED OPERATION CODES */
unsigned no_write_same:1; /* no WRITE SAME command */
unsigned use_16_for_rw:1; /* Use read/write(16) over read/write(10) */
unsigned use_16_for_sync:1; /* Use sync (16) over sync (10) */
unsigned skip_ms_page_8:1; /* do not use MODE SENSE page 0x08 */
unsigned skip_ms_page_3f:1; /* do not use MODE SENSE page 0x3f */
unsigned skip_vpd_pages:1; /* do not read VPD pages */
unsigned try_vpd_pages:1; /* attempt to read VPD pages */
unsigned use_192_bytes_for_3f:1; /* ask for 192 bytes from page 0x3f */
unsigned no_start_on_add:1; /* do not issue start on add */
unsigned allow_restart:1; /* issue START_UNIT in error handler */
unsigned start_stop_pwr_cond:1; /* Set power cond. in START_STOP_UNIT */
unsigned no_uld_attach:1; /* disable connecting to upper level drivers */
unsigned select_no_atn:1;
unsigned fix_capacity:1; /* READ_CAPACITY is too high by 1 */
unsigned guess_capacity:1; /* READ_CAPACITY might be too high by 1 */
unsigned retry_hwerror:1; /* Retry HARDWARE_ERROR */
unsigned last_sector_bug:1; /* do not use multisector accesses on
SD_LAST_BUGGY_SECTORS */
unsigned no_read_disc_info:1; /* Avoid READ_DISC_INFO cmds */
scsi/sd: add a no_read_capacity_16 scsi_device flag I seem to have a knack for digging up buggy usb devices which don't work with Linux, and I'm crazy enough to try to make them work. So this time a friend of mine asked me to get an mp4 player (an mp3 player which can play videos on a small screen) to work with Linux. It is based on the well known rockbox chipset for which we already have an unusual devs entries to work around some of its bugs. But this model comes with an additional twist. This model chokes on read_capacity_16 calls. Now normally we don't make those calls, but this model comes with an sdcard slot and when there is no card in there (and shipped from the factory there is none), it reports a size of 0. However this time the programmers actually got the read_capacity_10 response right! So they substract one from the size as stored internally in the mp3 player before reporting it back, resulting in an answer of ... 0xffffffff sectors, causing sd.c to try a read_capacity_16, on which the device crashes. This patch adds a flag to scsi_device to indicate that a a device cannot handle read_capacity_16, and when this flag is set if a device reports an lba of 0xffffffff as answer to a read_capacity_10, assumes it tries to report a size of 0. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-01 21:20:10 +00:00
unsigned no_read_capacity_16:1; /* Avoid READ_CAPACITY_16 cmds */
unsigned try_rc_10_first:1; /* Try READ_CAPACACITY_10 first */
unsigned security_supported:1; /* Supports Security Protocols */
unsigned is_visible:1; /* is the device visible in sysfs */
unsigned wce_default_on:1; /* Cache is ON by default */
unsigned no_dif:1; /* T10 PI (DIF) should be disabled */
unsigned broken_fua:1; /* Don't set FUA bit */
scsi: don't store LUN bits in CDB[1] for USB mass-storage devices The SCSI specification requires that the second Command Data Byte should contain the LUN value in its high-order bits if the recipient device reports SCSI level 2 or below. Nevertheless, some USB mass-storage devices use those bits for other purposes in vendor-specific commands. Currently Linux has no way to send such commands, because the SCSI stack always overwrites the LUN bits. Testing shows that Windows 7 and XP do not store the LUN bits in the CDB when sending commands to a USB device. This doesn't matter if the device uses the Bulk-Only or UAS transports (which virtually all modern USB mass-storage devices do), as these have a separate mechanism for sending the LUN value. Therefore this patch introduces a flag in the Scsi_Host structure to inform the SCSI midlayer that a transport does not require the LUN bits to be stored in the CDB, and it makes usb-storage set this flag for all devices using the Bulk-Only transport. (UAS is handled by a separate driver, but it doesn't really matter because no SCSI-2 or lower device is at all likely to use UAS.) The patch also cleans up the code responsible for storing the LUN value by adding a bitflag to the scsi_device structure. The test for whether to stick the LUN value in the CDB can be made when the device is probed, and stored for future use rather than being made over and over in the fast path. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Tiziano Bacocco <tiziano.bacocco@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-09-02 15:35:50 +00:00
unsigned lun_in_cdb:1; /* Store LUN bits in CDB[1] */
unsigned unmap_limit_for_ws:1; /* Use the UNMAP limit for WRITE SAME */
unsigned rpm_autosuspend:1; /* Enable runtime autosuspend at device
* creation time */
unsigned ignore_media_change:1; /* Ignore MEDIA CHANGE on resume */
unsigned silence_suspend:1; /* Do not print runtime PM related messages */
unsigned no_vpd_size:1; /* No VPD size reported in header */
unsigned cdl_supported:1; /* Command duration limits supported */
unsigned cdl_enable:1; /* Enable/disable Command duration limits */
unsigned int queue_stopped; /* request queue is quiesced */
bool offline_already; /* Device offline message logged */
atomic_t disk_events_disable_depth; /* disable depth for disk events */
DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */
DECLARE_BITMAP(pending_events, SDEV_EVT_MAXBITS); /* pending events */
struct list_head event_list; /* asserted events */
struct work_struct event_work;
unsigned int max_device_blocked; /* what device_blocked counts down from */
#define SCSI_DEFAULT_DEVICE_BLOCKED 3
atomic_t iorequest_cnt;
atomic_t iodone_cnt;
atomic_t ioerr_cnt;
atomic_t iotmo_cnt;
struct device sdev_gendev,
sdev_dev;
struct work_struct requeue_work;
struct scsi_device_handler *handler;
void *handler_data;
size_t dma_drain_len;
void *dma_drain_buf;
unsigned int sg_timeout;
unsigned int sg_reserved_size;
struct bsg_device *bsg_dev;
unsigned char access_state;
struct mutex state_mutex;
enum scsi_device_state sdev_state;
block, scsi: Make SCSI quiesce and resume work reliably The contexts from which a SCSI device can be quiesced or resumed are: * Writing into /sys/class/scsi_device/*/device/state. * SCSI parallel (SPI) domain validation. * The SCSI device power management methods. See also scsi_bus_pm_ops. It is essential during suspend and resume that neither the filesystem state nor the filesystem metadata in RAM changes. This is why while the hibernation image is being written or restored that SCSI devices are quiesced. The SCSI core quiesces devices through scsi_device_quiesce() and scsi_device_resume(). In the SDEV_QUIESCE state execution of non-preempt requests is deferred. This is realized by returning BLKPREP_DEFER from inside scsi_prep_state_check() for quiesced SCSI devices. Avoid that a full queue prevents power management requests to be submitted by deferring allocation of non-preempt requests for devices in the quiesced state. This patch has been tested by running the following commands and by verifying that after each resume the fio job was still running: for ((i=0; i<10; i++)); do ( cd /sys/block/md0/md && while true; do [ "$(<sync_action)" = "idle" ] && echo check > sync_action sleep 1 done ) & pids=($!) for d in /sys/class/block/sd*[a-z]; do bdev=${d#/sys/class/block/} hcil=$(readlink "$d/device") hcil=${hcil#../../../} echo 4 > "$d/queue/nr_requests" echo 1 > "/sys/class/scsi_device/$hcil/device/queue_depth" fio --name="$bdev" --filename="/dev/$bdev" --buffered=0 --bs=512 \ --rw=randread --ioengine=libaio --numjobs=4 --iodepth=16 \ --iodepth_batch=1 --thread --loops=$((2**31)) & pids+=($!) done sleep 1 echo "$(date) Hibernating ..." >>hibernate-test-log.txt systemctl hibernate sleep 10 kill "${pids[@]}" echo idle > /sys/block/md0/md/sync_action wait echo "$(date) Done." >>hibernate-test-log.txt done Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> References: "I/O hangs after resuming from suspend-to-ram" (https://marc.info/?l=linux-block&m=150340235201348). Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Martin Steigerwald <martin@lichtvoll.de> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-09 18:49:58 +00:00
struct task_struct *quiesced_by;
unsigned long sdev_data[];
} __attribute__((aligned(sizeof(unsigned long))));
#define to_scsi_device(d) \
container_of(d, struct scsi_device, sdev_gendev)
#define class_to_sdev(d) \
container_of(d, struct scsi_device, sdev_dev)
#define transport_class_to_sdev(class_dev) \
to_scsi_device(class_dev->parent)
#define sdev_dbg(sdev, fmt, a...) \
dev_dbg(&(sdev)->sdev_gendev, fmt, ##a)
/*
* like scmd_printk, but the device name is passed in
* as a string pointer
*/
__printf(4, 5) void
sdev_prefix_printk(const char *, const struct scsi_device *, const char *,
const char *, ...);
#define sdev_printk(l, sdev, fmt, a...) \
sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
__printf(3, 4) void
scmd_printk(const char *, const struct scsi_cmnd *, const char *, ...);
#define scmd_dbg(scmd, fmt, a...) \
do { \
struct request *__rq = scsi_cmd_to_rq((scmd)); \
\
if (__rq->q->disk) \
sdev_dbg((scmd)->device, "[%s] " fmt, \
__rq->q->disk->disk_name, ##a); \
else \
sdev_dbg((scmd)->device, fmt, ##a); \
} while (0)
enum scsi_target_state {
STARGET_CREATED = 1,
STARGET_RUNNING,
STARGET_REMOVE,
scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state The addition of the STARGET_REMOVE state had the side effect of introducing a race condition that can cause a crash. scsi_target_reap_ref_release() checks the starget->state to see if it still in STARGET_CREATED, and if so, skips calling transport_remove_device() and device_del(), because the starget->state is only set to STARGET_RUNNING after scsi_target_add() has called device_add() and transport_add_device(). However, if an rport loss occurs while a target is being scanned, it can happen that scsi_remove_target() will be called while the starget is still in the STARGET_CREATED state. In this case, the starget->state will be set to STARGET_REMOVE, and as a result, scsi_target_reap_ref_release() will take the wrong path. The end result is a panic: [ 1255.356653] Oops: 0000 [#1] SMP [ 1255.360154] Modules linked in: x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel ghash_clmulni_i [ 1255.393234] CPU: 5 PID: 149 Comm: kworker/u96:4 Tainted: G W 4.11.0+ #8 [ 1255.401879] Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013 [ 1255.410327] Workqueue: scsi_wq_6 fc_scsi_scan_rport [scsi_transport_fc] [ 1255.417720] task: ffff88060ca8c8c0 task.stack: ffffc900048a8000 [ 1255.424331] RIP: 0010:kernfs_find_ns+0x13/0xc0 [ 1255.429287] RSP: 0018:ffffc900048abbf0 EFLAGS: 00010246 [ 1255.435123] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 1255.443083] RDX: 0000000000000000 RSI: ffffffff8188d659 RDI: 0000000000000000 [ 1255.451043] RBP: ffffc900048abc10 R08: 0000000000000000 R09: 0000012433fe0025 [ 1255.459005] R10: 0000000025e5a4b5 R11: 0000000025e5a4b5 R12: ffffffff8188d659 [ 1255.466972] R13: 0000000000000000 R14: ffff8805f55e5088 R15: 0000000000000000 [ 1255.474931] FS: 0000000000000000(0000) GS:ffff880616b40000(0000) knlGS:0000000000000000 [ 1255.483959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1255.490370] CR2: 0000000000000068 CR3: 0000000001c09000 CR4: 00000000000406e0 [ 1255.498332] Call Trace: [ 1255.501058] kernfs_find_and_get_ns+0x31/0x60 [ 1255.505916] sysfs_unmerge_group+0x1d/0x60 [ 1255.510498] dpm_sysfs_remove+0x22/0x60 [ 1255.514783] device_del+0xf4/0x2e0 [ 1255.518577] ? device_remove_file+0x19/0x20 [ 1255.523241] attribute_container_class_device_del+0x1a/0x20 [ 1255.529457] transport_remove_classdev+0x4e/0x60 [ 1255.534607] ? transport_add_class_device+0x40/0x40 [ 1255.540046] attribute_container_device_trigger+0xb0/0xc0 [ 1255.546069] transport_remove_device+0x15/0x20 [ 1255.551025] scsi_target_reap_ref_release+0x25/0x40 [ 1255.556467] scsi_target_reap+0x2e/0x40 [ 1255.560744] __scsi_scan_target+0xaa/0x5b0 [ 1255.565312] scsi_scan_target+0xec/0x100 [ 1255.569689] fc_scsi_scan_rport+0xb1/0xc0 [scsi_transport_fc] [ 1255.576099] process_one_work+0x14b/0x390 [ 1255.580569] worker_thread+0x4b/0x390 [ 1255.584651] kthread+0x109/0x140 [ 1255.588251] ? rescuer_thread+0x330/0x330 [ 1255.592730] ? kthread_park+0x60/0x60 [ 1255.596815] ret_from_fork+0x29/0x40 [ 1255.600801] Code: 24 08 48 83 42 40 01 5b 41 5c 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 [ 1255.621876] RIP: kernfs_find_ns+0x13/0xc0 RSP: ffffc900048abbf0 [ 1255.628479] CR2: 0000000000000068 [ 1255.632756] ---[ end trace 34a69ba0477d036f ]--- Fix this by adding another scsi_target state STARGET_CREATED_REMOVE to distinguish this case. Fixes: f05795d3d771 ("scsi: Add intermediate STARGET_REMOVE state to scsi_target_state") Reported-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Cc: <stable@vger.kernel.org> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-27 18:55:58 +00:00
STARGET_CREATED_REMOVE,
STARGET_DEL,
};
/*
* scsi_target: representation of a scsi target, for now, this is only
* used for single_lun devices. If no one has active IO to the target,
* starget_sdev_user is NULL, else it points to the active sdev.
*/
struct scsi_target {
struct scsi_device *starget_sdev_user;
struct list_head siblings;
struct list_head devices;
struct device dev;
struct kref reap_ref; /* last put renders target invisible */
unsigned int channel;
unsigned int id; /* target id ... replace
* scsi_device.id eventually */
unsigned int create:1; /* signal that it needs to be added */
unsigned int single_lun:1; /* Indicates we should only
* allow I/O to one of the luns
* for the device at a time. */
unsigned int pdt_1f_for_no_lun:1; /* PDT = 0x1f
* means no lun present. */
unsigned int no_report_luns:1; /* Don't use
* REPORT LUNS for scanning. */
unsigned int expecting_lun_change:1; /* A device has reported
* a 3F/0E UA, other devices on
* the same target will also. */
/* commands actually active on LLD. */
atomic_t target_busy;
atomic_t target_blocked;
[SCSI] Add helper code so transport classes/driver can control queueing (v3) SCSI-ml manages the queueing limits for the device and host, but does not do so at the target level. However something something similar can come in userful when a driver is transitioning a transport object to the the blocked state, becuase at that time we do not want to queue io and we do not want the queuecommand to be called again. The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers. You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit a transport level queueing issue like the hw cannot allocate some resource at the iscsi session/connection level, or the target has temporarily closed or shrunk the queueing window, or if we are transitioning to the blocked state. bnx2i, when they rework their firmware according to netdev developers requests, will also need to be able to limit queueing at this level. bnx2i will hook into libiscsi, but will allocate a scsi host per netdevice/hba, so unlike pure software iscsi/iser which is allocating a host per session, it cannot set the scsi_host->can_queue and return SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport. The iscsi class/driver can also set a scsi_target->can_queue value which reflects the max commands the driver/class can support. For iscsi this reflects the number of commands we can support for each session due to session/connection hw limits, driver limits, and to also reflect the session/targets's queueing window. Changes: v1 - initial patch. v2 - Fix scsi_run_queue handling of multiple blocked targets. Previously we would break from the main loop if a device was added back on the starved list. We now run over the list and check if any target is blocked. v3 - Rediff for scsi-misc. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-08-17 20:24:38 +00:00
/*
* LLDs should set this in the slave_alloc host template callout.
* If set to zero then there is not limit.
*/
unsigned int can_queue;
unsigned int max_target_blocked;
#define SCSI_DEFAULT_TARGET_BLOCKED 3
char scsi_level;
enum scsi_target_state state;
void *hostdata; /* available to low-level driver */
unsigned long starget_data[]; /* for the transport */
/* starget_data must be the last element!!!! */
} __attribute__((aligned(sizeof(unsigned long))));
#define to_scsi_target(d) container_of(d, struct scsi_target, dev)
static inline struct scsi_target *scsi_target(struct scsi_device *sdev)
{
return to_scsi_target(sdev->sdev_gendev.parent);
}
#define transport_class_to_starget(class_dev) \
to_scsi_target(class_dev->parent)
#define starget_printk(prefix, starget, fmt, a...) \
dev_printk(prefix, &(starget)->dev, fmt, ##a)
extern struct scsi_device *__scsi_add_device(struct Scsi_Host *,
uint, uint, u64, void *hostdata);
extern int scsi_add_device(struct Scsi_Host *host, uint channel,
uint target, u64 lun);
extern int scsi_register_device_handler(struct scsi_device_handler *scsi_dh);
extern void scsi_remove_device(struct scsi_device *);
extern int scsi_unregister_device_handler(struct scsi_device_handler *scsi_dh);
void scsi_attach_vpd(struct scsi_device *sdev);
void scsi_cdl_check(struct scsi_device *sdev);
int scsi_cdl_enable(struct scsi_device *sdev, bool enable);
extern struct scsi_device *scsi_device_from_queue(struct request_queue *q);
extern int __must_check scsi_device_get(struct scsi_device *);
extern void scsi_device_put(struct scsi_device *);
extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *,
uint, uint, u64);
extern struct scsi_device *__scsi_device_lookup(struct Scsi_Host *,
uint, uint, u64);
extern struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *,
u64);
extern struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *,
u64);
extern void starget_for_each_device(struct scsi_target *, void *,
void (*fn)(struct scsi_device *, void *));
extern void __starget_for_each_device(struct scsi_target *, void *,
void (*fn)(struct scsi_device *,
void *));
/* only exposed to implement shost_for_each_device */
extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
struct scsi_device *);
/**
* shost_for_each_device - iterate over all devices of a host
* @sdev: the &struct scsi_device to use as a cursor
* @shost: the &struct scsi_host to iterate over
*
* Iterator that returns each device attached to @shost. This loop
* takes a reference on each device and releases it at the end. If
* you break out of the loop, you must call scsi_device_put(sdev).
*/
#define shost_for_each_device(sdev, shost) \
for ((sdev) = __scsi_iterate_devices((shost), NULL); \
(sdev); \
(sdev) = __scsi_iterate_devices((shost), (sdev)))
/**
* __shost_for_each_device - iterate over all devices of a host (UNLOCKED)
* @sdev: the &struct scsi_device to use as a cursor
* @shost: the &struct scsi_host to iterate over
*
* Iterator that returns each device attached to @shost. It does _not_
* take a reference on the scsi_device, so the whole loop must be
* protected by shost->host_lock.
*
* Note: The only reason to use this is because you need to access the
* device list in interrupt context. Otherwise you really want to use
* shost_for_each_device instead.
*/
#define __shost_for_each_device(sdev, shost) \
list_for_each_entry((sdev), &((shost)->__devices), siblings)
extern int scsi_change_queue_depth(struct scsi_device *, int);
extern int scsi_track_queue_full(struct scsi_device *, int);
extern int scsi_set_medium_removal(struct scsi_device *, char);
int scsi_mode_sense(struct scsi_device *sdev, int dbd, int modepage,
int subpage, unsigned char *buffer, int len, int timeout,
int retries, struct scsi_mode_data *data,
struct scsi_sense_hdr *);
extern int scsi_mode_select(struct scsi_device *sdev, int pf, int sp,
unsigned char *buffer, int len, int timeout,
int retries, struct scsi_mode_data *data,
struct scsi_sense_hdr *);
extern int scsi_test_unit_ready(struct scsi_device *sdev, int timeout,
int retries, struct scsi_sense_hdr *sshdr);
extern int scsi_get_vpd_page(struct scsi_device *, u8 page, unsigned char *buf,
int buf_len);
int scsi_report_opcode(struct scsi_device *sdev, unsigned char *buffer,
unsigned int len, unsigned char opcode,
unsigned short sa);
extern int scsi_device_set_state(struct scsi_device *sdev,
enum scsi_device_state state);
extern struct scsi_event *sdev_evt_alloc(enum scsi_device_event evt_type,
gfp_t gfpflags);
extern void sdev_evt_send(struct scsi_device *sdev, struct scsi_event *evt);
extern void sdev_evt_send_simple(struct scsi_device *sdev,
enum scsi_device_event evt_type, gfp_t gfpflags);
extern int scsi_device_quiesce(struct scsi_device *sdev);
extern void scsi_device_resume(struct scsi_device *sdev);
extern void scsi_target_quiesce(struct scsi_target *);
extern void scsi_target_resume(struct scsi_target *);
extern void scsi_scan_target(struct device *parent, unsigned int channel,
unsigned int id, u64 lun,
enum scsi_scan_mode rescan);
extern void scsi_target_reap(struct scsi_target *);
void scsi_block_targets(struct Scsi_Host *shost, struct device *dev);
extern void scsi_target_unblock(struct device *, enum scsi_device_state);
extern void scsi_remove_target(struct device *);
extern const char *scsi_device_state_name(enum scsi_device_state);
extern int scsi_is_sdev_device(const struct device *);
extern int scsi_is_target_device(const struct device *);
extern void scsi_sanitize_inquiry_string(unsigned char *s, int len);
/*
* scsi_execute_cmd users can set scsi_failure.result to have
* scsi_check_passthrough fail/retry a command. scsi_failure.result can be a
* specific host byte or message code, or SCMD_FAILURE_RESULT_ANY can be used
* to match any host or message code.
*/
#define SCMD_FAILURE_RESULT_ANY 0x7fffffff
/*
* Set scsi_failure.result to SCMD_FAILURE_STAT_ANY to fail/retry any failure
* scsi_status_is_good returns false for.
*/
#define SCMD_FAILURE_STAT_ANY 0xff
/*
* The following can be set to the scsi_failure sense, asc and ascq fields to
* match on any sense, ASC, or ASCQ value.
*/
#define SCMD_FAILURE_SENSE_ANY 0xff
#define SCMD_FAILURE_ASC_ANY 0xff
#define SCMD_FAILURE_ASCQ_ANY 0xff
/* Always retry a matching failure. */
#define SCMD_FAILURE_NO_LIMIT -1
struct scsi_failure {
int result;
u8 sense;
u8 asc;
u8 ascq;
/*
* Number of times scsi_execute_cmd will retry the failure. It does
* not count for the total_allowed.
*/
s8 allowed;
/* Number of times the failure has been retried. */
s8 retries;
};
struct scsi_failures {
/*
* If a scsi_failure does not have a retry limit setup this limit will
* be used.
*/
int total_allowed;
int total_retries;
struct scsi_failure *failure_definitions;
};
/* Optional arguments to scsi_execute_cmd */
struct scsi_exec_args {
unsigned char *sense; /* sense buffer */
unsigned int sense_len; /* sense buffer len */
struct scsi_sense_hdr *sshdr; /* decoded sense header */
blk_mq_req_flags_t req_flags; /* BLK_MQ_REQ flags */
int scmd_flags; /* SCMD flags */
int *resid; /* residual length */
struct scsi_failures *failures; /* failures to retry */
};
int scsi_execute_cmd(struct scsi_device *sdev, const unsigned char *cmd,
blk_opf_t opf, void *buffer, unsigned int bufflen,
int timeout, int retries,
const struct scsi_exec_args *args);
void scsi_failures_reset_retries(struct scsi_failures *failures);
extern void sdev_disable_disk_events(struct scsi_device *sdev);
extern void sdev_enable_disk_events(struct scsi_device *sdev);
extern int scsi_vpd_lun_id(struct scsi_device *, char *, size_t);
extern int scsi_vpd_tpg_id(struct scsi_device *, int *);
#ifdef CONFIG_PM
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
extern int scsi_autopm_get_device(struct scsi_device *);
extern void scsi_autopm_put_device(struct scsi_device *);
#else
static inline int scsi_autopm_get_device(struct scsi_device *d) { return 0; }
static inline void scsi_autopm_put_device(struct scsi_device *d) {}
#endif /* CONFIG_PM */
[SCSI] implement runtime Power Management This patch (as1398b) adds runtime PM support to the SCSI layer. Only the machanism is provided; use of it is up to the various high-level drivers, and the patch doesn't change any of them. Except for sg -- the patch expicitly prevents a device from being runtime-suspended while its sg device file is open. The implementation is simplistic. In general, hosts and targets are automatically suspended when all their children are asleep, but for them the runtime-suspend code doesn't actually do anything. (A host's runtime PM status is propagated up the device tree, though, so a runtime-PM-aware lower-level driver could power down the host adapter hardware at the appropriate times.) There are comments indicating where a transport class might be notified or some other hooks added. LUNs are runtime-suspended by calling the drivers' existing suspend handlers (and likewise for runtime-resume). Somewhat arbitrarily, the implementation delays for 100 ms before suspending an eligible LUN. This is because there typically are occasions during bootup when the same device file is opened and closed several times in quick succession. The way this all works is that the SCSI core increments a device's PM-usage count when it is registered. If a high-level driver does nothing then the device will not be eligible for runtime-suspend because of the elevated usage count. If a high-level driver wants to use runtime PM then it can call scsi_autopm_put_device() in its probe routine to decrement the usage count and scsi_autopm_get_device() in its remove routine to restore the original count. Hosts, targets, and LUNs are not suspended while they are being probed or removed, or while the error handler is running. In fact, a fairly large part of the patch consists of code to make sure that things aren't suspended at such times. [jejb: fix up compile issues in PM config variations] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-06-17 14:41:42 +00:00
static inline int __must_check scsi_device_reprobe(struct scsi_device *sdev)
{
return device_reprobe(&sdev->sdev_gendev);
}
static inline unsigned int sdev_channel(struct scsi_device *sdev)
{
return sdev->channel;
}
static inline unsigned int sdev_id(struct scsi_device *sdev)
{
return sdev->id;
}
#define scmd_id(scmd) sdev_id((scmd)->device)
#define scmd_channel(scmd) sdev_channel((scmd)->device)
/*
* checks for positions of the SCSI state machine
*/
static inline int scsi_device_online(struct scsi_device *sdev)
{
return (sdev->sdev_state != SDEV_OFFLINE &&
sdev->sdev_state != SDEV_TRANSPORT_OFFLINE &&
sdev->sdev_state != SDEV_DEL);
}
static inline int scsi_device_blocked(struct scsi_device *sdev)
{
return sdev->sdev_state == SDEV_BLOCK ||
sdev->sdev_state == SDEV_CREATED_BLOCK;
}
static inline int scsi_device_created(struct scsi_device *sdev)
{
return sdev->sdev_state == SDEV_CREATED ||
sdev->sdev_state == SDEV_CREATED_BLOCK;
}
int scsi_internal_device_block_nowait(struct scsi_device *sdev);
int scsi_internal_device_unblock_nowait(struct scsi_device *sdev,
enum scsi_device_state new_state);
/* accessor functions for the SCSI parameters */
static inline int scsi_device_sync(struct scsi_device *sdev)
{
return sdev->sdtr;
}
static inline int scsi_device_wide(struct scsi_device *sdev)
{
return sdev->wdtr;
}
static inline int scsi_device_dt(struct scsi_device *sdev)
{
return sdev->ppr;
}
static inline int scsi_device_dt_only(struct scsi_device *sdev)
{
if (sdev->inquiry_len < 57)
return 0;
return (sdev->inquiry[56] & 0x0c) == 0x04;
}
static inline int scsi_device_ius(struct scsi_device *sdev)
{
if (sdev->inquiry_len < 57)
return 0;
return sdev->inquiry[56] & 0x01;
}
static inline int scsi_device_qas(struct scsi_device *sdev)
{
if (sdev->inquiry_len < 57)
return 0;
return sdev->inquiry[56] & 0x02;
}
static inline int scsi_device_enclosure(struct scsi_device *sdev)
{
return sdev->inquiry ? (sdev->inquiry[6] & (1<<6)) : 1;
}
[SCSI] modalias for scsi devices The following patch adds support for sysfs/uevent modalias attribute for scsi devices (like disks, tapes, cdroms etc), based on whatever current sd.c, sr.c, st.c and osst.c drivers supports. The modalias format is like this: scsi:type-0x04 (for TYPE_WORM, handled by sr.c now). Several comments. o This hexadecimal type value is because all TYPE_XXX constants in include/scsi/scsi.h are given in hex, but __stringify() will not convert them to decimal (so it will NOT be scsi:type-4). Since it does not really matter in which format it is, while both modalias in module and modalias attribute match each other, I descided to go for that 0x%02x format (and added a comment in include/scsi/scsi.h to keep them that way), instead of changing them all to decimal. o There was no .uevent routine for SCSI bus. It might be a good idea to add some more ueven environment variables in there. o osst.c driver handles tapes too, like st.c, but only SOME tapes. With this setup, hotplug scripts (or whatever is used by the user) will try to load both st and osst modules for all SCSI tapes found, because both modules have scsi:type-0x01 alias). It is not harmful, but one extra module is no good either. It is possible to solve this, by exporting more info in modalias attribute, including vendor and device identification strings, so that modalias becomes something like scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive and having that, match for all 3 attributes, not only device type. But oh well, vendor and device strings may be large, and they do contain spaces and whatnot. So I left them for now, awaiting for comments first. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-27 12:02:37 +00:00
static inline int scsi_device_protection(struct scsi_device *sdev)
{
if (sdev->no_dif)
return 0;
return sdev->scsi_level > SCSI_2 && sdev->inquiry[5] & (1<<0);
}
static inline int scsi_device_tpgs(struct scsi_device *sdev)
{
return sdev->inquiry ? (sdev->inquiry[5] >> 4) & 0x3 : 0;
}
/**
* scsi_device_supports_vpd - test if a device supports VPD pages
* @sdev: the &struct scsi_device to test
*
* If the 'try_vpd_pages' flag is set it takes precedence.
* Otherwise we will assume VPD pages are supported if the
* SCSI level is at least SPC-3 and 'skip_vpd_pages' is not set.
*/
static inline int scsi_device_supports_vpd(struct scsi_device *sdev)
{
/* Attempt VPD inquiry if the device blacklist explicitly calls
* for it.
*/
if (sdev->try_vpd_pages)
return 1;
/*
* Although VPD inquiries can go to SCSI-2 type devices,
* some USB ones crash on receiving them, and the pages
* we currently ask for are mandatory for SPC-2 and beyond
*/
if (sdev->scsi_level >= SCSI_SPC_2 && !sdev->skip_vpd_pages)
return 1;
return 0;
}
static inline int scsi_device_busy(struct scsi_device *sdev)
{
scsi: core: Replace sdev->device_busy with sbitmap SCSI currently uses an atomic variable to track queue depth for each attached device. The queue depth depends on many factors such as transport type and device implementation. In addition, the SCSI device queue depth is not a static entity but changes over time as a result of congestion management. While blk-mq currently tracks queue depth for each hctx, it can't easily be changed to accommodate the SCSI per-device requirement. The current approach of using an atomic variable doesn't scale well when there are lots of CPU cores and the disk is very fast. IOPS can be substantially impacted by the atomic in the hot path. Replace the atomic variable sdev->device_busy with an sbitmap for tracking the SCSI device queue depth. It has been observed that IOPS is improved ~30% by this patchset in the following test: 1) test machine(32 logical CPU cores) Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz 2) setup scsi_debug: modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256 3) fio script: fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \ --numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \ --loops=10000 --name=job1 --filename=/dev/sdN [mkp: fix device_busy reference in mpt3sas] Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/ Cc: Omar Sandoval <osandov@fb.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Cc: Ewan D. Milne <emilne@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 02:33:17 +00:00
return sbitmap_weight(&sdev->budget_map);
}
[SCSI] modalias for scsi devices The following patch adds support for sysfs/uevent modalias attribute for scsi devices (like disks, tapes, cdroms etc), based on whatever current sd.c, sr.c, st.c and osst.c drivers supports. The modalias format is like this: scsi:type-0x04 (for TYPE_WORM, handled by sr.c now). Several comments. o This hexadecimal type value is because all TYPE_XXX constants in include/scsi/scsi.h are given in hex, but __stringify() will not convert them to decimal (so it will NOT be scsi:type-4). Since it does not really matter in which format it is, while both modalias in module and modalias attribute match each other, I descided to go for that 0x%02x format (and added a comment in include/scsi/scsi.h to keep them that way), instead of changing them all to decimal. o There was no .uevent routine for SCSI bus. It might be a good idea to add some more ueven environment variables in there. o osst.c driver handles tapes too, like st.c, but only SOME tapes. With this setup, hotplug scripts (or whatever is used by the user) will try to load both st and osst modules for all SCSI tapes found, because both modules have scsi:type-0x01 alias). It is not harmful, but one extra module is no good either. It is possible to solve this, by exporting more info in modalias attribute, including vendor and device identification strings, so that modalias becomes something like scsi:type-0x12:vendor-Adaptec LTD:device-OnStream Tape Drive and having that, match for all 3 attributes, not only device type. But oh well, vendor and device strings may be large, and they do contain spaces and whatnot. So I left them for now, awaiting for comments first. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-27 12:02:37 +00:00
#define MODULE_ALIAS_SCSI_DEVICE(type) \
MODULE_ALIAS("scsi:t-" __stringify(type) "*")
#define SCSI_DEVICE_MODALIAS_FMT "scsi:t-0x%02x"
#endif /* _SCSI_SCSI_DEVICE_H */