25008 Commits

Author SHA1 Message Date
Martin K. Petersen
359aeb8648 Merge patch series "Update lpfc to revision 14.4.0.5"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.4.0.5

This patch set contains bug fixes related to HBA state clean ups, FCP
discovery on older adapters, kref imbalances, log message improvements,
and support for a new diagnostic loopback testing mode.

The patches were cut against Martin's 6.12/scsi-queue tree.

Link: https://lore.kernel.org/r/20240912232447.45607-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:22:25 -04:00
Justin Tee
b071c1a909 scsi: lpfc: Update lpfc version to 14.4.0.5
Update lpfc version to 14.4.0.5

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
eeb85c658e scsi: lpfc: Support loopback tests with VMID enabled
The VMID feature adds an extra application services header to each frame.
As such, the loopback test path is updated to accommodate the extra
application header.

Changes include filling in APPID and WQES bit fields for XMIT_SEQUENCE64
commands, a special loopback source APPID for verifying received loopback
data matches what is sent, and increasing ELS WQ size to accommodate the
APPID field in loopback test mode.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
1af9af1f8a scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
Revise certain log messages marked as KERN_ERR LOG_TRACE_EVENT to
KERN_WARNING and use the lpfc_vlog_msg() macro to still log the event.  The
benefit is that events of interest are still logged and the entire trace
buffer is not dumped with extraneous logging information when using default
lpfc_log_verbose driver parameter settings.

Also, delete the keyword "fail" from such log messages as they aren't
really causes for concern.  The log messages are more for warnings to a SAN
admin about SAN activity.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
0a3c84f716 scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
Deleting an NPIV instance requires all fabric ndlps to be released before
an NPIV's resources can be torn down.  Failure to release fabric ndlps
beforehand opens kref imbalance race conditions.  Fix by forcing the DA_ID
to complete synchronously with usage of wait_queue.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
d1a2ef63fc scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
With a FLOGI outstanding and loss of physical link connection to the fabric
for the duration of dev_loss_tmo, there is a fabric ndlp kref imbalance
that decrements the kref and sets the NLP_IN_RECOV_POST_DEV_LOSS flag at
the same time.

The issue is that when the FLOGI completion routine executes, the fabric
ndlp could already be freed because of the final kref put from the
dev_loss_tmo handler.  Fix by early returning before the ndlp kref put if
the ndlp is deemed a candidate for NLP_IN_RECOV_POST_DEV_LOSS in the FLOGI
completion routine.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
05ab4e7846 scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
An older generation of HBAs are failing FCP discovery due to usage of an
outdated field in FCP command WQEs.

Fix by checking the SLI Interface Type register for applicable support of
32 Byte CDB commands, and restore a setting for a WQE path using normal 16
byte CDBs.

Fixes: af20bb73ac25 ("scsi: lpfc: Add support for 32 byte CDBs")
Cc: stable@vger.kernel.org # v6.10+
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
fc318cac66 scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
It's possible for the driver to send a CMF_SYNC_WQE to nonresponsive
firmware during reset of the adapter.  The phba link_state conditional
check is currently a strict == LPFC_LINK_DOWN, which does not cover
initialization states before reaching the LPFC_LINK_UP state.

Update the phba->link_state conditional to < LPFC_LINK_UP so that all
initialization states are covered before allowing sending CMF_SYNC_WQE.

Update taking of the hbalock to be during this link_state check as well.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
93bcc5f398 scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
During HBA stress testing, a spam of received PLOGIs exposes a resource
recovery bug causing leakage of lpfc_sqlq entries from the global
phba->sli4_hba.lpfc_els_sgl_list.

The issue is in lpfc_els_flush_cmd(), where the driver attempts to recover
outstanding ELS sgls when walking the txcmplq.  Only CMD_ELS_REQUEST64_CRs
and CMD_GEN_REQUEST64_CRs are added to the abort and cancel lists.  A check
for CMD_XMIT_ELS_RSP64_WQE is missing in order to recover LS_ACC usages of
the phba->sli4_hba.lpfc_els_sgl_list too.

Fix by adding CMD_XMIT_ELS_RSP64_WQE as part of the txcmplq walk when
adding WQEs to the abort and cancel list in lpfc_els_flush_cmd().  Also,
update naming convention from CRs to WQEs.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:18 -04:00
Martin K. Petersen
95474648b8 Merge patch series "mpi3mr: Few Enhancements and minor fix"
Ranjan Kumar <ranjan.kumar@broadcom.com> says:

Few Enhancements and minor fix of mpi3mr driver.

Link: https://lore.kernel.org/r/20240905102753.105310-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:11:28 -04:00
Ranjan Kumar
e7d67f3f9f scsi: mpi3mr: Update driver version to 8.12.0.0.50
Update driver version to 8.12.0.0.50.

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240905102753.105310-6-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:10:14 -04:00
Ranjan Kumar
4616a4b3cb scsi: mpi3mr: Improve wait logic while controller transitions to READY state
During controller transitioning to READY state, if the controller is found
in transient states ("becoming ready" or "reset requested"), driver waits
for 510 secs even if the controller transitions out of these states
early. This causes an unnecessary delay of 510 secs in the overall firmware
initialization sequence.

Poll the controller state periodically (every 100 milliseconds) while
waiting for the controller to come out of the mentioned transient
states. Once the controller transits out of the transient states, come out
of the wait loop.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240905102753.105310-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:10:13 -04:00
Ranjan Kumar
6e4c825f26 scsi: mpi3mr: Update MPI Headers to revision 34
Update MPI Headers to revision 34.

Signed-off-by: Prayas Patel <prayas.patel@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240905102753.105310-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:10:13 -04:00
Ranjan Kumar
fc1ddda330 scsi: mpi3mr: Use firmware-provided timestamp update interval
Make driver use the timestamp update interval value provided by firmware in
the driver page 1. If firmware fails to provide non-zero value, then the
driver will fall back to the driver defined macro.

Signed-off-by: Prayas Patel <prayas.patel@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240905102753.105310-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:10:13 -04:00
Ranjan Kumar
9634bb0708 scsi: mpi3mr: Enhance the Enable Controller retry logic
When enabling the IOC request and polling for controller ready status, poll
for controller fault and reset history bit. If the controller is faulted
or the reset history bit is set, retry the initialization a maximum of
three times (2 retries) or if the cumulative time taken for all retries
exceeds 510 seconds.

Signed-off-by: Prayas Patel <prayas.patel@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240905102753.105310-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:10:13 -04:00
Martin Wilck
f81eaf0838 scsi: sd: Fix off-by-one error in sd_read_block_characteristics()
Ff the device returns page 0xb1 with length 8 (happens with qemu v2.x, for
example), sd_read_block_characteristics() may attempt an out-of-bounds
memory access when accessing the zoned field at offset 8.

Fixes: 7fb019c46eee ("scsi: sd: Switch to using scsi_device VPD pages")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Wilck <mwilck@suse.com>
Link: https://lore.kernel.org/r/20240912134308.282824-1-mwilck@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:55:53 -04:00
Daniel Wagner
a141c17a54 scsi: pm8001: Do not overwrite PCI queue mapping
blk_mq_pci_map_queues() maps all queues but right after this, we overwrite
these mappings by calling blk_mq_map_queues(). Just use one helper but not
both.

Fixes: 42f22fe36d51 ("scsi: pm8001: Expose hardware queues for pm80xx")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Link: https://lore.kernel.org/r/20240912-do-not-overwrite-pci-mapping-v1-1-85724b6cec49@suse.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:52:00 -04:00
Christophe JAILLET
bba20b894e scsi: scsi_debug: Remove a useless memset()
'arr' is kzalloc()'ed, so there is no need to call memset(.., 0, ...) on
it. It is already cleared.

This is a follow up of commit b952eb270df3 ("scsi: scsi_debug: Allocate the
MODE SENSE response from the heap").

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/6296722174e39a51cac74b7fc68b0d75bd0db2a3.1725690433.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:45:48 -04:00
Chen Ni
4708c9332d scsi: pmcraid: Convert comma to semicolon
Replace comma between expressions with semicolons.

Using a ',' in place of a ';' can have unintended side effects. Although
that is not the case here, it is seems best to use ';' unless ',' is
intended.

Found by inspection. No functional change intended. Compile tested only.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Link: https://lore.kernel.org/r/20240905023521.1642862-1-nichen@iscas.ac.cn
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:42:09 -04:00
Bart Van Assche
a8598aefae scsi: sd: Retry START STOP UNIT commands
During system resume, sd_start_stop_device() submits a START STOP UNIT
command to the SCSI device that is being resumed. That command is not
retried in case of a unit attention and hence may fail. An example:

[16575.983359] sd 0:0:0:3: [sdd] Starting disk
[16575.983693] sd 0:0:0:3: [sdd] Start/Stop Unit failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK
[16575.983712] sd 0:0:0:3: [sdd] Sense Key : 0x6
[16575.983730] sd 0:0:0:3: [sdd] ASC=0x29 ASCQ=0x0
[16575.983738] sd 0:0:0:3: PM: dpm_run_callback(): scsi_bus_resume+0x0/0xa0 returns -5
[16575.983783] sd 0:0:0:3: PM: failed to resume async: error -5

Make the SCSI core retry the START STOP UNIT command if the device reports
that it has been powered on or that it has been reset.

Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240904210304.2947789-1-bvanassche@acm.org
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:40:32 -04:00
Tomas Henzl
24d7071d96 scsi: mpi3mr: A performance fix
Commit 0c52310f2600 ("hrtimer: Ignore slack time for RT tasks in
schedule_hrtimeout_range()") effectivelly shortens a sleep in a polling
function in the driver.  That is causing a performance regression as the
new value of just 2us is too low, in certain tests the perf drop is ~30%.
Fix this by adjusting the sleep to 20us (close to the previous value).

Reported-by: Jan Jurca <jjurca@redhat.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240903144729.37218-1-thenzl@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:34:44 -04:00
Colin Ian King
0557f49870 scsi: mpt3sas: Remove trailing space after \n newline
There is a extraneous space after a newline in an ioc_info message.
Remove it and join to split literal strings into one.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902172708.369741-1-colin.i.king@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:30:43 -04:00
Colin Ian King
c7c846fa94 scsi: lpfc: Remove trailing space after \n newline
There is a extraneous space after a newline in two lpfc_printf_log()
messages. Remove the space.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902150042.311157-1-colin.i.king@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:29:49 -04:00
Colin Ian King
fa557da6b0 scsi: qedf: Remove trailing space after \n newline
There is a extraneous space after a newline in a QEDF_INFO message.
Remove it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902145138.310883-1-colin.i.king@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:28:58 -04:00
Colin Ian King
d2ce0e5ab5 scsi: hisi_sas: Remove trailing space after \n newline
There is a extraneous space after a newline in a dev_info message.
Remove it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902144153.309920-1-colin.i.king@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:28:58 -04:00
Colin Ian King
571d81b482 scsi: megaraid_sas: Remove trailing space after \n newline
There is a extraneous space after a newline in a dev_err message.
Remove it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902142252.309232-1-colin.i.king@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:28:50 -04:00
Colin Ian King
34f04a9b6e scsi: pm8001: Remove trailing space after \n newline
There is a extraneous space after a newline in a pm8001_dbg message.
Remove it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902141537.308914-1-colin.i.king@gmail.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:26:22 -04:00
Colin Ian King
57bada8a5e scsi: zalon: Remove trailing space after \n newline
There is a extraneous space after a newline in a dev_printk message,
remove it. Also fix non-tabbed indentation of the statement.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902141202.308632-1-colin.i.king@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:25:43 -04:00
Christophe JAILLET
45fad027df scsi: libcxgbi: Remove an unused field in struct cxgbi_device
Usage of .dev_ddp_cleanup() in libcxgbi was removed by commit 5999299f1ce9
("cxgb3i,cxgb4i,libcxgbi: remove iSCSI DDP support") on 2016-07.

.csk_rx_pdu_ready() and debugfs_root have apparently never been used since
introduction by commit 9ba682f01e2f ("[SCSI] libcxgbi: common library for
cxgb3i and cxgb4i")

Remove the now unused function pointer from struct cxgbi_device.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/58f77f690d85e2c653447e3e3fc4f8d3c3ce8563.1725223504.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:20:14 -04:00
Brian King
e368400694 scsi: ibmvfc: Add max_sectors module parameter
There are some scenarios that can occur, such as performing an upgrade of
the virtual I/O server, where the supported max transfer of the backing
device for an ibmvfc HBA can change. If the max transfer of the backing
device decreases, this can cause issues with previously discovered
LUNs. This patch accomplishes two things.  First, it changes the default
ibmvfc max transfer value to 1MB.  This is generally supported by all
backing devices, which should mitigate this issue out of the box. Secondly,
it adds a module parameter, enabling a user to increase the max transfer
value to values that are larger than 1MB, as long as they have configured
these larger values on the virtual I/O server as well.

[mkp: fix checkpatch warnings]

Signed-off-by: Brian King <brking@linux.ibm.com>
Link: https://lore.kernel.org/r/20240903134708.139645-2-brking@linux.ibm.com
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:18:30 -04:00
Hongbo Li
b112947ffc scsi: sd: Remove duplicate included header file linux/bio-integrity.h
The header file linux/bio-integrity.h is included twice.  Remove the last
one. The compilation test has passed.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://lore.kernel.org/r/20240830075858.3541907-1-lihongbo22@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:11:06 -04:00
Rafael Rocha
3d882cca73 scsi: st: Fix input/output error on empty drive reset
A previous change was introduced to prevent data loss during a power-on
reset when a tape is present inside the drive. This commit set the
"pos_unknown" flag to true to avoid operations that could compromise data
by performing actions from an untracked position. The relevant change is
commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")

As a consequence of this change, a new issue has surfaced: the driver now
returns an "Input/output error" even for empty drives when the drive, host,
or bus is reset. This issue stems from the "flush_buffer" function, which
first checks whether the "pos_unknown" flag is set. If the flag is set, the
user will encounter an "Input/output error" until the tape position is
known again. This behavior differs from the previous implementation, where
empty drives were not affected at system start up time, allowing tape
software to send commands to the driver to retrieve the drive's status and
other information.

The current behavior prioritizes the "pos_unknown" flag over the
"ST_NO_TAPE" status, leading to issues for software that detects drives
during system startup. This software will receive an "Input/output error"
until a tape is loaded and its position is known.

To resolve this, the "ST_NO_TAPE" status should take priority when the
drive is empty, allowing communication with the drive following a power-on
reset. At the same time, the change should continue to protect data by
maintaining the "pos_unknown" flag when the drive contains a tape and its
position is unknown.

Signed-off-by: Rafael Rocha <rrochavi@fnal.gov>
Link: https://lore.kernel.org/r/20240905173921.10944-1-rrochavi@fnal.gov
Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:07:11 -04:00
Martin K. Petersen
cff06a799d Merge patch series "smartpqi updates"
Don Brace <don.brace@microchip.com> says:

These patches are based on Martin Petersen's 6.12/scsi-queue tree
  https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
  6.12/scsi-queue

There are two functional changes:
    smartpqi-add-fw-log-to-kdump
    smartpqi-add-counter-for-parity-write-stream-requests

There are three minor bug fixes:
    smartpqi-fix-stream-detection
    smartpqi-fix-rare-system-hang-during-LUN-reset
    smartpqi-fix-volume-size-updates

The other two patches add PCI-IDs for new controllers and change the
driver version.

This set of changes consists of:
* smartpqi-add-fw-log-to-kdump

  During a kdump, the driver tells the controller to copy its logging
  information to some pre-allocated buffers that can be analyzed
  later.

  This is a "feature" driven capability and is backward compatible
  with existing controller FW.

  This patch renames some prefixes for OFA (Online-Firmware Activation
  ofa_*) buffers to host_memory_*. So, not a lot of actual functional
  changes to smartpqi_init.c, mainly determining the memory size
  allocation.

  We added a function to notify the controller to copy debug data into
  host memory before continuing kdump.

  Most of the functional changes are in smartpqi_sis.c where the
  actual handshaking is done.

* smartpqi-fix-stream-detection

  Correct some false write-stream detections. The data structure used
  to check for write-streams was not initialized to all 0's causing
  some false write stream detections. The driver sends down streamed
  requests to the raid engine instead of using AIO bypass for some
  extra performance.  (Potential full-stripe write verses Read Modify
  Write).

  False detections have not caused any data corruption.  Found by
  internal testing. No known externally reported bugs.

* smartpqi-add-counter-for-parity-write-stream-requests

  Adding some counters for raid_bypass and write streams. These two
  counters are related because write stream detection is only checked
  if an I/O request is eligible for bypass (AIO).

  The bypass counter (raid_bypass_cnt) was moved into a common
  structure (pqi_raid_io_stats) and changed to type __percpu. The
  write stream counter is (write_stream_cnt) has been added to this
  same structure.

  These counters are __percpu counters for performance. We added a
  sysfs entry to show the write stream count. The raid bypass counter
  sysfs entry already exists.

  Useful for checking streaming writes. The change in the sysfs entry
  write_stream_cnt can be checked during AIO eligible write
  operations.

* smartpqi-add-new-controller-PCI-IDs

  Adding support for new controller HW.  No functional changes.

* smartpqi-fix-rare-system-hang-during-LUN-reset

  We found a rare race condition that can occur during a LUN reset. We
  were not emptying our internal queue completely.

  There have been some rare conditions where our internal request
  queue has requests for multiple LUNs and a reset comes in for one of
  the LUNs. The driver waits for this internal queue to empty. We were
  only clearing out the requests for the LUN being reset so the
  request queue was never empty causing a hang.

  The Fix:

     For all requests in our internal request queue:

        Complete requests with DID_RESET for queued requests for the
        device undergoing a reset.

        Complete requests with DID_REQUEUE for all other queued requests.

  Found by internal testing. No known externally reported bugs.

* smartpqi-fix-volume-size-updates

  The current code only checks for a size change if there is also a
  queue depth change.  We are separating the check for queue depth and
  the size changes.

  Found by internal testing. No known bugs were filed.

* smartpqi-update-version-to-2.1.30-031
  No functional changes.

Link: https://lore.kernel.org/r/20240827185501.692804-1-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:16:33 -04:00
Don Brace
bda1c931e2 scsi: smartpqi: update driver version to 2.1.30-031
Update driver version to 2.1.30-031.

Reviewed-by: David Strahan <david.strahan@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Gerry Morong <gerry.morong@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-8-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:10:35 -04:00
Don Brace
07dde72ff1 scsi: smartpqi: fix volume size updates
Correct logical volume size changes by moving the check for a volume rescan
outside of the check for a queue depth change.

Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-7-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:10:34 -04:00
Murthy Bhat
4e0a51716d scsi: smartpqi: fix rare system hang during LUN reset
Correct a rare case where in a LUN reset occurs on a device and I/O
requests for other devices persist in the driver's internal request queue.

Part of a LUN reset involves waiting for our internal request queue to
empty before proceeding. The internal request queue contains requests not
yet sent down to the controller.

We were clearing the requests queued for the LUN undergoing a reset, but
not all of the queued requests. Causing a hang.

For all requests in our internal request queue:

   Complete requests with DID_RESET for queued requests for the device
   undergoing a reset.

   Complete requests with DID_REQUEUE for all other queued requests.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-6-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:10:34 -04:00
David Strahan
dbc39b8454 scsi: smartpqi: add new controller PCI IDs
All PCI ID entries in Hex.

Add new cisco pci ids:
                                             VID  / DID  / SVID / SDID
                                             ----   ----   ----   ----
                                             9005   028f   1137   02fe
                                             9005   028f   1137   02ff
                                             9005   028f   1137   0300

Add new h3c pci ids:
                                             VID  / DID  / SVID / SDID
                                             ----   ----   ----   ----
                                             9005   028f   193d   0462
                                             9005   028f   193d   8462

Add new ieit pci ids:
                                             VID  / DID  / SVID / SDID
                                             ----   ----   ----   ----
                                             9005   028f   1ff9   00a3

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: David Strahan <David.Strahan@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-5-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:10:34 -04:00
Mahesh Rajashekhara
283dcc1b14 scsi: smartpqi: add counter for parity write stream requests
Add sysfs entry to check for write stream requests.

Move existing raid_bypass_cnt into a structure named pqi_raid_io_stats and
add member write_stream_cnt. These two counters are related because write
stream detection is only checked if an I/O request is eligible for bypass
(AIO).

Example usage:

lsscsi
[15:1:0:0]   disk    Adaptec  LOGICAL VOLUME   0129  /dev/sdae

cat /sys/block/sdae/device/ssd_smart_path_enabled
1
^
|
+---- NOTE: here bypass has been enabled on device sdae

To read the counter for parity write stream requests:

cat /sys/block/sdae/device/write_stream_cnt
0x60cd507

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Co-developed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-4-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:10:34 -04:00
Mahesh Rajashekhara
4c76114932 scsi: smartpqi: correct stream detection
Correct stream detection by initializing the structure
pqi_scsi_dev_raid_map_data to 0s.

When the OS issues SCSI READ commands, the driver erroneously considers
them as SCSI WRITES. If they are identified as sequential IOs, the driver
then submits those requests via the RAID path instead of the AIO path.

The 'is_write' flag might be set for SCSI READ commands also.  The driver
may interpret SCSI READ commands as SCSI WRITE commands, resulting in IOs
being submitted through the RAID path.

Note: This does not cause data corruption.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-3-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:10:34 -04:00
Murthy Bhat
058311b72f scsi: smartpqi: Add fw log to kdump
Add controller logs to kdump.

Driver allocates DMA memory and communicates this address to FW.  In the
event of system crash, host driver notifies the firmware about the crash
and firmware posts all the necessary logs in the pre-allocated host buffer
for firmware debugging.

Once firmware notifies the completion of the log uploading to the host
memory and host continues with the OS crash dump saving.

This is a "feature" driven capability and is backward compatible with
existing controller FW.

Rename some prefixes for OFA (Online-Firmware Activation ofa_*) buffers to
host_memory_*. So, not a lot of actual functional changes to
smartpqi_init.c, mainly determining the memory size allocation.

Added a function to notify the controller to copy debug data into host
memory before continuing kdump.

Most of the functional changes are in smartpqi_sis.c where the actual
handshaking is done.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-2-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 22:10:34 -04:00
Christophe JAILLET
d5a4b0d642 scsi: bnx2fc: Remove some unused fields in struct bnx2fc_rport
Some fields are unused in struct bnx2fc_rport.  Remove them in order to
save 96 bytes on a x86_64.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/42e20b159f3bbb12da7796463a521ca051bd5274.1724399924.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 21:33:21 -04:00
Christophe JAILLET
e59f43fb64 scsi: qla2xxx: Remove the unused 'del_list_entry' field in struct fc_port
The 'del_list_entry' field in "struct fc_port" is unused.

The field was introduced in commit 2d70c103fd2a ("[SCSI] qla2xxx: Add LLD
target-mode infrastructure for >= 24xx series") in 2012-05 and the last
user was removed in commit 726b85487067 ("qla2xxx: Add framework for async
fabric discovery") in 2017-02.

Remove this unused field.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/69155321ab26c1f4d473d5bb6cd44b59b9b6a020.1724094686.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 21:31:59 -04:00
Yue Haibing
adedd0f46c scsi: bnx2i: Remove unused declarations
Commit cf4e6363859d ("[SCSI] bnx2i: Add bnx2i iSCSI driver.") declared but
never implemented these.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20240824084724.3647307-1-yuehaibing@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-28 20:58:47 -04:00
Martin K. Petersen
70302fc7ad Merge patch series "Simplify multiple create*_workqueue() invocations"
Bart Van Assche <bvanassche@acm.org> says:

Hi Martin,

Multiple SCSI drivers use snprintf() to format a workqueue name before
invoking one of the create*_workqueue() macros. This patch series
simplifies such code by passing the format string and arguments to
alloc_workqueue(). Additionally, the structure members that are only
used as a temporary buffer for formatting workqueue names are
removed. Please consider this patch series for the next merge window.

Thanks,

Bart.

Link: https://lore.kernel.org/r/20240822195944.654691-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 21:30:06 -04:00
Bart Van Assche
ba52850cb6 scsi: core: Simplify an alloc_workqueue() invocation
Let alloc_workqueue() format the workqueue name. Remove the
work_q_name[] member from struct Scsi_Host because it is no longer
used by any SCSI driver nor by the SCSI core.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240822195944.654691-19-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 21:28:57 -04:00
Bart Van Assche
0ef9b0186d scsi: stex: Simplify an alloc_ordered_workqueue() invocation
Let alloc_ordered_workqueue() format the workqueue name instead of calling
snprintf() explicitly.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240822195944.654691-17-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 21:28:57 -04:00
Bart Van Assche
06d5378976 scsi: scsi_transport_fc: Simplify alloc_workqueue() invocations
Let alloc_workqueue() format the workqueue name instead of calling
snprintf() explicitly.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240822195944.654691-16-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 21:28:57 -04:00
Bart Van Assche
6411307b63 scsi: snic: Simplify alloc_workqueue() invocations
Let alloc_workqueue() format the workqueue name instead of calling
snprintf() explicitly. Not setting shost->work_q_name is safe because
there is no code that reads the value set by the removed code.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240822195944.654691-15-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 21:28:56 -04:00
Bart Van Assche
19d7cda1c6 scsi: qedi: Simplify an alloc_workqueue() invocation
Let alloc_workqueue() format the workqueue name instead of calling
snprintf() explicitly.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240822195944.654691-14-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 21:28:56 -04:00
Bart Van Assche
8bbe60bbd4 scsi: qedf: Simplify alloc_workqueue() invocations
Let alloc_workqueue() format the workqueue name instead of calling
snprintf() explicitly.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240822195944.654691-13-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 21:28:56 -04:00