346 Commits

Author SHA1 Message Date
Lee Jones
0bb87e01d8 scsi: lpfc: Fix a bunch of kernel-doc misdemeanours
Fixes the following W=1 kernel build warning(s):

 drivers/scsi/lpfc/lpfc_scsi.c:746: warning: expecting prototype for lpfc_release_scsi_buf(). Prototype was for lpfc_release_scsi_buf_s3() instead
 drivers/scsi/lpfc/lpfc_scsi.c:979: warning: expecting prototype for App checking is required for(). Prototype was for BG_ERR_CHECK() instead
 drivers/scsi/lpfc/lpfc_scsi.c:3701: warning: Function parameter or member 'vport' not described in 'lpfc_scsi_prep_cmnd_buf'
 drivers/scsi/lpfc/lpfc_scsi.c:3701: warning: Excess function parameter 'phba' description in 'lpfc_scsi_prep_cmnd_buf'
 drivers/scsi/lpfc/lpfc_scsi.c:3717: warning: Function parameter or member 'fcpi_parm' not described in 'lpfc_send_scsi_error_event'
 drivers/scsi/lpfc/lpfc_scsi.c:3717: warning: Excess function parameter 'rsp_iocb' description in 'lpfc_send_scsi_error_event'
 drivers/scsi/lpfc/lpfc_scsi.c:3837: warning: Function parameter or member 'fcpi_parm' not described in 'lpfc_handle_fcp_err'
 drivers/scsi/lpfc/lpfc_scsi.c:3837: warning: expecting prototype for lpfc_handler_fcp_err(). Prototype was for lpfc_handle_fcp_err() instead
 drivers/scsi/lpfc/lpfc_scsi.c:4021: warning: Function parameter or member 'wcqe' not described in 'lpfc_fcp_io_cmd_wqe_cmpl'
 drivers/scsi/lpfc/lpfc_scsi.c:4021: warning: Excess function parameter 'pwqeOut' description in 'lpfc_fcp_io_cmd_wqe_cmpl'
 drivers/scsi/lpfc/lpfc_scsi.c:4621: warning: Function parameter or member 'vport' not described in 'lpfc_scsi_prep_cmnd_buf_s3'
 drivers/scsi/lpfc/lpfc_scsi.c:4621: warning: Excess function parameter 'phba' description in 'lpfc_scsi_prep_cmnd_buf_s3'
 drivers/scsi/lpfc/lpfc_scsi.c:4698: warning: Function parameter or member 'vport' not described in 'lpfc_scsi_prep_cmnd_buf_s4'
 drivers/scsi/lpfc/lpfc_scsi.c:4698: warning: Excess function parameter 'phba' description in 'lpfc_scsi_prep_cmnd_buf_s4'
 drivers/scsi/lpfc/lpfc_scsi.c:4954: warning: expecting prototype for lpfc_taskmgmt_def_cmpl(). Prototype was for lpfc_tskmgmt_def_cmpl() instead
 drivers/scsi/lpfc/lpfc_scsi.c:5094: warning: expecting prototype for lpfc_poll_rearm_time(). Prototype was for lpfc_poll_rearm_timer() instead

Link: https://lore.kernel.org/r/20210312094738.2207817-4-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-15 22:28:56 -04:00
James Smart
67073c69c8 scsi: lpfc: Update copyrights for 12.8.0.7 and 12.8.0.8 changes
For the files modified in 2021 via the 12.8.0.7 and 12.8.0.8 patch sets,
update the copyright for 2021.

Link: https://lore.kernel.org/r/20210301171821.3427-23-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:06 -05:00
James Smart
a94a40eb64 scsi: lpfc: Change wording of invalid pci reset log message
Message 8347 Invalid device found log message is logged when an LPe12000
adapter is installed.  The log message is supposed to indicate an
unsupported pci reset adapter rather than an invalid device.

Change the wording to: Incapable PCI reset device.

Link: https://lore.kernel.org/r/20210301171821.3427-19-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:05 -05:00
James Smart
ae960d78ec scsi: lpfc: Fix unnecessary null check in lpfc_release_scsi_buf
lpfc_fcp_io_cmd_wqe_cmpl() is intended to mirror
lpfc_nvme_io_cmd_wqe_cmpl() for sli4 fcp completions. When the routine was
added, lpfc_fcp_io_cmd_wqe_cmpl() included a null pointer check for
phba. However, phba is definitely valid, being dereferenced by the calling
routine and used later in the routine itself.

Remove the unnecessary null check.

Link: https://lore.kernel.org/r/20210301171821.3427-9-jsmart2021@gmail.com
Fixes: 96e209be6ecb ("scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 handlers")
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:04 -05:00
James Smart
68a6a66c51 scsi: lpfc: Fix reftag generation sizing errors
An LBA is 8 bytes. The driver generates a reftag from the LBA but the
reftag is 4 bytes. Thus scsi_get_lba() could return a value that exceeds
our reftag size.

Fix by converting all the code to calling the common routine
t10_pi_ref_tag() which returns a u32, thus ensuring a consistent 4byte
value.  Also correct a few code lines that access LBA directly and ensure
64-bit data types are used.

Link: https://lore.kernel.org/r/20210301171821.3427-4-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:04 -05:00
Muneendra Kumar
7f3a79a7fd scsi: lpfc: Add support for eh_should_retry_cmd()
Add support for eh_should_retry_cmd callback in lpfc_template.

Link: https://lore.kernel.org/r/1609969748-17684-6-git-send-email-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-14 22:55:18 -05:00
James Smart
a22d73b655 scsi: lpfc: Implement health checking when aborting I/O
Several errors have occurred where the adapter stops or fails but does not
raise the register values for the driver to detect failure. Thus driver is
unaware of the failure. The failure typically results in I/O timeouts, the
I/O timeout handler failing (after several seconds), and the error handler
escalating recovery policy and resulting in more errors. Eventually, the
driver is in a position where things have spiraled and it can't do recovery
because other recovery ops are still outstanding and it becomes unusable.

Resolve the situation by having the I/O timeout handler (actually a els,
SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O,
perform a mailbox command and look for a response from the hardware.  If
the mailbox command fails, it will mark the adapter offline and then invoke
the adapter reset handler to clean up.

The new I/O timeout test will be limited to a test every 5s. If there are
multiple I/O timeouts concurrently, only the 1st I/O timeout will generate
the mailbox command. Further testing will only occur once a timeout occurs
after a 5s delay from the last mailbox command has expired.

Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:37 -05:00
James Smart
31051249f1 scsi: lpfc: Fix target reset failing
Target reset is failed by the target as an invalid command.

The Target Reset TMF has been obsoleted in T10 for a while, but continues
to be used. On (newer) devices, the TMF is rejected causing the reset
handler to escalate to adapter resets.

Fix by having Target Reset TMF rejections be translated into a LOGO and
re-PLOGI with the target device. This provides the same semantic action
(although, if the device also supports nvme traffic, it will terminate nvme
traffic as well - but it's still recoverable).

Link: https://lore.kernel.org/r/20210104180240.46824-10-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
James Smart
da09ae4864 scsi: lpfc: Fix error log messages being logged following SCSI task mgnt
A successful task mgmt command is logging errors, making it look like
problems were encountered.  This is due to log messages for the
device/target and bus reset handlers having the LOG_TRACE_EVENT flag set.

Fix by adjusting the event flag such that the call to the logging routine
only receives a LOG_TRACE_EVENT if a prior call actually failed.

Link: https://lore.kernel.org/r/20210104180240.46824-9-jsmart2021@gmail.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-07 23:02:36 -05:00
Colin Ian King
1e7dddb2e7 scsi: lpfc: Fix pointer defereference before it is null checked issue
There is a null check on pointer lpfc_cmd after the pointer has been
dereferenced when pointers rdata and ndlp are initialized at the start of
the function. Fix this by only assigning rdata and ndlp after the pointer
lpfc_cmd has been null checked.

Link: https://lore.kernel.org/r/20201118131345.460631-1-colin.king@canonical.com
Fixes: 96e209be6ecb ("scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 handlers")
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Addresses-Coverity: ("Dereference before null check")
2020-11-19 22:13:42 -05:00
James Smart
db7531d2b3 scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers
This patch reworks the abort interfaces such that SLI-3 retains the
iocb-based formatting and completions and SLI-4 now uses native WQEs and
completion routines.

The following changes are made:

 - The code is refactored from a confusing 2 routine sequence of
   xx_abort_iotag_issue(), which creates/formats and abort cmd, and
   xx_issue_abort_tag(), which then issues and handles the completion of
   the abort cmd - into a single interface of xx_issue_abort_iotag().  The
   new interface will determine whether SLI-3 or SLI-4 and then call the
   appropriate handler. A completion handler can now be specified to
   address the differences in completion handling.  Note: original code is
   all iocb based, with SLI-4 converting to SLI-3 for the SCSI/ELS path,
   and NVMe natively using wqes.

 - The SLI-3 side is refactored:

   The older iocb-base lpfc_sli_issue_abort_iotag() routine is combined
   with the logic of lpfc_sli_abort_iotag_issue() as well as the
   iocb-specific code in lpfc_abort_handler() and lpfc_sli_abort_iocb() to
   create the new single SLI-3 abort routine that formats and issues the
   iocb.

 - The SLI-4 side is refactored and added to:

   The native WQE abort code in NVMe is moved to the new SLI-4
   issue_abort_iotag() routine. Items in SCSI that set fields not set by
   NVMe is migrated into the new routine. Thus the routine supports NVMe
   and SCSI initiators. The nvmet block (target) formats the abort slightly
   different (like the old NVMe initiator) thus it has its own prep routine
   stolen from NVMe initiator and it retains the current code it has for
   issuing the WQE (does not use the commonized routine the initiators
   do). SLI-4 completion handlers were also added.

 - lpfc_abort_handler now becomes a wrapper that determines whether
   SLI-3 or SLI-4 and calls the proper abort handler.

Link: https://lore.kernel.org/r/20201115192646.12977-16-james.smart@broadcom.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17 00:43:56 -05:00
James Smart
96e209be6e scsi: lpfc: Convert SCSI I/O completions to SLI-3 and SLI-4 handlers
The current driver implementation uses SLI-4 WQE to iocb conversion before
calling the cmpl callback function.

Rework the FCP I/O completion path to utilize the SLI-4 WQE.

This patch converts the SCSI I/O completion paths from the iocb-centric
interfaces to the routines are native for whether I/Os are iocb-based
(SLI-3) or WQE-based (SLI-4).

Most existing routines were iocb-based, so this creates a lot of SLI-4
specific routines to provide the functionality.

Link: https://lore.kernel.org/r/20201115192646.12977-15-james.smart@broadcom.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17 00:43:56 -05:00
James Smart
da255e2e7c scsi: lpfc: Convert SCSI path to use common I/O submission path
This patch converts the SCSI I/O path from the iocb-centric interfaces to
the common I/O submission path which supports native SLI-4 WQEs.

A wrapper routine is put in place to distinguish SLI-3 from SLI. If SLI-3,
the same iocb-centric paths are used, perhaps with refactored code that is
explicitly for SLI-3.  For SLI-4, any iocb-related formatting is replaced
by wqe-based formatting, although much of that is addressed by the common
wqe templates in the SLI-4 path.

Link: https://lore.kernel.org/r/20201115192646.12977-14-james.smart@broadcom.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17 00:43:56 -05:00
James Smart
47ff4c510f scsi: lpfc: Enable common send_io interface for SCSI and NVMe
To set up common use by the SCSI and NVMe I/O paths, create a new routine
that issues FCP I/O commands which can be used by either protocol.  The new
routine addresses SLI-3 vs SLI-4 differences within its implementation.

Replace the (SLI-3 centric) iocb routine in the SCSI path with this new
WQE-centric common routine.

Link: https://lore.kernel.org/r/20201115192646.12977-13-james.smart@broadcom.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17 00:43:56 -05:00
James Smart
c6adba1501 scsi: lpfc: Rework remote port lock handling
Currently the discovery layers within the driver use the SCSI midlayer
host_lock to access node-specific structures. This can contend with the I/O
path and is too coarse of a lock.

Rework the driver so that it uses a lock specific to the remote port node
structure when accessing the structure contents. A few of the changes
brought out spots were some slightly reorganized routines worked better.

Link: https://lore.kernel.org/r/20201115192646.12977-6-james.smart@broadcom.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17 00:43:54 -05:00
James Smart
307e338097 scsi: lpfc: Rework remote port ref counting and node freeing
When a remote port is disconnected and disappears, its node structure
(ndlp) stays allocated and on a vport node list. While on the list it can
be matched, thus requires validation checks on state to be added in
numerous code paths. If the node comes back, its possible for there to be
multiple node structures for the same device on the vport node list. There
is no reason to keep the node structure around after it is no longer in
existence, and the current implementation creates problems for itself
(multiple nodes) and lots of unnecessary code for state validation.

Additionally, the reference taking on the node structure didn't follow the
normal model used by the kernel kref api. It included lots of odd logic to
match state with reference count.  The combination of this odd logic plus
the way it was implicitly used in the discovery engine made its reference
taking implementation suspect and extremely hard to follow.

Change the driver such that the reference taking routines are now normal
ref increments/decrements and callout on refcount=0.

With this in place, the rework can be done such that the node structure is
fully removed and deallocated when the remote port no longer exists and all
references are removed.  This removal logic, and the basic ref counting are
intrically tied, thus in a single patch.

Link: https://lore.kernel.org/r/20201115192646.12977-2-james.smart@broadcom.com
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-17 00:43:54 -05:00
Lee Jones
eceee00e41 scsi: lpfc: lpfc_scsi: Fix a whole host of kernel-doc issues
Fixes the following W=1 kernel build warning(s):

 drivers/scsi/lpfc/lpfc_scsi.c:331: warning: Function parameter or member 'num_to_alloc' not described in 'lpfc_new_scsi_buf_s3'
 drivers/scsi/lpfc/lpfc_scsi.c:331: warning: Excess function parameter 'num_to_allocate' description in 'lpfc_new_scsi_buf_s3'
 drivers/scsi/lpfc/lpfc_scsi.c:507: warning: Function parameter or member 'idx' not described in 'lpfc_sli4_io_xri_aborted'
 drivers/scsi/lpfc/lpfc_scsi.c:593: warning: Function parameter or member 'ndlp' not described in 'lpfc_get_scsi_buf_s3'
 drivers/scsi/lpfc/lpfc_scsi.c:593: warning: Function parameter or member 'cmnd' not described in 'lpfc_get_scsi_buf_s3'
 drivers/scsi/lpfc/lpfc_scsi.c:632: warning: Function parameter or member 'ndlp' not described in 'lpfc_get_scsi_buf_s4'
 drivers/scsi/lpfc/lpfc_scsi.c:632: warning: Function parameter or member 'cmnd' not described in 'lpfc_get_scsi_buf_s4'
 drivers/scsi/lpfc/lpfc_scsi.c:744: warning: Function parameter or member 'ndlp' not described in 'lpfc_get_scsi_buf'
 drivers/scsi/lpfc/lpfc_scsi.c:744: warning: Function parameter or member 'cmnd' not described in 'lpfc_get_scsi_buf'
 drivers/scsi/lpfc/lpfc_scsi.c:986: warning: Function parameter or member 'new_guard' not described in 'lpfc_bg_err_inject'
 drivers/scsi/lpfc/lpfc_scsi.c:1393: warning: Function parameter or member 'txop' not described in 'lpfc_sc_to_bg_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1393: warning: Function parameter or member 'rxop' not described in 'lpfc_sc_to_bg_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1393: warning: Excess function parameter 'txopt' description in 'lpfc_sc_to_bg_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1393: warning: Excess function parameter 'rxopt' description in 'lpfc_sc_to_bg_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1473: warning: Function parameter or member 'txop' not described in 'lpfc_bg_err_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1473: warning: Function parameter or member 'rxop' not described in 'lpfc_bg_err_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1473: warning: Excess function parameter 'txopt' description in 'lpfc_bg_err_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1473: warning: Excess function parameter 'rxopt' description in 'lpfc_bg_err_opcodes'
 drivers/scsi/lpfc/lpfc_scsi.c:1565: warning: Function parameter or member 'datasegcnt' not described in 'lpfc_bg_setup_bpl'
 drivers/scsi/lpfc/lpfc_scsi.c:1565: warning: Excess function parameter 'datacnt' description in 'lpfc_bg_setup_bpl'
 drivers/scsi/lpfc/lpfc_scsi.c:1951: warning: Function parameter or member 'datasegcnt' not described in 'lpfc_bg_setup_sgl'
 drivers/scsi/lpfc/lpfc_scsi.c:1951: warning: Function parameter or member 'lpfc_cmd' not described in 'lpfc_bg_setup_sgl'
 drivers/scsi/lpfc/lpfc_scsi.c:1951: warning: Excess function parameter 'datacnt' description in 'lpfc_bg_setup_sgl'
 drivers/scsi/lpfc/lpfc_scsi.c:2131: warning: Function parameter or member 'lpfc_cmd' not described in 'lpfc_bg_setup_sgl_prot'
 drivers/scsi/lpfc/lpfc_scsi.c:4476: warning: Function parameter or member 't' not described in 'lpfc_poll_timeout'
 drivers/scsi/lpfc/lpfc_scsi.c:4476: warning: Excess function parameter 'ptr' description in 'lpfc_poll_timeout'
 drivers/scsi/lpfc/lpfc_scsi.c:4503: warning: Function parameter or member 'shost' not described in 'lpfc_queuecommand'
 drivers/scsi/lpfc/lpfc_scsi.c:4503: warning: Excess function parameter 'done' description in 'lpfc_queuecommand'
 drivers/scsi/lpfc/lpfc_scsi.c:5035: warning: Function parameter or member 'cmnd' not described in 'lpfc_send_taskmgmt'
 drivers/scsi/lpfc/lpfc_scsi.c:5035: warning: Excess function parameter 'rdata' description in 'lpfc_send_taskmgmt'
 drivers/scsi/lpfc/lpfc_scsi.c:5688: warning: Function parameter or member 'phba' not described in 'lpfc_create_device_data'
 drivers/scsi/lpfc/lpfc_scsi.c:5688: warning: Function parameter or member 'pri' not described in 'lpfc_create_device_data'
 drivers/scsi/lpfc/lpfc_scsi.c:5688: warning: Excess function parameter 'pha' description in 'lpfc_create_device_data'
 drivers/scsi/lpfc/lpfc_scsi.c:5730: warning: Function parameter or member 'phba' not described in 'lpfc_delete_device_data'
 drivers/scsi/lpfc/lpfc_scsi.c:5730: warning: Excess function parameter 'pha' description in 'lpfc_delete_device_data'
 drivers/scsi/lpfc/lpfc_scsi.c:5762: warning: Function parameter or member 'phba' not described in '__lpfc_get_device_data'
 drivers/scsi/lpfc/lpfc_scsi.c:5762: warning: Excess function parameter 'pha' description in '__lpfc_get_device_data'
 drivers/scsi/lpfc/lpfc_scsi.c:5818: warning: Function parameter or member 'phba' not described in 'lpfc_find_next_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5818: warning: Function parameter or member 'found_lun_pri' not described in 'lpfc_find_next_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5818: warning: Excess function parameter 'pha' description in 'lpfc_find_next_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5909: warning: Function parameter or member 'phba' not described in 'lpfc_enable_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5909: warning: Function parameter or member 'pri' not described in 'lpfc_enable_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5909: warning: Excess function parameter 'pha' description in 'lpfc_enable_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5968: warning: Function parameter or member 'phba' not described in 'lpfc_disable_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5968: warning: Function parameter or member 'pri' not described in 'lpfc_disable_oas_lun'
 drivers/scsi/lpfc/lpfc_scsi.c:5968: warning: Excess function parameter 'pha' description in 'lpfc_disable_oas_lun'

Link: https://lore.kernel.org/r/20201102142359.561122-4-lee.jones@linaro.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-10 22:27:45 -05:00
James Smart
7c30bb62ed scsi: lpfc: Enlarge max_sectors in scsi host templates
The driver supports arbitrarily large scatter-gather lists and the current
value for max_sectors is limiting.

Change max_sectors to the largest value.  This was actually done prior but
it only corrected one template and that template was later removed.

So change the remaining 2 templates. Other areas which hard-set the sectors
value should be inheriting what is in the template.

Link: https://lore.kernel.org/r/20201020202719.54726-7-james.smart@broadcom.com
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-26 21:42:38 -04:00
Tom Rix
170b7d2de2 scsi: Remove unneeded break statements
A break is not needed if it is preceded by a return or goto.

Link: https://lore.kernel.org/r/20201019142333.16584-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-26 18:23:24 -04:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Dick Kennedy
372c187b8a scsi: lpfc: Add an internal trace log buffer
The current logging methods typically end up requesting a reproduction with
a different logging level set to figure out what happened. This was mainly
by design to not clutter the kernel log messages with things that were
typically not interesting and the messages themselves could cause other
issues.

When looking to make a better system, it was seen that in many cases when
more data was wanted was when another message, usually at KERN_ERR level,
was logged.  And in most cases, what the additional logging that was then
enabled was typically. Most of these areas fell into the discovery machine.

Based on this summary, the following design has been put in place: The
driver will maintain an internal log (256 elements of 256 bytes).  The
"additional logging" messages that are usually enabled in a reproduction
will be changed to now log all the time to the internal log.  A new logging
level is defined - LOG_TRACE_EVENT.  When this level is set (it is not by
default) and a message marked as KERN_ERR is logged, all the messages in
the internal log will be dumped to the kernel log before the KERN_ERR
message is logged.

There is a timestamp on each message added to the internal log. However,
this timestamp is not converted to wall time when logged. The value of the
timestamp is solely to give a crude time reference for the messages.

Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-02 23:06:49 -04:00
James Smart
2fcbc569b9 scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI
Currently driver ktime stats, measuring code paths, is NVME-specific.

Convert the stats routines such that the code paths are generic, providing
status for NVME and SCSI. Added ktime stat calls in SCSI queuecommand and
cmpl routines.

Link: https://lore.kernel.org/r/20200322181304.37655-11-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-29 18:10:58 -04:00
James Smart
840eda9602 scsi: lpfc: Fix erroneous cpu limit of 128 on I/O statistics
The cpu io statistics were capped by a hard define limit of 128. This
effectively was a max number of CPUs, not an actual CPU count, nor actual
CPU numbers which can be even larger than both of those values. This made
stats off/misleading and on large CPU count systems, wrong.

Fix the stats so that all CPUs can have a stats struct.  Fix the looping
such that it loops by hdwq, finds CPUs that used the hdwq, and sum the
stats, then display.

Link: https://lore.kernel.org/r/20200322181304.37655-9-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-29 18:10:48 -04:00
James Smart
c90b448023 scsi: lpfc: Fix scsi host template for SLI3 vports
SCSI layer sends driver IOs with more s/g segments than driver can handle.
This results in "Too many sg segments from dma_map_sg. Config 64, seg_cnt
219" error messages from the lpfc_scsi_prep_dma_buf_s3() routine.

The was due to use the driver using individual templates for pport and
vport, host reset enabled or not, nvme vs scsi, etc. In the end, there was
a combination for a vport that didn't match the pport.

Rather than enumerating more templates and more discretionary assignments,
revert to a base template that is copied to a template specific to the
pport/vport. Then, based on role, attributes and sli type, modify the
fields that are different for that port.  Added a log message to
lpfc_create_port to validate values.

Link: https://lore.kernel.org/r/20200322181304.37655-5-jsmart2021@gmail.com
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-26 23:15:08 -04:00
James Smart
145e5a8a5c scsi: lpfc: Copyright updates for 12.6.0.4 patches
Update copyrights to 2020 for files modified in the 12.6.0.4 patch set.

Link: https://lore.kernel.org/r/20200128002312.16346-13-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:56 -05:00
James Smart
0ab384a49c scsi: lpfc: Fix lpfc_io_buf resource leak in lpfc_get_scsi_buf_s4 error path
If a call to lpfc_get_cmd_rsp_buf_per_hdwq returns NULL (memory allocation
failure), a previously allocated lpfc_io_buf resource is leaked.

Fix by releasing the lpfc_io_buf resource in the failure path.

Fixes: d79c9e9d4b3d ("scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200128002312.16346-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:55 -05:00
James Smart
c438d0628a scsi: lpfc: Fix improper flag check for IO type
Current driver code looks at iocb types and uses a "==" comparison on the
flags to determine type. If another flag were set, it would disrupt the
comparison.

Fix by converting to a bitwise & operation.

Link: https://lore.kernel.org/r/20191218235808.31922-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21 13:42:42 -05:00
Linus Torvalds
ef2cc88e2a SCSI misc on 20191130
This is mostly update of the usual drivers: aacraid, ufs, zfcp,
 NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
 plus a whole load of minor updates and fixes.  The two major core
 changes are Al Viro's reworking of sg's handling of copy to/from user,
 Ming Lei's removal of the host busy counter to avoid contention in the
 multiqueue case and Damien Le Moal's fixing of residual tracking
 across error handling.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXeKvHCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQJMAQDAjlAi
 SNfbyndMqyf+rZGWufDI+43Up1VvW9GeWJHeDwEAxfO5XZsCks2uT8UxXhpEp9L7
 HkiUww3zbcgl0FWFkUM=
 =cdVU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: aacraid, ufs, zfcp,
  NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
  plus a whole load of minor updates and fixes.

  The major core changes are Al Viro's reworking of sg's handling of
  copy to/from user, Ming Lei's removal of the host busy counter to
  avoid contention in the multiqueue case and Damien Le Moal's fixing of
  residual tracking across error handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits)
  scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort()
  scsi: target: core: Fix a pr_debug() argument
  scsi: iscsi: Don't send data to unbound connection
  scsi: target: iscsi: Wait for all commands to finish before freeing a session
  scsi: target: core: Release SPC-2 reservations when closing a session
  scsi: target: core: Document target_cmd_size_check()
  scsi: bnx2i: fix potential use after free
  Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails"
  scsi: NCR5380: Add disconnect_mask module parameter
  scsi: NCR5380: Unconditionally clear ICR after do_abort()
  scsi: NCR5380: Call scsi_set_resid() on command completion
  scsi: scsi_debug: num_tgts must be >= 0
  scsi: lpfc: use hdwq assigned cpu for allocation
  scsi: arcmsr: fix indentation issues
  scsi: qla4xxx: fix double free bug
  scsi: pm80xx: Modified the logic to collect fatal dump
  scsi: pm80xx: Tie the interrupt name to the module instance
  scsi: pm80xx: Controller fatal error through sysfs
  scsi: pm80xx: Do not request 12G sas speeds
  scsi: pm80xx: Cleanup command when a reset times out
  ...
2019-12-02 13:37:02 -08:00
James Smart
6f23f8c5c9 scsi: lpfc: fix: Coverity: lpfc_get_scsi_buf_s3(): Null pointer dereferences
Coverity reported the following:

*** CID 1487391:  Null pointer dereferences  (FORWARD_NULL)
/drivers/scsi/lpfc/lpfc_scsi.c: 614 in lpfc_get_scsi_buf_s3()
608     		spin_unlock(&phba->scsi_buf_list_put_lock);
609     	}
610     	spin_unlock_irqrestore(&phba->scsi_buf_list_get_lock, iflag);
611
612     	if (lpfc_ndlp_check_qdepth(phba, ndlp)) {
613     		atomic_inc(&ndlp->cmd_pending);
vvv     CID 1487391:  Null pointer dereferences  (FORWARD_NULL)
vvv     Dereferencing null pointer "lpfc_cmd".
614     		lpfc_cmd->flags |= LPFC_SBUF_BUMP_QDEPTH;
615     	}
616     	return  lpfc_cmd;
617     }
618     /**
619      * lpfc_get_scsi_buf_s4 - Get a scsi buffer from io_buf_list of the HBA

Fix by checking lpfc_cmd to be non-NULL as part of line 612

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1487391 ("Null pointer dereferences")
Fixes: 2a5b7d626ed2 ("scsi: lpfc: Limit tracking of tgt queue depth in fast path")

CC: "Martin K. Petersen" <martin.petersen@oracle.com>
CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
CC: linux-next@vger.kernel.org
Link: https://lore.kernel.org/r/20191111230401.12958-2-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
James Smart
22770cbabf scsi: lpfc: Slight fast-path performance optimizations
Slightly rework some error check code paths for better streamlining.

Added compiler unlikely hints to allow slightly better optimization of the
fast-path.

Removed a few pointer checks that were obviously already valid.

Link: https://lore.kernel.org/r/20191018211832.7917-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:05 -04:00
James Smart
91a52b617c scsi: lpfc: Fix hardlockup in lpfc_abort_handler
In lpfc_abort_handler, the lock acquire order is hbalock (irqsave),
buf_lock (irq) and ring_lock (irq).  The issue is that in two places the
locks are released out of order - the buf_lock and the hbalock - resulting
in the cpu preemption/lock flags getting restored out of order and
deadlocking the cpu.

Fix the unlock order by fully releasing the hbalocks as well.

CC: Zhangguanghui <zhang.guanghui@h3c.com>
Link: https://lore.kernel.org/r/20191018211832.7917-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:04 -04:00
James Smart
324e1c4020 scsi: lpfc: Fix bad ndlp ptr in xri aborted handling
In cases where I/O may be aborted, such as driver unload or link bounces,
the system will crash based on a bad ndlp pointer.

Example:
  RIP: 0010:lpfc_sli4_abts_err_handler+0x15/0x140 [lpfc]
  ...
  lpfc_sli4_io_xri_aborted+0x20d/0x270 [lpfc]
  lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x84/0x170 [lpfc]
  lpfc_sli4_fp_handle_cqe+0xc2/0x480 [lpfc]
  __lpfc_sli4_process_cq+0xc6/0x230 [lpfc]
  __lpfc_sli4_hba_process_cq+0x29/0xc0 [lpfc]
  process_one_work+0x14c/0x390

Crash was caused by a bad ndlp address passed to I/O indicated by the XRI
aborted CQE.  The address was not NULL so the routine deferenced the ndlp
ptr. The bad ndlp also caused the lpfc_sli4_io_xri_aborted to call an
erroneous io handler.  Root cause for the bad ndlp was an lpfc_ncmd that
was aborted, put on the abort_io list, completed, taken off the abort_io
list, sent to lpfc_release_nvme_buf where it was put back on the abort_io
list because the lpfc_ncmd->flags setting LPFC_SBUF_XBUSY was not cleared
on the final completion.

Rework the exchange busy handling to ensure the flags are properly set for
both scsi and nvme.

Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing")
Cc: <stable@vger.kernel.org> # v5.1+
Link: https://lore.kernel.org/r/20191018211832.7917-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:04 -04:00
Hannes Reinecke
1052b41b25 scsi: lpfc: remove left-over BUILD_NVME defines
The BUILD_NVME define never got defined anywhere, causing NVMe commands to
be treated as SCSI commands when freeing the buffers.  This was causing a
stuck discovery and a horrible crash in lpfc_set_rrq_active() later on.

Link: https://lore.kernel.org/r/20191017150019.75769-1-hare@suse.de
Fixes: c00f62e6c546 ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-17 22:01:27 -04:00
James Smart
43bfea1bff scsi: lpfc: Fix coverity errors on NULL pointer checks
Coverity flagged several scenarios where checking of null pointer values
wasn't consistent.

Fix the code to that be consistent on checking.

Link: https://lore.kernel.org/r/20190922035906.10977-12-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 22:07:10 -04:00
James Smart
9df0a0381a scsi: lpfc: Fix GPF on scsi command completion
Faults are seen with RIP of lpfc_scsi_cmd_iocb_cmpl().  The failure is when
lpfc_update_status is being called as part of the completion.  After
debugging, it was seen the issue was the shost pointer that the driver
derived from the scsi cmd.  The crash showed the cmd->device pointer being
bogus, which is likely as the scsi devices were offlined prior. The bogus
device pointer caused subsequent pointers derived from the location,
specifically the vport, to be bogus.

Fix by adjusting the calling sequence to pass in the vport rather than
having to derive it from the cmd structure.

Link: https://lore.kernel.org/r/20190922035906.10977-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 22:07:09 -04:00
James Smart
9db6c14c36 scsi: lpfc: Remove bg debugfs buffers
Capturing and downloading dif command data and dif data was done a dozen
years ago and no longer being used. Also creates a potential security hole.

Remove the debugfs buffer for dif debugging.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com>
CC: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-29 18:08:58 -04:00
James Smart
c00f62e6c5 scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair
Currently, each hardware queue, typically allocated per-cpu, consists of a
WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2
WQ/CQ pairs will exist for the hardware queue. Separate queues are
unnecessary. The current implementation wastes memory backing the 2nd set
of queues, and the use of double the SLI-4 WQ/CQ's means less hardware
queues can be supported which means there may not always be enough to have
a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their
own WQ/CQ.

Rework the implementation to use a single WQ/CQ pair by both protocols.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:12 -04:00
James Smart
d79c9e9d4b scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.
Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI
to contain the exchanges Scatter/Gather List. This caps the number of SGL
elements that can be in the SGL. There are not extensions to extend the
list out of the 2 pages.

The G7 hardware adds a SGE type that allows the SGL to be vectored to a
different scatter/gather list segment. And that segment can contain a SGE
to go to another segment and so on.  The initial segment must still be
pre-registered for the XRI, but it can be a much smaller amount (256Bytes)
as it can now be dynamically grown.  This much smaller allocation can
handle the SG list for most normal I/O, and the dynamic aspect allows it to
support many MB's if needed.

The implementation creates a pool which contains "segments" and which is
initially sized to hold the initial small segment per xri. If an I/O
requires additional segments, they are allocated from the pool.  If the
pool has no more segments, the pool is grown based on what is now
needed. After the I/O completes, the additional segments are returned to
the pool for use by other I/Os. Once allocated, the additional segments are
not released under the assumption of "if needed once, it will be needed
again". Pools are kept on a per-hardware queue basis, which is typically
1:1 per cpu, but may be shared by multiple cpus.

The switch to the smaller initial allocation significantly reduces the
memory footprint of the driver (which only grows if large ios are
issued). Based on the several K of XRIs for the adapter, the 8KB->256B
reduction can conserve 32MBs or more.

It has been observed with per-cpu resource pools that allocating a resource
on CPU A, may be put back on CPU B. While the get routines are distributed
evenly, only a limited subset of CPUs may be handling the put routines.
This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because
all the resources are being put on a limited subset of CPUs.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:12 -04:00
James Smart
3235066449 scsi: lpfc: Migrate to %px and %pf in kernel print calls
In order to see real addresses, convert %p with %px for kernel addresses
and replace %p with %pf for functions.

While converting, standardize on "x%px" throughout (not %px or 0x%px).

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:11 -04:00
James Smart
5e0e2318aa scsi: lpfc: Fix too many sg segments spamming in kernel log
This issue is specific to SLI-3 adapters, specifically when DIF is used.

Once seen, this message floods the logs:
9064 BLKGRD: lpfc_scsi_prep_dma_buf_s3: Too many sg segments from
  dma_map_sg

The driver, upon detecting an error such as too many elements in an sglist,
misrepresents the error by treating it as a temporary resource issue by
returning MLQUEUE_HOST_BUSY.  In these cases, no retry will fix it and it
should have been a hard error. The repeated retry was causing the spamming
of the log.

As for the initial reason of why an I/O encountered this issue at all is
not clear as parameters set by the driver should have avoided this.  The
dm multipath maintainer has been notified of the issue.

Fix by changing the return code for the dma mapping routines to indicate
cases that are not retryable and return DID_ERROR on those cases.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
8c24a4f643 scsi: lpfc: Fix crash due to port reset racing vs adapter error handling
If the adapter encounters a condition which causes the adapter to fail
(driver must detect the failure) simultaneously to a request to the driver
to reset the adapter (such as a host_reset), the reset path will be racing
with the asynchronously-detect adapter failure path.  In the failing
situation, one path has started to tear down the adapter data structures
(io_wq's) while the other path has initiated a repeat of the teardown and
is in the lpfc_sli_flush_xxx_rings path and attempting to access the
just-freed data structures.

Fix by the following:

 - In cases where an adapter failure is detected, rather than explicitly
   calling offline_eratt() to start the teardown, change the adapter state
   and let the later calls of posted work to the slowpath thread invoke the
   adapter recovery.  In essence, this means all requests to reset are
   serialized on the slowpath thread.

 - Clean up the routine that restarts the adapter. If there is a failure
   from brdreset, don't immediately error and leave things in a partial
   state. Instead, ensure the adapter state is set and finish the teardown
   of structures before returning.

 - If in the scsi host reset handler and the board fails to reset and
   restart (which can be due to parallel reset/recovery paths), instead of
   hard failing and explicitly calling offline_eratt() (which gets into the
   redundant path), just fail out and let the asynchronous path resolve the
   adapter state.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
996a02aeb9 scsi: lpfc: Fix fcp_rsp_len checking on lun reset
Issuing a LUN reset was resulting in a command failure which then escalated
to a host reset.

The FCP-4 spec allows fcp_rsp_len field to specify the number of valid
bytes of FCP_RSP_INFO, and the value could be 4 or 8.  The driver is
allowing only a value of 8, thus it failed the command.

Revise the driver to allow 4 or 8.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:22 -04:00
James Smart
b9e5a2d961 scsi: lpfc: Fix hardlockup in scsi_cmd_iocb_cmpl
There is a race condition with the abort handler declaring a waitq
item on it's stack, followed by a timeout in the abort handler that
has it give up on the abort return to its caller. When the io is
finally aborted and its completion handler called, it references
the waitq element that the abort_handler set up, which is no longer
valid resulting in a deadlock.

Fix by clearing the waitq reference, under lock, when the abort
handler timeout gives up. Have the completion handler validate the
waitq before referencing it.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:21 -04:00
James Smart
2d71dc8eb6 scsi: lpfc: Fix alloc context on oas lun creations
Softlockups are seen in low memory situations. They are due to doing
oas_lun allocation with GFP_KERNEL in atomic contexts.

Change the calls to oas_lun to indicate atomic context so that GFP_ATOMIC
is used.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:20 -04:00
Martin K. Petersen
17631462cd Merge branch '5.1/scsi-fixes' into 5.2/merge
We have a few submissions for 5.2 that depend on fixes merged post
5.1-rc1. Merge the fixes branch into queue.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-12 21:27:23 -04:00
James Smart
4eb0153588 scsi: lpfc: Fix missing wakeups on abort threads
Abort thread wakeups, on some wqe types, are not happening.  The thread
wakeup logic is dependent upon the LPFC_DRIVER_ABORTED flag. However, on
these wqes, the completion handler running prior to the io completion
routine ends up clearing the flag.

Rework the wakeup logic to look at a non-null waitq element which must be
set if the abort thread is waiting. This is reverting the change in the
indicated patch.

Fixes: c2017260eea2d ("scsi: lpfc: Rework locking on SCSI io completion")
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:35:09 -04:00
Bart Van Assche
d6d189ceab scsi: lpfc: Change smp_processor_id() into raw_smp_processor_id()
This patch avoids that a kernel warning appears when smp_processor_id() is
called with preempt debugging enabled.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:36 -04:00
Bart Van Assche
cd05c155d7 scsi: lpfc: Annotate switch/case fall-through
This patch avoids that the compiler warns about missing fall-through
annotation when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:35 -04:00
James Smart
bbd3d7380b scsi: lpfc: Fix driver crash in target reset handler
It's possible for the scsi error handler to fire and call the target reset
handler simultaneously to the driver logging out and relogging into the
system.  If hit just right, the re-login may not have fully re-established
the remote port and the rdata->pnod structure may be null.

Check for NULL in the reset handler and return failure if NULL.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 13:15:08 -04:00
James Smart
ff6bf89717 scsi: lpfc: Resolve inconsistent check of hdwq in lpfc_scsi_cmd_iocb_cmpl
A prior patch which added support for non-uniform allocation of MSIX
vectors now causes a smatch complaint:

 drivers/scsi/lpfc/lpfc_scsi.c:3674 lpfc_scsi_cmd_iocb_cmpl()
   error: we previously assumed 'phba->sli4_hba.hdwq' could be
          null (see line 3667)

Resolve by removing the unnecessary check for a NULL hdwq table.

Fixes 6a828b0f6192: ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 12:57:01 -04:00