The initial version of TCMU (in 3.18) does not properly handle
bidirectional SCSI commands -- those with both an in and out buffer. In
looking to fix this it also became clear that TCMU's support for adding
new types of entries (opcodes) to the command ring was broken. We need
to fix this now, so that future issues can be handled properly by adding
new opcodes.
We make the most of this ABI break by enabling bidi cmd handling within
TCMP_OP_CMD opcode. Add an iov_bidi_cnt field to tcmu_cmd_entry.req.
This enables TCMU to describe bidi commands, but further kernel work is
needed for full bidi support.
Enlarge tcmu_cmd_entry_hdr by 32 bits by pulling in cmd_id and __pad1. Turn
__pad1 into two 8 bit flags fields, for kernel-set and userspace-set flags,
"kflags" and "uflags" respectively.
Update version fields so userspace can tell the interface is changed.
Update tcmu-design.txt with details of how new stuff works:
- Specify an additional requirement for userspace to set UNKNOWN_OP
(bit 0) in hdr.uflags for unknown/unhandled opcodes.
- Define how Data-In and Data-Out fields are described in req.iov[]
Changed in v2:
- Change name of SKIPPED bit to UNKNOWN bit
- PAD op does not set the bit any more
- Change len_op helper functions to take just len_op, not the whole struct
- Change version to 2 in missed spots, and use defines
- Add 16 unused bytes to cmd_entry.req, in case additional SAM cmd
parameters need to be included
- Add iov_dif_cnt field to specify buffers used for DIF info in iov[]
- Rearrange fields to naturally align cdb_off
- Handle if userspace sets UNKNOWN_OP by indicating failure of the cmd
- Wrap some overly long UPDATE_HEAD lines
(Add missing req.iov_bidi_cnt + req.iov_dif_cnt zeroing - Ilias)
Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
When UNMAP command is issued with DIF protection support enabled,
the protection info for the unmapped region is remain unchanged.
So READ command for the region causes data integrity failure.
This fixes it by invalidating protection info for the unmapped region
by filling with 0xff pattern. This change also adds helper function
fd_do_prot_fill() in order to reduce code duplication with existing
fd_format_prot().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In fd_do_prot_rw(), it allocates prot_buf which is used to copy from
se_cmd->t_prot_sg by sbc_dif_copy_prot(). The SG table for prot_buf
is also initialized by allocating 'se_cmd->t_prot_nents' entries of
scatterlist and setting the data length of each entry to PAGE_SIZE
at most.
However if se_cmd->t_prot_sg contains a clustered entry (i.e.
sg->length > PAGE_SIZE), the SG table for prot_buf can't be
initialized correctly and sbc_dif_copy_prot() can't copy to prot_buf.
(This actually happened with TCM loopback fabric module)
As prot_buf is allocated by kzalloc() and it's physically contiguous,
we only need a single scatterlist entry.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
When CONFIG_DEBUG_SG=y and DIF protection support enabled, kernel
BUG()s are triggered due to the following two issues:
1) prot_sg is not initialized by sg_init_table().
When CONFIG_DEBUG_SG=y, scatterlist helpers check sg entry has a
correct magic value.
2) vmalloc'ed buffer is passed to sg_set_buf().
sg_set_buf() uses virt_to_page() to convert virtual address to struct
page, but it doesn't work with vmalloc address. vmalloc_to_page()
should be used instead. As prot_buf isn't usually too large, so
fix it by allocating prot_buf by kmalloc instead of vmalloc.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The loop in core_tmr_abort_task() iterates over sess_cmd_list.
That list is a list of regular commands and task management
functions (TMFs). Skip TMFs in this loop instead of letting
the target drivers filter out TMFs in their get_task_tag()
callback function.
(Drop bogus check removal in tcm_qla2xxx_get_task_tag - nab)
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: <qla2xxx-upstream@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Now that sbc_dif_generate can also be called for READ_INSERT, update
the debugging message accordingly.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The internal DIF emulation was not honoring se_cmd->prot_checks for
the WRPROTECT/RDPROTECT == 0x3 case, so sbc_dif_v1_verify() has been
updated to follow which checks have been calculated based on
WRPROTECT/RDPROTECT in sbc_set_prot_op_checks().
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In sbc_check_prot(), if PROTECT is non-zero for a backend device with
DIF disabled, and sess_prot_type is not set go ahead and return
INVALID_CDB_FIELD.
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The following incremental patch saves the current sess_prot_type into
se_node_acl, and will always reset sess_prot_type if a previous saved
value exists. So the PI setting for the fabric's session with backend
devices not supporting PI is persistent across session restart.
(Fix se_node_acl dereference for discovery sessions - DanCarpenter)
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The scatterlist for protection information which is passed to
sbc_dif_verify_read() or sbc_dif_verify_write() requires that
neighboring scatterlist entries are contiguous or chained so that they
can be iterated by sg_next().
However, the protection information for RD-MCP backends could be located
in the multiple scatterlist arrays when the ramdisk space is too large.
So if the read/write request straddles this boundary, sbc_dif_verify_read()
or sbc_dif_verify_write() can't iterate all scatterlist entries.
This problem can be fixed by chaining protection information scatterlist
at creation time. For the architectures which don't support sg chaining
(i.e. !CONFIG_ARCH_HAS_SG_CHAIN), fix it by allocating temporary
scatterlist if needed.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: target-devel@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The flag SCF_ACK_KREF is only set but never tested. Hence remove
this flag.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Avoid that sparse complains about context imbalances.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a bug for COMPARE_AND_WRITE handling with
fabrics using SCF_PASSTHROUGH_SG_TO_MEM_NOALLOC.
It adds the missing allocation for cmd->t_bidi_data_sg within
transport_generic_new_cmd() that is used by COMPARE_AND_WRITE
for the initial READ payload, even if the fabric is already
providing a pre-allocated buffer for cmd->t_data_sg.
Also, fix zero-length COMPARE_AND_WRITE handling within the
compare_and_write_callback() and target_complete_ok_work()
to queue the response, skipping the initial READ.
This fixes COMPARE_AND_WRITE emulation with loopback, vhost,
and xen-backend fabric drivers using SG_TO_MEM_NOALLOC.
Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Instead of calling target_fabric_configfs_init() +
target_fabric_configfs_register() / target_fabric_configfs_deregister()
target_fabric_configfs_free() from every target driver, rewrite the API
so that we have simple register/unregister functions that operate on
a const operations vector.
This patch also fixes a memory leak in several target drivers. Several
target drivers namely called target_fabric_configfs_deregister()
without calling target_fabric_configfs_free().
A large part of this patch is based on earlier changes from
Bart Van Assche <bart.vanassche@sandisk.com>.
(v2: Add a new TF_CIT_SETUP_DRV macro so that the core configfs code
can declare attributes as either core only or for drivers)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Drop unused argument & return value and consolidate a duplicate assignment.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Factor out code duplication in rd_execute_rw() into a helper function
rd_do_prot_rw(). This change is required to minimize the forthcoming
fix in rd_do_prot_rw().
(Fix up v4.1 for-next fuzz - nab)
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: target-devel@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Currently, for example, mkdir "tpgt_xyz" doesn't return error.
mkdir /sys/kernel/config/target/loopback/naa.60014055f195952b/tpgt_xyz
Replace obsoleted simple_strtoul with kstrtoul and check the conversion.
Signed-off-by: Ming Lin <mlin@kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
These variables are always accessed via struct isert_conn so
no need to have a "conn_" prefix for them.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
iser target can handle as many connect request as
the fabric sends to it. This backlog should not set as
a back-pressure mechanism (which is not very useful).
isert does need a back-pressure mechanism, but it should
be added in isert by monitoring the number of pending
established connections (will be added in a later stage).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
In iser_connect_release there is no chance that
the iser device is set to NULL, if this happens
we have a BUG. So use BUG_ON.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Not sure what it was used for, but there is
no real need for it now as I see it. Go ahead
and get rid of it.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Move login buffer alloc/free code to dedicated
routines and introduce isert_conn_init which
initializes the connection lists and locks.
Simplifies and cleans up the code a little bit.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
isert_device_find_by_ib_dev and isert_device_try_release
can have a better, more common name like isert_device_[get|put].
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
No need to keep a local ib_dev as a device pointer.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Move the code for completion context handling to dedicated
routines. This simplifies the code and removes code duplication.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Simplify iser QP creation by splitting some unrelated
logic bulks to routines.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This is to favor the HCA cache hit rate using less MRs
and PDs. This commit partially reverts commit:
"eb6ab13 IB/isert: separate connection protection domains and dma MRs"
At the time I thought this would be needed.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Before we reach to connection established we may get an
error event. In this case the core won't teardown this
connection (never established it), so we take care of freeing
it ourselves.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This hang was a result of a missing command put when
a DIF error occurred during a rdma read (and we sent
an CHECK_CONDITION error without passing it to the
backend).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch updates iscsi/iser-target to add a new fabric_prot_type
TPG attribute for iser-target, used for controlling LLD level
protection into LIO when the backend device does not support T10-PI.
This is required for ib_isert to enable WRITE_STRIP + READ_INSERT
hardware offloads.
It's disabled by default and controls which se_sesion->sess_prot_type
are set at iscsi_target_locate_portal() session registration time.
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Martin Petersen <martin.petersen@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch updates qla2xxx target to add a new fabric_prot_type TPG
attribute, used for controlling LLD level protection into LIO when
the backend device does not support T10-PI.
This is required for qla_target.c to enable WRITE_STRIP + READ_INSERT
hardware offloads.
It's disabled by default and controls which se_sesion->sess_prot_type
are set at tcm_qla2xxx_check_initiator_node_acl() session registration
time.
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds the missing TARGET_PROT_ALL when initializing a new
session and declaring the capable se_sess->sup_prot_ops for T10-PI.
This is required in order to function with existing qla_target.c
DIF protection offload support.
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch updates vhost-scsi to add a new fabric_prot_type TPG
attribute, used for controlling LLD level protection into LIO when
the backend device does not support T10-PI.
This is required for vhost-scsi to enable WRITE_STRIP + READ_INSERT
operations using software emulation + crct10dif instruction offload.
It's disabled by default and controls which se_sesion->sess_prot_type
are set at vhost_scsi_make_nexus() session registration time.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch updates loopback to add a new fabric_prot_type TPG attribute,
used for controlling LLD level protection into LIO when the backend
device does not support T10-PI.
Also, go ahead and set DIN_PASS + DOUT_PASS so target-core knows that
it will be doing any WRITE_STRIP and READ_INSERT operations.
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Make sure that RAMDISK only attempts to use backend DIF emulation
when it's actually enabled at device level.
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Make sure that IBLOCK only attempts to use backend DIF emulation
when it's actually enabled at device level.
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Make sure that FILEIO only attempts to use backend DIF emulation
when it's actually enabled at device level.
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds READ_INSERT support in target_read_prot_action() that
invokes sbc_dif_generate() when LIO is responsible for generating the
outgoing T10-PI.
Required for supporting fabrics that exchange protection information,
and would like to function with un-protected devices.
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch moves the existing target_complete_ok_work() check for
cmd->prot_op into it's own function, so it's easier to add future
support for READ INSERT.
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds WRITE_STRIP support in target_write_prot_action() that
invokes sbc_dif_verify_write() for checking T10-PI metadata before
submitting the I/O to a backend driver.
Upon verify failure, the specific sense code is propigated up the
failure path up to transport_generic_request_failure().
Also, update sbc_dif_verify_write() to only perform the subsequent
protection metadata copy when a valid *sg is passed.
(Use ilog2 instead of division and unlikely for pi_err - Sagi)
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch moves the existing target_execute_cmd() check for
cmd->prot_op into it's own function, so it's easier to add
future support for WRITE STRIP.
(Use better target_write_prot_action name - Sagi)
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch updates standard INQUIRY, INQUIRY EVPD=0x86, READ_CAPACITY_16
and control mode pages to use se_sess->sess_prot_type when determing which
type of T10-PI related feature bits can be exposed.
This is required for fabric sessions supporting T10-PI metadata to
backend devices that don't have protection enabled.
Reviewed-by: Martin Petersen <martin.petersen@oracle.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Doug Gilbert <dgilbert@interlog.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>