usNIC userspace/kernel ABI should be set to 4 instead of 3.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
usNIC default transport is UDP. Hence, advertise RDMA_NODE_USNIC_UDP
by default for usNIC devices.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add the complementary RDMA_NODE_USNIC_UDP for RDMA_TRANSPORT_USNIC_UDP.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
usNIC needs inet notifiers to function correctly, so add a Kconfig
dependency on CONFIG_INET.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch removes the net_random and net_srandom macros and replaces
them with direct calls to the prandom ones. As new commits only seem to
use prandom_u32 there is no use to keep them around.
This change makes it easier to grep for users of prandom_u32.
Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.
When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.
Thus, those attributes were added to the following structures:
* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id
For the path record structure, extra care was taken to avoid the new
fields when packing it into wire format, so we don't break the IB CM
and SA wire protocol.
On the active side, the CM fills. its internal structures from the
path provided by the ULP. We add there taking the ETH L2 attributes
and placing them into the CM Address Handle (struct cm_av).
On the passive side, the CM fills its internal structures from the WC
associated with the REQ message. We add there taking the ETH L2
attributes from the WC.
When the HW driver provides the required ETH L2 attributes in the WC,
they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
code checks for the presence of these flags, and in their absence does
address resolution from the ib_init_ah_from_wc() helper function.
ib_modify_qp_is_ok is also updated to consider the link layer. Some
parameters are mandatory for Ethernet link layer, while they are
irrelevant for IB. Vendor drivers are modified to support the new
function signature.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch adds support for steerable (NETIF) QP creation. When we
create the device, we allocate a range of steerable QPs.
Afterward when a QP is created with the NETIF flag, it's allocated
from this range. Allocation is managed by bitmap allocator.
Internal steering rules for those QPs is automatically generated on
their creation.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The mlx4 device requires adding IB flow spec to rules that apply over
infiniband link layer. This patch adds a mechanism to add such a rule.
If higher levels e.g. IP/UDP/TCP flow specs are provided, the device
requires us to add an empty wild-carded IB rule. Furthermore, the device
requires the QPN to be put in the rule.
Add here specific parsing support for IB empty rules and the ability
to self-generate missing specs based on existing ones.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Up until now, flow steering wasn't supported when using IB ports.
This patch enables support for flow steering if all hardware ports
support that, for example the new MLX4_DEV_CAP_FLAG2_DMFS_IPOIB mlx4
device capability.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
When creating an IPoIB UD QP, provide a hint to the low level driver
that the QP should support flow-steering. This means that privileged
user space applications can steer TCP/IP IPoIB traffic from the
network stack, in a similar manner done with Ethernet RAW_PACKET QPs.
The hint is provided through new QP creation flag called NETIF_QP.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The micro UAR (uuar) allocator had a bug which resulted from the fact
that in each UAR we only have two micro UARs avaialable, those at
index 0 and 1. This patch defines iterators to aid in traversing the
list of available micro UARs when allocating a uuar.
In addition, change the logic in create_user_qp() so that if high
class allocation fails (high class means lower latency), we revert to
medium class and not to the low class.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The variable start in struct mlx5_ib_mr is never used. Remove it.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add comment describing usnic_transport_rsrv port and remove
extraneous space from usnic_transport.c.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Use for_each_sg() instead of an explicit for-loop to iterate over
scatter-gather list.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add RDMA_TRANSPORT_USNIC_UDP which will be used by usNIC.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add supports for:
1) Parsing the socket file descriptor pass down from userspace.
2) IP notifiers
3) Encoding the IP in the GID
4) Other aux. changes to support UDP
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch provides API for rest of usNIC code to increment or decrement
socket's reference count. Auxiliary socket APIs are also provided.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add *ip field* to *struct usnic_fwd_dev* as well as new *functions* to
manipulate the *ip field.* Furthermore, add new functions for
programming UDP flows in the forwarding device.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Expand the kernel/userspace interface so userspace may push down
a socket file descriptor to usNIC. Also, bump up the abi and version
numbers.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch ports usnic_ib_sysfs.c to the new interface of
usnic_fwd.h.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch ports usnic_ib_qp_grp.[hc] to the new interface
of usnic_fwd.h.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch ports usnic_ib_main.c, usnic_ib_verbs.c and usnic_ib.h
to the new interface of usnic_fwd.h.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Push all of the usnic device forwarding state - such as mtu, mac - to
usnic_fwd_dev. Furthermore, usnic_fwd.h exposes a improved interface
for rest of the usnic code. The primary improvement is that
usnic_fwd.h's flow management interface takes in high-level *filter*
and *action* structures now, instead of low-level paramaters such as
vnic_idx, rq_idx.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add *struct usnic_transport_spec* for passing around transport
specifications.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
usNIC calls WARN_ON(spin_is_locked..) at few places. In some of these
instances, the call is made while holding a spinlock. Change
all WARN_ON(spin_is_locked...) calls in usNIC to
lockdep_assert_held to make it fool-proof bc the latter can be
called while holding a spinlock and unlike spin_is_locked,
lockdep_assert_held also works correctly on UP.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This adds a driver that allows userspace to use UD-like QPs over a
proprietary Cisco transport with Cisco's Virtual Interface Cards (VICs),
including VIC 1240 and 1280 cards.
Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
OCRDMA_GEN2_FAMILY is wrongly defined as 0x02 -- it should be 0x0F.
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Fix ah->av->valid bit position and big endian portability.
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Conflicts:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
net/ipv6/ip6_tunnel.c
net/ipv6/ip6_vti.c
ipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.
qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.
Signed-off-by: David S. Miller <davem@davemloft.net>
When we create a new infiniband link with uninfiniband device, e.g. `ip link
add link em1 type ipoib pkey 0x8001`. We will get a NULL pointer dereference
cause other dev like Ethernet don't have struct ib_device.
The code path is:
rtnl_newlink
|-- ipoib_new_child_link
|-- __ipoib_vlan_add
|-- ipoib_set_dev_features
|-- ib_query_device
Fix this bug by make sure the src net is infiniband when create new link.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Some holiday bug fixes for 3.13... There is still one bug I'd like to
get fixed before 3.13-final.
The vlan code erroneously assignes the header ops of the underlying
real device to the VLAN device above it when the real device can
hardware offload VLAN handling. That's completely bogus because
header ops are tied to the device type, so they only expect to see a
'dev' argument compatible with their ops.
The fix is the have the VLAN code use a special set of header ops that
does the pass-thru correctly, by calling the underlying real device's
header ops but _also_ passing in the real device instead of the VLAN
device.
That fix is currently waiting some testing.
Anyways, of note here:
1) Fix bitmap edge case in radiotap, from Johannes Berg.
2) Fix oops on driver unload in rtlwifi, from Larry Finger.
3) Bonding doesn't do locking correctly during speed/duplex/link
changes, from Ding Tianhong.
4) Fix header parsing in GRE code, this bug has been around for a few
releases. From Timo Teräs.
5) SIT tunnel driver MTU check needs to take GSO into account, from
Eric Dumazet.
6) Minor info leak in inet_diag, from Daniel Borkmann.
7) Info leak in YAM hamradio driver, from Salva Peiró.
8) Fix route expiration state handling in ipv6 routing code, from Li
RongQing.
9) DCCP probe module does not check request_module()'s return value,
from Wang Weidong.
10) cpsw driver passes NULL device names to request_irq(), from
Mugunthan V N.
11) Prevent a NULL splat in RDS binding code, from Sasha Levin.
12) Fix 4G overflow test in tg3 driver, from Nithin Sujir.
13) Cure use after free in arc_emac and fec driver's software
timestamp handling, from Eric Dumazet.
14) SIT driver can fail to release the route when
iptunnel_handle_offloads() throws an error. From Li RongQing.
15) Several batman-adv fixes from Simon Wunderlich and Antonio
Quartulli.
16) Fix deadlock during TIPC socket release, from Ying Xue.
17) Fix regression in ROSE protocol recvmsg() msg_name handling, from
Florian Westphal.
18) stmmac PTP support releases wrong spinlock, from Vince Bridgers"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
stmmac: Fix incorrect spinlock release and PTP cap detection.
phy: IRQ cannot be shared
net: rose: restore old recvmsg behavior
xen-netback: fix guest-receive-side array sizes
fec: Do not assume that PHY reset is active low
tipc: fix deadlock during socket release
netfilter: nf_tables: fix wrong datatype in nft_validate_data_load()
batman-adv: fix vlan header access
batman-adv: clean nf state when removing protocol header
batman-adv: fix alignment for batadv_tvlv_tt_change
batman-adv: fix size of batadv_bla_claim_dst
batman-adv: fix size of batadv_icmp_header
batman-adv: fix header alignment by unrolling batadv_header
batman-adv: fix alignment for batadv_coded_packet
netfilter: nf_tables: fix oops when updating table with user chains
netfilter: nf_tables: fix dumping with large number of sets
ipv6: release dst properly in ipip6_tunnel_xmit
netxen: Correct off-by-one errors in bounds checks
net: Add some clarification to skb_tx_timestamp() comment.
arc_emac: fix potential use after free
...
Use the possibly more efficient ether_addr_equal
to instead of memcmp.
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Faisal Latif <faisal.latif@intel.com>
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Additional checks for uverbs to ensure forward compatibility, handle
malformed input better.
- Fix potential use-after-free in iWARP connection manager.
- Make a function static.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABCAAGBQJSuHFcAAoJEENa44ZhAt0hetAP/2lMjThZ/dK/Q4eyGXr86qQ6
ygjFc26H7SPEqZ2cjmcm0mgObHhVj4+94YiJDcemHpjwfzLhhZywZxjySTTJdKna
ZSzmSb8si0KNdPAoEdldPmtmO04oTI0+mz7kK2sz9mbn033xDVcJ24M8jSUy6/dl
KBmYWCaiE6rabSyvRNwXOIFilj5RwomQFZzJMjg9JGvWY5qpmwKbcvVdZMCSzAwS
VsZ2Z5UEDaghsGtnYjKQzvR1TDGeq6XyWtR4rfjCH19U6rUcInbA1oZFDWcOqE3C
r/o9jOCUszAVefG3iPX5PJ+A3kKCHB2lcDK9owfoIM/cLfV8KWBFkp0WouD5wiGr
8sfENcKI7aMq+Rptsa24FDNkJVCcdG1HhribRqWY/mn4CwI4c3Qe+5s4x+kgHnxU
VngX/Hp54lw+pnrgWWrKCT5tFVuDgdVhHqyYLabtnKrCHIPsb2Eqp8Kaz2JPNLkI
OYWOcS17wzg7nAPoiqbMVfWInGmziqOpWqVZkFwZ+EBTfJIjE7td+Mxn46PzKsrO
gbVYifn+q5cToDGwmK4pn14G1Xh45tqYW/P3KfZNwyGB6+Ui7u18Q5/epp8BHIWy
8bn3jTcJzBIMlOmFxqirrJ2OixaeAWJhZjgAMJIA3So9MHwmhbkJfgo0J+S6jLnI
p1WRy0kTmPC5NNX21YN8
=bol6
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband fixes from Roland Dreier:
"Last batch of InfiniBand/RDMA changes for 3.13 / 2014:
- Additional checks for uverbs to ensure forward compatibility,
handle malformed input better.
- Fix potential use-after-free in iWARP connection manager.
- Make a function static"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/uverbs: Check access to userspace response buffer in extended command
IB/uverbs: Check input length in flow steering uverbs
IB/uverbs: Set error code when fail to consume all flow_spec items
IB/uverbs: Check reserved fields in create_flow
IB/uverbs: Check comp_mask in destroy_flow
IB/uverbs: Check reserved field in extended command header
IB/uverbs: New macro to set pointers to NULL if length is 0 in INIT_UDATA()
IB/core: const'ify inbuf in struct ib_udata
RDMA/iwcm: Don't touch cm_id after deref in rem_ref
RDMA/cxgb4: Make _c4iw_write_mem_dma() static
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on original work by Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a check on the output buffer with access_ok(VERIFY_WRITE, ...)
to ensure the whole buffer is in userspace memory before using the
pointer in uverbs functions. If the buffer or a subset of it is not
valid, returns -EFAULT to the caller.
This will also catch invalid buffer before the final call to
copy_to_user() which happen late in most uverb functions.
Just like the check in read(2) syscall, it's a sanity check to detect
invalid parameters provided by userspace. This particular check was added
in vfs_read() by Linus Torvalds for v2.6.12 with following commit message:
https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=fd770e66c9a65b14ce114e171266cf6f393df502
Make read/write always do the full "access_ok()" tests.
The actual user copy will do them too, but only for the
range that ends up being actually copied. That hides
bugs when the range has been clamped by file size or other
issues.
Note: there's no need to check input buffer since vfs_write() already does
access_ok(VERIFY_READ, ...) as part of write() syscall.
Link: http://marc.info/?i=cover.1387273677.git.ydroneaud@opteya.com
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Since ib_copy_from_udata() doesn't check yet the available input data
length before accessing userspace memory, an explicit check of this
length is required to prevent:
- reading past the user provided buffer,
- underflow when subtracting the expected command size from the input
length.
This will ensure the newly added flow steering uverbs don't try to
process truncated commands.
Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
If the flow_spec items parsed count does not match the number of items
declared in the flow_attr command, or if not all bytes are used for
flow_spec items (eg. trailing garbage), a log message is reported and
the function leave through the error path. Unfortunately the error
code is currently not set.
This patch set error code to -EINVAL in such cases, so that the error
is reported to userspace instead of silently fail.
Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
As noted by Daniel Vetter in its article "Botching up ioctls"[1]
"Check *all* unused fields and flags and all the padding for whether
it's 0, and reject the ioctl if that's not the case. Otherwise
your nice plan for future extensions is going right down the
gutters since someone *will* submit an ioctl struct with random
stack garbage in the yet unused parts. Which then bakes in the ABI
that those fields can never be used for anything else but garbage."
It's important to ensure that reserved fields are set to known value,
so that it will be possible to use them latter to extend the ABI.
The same reasonning apply to comp_mask field present in newer uverbs
command: per commit 22878dbc9173 ("IB/core: Better checking of
userspace values for receive flow steering"), unsupported values in
comp_mask are rejected.
[1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html
Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
As noted by Daniel Vetter in its article "Botching up ioctls"[1]
"Check *all* unused fields and flags and all the padding for whether
it's 0, and reject the ioctl if that's not the case. Otherwise
your nice plan for future extensions is going right down the
gutters since someone *will* submit an ioctl struct with random
stack garbage in the yet unused parts. Which then bakes in the ABI
that those fields can never be used for anything else but garbage."
It's important to ensure that reserved fields are set to known value,
so that it will be possible to use them latter to extend the ABI.
The same reasonning apply to comp_mask field present in newer uverbs
command: per commit 22878dbc9173 ("IB/core: Better checking of
userspace values for receive flow steering"), unsupported values in
comp_mask are rejected.
[1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html
Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Trying to have a ternary operator to choose between NULL (or 0) and the
real pointer value in invocations leads to an impossible choice between
a sparse error about a literal 0 used as a NULL pointer, and a gcc
warning about "pointer/integer type mismatch in conditional expression."
Rather than clutter the source with more casts, move the ternary
operator into a new INIT_UDATA_BUF_OR_NULL() macro, which makes it
easier to use and simplifies its callers.
Reported-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch moves INIT_WORK setup for cq_desc->cq_[rx,tx]_work into
isert_create_device_ib_res(), instead of being done each callback
invocation in isert_cq_[rx,tx]_callback().
This also fixes a 'INFO: trying to register non-static key' warning
when cancel_work_sync() is called before INIT_WORK has setup the
struct work_struct.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: <stable@vger.kernel.org> #3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>