82455 Commits

Author SHA1 Message Date
Doug Ledford
8779e7658d Merge branch 'hfi1-2' into k.o/for-4.7 2016-05-26 12:50:05 -04:00
Mike Marciniszyn
8b103e9cde IB/rdamvt: Fix rdmavt s_ack_queue sizing
rdmavt allows the driver to specify the size of the ack queue, but
only uses it for the modify QP limit testing for setting the atomic
limit value.

The driver dependent size is now used to size the s_ack_queue ring
dynamicially.

Since the driver knows its size, the driver will use its define
for any ring size dependent code.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 12:21:10 -04:00
Mike Marciniszyn
4c0b653335 IB/rdmavt: Max atomic value should be a u8
This matches the ib_qp_attr size and
avoids a extremely large value when the lower level
driver registers.

As part of the patch, the u8 ordinals are moved to the
end of the struct to reduce pahole noted excesses.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 12:21:10 -04:00
Doug Ledford
e6f61130ed Merge branches 'misc-4.7-2', 'ipoib' and 'ib-router' into k.o/for-4.7 2016-05-26 11:55:19 -04:00
Dennis Dalessandro
380fb94288 IB/hfi1: Remove write(), use ioctl() for user cmds
Remove the write() handler for user space commands now that ioctl
handling is available. User apps will need to change to use ioctl from
this point forward.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 11:35:13 -04:00
Dennis Dalessandro
8d970cf991 IB/hfi1: Add ioctl() interface for user commands
IOCTL is more suited to what user space commands need to do than the
write() interface. Add IOCTL definitions for all existing write commands
and the handling for those. The write() interface will be removed in a
follow on patch.

Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 11:35:06 -04:00
Dennis Dalessandro
ac56f162d4 IB/hfi1: Remove unused user command
The HFI1_CMD_SDMA_STATUS_UPD command was never implemented it has no
reason to live in the driver. Remove it.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 11:23:18 -04:00
Dennis Dalessandro
d079031742 IB/hfi1: Remove EPROM functionality from data device
Remove EPROM handling from the cdev which is used for user application
data traffic.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 11:23:18 -04:00
Dennis Dalessandro
0eb626590d IB/hfi1: Remove multiple device cdev
hfi1 current exports a cdev that can be used to target all of the hfi's
in the system. However there is a problem with this approach in
that the devices could be on different subnets. This is a problem that
user space can figure out and explicitly tell the driver on which device
to create a context.

Remove the multi-purpose cdev leaving a dedicated cdev for each port.
Also remove the striping capability that is dependent upon the user
choosing the multi-purpose cdev. It is now up to user space to determine
how to stripe contexts.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-26 11:23:17 -04:00
Erez Shitrit
628e6f7515 IB/SA Agent: Add support for SA agent get ClassPortInfo
New SA query function to return the ClassPortInfo struct from the SA.
If the SM supports FullMemberSendOnly mode for MCG's, it sets a
capability bit in the capability_mask2 field of the response.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-25 15:39:02 -04:00
Erez Shitrit
507f6afa3a IB/core: Introduce capabilitymask2 field in ClassPortInfo mad
Change struct ib_class_port_info to conform to IB Spec 1.3
That in order to get specific capability mask from ClassPortInfo mad.

>From the IB Spec, ClassPortInfo section:
        "CapabilityMask2 Bits 0-26: Additional class-specific capabilities...
         RespTimeValue the rest 5 bits"

The new struct now has one field for capabilitymask2 (previously was the
reserved field) and the resp_time field.

And it fixes up qib and srpt, use of the field repurposed to be used as
capabilitymask2:
IB/qib: Change pma_get_classportinfo
IB/srpt: Adjust the use of ib_class_port_info

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-25 15:39:02 -04:00
Mark Bloch
c34d376187 IB/netlink: Add a new local service operation
This commits adds a new RDMA local service operation:
- IP to GID resolution.

The client request would include the ifindex of the outgoing interface
and would place in an attribute (LS_NLA_TYPE_IPV4 or LS_NLA_TYPE_IPV6)
the destnation IP.

The local service would answer with a message that has the attribute:
- LS_NLA_TYPE_DGID - The destination GID.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-24 14:42:48 -04:00
Matan Barak
94c6825e0f net/mlx5_core: Use tasklet for user-space CQ completion events
Previously, we've fired all our completion callbacks straight from
our ISR.

Some of those callbacks were lightweight (for example, mlx5 Ethernet
napi callbacks), but some of them did more work (for example,
the user-space RDMA stack uverbs' completion handler). Besides that,
doing more than the minimal work in ISR is generally considered wrong,
it could even lead to a hard lockup of the system. Since when a lot
of completion events are generated by the hardware, the loop over
those events could be so long, that we'll get into a hard lockup by
the system watchdog.

In order to avoid that, add a new way of invoking completion events
callbacks. In the interrupt itself, we add the CQs which receive
completion event to a per-EQ list and schedule a tasklet. In the
tasklet context we loop over all the CQs in the list and invoke the
user callback.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-18 10:45:49 -04:00
Doug Ledford
0651ec932a Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into k.o/for-4.7 2016-05-13 19:40:38 -04:00
Majd Dibbiny
b531b90948 IB/core: Add Scatter FCS create flag
Raw Packet QPs that were created with Scatter FCS flag, will scatter
the FCS into the receive buffers.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:28 -04:00
Majd Dibbiny
45686f2d65 IB/core: Add Raw Scatter FCS device capability
Raw Scatter FCS device capability is set when the NIC supports
scattering the FCS to the receive buffers of Raw Packet QPs.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:27 -04:00
Majd Dibbiny
0b24e5ac93 IB/core: Add extended device capability flags
Since all the uverbs device_cap_flags are occupied, we need a place to
expose more device capabilities.

This patch adds a new 64 bit device_cap_flags_ex to expose new
device capabilities.

The lower 32 bits will be identical to the original device_cap_flags,
The upper 32 bits will be new capabilities.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:27 -04:00
Jianxin Xiong
bf77cc34f3 ib_pack.h: Add opcode definition for send with invalidate
The opcode for "SEND Last with Invalidate" and "SEND Only with
Invalidate" have been defined for RC in IBA Specification Vol 1
since Release 1.2. Add the definition to the header file in
preparation of supporting these opcodes in rdmavt based drivers.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:39:19 -04:00
Bart Van Assche
9aa8b3217e IB/core: Enhance ib_map_mr_sg()
The SRP initiator allows to set max_sectors to a value that exceeds
the largest amount of data that can be mapped at once with an mlx4
HCA using fast registration and a page size of 4 KB. Hence modify
ib_map_mr_sg() such that it can map partial sg-elements. If an
sg-element has been mapped partially, let the caller know
which fraction has been mapped by adjusting *sg_offset.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:57 -04:00
Christoph Hellwig
0e353e34e1 IB/core: add RW API support for signature MRs
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:20 -04:00
Christoph Hellwig
e64aa657c3 target: enhance and export target_alloc_sgl/target_free_sgl
The SRP target driver will need to allocate and chain it's own SGLs soon.
For this export target_alloc_sgl, and add a new argument to it so that it
can allocate an additional chain entry that doesn't point to a page.  Also
export transport_free_sgl after renaming it to target_free_sgl to free
these SGLs again.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:19 -04:00
Christoph Hellwig
a060b5629a IB/core: generic RDMA READ/WRITE API
This supports both manual mapping of lots of SGEs, as well as using MRs
from the QP's MR pool, for iWarp or other cases where it's more optimal.
For now, MRs are only used for iWARP transports.  The user of the RDMA-RW
API must allocate the QP MR pool as well as size the SQ accordingly.

Thanks to Steve Wise for testing, fixing and rewriting the iWarp support,
and to Sagi Grimberg for ideas, reviews and fixes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:19 -04:00
Steve Wise
d4a85c309b IB/core: add a need_inval flag to struct ib_mr
This is the first step toward moving MR invalidation decisions
to the core.  It will be needed by the upcoming RW API.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:19 -04:00
Christoph Hellwig
fffb0383cf IB/core: add a simple MR pool
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:18 -04:00
Christoph Hellwig
002516edf5 IB/core: add a helper to check for READ WITH INVALIDATE support
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:18 -04:00
Christoph Hellwig
ff2ba99365 IB/core: Add passing an offset into the SG to ib_map_mr_sg
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:11 -04:00
Saeed Mahameed
80835cba4b net/mlx5: Update mlx5_ifc hardware features
Adding the needed mlx5_ifc hardware bits and structs
for the following features:

* Add vport to steering commands for SRIOV ACL support
* Add mlcr, pcmr and mcia registers for dump module EEPROM
* Add support for FCS, beacon led and disable_link bits to
  hca caps
* Add CQE period mode bit in CQ context for CQE based CQ
  moderation support
* Add umr SQ bit for fragmented memory registration
* Add needed bits and caps for Striding RQ support

In-order to avoid possible future conflicts between rdma and
net-next we added all expected updates to this file for this release.
If more changes will be submitted, we plan to do it only through
one of the subsystems, probably net-next.

All updated bits in this patch will be later used in
the up-coming submissions to net-next and rdma trees.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-12 14:20:47 -04:00
Tariq Toukan
c16aea129b net/mlx5: Fix mlx5 ifc cmd_hca_cap bad offsets
All reserved fields after early_vf_enable are off by 1, since
early_vf_enable was not explicitly declared as array of size 1.

Reserved field before cqe_zip had a wrong size, it should
be 0x80 + 0x3f.

Fixes: b0844444590e ("net/mlx5_core: Introduce access function to read internal timer ")
Fixes: b4ff3a36d3e4 ("net/mlx5: Use offset based reserved field names in the IFC header file")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-12 14:20:46 -04:00
Jubin John
ea0e4ce3bc IB/rdmavt,hfi1,qib: Fix memory leak
rdi->ports has memory allocated in rvt_alloc_device(), but does not get
freed because the hfi1 and qib drivers drivers call ib_dealloc_device()
directly instead of going through rdmavt. Add a rvt_dealloc_device()
that frees rdi->ports and then calls ib_dealloc_device(). Switch hfi1
and qib drivers to calling rvt_dealloc_device() instead of
ib_dealloc_device() directly.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Jubin John <jubin.john@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28 16:32:27 -04:00
Mike Marciniszyn
f39cc34df7 IB/rdmavt: Fix adaptive pio hang
The RVT_S_WAIT_PIO_DRAIN flag was missing from
the set of flags indicating a qp is waiting
on a resource.

This caused the sleep/wakeup for adaptive pio
drain to lose a wakeup "hanging" a QP.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28 16:32:26 -04:00
Doug Ledford
e29bff46b9 Merge branch 'k.o/for-4.6-rc' into testing/4.6 2016-04-28 15:16:32 -04:00
Jason Gunthorpe
e6bd18f57a IB/security: Restrict use of the write() interface
The drivers/infiniband stack uses write() as a replacement for
bi-directional ioctl().  This is not safe. There are ways to
trigger write calls that result in the return structure that
is normally written to user space being shunted off to user
specified kernel memory instead.

For the immediate repair, detect and deny suspicious accesses to
the write API.

For long term, update the user space libraries and the kernel API
to something that doesn't present the same security vulnerabilities
(likely a structured ioctl() interface).

The impacted uAPI interfaces are generally only available if
hardware from drivers/infiniband is installed in the system.

Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
[ Expanded check to all known write() entry points ]
Cc: stable@vger.kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28 12:03:16 -04:00
Sagi Grimberg
986ef95ecd IB/mlx5: Expose correct max_sge_rd limit
mlx5 devices (Connect-IB, ConnectX-4, ConnectX-4-LX) has a limitation
where rdma read work queue entries cannot exceed 512 bytes.
A rdma_read wqe needs to fit in 512 bytes:
- wqe control segment (16 bytes)
- rdma segment (16 bytes)
- scatter elements (16 bytes each)

So max_sge_rd should be: (512 - 16 - 16) / 16 = 30.

Cc: linux-stable@vger.kernel.org
Reported-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagig@grimberg.me>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28 10:49:17 -04:00
Linus Torvalds
763cfc86ee Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "Two patches to fix a deadlock which can be easily triggered if memcg
  charge moving is used.

  This bug was introduced while converting threadgroup locking to a
  global percpu_rwsem and is caused by cgroup controller task migration
  path depending on the ability to create new kthreads.  cpuset had a
  similar issue which was fixed by performing heavy-lifting operations
  asynchronous to task migration.  The two patches fix the same issue in
  memcg in a similar way.  The first patch makes the mechanism generic
  and the second relocates memcg charge moving outside the migration
  path.

  Given that we don't want to perform heavy operations while
  writelocking threadgroup lock anyway, moving them out of the way is a
  desirable solution.  One thing to note is that the problem was
  difficult to debug because lockdep couldn't figure out the deadlock
  condition.  Looking into how to improve that"

* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  memcg: relocate charge moving from ->attach to ->post_attach
  cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
2016-04-27 11:41:14 -07:00
Linus Torvalds
f28f20da70 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Handle v4/v6 mixed sockets properly in soreuseport, from Craig
    Gallak.

 2) Bug fixes for the new macsec facility (missing kmalloc NULL checks,
    missing locking around netdev list traversal, etc.) from Sabrina
    Dubroca.

 3) Fix handling of host routes on ifdown in ipv6, from David Ahern.

 4) Fix double-fdput in bpf verifier.  From Jann Horn.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
  bpf: fix double-fdput in replace_map_fd_with_map_ptr()
  net: ipv6: Delete host routes on an ifdown
  Revert "ipv6: Revert optional address flusing on ifdown."
  net/mlx4_en: fix spurious timestamping callbacks
  net: dummy: remove note about being Y by default
  cxgbi: fix uninitialized flowi6
  ipv6: Revert optional address flusing on ifdown.
  ipv4/fib: don't warn when primary address is missing if in_dev is dead
  net/mlx5: Add pci shutdown callback
  net/mlx5_core: Remove static from local variable
  net/mlx5e: Use vport MTU rather than physical port MTU
  net/mlx5e: Fix minimum MTU
  net/mlx5e: Device's mtu field is u16 and not int
  net/mlx5_core: Add ConnectX-5 to list of supported devices
  net/mlx5e: Fix MLX5E_100BASE_T define
  net/mlx5_core: Fix soft lockup in steering error flow
  qlcnic: Update version to 5.3.64
  net: stmmac: socfpga: Remove re-registration of reset controller
  macsec: fix netlink attribute validation
  macsec: add missing macsec prefix in uapi
  ...
2016-04-26 16:25:51 -07:00
Linus Torvalds
8ead9dd547 devpts: more pty driver interface cleanups
This is more prep-work for the upcoming pty changes.  Still just code
cleanup with no actual semantic changes.

This removes a bunch pointless complexity by just having the slave pty
side remember the dentry associated with the devpts slave rather than
the inode.  That allows us to remove all the "look up the dentry" code
for when we want to remove it again.

Together with moving the tty pointer from "inode->i_private" to
"dentry->d_fsdata" and getting rid of pointless inode locking, this
removes about 30 lines of code.  Not only is the end result smaller,
it's simpler and easier to understand.

The old code, for example, depended on the d_find_alias() to not just
find the dentry, but also to check that it is still hashed, which in
turn validated the tty pointer in the inode.

That is a _very_ roundabout way to say "invalidate the cached tty
pointer when the dentry is removed".

The new code just does

	dentry->d_fsdata = NULL;

in devpts_pty_kill() instead, invalidating the tty pointer rather more
directly and obviously.  Don't do something complex and subtle when the
obvious straightforward approach will do.

The rest of the patch (ie apart from code deletion and the above tty
pointer clearing) is just switching the calling convention to pass the
dentry or file pointer around instead of the inode.

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Jann Horn <jann@thejh.net>
Cc: Greg KH <greg@kroah.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-26 15:47:32 -07:00
David S. Miller
6a923934c3 Revert "ipv6: Revert optional address flusing on ifdown."
This reverts commit 841645b5f2dfceac69b78fcd0c9050868d41ea61.

Ok, this puts the feature back.  I've decided to apply David A.'s
bug fix and run with that rather than make everyone wait another
whole release for this feature.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 11:47:41 -04:00
Tejun Heo
5cf1cacb49 cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys->post_attach callback
Since e93ad19d0564 ("cpuset: make mm migration asynchronous"), cpuset
kicks off asynchronous NUMA node migration if necessary during task
migration and flushes it from cpuset_post_attach_flush() which is
called at the end of __cgroup_procs_write().  This is to avoid
performing migration with cgroup_threadgroup_rwsem write-locked which
can lead to deadlock through dependency on kworker creation.

memcg has a similar issue with charge moving, so let's convert it to
an official callback rather than the current one-off cpuset specific
function.  This patch adds cgroup_subsys->post_attach callback and
makes cpuset register cpuset_post_attach_flush() as its ->post_attach.

The conversion is mostly one-to-one except that the new callback is
called under cgroup_mutex.  This is to guarantee that no other
migration operations are started before ->post_attach callbacks are
finished.  cgroup_mutex is one of the outermost mutex in the system
and has never been and shouldn't be a problem.  We can add specialized
synchronization around __cgroup_procs_write() but I don't think
there's any noticeable benefit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org> # 4.4+ prerequisite for the next patch
2016-04-25 15:45:14 -04:00
David S. Miller
841645b5f2 ipv6: Revert optional address flusing on ifdown.
This reverts the following three commits:

70af921db6f8835f4b11c65731116560adb00c14
799977d9aafbf0ca0b9c39b04cbfb16db71302c9
f1705ec197e705b79ea40fe7a2cc5acfa1d3bfac

The feature was ill conceived, has terrible semantics, and has added
nothing but regressions to the already fragile ipv6 stack.

Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-25 15:33:55 -04:00
Majd Dibbiny
5fc7197d3a net/mlx5: Add pci shutdown callback
This patch introduces kexec support for mlx5.
When switching kernels, kexec() calls shutdown, which unloads
the driver and cleans its resources.

In addition, remove unregister netdev from shutdown flow. This will
allow a clean shutdown, even if some netdev clients did not release their
reference from this netdev. Releasing The HW resources only is enough as
the kernel is shutting down

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:39 -04:00
Saeed Mahameed
cd255efff9 net/mlx5e: Use vport MTU rather than physical port MTU
Set and report vport MTU rather than physical MTU,
Driver will set both vport and physical port mtu and will
rely on the query of vport mtu.

SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.

Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:39 -04:00
Saeed Mahameed
046339eaab net/mlx5e: Device's mtu field is u16 and not int
For set/query MTU port firmware commands the MTU field
is 16 bits, here I changed all the "int mtu" parameters
of the functions wrapping those firmware commands to be u16.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:38 -04:00
Sabrina Dubroca
748164802c macsec: add missing macsec prefix in uapi
I accidentally forgot some MACSEC_ prefixes in if_macsec.h.

Fixes: dece8d2b78d1 ("uapi: add MACsec bits")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:31:59 -04:00
Elad Raz
7ceb2afbd6 switchdev: Adding complete operation to deferred switchdev ops
When using switchdev deferred operation (SWITCHDEV_F_DEFER), the operation
is executed in different context and the application doesn't have any way
to get the operation real status.

Adding a completion callback fixes that. This patch adds fields to
switchdev_attr and switchdev_obj "complete_priv" field which is used by
the "complete" callback.

Application can set a complete function which will be called once the
operation executed.

Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:23:32 -04:00
Linus Torvalds
913f201083 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal fixes from Eduardo Valentin:
 "Specifics in this pull request:

   - Fixes in mediatek and OF thermal drivers

   - Fixes in power_allocator governor

   - More fixes of unsigned to int type change in thermal_core.c.

  These change have been CI tested using KernelCI bot. \o/"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: fix Mediatek thermal controller build
  thermal: consistently use int for trip temp
  thermal: fix mtk_thermal build dependency
  thermal: minor mtk_thermal.c cleanups
  thermal: power_allocator: req_range multiplication should be a 64 bit type
  thermal: of: add __init attribute
2016-04-23 17:15:39 -07:00
Linus Torvalds
4dfa5739d9 asm-generic changes for 4.6-rc
Here is one patch to wire up the preadv/pwritev system calls in the
 generic system call table, which is required for all architectures
 that were merged in the last few years, including arm64.
 
 Usually these get merged along with the syscall implementation
 or one of the architecture trees, but this time that did not
 happen.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVxvoK2CrR//JCVInAQJ6ixAApP2U+bTRTi8CCn5fOkHvx5o7iev1T8bQ
 qgQdTeIBOnUGm7K2zpbzQrq9iotwkdeGTzmGOkXd7+319LI7oRed1yURgFSaUzdl
 GSgkI8mfFN/Jf+cwUKLtYrBiMhThDOlCJqKmCu1xQ9DggHbe16x5QUtapxKJLE+r
 8I7/BVopBXl9rNh6NJ9kv3TXhTJN5YTsF4E0eiPb8gOsx+RkfFtGAsrjFm8eQBY6
 vnbF5bnKT3tH80NWC3PYHhoTJKowNgG1hna2/neuKodkaDhrnnnIHQYwCfKwpGFz
 BnPEimGsoEbqmXJtrBKQa4QsMROO0V/0q+bKxrNDeiqCZHiNZ2t0JOX5x/66Y1p+
 vmLkZf4B88VwfGtNFOOE+G4dziWPuCi0q9KqAyVkOrIpMs4jtZmH2Usp30RSrYjC
 q8Ue3CAEBKZHET+LLUTh1ZrbgAYg8d13UdQUoc3LjSnNM21QQKYxkWw4E48MwvU/
 hTQbJ12BnURNvipurgJYid9vxvqnR1DpuK/odGRdmkulLTXo+MNF0YNkg0wqoJzS
 HLsR9b5LWdj0On+iIoClPms05CxURSy3Mk8dH5LZvnrugrBBepPPnaJIOHEYuzOR
 3dI7LI28k9YsdjnDWWScncMfWU0Lc99HVQHNTpKK78ekST/klhtNh+hxLfJZjnqQ
 IqouDJsB2wc=
 =aE0K
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic update from Arnd Bergmann:
 "Here is one patch to wire up the preadv/pwritev system calls in the
  generic system call table, which is required for all architectures
  that were merged in the last few years, including arm64.

  Usually these get merged along with the syscall implementation or one
  of the architecture trees, but this time that did not happen.

  Andre and Christoph both sent a version of this patch, I picked the
  one I got first"

* tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  generic syscalls: wire up preadv2 and pwritev2 syscalls
2016-04-23 14:53:11 -07:00
Andre Przywara
987aedb5d6 generic syscalls: wire up preadv2 and pwritev2 syscalls
These new syscalls are implemented as generic code, so enable them for
architectures like arm64 which use the generic syscall table.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-04-23 22:38:08 +02:00
Linus Torvalds
0e11d25651 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "Misc fixes:

  pvqspinlocks:
   - an instrumentation fix

  futexes:
   - preempt-count vs pagefault_disable decouple corner case fix
   - futex requeue plist race window fix
   - futex UNLOCK_PI transaction fix for a corner case"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic()
  futex: Acknowledge a new waiter in counter before plist
  futex: Handle unlock_pi race gracefully
  locking/pvqspinlock: Fix division by zero in qstat_read()
2016-04-23 11:39:48 -07:00
Linus Torvalds
d61fb48b2f Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "i915, nouveau and amdgpu/radeon fixes in this:

  nouveau:
     Two fixes, one for a regression with dithering and one for a bug
     hit by the userspace drivers.

  i915:
     A few fixes, mostly things heading for stable, two important
     skylake GT3/4 hangs.

  radeon/amdgpu:
     Some audio, suspend/resume and some runtime PM fixes, along with
     two patches to harden the userptr ABI a bit"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (24 commits)
  drm: Loongson-3 doesn't fully support wc memory
  drm/nouveau/gr/gf100: select a stream master to fixup tfb offset queries
  amdgpu/uvd: add uvd fw version for amdgpu
  drm/amdgpu: forbid mapping of userptr bo through radeon device file
  drm/radeon: forbid mapping of userptr bo through radeon device file
  drm/amdgpu: bump the afmt limit for CZ, ST, Polaris
  drm/amdgpu: use defines for CRTCs and AMFT blocks
  drm/dp/mst: Validate port in drm_dp_payload_send_msg()
  drm/nouveau/kms: fix setting of default values for dithering properties
  drm/radeon: print a message if ATPX dGPU power control is missing
  Revert "drm/radeon: disable runtime pm on PX laptops without dGPU power control"
  drm/amdgpu/acp: fix resume on CZ systems with AZ audio
  drm/radeon: add a quirk for a XFX R9 270X
  drm/radeon: print pci revision as well as pci ids on driver load
  drm/i915: Use fw_domains_put_with_fifo() on HSW
  drm/i915: Force ringbuffers to not be at offset 0
  drm/i915: Adjust size of PIPE_CONTROL used for gen8 render seqno write
  drm/i915/skl: Fix spurious gpu hang with gt3/gt4 revs
  drm/i915/skl: Fix rc6 based gpu/system hang
  drm/i915/userptr: Hold mmref whilst calling get-user-pages
  ...
2016-04-22 10:29:52 -07:00
Linus Torvalds
d4b0528827 sound fixes for 4.6-rc5
Again a relatively calm week without surprise: most of fixes are about
 HD-audio, including fixes for Cirrus codec regression and a race over
 regmap access.  Although both change are slightly unintuitive, the
 risk of further breakage is quite low, I hope.
 
 Other than that, all the rest are trivial.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXGeZhAAoJEGwxgFQ9KSmkOg4P/RWGLla59n6L6QCE2kNp3rQS
 fPjVG2ctEHAqVi6lqLSletV2HkA1K6MY/tyawjtjG2Fs6JqbITasOWJmI10aDzDl
 tmLWBlmJf4zRye2YbFmP4v5dE9VuRhU9ucMc4IqBT9+DgTEVNuCcQiW9amyj4kJ7
 XcjT6z0tj/5MXe+hmjvHuEwc6qYk/4Z7HJDcKlNxsD7dQ9zoLhXmbEndw5600Os5
 cMFzZzRPDofyRbvhooQrn1oJG143bilnseMV83O7WTwzJ+DSiZlJG8ibY7bPYsPL
 qicr4keet5n96gZHbPXeZHvxlCzW4dStna1qPYOWTg8xy1kHGA7tAExznPG23f2h
 ANTQ0lAN+A0I3PjB0S+MP6I2Uw+bkkKpo0w6ugSz8nmSfO1k9mkavK51RWAuQTef
 ITdORv/HTkCHp7jyKBx2ArM+s3mckrNA4kNFOAHYpdXln3vQDPc0BnQkfq1LW6vf
 RQVQSAoSBqZahefgfL2nuhHWa/QqZ+9XTPmxp1dA2YLmYLKtkGOZYuz0XkDYqSaC
 JdnL+3Q7OmTbVvgPPOn595eq7R8VrH2vbQH1pJC+zRL+I9dVmZ7Et7KtBYCr0pON
 G4S+ZieILnMTYdYsEK4jQkC0ZiWwpr+y2Lurz97qVm8yIbKSBEpxakxpVHNP5dDe
 I/6ZrY3i3K4YqYbxPwEF
 =wz0C
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Again a relatively calm week without surprise: most of fixes are about
  HD-audio, including fixes for Cirrus codec regression and a race over
  regmap access.  Although both change are slightly unintuitive, the
  risk of further breakage is quite low, I hope.

  Other than that, all the rest are trivial"

* tag 'sound-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix possible race on regmap bypass flip
  ALSA: pcxhr: Fix missing mutex unlock
  ALSA: hda - add PCI ID for Intel Broxton-T
  ALSA: hda - Keep powering up ADCs on Cirrus codecs
  ALSA: hda/realtek - Add ALC3234 headset mode for Optiplex 9020m
  ALSA - hda: hdmi check NULL pointer in hdmi_set_chmap
  ALSA: hda - Don't trust the reported actual power state
2016-04-22 10:17:18 -07:00