51319 Commits

Author SHA1 Message Date
Ingo Molnar
44530d588e Revert "perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86"
This reverts commit 2c95afc1e83d93fac3be6923465e1753c2c53b0a.

Stephane reported the following regression:

 > Since Andi added:
 >
 > commit 2c95afc1e83d93fac3be6923465e1753c2c53b0a
 > Author: Andi Kleen <ak@linux.intel.com>
 > Date:   Thu Jun 9 06:14:38 2016 -0700
 >
 >    perf/x86/intel, watchdog: Switch NMI watchdog to ref cycles on x86
 >
 > $ perf stat -e ref-cycles ls
 >   <not counted> ....
 >
 > fails systematically because the ref-cycles is now used by the
 > watchdog and given this is a system-wide pinned event, it monopolizes
 > the fixed counter 2 which is the only counter able to measure this event.

Since the next merge window is near, fix the regression for now
by reverting the commit.

Reported-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-10 20:58:36 +02:00
Lukas Wunner
abb2bafd29 x86/quirks: Add early quirk to reset Apple AirPort card
The EFI firmware on Macs contains a full-fledged network stack for
downloading OS X images from osrecovery.apple.com. Unfortunately
on Macs introduced 2011 and 2012, EFI brings up the Broadcom 4331
wireless card on every boot and leaves it enabled even after
ExitBootServices has been called. The card continues to assert its IRQ
line, causing spurious interrupts if the IRQ is shared. It also corrupts
memory by DMAing received packets, allowing for remote code execution
over the air. This only stops when a driver is loaded for the wireless
card, which may be never if the driver is not installed or blacklisted.

The issue seems to be constrained to the Broadcom 4331. Chris Milsted
has verified that the newer Broadcom 4360 built into the MacBookPro11,3
(2013/2014) does not exhibit this behaviour. The chances that Apple will
ever supply a firmware fix for the older machines appear to be zero.

The solution is to reset the card on boot by writing to a reset bit in
its mmio space. This must be done as an early quirk and not as a plain
vanilla PCI quirk to successfully combat memory corruption by DMAed
packets: Matthew Garrett found out in 2012 that the packets are written
to EfiBootServicesData memory (http://mjg59.dreamwidth.org/11235.html).
This type of memory is made available to the page allocator by
efi_free_boot_services(). Plain vanilla PCI quirks run much later, in
subsys initcall level. In-between a time window would be open for memory
corruption. Random crashes occurring in this time window and attributed
to DMAed packets have indeed been observed in the wild by Chris
Bainbridge.

When Matthew Garrett analyzed the memory corruption issue in 2012, he
sought to fix it with a grub quirk which transitions the card to D3hot:
http://git.savannah.gnu.org/cgit/grub.git/commit/?id=9d34bb85da56

This approach does not help users with other bootloaders and while it
may prevent DMAed packets, it does not cure the spurious interrupts
emanating from the card. Unfortunately the card's mmio space is
inaccessible in D3hot, so to reset it, we have to undo the effect of
Matthew's grub patch and transition the card back to D0.

Note that the quirk takes a few shortcuts to reduce the amount of code:
The size of BAR 0 and the location of the PM capability is identical
on all affected machines and therefore hardcoded. Only the address of
BAR 0 differs between models. Also, it is assumed that the BCMA core
currently mapped is the 802.11 core. The EFI driver seems to always take
care of this.

Michael Büsch, Bjorn Helgaas and Matt Fleming contributed feedback
towards finding the best solution to this problem.

The following should be a comprehensive list of affected models:
    iMac13,1        2012  21.5"       [Root Port 00:1c.3 = 8086:1e16]
    iMac13,2        2012  27"         [Root Port 00:1c.3 = 8086:1e16]
    Macmini5,1      2011  i5 2.3 GHz  [Root Port 00:1c.1 = 8086:1c12]
    Macmini5,2      2011  i5 2.5 GHz  [Root Port 00:1c.1 = 8086:1c12]
    Macmini5,3      2011  i7 2.0 GHz  [Root Port 00:1c.1 = 8086:1c12]
    Macmini6,1      2012  i5 2.5 GHz  [Root Port 00:1c.1 = 8086:1e12]
    Macmini6,2      2012  i7 2.3 GHz  [Root Port 00:1c.1 = 8086:1e12]
    MacBookPro8,1   2011  13"         [Root Port 00:1c.1 = 8086:1c12]
    MacBookPro8,2   2011  15"         [Root Port 00:1c.1 = 8086:1c12]
    MacBookPro8,3   2011  17"         [Root Port 00:1c.1 = 8086:1c12]
    MacBookPro9,1   2012  15"         [Root Port 00:1c.1 = 8086:1e12]
    MacBookPro9,2   2012  13"         [Root Port 00:1c.1 = 8086:1e12]
    MacBookPro10,1  2012  15"         [Root Port 00:1c.1 = 8086:1e12]
    MacBookPro10,2  2012  13"         [Root Port 00:1c.1 = 8086:1e12]

For posterity, spurious interrupts caused by the Broadcom 4331 wireless
card resulted in splats like this (stacktrace omitted):

    irq 17: nobody cared (try booting with the "irqpoll" option)
    handlers:
    [<ffffffff81374370>] pcie_isr
    [<ffffffffc0704550>] sdhci_irq [sdhci] threaded [<ffffffffc07013c0>] sdhci_thread_irq [sdhci]
    [<ffffffffc0a0b960>] azx_interrupt [snd_hda_codec]
    Disabling IRQ #17

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79301
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111781
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=728916
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=895951#c16
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1009819
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1098621
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1149632#c5
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1279130
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1332732
Tested-by: Konstantin Simanov <k.simanov@stlk.ru>        # [MacBookPro8,1]
Tested-by: Lukas Wunner <lukas@wunner.de>                # [MacBookPro9,1]
Tested-by: Bryan Paradis <bryan.paradis@gmail.com>       # [MacBookPro9,2]
Tested-by: Andrew Worsley <amworsley@gmail.com>          # [MacBookPro10,1]
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> # [MacBookPro10,2]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Acked-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chris Milsted <cmilsted@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Michael Buesch <m@bues.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: b43-dev@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Apply nvidia_bugs quirk only on root bus
Cc: stable@vger.kernel.org # 123456789abc: x86/quirks: Reintroduce scanning of secondary buses
Link: http://lkml.kernel.org/r/48d0972ac82a53d460e5fce77a07b2560db95203.1465690253.git.lukas@wunner.de
[ Did minor readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-10 20:13:53 +02:00
Paolo Bonzini
2e9d1e150a x86/entry: Avoid interrupt flag save and restore
Thanks to all the work that was done by Andy Lutomirski and others,
enter_from_user_mode() and prepare_exit_to_usermode() are now called only with
interrupts disabled.  Let's provide them a version of user_enter()/user_exit()
that skips saving and restoring the interrupt flag.

On an AMD-based machine I tested this patch on, with force-enabled
context tracking, the speed-up in system calls was 90 clock cycles or 6%,
measured with the following simple benchmark:

    #include <sys/signal.h>
    #include <time.h>
    #include <unistd.h>
    #include <stdio.h>

    unsigned long rdtsc()
    {
        unsigned long result;
        asm volatile("rdtsc; shl $32, %%rdx; mov %%eax, %%eax\n"
                     "or %%rdx, %%rax" : "=a" (result) : : "rdx");
        return result;
    }

    int main()
    {
        unsigned long tsc1, tsc2;
        int pid = getpid();
        int i;

        tsc1 = rdtsc();
        for (i = 0; i < 100000000; i++)
            kill(pid, SIGWINCH);
        tsc2 = rdtsc();

        printf("%ld\n", tsc2 - tsc1);
    }

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/1466434712-31440-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-10 13:33:02 +02:00
Ingo Molnar
52e31f89cc Merge branch 'linus' into x86/asm, to pick up fixes before merging new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-09 10:43:49 +02:00
David S. Miller
cc3baecb21 RxRPC rewrite
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAV3zeiPSw1s6N8H32AQI9Iw//RxwhAzb+ovNIizPqcFRt4BIS3Or6unov
 Bx9jtq+U1POWZWtlIHYZKDH8ndxMnC56NkEHmpIK3uEiqJ6BoQPpDdfN+hhREysV
 Hoa+JOkMgufPBanU/JyKY/4vsmlvLuoOdppN1OA/kx1KECux9xJWIrvFsCUQGeat
 nDsdWChHkZAm/GDPZiFvxEBVaxDe2dmnDMBFTst1RsrH2uICSqM4k5srmjc3NPAY
 bsTqZeQGTIK1V9MggwBHxHFMvvERlGDpcrpoMRjeTzMmCpCg5endJoSu3hdNjHUO
 o5Fi50dhLI5jo84DQiXL0wM4SLND0QQygl+QeU3zlJYtsQsF6WxPnIEGqlGr3+WV
 I4wjDc5lxECyQIjCsrCo5ZwJ47Kqmm/ZQ4uGd9JooAVhqhP7/2dhFH0zXywJZzKs
 zo+dWTF5Xvde+mlknm1RCTgkdx3msPH9EVkEoO4FOPOAg6lhIMQFFXLXZfGr9oX6
 V99t+8t8YhDPTbL4AQnzh/aMHtbpM6be4TYiRRjT6iZLuvWOPW/zpp/1hmyeEkbU
 KZNDunC2tH030Fx5toGi3b2i8M5SJdyex9Udg/YsNexpWmyHMS49PoGk9ZnRRPA+
 xn9+xIVsqTh+xbiyCOPJqlQMMK9ayF7isT2N8T19qoVJxurdE/tMBdtKrJ5uTJFT
 W0n8KV46a+4=
 =1ZSd
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-rewrite-20160706' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Improve conn/call lookup and fix call number generation [ver #3]

I've fixed a couple of patch descriptions and excised the patch that
duplicated the connections list for reconsideration at a later date.

For reference, the excised patch is sitting on the rxrpc-experimental
branch of my git tree, based on top of the rxrpc-rewrite branch.  Diffing
it against yesterday's tag shows no differences.

Would you prefer the patch set to be emailed afresh instead of a git-pull
request?

David
---
Here's the next part of the AF_RXRPC rewrite.  The two main purposes of
this set are to fix the call number handling and to make use of RCU when
looking up the connection or call to pass a received packet to.

Important changes in this set include:

 (1) Avoidance of placing stack data into SG lists in rxkad so that kernel
     stacks can become vmalloc'd (Herbert Xu).

 (2) Calls cease pinning the connection they used as soon as possible,
     which allows the connection to be discarded sooner and allows the call
     channel on that connection to be reused earlier.

 (3) Make each call channel on a connection have a separate and independent
     call number space rather than having a shared number space for the
     connection.  Call numbers should increment monotonically per channel
     on the client, and the server should ignore a call with a lower call
     number for that channel than the latest it has seen.  The RESPONSE
     packet sets the minimum values of each call ID counter on a
     connection.

 (4) Look up calls by indexing the channel array on a connection rather
     than by keeping calls in an rbtree on that connection.  Also look up
     calls using the channel array rather than using a hashtable.

     The call hashtable can then be removed.

 (5) Call terminal statuses are cached in the channel array for the last
     call.  It is assumed that if we the server have seen call N, then the
     client no longer cares about call N-1 on the same channel.

     This will allow retransmission of the terminal status in future
     without the need to keep the rxrpc_call struct around.

 (6) Peer lookups are moved out of common connection handling code and into
     service connection handling code as client connections (a) must point
     to a peer before they can be used and (b) are looked up by a
     machine-unique connection ID directly, so we only need to look up the
     peer first if we're going to deal with a service call.

 (7) The reference count on a connection is held elevated by 1 whilst it is
     alive (ie. idle unused connections have a refcount of 1).  The reaper
     will attempt to change the refcount from 1->0 and skip if this cannot
     be done, whilst look ups only increment the refcount if it's non-zero.

     This makes the implementation of RCU lookups easier as we don't have
     to get a ref on the connection or a lock on the connection list to
     prevent a connection being reaped whilst we're contemplating queueing
     a packet that initiates a new service call upon it.

     If we need to get a connection, but there's a dead connection in the
     tree, we use rb_replace_node() to replace the dead one with a new one.

 (8) Use a seqlock to validate the walk over the service connection rbtree
     attached to a peer when it's being walked in RCU mode.

 (9) Make the incoming call/connection packet handling code use RCU mode
     and locks and make it only take a reference if the call/connection
     gets queued on a workqueue.

The intention is that the next set will introduce the connection lifetime
management and capacity limits to prevent clients from overloading the
server.

There are some fixes too:

 (1) Verifying that a packet coming in to a client connection came from the
     expected source.

 (2) Fix handling of connection failure in client call creation where we
     don't reinitialise the list linkage block and a second attempt to
     unlink the failed connection oopses and also we don't set the state
     correctly, which causes an assertion failure.

 (3) New service calls were being added to the socket's accept queue under
     the wrong lock.

Changes:

 (V2) In rxrpc_find_service_conn_rcu() initialised the sequence number to 0.

      Fixed the RCU handling in conn_service.c by introducing and using
      rb_replace_node_rcu() as an RCU-safe alternative in
      rxrpc_publish_service_conn().

      Modified and used rcu_dereference_raw() to avoid RCU sparse warnings
      in rxrpc_find_service_conn_rcu().

      Added in some missing RCU dereference wrappers.  It seems to be
      necessary to turn on CONFIG_PROVE_RCU_REPEATEDLY as well as
      CONFIG_SPARSE_RCU_POINTER to get the static __rcu annotation checking
      to happen.

      Fixed some other sparse warnings, including a missing ntohs() in
      jumbo packet processing.

 (V3) Fixed some commit descriptions.

      Excised the patch that duplicated the connection list to separate out
      the procfs list for reconsideration at a later date.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08 23:52:12 -04:00
Jens Axboe
41d512e51b Merge branch 'for-4.8/block' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm into for-4.8/drivers
Dan writes:

"The removal of ->driverfs_dev in favor of just passing the parent
device in as a parameter to add_disk().  See below, it has received a
"Reviewed-by" from Christoph, Bart, and Johannes.

It is also a pre-requisite for Fam Zheng's work to cleanup gendisk
uevents vs attribute visibility [1].  We would extend device_add_disk()
to take an attribute_group list.

This is based off a branch of block.git/for-4.8/drivers and has
received a positive build success notification from the kbuild robot
across several configs.

[1]: "gendisk: Generate uevent after attribute available"
http://marc.info/?l=linux-virtualization&m=146725201522201&w=2"
2016-07-08 16:04:11 -06:00
Hans Verkuil
9ad52b4db7 [media] cec-funcs.h: add missing 'reply' for short audio descriptor
The cec_msg_request_short_audio_descriptor function was missing the reply
argument. That's needed if you want the framework to wait for the reply
message.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-08 18:35:42 -03:00
Hans Verkuil
0c1f856994 [media] cec-funcs.h: add missing const modifier
The cec_ops_* functions never touch cec_msg, so mark it as const.

This was done for some of the cec_ops_ functions, but not all.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-08 18:35:17 -03:00
Hans Verkuil
0cb1f1e44c [media] cec-funcs.h: add length checks
Add msg->len sanity checks to fix static checker warning:

	include/linux/cec-funcs.h:1154 cec_ops_set_osd_string()
	warn: setting length 'msg->len - 3' to negative one

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-08 18:34:42 -03:00
Hans Verkuil
12655445a2 [media] cec.h/cec-funcs.h: add option to use BSD license
Like the videodev2.h and other headers, explicitly allow these headers
to be used with the BSD license.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2016-07-08 18:31:33 -03:00
Mauro Carvalho Chehab
cbb5c8355a Merge branch 'topic/cec' into patchwork
* topic/cec:
  [media] DocBook/media: add CEC documentation
  [media] s5p_cec: get rid of an unused var
  [media] move s5p-cec to staging
  [media] vivid: add CEC emulation
  [media] cec: s5p-cec: Add s5p-cec driver
  [media] cec: adv7511: add cec support
  [media] cec: adv7842: add cec support
  [media] cec: adv7604: add cec support
  [media] cec: add compat32 ioctl support
  [media] cec/TODO: add TODO file so we know why this is still in staging
  [media] cec: add HDMI CEC framework (api)
  [media] cec: add HDMI CEC framework (adapter)
  [media] cec: add HDMI CEC framework (core)
  [media] cec-funcs.h: static inlines to pack/unpack CEC messages
  [media] cec.h: add cec header
  [media] cec-edid: add module for EDID CEC helper functions
  [media] cec.txt: add CEC framework documentation
  [media] rc: Add HDMI CEC protocol handling
2016-07-08 18:16:10 -03:00
Rafael J. Wysocki
63f9ccb895 Merge back earlier suspend/hibernation changes for v4.8. 2016-07-08 23:14:17 +02:00
Mauro Carvalho Chehab
fb810cb5ed Linux 4.7-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXefulAAoJEHm+PkMAQRiG6nMH/2O1vcZeOtqmx2yCMUeXyKAT
 wG88XflXzf3rM7C7TiObEYVf/bbLleJ7saDLEeic7ButD5gyYacIuzylVnrcqfBc
 vinz4cOw5kvu9DrRkCKdOfiTAgwYtqQW+syJ8ZK4lPQuSxnPAs+F/FKSOpyUF5FN
 Dngr520KjYKBEtn27W9UDPChFRwQoWAlaOC534eusaArCJtHGHHiuq5TEDn2EIo8
 pUw2vwx5JiquSHOY34WLU7r+QoilovCQlUSsBQdLlPjfMB1QFtclPYa+5yEMjkT4
 wusOUOfS/zK0rV6KnEdc/SkpiVX5C9WpFiWUOdEeJ5mZ+KijVkaOa9K1EDx8jSM=
 =7Hwh
 -----END PGP SIGNATURE-----

Merge tag 'v4.7-rc6' into patchwork

Linux 4.7-rc6

* tag 'v4.7-rc6': (1245 commits)
  Linux 4.7-rc6
  ovl: warn instead of error if d_type is not supported
  MIPS: Fix possible corruption of cache mode by mprotect.
  locks: use file_inode()
  usb: dwc3: st: Use explicit reset_control_get_exclusive() API
  phy: phy-stih407-usb: Use explicit reset_control_get_exclusive() API
  phy: miphy28lp: Inform the reset framework that our reset line may be shared
  namespace: update event counter when umounting a deleted dentry
  9p: use file_dentry()
  lockd: unregister notifier blocks if the service fails to come up completely
  ACPI,PCI,IRQ: correct operator precedence
  fuse: serialize dirops by default
  drm/i915: Fix missing unlock on error in i915_ppgtt_info()
  powerpc: Initialise pci_io_base as early as possible
  mfd: da9053: Fix compiler warning message for uninitialised variable
  mfd: max77620: Fix FPS switch statements
  phy: phy-stih407-usb: Inform the reset framework that our reset line may be shared
  usb: dwc3: st: Inform the reset framework that our reset line may be shared
  usb: host: ehci-st: Inform the reset framework that our reset line may be shared
  usb: host: ohci-st: Inform the reset framework that our reset line may be shared
  ...
2016-07-08 18:14:03 -03:00
Octavian Purdila
68bdb67732 ACPI: add support for ACPI reconfiguration notifiers
Add support for ACPI reconfiguration notifiers to allow subsystems
to react to changes in the ACPI tables that happen after the initial
enumeration. This is similar with the way dynamic device tree
notifications work.

The reconfigure notifications supported for now are device add and
device remove.

Since ACPICA allows only one table notification handler, this patch
makes the table notifier function generic and moves it out of the
sysfs specific code.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-08 21:52:35 +02:00
Octavian Purdila
10c7e20b2f ACPI / scan: fix enumeration (visited) flags for bus rescans
If the ACPI tables change as a result of a dinamically loaded table
and a bus rescan is required the enumeration/visited flag are not
consistent.

I2C/SPI are not directly enumerated in acpi_bus_attach(), however
the visited flag is set. This makes it impossible to check if an
ACPI device has already been enumerated by the I2C and SPI
subsystems. To fix this issue we only set the visited flags if
the device is not I2C or SPI.

With this change we also need to remove setting visited to false
from acpi_bus_attach(), otherwise if we rescan already enumerated
I2C/SPI devices we try to re-enumerate them.

Note that I2C/SPI devices can be enumerated either via a scan handler
(when using PRP0001) or via regular device_attach(). In either case
the flow goes through acpi_default_enumeration() which makes it the
ideal place to mark the ACPI device as enumerated.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-08 21:52:34 +02:00
Sagi Grimberg
931a6de4d7 nvme-rdma.h: Add includes for nvme rdma_cm negotiation
NVMe over Fabrics RDMA transport defines a connection establishment
protocol over the RDMA connection manager. This header will be used by
both the host and target drivers to negotiate the connection
establishment parameters.

Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-08 08:38:49 -06:00
Sagi Grimberg
486cf9899e blk-mq: Introduce blk_mq_reinit_tagset
The new nvme-rdma driver will need to reinitialize all the tags as part of
the error recovery procedure (realloc the tag memory region). Add a helper
in blk-mq for it that can iterate over all requests in a tagset to make
this easier.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Ming Lin <ming.l@ssi.samsung.com>
Reviewed-by: Stephen Bates <Stephen.Bates@pmcs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-08 08:38:49 -06:00
Martin Blumenstingl
3467f0d433 ath9k: Allow configuration of LED polarity in platform data.
Some devices running OpenWrt need this and it makes sense to add this
to ath9k_platform_data as the next patches will add a devicetree
(boolean) property for it as well.

Suggested-by: Vittorio Gambaletta <openwrt@vittgam.net>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
2016-07-08 17:01:14 +03:00
Ingo Molnar
7fb2b43c32 efi: Reorganize the GUID table to make it easier to read
Re-organize the GUID table so that every GUID takes a single line.

This makes each line super long, but if you have a large enough terminal
(or zoom out of a small terminal) then you can see the structure at
a glance - which is more readable than it was the case with the
multi-line layout.

Acked-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Joe Perches <joe@perches.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20160627104920.GA9099@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-08 15:21:37 +02:00
Dmitry Safonov
b059a453b1 x86/vdso: Add mremap hook to vm_special_mapping
Add possibility for 32-bit user-space applications to move
the vDSO mapping.

Previously, when a user-space app called mremap() for the vDSO
address, in the syscall return path it would land on the previous
address of the vDSOpage, resulting in segmentation violation.

Now it lands fine and returns to userspace with a remapped vDSO.

This will also fix the context.vdso pointer for 64-bit, which does
not affect the user of vDSO after mremap() currently, but this
may change in the future.

As suggested by Andy, return -EINVAL for mremap() that would
split the vDSO image: that operation cannot possibly result in
a working system so reject it.

Renamed and moved the text_mapping structure declaration inside
map_vdso(), as it used only there and now it complements the
vvar_mapping variable.

There is still a problem for remapping the vDSO in glibc
applications: the linker relocates addresses for syscalls
on the vDSO page, so you need to relink with the new
addresses.

Without that the next syscall through glibc may fail:

  Program received signal SIGSEGV, Segmentation fault.
  #0  0xf7fd9b80 in __kernel_vsyscall ()
  #1  0xf7ec8238 in _exit () from /usr/lib32/libc.so.6

Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160628113539.13606-2-dsafonov@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-08 14:17:51 +02:00
Alexander Aring
19580cc1ed ieee802154: add ieee802154_skb_src_pan helper
This patch adds ieee802154_skb_src_pan function to get the pointer
address of the source pan id at skb mac pointer.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 13:23:12 +02:00
Alexander Aring
9cc577dd25 ieee802154: add ieee802154_skb_dst_pan helper
This patch adds ieee802154_skb_dst_pan function to get the pointer
address of the destination pan id at skb mac pointer.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 13:23:12 +02:00
Borislav Petkov
069f0cd00d printk: Make the printk*once() variants return a value
Have printk*once() return a bool which denotes whether the string was
printed or not so that calling code can react accordingly.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1467671487-10344-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-08 11:33:18 +02:00
Dan Williams
29b9aa0aa3 libnvdimm: introduce devm_nvdimm_memremap(), convert nfit_spa_map() users
In preparation for generically mapping flush hint addresses for both the
BLK and PMEM use case, provide a generic / reference counted mapping
api.  Given the fact that a dimm may belong to multiple regions (PMEM
and BLK), the flush hint addresses need to be held valid as long as any
region associated with the dimm is active.  This is similar to the
existing BLK-region case where multiple BLK-regions may share an
aperture mapping.  Up-level this shared / reference-counted mapping
capability from the nfit driver to a core nvdimm capability.

This eliminates the need for the nd_blk_region.disable() callback.  Note
that the removal of nfit_spa_map() and related infrastructure is
deferred to a later patch.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-07 17:11:09 -07:00
Matias Bjørling
8680f165ea lightnvm: make ppa_list const in nvm_set_rqd_list
The passed by reference ppa list in nvm_set_rqd_list() is updated when
multiple planes are available. In that case, each PPA plane is
incremented when the device side PPA list is created. This prevents the
caller to rely on the PPA list to be unmodified after a call.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling
41285fad51 lightnvm: remove _unlocked variant of [get/put]_blk
The [get/put]_blk API enables targets to get ownership of blocks at
runtime. This information is currently not recorded on disk, and the
information is therefore lost on power failure. To restore the
metadata, the [get/put]_blk must persist its metadata. In that case,
we need to control the outer lock, so that we can disable them while
updating the on-disk metadata. Fortunately, the _unlocked versions can
be removed, which allows us to move the lock into the [get/put]_blk
functions.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling
b76eb20bb0 lightnvm: move target mgmt into media mgr
To enable persistent block management to easily control creation and
removal of targets, we move target management into the media
manager. The LightNVM core continues to maintain which target types are
registered, while the media manager now keeps track of its initialized
targets.

Two new callbacks for the media manager are introduced. create_tgt and
remove_tgt. Note that remove_tgt returns 0 on successfully removing a
target, and returns 1 if the target was not found.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Matias Bjørling
077d238999 lightnvm: remove open/close statistics for gennvm
The responsibility of the media manager is not to keep track of
open/closed blocks. This is better maintained within a target,
that already manages this information on writes.

Remove the statistics and merge the states NVM_BLK_ST_OPEN and
NVM_BLK_ST_CLOSED.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Javier González
5389a1dfb3 lightnvm: initialize ppa_addr in dev_to_generic_addr()
The ->reserved bit is not initialized when allocated on stack.
This may lead targets to misinterpret the PPA as cached.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Javier González
529435e818 lightnvm: add media manager mark_blk helper
Expose media manager mark_blk() to targets, as done for the rest of the
media manager callback functions.

Signed-off-by: Javier González <javier@cnexlabs.com>
Updated description
Signed-off-by: Matias Bjørling <m@bjorling.me>

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-07-07 08:51:52 -06:00
Thomas Gleixner
3d93f42d44 Merge branch 'clockevents/4.8' of http://git.linaro.org/people/daniel.lezcano/linux into timers/core
Pull the clockevents/clocksource tree from Daniel Lezcano:

  - Convert the clocksource-probe init functions to return a value in order to
    prepare the consolidation of the drivers using the DT. It is a big patchset
    but went through 01.org (kbuild bot), linux next and kernel-ci (continuous
    integration) (Daniel Lezcano)

  - Fix a bad error handling by returning the right value for cadence_ttc
    (Christophe Jaillet)

  - Fix typo in the Kconfig for the Samsung pwm (Alexandre Belloni)

  - Change functions to static for armada-370-xp and digicolor (Ben Dooks)

  - Add support for the rk3399 SoC timer by adding bindings and a slight
    change in the base address. Take the opportunity to add the DYNIRQ flag
    (Huang Tao)

  - Fix endian accessors for the Samsung pwm timer (Matthew Leach)

  - Add Oxford Semiconductor RPS Dual Timer driver (Neil Armstrong)

  - Add a kernel parameter to swich on/off the event stream feature of the arch
    arm timer (Will Deacon)
2016-07-07 15:41:13 +02:00
Arnd Bergmann
82be1178ee Fix a long time regression for ir-rx51 driver for n900 device tree
booting.
 
 This driver has been unusable with multiarch because of the hardware
 timer access. With the recent PWM changes, we can finally fix the
 driver for multiarch and device tree support. And naturally there
 is no rush for these for the -rc cycle, these can wait for the
 merge window.
 
 The PWM changes have been acked by Thierry. For the media changes
 I did not get an ack from Mauro but he was Cc'd in the discussion
 and these changes do not conflict with other media changes.
 
 After this series we can drop the remaining omap3 legacy booting
 board files finally.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXe1v2AAoJEBvUPslcq6VzLvsQALIWLcgyqUKNgPV6EOxAsxeO
 azvjPPZgI0oTl5+voS6dzSg5mXhQt3rVlaXxJXxCdxYgOf67zYjugrJqf5Sm4stF
 BEtzPGLRTwR8miJbHNB8eGfcBGU3q+hx19dT+U7EAsPQqItlFfxVTh5ZzS0TOZc2
 9bQGO6MYIWNEk6V0TRxNUfRt+4fQN17zRdIpKL8NCJOsoBn9ZpztB76GeHUnd34t
 pYeaMFPdVUqBCQOmE9EA3dmOB7q37JgtsM312zpn72OgzpTTttza1Vo6HAh7dTWM
 Pmphin+H54zmsrB3wvI0h4G2WmuZjNNEHBTbLV4fgTjDer4o84RHJUGOgQmmPnni
 q9nyuV0y1gudk8ihlq7hxnickPS+NCSk8aQgAfspY+8rhxYWstJdJ66gUl+bUPlQ
 qaOvNACaEMVuRdeqnuTLm/mhQdVLz5VuUzSZkAWosVVuCVY9ONLt/t/Mxbylqf9P
 CfKiDBEUp+DQ8Ly4L0odD7n4fOWOq6XOx3RR6C4IJqkLyb6VL5O8POKNsHuMvb1q
 ptWyN1aWHrdYvd4PD0QB6hakY+C2P9BSAMb0t9ZscAX1nGvkGY27TRzHD8rkhHZD
 WKy5J4gzr8VDLnQom9MwoXcoqrWKTGY5QCqw1hdUrqNlVnRJTu2jYc5NrYxOK0mD
 TUC5drgK5e2mjmgUv95q
 =frvc
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v4.8/ir-rx51-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/drivers

Merge "omap ir-rx51 driver fixes for multiarch for v4.8 merge window"
from Tony Lindgren:

Fix a long time regression for ir-rx51 driver for n900 device tree
booting.

This driver has been unusable with multiarch because of the hardware
timer access. With the recent PWM changes, we can finally fix the
driver for multiarch and device tree support. And naturally there
is no rush for these for the -rc cycle, these can wait for the
merge window.

The PWM changes have been acked by Thierry. For the media changes
I did not get an ack from Mauro but he was Cc'd in the discussion
and these changes do not conflict with other media changes.

After this series we can drop the remaining omap3 legacy booting
board files finally.

* tag 'omap-for-v4.8/ir-rx51-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ir-rx51: use hrtimer instead of dmtimer
  ir-rx51: add DT support to driver
  ir-rx51: use PWM framework instead of OMAP dmtimer
  pwm: omap-dmtimer: Allow for setting dmtimer clock source
  ir-rx51: Fix build after multiarch changes broke it
2016-07-07 14:32:08 +02:00
Arnd Bergmann
7dccd2ec96 Reset controller changes for v4.8, part 3
- change request API to be more explicit about the difference between
   exclusive and shared resets (the former guarantee the reset line is
   asserted immediately when reset_control_assert is called, the latter
   are refcounted and do not guarantee this).
 - add Hisilicon hi6220 media subsystem reset controller support
 - add TI SYSCON based reset controller support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJXdEKmAAoJEFDCiBxwnmDr0qUQAK7zmAATBcgGp/ipvkEajNzp
 oTnxTC8kpGJ1oG0rQzXVDHImcOGxsLaV2CwEx5KLRyTkMvTRWroCrmpspmwpFgbg
 P9CY2HBim7TJzEKG3Z7qJ5cymvGeS4aDFWHZ97Il26/ElGcXqnNRlI8jmYEBuAQD
 GiW3Q1mzxp/fsr0vjIn8JE/j4T4kUrs2A1TSv4AjlBtDp1eHdDzWq1l11TvH9GTO
 fpSOMEPxfU+KPN4Bdu0ZrxQan8wlCFr+YBvkXzaJRZv1bjRYOqgqhYnprM0VT62J
 KZkkrDUzZ/QweOhH4bFr0REddWGGnj9NZ7rmaAYLi6iTy9vHv8AwFuST3J+bmCuu
 soOzx+HTLJuYnOtb4dg7KQRqLik04V3oNKqdeuCxeh+nRpOjssb1HeycN7tcrIPb
 6Io4amMYP3S81B3LZIXVh/zxUxHmPudtKeiDl7YEaZR9/oY7LIh1DSlysBru9+OQ
 bX+lxOzUdNaXBZAt/9MHC4Z6GDe0/xr7G6v1cQjRSzwX6HC8PPY3fgnyB04CfUZK
 srnLI+ZzxOjEqEaqJ3/ekJRZQ3dLUIB1NibrUF4U7JPTwkM/509jyOUwOb9CsPuP
 5P6x/ufpVHFGZvHc1LIk1GKwUi+Zh0VQbFwGCaVn7UdIenw5rURiWS9/Bv/VhtB7
 bbfx0gRGzvgIfs1esRPS
 =5bzt
 -----END PGP SIGNATURE-----

Merge tag 'reset-for-4.8-3' of git://git.pengutronix.de/git/pza/linux into next/drivers

Merge "Reset controller changes for v4.8, part 3" from Philipp Zabel:

- change request API to be more explicit about the difference between
  exclusive and shared resets (the former guarantee the reset line is
  asserted immediately when reset_control_assert is called, the latter
  are refcounted and do not guarantee this).
- add Hisilicon hi6220 media subsystem reset controller support
- add TI SYSCON based reset controller support

* tag 'reset-for-4.8-3' of git://git.pengutronix.de/git/pza/linux:
  reset: add TI SYSCON based reset driver
  Documentation: dt: reset: Add TI syscon reset binding
  reset: hisilicon: Add hi6220 media subsystem reset support
  reset: hisilicon: Change to syscon register access
  arm64: dts: hi6220: Add media subsystem reset dts
  reset: hisilicon: Add media reset controller binding
  reset: TRIVIAL: Add line break at same place for similar APIs
  reset: Supply *_shared variant calls when using *_optional APIs
  reset: Supply *_shared variant calls when using of_* API
  reset: Ensure drivers are explicit when requesting reset lines
  reset: Reorder inline reset_control_get*() wrappers
2016-07-07 14:09:00 +02:00
Ingo Molnar
4b4b20852d Merge branch 'timers/fast-wheel' into timers/core 2016-07-07 10:35:28 +02:00
Thomas Gleixner
53bf837b78 timers: Remove set_timer_slack() leftovers
We now have implicit batching in the timer wheel. The slack API is no longer
used, so remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrew F. Davis <afd@ti.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jaehoon Chung <jh80.chung@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathias Nyman <mathias.nyman@intel.com>
Cc: Pali Rohár <pali.rohar@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.189813118@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:09 +02:00
Thomas Gleixner
500462a9de timers: Switch to a non-cascading wheel
The current timer wheel has some drawbacks:

1) Cascading:

   Cascading can be an unbound operation and is completely pointless in most
   cases because the vast majority of the timer wheel timers are canceled or
   rearmed before expiration. (They are used as timeout safeguards, not as
   real timers to measure time.)

2) No fast lookup of the next expiring timer:

   In NOHZ scenarios the first timer soft interrupt after a long NOHZ period
   must fast forward the base time to the current value of jiffies. As we
   have no way to find the next expiring timer fast, the code loops linearly
   and increments the base time one by one and checks for expired timers
   in each step. This causes unbound overhead spikes exactly in the moment
   when we should wake up as fast as possible.

After a thorough analysis of real world data gathered on laptops,
workstations, webservers and other machines (thanks Chris!) I came to the
conclusion that the current 'classic' timer wheel implementation can be
modified to address the above issues.

The vast majority of timer wheel timers is canceled or rearmed before
expiry. Most of them are timeouts for networking and other I/O tasks. The
nature of timeouts is to catch the exception from normal operation (TCP ack
timed out, disk does not respond, etc.). For these kinds of timeouts the
accuracy of the timeout is not really a concern. Timeouts are very often
approximate worst-case values and in case the timeout fires, we already
waited for a long time and performance is down the drain already.

The few timers which actually expire can be split into two categories:

 1) Short expiry times which expect halfways accurate expiry

 2) Long term expiry times are inaccurate today already due to the
    batching which is done for NOHZ automatically and also via the
    set_timer_slack() API.

So for long term expiry timers we can avoid the cascading property and just
leave them in the less granular outer wheels until expiry or
cancelation. Timers which are armed with a timeout larger than the wheel
capacity are no longer cascaded. We expire them with the longest possible
timeout (6+ days). We have not observed such timeouts in our data collection,
but at least we handle them, applying the rule of the least surprise.

To avoid extending the wheel levels for HZ=1000 so we can accomodate the
longest observed timeouts (5 days in the network conntrack code) we reduce the
first level granularity on HZ=1000 to 4ms, which effectively is the same as
the HZ=250 behaviour. From our data analysis there is nothing which relies on
that 1ms granularity and as a side effect we get better batching and timer
locality for the networking code as well.

Contrary to the classic wheel the granularity of the next wheel is not the
capacity of the first wheel. The granularities of the wheels are in the
currently chosen setting 8 times the granularity of the previous wheel.

So for HZ=250 we end up with the following granularity levels:

 Level Offset   Granularity                  Range
     0      0          4 ms                 0 ms -        252 ms
     1     64         32 ms               256 ms -       2044 ms (256ms - ~2s)
     2    128        256 ms              2048 ms -      16380 ms (~2s   - ~16s)
     3    192       2048 ms (~2s)       16384 ms -     131068 ms (~16s  - ~2m)
     4    256      16384 ms (~16s)     131072 ms -    1048572 ms (~2m   - ~17m)
     5    320     131072 ms (~2m)     1048576 ms -    8388604 ms (~17m  - ~2h)
     6    384    1048576 ms (~17m)    8388608 ms -   67108863 ms (~2h   - ~18h)
     7    448    8388608 ms (~2h)    67108864 ms -  536870911 ms (~18h  - ~6d)

That's a worst case inaccuracy of 12.5% for the timers which are queued at the
beginning of a level.

So the new wheel concept addresses the old issues:

1) Cascading is avoided completely

2) By keeping the timers in the bucket until expiry/cancelation we can track
   the buckets which have timers enqueued in a bucket bitmap and therefore can
   look up the next expiring timer very fast and O(1).

A further benefit of the concept is that the slack calculation which is done
on every timer start is no longer necessary because the granularity levels
provide natural batching already.

Our extensive testing with various loads did not show any performance
degradation vs. the current wheel implementation.

This patch does not address the 'fast lookup' issue as we wanted to make sure
that there is no regression introduced by the wheel redesign. The
optimizations are in follow up patches.

This patch contains fixes from Anna-Maria Gleixner and Richard Cochran.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.108621834@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:09 +02:00
Thomas Gleixner
b0d6e2dcb2 timers: Reduce the CPU index space to 256k
We want to store the array index in the flags space. 256k CPUs should be
enough for a while.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.030144293@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:08 +02:00
Thomas Gleixner
15dba1e37b hlist: Add hlist_is_singular_node() helper
Required to figure out whether the entry is the only one in the hlist.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.867631372@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:07 +02:00
Thomas Gleixner
177ec0a0a5 timers: Remove the deprecated mod_timer_pinned() API
We switched all users to initialize the timers as pinned and call
mod_timer(). Remove the now unused timer API function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.706205231@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:35:06 +02:00
Thomas Gleixner
e675447bda timers: Make 'pinned' a timer property
We want to move the timer migration logic from a 'push' to a 'pull' model.

Under the current 'push' model pinned timers are handled via
a runtime API variant: mod_timer_pinned().

The 'pull' model requires us to store the pinned attribute of a timer
in the timer_list structure itself, as a new TIMER_PINNED bit in
timer->flags.

This flag must be set at initialization time and the timer APIs
recognize the flag.

This patch:

 - Implements the new flag and associated new-style initialization
   methods

 - makes mod_timer() recognize new-style pinned timers,

 - and adds some migration helper facility to allow
   step by step conversion of old-style to new-style
   pinned timers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.049338558@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 10:25:13 +02:00
Davidlohr Bueso
f06628638c locking/atomic: Introduce inc/dec variants for the atomic_fetch_$op() API
With the inclusion of atomic FETCH-OP variants, many places in the
kernel can make use of atomic_fetch_$op() to avoid the callers that
need to compute the value/state _before_ the operation.

Peter Zijlstra laid out the machinery but we are still missing the
simpler dec,inc() calls (which future patches will make use of).

This patch only deals with the generic code, as at least right now
no arch actually implement them -- which is similar to what the
OP-RETURN primitives currently do.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James.Bottomley@HansenPartnership.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: awalls@md.metrocast.net
Cc: bp@alien8.de
Cc: cw00.choi@samsung.com
Cc: davem@davemloft.net
Cc: dledford@redhat.com
Cc: dougthompson@xmission.com
Cc: gregkh@linuxfoundation.org
Cc: hans.verkuil@cisco.com
Cc: heiko.carstens@de.ibm.com
Cc: jikos@kernel.org
Cc: kys@microsoft.com
Cc: mchehab@osg.samsung.com
Cc: pfg@sgi.com
Cc: schwidefsky@de.ibm.com
Cc: sean.hefty@intel.com
Cc: sumit.semwal@linaro.org
Link: http://lkml.kernel.org/r/20160628215651.GA20048@linux-80c1.suse
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 09:16:20 +02:00
Ingo Molnar
3ebe3bd8fb Merge branch 'perf/urgent' into perf/core, to pick up fixes before merging new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-07 08:58:23 +02:00
David S. Miller
a90a6e55f3 One more set of new features:
* beacon report (for radio measurement) support in cfg80211/mac80211
  * hwsim: allow wmediumd in namespaces
  * mac80211: extend 160MHz workaround to CSA IEs
  * mesh: properly encrypt group-addressed privacy action frames
  * mesh: allow setting peer AID
  * first steps for MU-MIMO monitor mode
  * along with various other cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJXfSCuAAoJEGt7eEactAAdaugP/ilrcELQRsIN5ZXCAKYZuwXV
 T01JPwgOaWL9ILu7h1SfG/+j9kzMnyk4WmRpeoj2FGNcyfG2AvULWSLpQJ2abwgQ
 8o/emuLinQwRENevaMUTRSOE0HkXoFPCbbq37+a2i6bAv1QSYY3A0xvWpcU5fZ4D
 7CKYDYPBAdMXYwEwy1g4nYWfDAYqS4rthr3l3rS1Cy7Q2T1ZlMlD90GjD7oeQAEw
 orKulhkkDSzvxfvZCYTzXmUoBQE8sNXGDD+OFsJyowyt+ugM/xan+2tmhCaHSnda
 HpdCS2aRj779UBn9cOfELjffTNpS++PM6KFd8ZDaPcJSMginn/BAHTOeNfNUfL0Q
 +Enu59I82qMDzbG2z1Qezzjv7OTzyEvyvYzNbLOqljTBSBklDa3rHwhyk+g1oVBH
 +4xX1Vk5QBLde+Q0NS0gTkcqOQK8KT5+HEqiUfgLSNDETN0lSGsKbtvMfU/ikE1Y
 aLRkTp7nzUd03qjIFLS6RMf7JjucWWzH1ZXTHvbpDFAG7riOhYRD3Sw+0e7madTd
 +HXjH9dnOGnGPDL+FyDwtW6iclYwNjcIPQiNdOjwWfMA2Wmr7iq+aFUptRCwQTHB
 WtgJ3f8OHax2JXcm4grYfxELZip5vbWJJHUC84Drvmzw3X7FRITf+OEWjdNOzsRD
 Fc7w5ceThh9Id1BvcH2+
 =R1qP
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2016-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
One more set of new features:
 * beacon report (for radio measurement) support in cfg80211/mac80211
 * hwsim: allow wmediumd in namespaces
 * mac80211: extend 160MHz workaround to CSA IEs
 * mesh: properly encrypt group-addressed privacy action frames
 * mesh: allow setting peer AID
 * first steps for MU-MIMO monitor mode
 * along with various other cleanups and improvements
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-06 22:32:15 -07:00
Olof Johansson
c341376ae9 Second Round of Renesas ARM Based SoC Updates for v4.8
* Add DT support to the APMU driver and prioritise DT APMU support
 * Obtain extal frequency from DT
 * Add support for r8a7792
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXdR2IAAoJENfPZGlqN0++PFsP+QG1N24uc3pc1+kDFvgnPzqH
 Hs3SzPdSLTjx72K4gybaxLp6N6IMLWwHZUSfWDa1WoUDw8j4yVieJ+bTqPQ52SAS
 +9bRbUb5P1kowQ643xIxOoNPKN+XERJtFMUgt9NW/auzxGJFtVuxkDkxzovwWtgm
 f4+wQSgf0zgl/JabbD1qZtf8zhm8gPGsG92CFJqPmkGYoXjVNkiGwsAF4W1WCKfJ
 oXGUAwuVCpoZluLTg7sY2drqIb/L2mZ0QunjdY3dEJO7YxDF7cYTCE0BgGhN/oGH
 nFme3l9WPi2cPMfEl6Ucfvj4lXPeReXhTIGq8XzKCyJy75WlZsCCPgNJ0nuBvriC
 i0V4q2wfpfubzxjcVGdGfP/05QOY2ZZbvBvqaLyxorjw7atTGhn0QPgQtIfL4P1T
 nlf1fMMtlG5m8CjM7J3nAR6BKcEEF1plUL0c1ZG6UeZCAENX9MyB0J47B7+umI1a
 xuNugqKJg2Z8OsFKJ40MwdvlQuw6QRNw8am4erEou0pt3tzY2+O9pgw7yV2ElNkX
 OVskuk+ove0uK5XAmOdClZHtTl5+wNp+XKCKTcUGHY7L4RaC9zax0YujrrBoe/wz
 QvqwZF5As7pXk8ckzDS4Zvxsc2AuKQBRKfIp9JMcM9urEwRdlWvhb/OgKUjlqsve
 BSj+hDjpO4V4VZCSQv8r
 =+RmG
 -----END PGP SIGNATURE-----

Merge tag 'renesas-soc2-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/soc

Second Round of Renesas ARM Based SoC Updates for v4.8

* Add DT support to the APMU driver and prioritise DT APMU support
* Obtain extal frequency from DT
* Add support for r8a7792

* tag 'renesas-soc2-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: r8a7791: Prioritize DT APMU support
  ARM: shmobile: r8a7790: Prioritize DT APMU support
  ARM: shmobile: smp: Add function to prioritize DT SMP
  ARM: shmobile: apmu: Add APMU DT support via Enable method
  ARM: shmobile: apmu: Move #ifdef CONFIG_SMP to cover more functions
  ARM: shmobile: rcar-gen2: Correct arch timer frequency on R-Car V2H
  ARM: shmobile: rcar-gen2: Obtain extal frequency from DT
  ARM: shmobile: r8a7792: basic SoC support
  soc: renesas: rcar-sysc: Improve SYSC interrupt config in legacy wrapper
  soc: renesas: rcar-sysc: Move SYSC interrupt config to rcar-sysc driver
  soc: renesas: rcar-sysc: Make rcar_sysc_init() init the PM domains
  soc: renesas: rcar-sysc: Fix uninitialized error code in rcar_sysc_pd_init()
  soc: renesas: rcar-sysc: add R8A7792 support
  soc: renesas: rcar-sysc: Add support for R-Car M3-W power areas
  soc: renesas: Add r8a7796 SYSC PM Domain Binding Definitions
  soc: renesas: rcar-sysc: Document r8a7796 support

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-07-06 22:21:11 -07:00
Olof Johansson
b85751c761 Second Round of Renesas ARM Based SoC R-Car SYSC Updates for v4.8
* Prepare for handling SYSC interrupt configuration purely
   from DT in the rcar-sysc driver for new SoCs, while preserving
   backward compatibility with old DTBs for R-Car H1, H2, and M2-W
 * Add R8A7792 support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXdRvUAAoJENfPZGlqN0++qocQAJ6MApU5z9Bn+bLCCLOUeL+F
 5qtRZIQoO/VQ+7sbRR7pBTTLbKs13Zeizq9CxvhGVtq/YUqzP25/RJ8mE25j1Hq1
 +X3QVVOIqCnRQVDlGSq6Og7IMOvCVsNgqxWdmAwtkuiRkD75Qsb6lkmLH8MyNkRj
 6UcC7UUJ5P/Pes9Qgh7ZqfWHD+A7JkUbYlN8THQkuSOJyA5Vf/jAry6riQSkJiuy
 854xJ/f1FRAQas/Usqr3i+6gpZtRw66n5DA+JFubXJLxvrqnZUDdjV2d1tS8phC6
 /AIGxx8n967lPg7j9DO6YxEViF3sxFrvQNds8qYoxzH6+z7FtApBJW9EDLkb58+H
 nvP93DnYz1PzIjcKq0CDreqVnPDtBFK4kfDzkMxLIfbfQVgtSQlo0rpjHQzUA24u
 6zk0Nhl2DM7pe9BYxQNGwWK9f9QIKxiKx9y8Qqkghylq27V34tG+dBgRj0Klv6G+
 QHJNg2ShGZ/esup14eOh4YOw4ks6oNPeda0Rbl4ox3Ec970/42uIR495rsM40Avb
 G17hpvsCd4oVq5xhmr8CKSDzNleewRYGxbzs5Yp1VpWxAjkKOO2AlCxaWSYtYpQe
 gO1sT/YDWx4m0wiEa1/W0z3tO+U3w3wKaKOtxOZyjOj6dq99MJxPwMRoHYUGyMKW
 gGN/bJjsoxBJO/Kl0om+
 =/p9r
 -----END PGP SIGNATURE-----

Merge tag 'renesas-rcar-sysc2-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/drivers

Second Round of Renesas ARM Based SoC R-Car SYSC Updates for v4.8

* Prepare for handling SYSC interrupt configuration purely
  from DT in the rcar-sysc driver for new SoCs, while preserving
  backward compatibility with old DTBs for R-Car H1, H2, and M2-W
* Add R8A7792 support

* tag 'renesas-rcar-sysc2-for-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  soc: renesas: rcar-sysc: Improve SYSC interrupt config in legacy wrapper
  soc: renesas: rcar-sysc: Move SYSC interrupt config to rcar-sysc driver
  soc: renesas: rcar-sysc: Make rcar_sysc_init() init the PM domains
  soc: renesas: rcar-sysc: Fix uninitialized error code in rcar_sysc_pd_init()
  soc: renesas: rcar-sysc: add R8A7792 support

Signed-off-by: Olof Johansson <olof@lixom.net>
2016-07-06 22:12:04 -07:00
Viresh Kumar
da0c6dc00c cpufreq: Handle sorted frequency tables more efficiently
cpufreq drivers aren't required to provide a sorted frequency table
today, and even the ones which provide a sorted table aren't handled
efficiently by cpufreq core.

This patch adds infrastructure to verify if the freq-table provided by
the drivers is sorted or not, and use efficient helpers if they are
sorted.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07 00:13:20 +02:00
David S. Miller
30d0844bdc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx5/core/en.h
	drivers/net/ethernet/mellanox/mlx5/core/en_main.c
	drivers/net/usb/r8152.c

All three conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-06 10:35:22 -07:00
Linus Torvalds
bc86765181 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) All users of AF_PACKET's fanout feature want a symmetric packet
    header hash for load balancing purposes, so give it to them.

 2) Fix vlan state synchronization in e1000e, from Jarod Wilson.

 3) Use correct socket pointer in ip_skb_dst_mtu(), from Shmulik
    Ladkani.

 4) mlx5 bug fixes from Mohamad Haj Yahia, Daniel Jurgens, Matthew
    Finlay, Rana Shahout, and Shaker Daibes.  Mostly to do with
    operation timeouts and PCI error handling.

 5) Fix checksum handling in mirred packet action, from WANG Cong.

 6) Set skb->dev correctly when transmitting in !protect_frames case of
    macsec driver, from Daniel Borkmann.

 7) Fix MTU calculation in geneve driver, from Haishuang Yan.

 8) Missing netif_napi_del() in unregister path of qeth driver, from
    Ursula Braun.

 9) Handle malformed route netlink messages in decnet properly, from
    Vergard Nossum.

10) Memory leak of percpu data in ipv6 routing code, from Martin KaFai
    Lau.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
  ipv6: Fix mem leak in rt6i_pcpu
  net: fix decnet rtnexthop parsing
  cxgb4: update latest firmware version supported
  net/mlx5: Avoid setting unused var when modifying vport node GUID
  bonding: fix enslavement slave link notifications
  r8152: fix runtime function for RTL8152
  qeth: delete napi struct when removing a qeth device
  Revert "fsl/fman: fix error handling"
  fsl/fman: fix error handling
  cdc_ncm: workaround for EM7455 "silent" data interface
  RDS: fix rds_tcp_init() error path
  geneve: fix max_mtu setting
  net: phy: dp83867: Fix initialization of PHYCR register
  enc28j60: Fix race condition in enc28j60 driver
  net: stmmac: Fix null-function call in ISR on stmmac1000
  tipc: fix nl compat regression for link statistics
  net: bcmsysport: Device stats are unsigned long
  macsec: set actual real device for xmit when !protect_frames
  net_sched: fix mirrored packets checksum
  packet: Use symmetric hash for PACKET_FANOUT_HASH.
  ...
2016-07-06 09:42:43 -07:00
David S. Miller
ae3e4562e2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next,
they are:

1) Don't use userspace datatypes in bridge netfilter code, from
   Tobin Harding.

2) Iterate only once over the expectation table when removing the
   helper module, instead of once per-netns, from Florian Westphal.

3) Extra sanitization in xt_hook_ops_alloc() to return error in case
   we ever pass zero hooks, xt_hook_ops_alloc():

4) Handle NFPROTO_INET from the logging core infrastructure, from
   Liping Zhang.

5) Autoload loggers when TRACE target is used from rules, this doesn't
   change the behaviour in case the user already selected nfnetlink_log
   as preferred way to print tracing logs, also from Liping Zhang.

6) Conntrack slabs with SLAB_HWCACHE_ALIGN to allow rearranging fields
   by cache lines, increases the size of entries in 11% per entry.
   From Florian Westphal.

7) Skip zone comparison if CONFIG_NF_CONNTRACK_ZONES=n, from Florian.

8) Remove useless defensive check in nf_logger_find_get() from Shivani
   Bhardwaj.

9) Remove zone extension as place it in the conntrack object, this is
   always include in the hashing and we expect more intensive use of
   zones since containers are in place. Also from Florian Westphal.

10) Owner match now works from any namespace, from Eric Bierdeman.

11) Make sure we only reply with TCP reset to TCP traffic from
    nf_reject_ipv4, patch from Liping Zhang.

12) Introduce --nflog-size to indicate amount of network packet bytes
    that are copied to userspace via log message, from Vishwanath Pai.
    This obsoletes --nflog-range that has never worked, it was designed
    to achieve this but it has never worked.

13) Introduce generic macros for nf_tables object generation masks.

14) Use generation mask in table, chain and set objects in nf_tables.
    This allows fixes interferences with ongoing preparation phase of
    the commit protocol and object listings going on at the same time.
    This update is introduced in three patches, one per object.

15) Check if the object is active in the next generation for element
    deactivation in the rbtree implementation, given that deactivation
    happens from the commit phase path we have to observe the future
    status of the object.

16) Support for deletion of just added elements in the hash set type.

17) Allow to resize hashtable from /proc entry, not only from the
    obscure /sys entry that maps to the module parameter, from Florian
    Westphal.

18) Get rid of NFT_BASECHAIN_DISABLED, this code is not exercised
    anymore since we tear down the ruleset whenever the netdevice
    goes away.

19) Support for matching inverted set lookups, from Arturo Borrero.

20) Simplify the iptables_mangle_hook() by removing a superfluous
    extra branch.

21) Introduce ether_addr_equal_masked() and use it from the netfilter
    codebase, from Joe Perches.

22) Remove references to "Use netfilter MARK value as routing key"
    from the Netfilter Kconfig description given that this toggle
    doesn't exists already for 10 years, from Moritz Sichert.

23) Introduce generic NF_INVF() and use it from the xtables codebase,
    from Joe Perches.

24) Setting logger to NONE via /proc was not working unless explicit
    nul-termination was included in the string. This fixes seems to
    leave the former behaviour there, so we don't break backward.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-06 09:15:15 -07:00
Paul E. McKenney
995f140561 rcu: Suppress sparse warnings for rcu_dereference_raw()
Data structures that are used both with and without RCU protection
are difficult to write in a sparse-clean manner.  If you mark the
relevant pointers with __rcu, sparse will complain about all non-RCU
uses, but if you don't mark those pointers, sparse will complain about
all RCU uses.

This commit therefore suppresses sparse warnings for rcu_dereference_raw(),
allowing mixed-protection data structures to avoid these warnings.

Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2016-07-06 10:51:14 +01:00