19140 Commits

Author SHA1 Message Date
Linus Torvalds
47137c6ba1 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Nothing really exciting this time:

   - a few fixlets in the NOHZ code

   - a new ARM SoC timer abomination.  One should expect that we have
     enough of them already, but they insist on inventing new ones.

   - the usual bunch of ARM SoC timer updates.  That feels like herding
     cats"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: arm_arch_timer: Consolidate arch_timer_evtstrm_enable
  clocksource: arm_arch_timer: Enable counter access for 32-bit ARM
  clocksource: arm_arch_timer: Change clocksource name if CP15 unavailable
  clocksource: sirf: Disable counter before re-setting it
  clocksource: cadence_ttc: Add support for 32bit mode
  clocksource: tcb_clksrc: Sanitize IRQ request
  clocksource: arm_arch_timer: Discard unavailable timers correctly
  clocksource: vf_pit_timer: Support shutdown mode
  ARM: meson6: clocksource: Add Meson6 timer support
  ARM: meson: documentation: Add timer documentation
  clocksource: sh_tmu: Document r8a7779 binding
  clocksource: sh_mtu2: Document r7s72100 binding
  clocksource: sh_cmt: Document SoC specific bindings
  timerfd: Remove an always true check
  nohz: Avoid tick's double reprogramming in highres mode
  nohz: Fix spurious periodic tick behaviour in low-res dynticks mode
2014-10-09 06:35:05 -04:00
Linus Torvalds
afa3536be8 Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
 "Main changes:

  - Fix the deadlock reported by Dave Jones et al
  - Clean up and fix nohz_full interaction with arch abilities
  - nohz init code consolidation/cleanup"

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  nohz: nohz full depends on irq work self IPI support
  nohz: Consolidate nohz full init code
  arm64: Tell irq work about self IPI support
  arm: Tell irq work about self IPI support
  x86: Tell irq work about self IPI support
  irq_work: Force raised irq work to run on irq work interrupt
  irq_work: Introduce arch_irq_work_has_interrupt()
  nohz: Move nohz full init call to tick init
2014-10-09 06:30:57 -04:00
Martin Schwidefsky
fe0f49768d s390/nohz: use a per-cpu flag for arch_needs_cpu
Move the nohz_delay bit from the s390_idle data structure to the
per-cpu flags. Clear the nohz delay flag in __cpu_disable and
remove the cpu hotplug notifier that used to do this.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-10-09 09:14:02 +02:00
Ingo Molnar
fd19bda491 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull additional commits for locktorture, from Paul E. McKenney.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-09 08:39:25 +02:00
Al Viro
849f3127bb switch /dev/kmsg to ->write_iter()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-10-09 02:39:09 -04:00
Linus Torvalds
35a9ad8af0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Most notable changes in here:

   1) By far the biggest accomplishment, thanks to a large range of
      contributors, is the addition of multi-send for transmit.  This is
      the result of discussions back in Chicago, and the hard work of
      several individuals.

      Now, when the ->ndo_start_xmit() method of a driver sees
      skb->xmit_more as true, it can choose to defer the doorbell
      telling the driver to start processing the new TX queue entires.

      skb->xmit_more means that the generic networking is guaranteed to
      call the driver immediately with another SKB to send.

      There is logic added to the qdisc layer to dequeue multiple
      packets at a time, and the handling mis-predicted offloads in
      software is now done with no locks held.

      Finally, pktgen is extended to have a "burst" parameter that can
      be used to test a multi-send implementation.

      Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4,
      virtio_net

      Adding support is almost trivial, so export more drivers to
      support this optimization soon.

      I want to thank, in no particular or implied order, Jesper
      Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal
      Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann,
      David Tat, Hannes Frederic Sowa, and Rusty Russell.

   2) PTP and timestamping support in bnx2x, from Michal Kalderon.

   3) Allow adjusting the rx_copybreak threshold for a driver via
      ethtool, and add rx_copybreak support to enic driver.  From
      Govindarajulu Varadarajan.

   4) Significant enhancements to the generic PHY layer and the bcm7xxx
      driver in particular (EEE support, auto power down, etc.) from
      Florian Fainelli.

   5) Allow raw buffers to be used for flow dissection, allowing drivers
      to determine the optimal "linear pull" size for devices that DMA
      into pools of pages.  The objective is to get exactly the
      necessary amount of headers into the linear SKB area pre-pulled,
      but no more.  The new interface drivers use is eth_get_headlen().
      From WANG Cong, with driver conversions (several had their own
      by-hand duplicated implementations) by Alexander Duyck and Eric
      Dumazet.

   6) Support checksumming more smoothly and efficiently for
      encapsulations, and add "foo over UDP" facility.  From Tom
      Herbert.

   7) Add Broadcom SF2 switch driver to DSA layer, from Florian
      Fainelli.

   8) eBPF now can load programs via a system call and has an extensive
      testsuite.  Alexei Starovoitov and Daniel Borkmann.

   9) Major overhaul of the packet scheduler to use RCU in several major
      areas such as the classifiers and rate estimators.  From John
      Fastabend.

  10) Add driver for Intel FM10000 Ethernet Switch, from Alexander
      Duyck.

  11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric
      Dumazet.

  12) Add Datacenter TCP congestion control algorithm support, From
      Florian Westphal.

  13) Reorganize sk_buff so that __copy_skb_header() is significantly
      faster.  From Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits)
  netlabel: directly return netlbl_unlabel_genl_init()
  net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers
  net: description of dma_cookie cause make xmldocs warning
  cxgb4: clean up a type issue
  cxgb4: potential shift wrapping bug
  i40e: skb->xmit_more support
  net: fs_enet: Add NAPI TX
  net: fs_enet: Remove non NAPI RX
  r8169:add support for RTL8168EP
  net_sched: copy exts->type in tcf_exts_change()
  wimax: convert printk to pr_foo()
  af_unix: remove 0 assignment on static
  ipv6: Do not warn for informational ICMP messages, regardless of type.
  Update Intel Ethernet Driver maintainers list
  bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING
  tipc: fix bug in multicast congestion handling
  net: better IFF_XMIT_DST_RELEASE support
  net/mlx4_en: remove NETDEV_TX_BUSY
  3c59x: fix bad split of cpu_to_le32(pci_map_single())
  net: bcmgenet: fix Tx ring priority programming
  ...
2014-10-08 21:40:54 -04:00
Peter Zijlstra
fe0e01c77d tracing: Robustify wait loop
The pending nested sleep debugging triggered on the potential stale
TASK_INTERRUPTIBLE in this code.

While there, fix the loop such that we won't revert to a while(1)
yield() 'spin' loop if we ever get a spurious wakeup.

And fix the actual issue by properly terminating the 'wait' loop by
setting TASK_RUNNING.

Link: http://lkml.kernel.org/p/20141008165110.GA14547@worktop.programming.kicks-ass.net

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-10-08 19:51:01 -04:00
David S. Miller
64b1f00a08 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-10-08 16:22:22 -04:00
Linus Torvalds
25641c0c8d NFS client updates for Linux 3.18
Highlights include:
 
 Stable fixes:
 - fix an NFSv4.1 state renewal regression
 - fix open/lock state recovery error handling
 - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
 - fix statd when reconnection fails
 - Don't wake tasks during connection abort
 - Don't start reboot recovery if lease check fails
 - fix duplicate proc entries
 
 Features:
 - pNFS block driver fixes and clean ups from Christoph
 - More code cleanups from Anna
 - Improve mmap() writeback performance
 - Replace use of PF_TRANS with a more generic mechanism for avoiding
   deadlocks in nfs_release_page
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUMpFYAAoJEGcL54qWCgDywHYP/A7XNykwOGhoHVP1Cgr3xqoz
 gVhAw97AEMZE8xSNVEGS++pJTe59JVzsIsYAwdHMwePV33l3zyzYorae6N9p7zWF
 0xVaNQ4qNLVhbrNLAoB5KA/c3/jMnNjF5t15+8akZad5pt4kXLlhSKjyVpdEEtJE
 A0eneXShMYEeLZoOJhpQt5bsw0OZ8YbWWEMjGlDqyeelvV3K1+zfivQOoyX6hS4w
 XFkPEDmU7zunE/xFP9ZoUaVdLO0TvOWfEZ7STWoHm7NuWfPQiDb9w1mTnuZbZyka
 ssezoGcitzwsjCcQ5e1iKTOoFRIsm/zYXFQgFQL7VFMBU1Tss9Of8047EyDkqcPF
 GxctsGg0gQ2FkG7yx7JH7AKpyibOIuByQrQQ916coWSf7K0L4H4Rcky3vryroylP
 1e1RI49xu215OTm+dLvlvYCv55bqCrTmaUGImZac18+ixD2eh6MNfW2ubSdxk89L
 U2rTFV09Bd52N7IQOGQx1FBEI2ZnIFUV4UaFz7v+rGFxOnk6+WYe+iWyb4wC70Yc
 8Jh/gTIQDd5aghql3FTieMOyfEvO6Re4pLMXmqEWMAevicx2t8DwkJriRu6X8Iy2
 rlDlBPwu5QmRWC20Dc897f0VajwDtwdeB8puod7nobOWzOfx4FrNqLJ+jR3pmHUk
 0otvJytqemXt+zkqqHKK
 =/OQi
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable fixes:
   - fix an NFSv4.1 state renewal regression
   - fix open/lock state recovery error handling
   - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
   - fix statd when reconnection fails
   - don't wake tasks during connection abort
   - don't start reboot recovery if lease check fails
   - fix duplicate proc entries

  Features:
  - pNFS block driver fixes and clean ups from Christoph
  - More code cleanups from Anna
  - Improve mmap() writeback performance
  - Replace use of PF_TRANS with a more generic mechanism for avoiding
    deadlocks in nfs_release_page"

* tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (66 commits)
  NFSv4.1: Fix an NFSv4.1 state renewal regression
  NFSv4: fix open/lock state recovery error handling
  NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails
  NFS: Fabricate fscache server index key correctly
  SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT
  NFSv3: Fix missing includes of nfs3_fs.h
  NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page()
  NFS: avoid waiting at all in nfs_release_page when congested.
  NFS: avoid deadlocks with loop-back mounted NFS filesystems.
  MM: export page_wakeup functions
  SCHED: add some "wait..on_bit...timeout()" interfaces.
  NFS: don't use STABLE writes during writeback.
  NFSv4: use exponential retry on NFS4ERR_DELAY for async requests.
  rpc: Add -EPERM processing for xs_udp_send_request()
  rpc: return sent and err from xs_sendpages()
  lockd: Try to reconnect if statd has moved
  SUNRPC: Don't wake tasks during connection abort
  Fixing lease renewal
  nfs: fix duplicate proc entries
  pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe
  ...
2014-10-08 12:49:23 -04:00
Linus Torvalds
87d7bcee4f Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 - add multibuffer infrastructure (single_task_running scheduler helper,
   OKed by Peter on lkml.
 - add SHA1 multibuffer implementation for AVX2.
 - reenable "by8" AVX CTR optimisation after fixing counter overflow.
 - add APM X-Gene SoC RNG support.
 - SHA256/SHA512 now handles unaligned input correctly.
 - set lz4 decompressed length correctly.
 - fix algif socket buffer allocation failure for 64K page machines.
 - misc fixes

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (47 commits)
  crypto: sha - Handle unaligned input data in generic sha256 and sha512.
  Revert "crypto: aesni - disable "by8" AVX CTR optimization"
  crypto: aesni - remove unused defines in "by8" variant
  crypto: aesni - fix counter overflow handling in "by8" variant
  hwrng: printk replacement
  crypto: qat - Removed unneeded partial state
  crypto: qat - Fix typo in name of tasklet_struct
  crypto: caam - Dynamic allocation of addresses for various memory blocks in CAAM.
  crypto: mcryptd - Fix typos in CRYPTO_MCRYPTD description
  crypto: algif - avoid excessive use of socket buffer in skcipher
  arm64: dts: add random number generator dts node to APM X-Gene platform.
  Documentation: rng: Add X-Gene SoC RNG driver documentation
  hwrng: xgene - add support for APM X-Gene SoC RNG support
  crypto: mv_cesa - Add missing #define
  crypto: testmgr - add test for lz4 and lz4hc
  crypto: lz4,lz4hc - fix decompression
  crypto: qat - Use pci_enable_msix_exact() instead of pci_enable_msix()
  crypto: drbg - fix maximum value checks on 32 bit systems
  crypto: drbg - fix sparse warning for cpu_to_be[32|64]
  crypto: sha-mb - sha1_mb_alg_state can be static
  ...
2014-10-08 06:44:48 -04:00
Linus Torvalds
6325e940e7 arm64 updates for 3.18:
- eBPF JIT compiler for arm64
 - CPU suspend backend for PSCI (firmware interface) with standard idle
   states defined in DT (generic idle driver to be merged via a different
   tree)
 - Support for CONFIG_DEBUG_SET_MODULE_RONX
 - Support for unmapped cpu-release-addr (outside kernel linear mapping)
 - set_arch_dma_coherent_ops() implemented and bus notifiers removed
 - EFI_STUB improvements when base of DRAM is occupied
 - Typos in KGDB macros
 - Clean-up to (partially) allow kernel building with LLVM
 - Other clean-ups (extern keyword, phys_addr_t usage)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNB6NAAoJEGvWsS0AyF7x22sP/1qPQvFoY71fSqTZmSY+kfgW
 UMXhDFZOd+khD2TPHWptbgBRDElTQjRPHyISv/8ILKwDNoMlUDLlYkp1XPLM/nlB
 ea9ou2GX8iktqgM2JF5r4vk1hjH6JqEGOUHyWKZc7ibphTVm3dhg3nWL1A4peOUG
 0UyX79kl8BLAaggLSUhjtUz1GMpSNlb6Pc1ForUXaPMayBlOcVoOzh1ir7b5wb3e
 IvotUY1gv+opE9uK0QPr1AJSfpCogPEfQ2TSCP8MQZjxkrEz69n0HaFvdy60rwf4
 DaJiqBoQ5MSP3Bw+qvoYgyz+tfiPFAvEF+O3YQ5x3LBTteoooriFYH4mL7DsicAs
 2WLor/342mHykE0bOc44/gNl8B/xaZNzvO2ezLYrjVGsiY2QHTZ7fXB8arPUvQSS
 RUXVfHmcv4qthZjI17rgreBKvsfeFIMighSfvMJnVhGqDSvB8abjiPwZjzqB91Bq
 pu5MDitNgR3k3ctwzRaS6JtH2CluVFv97xIS4VaD/hm3JnS5NPeTXFou3Gb3lvon
 d/wXOIB3vY8FDMIt+BMCQPzWiU0liZ/sN7p1bsOmkgZ1wLOZ0nmsaHF09PDRGbtA
 vifopwaw9qtNlcVrTB/rDBCDaT0Ds/mTYD/a3+ch5CYUeLmQmfW/vBMfq/3gUt65
 JdI/nTVXawbl2CpBWw36
 =SAfQ
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 - eBPF JIT compiler for arm64
 - CPU suspend backend for PSCI (firmware interface) with standard idle
   states defined in DT (generic idle driver to be merged via a
   different tree)
 - Support for CONFIG_DEBUG_SET_MODULE_RONX
 - Support for unmapped cpu-release-addr (outside kernel linear mapping)
 - set_arch_dma_coherent_ops() implemented and bus notifiers removed
 - EFI_STUB improvements when base of DRAM is occupied
 - Typos in KGDB macros
 - Clean-up to (partially) allow kernel building with LLVM
 - Other clean-ups (extern keyword, phys_addr_t usage)

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (51 commits)
  arm64: Remove unneeded extern keyword
  ARM64: make of_device_ids const
  arm64: Use phys_addr_t type for physical address
  aarch64: filter $x from kallsyms
  arm64: Use DMA_ERROR_CODE to denote failed allocation
  arm64: Fix typos in KGDB macros
  arm64: insn: Add return statements after BUG_ON()
  arm64: debug: don't re-enable debug exceptions on return from el1_dbg
  Revert "arm64: dmi: Add SMBIOS/DMI support"
  arm64: Implement set_arch_dma_coherent_ops() to replace bus notifiers
  of: amba: use of_dma_configure for AMBA devices
  arm64: dmi: Add SMBIOS/DMI support
  arm64: Correct ftrace calls to aarch64_insn_gen_branch_imm()
  arm64:mm: initialize max_mapnr using function set_max_mapnr
  setup: Move unmask of async interrupts after possible earlycon setup
  arm64: LLVMLinux: Fix inline arm64 assembly for use with clang
  arm64: pageattr: Correctly adjust unaligned start addresses
  net: bpf: arm64: fix module memory leak when JIT image build fails
  arm64: add PSCI CPU_SUSPEND based cpu_suspend support
  arm64: kernel: introduce cpu_init_idle CPU operation
  ...
2014-10-08 05:34:24 -04:00
Linus Torvalds
536fd93d43 Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Included in these updates are:
   - Performance optimisation to avoid writing the control register at
     every exception.
   - Use static inline instead of extern inline in ftrace code.
   - Crypto ARM assembly updates for big endian
   - Alignment of initrd/.init memory to page sizes when freeing to
     ensure that we fully free the regions
   - Add gcov support
   - A couple of preparatory patches for VDSO support: use
     _install_special_mapping, and randomize the sigpage placement above
     stack.
   - Add L2 ePAPR DT cache properties so that DT can specify the cache
     geometry.
   - Preparatory patch for FIQ (NMI) kernel C code for things like
     spinlock lockup debug.  Following on from this are a couple of my
     patches cleaning up show_regs() and removing an unused (probably
     since 1.x days) do_unexp_fiq() function.
   - Use pr_warn() rather than pr_warning().
   - A number of cleanups (smp, footbridge, return_address)"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (21 commits)
  ARM: 8167/1: extend the reserved memory for initrd to be page aligned
  ARM: 8168/1: extend __init_end to a page align address
  ARM: 8169/1: l2c: parse cache properties from ePAPR definitions
  ARM: 8160/1: drop warning about return_address not using unwind tables
  ARM: 8161/1: footbridge: select machine dir based on ARCH_FOOTBRIDGE
  ARM: 8158/1: LLVMLinux: use static inline in ARM ftrace.h
  ARM: 8155/1: place sigpage at a random offset above stack
  ARM: 8154/1: use _install_special_mapping for sigpage
  ARM: 8153/1: Enable gcov support on the ARM architecture
  ARM: Avoid writing to control register on every exception
  ARM: 8152/1: Convert pr_warning to pr_warn
  ARM: remove unused do_unexp_fiq() function
  ARM: remove extraneous newline in show_regs()
  ARM: 8150/3: fiq: Replace default FIQ handler
  ARM: 8140/1: ep93xx: Enable DEBUG_LL_UART_PL01X
  ARM: 8139/1: versatile: Enable DEBUG_LL_UART_PL01X
  ARM: 8138/1: drop ISAR0 workaround for B15
  ARM: 8136/1: sa1100: add Micro ASIC platform device
  ARM: 8131/1: arm/smp: Absorb boot_secondary()
  ARM: 8126/1: crypto: enable NEON SHA-384/SHA-512 for big endian
  ...
2014-10-08 05:30:03 -04:00
Linus Torvalds
28596c9722 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull "trivial tree" updates from Jiri Kosina:
 "Usual pile from trivial tree everyone is so eagerly waiting for"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Remove MN10300_PROC_MN2WS0038
  mei: fix comments
  treewide: Fix typos in Kconfig
  kprobes: update jprobe_example.c for do_fork() change
  Documentation: change "&" to "and" in Documentation/applying-patches.txt
  Documentation: remove obsolete pcmcia-cs from Changes
  Documentation: update links in Changes
  Documentation: Docbook: Fix generated DocBook/kernel-api.xml
  score: Remove GENERIC_HAS_IOMAP
  gpio: fix 'CONFIG_GPIO_IRQCHIP' comments
  tty: doc: Fix grammar in serial/tty
  dma-debug: modify check_for_stack output
  treewide: fix errors in printk
  genirq: fix reference in devm_request_threaded_irq comment
  treewide: fix synchronize_rcu() in comments
  checkstack.pl: port to AArch64
  doc: queue-sysfs: minor fixes
  init/do_mounts: better syntax description
  MIPS: fix comment spelling
  powerpc/simpleboot: fix comment
  ...
2014-10-07 21:16:26 -04:00
Linus Torvalds
d0cd84817c dmaengine-3.17
1/ Step down as dmaengine maintainer see commit 08223d80df38 "dmaengine
    maintainer update"
 
 2/ Removal of net_dma, as it has been marked 'broken' since 3.13 (commit
    77873803363c "net_dma: mark broken"), without reports of performance
    regression.
 
 3/ Miscellaneous fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUKDLKAAoJEB7SkWpmfYgC7wwP/iNHqRjf1suMUTBIF3P6Hgbe
 VCUwh0IkuujMPDG46WRn6cYzarRxVPLoGaLHLPszgjI6pmGPVv19wqeDOlUxtcmr
 0iQWEWv/zqseaAIW+4gj/WYCyMgKil49EUBJKCZCfNmIaad+e0pr8f0uE5yOkHPM
 tqWoZERu9A4dlXGr1TjeOZVzdnPrCt92MrLDN6ZZ6tMuJaEc5PauaLxKTeGy5fYj
 UB+k1xJQzECbsYfpB+uCVYl5/qPO1rNyuBYS8THCsW+JYmrbbfH2kkF2lo2FaUpO
 8Yd50FtzXHKWwAt7BzfIwU2M7x0wRmryrC/xsQi6M+WmVeHYvvHUIpzaA66xRZ5x
 fCy3Fu8sEnmnmboAbh2v2c5uTycqRl2xPzbpLAuxglloXIxzi3ckp6ESF/Z4SldH
 oxIoEievN7lah3vKgvlHZYcWDzrYr8EKf/EzFe9RqDBQDKtzDzre1H9Uivr387Vm
 uFUcGHYG/GXuX47C7EUsMtaSW2UEoR2ytw/HR6CKFPTVXwAzEO6kA9vg0EqL0iIq
 2wVLgavlZuwegmaUBgnr+bgVZMvVN7OU7fAIRVe5xNO6itrPKvheSlQthmRiiq9C
 uzOu4PS6PexqzHUNPCcJpCsj+lawmCSrE0bxtPzTA/CQInVgWs219V9+W5Gn/0YA
 EARN9k6ueX9PZPQrPQLm
 =BBBv
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine

Pull dmaengine updates from Dan Williams:
 "Even though this has fixes marked for -stable, given the size and the
  needed conflict resolutions this is 3.18-rc1/merge-window material.

  These patches have been languishing in my tree for a long while.  The
  fact that I do not have the time to do proper/prompt maintenance of
  this tree is a primary factor in the decision to step down as
  dmaengine maintainer.  That and the fact that the bulk of drivers/dma/
  activity is going through Vinod these days.

  The net_dma removal has not been in -next.  It has developed simple
  conflicts against mainline and net-next (for-3.18).

  Continuing thanks to Vinod for staying on top of drivers/dma/.

  Summary:

   1/ Step down as dmaengine maintainer see commit 08223d80df38
      "dmaengine maintainer update"

   2/ Removal of net_dma, as it has been marked 'broken' since 3.13
      (commit 77873803363c "net_dma: mark broken"), without reports of
      performance regression.

   3/ Miscellaneous fixes"

* tag 'dmaengine-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
  net: make tcp_cleanup_rbuf private
  net_dma: revert 'copied_early'
  net_dma: simple removal
  dmaengine maintainer update
  dmatest: prevent memory leakage on error path in thread
  ioat: Use time_before_jiffies()
  dmaengine: fix xor sources continuation
  dma: mv_xor: Rename __mv_xor_slot_cleanup() to mv_xor_slot_cleanup()
  dma: mv_xor: Remove all callers of mv_xor_slot_cleanup()
  dma: mv_xor: Remove unneeded mv_xor_clean_completed_slots() call
  ioat: Use pci_enable_msix_exact() instead of pci_enable_msix()
  drivers: dma: Include appropriate header file in dca.c
  drivers: dma: Mark functions as static in dma_v3.c
  dma: mv_xor: Add DMA API error checks
  ioat/dca: Use dev_is_pci() to check whether it is pci device
2014-10-07 20:39:25 -04:00
Linus Torvalds
bdf428feb2 Nothing major: support for compressing modules, and auto-tainting params.
Cheers,
 Rusty.
 PS.  My virtio-next tree is empty: DaveM took the patches I had.  There might
      be a virtio-rng starvation fix, but so far it's a bit voodoo so I will
      get to that in the next two days or it will wait.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUGFrvAAoJENkgDmzRrbjxOJYQALaZbTumrtX3Mo/FAtzn8d5N
 8gxcqk1Mhz4lR1vPWy/YN/H2f23qb/saqLxPar8Wgou3h7N8EqSdwDqJSuvEqhG0
 iEXUsNLC7BOsDkLYhdjTfZoW/lsVU/EH4bkZMSxAZI9V64phXhDYfPb5SQgJTECr
 Ue6IK4ijW6zdWLstGfg/ixrIeGDUSnyiThF9O2mYVaB1D0QkLDIAZxbjZJgfFfut
 PwO33/sEV4pceTpkmxFKl/OiS+obi/VbDixjSCcO+jaBd1pVxH9fhhKREStOhN4z
 88z5ADR71RH6so9TQTwIIcgb2Hon5d+3RVMB6CxuvKs9NmHSXDiQyZvG9J/jiSdm
 KrPKSiVwGGwJSwxXTm8CDaz6Oj0ibDXBIzv/vYI22sR7u8PmRQFvL3O1VrW+KDnE
 yoG75S9DHzSQ1183xFFFTt4FBRm/4XKyVs+F6YqYkchLigrUfQMCGb1cmZyE5y7K
 bgNyonu0m/ItoQmekoDgYqvSjwdguaJ35XCW55GrKJ84JDHBaw3SpPdEfjAS8FsH
 aT5o2oernvwRG6gsX9858RvB/uo1UKwHv1waDfV4cqNjMm5Ko+Yr6OIdQvBQiq07
 cFkVmkrMtEyX19QyIGW3QSbFL1lr3X5cC5glzEeKY941yZbTluSsNuMlMPT1+IMx
 NOUbh0aG8B8ZaMZPFNLi
 =QzCn
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module update from Rusty Russell:
 "Nothing major: support for compressing modules, and auto-tainting
  params.

  PS. My virtio-next tree is empty: DaveM took the patches I had.  There
      might be a virtio-rng starvation fix, but so far it's a bit voodoo
      so I will get to that in the next two days or it will wait"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  moduleparam: Resolve missing-field-initializer warning
  kbuild: handle module compression while running 'make modules_install'.
  modinst: wrap long lines in order to enhance cmd_modules_install
  modsign: lookup lines ending in .ko in .mod files
  modpost: simplify file name generation of *.mod.c files
  modpost: reduce visibility of symbols and constify r/o arrays
  param: check for tainting before calling set op.
  drm/i915: taint the kernel if unsafe module parameters are set
  module: add module_param_unsafe and module_param_named_unsafe
  module: make it possible to have unsafe, tainting module params
  module: rename KERNEL_PARAM_FL_NOARG to avoid confusion
2014-10-07 20:17:38 -04:00
Linus Torvalds
74da38631a Tinification for 3.18
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJUL0J0AAoJEA7Zo9+K/4c9w40P/iMFPfCethdBtPz5rI88CVr2
 7yU99TdbEPoRJm+rU4ohvHdB73p2KWINIKvpSThvegvjXbEcKxQkdpVWHsFJZeHS
 bZiYmhjxdCBvJGLrYo5IwqH0PrSjokTPzMUekUCk7BkUKNJRaDjfUBHvUmKsinUR
 dQL+3KE3edy6W3DL+FOd0QZwSOgmOfEibTWpfmg+n16kFNa75Kg/QLwjYRvtQplP
 eElywDZN07IhAeBFqKhKvlKmDSAeqMd8RfoPPo9Ts+reeIrWYjVNbl9ISOqXqy2x
 JoLeZQmwSXj/C9Ehr5e+aId2eO8In5xueQfXP8SS8dCC7VLwRbnNgyAQQZEslEBk
 QH0GhT6GqTamBdiNI3I+usfs65cEaialXh2afcoLwGS/iGD8MhZ8Dt+m4iyXNxEZ
 kT9VA4974mPjJ1g0mDDnYIxNjxF43m+SD5K1sR/XGpMcA8NdqMUmvKNcbePCobVa
 WTutIemQqGipNeWE94XwZEbc0B+aWwH7eiZOBMVGhWsHInd7QeTBTbfZlctyBkzf
 AswgsFjC5FW05CWK6J1Lf/UI1FD9PmHMKpmQUPED1+7okDTfqGjKjdREWgZSixUt
 LIRfWqWEaNpRRBFbDyt0C+F4pBRPLiRDaOyNhwEdtXuVGKRXb1G3qX7nFOJAZo6G
 GDTZo9iIRNSfm/M4tJ+n
 =2VyW
 -----END PGP SIGNATURE-----

Merge tag 'tiny/for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/josh/linux

Pull "tinification" patches from Josh Triplett.

Work on making smaller kernels.

* tag 'tiny/for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/josh/linux:
  bloat-o-meter: Ignore syscall aliases SyS_ and compat_SyS_
  mm: Support compiling out madvise and fadvise
  x86: Support compiling out human-friendly processor feature names
  x86: Drop support for /proc files when !CONFIG_PROC_FS
  x86, boot: Don't compile early_serial_console.c when !CONFIG_EARLY_PRINTK
  x86, boot: Don't compile aslr.c when !CONFIG_RANDOMIZE_BASE
  x86, boot: Use the usual -y -n mechanism for objects in vmlinux
  x86: Add "make tinyconfig" to configure the tiniest possible kernel
  x86, platform, kconfig: move kvmconfig functionality to a helper
2014-10-07 08:51:59 -04:00
Rafael J. Wysocki
49a09c9ab0 Merge branch 'pm-domains'
* pm-domains: (32 commits)
  PM / Domains: Rename cpu_data to cpuidle_data
  PM / Domains: Move dev_pm_domain_attach|detach() to pm_domain.h
  PM / Domains: Remove legacy API for adding devices through DT
  PM / Domains: Add genpd attach/detach callbacks
  PM / Domains: add debugfs listing of struct generic_pm_domain-s
  ACPI / PM: Convert acpi_dev_pm_detach() into a static function
  ARM: exynos: Move to generic PM domain DT bindings
  amba: Add support for attach/detach of PM domains
  spi: core: Convert to dev_pm_domain_attach|detach()
  mmc: sdio: Convert to dev_pm_domain_attach|detach()
  i2c: core: Convert to dev_pm_domain_attach|detach()
  drivercore / platform: Convert to dev_pm_domain_attach|detach()
  PM / Domains: Add APIs to attach/detach a PM domain for a device
  PM / Domains: Add generic OF-based PM domain look-up
  ACPI / PM: Assign the ->detach() callback when attaching the PM domain
  PM / Domains: Add a detach callback to the struct dev_pm_domain
  PM / domains: Spelling s/domian/domain/
  PM / domains: Keep declaration of dev_power_governors together
  PM / domains: Remove default_stop_ok() API
  drivers: sh: Leave disabling of unused PM domains to genpd
  ...
2014-10-07 01:18:12 +02:00
Rafael J. Wysocki
28c399e2a1 Merge branch 'acpi-pm'
* acpi-pm:
  ACPI / sleep: Rework the handling of ACPI GPE wakeup from suspend-to-idle
  PM / sleep: Rename platform suspend/resume functions in suspend.c
  PM / sleep: Export dpm_suspend_late/noirq() and dpm_resume_early/noirq()
2014-10-07 01:17:50 +02:00
Rafael J. Wysocki
0ede470030 Merge branch 'pm-sleep'
* pm-sleep:
  PM / hibernate: Iterate over set bits instead of PFNs in swsusp_free()
  PM / sleep: new suspend_resume trace event for console resume
  PM / sleep: Update test_suspend option documentation
  PM / sleep: Enhance test_suspend option with repeat capability
  PM / sleep: Support freeze as test_suspend option
  PM / sysfs: avoid shadowing variables
2014-10-07 01:17:30 +02:00
Rafael J. Wysocki
88b42a4883 Merge branch 'pm-genirq'
* pm-genirq:
  PM / genirq: Document rules related to system suspend and interrupts
  PCI / PM: Make PCIe PME interrupts wake up from suspend-to-idle
  x86 / PM: Set IRQCHIP_SKIP_SET_WAKE for IOAPIC IRQ chip objects
  genirq: Simplify wakeup mechanism
  genirq: Mark wakeup sources as armed on suspend
  genirq: Create helper for flow handler entry check
  genirq: Distangle edge handler entry
  genirq: Avoid double loop on suspend
  genirq: Move MASK_ON_SUSPEND handling into suspend_device_irqs()
  genirq: Make use of pm misfeature accounting
  genirq: Add sanity checks for PM options on shared interrupt lines
  genirq: Move suspend/resume logic into irq/pm code
  PM / sleep: Mechanism for aborting system suspends unconditionally
2014-10-07 01:17:21 +02:00
Joe Lawrence
3e28e37720 workqueue: Use cond_resched_rcu_qs macro
Tidy up and use cond_resched_rcu_qs when calling cond_resched and
reporting potential quiescent state to RCU.  Splitting this change in
this way allows easy backporting to -stable for kernel versions not
having cond_resched_rcu_qs().

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-10-06 05:58:26 -07:00
Joe Lawrence
789cbbeca4 workqueue: Add quiescent state between work items
Similar to the stop_machine deadlock scenario on !PREEMPT kernels
addressed in b22ce2785d97 "workqueue: cond_resched() after processing
each work item", kworker threads requeueing back-to-back with zero jiffy
delay can stall RCU. The cond_resched call introduced in that fix will
yield only iff there are other higher priority tasks to run, so force a
quiescent RCU state between work items.

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com
Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com
Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item")
Cc: <stable@vger.kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-10-06 05:57:43 -07:00
Linus Torvalds
039001972a While testing some new changes for 3.18, I kept hitting a bug every so
often in the ring buffer. At first I thought it had to do with some
 of the changes I was working on, but then testing something else I
 realized that the bug was in 3.17 itself. I ran several bisects as the
 bug was not very reproducible, and finally came up with the commit
 that I could reproduce easily within a few minutes, and without the change
 I could run the tests over an hour without issue. The change fit the
 bug and I figured out a fix. That bad commit was:
 
 Commit 651e22f2701b "ring-buffer: Always reset iterator to reader page"
 
 This commit fixed a bug, but in the process created another one. It used
 the wrong value as the cached value that is used to see if things changed
 while an iterator was in use. This made it look like a change always
 happened, and could cause the iterator to go into an infinite loop.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJULvrkAAoJEKQekfcNnQGuD+sH/iHPE2qb2ojCP9+hqOpszdd1
 d8rN8BNZsDlJxfWQELw2vGVXTmeW7txW5DFWQ3I8qSSjwYa6l27M4mHsw2QLagtw
 kIrcazis3IAcYCH8OE4ruD5nAGYLFqRIt0MOa/NAJD0r00xM7nvOhII2+6uAXF+A
 1JbQDRq8eleCKMUMV0XchqWx6pYTXL8cLh1YEXZ0BTUFKIz+y22HjWnMf+odDhLB
 okQic67/+i7mJDAAW4U+pyevd0QBZdDOohjQtbj+irv2pb7WtWqylKcYhAYSpgsy
 MtPzzYyPDs/aHLNcnIJVdVtbKfNXsaHuCgEvKKgLXnKMMcS5UxSIxj+Q1IxSIOM=
 =B7HS
 -----END PGP SIGNATURE-----

Merge tag 'trace-fixes-v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull trace ring buffer iterator fix from Steven Rostedt:
 "While testing some new changes for 3.18, I kept hitting a bug every so
  often in the ring buffer.  At first I thought it had to do with some
  of the changes I was working on, but then testing something else I
  realized that the bug was in 3.17 itself.  I ran several bisects as
  the bug was not very reproducible, and finally came up with the commit
  that I could reproduce easily within a few minutes, and without the
  change I could run the tests over an hour without issue.  The change
  fit the bug and I figured out a fix.  That bad commit was:

    Commit 651e22f2701b "ring-buffer: Always reset iterator to reader page"

  This commit fixed a bug, but in the process created another one.  It
  used the wrong value as the cached value that is used to see if things
  changed while an iterator was in use.  This made it look like a change
  always happened, and could cause the iterator to go into an infinite
  loop"

* tag 'trace-fixes-v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Fix infinite spin in reading buffer
2014-10-03 13:31:57 -07:00
Peter Zijlstra
8acd91e862 locking/lockdep: Revert qrwlock recusive stuff
Commit f0bab73cb539 ("locking/lockdep: Restrict the use of recursive
read_lock() with qrwlock") changed lockdep to try and conform to the
qrwlock semantics which differ from the traditional rwlock semantics.

In particular qrwlock is fair outside of interrupt context, but in
interrupt context readers will ignore all fairness.

The problem modeling this is that read and write side have different
lock state (interrupts) semantics but we only have a single
representation of these. Therefore lockdep will get confused, thinking
the lock can cause interrupt lock inversions.

So revert it for now; the old rwlock semantics were already imperfectly
modeled and the qrwlock extra won't fit either.

If we want to properly fix this, I think we need to resurrect the work
by Gautham did a few years ago that split the read and write state of
locks:

   http://lwn.net/Articles/332801/

FWIW the locking selftest that would've failed (and was reported by
Borislav earlier) is something like:

  RL(X1);	/* IRQ-ON */
  LOCK(A);
  UNLOCK(A);
  RU(X1);

  IRQ_ENTER();
  RL(X1);	/* IN-IRQ */
  RU(X1);
  IRQ_EXIT();

At which point it would report that because A is an IRQ-unsafe lock we
can suffer the following inversion:

	CPU0		CPU1

	lock(A)
			lock(X1)
			lock(A)
	<IRQ>
	 lock(X1)

And this is 'wrong' because X1 can recurse (assuming the above lock are
in fact read-lock) but lockdep doesn't know about this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: ego@linux.vnet.ibm.com
Cc: bp@alien8.de
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20140930132600.GA7444@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 06:09:30 +02:00
Jason Low
debfab74e4 locking/rwsem: Avoid double checking before try acquiring write lock
Commit 9b0fc9c09f1b ("rwsem: skip initial trylock in rwsem_down_write_failed")
checks for if there are known active lockers in order to avoid write trylocking
using expensive cmpxchg() when it likely wouldn't get the lock.

However, a subsequent patch was added such that we directly
check for sem->count == RWSEM_WAITING_BIAS right before trying
that cmpxchg().

Thus, commit 9b0fc9c09f1b now just adds overhead.

This patch modifies it so that we only do a check for if
count == RWSEM_WAITING_BIAS.

Also, add a comment on why we do an "extra check" of count
before the cmpxchg().

Signed-off-by: Jason Low <jason.low2@hp.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Chegu Vinod <chegu_vinod@hp.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1410913017.2447.22.camel@j-VirtualBox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 06:09:29 +02:00
Kirill Tkhai
f10e00f4bf sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
rq->rd is freed using call_rcu_sched(), so rcu_read_lock() to access it
is not enough. We should use either rcu_read_lock_sched() or preempt_disable().

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Fixes: 66339c31bc39 "sched: Use dl_bw_of() under RCU read lock"
Link: http://lkml.kernel.org/r/1412065417.20287.24.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 05:46:58 +02:00
Kirill Tkhai
10a12983b3 sched/fair: Delete resched_cpu() from idle_balance()
We already reschedule env.dst_cpu in attach_tasks()->check_preempt_curr()
if this is necessary.

Furthermore, a higher priority class task may be current on dest rq,
we shouldn't disturb it.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Cc: Juri Lelli <juri.lelli@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140930210441.5258.55054.stgit@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 05:46:56 +02:00
Rik van Riel
347abad981 sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
On 32 bit systems cmpxchg cannot handle 64 bit values, so
some additional magic is required to allow a 32 bit system
with CONFIG_VIRT_CPU_ACCOUNTING_GEN=y enabled to build.

Make sure the correct cmpxchg function is used when doing
an atomic swap of a cputime_t.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: umgwanakikbuti@gmail.com
Cc: fweisbec@gmail.com
Cc: srao@redhat.com
Cc: lwoodman@redhat.com
Cc: atheurer@redhat.com
Cc: oleg@redhat.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linux390@de.ibm.com
Cc: linux-arch@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/20140930155947.070cdb1f@annuminas.surriel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 05:46:55 +02:00
Vincent Guittot
43f4d66637 sched: Improve sysbench performance by fixing spurious active migration
Since commit caeb178c60f4 ("sched/fair: Make update_sd_pick_busiest() ...")
sd_pick_busiest returns a group that can be neither imbalanced nor overloaded
but is only more loaded than others. This change has been introduced to ensure
a better load balance in system that are not overloaded but as a side effect,
it can also generate useless active migration between groups.

Let take the example of 3 tasks on a quad cores system. We will always have an
idle core so the load balance will find a busiest group (core) whenever an ILB
is triggered and it will force an active migration (once above
nr_balance_failed threshold) so the idle core becomes busy but another core
will become idle. With the next ILB, the freshly idle core will try to pull the
task of a busy CPU.
The number of spurious active migration is not so huge in quad core system
because the ILB is not triggered so much. But it becomes significant as soon as
you have more than one sched_domain level like on a dual cluster of quad cores
where the ILB is triggered every tick when you have more than 1 busy_cpu

We need to ensure that the migration generate a real improveùent and will not
only move the avg_load imbalance on another CPU.

Before caeb178c60f4f93f1b45c0bc056b5cf6d217b67f, the filtering of such use
case was ensured by the following test in f_b_g:

  if ((local->idle_cpus < busiest->idle_cpus) &&
		    busiest->sum_nr_running  <= busiest->group_weight)

This patch modified the condition to take into account situation where busiest
group is not overloaded: If the diff between the number of idle cpus in 2
groups is less than or equal to 1 and the busiest group is not overloaded,
moving a task will not improve the load balance but just move it.

A test with sysbench on a dual clusters of quad cores gives the following
results:

  command: sysbench --test=cpu --num-threads=5 --max-time=5 run

The HZ is 200 which means that 1000 ticks has fired during the test.

With Mainline, perf gives the following figures:

 Samples: 727  of event 'sched:sched_migrate_task'
 Event count (approx.): 727
  Overhead  Command          Shared Object  Symbol
  ........  ...............  .............  ..............
    12.52%  migration/1      [unknown]      [.] 00000000
    12.52%  migration/5      [unknown]      [.] 00000000
    12.52%  migration/7      [unknown]      [.] 00000000
    12.10%  migration/6      [unknown]      [.] 00000000
    11.83%  migration/0      [unknown]      [.] 00000000
    11.83%  migration/3      [unknown]      [.] 00000000
    11.14%  migration/4      [unknown]      [.] 00000000
    10.87%  migration/2      [unknown]      [.] 00000000
     2.75%  sysbench         [unknown]      [.] 00000000
     0.83%  swapper          [unknown]      [.] 00000000
     0.55%  ktps65090charge  [unknown]      [.] 00000000
     0.41%  mmcqd/1          [unknown]      [.] 00000000
     0.14%  perf             [unknown]      [.] 00000000

With this patch, perf gives the following figures

 Samples: 20  of event 'sched:sched_migrate_task'
 Event count (approx.): 20
  Overhead  Command          Shared Object  Symbol
  ........  ...............  .............  ..............
    80.00%  sysbench         [unknown]      [.] 00000000
    10.00%  swapper          [unknown]      [.] 00000000
     5.00%  ktps65090charge  [unknown]      [.] 00000000
     5.00%  migration/1      [unknown]      [.] 00000000

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1412170735-5356-1-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 05:46:54 +02:00
Peter Zijlstra
9c2b9d30e2 perf: Fix perf bug in fork()
Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by
calling perf_event_free_task() when failing sched_fork() we will not yet
have done the memset() on ->perf_event_ctxp[] and will therefore try and
'free' the inherited contexts, which are still in use by the parent
process.

This is bad and might explain some outstanding fuzzer failures ...

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Sylvain 'ythier' Hitier <sylvain.hitier@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Daeseok Youn <daeseok.youn@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20140929101201.GE5430@worktop
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 05:41:08 +02:00
Peter Zijlstra
211de6eba8 perf: Fix unclone_ctx() vs. locking
The idiot who did 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
forgot to pay attention and fix all similar cases. Do so now.

In particular, unclone_ctx() must be called while holding ctx->lock,
therefore all such sites are broken for the same reason. Pull the
put_ctx() call out from under ctx->lock.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Probably-also-reported-by: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Cong Wang <cwang@twopensource.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140930172308.GI4241@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 05:41:06 +02:00
Peter Zijlstra
6c72e3501d perf: fix perf bug in fork()
Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by
calling perf_event_free_task() when failing sched_fork() we will not yet
have done the memset() on ->perf_event_ctxp[] and will therefore try and
'free' the inherited contexts, which are still in use by the parent
process.  This is bad..

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Sylvain 'ythier' Hitier <sylvain.hitier@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02 16:28:44 -07:00
Steven Rostedt (Red Hat)
24607f114f ring-buffer: Fix infinite spin in reading buffer
Commit 651e22f2701b "ring-buffer: Always reset iterator to reader page"
fixed one bug but in the process caused another one. The reset is to
update the header page, but that fix also changed the way the cached
reads were updated. The cache reads are used to test if an iterator
needs to be updated or not.

A ring buffer iterator, when created, disables writes to the ring buffer
but does not stop other readers or consuming reads from happening.
Although all readers are synchronized via a lock, they are only
synchronized when in the ring buffer functions. Those functions may
be called by any number of readers. The iterator continues down when
its not interrupted by a consuming reader. If a consuming read
occurs, the iterator starts from the beginning of the buffer.

The way the iterator sees that a consuming read has happened since
its last read is by checking the reader "cache". The cache holds the
last counts of the read and the reader page itself.

Commit 651e22f2701b changed what was saved by the cache_read when
the rb_iter_reset() occurred, making the iterator never match the cache.
Then if the iterator calls rb_iter_reset(), it will go into an
infinite loop by checking if the cache doesn't match, doing the reset
and retrying, just to see that the cache still doesn't match! Which
should never happen as the reset is suppose to set the cache to the
current value and there's locks that keep a consuming reader from
having access to the data.

Fixes: 651e22f2701b "ring-buffer: Always reset iterator to reader page"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-10-02 16:51:18 -04:00
Russell King
d5d1689224 Merge branches 'fiq' (early part), 'fixes', 'l2c' (early part) and 'misc' into for-next 2014-10-02 21:47:02 +01:00
David S. Miller
739e4a758e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	net/netfilter/nfnetlink.c

Both r8152 and nfnetlink conflicts were simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-02 11:25:43 -07:00
Kyle McMartin
6c34f1f542 aarch64: filter $x from kallsyms
Similar to ARM, AArch64 is generating $x and $d syms... which isn't
terribly helpful when looking at %pF output and the like. Filter those
out in kallsyms, modpost and when looking at module symbols.

Seems simplest since none of these check EM_ARM anyway, to just add it
to the strchr used, rather than trying to make things overly
complicated.

initcall_debug improves:
dmesg_before.txt: initcall $x+0x0/0x154 [sg] returned 0 after 26331 usecs
dmesg_after.txt: initcall init_sg+0x0/0x154 [sg] returned 0 after 15461 usecs

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-10-02 17:01:51 +01:00
Alexei Starovoitov
f1bca824da bpf: add search pruning optimization to verifier
consider C program represented in eBPF:
int filter(int arg)
{
    int a, b, c, *ptr;

    if (arg == 1)
        ptr = &a;
    else if (arg == 2)
        ptr = &b;
    else
        ptr = &c;

    *ptr = 0;
    return 0;
}
eBPF verifier has to follow all possible paths through the program
to recognize that '*ptr = 0' instruction would be safe to execute
in all situations.
It's doing it by picking a path towards the end and observes changes
to registers and stack at every insn until it reaches bpf_exit.
Then it comes back to one of the previous branches and goes towards
the end again with potentially different values in registers.
When program has a lot of branches, the number of possible combinations
of branches is huge, so verifer has a hard limit of walking no more
than 32k instructions. This limit can be reached and complex (but valid)
programs could be rejected. Therefore it's important to recognize equivalent
verifier states to prune this depth first search.

Basic idea can be illustrated by the program (where .. are some eBPF insns):
    1: ..
    2: if (rX == rY) goto 4
    3: ..
    4: ..
    5: ..
    6: bpf_exit
In the first pass towards bpf_exit the verifier will walk insns: 1, 2, 3, 4, 5, 6
Since insn#2 is a branch the verifier will remember its state in verifier stack
to come back to it later.
Since insn#4 is marked as 'branch target', the verifier will remember its state
in explored_states[4] linked list.
Once it reaches insn#6 successfully it will pop the state recorded at insn#2 and
will continue.
Without search pruning optimization verifier would have to walk 4, 5, 6 again,
effectively simulating execution of insns 1, 2, 4, 5, 6
With search pruning it will check whether state at #4 after jumping from #2
is equivalent to one recorded in explored_states[4] during first pass.
If there is an equivalent state, verifier can prune the search at #4 and declare
this path to be safe as well.
In other words two states at #4 are equivalent if execution of 1, 2, 3, 4 insns
and 1, 2, 4 insns produces equivalent registers and stack.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:30:33 -04:00
Joerg Roedel
fdd64ed54e PM / hibernate: Iterate over set bits instead of PFNs in swsusp_free()
The existing implementation of swsusp_free iterates over all
pfns in the system and checks every bit in the two memory
bitmaps.

This doesn't scale very well with large numbers of pfns,
especially when the bitmaps are not populated very densly.
Change the algorithm to iterate over the set bits in the
bitmaps instead to make it scale better in large memory
configurations.

Also add a memory_bm_clear_current() helper function that
clears the bit for the last position returned from the
memory bitmap.

This new version adds a !NULL check for the memory bitmaps
before they are walked. Not doing so causes a kernel crash
when the bitmaps are NULL.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-30 21:12:20 +02:00
Rafael J. Wysocki
a8d46b9e4e ACPI / sleep: Rework the handling of ACPI GPE wakeup from suspend-to-idle
The ACPI GPE wakeup from suspend-to-idle is currently based on using
the IRQF_NO_SUSPEND flag for the ACPI SCI, but that is problematic
for a couple of reasons.  First, in principle the ACPI SCI may be
shared and IRQF_NO_SUSPEND does not really work well with shared
interrupts.  Second, it may require the ACPI subsystem to special-case
the handling of device notifications depending on whether or not
they are received during suspend-to-idle in some places which would
lead to fragile code.  Finally, it's better the handle ACPI wakeup
interrupts consistently with wakeup interrupts from other sources.

For this reason, remove the IRQF_NO_SUSPEND flag from the ACPI SCI
and use enable_irq_wake()/disable_irq_wake() with it instead, which
requires two additional platform hooks to be added to struct
platform_freeze_ops.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-30 21:06:07 +02:00
Rafael J. Wysocki
ebc3e41e37 PM / sleep: Rename platform suspend/resume functions in suspend.c
Rename several local functions related to platform handling during
system suspend resume in suspend.c so that their names better
reflect their roles.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-30 21:05:59 +02:00
Rafael J. Wysocki
2a8a8ce651 PM / sleep: Export dpm_suspend_late/noirq() and dpm_resume_early/noirq()
Subsequent change sets will add platform-related operations between
dpm_suspend_late() and dpm_suspend_noirq() as well as between
dpm_resume_noirq() and dpm_resume_early() in suspend_enter(), so
export these functions for suspend_enter() to be able to call them
separately and split the invocations of dpm_suspend_end() and
dpm_resume_start() in there accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-30 21:05:59 +02:00
Rafael J. Wysocki
e4cb0c9e92 Merge branch 'pm-genirq' into acpi-pm 2014-09-30 20:46:13 +02:00
Davidlohr Bueso
c98fed9fc6 locktorture: Cleanup header usage
Remove some unnecessary ones and explicitly include rwsem.h

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-30 00:10:02 -07:00
Davidlohr Bueso
a122949100 locktorture: Cannot hold read and write lock
... trigger an error if so.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-30 00:10:02 -07:00
Davidlohr Bueso
219f800f99 locktorture: Fix __acquire annotation for spinlock irq
Its quite easy to get mixed up with the names -- 'torture_spinlock_irq'
is not actually a valid spinlock name.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-30 00:10:02 -07:00
Davidlohr Bueso
e34191fad8 locktorture: Support rwlocks
Add a "rw_lock" torture test to stress kernel rwlocks and their irq
variant. Reader critical regions are 5x longer than writers. As such
a similar ratio of lock acquisitions is seen in the statistics. In the
case of massive contention, both hold the lock for 1/10 of a second.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-30 00:10:00 -07:00
Rafael J. Wysocki
905563ff47 Merge back earlier 'pm-sleep' material for v3.18. 2014-09-29 15:33:26 +02:00
Dan Williams
7bced39751 net_dma: simple removal
Per commit "77873803363c net_dma: mark broken" net_dma is no longer used
and there is no plan to fix it.

This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards.
Reverting the remainder of the net_dma induced changes is deferred to
subsequent patches.

Marked for stable due to Roman's report of a memory leak in
dma_pin_iovec_pages():

    https://lkml.org/lkml/2014/9/3/177

Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: David Whipple <whipple@securedatainnovations.ch>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: <stable@vger.kernel.org>
Reported-by: Roman Gushchin <klamm@yandex-team.ru>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2014-09-28 07:05:16 -07:00
Linus Torvalds
6111da3432 Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "This is quite late but these need to be backported anyway.

  This is the fix for a long-standing cpuset bug which existed from
  2009.  cpuset makes use of PF_SPREAD_{PAGE|SLAB} flags to modify the
  task's memory allocation behavior according to the settings of the
  cpuset it belongs to; unfortunately, when those flags have to be
  changed, cpuset did so directly even whlie the target task is running,
  which is obviously racy as task->flags may be modified by the task
  itself at any time.  This obscure bug manifested as corrupt
  PF_USED_MATH flag leading to a weird crash.

  The bug is fixed by moving the flag to task->atomic_flags.  The first
  two are prepatory ones to help defining atomic_flags accessors and the
  third one is the actual fix"

* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags
  sched: add macros to define bitops for task atomic flags
  sched: fix confusing PFA_NO_NEW_PRIVS constant
2014-09-27 16:45:33 -07:00
Alexei Starovoitov
3c731eba48 bpf: mini eBPF library, test stubs and verifier testsuite
1.
the library includes a trivial set of BPF syscall wrappers:
int bpf_create_map(int key_size, int value_size, int max_entries);
int bpf_update_elem(int fd, void *key, void *value);
int bpf_lookup_elem(int fd, void *key, void *value);
int bpf_delete_elem(int fd, void *key);
int bpf_get_next_key(int fd, void *key, void *next_key);
int bpf_prog_load(enum bpf_prog_type prog_type,
		  const struct sock_filter_int *insns, int insn_len,
		  const char *license);
bpf_prog_load() stores verifier log into global bpf_log_buf[] array

and BPF_*() macros to build instructions

2.
test stubs configure eBPF infra with 'unspec' map and program types.
These are fake types used by user space testsuite only.

3.
verifier tests valid and invalid programs and expects predefined
error log messages from kernel.
40 tests so far.

$ sudo ./test_verifier
 #0 add+sub+mul OK
 #1 unreachable OK
 #2 unreachable2 OK
 #3 out of range jump OK
 #4 out of range jump2 OK
 #5 test1 ld_imm64 OK
 ...

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26 15:05:15 -04:00