Commit Graph

737123 Commits

Author SHA1 Message Date
Wei Yongjun
0725390da9 cpufreq: scpi: fix error return code in scpi_cpufreq_init()
Fix to return a negative error code from the clk_get() error handling
case instead of 0, as done elsewhere in this function.

Fixes: 343a8d17fa (cpufreq: scpi: remove arm_big_little dependency)
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08 10:21:15 +01:00
Greg Kroah-Hartman
43cdd1b716 ACPI: sbshc: remove raw pointer from printk() message
There's no need to be printing a raw kernel pointer to the kernel log at
every boot.  So just remove it, and change the whole message to use the
correct dev_info() call at the same time.

Reported-by: Wang Qize <wang_qize@venustech.com.cn>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-08 09:50:08 +01:00
Ulf Magnusson
48973df8c9 s390/kconfig: Remove ARCH_WANTS_PROT_NUMA_PROT_NONE select
The ARCH_WANTS_PROT_NUMA_PROT_NONE symbol was removed by
commit 6a33979d5b ("mm: remove misleading ARCH_USES_NUMA_PROT_NONE"),
but S390 still selects it.

Remove the ARCH_WANTS_PROT_NUMA_PROT_NONE select from the S390 symbol.

Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py
script.

Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-02-08 07:07:44 +01:00
Ulf Magnusson
57ea5f161a KVM: PPC: Book3S PR: Fix broken select due to misspelling
Commit 76d837a4c0 ("KVM: PPC: Book3S PR: Don't include SPAPR TCE code
on non-pseries platforms") added a reference to the globally undefined
symbol PPC_SERIES. Looking at the rest of the commit, PPC_PSERIES was
probably intended.

Change PPC_SERIES to PPC_PSERIES.

Discovered with the
https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py
script.

Fixes: 76d837a4c0 ("KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-02-08 16:42:16 +11:00
Song Liu
5c487bb9ad tcp: tracepoint: only call trace_tcp_send_reset with full socket
tracepoint tcp_send_reset requires a full socket to work. However, it
may be called when in TCP_TIME_WAIT:

        case TCP_TW_RST:
                tcp_v6_send_reset(sk, skb);
                inet_twsk_deschedule_put(inet_twsk(sk));
                goto discard_it;

To avoid this problem, this patch checks the socket with sk_fullsock()
before calling trace_tcp_send_reset().

Fixes: c24b14c46b ("tcp: add tracepoint trace_tcp_send_reset")
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 22:00:42 -05:00
Md. Islam
043e337f55 sch_netem: Bug fixing in calculating Netem interval
In Kernel 4.15.0+, Netem does not work properly.

Netem setup:

tc qdisc add dev h1-eth0 root handle 1: netem delay 10ms 2ms

Result:

PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data.
64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=22.8 ms
64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=10.9 ms
64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=10.9 ms
64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=11.4 ms
64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms
64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=4303 ms
64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.2 ms
64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.3 ms
64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=4304 ms
64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=4303 ms

Patch:

(rnd % (2 * sigma)) - sigma was overflowing s32. After applying the
patch, I found following output which is desirable.

PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data.
64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=21.1 ms
64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=8.46 ms
64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=9.00 ms
64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=11.8 ms
64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=8.36 ms
64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms
64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=8.11 ms
64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=10.0 ms
64 bytes from 172.16.101.2: icmp_seq=9 ttl=64 time=11.3 ms
64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.5 ms
64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.2 ms

Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:59:12 -05:00
Grygorii Strashko
62f94c2101 net: ethernet: ti: cpsw: fix net watchdog timeout
It was discovered that simple program which indefinitely sends 200b UDP
packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
timeout is triggered due to race between cpsw_ndo_start_xmit() and
cpsw_tx_handler() [NAPI]

cpsw_ndo_start_xmit()
	if (unlikely(!cpdma_check_free_tx_desc(txch))) {
		txq = netdev_get_tx_queue(ndev, q_idx);
		netif_tx_stop_queue(txq);

^^ as per [1] barier has to be used after set_bit() otherwise new value
might not be visible to other cpus
	}

cpsw_tx_handler()
	if (unlikely(netif_tx_queue_stopped(txq)))
		netif_tx_wake_queue(txq);

and when it happens ndev TX queue became disabled forever while driver's HW
TX queue is empty.

Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
calls and double check for free TX descriptors after stopping ndev TX queue
- if there are free TX descriptors wake up ndev TX queue.

[1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:57:10 -05:00
Thomas Falcon
b0992eca00 ibmvnic: Ensure that buffers are NULL after free
This change will guard against a double free in the case that the
buffers were previously freed at some other time, such as during
a device reset. It resolves a kernel oops that occurred when changing
the VNIC device's MTU.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:55:52 -05:00
John Allen
3468656fd7 ibmvnic: Fix rx queue cleanup for non-fatal resets
At some point, a check was added to exit the polling routine during resets.
This makes sense for most reset conditions, but for a non-fatal error, we
expect the polling routine to continue running to properly clean up the rx
queues. This patch checks if we are performing a non-fatal reset and if we
are, continues normal polling operation.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:55:33 -05:00
Amritha Nambiar
bc6d33c8d9 i40e: Fix the number of queues available to be mapped for use
Fix the number of queues per enabled TC and report available queues
to the kernel without having to limit them to the max RSS limit so
they are available to be mapped for XPS. This allows a queue per
processing thread available for handling traffic for the given
traffic class.

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:53:32 -05:00
David Ahern
44750f8483 net/ipv6: onlink nexthop checks should default to main table
Because of differences in how ipv4 and ipv6 handle fib lookups,
verification of nexthops with onlink flag need to default to the main
table rather than the local table used by IPv4. As it stands an
address within a connected route on device 1 can be used with
onlink on device 2. Updating the table properly rejects the route
due to the egress device mismatch.

Update the extack message as well to show it could be a device
mismatch for the nexthop spec.

Fixes: fc1e64e109 ("net/ipv6: Add support for onlink flag")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:52:42 -05:00
David Ahern
58e354c01b net/ipv6: Handle reject routes with onlink flag
Verification of nexthops with onlink flag need to handle unreachable
routes. The lookup is only intended to validate the gateway address
is not a local address and if the gateway resolves the egress device
must match the given device. Hence, hitting any default reject route
is ok.

Fixes: fc1e64e109 ("net/ipv6: Add support for onlink flag")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:52:06 -05:00
Shannon Nelson
c861ef83d7 sun: Add SPDX license tags to Sun network drivers
Add the appropriate SPDX license tags to the Sun network drivers
as outlined in Documentation/process/license-rules.rst.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:51:02 -05:00
David Howells
17e9e23b13 rxrpc: Fix received abort handling
AF_RXRPC is incorrectly sending back to the server any abort it receives
for a client connection.  This is due to the final-ACK offload to the
connection event processor patch.  The abort code is copied into the
last-call information on the connection channel and then the event
processor is set.

Instead, the following should be done:

 (1) In the case of a final-ACK for a successful call, the ACK should be
     scheduled as before.

 (2) In the case of a locally generated ABORT, the ABORT details should be
     cached for sending in response to further packets related to that
     call and no further action scheduled at call disconnect time.

 (3) In the case of an ACK received from the peer, the call should be
     considered dead, no ABORT should be transmitted at this time.  In
     response to further non-ABORT packets from the peer relating to this
     call, an RX_USER_ABORT ABORT should be transmitted.

 (4) In the case of a call killed due to network error, an RX_USER_ABORT
     ABORT should be cached for transmission in response to further
     packets, but no ABORT should be sent at this time.

Fixes: 3136ef49a1 ("rxrpc: Delay terminal ACK transmission on a client call")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:47:10 -05:00
Christophe JAILLET
e729452ec3 cxgb4: Fix error handling path in 'init_one()'
Commit baf5086840 ("cxgb4: restructure VF mgmt code") has reordered
some code but an error handling label has not been updated accordingly.
So fix it and free 'adapter' if 't4_wait_dev_ready()' fails.

Fixes: baf5086840 ("cxgb4: restructure VF mgmt code")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 21:46:06 -05:00
Trond Myklebust
90ea9f1b60 Make the xprtiod workqueue unbounded.
This should help reduce the latency on replies.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2018-02-07 18:31:54 -05:00
Naresh Kamboju
035d808f7c selftests: bpf: test_kmod.sh: check the module path before insmod
test_kmod.sh reported false failure when module not present.
Check test_bpf.ko is present in the path before loading it.

Two cases to be addressed here,
In the development process of test_bpf.c unit testing will be done by
developers by using "insmod $SRC_TREE/lib/test_bpf.ko"

On the other hand testers run full tests by installing modules on device
under test (DUT) and followed by modprobe to insert the modules accordingly.

Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-08 00:24:55 +01:00
Jens Axboe
8525e5ff45 Merge branch 'for-linus' into test
* for-linus:
  block, bfq: add requeue-request hook
  bcache: fix for data collapse after re-attaching an attached device
  bcache: return attach error when no cache set exist
  bcache: set writeback_rate_update_seconds in range [1, 60] seconds
  bcache: fix for allocator and register thread race
  bcache: set error_limit correctly
  bcache: properly set task state in bch_writeback_thread()
  bcache: fix high CPU occupancy during journal
  bcache: add journal statistic
  block: Add should_fail_bio() for bpf error injection
  blk-wbt: account flush requests correctly
2018-02-07 15:54:25 -07:00
Jens Axboe
61a695184f Merge branch 'master' into test
* master: (1190 commits)
  ASoC: stm32: add of dependency for stm32 drivers
  ASoC: mt8173-rt5650: fix child-node lookup
  ASoC: dapm: fix debugfs read using path->connected
  platform/x86: samsung-laptop: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  platform/x86: ideapad-laptop: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  platform/x86: dell-laptop: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  seq_file: Introduce DEFINE_SHOW_ATTRIBUTE() helper macro
  Documentation/sysctl/user.txt: fix typo
  MAINTAINERS: update ARM/QUALCOMM SUPPORT patterns
  MAINTAINERS: update various PALM patterns
  MAINTAINERS: update "ARM/OXNAS platform support" patterns
  MAINTAINERS: update Cortina/Gemini patterns
  MAINTAINERS: remove ARM/CLKDEV SUPPORT file pattern
  MAINTAINERS: remove ANDROID ION pattern
  mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors
  mm: docs: fix parameter names mismatch
  mm: docs: fixup punctuation
  pipe: read buffer limits atomically
  pipe: simplify round_pipe_size()
  pipe: reject F_SETPIPE_SZ with size over UINT_MAX
  ...
2018-02-07 15:54:20 -07:00
Linus Torvalds
581e400ff9 Modules updates for v4.16
Summary of modules changes for the 4.16 merge window:
 
 - Minor code cleanups and MAINTAINERS update
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJaeyhuAAoJEMBFfjjOO8FyAEQP/RaFlbZWa7/wzOQ5uczUPJGQ
 bk+V3qdJ1m0ayI+hEPhxLeyIDeYcuWVM789FKJSfvl131gJ+8XTvzF9tgvbITiMh
 /LfYz1Qwgjb6gy/5x2z72irxTCL0leGZSkBeiUuQylIM0Pk9gYn/hh675jTsfPih
 fHTr5m5/1gokbmjqAIY8mPXilXJk2Df//BzLRnlUtXY7kLzkP41Cu3A9VKvaPzbj
 D/WqS+R7t/o11aTd3kwRYWQ73F4kcbdTEKmAQucDVOvtFrDZn5PxPzKRGhXB91yp
 Oa+sB4qQoG029/cQRF7X4PZAHP2wth5JxDavAjOKqNpGdYmniL+ihvldtabox0Nq
 ZWl9oKWs52Ga1xzhix0kSxiXkxwJk4x7oBTDxsud1w1MJJZzuHizGABJrKmvuEz7
 cVWFB7ZtLyG49vJmsJlZ7Zg5QfWeqJehf/2lSG6USwQDSukX8BvVqZQgYs2HGLxy
 lBgOI2y1V2LY8+w9d52nxyn8EIMWlnFK4KdUrtM5C2cIOLdeyvLcFas0M1VN1p3B
 TUCu+WeTbUzAAAeYDlKHoRObQAhSx/sx8B1oyAS4uubfvFVYWzTDPSStnevUFgmh
 Lo8Br64bEXF9RFQlanAPlfB+7OjANOmdQ/Hm6p63DchN6M2Q53v+bO8sGwUJfJCH
 RRaekrfJ2WT9T+kVh3+2
 =Qhww
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull modules updates from Jessica Yu:
 "Minor code cleanups and MAINTAINERS update"

* tag 'modules-for-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  modpost: Remove trailing semicolon
  ftrace/module: Move ftrace_release_mod() to ddebug_cleanup label
  MAINTAINERS: Remove from module & paravirt maintenance
2018-02-07 14:29:34 -08:00
Linus Torvalds
6fbac201f9 iversion.h related cleanup for v4.16
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJae0mSAAoJEAAOaEEZVoIVs98P+wSbwfgLeyTufmrRYrD9kxfh
 EQXfuvnJqPzRHLJIUXfwzTN3IV9RZ1434ci31lZvQE3PKrgb90QuBLiR6OIKULef
 UqpYRmjsg7BfFBdAnyUR8xSmmeN94PjXQk7tG+YQn096HJVZ6cG5qCA8RjJ9dFoq
 2haDcOfDU+3e8mbtrrF4doP6jGrVwV+okqRsshFBclQv62Kk3m7L5AjQINyZpTM5
 ZKX5JIMOAmlJcHsz/2J1qLAIRQKsvEUbRLV43bzp3E03PuVFPhig3dVtpGPUe+Yi
 OW0JX49hIoTCrQ4KZk6uweLG7ZpaSoppXggEi2ERNCUkCf3nhejLlScfye+yLx7f
 sItgPkOYU0VVF70Y72XH1DbOekZr/XCLZdEEUNCS/P68hnyK0gBNC9zPGetlxMMi
 wjjQ9Qe45vD2JFlrvhHrdUdCnxnE05zC9ckBrmM94uRwIfDR0WVgo6pfebfRkAJd
 Wp4/PfbaySY7vk4oyaXlNxcDIH2NvWwYkioI/K9rRGbB2KjTdXonQojBy+rT0LeS
 f3mufyZYyCxdwu3Wf8WO36H23L+4fseMthKIIPA0aL4wasB9LgD8gDnkyKx28DT4
 S32tdK4UALC8SAVsPr+vSaMVzKOZmuNHac+XB2i+5lHl8G/n4M2a+JFTeR4CnKJ/
 9LsBEBL5Oj7ZXL7lfFIO
 =iEKM
 -----END PGP SIGNATURE-----

Merge tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull inode->i_version cleanup from Jeff Layton:
 "Goffredo went ahead and sent a patch to rename this function, and
  reverse its sense, as we discussed last week.

  The patch is very straightforward and I figure it's probably best to
  go ahead and merge this to get the API as settled as possible"

* tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  iversion: Rename make inode_cmp_iversion{+raw} to inode_eq_iversion{+raw}
2018-02-07 14:25:22 -08:00
Linus Torvalds
fe803f8628 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF and ext2 fixlets from Jan Kara:
 "A UDF fix and an ext2 cleanup"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext2: drop unneeded newline
  udf: Sanitize nanoseconds for time stamps
2018-02-07 14:23:06 -08:00
Dave Airlie
94fc27ac48 Fix for pcode timeouts on BXT and GLK, cmdparser fixes and fixes
for new vbt version on CFL and CNL.
 
 GVT contains vGPU reset enhancement, which refines vGPU reset flow
 and the support of virtual aperture read/write when x-no-mmap=on
 is set in KVM, which is required by a test case from Redhat and
 also another fix for virtual OpRegion.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJae2UHAAoJEPpiX2QO6xPKR7YH/3RuIP6U0mh6/H0nchI4TmA2
 XNxkGLOaMvLRWIzTrTAsT/sHuNbABgaaNOFOVVCNqXuA8sIdShh4+S1du3xX1UuE
 CeGRFv7wMxK6pd7x3pIjT+/ySC6rUejY+4HqQE8k8TFgDhxYCPLQICf9EhiCLJNy
 /n5I1aU2v7PMoWnlZOj9c2Eoyl9cK9IuWOL5dZmZc48jEDwVSBPxFFqDavQTGq5X
 iq/7tM/Hnf/PpJN7WLatnqeFJg6vL9216kqQa2yf2watRr5AWR+9B7b05C+BDHsg
 l37ycKDCLQK7abBIKQIl4DMYzBTilUC4cCAMbfF5yetl35pzMmFfE0LFIbnpO4U=
 =Fxfl
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-fixes-2018-02-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Fix for pcode timeouts on BXT and GLK, cmdparser fixes and fixes
for new vbt version on CFL and CNL.

GVT contains vGPU reset enhancement, which refines vGPU reset flow
and the support of virtual aperture read/write when x-no-mmap=on
is set in KVM, which is required by a test case from Redhat and
also another fix for virtual OpRegion.

* tag 'drm-intel-next-fixes-2018-02-07' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915/bios: add DP max link rate to VBT child device struct
  drm/i915/cnp: Properly handle VBT ddc pin out of bounds.
  drm/i915/cnp: Ignore VBT request for know invalid DDC pin.
  drm/i915/cmdparser: Do not check past the cmd length.
  drm/i915/cmdparser: Check reg_table_count before derefencing.
  drm/i915/bxt, glk: Increase PCODE timeouts during CDCLK freq changing
  drm/i915/gvt: Use KVM r/w to access guest opregion
  drm/i915/gvt: Fix aperture read/write emulation when enable x-no-mmap=on
  drm/i915/gvt: only reset execlist state of one engine during VM engine reset
  drm/i915/gvt: refine intel_vgpu_submission_ops as per engine ops
2018-02-08 08:21:37 +10:00
Paolo Valente
a787739061 block, bfq: add requeue-request hook
Commit 'a6a252e64914 ("blk-mq-sched: decide how to handle flush rq via
RQF_FLUSH_SEQ")' makes all non-flush re-prepared requests for a device
be re-inserted into the active I/O scheduler for that device. As a
consequence, I/O schedulers may get the same request inserted again,
even several times, without a finish_request invoked on that request
before each re-insertion.

This fact is the cause of the failure reported in [1]. For an I/O
scheduler, every re-insertion of the same re-prepared request is
equivalent to the insertion of a new request. For schedulers like
mq-deadline or kyber, this fact causes no harm. In contrast, it
confuses a stateful scheduler like BFQ, which keeps state for an I/O
request, until the finish_request hook is invoked on the request. In
particular, BFQ may get stuck, waiting forever for the number of
request dispatches, of the same request, to be balanced by an equal
number of request completions (while there will be one completion for
that request). In this state, BFQ may refuse to serve I/O requests
from other bfq_queues. The hang reported in [1] then follows.

However, the above re-prepared requests undergo a requeue, thus the
requeue_request hook of the active elevator is invoked for these
requests, if set. This commit then addresses the above issue by
properly implementing the hook requeue_request in BFQ.

[1] https://marc.info/?l=linux-block&m=151211117608676

Reported-by: Ivan Kozik <ivan@ludios.org>
Reported-by: Alban Browaeys <alban.browaeys@gmail.com>
Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Serena Ziviani <ziviani.serena@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 15:17:46 -07:00
Linus Torvalds
ffefb18172 regulator: Fix suspend to idle
Testing on mainline after the initial regulator pull request went in
 identified a regression for suspend to idle due to it calling the
 suspend operations with states that it wasn't realized could happen,
 this patch fixes the problem.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlp7HfgTHGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0IsVB/9iDT3sw3G547exEzg9A0f3DW1lo371
 dFpveQzhmRvdxM2X4NUYrwdtZkkU+7grX4zZ/YrFpf0WXptTvnX+q3jMXNYYtfLn
 2ts6S33qJSIQYwxQqgkq2Ny3HnhRbaTN0xGz3qLtWpLPd5bzn8pzv19AU4e7wPiy
 OXZv8NVsdye/6VUeHD6NGjiDAawXOOfWRpy5EMQ4Ze//AR/pWB8H8ghkGT9qY8gk
 o4ulH+zcBvLL5wfLHKhAZOa8SOQlvW8E7cKJ7eW2qHW9OO3JP2w/475+MkQKuGPa
 tqbHcmhLluemAf050Mbw8WEwVQon0PZpU8DimHbL5IXfi/u015qs2lnT
 =LouH
 -----END PGP SIGNATURE-----

Merge tag 'regulator-fix-v4.16-suspend' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "Fix suspend to idle.

  Testing on mainline after the initial regulator pull request went in
  identified a regression for suspend to idle due to it calling the
  suspend operations with states that it wasn't realized could happen,
  this patch fixes the problem"

* tag 'regulator-fix-v4.16-suspend' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Fix suspend to idle
2018-02-07 13:17:07 -08:00
Linus Torvalds
c3611b6d7f fbdev changes for v4.16:
- fix display-timings lookup in the Device Tree in atmel_lcdfb
   driver (Johan Hovold)
 
 - fix video mode and line_length to be set correctly in vfb driver
   (Pieter "PoroCYon" Sluys)
 
 - fix returning nonsensical values to the user-space on GIO_FONTX
   ioctl when using dummy console (Nicolas Pitre)
 
 - add missing license tag to mmpfb driver (Arnd Bergmann)
 
 - convert radeonfb and pxa3xx_gcu drivers to use ktime_get[_ts64]()
   instead of the deprecated do_gettimeofday() (Arnd Bergmann)
 
 - switch udlfb driver from using the pr_*() logging functions to
   the dev_*() ones + related cleanups (Ladislav Michl)
 
 - use __raw I/O accessors also on arm64 (Ji Zhang)
 
 - fix Kconfig help text for intelfb driver (Randy Dunlap)
 
 - do not duplicate features data in omapfb driver (Ladislav Michl)
 
 - misc cleanups (Colin Ian King, Markus Elfring, Rasmus Villemoes,
   Vasyl Gomonovych, Himanshu Jha, Michael Trimarchi)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJaexPBAAoJEH4ztj+gR8ILeMgP/2CLJ4uucF8n3FuPgqYEs1sS
 7fW/DM/WnN06QVGTSmx7c9gzEs1AciJXOWwcPmB96qHD7MflUovJqlJiLvmC2a4I
 c5bMb/NZWMaKK5nQ8Rs2Xg07tUwF2FQD/62H7Qa+lsM0dIqfF27atzXskYKDcCG/
 SYKttQJksOaZLcFHB+FpOr87jte2P3mjyspy88+TFFRy05CPV11PFcnZ7GoI8Auz
 arNhf1WkZ7sWbtoJv9U/EmW3mk+nepvitBovUxiMYudTdPt2M8CyAzOspPPw7fyO
 zJqQtEgR18qmUcO6Lxzw0fL2C4pxt65iBO0kzIrAGton+o8u/QwZEkJvNy6LB0IZ
 yOG8Qdd9sPelwcHlVQw/a7x1h+wlU8DbwwzzjC3WBxGUFA8op7z2gwMjsK2OK54V
 sBQDaCRUjtFn7vV1xRAmGo18Yw9p40ZtHq/zaIvdlXZ6VAN1G2Mf25XeWQZzz4L4
 sMIJAtQETLeTiGpBCVDPKx5dN31g/GSEatGkT0C/kmoJXrAwb2JfB1Hemj5LSOq9
 DYCdXbSR883DgYQ7QqXsXH0Sj6hGNWD66/7HeC7xmL/TBs4+dO+1YLxrWK5QZGMa
 02K+obWg/xYqO+wkf6wC4zFGkCsHVf4sODirzBCrRpvntrRy7/kNScEEQJDOHZNl
 AbyPiTczZjkUKhUidM8G
 =bUB+
 -----END PGP SIGNATURE-----

Merge tag 'fbdev-v4.16' of git://github.com/bzolnier/linux

Pull fbdev updates from Bartlomiej Zolnierkiewicz:
 "There is nothing really major here:

   - fix display-timings lookup in the Device Tree in atmel_lcdfb driver
     (Johan Hovold)

   - fix video mode and line_length to be set correctly in vfb driver
     (Pieter "PoroCYon" Sluys)

   - fix returning nonsensical values to the user-space on GIO_FONTX
     ioctl when using dummy console (Nicolas Pitre)

   - add missing license tag to mmpfb driver (Arnd Bergmann)

   - convert radeonfb and pxa3xx_gcu drivers to use ktime_get[_ts64]()
     instead of the deprecated do_gettimeofday() (Arnd Bergmann)

   - switch udlfb driver from using the pr_*() logging functions to the
     dev_*() ones + related cleanups (Ladislav Michl)

   - use __raw I/O accessors also on arm64 (Ji Zhang)

   - fix Kconfig help text for intelfb driver (Randy Dunlap)

   - do not duplicate features data in omapfb driver (Ladislav Michl)

   - misc cleanups (Colin Ian King, Markus Elfring, Rasmus Villemoes,
     Vasyl Gomonovych, Himanshu Jha, Michael Trimarchi)"

* tag 'fbdev-v4.16' of git://github.com/bzolnier/linux: (25 commits)
  video: udlfb: Switch from the pr_*() to the dev_*() logging functions
  video: udlfb: Constify read only data
  video: fbdev/mmp: add MODULE_LICENSE
  console/dummy: leave .con_font_get set to NULL
  fbdev: mxsfb: use framebuffer_alloc in the correct way
  video: udlfb: Do not name private data 'dev'
  video: udlfb: Remove noisy warnings
  video: udlfb: Remove redundant gdev variable
  video: udlfb: Remove unnecessary local variable
  fbdev: auo_k190x: Use zeroing memory allocator instead of allocator/memset
  vfb: fix video mode and line_length being set when loaded
  fbdev: arm64 use __raw I/O memory api
  omapfb: dss: Do not duplicate features data
  video: fbdev: omap2: Use PTR_ERR_OR_ZERO()
  fbdev: au1200fb: delete duplicate header contents
  fbdev: pxa3xx: use ktime_get_ts64 for time stamps
  fbdev: radeon: use ktime_get() for HZ calibration
  video: smscufx: Improve a size determination in two functions
  video: udlfb: Delete an unnecessary return statement in two functions
  video: udlfb: Improve a size determination in dlfb_alloc_urb_list()
  ...
2018-02-07 13:10:43 -08:00
Linus Torvalds
cc006a2241 platform-drivers-x86 for v4.16-2
DEFINE_SHOW_ATTRIBUTE() macro defined privately in 3 locations is
 useful for new and old users to avoid a lot of code duplication.
 
 Move the macro to seq_file.h.
 
 Along with above, clean up 3 drivers to use that macro.
 
 The following is an automated git shortlog grouped by driver:
 
 dell-laptop:
  -  Re-use DEFINE_SHOW_ATTRIBUTE() macro
 
 ideapad-laptop:
  -  Re-use DEFINE_SHOW_ATTRIBUTE() macro
 
 samsung-laptop:
  -  Re-use DEFINE_SHOW_ATTRIBUTE() macro
 
 seq_file:
  -  Introduce DEFINE_SHOW_ATTRIBUTE() helper macro
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEhiZOUlnC9oKN3n3AmT3/83c5Sy0FAlp7AFIACgkQmT3/83c5
 Sy3LFQ//QZQcURDdVA0Kr98tHBdZuwA3VnfQ92WTmEOScflSoiaTidi5L/Pg+J/I
 g04T8UA/8I6UsvOzT/ULOkrSCj6CawAWeZCBiYaESrhM8sD29msxycPEf9wgPV2L
 iDzxtsGDYFiXm25Zd61Tdgv24NqLXDhv0C+MMF35N8RRzo4mxMy6CU7ZqNUZ2U8L
 eO7fdj0K6tXgOUNE90xziV/Rwwja3Rh9clb3l1AOK/l0uaWjaTfPeJJJGsG6aMAo
 upT+YauvzAxLm6ViWQn+kM29qvgXPPhSrxhlW0yOf5jHcSnpi5hCiqG0CypJJuBE
 Vo1bC7oHpep1Inn2eN8mz6qaidvHzqi/0nmzr4yu+QHQM+iC5vhlH2mw8oGoVw5O
 1znT6UX3BV4oRpbmSmYWseUnj6tV8vkaQ/394B86VtdVkfTV9J0VxIQfyf10I6Tq
 wK0aqvOvMqBi0PJdP1yhTnZ3SKRDRkRbTXUIkrnVvLyKku/SA+uA/yTZZD5BW0hI
 RY7s1//jiXpoHzKD0sV2kXMjbaOdhBYJMCtH9e31MwMCxSN8bSbA+CyC4vtWM6uC
 UWfnOtTs2ujilmZEmsh9zfX4m4DM4Li6YxrkC2enCqkwT5FRnnqxgnYQsKUMwbiz
 petClTONFKPDED8gP1kRgpVj+tQLrj/FFiSSwncZ73njIoU6Ooc=
 =jXCu
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v4.16-2' of git://git.infradead.org/linux-platform-drivers-x86

Pull more x86 platform-drivers updates from Andy Shevchenko:
 "The DEFINE_SHOW_ATTRIBUTE() macro was defined privately in three
  locations and is useful for new and old users to avoid a lot of code
  duplication.

  Move the macro to seq_file.h.

  Along with above, clean up three drivers to use that macro.

  This, due to dependencies, was sent separately since affected changes
  weren't upstream originally yet. The rationale of doing this now is to
  allow use of new macro in v4.17 cycle in a conflictless manner"

* tag 'platform-drivers-x86-v4.16-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: samsung-laptop: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  platform/x86: ideapad-laptop: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  platform/x86: dell-laptop: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  seq_file: Introduce DEFINE_SHOW_ATTRIBUTE() helper macro
2018-02-07 12:47:23 -08:00
Jani Nikula
6dd3104e78 drm/i915/bios: add DP max link rate to VBT child device struct
Update VBT defs to reflect revision 216. While at it, default the
expected child device struct size to sizeof the size rather than a
hardcoded value.

v2: Fix bit order (David)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180118153310.32437-1-jani.nikula@intel.com
(cherry picked from commit c4fb60b9ab)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-02-07 12:32:14 -08:00
Linus Torvalds
7590e37bda ASoC: Updates for v4.16
With the merge window having been delayed for another week here's
 another batch of updates that came in during that week.  There's a few
 important fixes in here, mainly a fix for I/O on a number of devices
 caused by some of the component rework and a fix for a potential issue
 if more than one component in a link provides compressed operations.
 The I/O fixes are particularly important as the problem causes a power
 regression on a number of OMAP platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlp65E4THGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0Oy1B/9XQd6twiSRfNKtbdujOyvjc8lpZ01n
 JYHVySXyL9ZqAnIblM7Or/iXVEtsAVcXgFegF3SLHAY6VF7DQ1pDolnxUtuXOxOj
 j+/2Y4wnDGCjXEr0tMoxrbNUIqlVZLCpPwsPo3vvVbr6sLLmQYVposNp2A2sK2bz
 uWm9E3Nr26Q0UctzjWQM5+AFHSouyL7zDPfBCoWkEToP7163w6r4JDr991KdNGwP
 Ac+5qjRUSldsn8WB2ngm8ioqbq+aOvsz2THYjG8gxrlQK+BWsyCDqF7f1d9GWse2
 7k+xZLrdJrVkBMOnpvOx/Y4KRfe9BAFZBZ3KRbi2IR++7TD3902xEX27
 =aIRm
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v4.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound

Pull more ASoC updates from Mark Brown:
 "With the merge window having been delayed for another week here's
  another batch of updates that came in during that week.

  There's a few important fixes in here, mainly a fix for I/O on a
  number of devices caused by some of the component rework and a fix for
  a potential issue if more than one component in a link provides
  compressed operations. The I/O fixes are particularly important as the
  problem causes a power regression on a number of OMAP platforms"

* tag 'asoc-v4.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound: (22 commits)
  ASoC: stm32: add of dependency for stm32 drivers
  ASoC: mt8173-rt5650: fix child-node lookup
  ASoC: dapm: fix debugfs read using path->connected
  ASoC: compress: Fixup error messages
  ASoC: compress: Remove some extraneous blank lines
  ASoC: compress: Correct handling of copy callback
  ASoC: Intel: kbl: Enable mclk and ssp sclk early
  ASoC: Intel: Skylake: Add extended I2S config blob support in Clock driver
  ASoC: Intel: Skylake: Add ssp clock driver
  ASoC: Fix twl4030 and 6040 regression by adding back read and write
  ASoC: sun8i-codec: Add ADC support for a33
  ASoC: rockchip: Use dummy_dai for rt5514 dsp dailink
  ASoC: soc-pcm: rename .pmdown_time to .use_pmdown_time for Component
  ASoC: ak4613: call dummy write for PW_MGMT1/3 when Playback
  ASoC: soc-pcm: don't call flush_delayed_work() many times in soc_pcm_private_free()
  ASoC: soc-core: snd_soc_rtdcom_lookup() cares component driver name
  ASoC: sam9x5_wm8731: Drop 'ASoC' prefix from error messages
  ASoC: sam9g20_wm8731: use dev_*() logging functions
  ASoC: max98373 Changed SPDX header in C++ comments style
  ASoC: dmic: Fix check of return value from read of 'num-channels'
  ...
2018-02-07 12:11:09 -08:00
Dave Airlie
2dd27794b9 Merge branch 'drm-next-4.16' of git://people.freedesktop.org/~agd5f/linux into drm-next
A few more misc fixes for 4.16.

* 'drm-next-4.16' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: re-enable CGCG on CZ and disable on ST
  drm/amdgpu: disable coarse grain clockgating for ST
  drm/radeon: adjust tested variable
  drm/amdgpu: remove WARN_ON when VM isn't found v2
  drm/amdgpu: fix locking in vega10_ih_prescreen_iv
  drm/amdgpu: fix another potential cause of VM faults
  drm/amdgpu: use queue 0 for kiq ring
  drm/ttm: Fix 'buf' pointer update in ttm_bo_vm_access_kmap() (v2)
  drm/ttm: fix missing parameter change for ttm_bo_cleanup_refs
2018-02-08 06:05:52 +10:00
Linus Torvalds
7e6127c124 linux-watchdog 4.16-rc1 merge window tag
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iEYEABECAAYFAlp4I7MACgkQ+iyteGJfRsp34QCgi3O78Sajso9iJNMj5KJsQTEt
 VOsAn1ioW2jO9CmZQoj4IlVStlKU0NCN
 =jeHv
 -----END PGP SIGNATURE-----

Merge tag 'linux-watchdog-4.16-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

 - new watchdog device drivers for Realtek RTD1295 and Spreadtrum SC9860
   platform

 - add support for the following devices: jz4780 SoC, AST25xx series SoC
   and r8a77970 SoC

 - convert to watchdog framework: i6300esb_wdt, xen_wdt and sp5100_tco

 - several fixes for watchdog core

 - remove at32ap700x and obsolete documentation

 - gpio: Convert to use GPIO descriptors

 - rename gemini into FTWDT010 as this IP block is generc from Faraday
   Technology

 - various clean-ups and small bugfixes

 - add Guenter Roeck as co-maintainer

 - change maintainers e-mail address

* tag 'linux-watchdog-4.16-rc1' of git://www.linux-watchdog.org/linux-watchdog: (74 commits)
  documentation: watchdog: remove documentation of w83697hf_wdt/w83697ug_wdt
  documentation: watchdog: remove documentation for ixp2000
  documentation: watchdog: remove documentation of at32ap700x_wdt
  watchdog: remove at32ap700x_wdt
  watchdog: sp5100_tco: Add support for recent FCH versions
  watchdog: sp5100-tco: Abort if watchdog is disabled by hardware
  watchdog: sp5100_tco: Use bit operations
  watchdog: sp5100_tco: Convert to use watchdog subsystem
  watchdog: sp5100_tco: Clean up function and variable names
  watchdog: sp5100_tco: Use dev_ print functions where possible
  watchdog: sp5100_tco: Match PCI device early
  watchdog: sp5100_tco: Clean up sp5100_tco_setupdevice
  watchdog: sp5100_tco: Use standard error codes
  watchdog: sp5100_tco: Use request_muxed_region where possible
  watchdog: sp5100_tco: Fix watchdog disable bit
  watchdog: sp5100_tco: Always use SP5100_IO_PM_{INDEX_REG,DATA_REG}
  watchdog: core: make sure the watchdog_worker is not deferred
  watchdog: mt7621: switch to using managed devm_watchdog_register_device()
  watchdog: mt7621: set WDOG_HW_RUNNING bit when appropriate
  watchdog: imx2_wdt: restore previous timeout after suspend+resume
  ...
2018-02-07 11:54:34 -08:00
Tang Junhui
73ac105be3 bcache: fix for data collapse after re-attaching an attached device
back-end device sdm has already attached a cache_set with ID
f67ebe1f-f8bc-4d73-bfe5-9dc88607f119, then try to attach with
another cache set, and it returns with an error:
[root]# cd /sys/block/sdm/bcache
[root]# echo 5ccd0a63-148e-48b8-afa2-aca9cbd6279f > attach
-bash: echo: write error: Invalid argument

After that, execute a command to modify the label of bcache
device:
[root]# echo data_disk1 > label

Then we reboot the system, when the system power on, the back-end
device can not attach to cache_set, a messages show in the log:
Feb  5 12:05:52 ceph152 kernel: [922385.508498] bcache:
bch_cached_dev_attach() couldn't find uuid for sdm in set

In sysfs_attach(), dc->sb.set_uuid was assigned to the value
which input through sysfs, no matter whether it is success
or not in bch_cached_dev_attach(). For example, If the back-end
device has already attached to an cache set, bch_cached_dev_attach()
would fail, but dc->sb.set_uuid was changed. Then modify the
label of bcache device, it will call bch_write_bdev_super(),
which would write the dc->sb.set_uuid to the super block, so we
record a wrong cache set ID in the super block, after the system
reboot, the cache set couldn't find the uuid of the back-end
device, so the bcache device couldn't exist and use any more.

In this patch, we don't assigned cache set ID to dc->sb.set_uuid
in sysfs_attach() directly, but input it into bch_cached_dev_attach(),
and assigned dc->sb.set_uuid to the cache set ID after the back-end
device attached to the cache set successful.

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Tang Junhui
7f4fc93d47 bcache: return attach error when no cache set exist
I attach a back-end device to a cache set, and the cache set is not
registered yet, this back-end device did not attach successfully, and no
error returned:
[root]# echo 87859280-fec6-4bcc-20df7ca8f86b > /sys/block/sde/bcache/attach
[root]#

In sysfs_attach(), the return value "v" is initialized to "size" in
the beginning, and if no cache set exist in bch_cache_sets, the "v" value
would not change any more, and return to sysfs, sysfs regard it as success
since the "size" is a positive number.

This patch fixes this issue by assigning "v" with "-ENOENT" in the
initialization.

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Coly Li
7a5e3ecbe5 bcache: set writeback_rate_update_seconds in range [1, 60] seconds
dc->writeback_rate_update_seconds can be set via sysfs and its value can
be set to [1, ULONG_MAX].  It does not make sense to set such a large
value, 60 seconds is long enough value considering the default 5 seconds
works well for long time.

Because dc->writeback_rate_update is a special delayed work, it re-arms
itself inside the delayed work routine update_writeback_rate(). When
stopping it by cancel_delayed_work_sync(), there should be a timeout to
wait and make sure the re-armed delayed work is stopped too. A small max
value of dc->writeback_rate_update_seconds is also helpful to decide a
reasonable small timeout.

This patch limits sysfs interface to set dc->writeback_rate_update_seconds
in range of [1, 60] seconds, and replaces the hand-coded number by macros.

Changelog:
v2: fix a rebase typo in v4, which is pointed out by Michael Lyle.
v1: initial version.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Tang Junhui
682811b3ce bcache: fix for allocator and register thread race
After long time running of random small IO writing,
I reboot the machine, and after the machine power on,
I found bcache got stuck, the stack is:
[root@ceph153 ~]# cat /proc/2510/task/*/stack
[<ffffffffa06b2455>] closure_sync+0x25/0x90 [bcache]
[<ffffffffa06b6be8>] bch_journal+0x118/0x2b0 [bcache]
[<ffffffffa06b6dc7>] bch_journal_meta+0x47/0x70 [bcache]
[<ffffffffa06be8f7>] bch_prio_write+0x237/0x340 [bcache]
[<ffffffffa06a8018>] bch_allocator_thread+0x3c8/0x3d0 [bcache]
[<ffffffff810a631f>] kthread+0xcf/0xe0
[<ffffffff8164c318>] ret_from_fork+0x58/0x90
[<ffffffffffffffff>] 0xffffffffffffffff
[root@ceph153 ~]# cat /proc/2038/task/*/stack
[<ffffffffa06b1abd>] __bch_btree_map_nodes+0x12d/0x150 [bcache]
[<ffffffffa06b1bd1>] bch_btree_insert+0xf1/0x170 [bcache]
[<ffffffffa06b637f>] bch_journal_replay+0x13f/0x230 [bcache]
[<ffffffffa06c75fe>] run_cache_set+0x79a/0x7c2 [bcache]
[<ffffffffa06c0cf8>] register_bcache+0xd48/0x1310 [bcache]
[<ffffffff812f702f>] kobj_attr_store+0xf/0x20
[<ffffffff8125b216>] sysfs_write_file+0xc6/0x140
[<ffffffff811dfbfd>] vfs_write+0xbd/0x1e0
[<ffffffff811e069f>] SyS_write+0x7f/0xe0
[<ffffffff8164c3c9>] system_call_fastpath+0x16/0x1
The stack shows the register thread and allocator thread
were getting stuck when registering cache device.

I reboot the machine several times, the issue always
exsit in this machine.

I debug the code, and found the call trace as bellow:
register_bcache()
   ==>run_cache_set()
      ==>bch_journal_replay()
         ==>bch_btree_insert()
            ==>__bch_btree_map_nodes()
               ==>btree_insert_fn()
                  ==>btree_split() //node need split
                     ==>btree_check_reserve()
In btree_check_reserve(), It will check if there is enough buckets
of RESERVE_BTREE type, since allocator thread did not work yet, so
no buckets of RESERVE_BTREE type allocated, so the register thread
waits on c->btree_cache_wait, and goes to sleep.

Then the allocator thread initialized, the call trace is bellow:
bch_allocator_thread()
==>bch_prio_write()
   ==>bch_journal_meta()
      ==>bch_journal()
         ==>journal_wait_for_write()
In journal_wait_for_write(), It will check if journal is full by
journal_full(), but the long time random small IO writing
causes the exhaustion of journal buckets(journal.blocks_free=0),
In order to release the journal buckets,
the allocator calls btree_flush_write() to flush keys to
btree nodes, and waits on c->journal.wait until btree nodes writing
over or there has already some journal buckets space, then the
allocator thread goes to sleep. but in btree_flush_write(), since
bch_journal_replay() is not finished, so no btree nodes have journal
(condition "if (btree_current_write(b)->journal)" never satisfied),
so we got no btree node to flush, no journal bucket released,
and allocator sleep all the times.

Through the above analysis, we can see that:
1) Register thread wait for allocator thread to allocate buckets of
   RESERVE_BTREE type;
2) Alloctor thread wait for register thread to replay journal, so it
   can flush btree nodes and get journal bucket.
   then they are all got stuck by waiting for each other.

Hua Rui provided a patch for me, by allocating some buckets of
RESERVE_BTREE type in advance, so the register thread can get bucket
when btree node splitting and no need to waiting for the allocator
thread. I tested it, it has effect, and register thread run a step
forward, but finally are still got stuck, the reason is only 8 bucket
of RESERVE_BTREE type were allocated, and in bch_journal_replay(),
after 2 btree nodes splitting, only 4 bucket of RESERVE_BTREE type left,
then btree_check_reserve() is not satisfied anymore, so it goes to sleep
again, and in the same time, alloctor thread did not flush enough btree
nodes to release a journal bucket, so they all got stuck again.

So we need to allocate more buckets of RESERVE_BTREE type in advance,
but how much is enough?  By experience and test, I think it should be
as much as journal buckets. Then I modify the code as this patch,
and test in the machine, and it works.

This patch modified base on Hua Rui’s patch, and allocate more buckets
of RESERVE_BTREE type in advance to avoid register thread and allocate
thread going to wait for each other.

[patch v2] ca->sb.njournal_buckets would be 0 in the first time after
cache creation, and no journal exists, so just 8 btree buckets is OK.

Signed-off-by: Hua Rui <huarui.dev@gmail.com>
Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Coly Li
7ba0d830dc bcache: set error_limit correctly
Struct cache uses io_errors for two purposes,
- Error decay: when cache set error_decay is set, io_errors is used to
  generate a small piece of delay when I/O error happens.
- I/O errors counter: in order to generate big enough value for error
  decay, I/O errors counter value is stored by left shifting 20 bits (a.k.a
  IO_ERROR_SHIFT).

In function bch_count_io_errors(), if I/O errors counter reaches cache set
error limit, bch_cache_set_error() will be called to retire the whold cache
set. But current code is problematic when checking the error limit, see the
following code piece from bch_count_io_errors(),

 90     if (error) {
 91             char buf[BDEVNAME_SIZE];
 92             unsigned errors = atomic_add_return(1 << IO_ERROR_SHIFT,
 93                                                 &ca->io_errors);
 94             errors >>= IO_ERROR_SHIFT;
 95
 96             if (errors < ca->set->error_limit)
 97                     pr_err("%s: IO error on %s, recovering",
 98                            bdevname(ca->bdev, buf), m);
 99             else
100                     bch_cache_set_error(ca->set,
101                                         "%s: too many IO errors %s",
102                                         bdevname(ca->bdev, buf), m);
103     }

At line 94, errors is right shifting IO_ERROR_SHIFT bits, now it is real
errors counter to compare at line 96. But ca->set->error_limit is initia-
lized with an amplified value in bch_cache_set_alloc(),
1545         c->error_limit  = 8 << IO_ERROR_SHIFT;

It means by default, in bch_count_io_errors(), before 8<<20 errors happened
bch_cache_set_error() won't be called to retire the problematic cache
device. If the average request size is 64KB, it means bcache won't handle
failed device until 512GB data is requested. This is too large to be an I/O
threashold. So I believe the correct error limit should be much less.

This patch sets default cache set error limit to 8, then in
bch_count_io_errors() when errors counter reaches 8 (if it is default
value), function bch_cache_set_error() will be called to retire the whole
cache set. This patch also removes bits shifting when store or show
io_error_limit value via sysfs interface.

Nowadays most of SSDs handle internal flash failure automatically by LBA
address re-indirect mapping. If an I/O error can be observed by upper layer
code, it will be a notable error because that SSD can not re-indirect
map the problematic LBA address to an available flash block. This situation
indicates the whole SSD will be failed very soon. Therefore setting 8 as
the default io error limit value makes sense, it is enough for most of
cache devices.

Changelog:
v2: add reviewed-by from Hannes.
v1: initial version for review.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Junhui Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Coly Li
99361bbf26 bcache: properly set task state in bch_writeback_thread()
Kernel thread routine bch_writeback_thread() has the following code block,

447         down_write(&dc->writeback_lock);
448~450     if (check conditions) {
451                 up_write(&dc->writeback_lock);
452                 set_current_state(TASK_INTERRUPTIBLE);
453
454                 if (kthread_should_stop())
455                         return 0;
456
457                 schedule();
458                 continue;
459         }

If condition check is true, its task state is set to TASK_INTERRUPTIBLE
and call schedule() to wait for others to wake up it.

There are 2 issues in current code,
1, Task state is set to TASK_INTERRUPTIBLE after the condition checks, if
   another process changes the condition and call wake_up_process(dc->
   writeback_thread), then at line 452 task state is set back to
   TASK_INTERRUPTIBLE, the writeback kernel thread will lose a chance to be
   waken up.
2, At line 454 if kthread_should_stop() is true, writeback kernel thread
   will return to kernel/kthread.c:kthread() with TASK_INTERRUPTIBLE and
   call do_exit(). It is not good to enter do_exit() with task state
   TASK_INTERRUPTIBLE, in following code path might_sleep() is called and a
   warning message is reported by __might_sleep(): "WARNING: do not call
   blocking ops when !TASK_RUNNING; state=1 set at [xxxx]".

For the first issue, task state should be set before condition checks.
Ineed because dc->writeback_lock is required when modifying all the
conditions, calling set_current_state() inside code block where dc->
writeback_lock is hold is safe. But this is quite implicit, so I still move
set_current_state() before all the condition checks.

For the second issue, frankley speaking it does not hurt when kernel thread
exits with TASK_INTERRUPTIBLE state, but this warning message scares users,
makes them feel there might be something risky with bcache and hurt their
data.  Setting task state to TASK_RUNNING before returning fixes this
problem.

In alloc.c:allocator_wait(), there is also a similar issue, and is also
fixed in this patch.

Changelog:
v3: merge two similar fixes into one patch
v2: fix the race issue in v1 patch.
v1: initial buggy fix.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Junhui Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Tang Junhui
c4dc2497d5 bcache: fix high CPU occupancy during journal
After long time small writing I/O running, we found the occupancy of CPU
is very high and I/O performance has been reduced by about half:

[root@ceph151 internal]# top
top - 15:51:05 up 1 day,2:43,  4 users,  load average: 16.89, 15.15, 16.53
Tasks: 2063 total,   4 running, 2059 sleeping,   0 stopped,   0 zombie
%Cpu(s):4.3 us, 17.1 sy 0.0 ni, 66.1 id, 12.0 wa,  0.0 hi,  0.5 si,  0.0 st
KiB Mem : 65450044 total, 24586420 free, 38909008 used,  1954616 buff/cache
KiB Swap: 65667068 total, 65667068 free,        0 used. 25136812 avail Mem

  PID USER PR NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 2023 root 20  0       0      0      0 S 55.1  0.0   0:04.42 kworker/11:191
14126 root 20  0       0      0      0 S 42.9  0.0   0:08.72 kworker/10:3
 9292 root 20  0       0      0      0 S 30.4  0.0   1:10.99 kworker/6:1
 8553 ceph 20  0 4242492 1.805g  18804 S 30.0  2.9 410:07.04 ceph-osd
12287 root 20  0       0      0      0 S 26.7  0.0   0:28.13 kworker/7:85
31019 root 20  0       0      0      0 S 26.1  0.0   1:30.79 kworker/22:1
 1787 root 20  0       0      0      0 R 25.7  0.0   5:18.45 kworker/8:7
32169 root 20  0       0      0      0 S 14.5  0.0   1:01.92 kworker/23:1
21476 root 20  0       0      0      0 S 13.9  0.0   0:05.09 kworker/1:54
 2204 root 20  0       0      0      0 S 12.5  0.0   1:25.17 kworker/9:10
16994 root 20  0       0      0      0 S 12.2  0.0   0:06.27 kworker/5:106
15714 root 20  0       0      0      0 R 10.9  0.0   0:01.85 kworker/19:2
 9661 ceph 20  0 4246876 1.731g  18800 S 10.6  2.8 403:00.80 ceph-osd
11460 ceph 20  0 4164692 2.206g  18876 S 10.6  3.5 360:27.19 ceph-osd
 9960 root 20  0       0      0      0 S 10.2  0.0   0:02.75 kworker/2:139
11699 ceph 20  0 4169244 1.920g  18920 S 10.2  3.1 355:23.67 ceph-osd
 6843 ceph 20  0 4197632 1.810g  18900 S  9.6  2.9 380:08.30 ceph-osd

The kernel work consumed a lot of CPU, and I found they are running journal
work, The journal is reclaiming source and flush btree node with surprising
frequency.

Through further analysis, we found that in btree_flush_write(), we try to
get a btree node with the smallest fifo idex to flush by traverse all the
btree nodein c->bucket_hash, after we getting it, since no locker protects
it, this btree node may have been written to cache device by other works,
and if this occurred, we retry to traverse in c->bucket_hash and get
another btree node. When the problem occurrd, the retry times is very high,
and we consume a lot of CPU in looking for a appropriate btree node.

In this patch, we try to record 128 btree nodes with the smallest fifo idex
in heap, and pop one by one when we need to flush btree node. It greatly
reduces the time for the loop to find the appropriate BTREE node, and also
reduce the occupancy of CPU.

[note by mpl: this triggers a checkpatch error because of adjacent,
pre-existing style violations]

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Tang Junhui
a728eacbbd bcache: add journal statistic
Sometimes, Journal takes up a lot of CPU, we need statistics
to know what's the journal is doing. So this patch provide
some journal statistics:
1) reclaim: how many times the journal try to reclaim resource,
   usually the journal bucket or/and the pin are exhausted.
2) flush_write: how many times the journal try to flush btree node
   to cache device, usually the journal bucket are exhausted.
3) retry_flush_write: how many times the journal retry to flush
   the next btree node, usually the previous tree node have been
   flushed by other thread.
we show these statistic by sysfs interface. Through these statistics
We can totally see the status of journal module when the CPU is too
high.

Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-07 12:50:01 -07:00
Linus Torvalds
413879a10b RISC-V changes for 4.16
This tag contains the fixes we'd like to target for the 4.16 merge
 window.  It's not as much as I was originally hoping to do but between
 glibc, the chip, and FOSDEM there just wasn't enough time to get
 everything put together.  As such, this merge window is essentially just
 going to be small changes.  This includes mostly cleanups:
 
 * A build fix failure to the audit test cases.  RISC-V doesn't have
   renameat because the generic syscall ABI moved to renameat2 by the
   time of our port.  The syscall audit test cases don't understand this,
   so I added a trivial fix.  This went through mailing list review
   during the 4.15 merge window, but nobody has picked it up so I think
   it's best to just do this here.
 * The removal of our command-line argument processing code.  The
   "mem_end" stuff was broken and the rest duplicated generic device tree
   code.  The generic code was already being called.
 * Some unused/redundant code has been removed, including
   __ARCH_HAVE_MMU, current_pgdir, and the initialization of init_mm.pgd.
 * SUM is disabled upon taking a trap, which means that user memory is
   protected during traps taking inside copy_{to,from}_user().
 * The sptbr CSR has been renamed to satp in C code.  We haven't changed
   the assembly code in order to maintain compatibility with binutils
   2.29, which doesn't understand the new name.
 
 Additionally, we're adding some new features:
 
 * Basic ftrace support, thanks to Alan Kao!
 * Support for ZONE_DMA32.  This is necessary for all the normal reasons,
   but also to deal with a deficiency in the Xilinx PCIe controller we're
   using on our FPGA-based systems.  While the ZONE_DMA32 addition should
   be sufficient for most uses, it doesn't complete the fix for the
   Xilinx controller.
 * TLB shootdowns now only target the harts where they're necessary,
   instead of applying to all harts in the system.
 
 These patches have all been sitting on our linux-next branch for a while
 now.  Due to time constraints this is all I feel comfortable submitting
 during the 4.16 merge window, hopefully we'll do better next time!
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlp7N2gTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQX8kD/4xxw6TuuESmDXxAQPQ+S8J98uKRfAF
 9kMMzJJARcW5sT1vo3pKpE8+Ss0Hy2fIcaYsw5Je/Yl7vdAy/Dk7X3/mx7mxf5BP
 8m2cSd7DFLLLhntZTbr1Y5fJ6awFLtzI46zn/SzTdTatLWKXNLS5wmPKE33ddq/C
 iTi4k/as8E/vuNtuPy1GsOF0gICpZ2xB4YoMwTgWfpxTekBkUktO3EOHmZTwQEEM
 U1muB+4WoqusbBt6cP3Q7cUF3b6aMVSevWnywZGkD+yWOGRXTVzMgT7R4YlKEOre
 OQypZocYUbRmZQMZACKpgHIcOZpePaSTIQ2zzhXEPVGB0XAHtMRnAaVtwPxwG6c4
 EThDCN9ldShutKqT4XilHrh5gf0sy7qG0PIidPhMmXH9LCeTSAU4VdISJP1jkq19
 chiMHlf6+/DhikyiH0+lK/MX8vQMt6UJL1SlRKO/c2FxxKAZKnENJ+tuAlkAlwoC
 gnvZsE5BUYw1ptRHXR0d5C4m8M2M9LPZfpWYcg+1mRO9EA+kt0XCupL7RsrdFuoa
 FCVEhP/JMaiX0JtmAHfVIU0yNGjH3b5xi3FoGk2Aoj/c8O3F5YcwT5C5nO+jpv32
 n9vyMR20/721+yA2dFIlq4DnelwdZczOTqrcDYJrLxXzk8OXUFFffbe4kbDCxp34
 WniBxwnY9BF25g==
 =cNRH
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-4.16-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V updates from Palmer Dabbelt:
 "This contains the fixes we'd like to target for the 4.16 merge window.
  It's not as much as I was originally hoping to do but between glibc,
  the chip, and FOSDEM there just wasn't enough time to get everything
  put together. As such, this merge window is essentially just going to
  be small changes. This includes mostly cleanups:

   - A build fix failure to the audit test cases.

     RISC-V doesn't have renameat because the generic syscall ABI moved
     to renameat2 by the time of our port. The syscall audit test cases
     don't understand this, so I added a trivial fix. This went through
     mailing list review during the 4.15 merge window, but nobody has
     picked it up so I think it's best to just do this here.

   - The removal of our command-line argument processing code. The
     "mem_end" stuff was broken and the rest duplicated generic device
     tree code. The generic code was already being called.

   - Some unused/redundant code has been removed, including
     __ARCH_HAVE_MMU, current_pgdir, and the initialization of
     init_mm.pgd.

   - SUM is disabled upon taking a trap, which means that user memory is
     protected during traps taking inside copy_{to,from}_user().

   - The sptbr CSR has been renamed to satp in C code. We haven't
     changed the assembly code in order to maintain compatibility with
     binutils 2.29, which doesn't understand the new name.

  Additionally, we're adding some new features:

   - Basic ftrace support, thanks to Alan Kao!

   - Support for ZONE_DMA32.

     This is necessary for all the normal reasons, but also to deal with
     a deficiency in the Xilinx PCIe controller we're using on our
     FPGA-based systems. While the ZONE_DMA32 addition should be
     sufficient for most uses, it doesn't complete the fix for the
     Xilinx controller.

   - TLB shootdowns now only target the harts where they're necessary,
     instead of applying to all harts in the system.

  These patches have all been sitting on our linux-next branch for a
  while now. Due to time constraints this is all I feel comfortable
  submitting during the 4.16 merge window, hopefully we'll do better
  next time!"

[ Note to self: "harts" is RISC-V speak for "hardware threads".  I had
  to look that up.    - Linus ]

* tag 'riscv-for-linus-4.16-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  riscv: inline set_pgdir into its only caller
  riscv: rename sptbr to satp
  riscv: don't read back satp in paging_init
  riscv: remove the unused current_pgdir function
  riscv: add ZONE_DMA32
  RISC-V: Limit the scope of TLB shootdowns
  riscv: disable SUM in the exception handler
  riscv: remove redundant unlikely()
  riscv: remove unused __ARCH_HAVE_MMU define
  riscv/ftrace: Add basic support
  RISC-V: Remove mem_end command line processing
  RISC-V: Remove duplicate command-line parsing logic
  audit: Avoid build failures on systems without renameat
2018-02-07 11:33:08 -08:00
Linus Torvalds
0bd2afc748 MIPS fixes for 4.16-rc1
A couple of MIPS fixes for 4.16-rc1, including an important regression
 in 4.15 and a rather more longstanding corner case build fix.
 
 - Fix CPS regression on older binutils due to MIPS_ISA_LEVEL_RAW fix
   (4.15)
 - Fix allmodconfig + CONFIG_MACH_TX49XX=y builds due to incorrect use of
   IS_ENABLED() (2.6.28)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlp6/k8ACgkQbAtpk944
 dnoIjxAAqvovuVTJfTVn2MwDwIagfOOZo2PASvWrt5YRnPyCapFZEgPsDh0Qv8ca
 dLugAMbz3uTk0xy+xzoKtbUowFJK65G15xn7a+UyHKFrEGflHd6lgb7SvGTNMWqJ
 ru7xo/Plk8zdrF6NCjAh1a3vSn1aYEIBjb4pjai9TH8cNXFfPjlOvcxKUj7MqRZQ
 /IyDAfWa87NAh8amJKoiCHfQk3u/awu0jn3Vcrjog6kLKDH0sxd09EPIcBkznUl+
 CCO8vlvBvbsaMOV1Dwl6qxFFMQ3/OL+QEe3HrrDM/DURzwWGWnWktC6O9WXqgq8c
 IJ3t84jMX/BoGqybS8rX9Uy+Qr7ieV7lNgSbd3QQYqA8PLPLrp1xqsAcUlXJm4pj
 KVIpJ2bAtJF54y0o4x6KbtiVsjHIoVm9k1ftnGNfcS6HjbCWQgAoccj2HZfIdYaN
 /9pnqU5HYRIOrOp165LgdGOUUotA9JWigco45/ywWrtztAITIh8hFR4IiIXqfl6L
 xbfl8dsjQTuGBIjtwNI8PjKbeD8Dhz2/bEEj+2YmwtTI/l/iIXepTNszWZaE6G03
 f0PfA9XVyej8BFPk/SQQy3rw1nvjWE+aFeKkwEZCwBQea9Nlhyrj/CvBFIagj9rQ
 R6AV4Fn67SyACri8hy90KG+dyfALtppn+2rWLBEWQPcahg9FWds=
 =jeJr
 -----END PGP SIGNATURE-----

Merge tag 'mips_fixes_4.16_1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips

Pull MIPS fixes from James Hogan:
 "A couple of MIPS fixes for 4.16-rc1, including an important regression
  in 4.15 and a rather more longstanding corner case build fix.

  These are separate from the main pull request as one of the bugs fixed
  was only recently introduced in v4.15-rc8.

   - Fix CPS regression on older binutils due to MIPS_ISA_LEVEL_RAW fix
     (4.15)

   - Fix allmodconfig + CONFIG_MACH_TX49XX=y builds due to incorrect use
     of IS_ENABLED() (2.6.28)"

* tag 'mips_fixes_4.16_1' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
  MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS
  MIPS: CPS: Fix MIPS_ISA_LEVEL_RAW fallout
2018-02-07 11:31:05 -08:00
Linus Torvalds
8578953687 MIPS changes for 4.16
These are the main MIPS changes for 4.16. Rough overview:
  - Basic support for the Ingenic JZ4770 based GCW Zero open-source
    handheld video game console
  - Support for the Ranchu board (used by Android emulator)
  - Various cleanups and misc improvements
 
 Fixes:
  - Fix generic platform's USB_*HCI_BIG_ENDIAN selects (4.9)
  - Fix vmlinuz default build when ZBOOT selected
  - Fix clean up of vmlinuz targets
  - Fix command line duplication (in preparation for Ingenic JZ4770)
 
 Miscellaneous:
  - Allow Processor ID reads to be to be optimised away by the compiler
    (improves performance when running in guest)
  - Push ARCH_MIGHT_HAVE_PC_SERIO/PARPORT down to platform level to
    disable on generic platform with Ranchu board support
  - Add helpers for assembler macro instructions for older assemblers
  - Use assembler macro instructions to support VZ, XPA & MSA operations
    on older assemblers, removing C wrapper duplication
  - Various improvements to VZ & XPA assembly wrappers
  - Add drivers/platform/mips/ to MIPS MAINTAINERS entry
 
 Minor cleanups:
  - Misc FPU emulation cleanups (removal of unnecessary include, moving
    macros to common header, checkpatch and sparse fixes)
  - Remove duplicate assignment of core in play_dead()
  - Remove duplication in watchpoint handling
  - Remove mips_dma_mapping_error() stub
  - Use NULL instead of 0 in prepare_ftrace_return()
  - Use proper kernel-doc Return keyword for
    __compute_return_epc_for_insn()
  - Remove duplicate semicolon in csum_fold()
 
 Platform support:
 
 Broadcom:
  - Enable ZBOOT on BCM47xx
 
 Generic platform:
  - Add Ranchu board support, used by Android emulator
  - Fix machine compatible string matching for Ranchu
  - Support GIC in EIC mode
 
 Ingenic platforms:
  - Add DT, defconfig and other support for JZ4770 SoC and GCW Zero
  - Support dynamnic machine types (i.e. JZ4740 / JZ4770 / JZ4780)
  - Add Ingenic JZ4770 CGU clocks
  - General Ingenic clk changes to prepare for JZ4770 SoC support
  - Use common command line handling code
  - Add DT vendor prefix to GCW (Game Consoles Worldwide)
 
 Loongson:
  - Add MAINTAINERS entry for Loongson2 and Loongson3 platforms
  - Drop 32-bit support for Loongson 2E/2F devices
  - Fix build failures due to multiple use of "MEM_RESERVED"
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlp64ZUACgkQbAtpk944
 dnrXrg//UPWeZMye/uHw0eEeJJjybyA0IWpJ6M94gbHxpduhQsjYU3CR9U4ZBmhs
 feY53dahh0RCR0k28EF8DEPkoUbGFKmyYCnvqAuatq1XOjAZtlgS9+VVzbK+Iswm
 XkZD1MBoZ49o0meyjQrH/2Ri/t6tHuzo0G2WtRJ8FnVruN9ymG6D5pR4Y31gDucb
 6JkTXjNfRJIKd0qJgP+c3HdlKE7jlnCTJnzHdA+5FbZVwKbm2/6KxbQo5Gc1BXJX
 4j7I4nJ0FIz0cB6fHbcccFSW9w3lPa9bQ4XpYPJYE6a36QldFvMWHRxvI6rxrACN
 5mPqIB9uqvtW8sdUbJtNRXFlNnm8XZzvsNqP6WxGQPW70+q2camni9W/gC1ifQsF
 +uVV54yj3Ky8xQNbbpfbDp/tFXRuLtj3DV4/a3dwA5J0YGEuMn1zzV5WTTzymFVn
 3NKl62LDUlzBNw0d1lUPMY6P1oKcNnRhLxBq0cxaB7AdOLF0jlCQ/wYUhXPpblj6
 CQB4cupR4IMvL7FZ1RS98e1RHaF8mXpaZBnGXT251DxZEre9OXCJxDdzqemedTVi
 SaCcvQqApCQD8OihL+wHZLew8Vp4EvwGAa++Evu/Ot4rWjY/9MGLtewYk8jkOEf6
 qk30dDn86ou29HNwpzfWadIq5Zew+QftifGOzTcuzgrJXXt+jH8=
 =7iwT
 -----END PGP SIGNATURE-----

Merge tag 'mips_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips

Pull MIPS updates from James Hogan:
 "These are the main MIPS changes for 4.16.

  Rough overview:

   (1) Basic support for the Ingenic JZ4770 based GCW Zero open-source
       handheld video game console

   (2) Support for the Ranchu board (used by Android emulator)

   (3) Various cleanups and misc improvements

  More detailed summary:

  Fixes:
   - Fix generic platform's USB_*HCI_BIG_ENDIAN selects (4.9)
   - Fix vmlinuz default build when ZBOOT selected
   - Fix clean up of vmlinuz targets
   - Fix command line duplication (in preparation for Ingenic JZ4770)

  Miscellaneous:
   - Allow Processor ID reads to be to be optimised away by the compiler
     (improves performance when running in guest)
   - Push ARCH_MIGHT_HAVE_PC_SERIO/PARPORT down to platform level to
     disable on generic platform with Ranchu board support
   - Add helpers for assembler macro instructions for older assemblers
   - Use assembler macro instructions to support VZ, XPA & MSA
     operations on older assemblers, removing C wrapper duplication
   - Various improvements to VZ & XPA assembly wrappers
   - Add drivers/platform/mips/ to MIPS MAINTAINERS entry

  Minor cleanups:
   - Misc FPU emulation cleanups (removal of unnecessary include, moving
     macros to common header, checkpatch and sparse fixes)
   - Remove duplicate assignment of core in play_dead()
   - Remove duplication in watchpoint handling
   - Remove mips_dma_mapping_error() stub
   - Use NULL instead of 0 in prepare_ftrace_return()
   - Use proper kernel-doc Return keyword for
     __compute_return_epc_for_insn()
   - Remove duplicate semicolon in csum_fold()

  Platform support:

  Broadcom:
   - Enable ZBOOT on BCM47xx

  Generic platform:
   - Add Ranchu board support, used by Android emulator
   - Fix machine compatible string matching for Ranchu
   - Support GIC in EIC mode

  Ingenic platforms:
   - Add DT, defconfig and other support for JZ4770 SoC and GCW Zero
   - Support dynamnic machine types (i.e. JZ4740 / JZ4770 / JZ4780)
   - Add Ingenic JZ4770 CGU clocks
   - General Ingenic clk changes to prepare for JZ4770 SoC support
   - Use common command line handling code
   - Add DT vendor prefix to GCW (Game Consoles Worldwide)

  Loongson:
   - Add MAINTAINERS entry for Loongson2 and Loongson3 platforms
   - Drop 32-bit support for Loongson 2E/2F devices
   - Fix build failures due to multiple use of 'MEM_RESERVED'"

* tag 'mips_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips: (53 commits)
  MIPS: Malta: Sanitize mouse and keyboard configuration.
  MIPS: Update defconfigs after previous patch.
  MIPS: Push ARCH_MIGHT_HAVE_PC_SERIO down to platform level
  MIPS: Push ARCH_MIGHT_HAVE_PC_PARPORT down to platform level
  MIPS: SMP-CPS: Remove duplicate assignment of core in play_dead
  MIPS: Generic: Support GIC in EIC mode
  MIPS: generic: Fix Makefile alignment
  MIPS: generic: Fix ranchu_of_match[] termination
  MIPS: generic: Fix machine compatible matching
  MIPS: Loongson fix name confict - MEM_RESERVED
  MIPS: bcm47xx: enable ZBOOT support
  MIPS: Fix trailing semicolon
  MIPS: Watch: Avoid duplication of bits in mips_read_watch_registers
  MIPS: Watch: Avoid duplication of bits in mips_install_watch_registers.
  MIPS: MSA: Update helpers to use new asm macros
  MIPS: XPA: Standardise readx/writex accessors
  MIPS: XPA: Allow use of $0 (zero) to MTHC0
  MIPS: XPA: Use XPA instructions in assembly
  MIPS: VZ: Pass GC0 register names in $n format
  MIPS: VZ: Update helpers to use new asm macros
  ...
2018-02-07 11:22:44 -08:00
David S. Miller
4d80ecdb80 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for you net tree, they
are:

1) Restore __GFP_NORETRY in xt_table allocations to mitigate effects of
   large memory allocation requests, from Michal Hocko.

2) Release IPv6 fragment queue in case of error in fragmentation header,
   this is a follow up to amend patch 83f1999cae, from Subash Abhinov
   Kasiviswanathan.

3) Flowtable infrastructure depends on NETFILTER_INGRESS as it registers
   a hook for each flowtable, reported by John Crispin.

4) Missing initialization of info->priv in xt_cgroup version 1, from
   Cong Wang.

5) Give a chance to garbage collector to run after scheduling flowtable
   cleanup.

6) Releasing flowtable content on nft_flow_offload module removal is
   not required at all, there is not dependencies between this module
   and flowtables, remove it.

7) Fix missing xt_rateest_mutex grabbing for hash insertions, also from
   Cong Wang.

8) Move nf_flow_table_cleanup() routine to flowtable core, this patch is
   a dependency for the next patch in this list.

9) Flowtable resources are not properly released on removal from the
   control plane. Fix this resource leak by scheduling removal of all
   entries and explicit call to the garbage collector.

10) nf_ct_nat_offset() declaration is dead code, this function prototype
    is not used anywhere, remove it. From Taehee Yoo.

11) Fix another flowtable resource leak on entry insertion failures,
    this patch also fixes a possible use-after-free. Patch from Felix
    Fietkau.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-07 13:55:20 -05:00
Linus Torvalds
e03ab6c4ad A few late-arriving fixes, along with Konstantin's PGP document that had
no reason to wait another cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaejsmAAoJEI3ONVYwIuV6iqIP/2vJ1GfaYwVKScGybqTAGFIj
 5zAwjkd3LhYYYfx5DkEcpAqYllZS/jmj5th5095TxDlDrwfxNwMYBQUP7rYp6s6E
 IupSGq4z/n/6MseeXom6J7IxRaRBbERB7VLb0eAA8UOTis96kgnjuMRB3bCVHlfN
 7VTo8uFm4/ZKKntSq4gl59FnVna4PkZU10TB0tWh617lyPkfJR8FkBcdBvlXbCYt
 03LAkQ5F1XfVMM9gq6F3IGnmPCkIRbzMrXm1tXZNRYFkfu0bw3lN5um0DOLrq4Qw
 NTybo1W5hKf3LQLVJGIDpMzYkrp4GMrBLy6cNaK6V5vHHAmRew+zHBu93dSmzqE+
 hT+Zih6FLXJV2Bc/GnwGiFFSHU44MBVO4DTACRTFAW19gnVaFZe51uCJ+gzwNkhk
 v4c4riGxRAVeZA6FoXXcjGxwjA2qmjMqcjiRHW2pxSHb1Oz+KwQHDd8muS/G0H/a
 CBCwlkfrJHPWVkrYRSuq912feIYMs9I/y/46lIhsowmFYBCF6g1SZ5xk1lggM+uo
 FJp/+m1vQvbWrkotFZjXiv/riWURKTQykSUHA9+cQcU/UIkjfx0YrmGkpfjaM2G8
 QTPq1HU9iRLpXXYrKNnzLEPH4CUU+ffGD8Ou9migxrNQnSut/Dgsii0Eqr/AGNbW
 bChrYwjeT7OAvzW8A5ZZ
 =ffz4
 -----END PGP SIGNATURE-----

Merge tag 'docs-4.16-2' of git://git.lwn.net/linux

Pull more documentation updates from Jonathan Corbet:
 "A few late-arriving fixes, along with Konstantin's PGP document that
  had no reason to wait another cycle"

* tag 'docs-4.16-2' of git://git.lwn.net/linux:
  Documentation/process: tweak pgp maintainer guide
  Documentation/admin-guide: fixes for thunderbolt.rst
  Documentation: mips: Update AU1xxx_IDE Kconfig dependencies
  Fix broken link in Documentation/process/kernel-docs.rst
  Documentation/process: kernel maintainer PGP guide
2018-02-07 09:42:59 -08:00
Steve French
5f60a56494 Add missing structs and defines from recent SMB3.1.1 documentation
The last two updates to MS-SMB2 protocol documentation added various
flags and structs (especially relating to SMB3.1.1 tree connect).
Add missing defines and structs to smb2pdu.h

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-02-07 09:36:46 -06:00
Steve French
f9de151bf2 address lock imbalance warnings in smbdirect.c
Although at least one of these was an overly strict sparse warning
in the new smbdirect code, it is cleaner to fix - so no warnings.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-02-07 09:36:43 -06:00
Arnd Bergmann
ade7db991b cifs: silence compiler warnings showing up with gcc-8.0.0
This bug was fixed before, but came up again with the latest
compiler in another function:

fs/cifs/cifssmb.c: In function 'CIFSSMBSetEA':
fs/cifs/cifssmb.c:6362:3: error: 'strncpy' offset 8 is out of the bounds [0, 4] [-Werror=array-bounds]
   strncpy(parm_data->list[0].name, ea_name, name_len);

Let's apply the same fix that was used for the other instances.

Fixes: b2a3ad9ca5 ("cifs: silence compiler warnings showing up with gcc-4.7.0")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steve French <smfrench@gmail.com>
2018-02-07 09:36:41 -06:00
Steve French
ede2e520a1 Add some missing debug fields in server and tcon structs
Allow dumping out debug information on dialect, signing, unix extensions
and encryption

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-02-07 09:36:38 -06:00
Julia Lawall
a2b0fe7435 coccinelle: deref_null: avoid useless computation
The effect of the rules ifm1, pr11, and pr12 is only used in the final rule,
which depends on context && !org && !report.  Thus these rules should only
be performed in those circumstances.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-02-08 00:16:12 +09:00
Martin Schwidefsky
f19fbd5ed6 s390: introduce execute-trampolines for branches
Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and
-mfunction_return= compiler options to create a kernel fortified against
the specte v2 attack.

With CONFIG_EXPOLINE=y all indirect branches will be issued with an
execute type instruction. For z10 or newer the EXRL instruction will
be used, for older machines the EX instruction. The typical indirect
call

	basr	%r14,%r1

is replaced with a PC relative call to a new thunk

	brasl	%r14,__s390x_indirect_jump_r1

The thunk contains the EXRL/EX instruction to the indirect branch

__s390x_indirect_jump_r1:
	exrl	0,0f
	j	.
0:	br	%r1

The detour via the execute type instruction has a performance impact.
To get rid of the detour the new kernel parameter "nospectre_v2" and
"spectre_v2=[on,off,auto]" can be used. If the parameter is specified
the kernel and module code will be patched at runtime.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-02-07 15:57:02 +01:00