Linux kernel stable tree
Go to file
Michal Pecio 42b7581376 usb: xhci: Limit Stop Endpoint retries
Some host controllers fail to atomically transition an endpoint to the
Running state on a doorbell ring and enter a hidden "Restarting" state,
which looks very much like Stopped, with the important difference that
it will spontaneously transition to Running anytime soon.

A Stop Endpoint command queued in the Restarting state typically fails
with Context State Error and the completion handler sees the Endpoint
Context State as either still Stopped or already Running. Even a case
of Halted was observed, when an error occurred right after the restart.

The Halted state is already recovered from by resetting the endpoint.
The Running state is handled by retrying Stop Endpoint.

The Stopped state was recognized as a problem on NEC controllers and
worked around also by retrying, because the endpoint soon restarts and
then stops for good. But there is a risk: the command may fail if the
endpoint is "stopped for good" already, and retries will fail forever.

The possibility of this was not realized at the time, but a number of
cases were discovered later and reproduced. Some proved difficult to
deal with, and it is outright impossible to predict if an endpoint may
fail to ever start at all due to a hardware bug. One such bug (albeit
on ASM3142, not on NEC) was found to be reliably triggered simply by
toggling an AX88179 NIC up/down in a tight loop for a few seconds.

An endless retries storm is quite nasty. Besides putting needless load
on the xHC and CPU, it causes URBs never to be given back, paralyzing
the device and connection/disconnection logic for the whole bus if the
device is unplugged. User processes waiting for URBs become unkillable,
drivers and kworker threads lock up and xhci_hcd cannot be reloaded.

For peace of mind, impose a timeout on Stop Endpoint retries in this
case. If they don't succeed in 100ms, consider the endpoint stopped
permanently for some reason and just give back the unlinked URBs. This
failure case is rare already and work is under way to make it rarer.

Start this work today by also handling one simple case of race with
Reset Endpoint, because it costs just two lines to implement.

Fixes: fd9d55d190 ("xhci: retry Stop Endpoint on buggy NEC controllers")
CC: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20241106101459.775897-32-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-06 13:26:16 +01:00
arch A trivial compile test fix for x86 2024-11-03 08:26:00 -10:00
block block-6.12-20241101 2024-11-01 13:41:55 -10:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto This push fixes the following issues: 2024-10-16 08:42:54 -07:00
Documentation dt-bindings: usb: qcom,dwc3: Add SAR2130P compatible 2024-11-05 13:29:17 +01:00
drivers usb: xhci: Limit Stop Endpoint retries 2024-11-06 13:26:16 +01:00
fs 17 hotfixes. 9 are cc:stable. 13 are MM and 4 are non-MM. 2024-11-03 10:25:05 -10:00
include Merge v6.12-rc6 into usb-next 2024-11-05 09:56:08 +01:00
init cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS 2024-10-13 22:23:13 +02:00
io_uring io_uring/rw: fix missing NOWAIT check for O_DIRECT start write 2024-10-31 08:21:02 -06:00
ipc struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
kernel A single fix for posix CPU timers 2024-11-03 08:22:21 -10:00
lib arm64 fixes for -rc6 2024-11-01 07:54:11 -10:00
LICENSES LICENSES: add 0BSD license text 2024-09-01 20:43:24 -07:00
mm 17 hotfixes. 9 are cc:stable. 13 are MM and 4 are non-MM. 2024-11-03 10:25:05 -10:00
net nfsd-6.12 fixes: 2024-11-02 09:27:11 -10:00
rust Driver core fix for 6.12-rc3 2024-10-13 09:10:52 -07:00
samples [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
scripts Kbuild fixes for v6.12 (2nd) 2024-11-03 08:29:02 -10:00
security ipe: fallback to platform keyring also if key in trusted keyring is rejected 2024-10-18 12:14:53 -07:00
sound ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1 2024-10-30 14:46:59 +01:00
tools 17 hotfixes. 9 are cc:stable. 13 are MM and 4 are non-MM. 2024-11-03 10:25:05 -10:00
usr initramfs: shorten cmd_initfs in usr/Makefile 2024-07-16 01:07:52 +09:00
virt ARM64: 2024-10-21 11:22:04 -07:00
.clang-format clang-format: Update with v6.11-rc1's for_each macro list 2024-08-02 13:20:31 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
.mailmap .mailmap: update e-mail address for Eugen Hristev 2024-10-31 20:27:04 -07:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS CREDITS: sort alphabetically by name 2024-10-09 12:47:19 -07:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Merge v6.12-rc6 into usb-next 2024-11-05 09:56:08 +01:00
Makefile Linux 6.12-rc6 2024-11-03 14:05:52 -10:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.