linux/drivers/net/wireless
Guilherme G. Piccoli 5c1b544563 wifi: rtlwifi: Drastically reduce the attempts to read efuse in case of failures
Syzkaller reported a hung task with uevent_show() on stack trace. That
specific issue was addressed by another commit [0], but even with that
fix applied (for example, running v6.12-rc5) we face another type of hung
task that comes from the same reproducer [1]. By investigating that, we
could narrow it to the following path:

(a) Syzkaller emulates a Realtek USB WiFi adapter using raw-gadget and
dummy_hcd infrastructure.

(b) During the probe of rtl8192cu, the driver ends-up performing an efuse
read procedure (which is related to EEPROM load IIUC), and here lies the
issue: the function read_efuse() calls read_efuse_byte() many times, as
loop iterations depending on the efuse size (in our example, 512 in total).

This procedure for reading efuse bytes relies in a loop that performs an
I/O read up to *10k* times in case of failures. We measured the time of
the loop inside read_efuse_byte() alone, and in this reproducer (which
involves the dummy_hcd emulation layer), it takes 15 seconds each. As a
consequence, we have the driver stuck in its probe routine for big time,
exposing a stack trace like below if we attempt to reboot the system, for
example:

task:kworker/0:3 state:D stack:0 pid:662 tgid:662 ppid:2 flags:0x00004000
Workqueue: usb_hub_wq hub_event
Call Trace:
 __schedule+0xe22/0xeb6
 schedule_timeout+0xe7/0x132
 __wait_for_common+0xb5/0x12e
 usb_start_wait_urb+0xc5/0x1ef
 ? usb_alloc_urb+0x95/0xa4
 usb_control_msg+0xff/0x184
 _usbctrl_vendorreq_sync+0xa0/0x161
 _usb_read_sync+0xb3/0xc5
 read_efuse_byte+0x13c/0x146
 read_efuse+0x351/0x5f0
 efuse_read_all_map+0x42/0x52
 rtl_efuse_shadow_map_update+0x60/0xef
 rtl_get_hwinfo+0x5d/0x1c2
 rtl92cu_read_eeprom_info+0x10a/0x8d5
 ? rtl92c_read_chip_version+0x14f/0x17e
 rtl_usb_probe+0x323/0x851
 usb_probe_interface+0x278/0x34b
 really_probe+0x202/0x4a4
 __driver_probe_device+0x166/0x1b2
 driver_probe_device+0x2f/0xd8
 [...]

We propose hereby to drastically reduce the attempts of doing the I/O
reads in case of failures, restricted to USB devices (given that
they're inherently slower than PCIe ones). By retrying up to 10 times
(instead of 10000), we got reponsiveness in the reproducer, while seems
reasonable to believe that there's no sane USB device implementation in
the field requiring this amount of retries at every I/O read in order
to properly work. Based on that assumption, it'd be good to have it
backported to stable but maybe not since driver implementation (the 10k
number comes from day 0), perhaps up to 6.x series makes sense.

[0] Commit 15fffc6a56 ("driver core: Fix uevent_show() vs driver detach race")

[1] A note about that: this syzkaller report presents multiple reproducers
that differs by the type of emulated USB device. For this specific case,
check the entry from 2024/08/08 06:23 in the list of crashes; the C repro
is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000.

Cc: stable@vger.kernel.org # v6.1+
Reported-by: syzbot+edd9fe0d3a65b14588d5@syzkaller.appspotmail.com
Tested-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20241101193412.1390391-1-gpiccoli@igalia.com
2024-11-06 14:32:59 +08:00
..
admtek wifi: mac80211: inform the low level if drv_stop() is a suspend 2024-06-26 10:25:46 +02:00
ath move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
atmel wifi: mac80211: inform the low level if drv_stop() is a suspend 2024-06-26 10:25:46 +02:00
broadcom wifi: brcmfmac: of: use devm_clk_get_optional_enabled_with_rate() 2024-10-17 19:49:59 +03:00
intel wifi: ipw: select CRYPTO_LIB_ARC4 2024-10-17 19:44:43 +03:00
intersil wifi: p54: Use IRQF_NO_AUTOEN flag in request_irq() 2024-09-18 16:54:30 +03:00
marvell wifi: mwifiex: Fix memcpy() field-spanning write warning in mwifiex_config_scan() 2024-10-17 19:50:17 +03:00
mediatek move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
microchip wifi: wilc1000: Set MAC after operation mode 2024-10-17 19:48:55 +03:00
purelifi move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
quantenna wifi: qtnfmac: don't include lib80211.h 2024-10-08 21:52:16 +02:00
ralink wifi: rt2x00: convert comma to semicolon 2024-10-17 19:46:20 +03:00
realtek wifi: rtlwifi: Drastically reduce the attempts to read efuse in case of failures 2024-11-06 14:32:59 +08:00
rsi wifi: rsi: Remove an unused field in struct rsi_debugfs 2024-09-09 15:30:49 +03:00
silabs wifi: wfx: repair open network AP mode 2024-08-27 10:49:26 +03:00
st wifi: cw1200: Remove unused cw1200_queue_requeue_all() 2024-10-17 19:50:41 +03:00
ti wifi: wl1251: Use IRQF_NO_AUTOEN flag in request_irq() 2024-09-18 16:54:30 +03:00
virtual wifi: mac80211: handle ieee80211_radar_detected() for MLO 2024-09-06 13:01:05 +02:00
zydas move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
Kconfig wifi: remove orphaned rndis_wlan driver 2023-10-30 19:30:33 +02:00
Makefile wifi: remove orphaned rndis_wlan driver 2023-10-30 19:30:33 +02:00