linux-stable/net/dccp
Valentin Schneider b334b924c9 net: tcp/dccp: prepare for tw_timer un-pinning
The TCP timewait timer is proving to be problematic for setups where
scheduler CPU isolation is achieved at runtime via cpusets (as opposed to
statically via isolcpus=domains).

What happens there is a CPU goes through tcp_time_wait(), arming the
time_wait timer, then gets isolated. TCP_TIMEWAIT_LEN later, the timer
fires, causing interference for the now-isolated CPU. This is conceptually
similar to the issue described in commit e02b931248 ("workqueue: Unbind
kworkers before sending them to exit()")

Move inet_twsk_schedule() to within inet_twsk_hashdance(), with the ehash
lock held. Expand the lock's critical section from inet_twsk_kill() to
inet_twsk_deschedule_put(), serializing the scheduling vs descheduling of
the timer. IOW, this prevents the following race:

			     tcp_time_wait()
			       inet_twsk_hashdance()
  inet_twsk_deschedule_put()
    del_timer_sync()
			       inet_twsk_schedule()

Thanks to Paolo Abeni for suggesting to leverage the ehash lock.

This also restores a comment from commit ec94c2696f ("tcp/dccp: avoid
one atomic operation for timewait hashdance") as inet_twsk_hashdance() had
a "Step 1" and "Step 3" comment, but the "Step 2" had gone missing.

inet_twsk_deschedule_put() now acquires the ehash spinlock to synchronize
with inet_twsk_hashdance_schedule().

To ease possible regression search, actual un-pin is done in next patch.

Link: https://lore.kernel.org/all/ZPhpfMjSiHVjQkTk@localhost.localdomain/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Co-developed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-10 11:54:18 +01:00
..
ccids net: dccp: Fix ccid2_rtt_estimator() kernel-doc 2024-05-07 16:15:08 -07:00
ackvec.c net: dccp: Simplify the allocation of slab caches in dccp_ackvec_init 2024-02-02 12:19:26 +00:00
ackvec.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
ccid.c net: dccp: Add __printf() markup to fix -Wsuggest-attribute=format 2020-10-30 11:31:46 -07:00
ccid.h net: dccp: Replace zero-length array with flexible-array member 2020-02-28 12:08:37 -08:00
dccp.h net: ioctl: Use kernel memory on protocol ioctl callbacks 2023-06-15 22:33:26 -07:00
diag.c inet_diag: add module pointer to "struct inet_diag_handler" 2024-01-23 15:13:54 +01:00
feat.c dccp: Return the correct errno code 2021-02-06 11:15:28 -08:00
feat.h dccp: Remove unused declaration dccp_feat_initialise_sysctls() 2023-07-27 17:16:26 -07:00
input.c net: dccp: Convert to use the preferred fallthrough macro 2020-08-22 12:38:34 -07:00
ipv4.c rstreason: prepare for passive reset 2024-04-26 15:34:00 +02:00
ipv6.c rstreason: prepare for passive reset 2024-04-26 15:34:00 +02:00
ipv6.h ipv6: remove hard coded limitation on ipv6_pinfo 2023-07-24 09:39:31 +01:00
Kconfig dccp: Replace HTTP links with HTTPS ones 2020-07-13 11:54:07 -07:00
Makefile net: dccp: Remove dccpprobe module 2018-01-02 14:27:30 -05:00
minisocks.c net: tcp/dccp: prepare for tw_timer un-pinning 2024-06-10 11:54:18 +01:00
options.c net: dccp: Convert to use the preferred fallthrough macro 2020-08-22 12:38:34 -07:00
output.c net: add sk_wake_async_rcu() helper 2024-03-29 15:03:11 -07:00
proto.c dccp: annotate data-races in dccp_poll() 2023-08-18 19:30:24 -07:00
qpolicy.c net: dccp: Fix most of the kerneldoc warnings 2020-10-30 12:08:54 -07:00
sysctl.c net: Remove the now superfluous sentinel elements from ctl_table array 2024-05-03 13:29:41 +01:00
timer.c tcp: record last received ipv6 flowlabel 2023-10-10 10:02:59 +02:00
trace.h net: dccp: Use memset_startat() for TP zeroing 2021-11-19 11:22:49 +00:00