Filipe Manana 1020443840 btrfs: make the extent map shrinker run asynchronously as a work queue job
Currently the extent map shrinker is run synchronously for kswapd tasks
that end up calling the fs shrinker (fs/super.c:super_cache_scan()).
This has some disadvantages and for some heavy workloads with memory
pressure it can cause some delays and stalls that make a machine
unresponsive for some periods. This happens because:

1) We can have several kswapd tasks on machines with multiple NUMA zones,
   and running the extent map shrinker concurrently can cause high
   contention on some spin locks, namely the spin locks that protect
   the radix tree that tracks roots, the per root xarray that tracks
   open inodes and the list of delayed iputs. This not only delays the
   shrinker but also causes high CPU consumption and makes the task
   running the shrinker monopolize a core, resulting in the symptoms
   of an unresponsive system. This was noted in previous commits such as
   commit ae1e766f623f ("btrfs: only run the extent map shrinker from
   kswapd tasks");

2) The extent map shrinker's iteration over inodes can often be slow, even
   after changing the data structure that tracks open inodes for a root
   from a red black tree (up to kernel 6.10) to an xarray (kernel 6.10+).
   The transition to the xarray while it made things a bit faster, it's
   still somewhat slow - for example in a test scenario with 10000 inodes
   that have no extent maps loaded, the extent map shrinker took between
   5ms to 8ms, using a release, non-debug kernel. Iterating over the
   extent maps of an inode can also be slow if have an inode with many
   thousands of extent maps, since we use a red black tree to track and
   search extent maps. So having the extent map shrinker run synchronously
   adds extra delay for other things a kswapd task does.

So make the extent map shrinker run asynchronously as a job for the
system unbounded workqueue, just like what we do for data and metadata
space reclaim jobs.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11 14:34:17 +01:00
..
2024-09-16 13:07:59 +02:00
2024-11-01 07:37:10 -10:00
2024-07-15 11:14:59 -07:00
2024-07-15 11:14:59 -07:00
2024-09-16 09:14:02 +02:00
2024-10-11 15:12:07 +00:00
2024-10-17 00:28:06 -07:00
2024-10-25 17:05:49 +02:00
2024-09-23 11:55:17 -07:00
2024-11-01 07:45:00 -10:00
2024-10-22 09:40:37 -05:00
2024-11-09 13:18:07 -08:00
2024-10-08 10:53:06 -07:00
2024-09-19 10:18:15 +02:00
2024-05-28 11:52:53 +02:00
\n
2024-09-23 10:49:28 -07:00
2024-08-21 22:32:58 +02:00
2024-11-09 12:58:23 -08:00
2024-04-23 13:27:43 +02:00
2024-11-02 09:22:16 -10:00
2024-10-02 12:02:15 -07:00
2024-08-29 06:20:44 +12:00
2024-07-15 11:14:59 -07:00
2024-09-16 08:54:30 +02:00
2024-09-16 08:35:09 +02:00
2024-09-18 08:53:53 +02:00
2024-08-28 13:05:39 +02:00
2024-09-16 08:54:30 +02:00
2024-09-24 15:29:42 -07:00
2024-09-16 11:15:26 +02:00
2024-09-27 08:18:43 -07:00
2024-09-27 18:29:19 +02:00
2024-09-27 08:18:43 -07:00
2024-05-02 16:28:20 +02:00