linux-next

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git synced 2025-01-08 15:04:45 +00:00

History

Qi Zheng caa05325c9 mm: vmscan: make memcg slab shrink lockless Like global slab shrink, this commit also uses SRCU to make memcg slab shrink lockless. We can reproduce the down_read_trylock() hotspot through the following script: ``` DIR="/root/shrinker/memcg/mnt" do_create() { mkdir -p /sys/fs/cgroup/memory/test mkdir -p /sys/fs/cgroup/perf_event/test echo 4G > /sys/fs/cgroup/memory/test/memory.limit_in_bytes for i in `seq 0 $1`; do mkdir -p /sys/fs/cgroup/memory/test/$i; echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs; echo $$ > /sys/fs/cgroup/perf_event/test/cgroup.procs; mkdir -p $DIR/$i; done } do_mount() { for i in `seq $1 $2`; do mount -t tmpfs $i $DIR/$i; done } do_touch() { for i in `seq $1 $2`; do echo $$ > /sys/fs/cgroup/memory/test/$i/cgroup.procs; echo $$ > /sys/fs/cgroup/perf_event/test/cgroup.procs; dd if=/dev/zero of=$DIR/$i/file$i bs=1M count=1 & done } case "$1" in touch) do_touch $2 $3 ;; test) do_create 4000 do_mount 0 4000 do_touch 0 3000 ;; *) exit 1 ;; esac ``` Save the above script, then run test and touch commands. Then we can use the following perf command to view hotspots: perf top -U -F 999 1) Before applying this patchset: 32.31% [kernel] [k] down_read_trylock 19.40% [kernel] [k] pv_native_safe_halt 16.24% [kernel] [k] up_read 15.70% [kernel] [k] shrink_slab 4.69% [kernel] [k] _find_next_bit 2.62% [kernel] [k] shrink_node 1.78% [kernel] [k] shrink_lruvec 0.76% [kernel] [k] do_shrink_slab 2) After applying this patchset: 27.83% [kernel] [k] _find_next_bit 16.97% [kernel] [k] shrink_slab 15.82% [kernel] [k] pv_native_safe_halt 9.58% [kernel] [k] shrink_node 8.31% [kernel] [k] shrink_lruvec 5.64% [kernel] [k] do_shrink_slab 3.88% [kernel] [k] mem_cgroup_iter At the same time, we use the following perf command to capture IPC information: perf stat -e cycles,instructions -G test -a --repeat 5 -- sleep 10 1) Before applying this patchset: Performance counter stats for 'system wide' (5 runs): 454187219766 cycles test ( +- 1.84% ) 78896433101 instructions test # 0.17 insn per cycle ( +- 0.44% ) 10.0020430 +- 0.0000366 seconds time elapsed ( +- 0.00% ) 2) After applying this patchset: Performance counter stats for 'system wide' (5 runs): 841954709443 cycles test ( +- 15.80% ) (98.69%) 527258677936 instructions test # 0.63 insn per cycle ( +- 15.11% ) (98.68%) 10.01064 +- 0.00831 seconds time elapsed ( +- 0.08% ) We can see that IPC drops very seriously when calling down_read_trylock() at high frequency. After using SRCU, the IPC is at a normal level. Link: https://lkml.kernel.org/r/20230313112819.38938-4-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Kirill Tkhai <tkhai@ya.ru> Acked-by: Vlastimil Babka <Vbabka@suse.cz> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christian König <christian.koenig@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Sultan Alsawaf <sultan@kerneltoast.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>		2023-03-28 16:20:16 -07:00
..
damon	mm/damon/paddr: fix folio_nr_pages() after folio_put() in damon_pa_mark_accessed_or_deactivate()	2023-03-07 17:04:55 -08:00
kasan	kasan: remove PG_skip_kasan_poison flag	2023-03-28 16:20:16 -07:00
kfence	mm: kfence: fix handling discontiguous page	2023-03-28 15:24:32 -07:00
kmsan	kmsan: add test_stackdepot_roundtrip	2023-03-28 16:20:14 -07:00
backing-dev.c	mm: add /sys/class/bdi/<bdi>/min_ratio_fine knob	2022-11-30 15:59:06 -08:00
balloon_compaction.c	mm: Convert all PageMovable users to movable_operations	2022-08-02 12:34:03 -04:00
bootmem_info.c	bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem	2022-08-28 14:02:45 -07:00
cma_debug.c	mm/cma_debug: show complete cma name in debugfs directories	2022-09-11 20:25:50 -07:00
cma_sysfs.c	mm: cma: make kobj_type structure constant	2023-03-28 16:20:06 -07:00
cma.c	mm/cma: fix potential memory loss on cma_declare_contiguous_nid	2023-02-02 22:33:24 -08:00
cma.h	mm/cma: provide option to opt out from exposing pages on activation failure	2022-03-22 15:57:09 -07:00
compaction.c	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add	2023-02-23 17:09:35 -08:00
debug_page_ref.c
debug_vm_pgtable.c	mm: prefer xxx_page() alloc/free functions for order-0 pages	2023-03-28 16:20:16 -07:00
debug.c	mm/debug: use %pGt to display page_type in dump_page()	2023-03-28 16:20:09 -07:00
dmapool.c	mm/dmapool.c: revert "make dma pool to use kmalloc_node"	2022-01-15 16:30:28 +02:00
early_ioremap.c	mm/early_ioremap: declare early_memremap_pgprot_adjust()	2022-03-22 15:57:11 -07:00
fadvise.c	mm: support POSIX_FADV_NOREUSE	2023-01-18 17:12:57 -08:00
failslab.c	mm: fix unexpected changes to {failslab\|fail_page_alloc}.attr	2022-11-22 18:50:44 -08:00
filemap.c	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add	2023-02-23 17:09:35 -08:00
folio-compat.c	mm: change to return bool for isolate_lru_page()	2023-02-20 12:46:17 -08:00
frontswap.c	frontswap: don't call ->init if no ops are registered	2022-09-26 12:14:34 -07:00
gup_test.c	mm/gup_test: free memory allocated via kvcalloc() using kvfree()	2022-12-15 16:37:48 -08:00
gup_test.h	mm/gup_test: start/stop/read functionality for PIN LONGTERM test	2022-11-08 17:37:15 -08:00
gup.c	mm/gup.c: fix typo in comments	2023-03-28 16:20:14 -07:00
highmem.c	highmem: fix kmap_to_page() for kmap_local_page() addresses	2022-10-12 18:51:51 -07:00
hmm.c	mm/hugetlb: make walk_hugetlb_range() safe to pmd unshare	2023-01-18 17:12:39 -08:00
huge_memory.c	mm: huge_memory: convert __do_huge_pmd_anonymous_page() to use a folio	2023-03-28 16:20:09 -07:00
hugetlb_cgroup.c	mm/hugetlb: increase use of folios in alloc_huge_page()	2023-02-13 15:54:27 -08:00
hugetlb_vmemmap.c	mm: prefer xxx_page() alloc/free functions for order-0 pages	2023-03-28 16:20:16 -07:00
hugetlb_vmemmap.h	mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability	2022-08-08 18:06:43 -07:00
hugetlb.c	mm: hugetlb: change to return bool for isolate_hugetlb()	2023-02-20 12:46:17 -08:00
hwpoison-inject.c	mm/hwpoison: add __init/__exit annotations to module init/exit funcs	2022-10-03 14:03:05 -07:00
init-mm.c	mm: remove rb tree.	2022-09-26 19:46:16 -07:00
internal.h	mm, printk: introduce new format %pGt for page_type	2023-03-28 16:20:09 -07:00
interval_tree.c	mm/interval_tree: add comments to improve code readability	2021-04-30 11:20:38 -07:00
io-mapping.c	mm: add a io_mapping_map_user helper	2021-04-30 11:20:39 -07:00
ioremap.c	mm: ioremap: Add ioremap/iounmap_allowed()	2022-06-27 12:22:31 +01:00
Kconfig	zsmalloc: set default zspage chain size to 8	2023-02-02 22:33:23 -08:00
Kconfig.debug	mm: move KMEMLEAK's Kconfig items from lib to mm	2023-02-02 22:33:26 -08:00
khugepaged.c	mm/khugepaged: cleanup memcg uncharge for failure path	2023-03-28 16:20:11 -07:00
kmemleak.c	lib/stackdepot, mm: rename stack_depot_want_early_init	2023-02-16 20:43:49 -08:00
ksm.c	mm: add tracepoints to ksm	2023-03-28 16:20:08 -07:00
list_lru.c	mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe	2022-06-16 19:48:31 -07:00
maccess.c	maccess: Fix writing offset in case of fault in strncpy_from_kernel_nofault()	2022-11-11 11:44:46 -08:00
madvise.c	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add	2023-02-23 17:09:35 -08:00
Makefile	mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol	2022-10-03 14:03:36 -07:00
mapping_dirty_helpers.c	mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export	2023-02-02 22:32:54 -08:00
memblock.c	memblock: small optimizations	2023-02-27 09:34:53 -08:00
memcontrol.c	mm, memcg: Prevent memory.soft_limit_in_bytes load/store tearing	2023-03-28 16:20:13 -07:00
memfd.c	mm/memfd: add write seals when apply SEAL_EXEC to executable memfd	2023-01-18 17:12:37 -08:00
memory_hotplug.c	mm/memory_hotplug: cleanup return value handing in do_migrate_range()	2023-02-20 12:46:18 -08:00
memory-failure.c	mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON	2023-02-27 17:00:14 -08:00
memory-tiers.c	memory tier: release the new_memtier in find_create_memory_tier()	2023-02-09 16:51:40 -08:00
memory.c	mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()	2023-03-28 16:20:12 -07:00
mempolicy.c	mm: hugetlb: change to return bool for isolate_hugetlb()	2023-02-20 12:46:17 -08:00
mempool.c	mempool: do not use ksize() for poisoning	2022-11-30 15:58:41 -08:00
memremap.c	mm/memremap.c: fix outdated comment in devm_memremap_pages	2023-02-09 16:51:46 -08:00
memtest.c
migrate_device.c	mm: change to return bool for isolate_lru_page()	2023-02-20 12:46:17 -08:00
migrate.c	mm/migrate: drop pte_mkhuge() in remove_migration_pte()	2023-03-28 16:20:11 -07:00
mincore.c	mm: teach mincore_hugetlb about pte markers	2023-03-07 17:04:53 -08:00
mlock.c	mm: introduce vm_flags_reset_once to replace WRITE_ONCE vm_flags updates	2023-02-09 16:51:41 -08:00
mm_init.c	memory: move hotplug memory notifier priority to same file for easy sorting	2022-11-08 17:37:17 -08:00
mm_slot.h	mm: introduce common struct mm_slot	2022-10-03 14:02:43 -07:00
mmap_lock.c	mm: mmap_lock: fix disabling preemption directly	2021-07-23 17:43:28 -07:00
mmap.c	mm: deduplicate error handling for map_deny_write_exec	2023-03-23 17:18:32 -07:00
mmu_gather.c	mm: prefer xxx_page() alloc/free functions for order-0 pages	2023-03-28 16:20:16 -07:00
mmu_notifier.c	mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export	2023-02-02 22:32:54 -08:00
mmzone.c	mm: multi-gen LRU: groundwork	2022-09-26 19:46:09 -07:00
mprotect.c	mm: fix error handling for map_deny_write_exec	2023-03-23 17:18:33 -07:00
mremap.c	x86/mm/pat: clear VM_PAT if copy_p4d_range failed	2023-03-28 16:20:07 -07:00
msync.c	mm/msync: use vma_find() instead of vma linked list	2022-09-26 19:46:25 -07:00
nommu.c	mm: replace vma->vm_flags direct modifications with modifier calls	2023-02-09 16:51:39 -08:00
oom_kill.c	mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export	2023-02-02 22:32:54 -08:00
page_alloc.c	mm: prefer xxx_page() alloc/free functions for order-0 pages	2023-03-28 16:20:16 -07:00
page_counter.c	mm: page_counter: remove unneeded atomic ops for low/min	2022-09-11 20:26:01 -07:00
page_ext.c	mm/page_ext: init page_ext early if there are no deferred struct pages	2023-02-02 22:33:22 -08:00
page_idle.c	mm: page_idle: convert page idle to use a folio	2023-01-18 17:12:52 -08:00
page_io.c	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add	2023-02-23 17:09:35 -08:00
page_isolation.c	mm/page_isolation: fix clang deadcode warning	2022-10-28 13:37:22 -07:00
page_owner.c	lib/stackdepot, mm: rename stack_depot_want_early_init	2023-02-16 20:43:49 -08:00
page_poison.c	mm: page_poison: print page info when corruption is caught	2021-04-30 11:20:36 -07:00
page_reporting.c	mm/page_reporting: replace rcu_access_pointer() with rcu_dereference_protected()	2023-01-18 17:12:50 -08:00
page_reporting.h	mm/page_reporting: export reporting order as module parameter	2021-06-29 10:53:47 -07:00
page_table_check.c	mm/page_ext: do not allocate space for page_ext->flags if not needed	2023-02-02 22:33:11 -08:00
page_vma_mapped.c	mm/hugetlb: introduce hugetlb_walk()	2023-01-18 17:12:39 -08:00
page-writeback.c	mm,jfs: move write_one_page/folio_write_one to jfs	2023-03-28 16:20:14 -07:00
pagewalk.c	mm/hugetlb: introduce hugetlb_walk()	2023-01-18 17:12:39 -08:00
percpu-internal.h	mm: percpu: fix incorrect size in pcpu_obj_full_size()	2023-02-16 20:43:55 -08:00
percpu-km.c	percpu: flush tlb in pcpu_reclaim_populated()	2021-07-04 18:30:17 +00:00
percpu-stats.c	mm: use vmalloc_array and vcalloc for array allocations	2022-03-08 09:30:46 -05:00
percpu-vm.c	percpu: flush tlb in pcpu_reclaim_populated()	2021-07-04 18:30:17 +00:00
percpu.c	mm: memcontrol: rename memcg_kmem_enabled()	2023-02-16 20:43:56 -08:00
pgalloc-track.h	mm: fix typos in comments	2021-05-07 00:26:35 -07:00
pgtable-generic.c	mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()	2023-03-28 16:20:12 -07:00
process_vm_access.c	use less confusing names for iov_iter direction initializers	2022-11-25 13:01:55 -05:00
ptdump.c	mm: pagewalk: Fix race between unmap and page walker	2022-09-03 10:13:13 -07:00
readahead.c	readahead: convert readahead_expand() to use a folio	2023-02-02 22:33:21 -08:00
rmap.c	mm/rmap: use atomic_try_cmpxchg in set_tlb_ubc_flush_pending	2023-03-28 16:20:09 -07:00
rodata_test.c	mm/rodata_test: use PAGE_ALIGNED() helper	2022-10-03 14:03:05 -07:00
secretmem.c	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add	2023-02-23 17:09:35 -08:00
shmem.c	shmem: add support to ignore swap	2023-03-28 16:20:15 -07:00
shrinker_debug.c	mm: shrinkers: fix deadlock in shrinker debugfs	2023-02-09 15:56:51 -08:00
shuffle.c	mm/shuffle: convert module_param_call to module_param_cb	2022-10-03 14:03:07 -07:00
shuffle.h	mm/shuffle: fix section mismatch warning	2021-05-22 15:09:07 -10:00
slab_common.c	mm/kasan: simplify and refine kasan_cache code	2023-01-18 17:12:55 -08:00
slab.c	slab fix for 6.3-rc4	2023-03-24 10:12:14 -07:00
slab.h	mm: memcontrol: rename memcg_kmem_enabled()	2023-02-16 20:43:56 -08:00
slob.c	Merge branch 'slab/for-6.1/kmalloc_size_roundup' into slab/for-next	2022-09-29 11:30:55 +02:00
slub.c	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add	2023-02-23 17:09:35 -08:00
sparse-vmemmap.c	mm/sparse-vmemmap: generalise vmemmap_populate_hugepages()	2022-12-11 18:12:12 -08:00
sparse.c	mm/sparse: fix "unused function 'pgdat_to_phys'" warning	2023-02-02 22:33:29 -08:00
swap_cgroup.c	mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled	2022-10-03 14:03:36 -07:00
swap_slots.c	mm/swap: convert put_swap_page() to put_swap_folio()	2022-10-03 14:02:46 -07:00
swap_state.c	swap_state: update shadow_nodes for anonymous page	2023-02-02 22:33:24 -08:00
swap.c	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add	2023-02-23 17:09:35 -08:00
swap.h	mm: remove the __swap_writepage return value	2023-02-02 22:33:33 -08:00
swapfile.c	mm: swap: remove unneeded cgroup_throttle_swaprate()	2023-03-28 16:20:10 -07:00
truncate.c	folio-compat: remove lru_cache_add()	2022-12-11 18:12:13 -08:00
usercopy.c	mm: use kstrtobool() instead of strtobool()	2022-11-30 15:58:45 -08:00
userfaultfd.c	mm/userfaultfd: support WP on multiple VMAs	2023-03-28 16:20:07 -07:00
util.c	mm: fix typo in __vm_enough_memory warning	2023-02-13 15:54:33 -08:00
vmalloc.c	mm: prefer xxx_page() alloc/free functions for order-0 pages	2023-03-28 16:20:16 -07:00
vmpressure.c	mm/vmpressure: fix data-race with memcg->socket_pressure	2021-11-06 13:30:40 -07:00
vmscan.c	mm: vmscan: make memcg slab shrink lockless	2023-03-28 16:20:16 -07:00
vmstat.c	mm: vmscan: split khugepaged stats from direct reclaim stats	2022-11-30 15:58:41 -08:00
workingset.c	swap_state: update shadow_nodes for anonymous page	2023-02-02 22:33:24 -08:00
z3fold.c	mm: remove PageMovable export	2023-01-18 17:12:57 -08:00
zbud.c	zpool: clean out dead code	2022-12-11 18:12:10 -08:00
zpool.c	zpool: clean out dead code	2022-12-11 18:12:10 -08:00
zsmalloc.c	zsmalloc: show per fullness group class stats	2023-03-28 16:20:12 -07:00
zswap.c	mm/zswap: try to avoid worst-case scenario on same element pages	2023-03-28 16:20:07 -07:00