linux-stable/mm
Roman Gushchin 2521664c1f mm: page_alloc: move mlocked flag clearance into free_pages_prepare()
commit 66edc3a589 upstream.

Syzbot reported a bad page state problem caused by a page being freed
using free_page() still having a mlocked flag at free_pages_prepare()
stage:

  BUG: Bad page state in process syz.5.504  pfn:61f45
  page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x61f45
  flags: 0xfff00000080204(referenced|workingset|mlocked|node=0|zone=1|lastcpupid=0x7ff)
  raw: 00fff00000080204 0000000000000000 dead000000000122 0000000000000000
  raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
  page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
  page_owner tracks the page as allocated
  page last allocated via order 0, migratetype Unmovable, gfp_mask 0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), pid 8443, tgid 8442 (syz.5.504), ts 201884660643, free_ts 201499827394
   set_page_owner include/linux/page_owner.h:32 [inline]
   post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1537
   prep_new_page mm/page_alloc.c:1545 [inline]
   get_page_from_freelist+0x303f/0x3190 mm/page_alloc.c:3457
   __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4733
   alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265
   kvm_coalesced_mmio_init+0x1f/0xf0 virt/kvm/coalesced_mmio.c:99
   kvm_create_vm virt/kvm/kvm_main.c:1235 [inline]
   kvm_dev_ioctl_create_vm virt/kvm/kvm_main.c:5488 [inline]
   kvm_dev_ioctl+0x12dc/0x2240 virt/kvm/kvm_main.c:5530
   __do_compat_sys_ioctl fs/ioctl.c:1007 [inline]
   __se_compat_sys_ioctl+0x510/0xc90 fs/ioctl.c:950
   do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
   __do_fast_syscall_32+0xb4/0x110 arch/x86/entry/common.c:386
   do_fast_syscall_32+0x34/0x80 arch/x86/entry/common.c:411
   entry_SYSENTER_compat_after_hwframe+0x84/0x8e
  page last free pid 8399 tgid 8399 stack trace:
   reset_page_owner include/linux/page_owner.h:25 [inline]
   free_pages_prepare mm/page_alloc.c:1108 [inline]
   free_unref_folios+0xf12/0x18d0 mm/page_alloc.c:2686
   folios_put_refs+0x76c/0x860 mm/swap.c:1007
   free_pages_and_swap_cache+0x5c8/0x690 mm/swap_state.c:335
   __tlb_batch_free_encoded_pages mm/mmu_gather.c:136 [inline]
   tlb_batch_pages_flush mm/mmu_gather.c:149 [inline]
   tlb_flush_mmu_free mm/mmu_gather.c:366 [inline]
   tlb_flush_mmu+0x3a3/0x680 mm/mmu_gather.c:373
   tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:465
   exit_mmap+0x496/0xc40 mm/mmap.c:1926
   __mmput+0x115/0x390 kernel/fork.c:1348
   exit_mm+0x220/0x310 kernel/exit.c:571
   do_exit+0x9b2/0x28e0 kernel/exit.c:926
   do_group_exit+0x207/0x2c0 kernel/exit.c:1088
   __do_sys_exit_group kernel/exit.c:1099 [inline]
   __se_sys_exit_group kernel/exit.c:1097 [inline]
   __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1097
   x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232
   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  Modules linked in:
  CPU: 0 UID: 0 PID: 8442 Comm: syz.5.504 Not tainted 6.12.0-rc6-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
  Call Trace:
   <TASK>
   __dump_stack lib/dump_stack.c:94 [inline]
   dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
   bad_page+0x176/0x1d0 mm/page_alloc.c:501
   free_page_is_bad mm/page_alloc.c:918 [inline]
   free_pages_prepare mm/page_alloc.c:1100 [inline]
   free_unref_page+0xed0/0xf20 mm/page_alloc.c:2638
   kvm_destroy_vm virt/kvm/kvm_main.c:1327 [inline]
   kvm_put_kvm+0xc75/0x1350 virt/kvm/kvm_main.c:1386
   kvm_vcpu_release+0x54/0x60 virt/kvm/kvm_main.c:4143
   __fput+0x23f/0x880 fs/file_table.c:431
   task_work_run+0x24f/0x310 kernel/task_work.c:239
   exit_task_work include/linux/task_work.h:43 [inline]
   do_exit+0xa2f/0x28e0 kernel/exit.c:939
   do_group_exit+0x207/0x2c0 kernel/exit.c:1088
   __do_sys_exit_group kernel/exit.c:1099 [inline]
   __se_sys_exit_group kernel/exit.c:1097 [inline]
   __ia32_sys_exit_group+0x3f/0x40 kernel/exit.c:1097
   ia32_sys_call+0x2624/0x2630 arch/x86/include/generated/asm/syscalls_32.h:253
   do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
   __do_fast_syscall_32+0xb4/0x110 arch/x86/entry/common.c:386
   do_fast_syscall_32+0x34/0x80 arch/x86/entry/common.c:411
   entry_SYSENTER_compat_after_hwframe+0x84/0x8e
  RIP: 0023:0xf745d579
  Code: Unable to access opcode bytes at 0xf745d54f.
  RSP: 002b:00000000f75afd6c EFLAGS: 00000206 ORIG_RAX: 00000000000000fc
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 00000000ffffff9c RDI: 00000000f744cff4
  RBP: 00000000f717ae61 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
   </TASK>

The problem was originally introduced by commit b109b87050 ("mm/munlock:
replace clear_page_mlock() by final clearance"): it was focused on
handling pagecache and anonymous memory and wasn't suitable for lower
level get_page()/free_page() API's used for example by KVM, as with this
reproducer.

Fix it by moving the mlocked flag clearance down to free_page_prepare().

The bug itself if fairly old and harmless (aside from generating these
warnings), aside from a small memory leak - "bad" pages are stopped from
being allocated again.

Link: https://lkml.kernel.org/r/20241106195354.270757-1-roman.gushchin@linux.dev
Fixes: b109b87050 ("mm/munlock: replace clear_page_mlock() by final clearance")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reported-by: syzbot+e985d3026c4fd041578e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6729f475.050a0220.701a.0019.GAE@google.com
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14 19:54:31 +01:00
..
damon mm/damon/vaddr: protect vma traversal in __damon_va_thre_regions() with rcu read lock 2024-10-17 15:21:27 +02:00
kasan kasan: remove vmalloc_percpu test 2024-11-08 16:26:46 +01:00
kfence mm,kfence: decouple kfence from page granularity mapping judgement 2023-12-03 07:32:08 +01:00
kmsan kmsan: do not wipe out origin when doing partial unpoisoning 2024-06-16 13:41:38 +02:00
backing-dev.c writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs 2023-04-26 14:28:39 +02:00
balloon_compaction.c mm: Convert all PageMovable users to movable_operations 2022-08-02 12:34:03 -04:00
bootmem_info.c bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem 2022-08-28 14:02:45 -07:00
cma_debug.c mm/cma_debug: show complete cma name in debugfs directories 2022-09-11 20:25:50 -07:00
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c mm/cma: drop incorrect alignment check in cma_init_reserved_mem 2024-06-16 13:41:39 +02:00
cma.h mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
compaction.c mm, vmscan: prevent infinite loop for costly GFP_NOIO | __GFP_RETRY_MAYFAIL allocations 2024-04-03 15:19:42 +02:00
debug_page_ref.c
debug_vm_pgtable.c docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
debug.c mm: remove the vma linked list 2022-09-26 19:46:26 -07:00
dmapool.c mm/dmapool.c: revert "make dma pool to use kmalloc_node" 2022-01-15 16:30:28 +02:00
early_ioremap.c mm/early_ioremap: declare early_memremap_pgprot_adjust() 2022-03-22 15:57:11 -07:00
fadvise.c riscv: compat: syscall: Add compat_sys_call_table implementation 2022-04-26 13:36:25 -07:00
failslab.c mm: fix unexpected changes to {failslab|fail_page_alloc}.attr 2022-11-22 18:50:44 -08:00
filemap.c filemap: Fix bounds checking in filemap_read() 2024-11-14 13:15:17 +01:00
folio-compat.c mm: remove try_to_free_swap() 2022-10-03 14:02:53 -07:00
frontswap.c frontswap: don't call ->init if no ops are registered 2022-09-26 12:14:34 -07:00
gup_test.c mm: rename is_pinnable_page() to is_longterm_pinnable_page() 2022-07-17 17:14:27 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm: always expand the stack with the mmap write lock held 2023-07-01 13:16:25 +02:00
highmem.c highmem: fix kmap_to_page() for kmap_local_page() addresses 2022-10-12 18:51:51 -07:00
hmm.c mm/swap: add swp_offset_pfn() to fetch PFN from swap entry 2022-09-26 19:46:05 -07:00
huge_memory.c mm: migrate: try again if THP split is failed due to page refcnt 2024-11-08 16:26:47 +01:00
hugetlb_cgroup.c mm/hugetlb_cgroup: convert hugetlb_cgroup_uncharge_page() to folios 2024-05-17 11:55:52 +02:00
hugetlb_vmemmap.c mm: hugetlb_vmemmap: fix a race between vmemmap pmd split 2023-09-19 12:27:56 +02:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability 2022-08-08 18:06:43 -07:00
hugetlb.c mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio() 2024-08-14 13:53:02 +02:00
hwpoison-inject.c mm/hwpoison: add __init/__exit annotations to module init/exit funcs 2022-10-03 14:03:05 -07:00
init-mm.c mm: remove rb tree. 2022-09-26 19:46:16 -07:00
internal.h mm: unconditionally close VMAs on error 2024-11-22 15:37:34 +01:00
interval_tree.c mm/interval_tree: add comments to improve code readability 2021-04-30 11:20:38 -07:00
io-mapping.c mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
ioremap.c mm: ioremap: Add ioremap/iounmap_allowed() 2022-06-27 12:22:31 +01:00
Kconfig mm: z3fold: deprecate CONFIG_Z3FOLD 2024-10-17 15:22:05 +02:00
Kconfig.debug mm: page_table_check: Make it dependent on EXCLUSIVE_SYSTEM_RAM 2023-06-14 11:15:29 +02:00
khugepaged.c mm: khugepaged: fix kernel BUG in hpage_collapse_scan_file() 2024-08-29 17:30:17 +02:00
kmemleak.c mm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops 2022-10-28 13:37:22 -07:00
ksm.c mm/ksm: fix race with VMA iteration and mm_struct teardown 2023-03-30 12:49:29 +02:00
list_lru.c mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe 2022-06-16 19:48:31 -07:00
maccess.c mm: Fix copy_from_user_nofault(). 2023-06-28 11:12:17 +02:00
madvise.c madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check 2023-08-30 16:11:11 +02:00
Makefile mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol 2022-10-03 14:03:36 -07:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c x86/numa: Fix the address overlap check in numa_fill_memblks() 2024-03-01 13:26:36 +01:00
memcontrol.c memcg: protect concurrent access to mem_cgroup_idr 2024-09-12 11:10:29 +02:00
memfd.c memfd: check for non-NULL file_seals in memfd_create() syscall 2023-06-28 11:12:27 +02:00
memory_hotplug.c x86/kaslr: Expose and use the end of the physical memory address space 2024-09-12 11:10:17 +02:00
memory-failure.c mm/memory-failure: use raw_spinlock_t in struct memory_failure_cpu 2024-08-29 17:30:15 +02:00
memory-tiers.c memory tier: release the new_memtier in find_create_memory_tier() 2023-03-10 09:34:27 +01:00
memory.c mm: avoid leaving partial pfn mappings around in error case 2024-09-18 19:23:04 +02:00
mempolicy.c mm/numa_balancing: teach mpol_to_str about the balancing mode 2024-08-03 08:49:40 +02:00
mempool.c mm/mempool: use might_alloc() 2022-06-16 19:48:30 -07:00
memremap.c mm/memremap.c: map FS_DAX device memory as decrypted 2022-11-08 15:57:23 -08:00
memtest.c memtest: use {READ,WRITE}_ONCE in memory scanning 2024-04-03 15:19:36 +02:00
migrate_device.c mm/migrate_device: return number of migrating pages in args->cpages 2022-11-22 18:50:43 -08:00
migrate.c migrate_pages_batch: fix statistics for longterm pin retry 2024-11-08 16:26:48 +01:00
mincore.c mm: teach mincore_hugetlb about pte markers 2023-03-22 13:34:03 +01:00
mlock.c mm/mlock: drop dead code in count_mm_mlocked_page_nr() 2022-09-26 19:46:27 -07:00
mm_init.c mm: multi-gen LRU: groundwork 2022-09-26 19:46:09 -07:00
mm_slot.h mm: introduce common struct mm_slot 2022-10-03 14:02:43 -07:00
mmap_lock.c mm: mmap_lock: replace get_memcg_path_buf() with on-stack buffer 2024-08-03 08:49:30 +02:00
mmap.c mm: resolve faulty mmap_region() error path behaviour 2024-11-22 15:37:34 +01:00
mmu_gather.c mm/khugepaged: fix GUP-fast interaction by sending IPI 2022-11-30 14:49:42 -08:00
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-21 20:01:10 -07:00
mmzone.c mm: multi-gen LRU: groundwork 2022-09-26 19:46:09 -07:00
mprotect.c mm/uffd: fix warning without PTE_MARKER_UFFD_WP compiled in 2022-10-12 15:56:46 -07:00
mremap.c mm, mremap: fix mremap() expanding for vma's with vm_ops->close() 2023-02-09 11:28:22 +01:00
msync.c mm/msync: use vma_find() instead of vma linked list 2022-09-26 19:46:25 -07:00
nommu.c mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling 2024-11-22 15:37:34 +01:00
oom_kill.c mm: reduce noise in show_mem for lowmem allocations 2022-09-26 19:46:29 -07:00
page_alloc.c mm: page_alloc: move mlocked flag clearance into free_pages_prepare() 2024-12-14 19:54:31 +01:00
page_counter.c mm: page_counter: remove unneeded atomic ops for low/min 2022-09-11 20:26:01 -07:00
page_ext.c mm/page_exit: fix kernel doc warning in page_ext_put() 2022-11-22 18:50:41 -08:00
page_idle.c mm: don't be stuck to rmap lock on reclaim path 2022-05-19 14:08:54 -07:00
page_io.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
page_isolation.c mm/page_isolation: fix clang deadcode warning 2022-10-28 13:37:22 -07:00
page_owner.c mm: reuse pageblock_start/end_pfn() macro 2022-10-03 14:03:03 -07:00
page_poison.c mm: page_poison: print page info when corruption is caught 2021-04-30 11:20:36 -07:00
page_reporting.c mm/page_reporting: allow driver to specify reporting order 2021-06-29 10:53:47 -07:00
page_reporting.h mm/page_reporting: export reporting order as module parameter 2021-06-29 10:53:47 -07:00
page_table_check.c mm/page_table_check: fix crash on ZONE_DEVICE 2024-06-27 13:46:22 +02:00
page_vma_mapped.c mm/swap: add swp_offset_pfn() to fetch PFN from swap entry 2022-09-26 19:46:05 -07:00
page-writeback.c Revert "mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again" 2024-07-11 12:47:14 +02:00
pagewalk.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
percpu-internal.h percpu: improve percpu_alloc_percpu event trace 2022-05-13 07:20:18 -07:00
percpu-km.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu-stats.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
percpu-vm.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu.c mm: percpu: use kmemleak_ignore_phys() instead of kmemleak_free() 2022-07-17 17:14:47 -07:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-06-16 13:41:38 +02:00
process_vm_access.c use less confusing names for iov_iter direction initializers 2023-02-09 11:28:04 +01:00
ptdump.c mm: pagewalk: Fix race between unmap and page walker 2022-09-03 10:13:13 -07:00
readahead.c mm: use memalloc_nofs_save() in page_cache_ra_order() 2024-05-17 11:56:21 +02:00
rmap.c mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON 2023-03-10 09:34:25 +01:00
rodata_test.c mm/rodata_test: use PAGE_ALIGNED() helper 2022-10-03 14:03:05 -07:00
secretmem.c secretmem: disable memfd_secret() if arch cannot set direct map 2024-10-17 15:22:28 +02:00
shmem.c mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling 2024-11-22 15:37:34 +01:00
shrinker_debug.c mm: shrinkers: fix deadlock in shrinker debugfs 2023-02-22 12:59:46 +01:00
shuffle.c mm/shuffle: convert module_param_call to module_param_cb 2022-10-03 14:03:07 -07:00
shuffle.h mm/shuffle: fix section mismatch warning 2021-05-22 15:09:07 -10:00
slab_common.c mm: krealloc: Fix MTE false alarm in __do_krealloc 2024-11-17 15:07:22 +01:00
slab.c mm/slab: Fix undefined init_cache_node_node() for NUMA and !SMP 2023-03-30 12:49:23 +02:00
slab.h - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
slob.c Merge branch 'slab/for-6.1/kmalloc_size_roundup' into slab/for-next 2022-09-29 11:30:55 +02:00
slub.c treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
sparse-vmemmap.c mm: hugetlb_vmemmap: move vmemmap code related to HugeTLB to hugetlb_vmemmap.c 2022-08-08 18:06:42 -07:00
sparse.c x86/kaslr: Expose and use the end of the physical memory address space 2024-09-12 11:10:17 +02:00
swap_cgroup.c mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled 2022-10-03 14:03:36 -07:00
swap_slots.c mm/swap: convert put_swap_page() to put_swap_folio() 2022-10-03 14:02:46 -07:00
swap_state.c swap_state: convert free_swap_cache() to use a folio 2022-10-03 14:02:51 -07:00
swap.c mm: page_alloc: move mlocked flag clearance into free_pages_prepare() 2024-12-14 19:54:31 +01:00
swap.h mm/swap: fix race when skipping swapcache 2024-03-01 13:26:32 +01:00
swapfile.c mm/swapfile: skip HugeTLB pages for unuse_vma 2024-10-22 15:56:43 +02:00
truncate.c mm: Fix missing folio invalidation calls during truncation 2024-09-04 13:25:00 +02:00
usercopy.c mm: Fix copy_from_user_nofault(). 2023-06-28 11:12:17 +02:00
userfaultfd.c userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb 2024-02-23 09:12:51 +01:00
util.c mm: unconditionally close VMAs on error 2024-11-22 15:37:34 +01:00
vmalloc.c mm/vmalloc: fix page mapping if vm_area_alloc_pages() with high order fallback to order 0 2024-08-29 17:30:52 +02:00
vmpressure.c net-memcg: Fix scope of sockmem pressure indicators 2023-09-13 09:42:33 +02:00
vmscan.c mm/mglru: fix div-by-zero in vmpressure_calc_level() 2024-08-03 08:49:30 +02:00
vmstat.c vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event 2024-12-14 19:54:13 +01:00
workingset.c mm/mglru: fix underprotected page cache 2023-12-20 17:00:26 +01:00
z3fold.c mm: Convert all PageMovable users to movable_operations 2022-08-02 12:34:03 -04:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c zpool: remove the list of pools_head 2022-01-15 16:30:31 +02:00
zsmalloc.c zsmalloc: allow only one active pool compaction context 2023-08-23 17:52:40 +02:00
zswap.c mm: zswap: fix missing folio cleanup in writeback race path 2024-03-01 13:26:39 +01:00