linux-next/Documentation/mm
Ryan Roberts 3a5a8d343e mm: fix race between __split_huge_pmd_locked() and GUP-fast
__split_huge_pmd_locked() can be called for a present THP, devmap or
(non-present) migration entry.  It calls pmdp_invalidate() unconditionally
on the pmdp and only determines if it is present or not based on the
returned old pmd.  This is a problem for the migration entry case because
pmd_mkinvalid(), called by pmdp_invalidate() must only be called for a
present pmd.

On arm64 at least, pmd_mkinvalid() will mark the pmd such that any future
call to pmd_present() will return true.  And therefore any lockless
pgtable walker could see the migration entry pmd in this state and start
interpretting the fields as if it were present, leading to BadThings (TM).
GUP-fast appears to be one such lockless pgtable walker.

x86 does not suffer the above problem, but instead pmd_mkinvalid() will
corrupt the offset field of the swap entry within the swap pte.  See link
below for discussion of that problem.

Fix all of this by only calling pmdp_invalidate() for a present pmd.  And
for good measure let's add a warning to all implementations of
pmdp_invalidate[_ad]().  I've manually reviewed all other
pmdp_invalidate[_ad]() call sites and believe all others to be conformant.

This is a theoretical bug found during code review.  I don't have any test
case to trigger it in practice.

Link: https://lkml.kernel.org/r/20240501143310.1381675-1-ryan.roberts@arm.com
Link: https://lore.kernel.org/all/0dd7827a-6334-439a-8fd0-43c98e6af22b@arm.com/
Fixes: 84c3fc4e9c ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-07 10:37:00 -07:00
..
damon Docs/mm/damon/design: document 'young page' type DAMOS filter 2024-05-05 17:53:55 -07:00
active_mm.rst lazy tlb: allow lazy tlb mm refcounting to be configurable 2023-03-28 16:20:08 -07:00
allocation-profiling.rst memprofiling: documentation 2024-04-25 20:55:58 -07:00
arch_pgtable_helpers.rst mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-05-07 10:37:00 -07:00
balance.rst - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
bootmem.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
free_page_reporting.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
highmem.rst Documentation work keeps chugging along; stuff for 6.6 includes: 2023-08-30 20:05:42 -07:00
hmm.rst docs/mm: remove references to hmm_mirror ops and clean typos 2023-08-28 12:41:17 -06:00
hugetlbfs_reserv.rst mm: convert free_huge_page() to free_huge_folio() 2023-08-21 14:28:43 -07:00
hwpoison.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
index.rst memprofiling: documentation 2024-04-25 20:55:58 -07:00
ksm.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
memory-model.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
mmu_notifier.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
multigen_lru.rst mm: multi-gen LRU: improve design doc 2023-03-28 16:20:07 -07:00
numa.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
oom.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
overcommit-accounting.rst docs: mm: fix vm overcommit documentation for OVERCOMMIT_GUESS 2023-10-10 13:35:55 -06:00
page_allocation.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
page_cache.rst ubifs: Convert ubifs_vm_page_mkwrite() to use a folio 2024-02-25 21:08:00 +01:00
page_frags.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
page_migration.rst Documentation: Fix typos 2023-08-18 11:29:03 -06:00
page_owner.rst mm,page_owner: fix refcount imbalance 2024-04-16 15:39:49 -07:00
page_reclaim.rst docs/mm: Physical Memory: remove useless markup 2023-02-02 10:18:04 -07:00
page_table_check.rst mm/page_table_check: support userfault wr-protect entries 2024-05-05 17:53:41 -07:00
page_tables.rst Documentation/page_tables: Add info about MMU/TLB and Page Faults 2023-10-10 13:35:55 -06:00
physical_memory.rst docs/mm: Physical Memory: Fix grammar 2023-04-11 16:16:50 -06:00
process_addrs.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
remap_file_pages.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
shmfs.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
slab.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
slub.rst mm/slub: make the description of slab_min_objects helpful in doc 2024-01-22 10:31:08 +01:00
split_page_table_lock.rst mm: remove pgtable_{pmd, pte}_page_{ctor, dtor}() wrappers 2023-08-21 13:37:58 -07:00
swap.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
transhuge.rst mm: track mapcount of large folios in single value 2024-05-05 17:53:28 -07:00
unevictable-lru.rst Documentation: stop referring to page_remove_rmap() 2023-12-29 11:58:54 -08:00
vmalloc.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
vmalloced-kernel-stacks.rst docs: rename Documentation/vm to Documentation/mm 2022-06-27 12:52:53 -07:00
vmemmap_dedup.rst remove references to page->flags in documentation 2024-04-25 20:56:15 -07:00
z3fold.rst docs/mm: remove useless markup 2023-02-02 10:18:05 -07:00
zsmalloc.rst mm: add orphaned kernel-doc to the rst files. 2023-08-24 16:20:31 -07:00