mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-07 14:32:23 +00:00
mm/khugepaged: don't recycle vma pgtable if uffd-wp registered
When we're trying to collapse a 2M huge shmem page, don't retract pgtable pmd page if it's registered with uffd-wp, because that pgtable could have pte markers installed. Recycling of that pgtable means we'll lose the pte markers. That could cause data loss for an uffd-wp enabled application on shmem. Instead of disabling khugepaged on these files, simply skip retracting these special VMAs, then the page cache can still be merged into a huge thp, and other mm/vma can still map the range of file with a huge thp when proper. Note that checking VM_UFFD_WP needs to be done with mmap_sem held for write, that avoids race like: khugepaged user thread ========== =========== check VM_UFFD_WP, not set UFFDIO_REGISTER with uffd-wp on shmem wr-protect some pages (install markers) take mmap_sem write lock erase pmd and free pmd page --> pte markers are dropped unnoticed! Link: https://lkml.kernel.org/r/20220405014921.14994-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
parent
bc70fbf269
commit
deb4c93a98
@ -1456,6 +1456,10 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr)
|
||||
if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE))
|
||||
return;
|
||||
|
||||
/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
|
||||
if (userfaultfd_wp(vma))
|
||||
return;
|
||||
|
||||
hpage = find_lock_page(vma->vm_file->f_mapping,
|
||||
linear_page_index(vma, haddr));
|
||||
if (!hpage)
|
||||
@ -1591,7 +1595,15 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
|
||||
* reverse order. Trylock is a way to avoid deadlock.
|
||||
*/
|
||||
if (mmap_write_trylock(mm)) {
|
||||
if (!khugepaged_test_exit(mm))
|
||||
/*
|
||||
* When a vma is registered with uffd-wp, we can't
|
||||
* recycle the pmd pgtable because there can be pte
|
||||
* markers installed. Skip it only, so the rest mm/vma
|
||||
* can still have the same file mapped hugely, however
|
||||
* it'll always mapped in small page size for uffd-wp
|
||||
* registered ranges.
|
||||
*/
|
||||
if (!khugepaged_test_exit(mm) && !userfaultfd_wp(vma))
|
||||
collapse_and_free_pmd(mm, vma, addr, pmd);
|
||||
mmap_write_unlock(mm);
|
||||
} else {
|
||||
|
Loading…
Reference in New Issue
Block a user