Baolin Wang 8b3d838139 fs: improve dump_mapping() robustness
We met a kernel crash issue when running stress-ng testing, and the
system crashes when printing the dentry name in dump_mapping().

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
pc : dentry_name+0xd8/0x224
lr : pointer+0x22c/0x370
sp : ffff800025f134c0
......
Call trace:
  dentry_name+0xd8/0x224
  pointer+0x22c/0x370
  vsnprintf+0x1ec/0x730
  vscnprintf+0x2c/0x60
  vprintk_store+0x70/0x234
  vprintk_emit+0xe0/0x24c
  vprintk_default+0x3c/0x44
  vprintk_func+0x84/0x2d0
  printk+0x64/0x88
  __dump_page+0x52c/0x530
  dump_page+0x14/0x20
  set_migratetype_isolate+0x110/0x224
  start_isolate_page_range+0xc4/0x20c
  offline_pages+0x124/0x474
  memory_block_offline+0x44/0xf4
  memory_subsys_offline+0x3c/0x70
  device_offline+0xf0/0x120
  ......

The root cause is that, one thread is doing page migration, and we will
use the target page's ->mapping field to save 'anon_vma' pointer between
page unmap and page move, and now the target page is locked and refcount
is 1.

Currently, there is another stress-ng thread performing memory hotplug,
attempting to offline the target page that is being migrated. It discovers
that the refcount of this target page is 1, preventing the offline operation,
thus proceeding to dump the page. However, page_mapping() of the target
page may return an incorrect file mapping to crash the system in dump_mapping(),
since the target page->mapping only saves 'anon_vma' pointer without setting
PAGE_MAPPING_ANON flag.

The page migration issue has been fixed by commit d1adb25df711 ("mm: migrate:
fix getting incorrect page mapping during page migration"). In addition,
Matthew suggested we should also improve dump_mapping()'s robustness to
resilient against the kernel crash [1].

With checking the 'dentry.parent' and 'dentry.d_name.name' used by
dentry_name(), I can see dump_mapping() will output the invalid dentry
instead of crashing the system when this issue is reproduced again.

[12211.189128] page:fffff7de047741c0 refcount:1 mapcount:0 mapping:ffff989117f55ea0 index:0x1 pfn:0x211dd07
[12211.189144] aops:0x0 ino:1 invalid dentry:74786574206e6870
[12211.189148] flags: 0x57ffffc0000001(locked|node=1|zone=2|lastcpupid=0x1fffff)
[12211.189150] page_type: 0xffffffff()
[12211.189153] raw: 0057ffffc0000001 0000000000000000 dead000000000122 ffff989117f55ea0
[12211.189154] raw: 0000000000000001 0000000000000001 00000001ffffffff 0000000000000000
[12211.189155] page dumped because: unmovable page

[1] https://lore.kernel.org/all/ZXxn%2F0oixJxxAnpF@casper.infradead.org/

Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/937ab1f87328516821d39be672b6bc18861d9d3e.1705391420.git.baolin.wang@linux.alibaba.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-22 15:33:37 +01:00
..
2024-01-19 09:10:23 -08:00
2024-01-11 20:11:35 -08:00
2024-01-11 13:58:04 -08:00
2024-01-19 09:10:23 -08:00
2024-01-11 20:11:35 -08:00
2023-10-30 09:47:13 -10:00
2023-11-07 12:11:26 -08:00
2024-01-19 09:10:23 -08:00
2023-11-07 12:11:26 -08:00
2024-01-08 11:11:51 -08:00
2024-01-10 10:17:23 -08:00
2023-12-29 11:58:34 -08:00
2024-01-10 17:44:36 -08:00
2024-01-19 09:10:23 -08:00
2024-01-19 09:10:23 -08:00
2024-01-11 20:11:35 -08:00
2024-01-10 17:44:36 -08:00
2023-11-07 12:11:26 -08:00
2024-01-11 20:11:35 -08:00
2024-01-10 17:44:36 -08:00
2023-10-30 09:47:13 -10:00
2024-01-06 23:49:50 +01:00
2024-01-11 10:07:29 -08:00
2024-01-19 09:57:08 -08:00
2024-01-10 17:44:36 -08:00
2024-01-22 15:33:30 +01:00
2023-10-30 19:28:19 -10:00
2023-10-30 19:28:19 -10:00
2024-01-11 20:11:35 -08:00
2024-01-11 20:11:35 -08:00
2023-12-12 14:24:14 +01:00
2024-01-22 15:33:37 +01:00
2024-01-11 20:11:35 -08:00
2024-01-19 09:10:23 -08:00
2024-01-11 20:11:35 -08:00
2024-01-19 09:10:23 -08:00
2023-11-25 02:49:43 -05:00
2024-01-08 11:11:51 -08:00
2024-01-10 17:44:36 -08:00
2024-01-08 10:57:34 -08:00
2024-01-10 10:24:49 -08:00
2024-01-17 13:03:37 -08:00