mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-10 07:00:48 +00:00
88a6f89944
Introduce the crash_hotplug attribute for memory and CPUs for use by userspace. These attributes directly facilitate the udev rule for managing userspace re-loading of the crash kernel upon hot un/plug changes. For memory, expose the crash_hotplug attribute to the /sys/devices/system/memory directory. For example: # udevadm info --attribute-walk /sys/devices/system/memory/memory81 looking at device '/devices/system/memory/memory81': KERNEL=="memory81" SUBSYSTEM=="memory" DRIVER=="" ATTR{online}=="1" ATTR{phys_device}=="0" ATTR{phys_index}=="00000051" ATTR{removable}=="1" ATTR{state}=="online" ATTR{valid_zones}=="Movable" looking at parent device '/devices/system/memory': KERNELS=="memory" SUBSYSTEMS=="" DRIVERS=="" ATTRS{auto_online_blocks}=="offline" ATTRS{block_size_bytes}=="8000000" ATTRS{crash_hotplug}=="1" For CPUs, expose the crash_hotplug attribute to the /sys/devices/system/cpu directory. For example: # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0 looking at device '/devices/system/cpu/cpu0': KERNEL=="cpu0" SUBSYSTEM=="cpu" DRIVER=="processor" ATTR{crash_notes}=="277c38600" ATTR{crash_notes_size}=="368" ATTR{online}=="1" looking at parent device '/devices/system/cpu': KERNELS=="cpu" SUBSYSTEMS=="" DRIVERS=="" ATTRS{crash_hotplug}=="1" ATTRS{isolated}=="" ATTRS{kernel_max}=="8191" ATTRS{nohz_full}==" (null)" ATTRS{offline}=="4-7" ATTRS{online}=="0-3" ATTRS{possible}=="0-7" ATTRS{present}=="0-3" With these sysfs attributes in place, it is possible to efficiently instruct the udev rule to skip crash kernel reloading for kernels configured with crash hotplug support. For example, the following is the proposed udev rule change for RHEL system 98-kexec.rules (as the first lines of the rule file): # The kernel updates the crash elfcorehdr for CPU and memory changes SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" When examined in the context of 98-kexec.rules, the above rules test if crash_hotplug is set, and if so, the userspace initiated unload-then-reload of the crash kernel is skipped. CPU and memory checks are separated in accordance with CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG kernel config options. If an architecture supports, for example, memory hotplug but not CPU hotplug, then the /sys/devices/system/memory/crash_hotplug attribute file is present, but the /sys/devices/system/cpu/crash_hotplug attribute file will NOT be present. Thus the udev rule skips userspace processing of memory hot un/plug events, but the udev rule will evaluate false for CPU events, thus allowing userspace to process CPU hot un/plug events (ie the unload-then-reload of the kdump capture kernel). Link: https://lkml.kernel.org/r/20230814214446.6659-5-eric.devolder@oracle.com Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Akhil Raj <lf32.dev@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
121 lines
4.6 KiB
Plaintext
121 lines
4.6 KiB
Plaintext
What: /sys/devices/system/memory
|
|
Date: June 2008
|
|
Contact: Badari Pulavarty <pbadari@us.ibm.com>
|
|
Description:
|
|
The /sys/devices/system/memory contains a snapshot of the
|
|
internal state of the kernel memory blocks. Files could be
|
|
added or removed dynamically to represent hot-add/remove
|
|
operations.
|
|
Users: hotplug memory add/remove tools
|
|
http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
|
|
|
|
What: /sys/devices/system/memory/memoryX/removable
|
|
Date: June 2008
|
|
Contact: Badari Pulavarty <pbadari@us.ibm.com>
|
|
Description:
|
|
The file /sys/devices/system/memory/memoryX/removable is a
|
|
legacy interface used to indicated whether a memory block is
|
|
likely to be offlineable or not. Newer kernel versions return
|
|
"1" if and only if the kernel supports memory offlining.
|
|
Users: hotplug memory remove tools
|
|
http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
|
|
lsmem/chmem part of util-linux
|
|
|
|
What: /sys/devices/system/memory/memoryX/phys_device
|
|
Date: September 2008
|
|
Contact: Badari Pulavarty <pbadari@us.ibm.com>
|
|
Description:
|
|
The file /sys/devices/system/memory/memoryX/phys_device
|
|
is read-only; it is a legacy interface only ever used on s390x
|
|
to expose the covered storage increment.
|
|
Users: Legacy s390-tools lsmem/chmem
|
|
|
|
What: /sys/devices/system/memory/memoryX/phys_index
|
|
Date: September 2008
|
|
Contact: Badari Pulavarty <pbadari@us.ibm.com>
|
|
Description:
|
|
The file /sys/devices/system/memory/memoryX/phys_index
|
|
is read-only and contains the section ID in hexadecimal
|
|
which is equivalent to decimal X contained in the
|
|
memory section directory name.
|
|
|
|
What: /sys/devices/system/memory/memoryX/state
|
|
Date: September 2008
|
|
Contact: Badari Pulavarty <pbadari@us.ibm.com>
|
|
Description:
|
|
The file /sys/devices/system/memory/memoryX/state
|
|
is read-write. When read, it returns the online/offline
|
|
state of the memory block. When written, root can toggle
|
|
the online/offline state of a memory block using the following
|
|
commands::
|
|
|
|
# echo online > /sys/devices/system/memory/memoryX/state
|
|
# echo offline > /sys/devices/system/memory/memoryX/state
|
|
|
|
On newer kernel versions, advanced states can be specified
|
|
when onlining to select a target zone: "online_movable"
|
|
selects the movable zone. "online_kernel" selects the
|
|
applicable kernel zone (DMA, DMA32, or Normal). However,
|
|
after successfully setting one of the advanced states,
|
|
reading the file will return "online"; the zone information
|
|
can be obtained via "valid_zones" instead.
|
|
|
|
While onlining is unlikely to fail, there are no guarantees
|
|
that offlining will succeed. Offlining is more likely to
|
|
succeed if "valid_zones" indicates "Movable".
|
|
Users: hotplug memory remove tools
|
|
http://www.ibm.com/developerworks/wikis/display/LinuxP/powerpc-utils
|
|
|
|
|
|
What: /sys/devices/system/memory/memoryX/valid_zones
|
|
Date: July 2014
|
|
Contact: Zhang Zhen <zhenzhang.zhang@huawei.com>
|
|
Description:
|
|
The file /sys/devices/system/memory/memoryX/valid_zones is
|
|
read-only.
|
|
|
|
For online memory blocks, it returns in which zone memory
|
|
provided by a memory block is managed. If multiple zones
|
|
apply (not applicable for hotplugged memory), "None" is returned
|
|
and the memory block cannot be offlined.
|
|
|
|
For offline memory blocks, it returns by which zone memory
|
|
provided by a memory block can be managed when onlining.
|
|
The first returned zone ("default") will be used when setting
|
|
the state of an offline memory block to "online". Only one of
|
|
the kernel zones (DMA, DMA32, Normal) is applicable for a single
|
|
memory block.
|
|
|
|
What: /sys/devices/system/memoryX/nodeY
|
|
Date: October 2009
|
|
Contact: Linux Memory Management list <linux-mm@kvack.org>
|
|
Description:
|
|
When CONFIG_NUMA is enabled, a symbolic link that
|
|
points to the corresponding NUMA node directory.
|
|
|
|
For example, the following symbolic link is created for
|
|
memory section 9 on node0:
|
|
|
|
/sys/devices/system/memory/memory9/node0 -> ../../node/node0
|
|
|
|
|
|
What: /sys/devices/system/node/nodeX/memoryY
|
|
Date: September 2008
|
|
Contact: Gary Hade <garyhade@us.ibm.com>
|
|
Description:
|
|
When CONFIG_NUMA is enabled
|
|
/sys/devices/system/node/nodeX/memoryY is a symbolic link that
|
|
points to the corresponding /sys/devices/system/memory/memoryY
|
|
memory section directory. For example, the following symbolic
|
|
link is created for memory section 9 on node0.
|
|
|
|
/sys/devices/system/node/node0/memory9 -> ../../memory/memory9
|
|
|
|
What: /sys/devices/system/memory/crash_hotplug
|
|
Date: Aug 2023
|
|
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
|
|
Description:
|
|
(RO) indicates whether or not the kernel directly supports
|
|
modifying the crash elfcorehdr for memory hot un/plug and/or
|
|
on/offline changes.
|