linux/Documentation
Kaiyang Zhao f77f0c7514 mm,memcg: provide per-cgroup counters for NUMA balancing operations
The ability to observe the demotion and promotion decisions made by the
kernel on a per-cgroup basis is important for monitoring and tuning
containerized workloads on machines equipped with tiered memory.

Different containers in the system may experience drastically different
memory tiering actions that cannot be distinguished from the global
counters alone.

For example, a container running a workload that has a much hotter memory
accesses will likely see more promotions and fewer demotions, potentially
depriving a colocated container of top tier memory to such an extent that
its performance degrades unacceptably.

For another example, some containers may exhibit longer periods between
data reuse, causing much more numa_hint_faults than numa_pages_migrated. 
In this case, tuning hot_threshold_ms may be appropriate, but the signal
can easily be lost if only global counters are available.

In the long term, we hope to introduce per-cgroup control of promotion and
demotion actions to implement memory placement policies in tiering.

This patch set adds seven counters to memory.stat in a cgroup:
numa_pages_migrated, numa_pte_updates, numa_hint_faults, pgdemote_kswapd,
pgdemote_khugepaged, pgdemote_direct and pgpromote_success.  pgdemote_*
and pgpromote_success are also available in memory.numa_stat.

count_memcg_events_mm() is added to count multiple event occurrences at
once, and get_mem_cgroup_from_folio() is added because we need to get a
reference to the memcg of a folio before it's migrated to track
numa_pages_migrated.  The accounting of PGDEMOTE_* is moved to
shrink_inactive_list() before being changed to per-cgroup.

[kaiyang2@cs.cmu.edu: add documentation of the memcg counters in cgroup-v2.rst]
  Link: https://lkml.kernel.org/r/20240814235122.252309-1-kaiyang2@cs.cmu.edu
Link: https://lkml.kernel.org/r/20240814174227.30639-1-kaiyang2@cs.cmu.edu
Signed-off-by: Kaiyang Zhao <kaiyang2@cs.cmu.edu>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Wei Xu <weixugc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-03 21:15:36 -07:00
..
ABI powerpc fixes for 6.11 #2 2024-08-17 19:23:02 -07:00
accel
accounting
admin-guide mm,memcg: provide per-cgroup counters for NUMA balancing operations 2024-09-03 21:15:36 -07:00
arch docs: move numa=fake description to kernel-parameters.txt 2024-09-03 21:15:32 -07:00
block block: fix spelling and grammar for in writeback_cache_control.rst 2024-06-20 06:53:14 -06:00
bpf bpf, docs: Address comments from IETF Area Directors 2024-06-23 09:10:26 -07:00
cdrom
core-api workqueue: doc: Fix function name, remove markers 2024-08-05 18:33:36 -10:00
cpu-freq
crypto docs: crypto: async-tx-api: fix broken code example 2024-06-12 15:41:09 -06:00
dev-tools kfence: introduce burst mode 2024-09-01 20:26:03 -07:00
devicetree USB fixes for 6.11-rc6 2024-09-01 07:06:28 +12:00
doc-guide doc-guide: kernel-doc: document Returns: spelling 2024-05-30 13:35:07 -06:00
driver-api thermal: core: Update thermal zone registration documentation 2024-08-02 13:22:37 +02:00
fault-injection
fb
features LoongArch: Add ARCH_HAS_DEBUG_VM_PGTABLE support 2024-07-20 22:40:59 +08:00
filesystems fs: remove calls to set and clear the folio error flag 2024-09-01 20:26:04 -07:00
firmware_class
firmware-guide
fpga
gpu Documentation/amdgpu: Fix duplicate declaration 2024-07-16 11:45:22 -04:00
hid HID: bpf: allow hid_device_event hooks to inject input reports on self 2024-06-27 11:00:48 +02:00
hwmon hwmon updates for v6.11-rc1 2024-07-15 17:39:13 -07:00
i2c This release includes significant updates, with the primary 2024-07-13 11:10:54 +02:00
iio Documentation: iio: Document high-speed DMABUF based API 2024-06-30 11:30:18 +01:00
images
infiniband
input
isdn
kbuild Documentation/llvm: turn make command for ccache into code block 2024-08-16 21:34:12 +09:00
kernel-hacking
leds docs: leds: leds-blinkm.rst: Fix 'dasy-chain' typo 2024-06-21 11:57:10 +01:00
litmus-tests
livepatch
locking hwspinlock: Introduce hwspin_lock_bust() 2024-05-29 12:52:26 -07:00
maintainer docs: maintainer: discourage taking conversations off-list 2024-07-16 11:08:26 -06:00
mhi
misc-devices misc: mrvl-cn10k-dpi: add Octeon CN10K DPI administrative driver 2024-07-10 14:58:29 +02:00
mm mm: remove follow_page() 2024-09-01 20:26:01 -07:00
netlabel
netlink ethtool: rss: echo the context number back 2024-07-25 16:23:47 -07:00
networking ethtool: rss: echo the context number back 2024-07-25 16:23:47 -07:00
nvdimm
nvme
PCI Merge branch 'pci/misc' 2024-07-19 10:10:33 -05:00
pcmcia
peci
power regulator: core: Add helper for allow HW access to enable/disable regulator 2024-06-26 18:17:05 +01:00
process net: drop special comment style 2024-08-23 10:21:02 +01:00
RCU Merge branches 'doc.2024.06.06a', 'fixes.2024.07.04a', 'mb.2024.06.28a', 'nocb.2024.06.03a', 'rcu-tasks.2024.06.06a', 'rcutorture.2024.06.06a' and 'srcu.2024.06.18a' into HEAD 2024-07-04 13:54:17 -07:00
rust Rust changes for v6.11 2024-07-27 13:44:54 -07:00
scheduler docs/sp_SP: Add translation for scheduler/sched-design-CFS.rst 2024-07-09 09:14:33 -06:00
scsi
security
sound
sphinx
sphinx-static
spi
staging Docs: Move magic-number from process to staging 2024-06-26 16:36:00 -06:00
target
tee
timers
tools Documentation/tools/rv: fix document header 2024-07-03 16:36:21 -06:00
trace ftrace: Rewrite of function graph tracer 2024-07-18 13:36:33 -07:00
translations pci-v6.11-changes 2024-07-19 19:03:18 -07:00
usb
userspace-api media: v4l: Fix missing tabular column hint for Y14P format 2024-07-30 08:36:29 +02:00
virt KVM/arm64 fixes for 6.11, round #1 2024-08-13 06:06:27 -04:00
w1
watchdog
wmi platform/x86: msi-wmi-platform: Fix spelling mistakes 2024-07-31 12:37:01 +03:00
.gitignore
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py
docutils.conf
dontdiff
index.rst
Kconfig
Makefile
memory-barriers.txt
SubmittingPatches
subsystem-apis.rst