When doing performance tuning or debugging performance regressions,
more and more cases are found to be related to false sharing [1][2][3],
and the situation can be worse for newer platforms with hundreds of
CPUs. There are already many commits in current kernel specially
for mitigating the performance degradation due to false sharing.
False sharing could harm the performance silently without being
noticed, due to reasons like:
* data members of a big data structure randomly sitting together
in one cache line
* global data of small size are linked compactly together
So it's better to make a simple document about the normal pattern
of false sharing, basic ways to mitigate it and call out to
developers to pay attention during code-writing.
[ Many thanks to Dave Hansen, Ying Huang, Tim Chen, Julie Du and
Yu Chen for their contributions ]
[1]. https://lore.kernel.org/lkml/20220619150456.GB34471@xsang-OptiPlex-9020/
[2]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
[3]. https://lore.kernel.org/lkml/20230307125538.818862491@linutronix.de/
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Link: https://lore.kernel.org/r/20230407041235.37886-1-feng.tang@intel.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>