linux-stable/arch
Feng Tang b4bac27931 x86/tsc: Use topology_max_packages() to get package number
Commit b50db7095f ("x86/tsc: Disable clocksource watchdog for TSC on
qualified platorms") was introduced to solve problem that sometimes TSC
clocksource is wrongly judged as unstable by watchdog like 'jiffies', HPET,
etc.

In it, the hardware package number is a key factor for judging whether to
disable the watchdog for TSC, and 'nr_online_nodes' was chosen due to, at
that time (kernel v5.1x), it is available in early boot phase before
registering 'tsc-early' clocksource, where all non-boot CPUs are not
brought up yet.

Dave and Rui pointed out there are many cases in which 'nr_online_nodes'
is cheated and not accurate, like:

 * SNC (sub-numa cluster) mode enabled
 * numa emulation (numa=fake=8 etc.)
 * numa=off
 * platforms with CPU-less HBM nodes, CPU-less Optane memory nodes.
 * 'maxcpus=' cmdline setup, where chopped CPUs could be onlined later
 * 'nr_cpus=', 'possible_cpus=' cmdline setup, where chopped CPUs can
   not be onlined after boot

The SNC case is the most user-visible case, as many CSP (Cloud Service
Provider) enable this feature in their server fleets. When SNC3 enabled, a
2 socket machine will appear to have 6 NUMA nodes, and get impacted by the
issue in reality.

Thomas' recent patchset of refactoring x86 topology code improves
topology_max_packages() greatly, by making it more accurate and available
in early boot phase, which works well in most of the above cases.

The only exceptions are 'nr_cpus=' and 'possible_cpus=' setup, which may
under-estimate the package number. As during topology setup, the boot CPU
iterates through all enumerated APIC IDs and either accepts or rejects the
APIC ID. For accepted IDs, it figures out which bits of the ID map to the
package number.  It tracks which package numbers have been seen in a
bitmap.  topology_max_packages() just returns the number of bits set in
that bitmap.

'nr_cpus=' and 'possible_cpus=' can cause more APIC IDs to be rejected and
can artificially lower the number of bits in the package bitmap and thus
topology_max_packages().  This means that, for example, a system with 8
physical packages might reject all the CPUs on 6 of those packages and be
left with only 2 packages and 2 bits set in the package bitmap. It needs
the TSC watchdog, but would disable it anyway.  This isn't ideal, but it
only happens for debug-oriented options. This is fixable by tracking the
package numbers for rejected CPUs.  But it's not worth the trouble for
debugging.

So use topology_max_packages() to replace nr_online_nodes().

Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/all/20240729021202.180955-1-feng.tang@intel.com
Closes: https://lore.kernel.org/lkml/a4860054-0f16-6513-f121-501048431086@intel.com/
2024-07-31 21:12:09 +02:00
..
alpha mseal: wire up mseal syscall 2024-05-23 19:40:26 -07:00
arc arc: convert to generic syscall table 2024-07-10 14:23:38 +02:00
arm Driver core changes for 6.11-rc1 2024-07-25 10:42:22 -07:00
arm64 RISC-V Patches for the 6.11 Merge Window, Part 2 2024-07-27 10:14:34 -07:00
csky ftrace: Rewrite of function graph tracer 2024-07-18 13:36:33 -07:00
hexagon hexagon: use new system call table 2024-07-10 14:23:38 +02:00
loongarch RISC-V Patches for the 6.11 Merge Window, Part 2 2024-07-27 10:14:34 -07:00
m68k Kbuild updates for v6.11 2024-07-23 14:32:21 -07:00
microblaze syscalls: mmap(): use unsigned offset type consistently 2024-06-25 15:57:38 +02:00
mips - Use improved timer sync for Loongson64 2024-07-25 12:41:53 -07:00
nios2 Kbuild updates for v6.11 2024-07-23 14:32:21 -07:00
openrisc openrisc: convert to generic syscall table 2024-07-10 14:23:38 +02:00
parisc parisc architecture fixes and updates for kernel v6.11-rc1: 2024-07-25 12:37:42 -07:00
powerpc Devicetree fixes for 6.11, part 1 2024-07-27 12:46:16 -07:00
riscv RISC-V Patches for the 6.11 Merge Window, Part 2 2024-07-27 10:14:34 -07:00
s390 more s390 updates for 6.11 merge window 2024-07-26 10:47:53 -07:00
sh sh updates for v6.11 2024-07-23 11:57:52 -07:00
sparc Driver core changes for 6.11-rc1 2024-07-25 10:42:22 -07:00
um This pull request contains the following changes for UML: 2024-07-25 12:33:08 -07:00
x86 x86/tsc: Use topology_max_packages() to get package number 2024-07-31 21:12:09 +02:00
xtensa - 875fa64577da ("mm/hugetlb_vmemmap: fix race with speculative PFN 2024-07-21 17:15:46 -07:00
.gitignore
Kconfig Revert "mm: mmap: allow for the maximum number of bits for randomizing mmap_base by default" 2024-06-17 12:57:03 -07:00