linux/include/asm-generic
Rong Xu d5dc958361 kbuild: Add Propeller configuration for kernel build
Add the build support for using Clang's Propeller optimizer. Like
AutoFDO, Propeller uses hardware sampling to gather information
about the frequency of execution of different code paths within a
binary. This information is then used to guide the compiler's
optimization decisions, resulting in a more efficient binary.

The support requires a Clang compiler LLVM 19 or later, and the
create_llvm_prof tool
(https://github.com/google/autofdo/releases/tag/v0.30.1). This
commit is limited to x86 platforms that support PMU features
like LBR on Intel machines and AMD Zen3 BRS.

Here is an example workflow for building an AutoFDO+Propeller
optimized kernel:

1) Build the kernel on the host machine, with AutoFDO and Propeller
   build config
      CONFIG_AUTOFDO_CLANG=y
      CONFIG_PROPELLER_CLANG=y
   then
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo_profile>

“<autofdo_profile>” is the profile collected when doing a non-Propeller
AutoFDO build. This step builds a kernel that has the same optimization
level as AutoFDO, plus a metadata section that records basic block
information. This kernel image runs as fast as an AutoFDO optimized
kernel.

2) Install the kernel on test/production machines.

3) Run the load tests. The '-c' option in perf specifies the sample
   event period. We suggest using a suitable prime number,
   like 500009, for this purpose.
   For Intel platforms:
      $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \
        -o <perf_file> -- <loadtest>
   For AMD platforms:
      The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2
      # To see if Zen3 support LBR:
      $ cat proc/cpuinfo | grep " brs"
      # To see if Zen4 support LBR:
      $ cat proc/cpuinfo | grep amd_lbr_v2
      # If the result is yes, then collect the profile using:
      $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \
        -N -b -c <count> -o <perf_file> -- <loadtest>

4) (Optional) Download the raw perf file to the host machine.

5) Generate Propeller profile:
   $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
     --format=propeller --propeller_output_module_name \
     --out=<propeller_profile_prefix>_cc_profile.txt \
     --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt

   “create_llvm_prof” is the profile conversion tool, and a prebuilt
   binary for linux can be found on
   https://github.com/google/autofdo/releases/tag/v0.30.1 (can also build
   from source).

   "<propeller_profile_prefix>" can be something like
   "/home/user/dir/any_string".

   This command generates a pair of Propeller profiles:
   "<propeller_profile_prefix>_cc_profile.txt" and
   "<propeller_profile_prefix>_ld_profile.txt".

6) Rebuild the kernel using the AutoFDO and Propeller profile files.
      CONFIG_AUTOFDO_CLANG=y
      CONFIG_PROPELLER_CLANG=y
   and
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo_profile> \
        CLANG_PROPELLER_PROFILE_PREFIX=<propeller_profile_prefix>

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
..
bitops bitops: Change function return types from long to int 2024-05-03 17:04:50 +02:00
vdso lib/vdso: Avoid highres update if clocksource is not VDSO capable 2020-02-17 20:12:17 +01:00
access_ok.h uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
agp.h char/agp: introduce asm-generic/agp.h 2023-02-13 22:13:29 +01:00
archrandom.h random: handle archrandom with multiple longs 2022-07-25 13:26:14 +02:00
asm-offsets.h asm-generic: Add common asm-offsets.h 2015-06-23 13:35:49 +09:00
asm-prototypes.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
atomic64.h locking/atomic: delete !ARCH_ATOMIC remnants 2021-05-26 13:20:52 +02:00
atomic.h locking/atomic: make atomic*_{cmp,}xchg optional 2023-06-05 09:57:14 +02:00
audit_change_attr.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
audit_dir_write.h audit: Avoid build failures on systems without renameat 2018-01-30 19:07:54 -08:00
audit_read.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
audit_signal.h
audit_write.h audit/stable-4.15 PR 20171113 2017-11-15 13:28:48 -08:00
barrier.h sched: Add missing memory barrier in switch_mm_cid 2024-04-16 13:59:45 +02:00
bitops.h include: move find.h from asm_generic to linux 2022-01-15 08:47:31 -08:00
bitsperlong.h lib: extend the scope of small_const_nbits() macro 2021-05-06 19:24:11 -07:00
bug.h bug: Improve comment 2024-05-07 14:20:48 +02:00
cache.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cacheflush.h mm: Introduce flush_cache_vmap_early() 2023-12-14 00:23:17 -08:00
cfi.h cfi: Flip headers 2023-12-15 16:25:55 -08:00
checksum.h asm-generic: Improve csum_fold 2024-01-17 17:52:29 -08:00
cmpxchg-local.h asm-generic: Fix 32 bit __generic_cmpxchg_local 2024-01-05 23:19:14 +01:00
cmpxchg.h asm-generic: avoid __generic_cmpxchg_local warnings 2023-04-04 17:58:11 +02:00
codetag.lds.h lib: add allocation tagging support for memory allocation profiling 2024-04-25 20:55:52 -07:00
compat.h asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual() 2022-11-01 10:20:11 +11:00
current.h asm-generic: current: Don't include thread-info.h if building asm 2023-08-26 22:38:49 +02:00
delay.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
device.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
div64.h ARM: 9117/1: asm-generic: div64: Remove always-true __div64_const32_is_OK() 2021-08-20 11:39:28 +01:00
dma-mapping.h dma-mapping: no need to pass a bus_type into get_arch_dma_ops() 2023-02-15 12:35:20 +01:00
dma.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
early_ioremap.h mm/early_ioremap.c: remove redundant early_ioremap_shutdown() 2021-09-08 11:50:24 -07:00
emergency-restart.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
error-injection.h docs: fault-injection: add requirements of error injectable functions 2023-02-02 22:50:00 -08:00
exec.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
extable.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
fixmap.h fixmap: Remove unused set_fixmap_offset_io() 2024-07-11 17:41:23 +02:00
flat.h binfmt_flat: remove the persistent argument from flat_get_addr_from_rp 2019-06-24 09:16:47 +10:00
ftrace.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
futex.h futex: Fix additional regressions 2021-12-11 23:31:51 +01:00
getorder.h asm-generic: force inlining of get_order() to work around gcc10 poor decision 2020-12-15 22:46:15 -08:00
hardirq.h irqstat: Move declaration into asm-generic/hardirq.h 2020-11-23 10:31:06 +01:00
hugetlb.h mm: provide mm_struct and address to huge_ptep_get() 2024-07-12 15:52:15 -07:00
hw_irq.h
hyperv-tlfs.h hyperv-fixes for v6.9-rc4 2024-04-11 16:23:56 -07:00
int-ll64.h int-ll64.h: define u{8,16,32,64} and s{8,16,32,64} based on uapi header 2018-06-07 17:34:38 -07:00
io.h asm-generic/io.h: kill vmalloc.h dependency 2024-04-25 20:55:50 -07:00
ioctl.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
iomap.h asm-generic/iomap.h: remove ARCH_HAS_IOREMAP_xx macros 2023-08-18 10:12:32 -07:00
irq_regs.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
irq_work.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
irq.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
irqflags.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kbuild move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
kdebug.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kmap_size.h mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL 2020-11-24 14:42:08 +01:00
kprobes.h treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
kvm_para.h KVM: Introduce paravirtualization hints and KVM_HINTS_DEDICATED 2018-03-06 18:40:44 +01:00
kvm_types.h KVM: Move x86's version of struct kvm_mmu_memory_cache to common code 2020-07-09 13:29:42 -04:00
linkage.h
local64.h locking/generic: Wire up local{,64}_try_cmpxchg() 2023-04-29 09:09:09 +02:00
local.h locking/generic: Wire up local{,64}_try_cmpxchg() 2023-04-29 09:09:09 +02:00
logic_io.h logic_io instance of iounmap() needs volatile on argument 2021-12-21 21:31:08 +01:00
mcs_spinlock.h locking/mcs: Allow architecture specific asm files to be used for contended case 2014-02-09 21:18:52 +01:00
memory_model.h mm, arch: add generic implementation of pfn_valid() for FLATMEM 2023-02-09 16:51:41 -08:00
mm_hooks.h mm: remove arch_unmap() 2024-09-01 20:26:13 -07:00
mmiowb_types.h asm-generic/mmiowb: Add generic implementation of mmiowb() tracking 2019-04-08 11:59:39 +01:00
mmiowb.h asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible() 2020-07-17 10:02:03 +01:00
mmu_context.h asm-generic: add generic MMU versions of mmu context functions 2020-10-26 16:45:03 +01:00
mmu.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mmzone.h arch, mm: move definition of node_data to generic code 2024-09-03 21:15:28 -07:00
module.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
module.lds.h kbuild: preprocess module linker script 2020-09-25 00:36:41 +09:00
mshyperv.h hyperv-fixes for v6.9-rc4 2024-04-11 16:23:56 -07:00
msi.h genirq: Get rid of GENERIC_MSI_IRQ_DOMAIN 2022-11-17 15:15:20 +01:00
nommu_context.h asm-generic: add generic MMU versions of mmu context functions 2020-10-26 16:45:03 +01:00
numa.h arch_numa: switch over to numa_memblks 2024-09-03 21:15:32 -07:00
param.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
parport.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
pci_iomap.h PCI: Stub __pci_ioport_map() for arches that don't support it at all 2022-07-29 12:01:00 -05:00
pci.h asm-generic: Add new pci.h and use it 2022-07-22 17:34:57 -05:00
percpu.h percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg() 2023-06-08 10:28:39 +02:00
pgalloc.h mm: change inlined allocation helpers to account at the call site 2024-04-25 20:55:59 -07:00
pgtable_uffd.h userfaultfd: wp: add pmd_swp_*uffd_wp() helpers 2020-04-07 10:43:39 -07:00
pgtable-nop4d.h mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t * 2021-07-08 11:48:22 -07:00
pgtable-nopmd.h mm: recover pud_leaf() definitions in nopmd case 2024-03-13 12:12:21 -07:00
pgtable-nopud.h mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t * 2021-07-08 11:48:22 -07:00
preempt.h riscv: support PREEMPT_DYNAMIC with static keys 2023-08-31 00:18:34 -07:00
qrwlock_types.h locking/qrwlock: Change "queue rwlock" to "queued rwlock" 2022-05-11 16:27:04 +02:00
qrwlock.h asm-generic changes for 5.19 2022-05-26 10:50:30 -07:00
qspinlock_types.h locking/qspinlock: Do not include atomic.h from qspinlock_types.h 2020-07-29 16:14:19 +02:00
qspinlock.h asm-generic: qspinlock: fix queued_spin_value_unlocked() implementation 2023-11-22 09:32:49 -08:00
resource.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
runtime-const.h runtime constants: add default dummy infrastructure 2024-06-19 12:34:34 -07:00
rwonce.h asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h' 2020-07-21 10:50:36 +01:00
seccomp.h seccomp: Use -1 marker for end of mode 1 syscall list 2020-07-10 16:01:52 -07:00
sections.h jump_label,module: Don't alloc static_key_mod for __ro_after_init keys 2024-03-22 11:18:16 +01:00
serial.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
set_memory.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
shmparam.h treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers 2019-05-14 19:52:48 -07:00
signal.h asm-generic: Remove empty #ifdef SA_RESTORER 2022-09-10 09:56:53 +02:00
simd.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
softirq_stack.h asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig. 2022-09-05 17:20:55 +02:00
spinlock_types.h asm-generic: ticket-lock: New generic ticket-based spinlock 2022-05-11 11:49:38 -07:00
spinlock.h asm-generic: ticket-lock: Optimize arch_spin_value_unlocked() 2023-09-21 10:17:00 +02:00
statfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
string.h
switch_to.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
syscall.h ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h 2022-03-10 13:35:08 -06:00
syscalls.h syscalls: mmap(): use unsigned offset type consistently 2024-06-25 15:57:38 +02:00
timex.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
tlb.h mm/mmu_gather: add __tlb_remove_folio_pages() 2024-02-22 15:27:17 -08:00
tlbflush.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
topology.h mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA 2021-06-29 10:53:55 -07:00
trace_clock.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
uaccess.h move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
user.h
vermagic.h arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h> 2020-04-23 10:50:26 +09:00
vga.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
video.h arch: Rename fbdev header and source files 2024-05-03 17:07:50 +02:00
vmlinux.lds.h kbuild: Add Propeller configuration for kernel build 2024-11-27 09:38:27 +09:00
word-at-a-time.h kernel.h: removed REPEAT_BYTE from kernel.h 2024-02-01 09:47:59 -08:00
xor.h lib/xor: make xor prototypes more friendly to compiler vectorization 2022-02-11 20:39:39 +11:00