linux/Documentation
Rong Xu d5dc958361 kbuild: Add Propeller configuration for kernel build
Add the build support for using Clang's Propeller optimizer. Like
AutoFDO, Propeller uses hardware sampling to gather information
about the frequency of execution of different code paths within a
binary. This information is then used to guide the compiler's
optimization decisions, resulting in a more efficient binary.

The support requires a Clang compiler LLVM 19 or later, and the
create_llvm_prof tool
(https://github.com/google/autofdo/releases/tag/v0.30.1). This
commit is limited to x86 platforms that support PMU features
like LBR on Intel machines and AMD Zen3 BRS.

Here is an example workflow for building an AutoFDO+Propeller
optimized kernel:

1) Build the kernel on the host machine, with AutoFDO and Propeller
   build config
      CONFIG_AUTOFDO_CLANG=y
      CONFIG_PROPELLER_CLANG=y
   then
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo_profile>

“<autofdo_profile>” is the profile collected when doing a non-Propeller
AutoFDO build. This step builds a kernel that has the same optimization
level as AutoFDO, plus a metadata section that records basic block
information. This kernel image runs as fast as an AutoFDO optimized
kernel.

2) Install the kernel on test/production machines.

3) Run the load tests. The '-c' option in perf specifies the sample
   event period. We suggest using a suitable prime number,
   like 500009, for this purpose.
   For Intel platforms:
      $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \
        -o <perf_file> -- <loadtest>
   For AMD platforms:
      The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2
      # To see if Zen3 support LBR:
      $ cat proc/cpuinfo | grep " brs"
      # To see if Zen4 support LBR:
      $ cat proc/cpuinfo | grep amd_lbr_v2
      # If the result is yes, then collect the profile using:
      $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \
        -N -b -c <count> -o <perf_file> -- <loadtest>

4) (Optional) Download the raw perf file to the host machine.

5) Generate Propeller profile:
   $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
     --format=propeller --propeller_output_module_name \
     --out=<propeller_profile_prefix>_cc_profile.txt \
     --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt

   “create_llvm_prof” is the profile conversion tool, and a prebuilt
   binary for linux can be found on
   https://github.com/google/autofdo/releases/tag/v0.30.1 (can also build
   from source).

   "<propeller_profile_prefix>" can be something like
   "/home/user/dir/any_string".

   This command generates a pair of Propeller profiles:
   "<propeller_profile_prefix>_cc_profile.txt" and
   "<propeller_profile_prefix>_ld_profile.txt".

6) Rebuild the kernel using the AutoFDO and Propeller profile files.
      CONFIG_AUTOFDO_CLANG=y
      CONFIG_PROPELLER_CLANG=y
   and
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo_profile> \
        CLANG_PROPELLER_PROFILE_PREFIX=<propeller_profile_prefix>

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
..
ABI Char/Misc and other driver changes for 6.12-rc1 2024-09-26 10:13:08 -07:00
accel drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
accounting
admin-guide cpufreq: docs: Reflect latency changes in docs 2024-10-21 13:20:03 +02:00
arch arm64 fixes for 6.12-rc2: 2024-10-04 12:20:09 -07:00
block docs: block: Fix grammar and spelling mistakes in bfq-iosched.rst 2024-09-05 14:38:10 -06:00
bpf docs/bpf: Add missing BPF program types to docs 2024-09-12 10:56:41 -07:00
cdrom
core-api arm64 fixes for -rc4 2024-10-17 09:51:03 -07:00
cpu-freq
crypto
dev-tools kbuild: Add Propeller configuration for kernel build 2024-11-27 09:38:27 +09:00
devicetree phy fixes for 6.12 2024-11-03 10:19:34 -10:00
doc-guide doc-guide: add help documentation checktransupdate.rst 2024-07-30 07:56:22 -06:00
driver-api platform/x86: wmi: Update WMI driver API documentation 2024-10-06 12:48:52 +02:00
fault-injection Fix typo "allocateed" to allocated 2024-08-26 15:37:25 -06:00
fb
features x86: remove PG_uncached 2024-09-03 21:15:46 -07:00
filesystems vfs-6.12-rc6.fixes 2024-11-01 07:37:10 -10:00
firmware_class
firmware-guide
fpga
gpu Short summary of fixes pull: 2024-10-01 08:15:55 +10:00
hid Documentation: hid: intel-ish-hid: Add vendor custom firmware loading 2024-08-19 21:12:27 +02:00
hwmon hwmon: Remove devm_hwmon_device_unregister() API function 2024-09-13 07:27:36 -07:00
i2c i2c: testunit: add SMBusAlert trigger 2024-08-26 15:15:48 +02:00
iio docs: iio: ad7380: fix supply for ad7380-4 2024-10-24 18:30:47 +01:00
images
infiniband
input
isdn
kbuild kconfig: remove support for "bool" prompt for choice entries 2024-11-04 17:53:09 +09:00
kernel-hacking
leds - Limited LED current based on thermal conditions in the QCOM flash LED driver. 2024-09-23 14:20:11 -07:00
litmus-tests
livepatch Documentation: livepatch: Correct release locks antonym 2024-09-04 13:42:27 +02:00
locking
maintainer
mhi
misc-devices
mm Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs 2024-10-17 00:28:09 -07:00
netlabel
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-09-12 17:11:24 -07:00
networking docs: networking: packet_mmap: replace dead links with archive.org links 2024-10-28 15:47:10 -07:00
nvdimm
nvme Remove duplicate "and" in 'Linux NVMe docs. 2024-09-10 15:44:20 -06:00
PCI Documentation: PCI: fix typo in pci.rst 2024-09-10 15:30:42 -06:00
pcmcia
peci
power Documentation: PM: Discourage use of deprecated macros 2024-09-04 14:37:57 +02:00
process soc: fixes for 6.12 2024-10-17 09:43:36 -07:00
RCU Merge branches 'context_tracking.15.08.24a', 'csd.lock.15.08.24a', 'nocb.09.09.24a', 'rcutorture.14.08.24a', 'rcustall.09.09.24a', 'srcu.12.08.24a', 'rcu.tasks.14.08.24a', 'rcu_scaling_tests.15.08.24a', 'fixes.12.08.24a' and 'misc.11.08.24a' into next.09.09.24a 2024-09-09 00:09:47 +05:30
rust RISC-V: disallow gcc + rust builds 2024-10-25 06:18:37 -07:00
scheduler sched_ext: Documentation: Update instructions for running example schedulers 2024-10-08 08:49:18 -10:00
scsi
security documentation: add IPE documentation 2024-08-20 14:03:47 -04:00
sound Docs/sound: Add documentation for userspace-driven ALSA timers 2024-08-18 09:55:54 +02:00
sphinx docs: kerneldoc-preamble.sty: Suppress extra spaces in CJK literal blocks 2024-09-05 14:16:41 -06:00
sphinx-static
spi spi: Enable controllers to extend the SPI protocol with MOSI idle configuration 2024-07-29 01:19:51 +01:00
staging xz: remove XZ_EXTERN and extern from functions 2024-09-01 20:43:27 -07:00
target
tee
timers treewide: Fix wrong singular form of jiffies in comments 2024-09-08 20:47:40 +02:00
tools
trace tracing/Documentation: Start a document on how to debug with tracing 2024-08-26 13:54:08 -04:00
translations move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
usb usb: gadget: f_uac1: Change volume name and remove alt names 2024-08-13 18:11:35 +02:00
userspace-api mseal: update mseal.rst 2024-10-28 21:40:41 -07:00
virt KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL 2024-10-20 07:31:05 -04:00
w1
watchdog [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
wmi platform/x86: dell-ddv: Fix typo in documentation 2024-10-06 12:47:40 +02:00
.gitignore
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py
docutils.conf
dontdiff Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
index.rst
Kconfig
Makefile
memory-barriers.txt docs/memory-barriers.txt: Remove left-over references to "CACHE COHERENCY" 2024-09-13 23:56:44 -07:00
SubmittingPatches
subsystem-apis.rst