Commit Graph

1310773 Commits

Author SHA1 Message Date
Masahiro Yamada
e2ff1219a5 setlocalversion: add -e option
Set the -e option to ensure this script fails on any unexpected errors.

Without this change, the kernel build may continue running with an
incorrect string in include/config/kernel.release.

Currently, try_tag() returns 1 when the expected tag is not found as an
ancestor, but this is a case where the script should continue.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-28 08:11:56 +09:00
Rasmus Villemoes
523f3dbc18 setlocalversion: work around "git describe" performance
Contrary to expectations, passing a single candidate tag to "git
describe" is slower than not passing any --match options.

  $ time git describe --debug
  ...
  traversed 10619 commits
  ...
  v6.12-rc5-63-g0fc810ae3ae1

  real    0m0.169s

  $ time git describe --match=v6.12-rc5 --debug
  ...
  traversed 1310024 commits
  v6.12-rc5-63-g0fc810ae3ae1

  real    0m1.281s

In fact, the --debug output shows that git traverses all or most of
history. For some repositories and/or git versions, those 1.3s are
actually 10-15 seconds.

This has been acknowledged as a performance bug in git [1], and a fix
is on its way [2]. However, no solution is yet in git.git, and even
when one lands, it will take quite a while before it finds its way to
a release and for $random_kernel_developer to pick that up.

So rewrite the logic to use plumbing commands. For each of the
candidate values of $tag, we ask: (1) is $tag even an annotated
tag? (2) Is it eligible to describe HEAD, i.e. an ancestor of
HEAD? (3) If so, how many commits are in $tag..HEAD?

I have tested that this produces the same output as the current script
for ~700 random commits between v6.9..v6.10. For those 700 commits,
and in my git repo, the 'make -s kernelrelease' command is on average
~4 times faster with this patch applied (geometric mean of ratios).

For the commit mentioned in Josh's original report [3], the
time-consuming part of setlocalversion goes from

$ time git describe --match=v6.12-rc5 c1e939a21e
v6.12-rc5-44-gc1e939a21eb1

real    0m1.210s

to

$ time git rev-list --count --left-right v6.12-rc5..c1e939a21eb1
0       44

real    0m0.037s

[1] https://lore.kernel.org/git/20241101113910.GA2301440@coredump.intra.peff.net/
[2] https://lore.kernel.org/git/20241106192236.GC880133@coredump.intra.peff.net/
[3] https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/

Reported-by: Sean Christopherson <seanjc@google.com>
Closes: https://lore.kernel.org/lkml/ZPtlxmdIJXOe0sEy@google.com/
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Closes: https://lore.kernel.org/lkml/309549cafdcfe50c4fceac3263220cc3d8b109b2.1730337435.git.jpoimboe@kernel.org/
Tested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-28 08:11:56 +09:00
Parth Pancholi
e397a603e4 kbuild: switch from lz4c to lz4 for compression
Replace lz4c with lz4 for kernel image compression.
Although lz4 and lz4c are functionally similar, lz4c has been deprecated
upstream since 2018. Since as early as Ubuntu 16.04 and Fedora 25, lz4
and lz4c have been packaged together, making it safe to update the
requirement from lz4c to lz4.

Consequently, some distributions and build systems, such as OpenEmbedded,
have fully transitioned to using lz4. OpenEmbedded core adopted this
change in commit fe167e082cbd ("bitbake.conf: require lz4 instead of
lz4c"), causing compatibility issues when building the mainline kernel
in the latest OpenEmbedded environment, as seen in the errors below.

This change also updates the LZ4 compression commands to make it backward
compatible by replacing stdin and stdout with the '-' option, due to some
unclear reason, the stdout keyword does not work for lz4 and '-' works for
both. In addition, this modifies the legacy '-c1' with '-9' which is also
compatible with both. This fixes the mainline kernel build failures with
the latest master OpenEmbedded builds associated with the mentioned
compatibility issues.

LZ4     arch/arm/boot/compressed/piggy_data
/bin/sh: 1: lz4c: not found
...
...
ERROR: oe_runmake failed

Link: https://github.com/lz4/lz4/pull/553
Suggested-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Parth Pancholi <parth.pancholi@toradex.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-28 08:11:56 +09:00
Masahiro Yamada
1b466b29a3 kbuild: re-enable KCSAN for autogenerated *.mod.c intermediaries
This reverts commit 54babdc034 ("kbuild: Disable KCSAN for
autogenerated *.mod.c intermediaries").

Now that objtool is enabled for *.mod.c, there is no need to filter
out CFLAGS_KCSAN.

I no longer see "Unpatched return thunk in use. This should not happen!"
error with KCSAN when loading a module.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
bede169618 kbuild: enable objtool for *.mod.o and additional kernel objects
Currently, objtool is disabled in scripts/Makefile.{modfinal,vmlinux}.

This commit moves rule_cc_o_c and rule_as_o_S to scripts/Makefile.lib
and set objtool-enabled to y there.

With this change, *.mod.o, .module-common.o,  builtin-dtb.o, and
vmlinux.export.o will now be covered by objtool.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
000e22a80d kbuild: move cmd_cc_o_c and cmd_as_o_S to scripts/Malefile.lib
The cmd_cc_o_c and cmd_as_o_S macros are duplicated in
scripts/Makefile.{build,modfinal,vmlinux}.

This commit factors them out to scripts/Makefile.lib.

No functional changes are intended.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
91ca8be3c4 kbuild: remove support for single %.symtypes build rule
This rule is unnecessary because you can generate foo/bar.symtypes
as a side effect using:

  $ make KBUILD_SYMTYPES=1 foo/bar.o

While compiling *.o is slower than preprocessing, the impact is
negligible. I prioritize keeping the code simpler.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
c2386abf55 kbuild: do not pass -r to genksyms when *.symref does not exist
There is no need to pass '-r /dev/null', which is no-op.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
8cd07cc6c8 kbuild: allow to start building external modules in any directory
Unless an explicit O= option is provided, external module builds must
start from the kernel directory.

This can be achieved by using the -C option:

  $ make -C /path/to/kernel M=/path/to/external/module

This commit allows starting external module builds from any directory,
so you can also do the following:

  $ make -f /path/to/kernel/Makefile M=/path/to/external/module

The key difference is that the -C option changes the working directory
and parses the Makefile located there, while the -f option only
specifies the Makefile to use.

As shown in the examples in Documentation/kbuild/modules.rst, external
modules usually have a wrapper Makefile that allows you to build them
without specifying any make arguments. The Makefile typically contains
a rule as follows:

    KDIR ?= /path/to/kernel
    default:
            $(MAKE) -C $(KDIR) M=$(CURDIR) $(MAKECMDGOALS)

The log will appear as follows:

    $ make
    make -C /path/to/kernel M=/path/to/external/module
    make[1]: Entering directory '/path/to/kernel'
    make[2]: Entering directory '/path/to/external/module'
      CC [M]  helloworld.o
      MODPOST Module.symvers
      CC [M]  helloworld.mod.o
      CC [M]  .module-common.o
      LD [M]  helloworld.ko
    make[2]: Leaving directory '/path/to/external/module'
    make[1]: Leaving directory '/path/to/kernel'

This changes the working directory twice because the -C option first
switches to the kernel directory, and then Kbuild internally recurses
back to the external module directory.

With this commit, the wrapper Makefile can directly include the kernel
Makefile:

    KDIR ?= /path/to/kernel
    export KBUILD_EXTMOD := $(realpath $(dir $(lastword $(MAKEFILE_LIST))))
    include $(KDIR)/Makefile

This avoids unnecessary sub-make invocations:

    $ make
      CC [M]  helloworld.o
      MODPOST Module.symvers
      CC [M]  helloworld.mod.o
      CC [M]  .module-common.o
      LD [M]  helloworld.ko

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
a2a45ebee0 kbuild: make wrapper Makefile more convenient for external modules
When Kbuild starts building in a separate output directory, it generates
a wrapper Makefile, allowing you to invoke 'make' from the output
directory.

This commit makes it more convenient, so you can invoke 'make' without
M= or MO=.

First, you need to build external modules in a separate directory:

  $ make M=/path/to/module/source/dir MO=/path/to/module/build/dir

Once the wrapper Makefile is generated in /path/to/module/build/dir,
you can proceed as follows:

  $ cd /path/to/module/build/dir
  $ make

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
822b11a74b kbuild: use absolute path in the generated wrapper Makefile
Keep the consistent behavior when this Makefile is invoked from another
directory.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
1d3730f001 kbuild: support -fmacro-prefix-map for external modules
This commit makes -fmacro-prefix-map work for external modules built in
a separate output directory. It improves the reproducibility of external
modules and provides the benefits described in commit a73619a845
("kbuild: use -fmacro-prefix-map to make __FILE__ a relative path").

When building_out_of_srctree is not defined (e.g., when the kernel or
external module is built in the source directory), this option is
unnecessary.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
11b3d5175e kbuild: support building external modules in a separate build directory
There has been a long-standing request to support building external
modules in a separate build directory.

This commit introduces a new environment variable, KBUILD_EXTMOD_OUTPUT,
and its shorthand Make variable, MO.

A simple usage:

 $ make -C <kernel-dir> M=<module-src-dir> MO=<module-build-dir>

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
bad6beb2c0 kbuild: remove extmod_prefix, MODORDER, MODULES_NSDEPS variables
With the previous changes, $(extmod_prefix), $(MODORDER), and
$(MODULES_NSDEPS) are constant. (empty, modules.order, and
modules.nsdeps, respectively).

Remove these variables and hard-code their values.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-28 08:11:55 +09:00
Masahiro Yamada
13b25489b6 kbuild: change working directory to external module directory with M=
Currently, Kbuild always operates in the output directory of the kernel,
even when building external modules. This increases the risk of external
module Makefiles attempting to write to the kernel directory.

This commit switches the working directory to the external module
directory, allowing the removal of the $(KBUILD_EXTMOD)/ prefix from
some build artifacts.

The command for building external modules maintains backward
compatibility, but Makefiles that rely on working in the kernel
directory may break. In such cases, $(objtree) and $(srctree) should
be used to refer to the output and source directories of the kernel.

The appearance of the build log will change as follows:

[Before]

  $ make -C /path/to/my/linux M=/path/to/my/externel/module
  make: Entering directory '/path/to/my/linux'
    CC [M]  /path/to/my/externel/module/helloworld.o
    MODPOST /path/to/my/externel/module/Module.symvers
    CC [M]  /path/to/my/externel/module/helloworld.mod.o
    CC [M]  /path/to/my/externel/module/.module-common.o
    LD [M]  /path/to/my/externel/module/helloworld.ko
  make: Leaving directory '/path/to/my/linux'

[After]

  $ make -C /path/to/my/linux M=/path/to/my/externel/module
  make: Entering directory '/path/to/my/linux'
  make[1]: Entering directory '/path/to/my/externel/module'
    CC [M]  helloworld.o
    MODPOST Module.symvers
    CC [M]  helloworld.mod.o
    CC [M]  .module-common.o
    LD [M]  helloworld.ko
  make[1]: Leaving directory '/path/to/my/externel/module'
  make: Leaving directory '/path/to/my/linux'

Printing "Entering directory" twice is cumbersome. This will be
addressed later.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
2024-11-28 08:10:23 +09:00
Masahiro Yamada
d171136019 kbuild: use 'output' variable to create the output directory
$(KBUILD_OUTPUT) specifies the output directory of kernel builds.

Use a more generic name, 'output', to better reflect this code hunk in
the context of external module builds.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-27 09:38:27 +09:00
Masahiro Yamada
5ea1721654 kbuild: rename abs_objtree to abs_output
'objtree' refers to the top of the output directory of kernel builds.

Rename abs_objtree to a more generic name, to better reflect its use in
the context of external module builds.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-27 09:38:27 +09:00
Masahiro Yamada
214c0eea43 kbuild: add $(objtree)/ prefix to some in-kernel build artifacts
$(objtree) refers to the top of the output directory of kernel builds.

This commit adds the explicit $(objtree)/ prefix to build artifacts
needed for building external modules.

This change has no immediate impact, as the top-level Makefile
currently defines:

  objtree         := .

This commit prepares for supporting the building of external modules
in a different directory.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-11-27 09:38:27 +09:00
Masahiro Yamada
0afd73c5f5 kbuild: replace two $(abs_objtree) with $(CURDIR) in top Makefile
Kbuild changes the working directory until it matches $(abs_objtree).

When $(need-sub-make) is empty, $(abs_objtree) is the same as $(CURDIR).

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
2024-11-27 09:38:27 +09:00
Matt Fleming
bcbbf493f2 kbuild: deb-pkg: Don't fail if modules.order is missing
Kernels built without CONFIG_MODULES might still want to create -dbg deb
packages but install_linux_image_dbg() assumes modules.order always
exists. This obviously isn't true if no modules were built, so we should
skip reading modules.order in that case.

Fixes: 16c36f8864 ("kbuild: deb-pkg: use build ID instead of debug link for dbg package")
Signed-off-by: Matt Fleming <mfleming@cloudflare.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Masahiro Yamada
dbefa1f31a Rename .data.once to .data..once to fix resetting WARN*_ONCE
Commit b1fca27d38 ("kernel debug: support resetting WARN*_ONCE")
added support for clearing the state of once warnings. However,
it is not functional when CONFIG_LD_DEAD_CODE_DATA_ELIMINATION or
CONFIG_LTO_CLANG is enabled, because .data.once matches the
.data.[0-9a-zA-Z_]* pattern in the DATA_MAIN macro.

Commit cb87481ee8 ("kbuild: linker script do not match C names unless
LD_DEAD_CODE_DATA_ELIMINATION is configured") was introduced to suppress
the issue for the default CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n case,
providing a minimal fix for stable backporting. We were aware this did
not address the issue for CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y. The
plan was to apply correct fixes and then revert cb87481ee8. [1]

Seven years have passed since then, yet the #ifdef workaround remains in
place. Meanwhile, commit b1fca27d38 introduced the .data.once section,
and commit dc5723b02e ("kbuild: add support for Clang LTO") extended
the #ifdef.

Using a ".." separator in the section name fixes the issue for
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION and CONFIG_LTO_CLANG.

[1]: https://lore.kernel.org/linux-kbuild/CAK7LNASck6BfdLnESxXUeECYL26yUDm0cwRZuM4gmaWUkxjL5g@mail.gmail.com/

Fixes: b1fca27d38 ("kernel debug: support resetting WARN*_ONCE")
Fixes: dc5723b02e ("kbuild: add support for Clang LTO")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Masahiro Yamada
bb43a59944 Rename .data.unlikely to .data..unlikely
Commit 7ccaba5314 ("consolidate WARN_...ONCE() static variables")
was intended to collect all .data.unlikely sections into one chunk.
However, this has not worked when CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
or CONFIG_LTO_CLANG is enabled, because .data.unlikely matches the
.data.[0-9a-zA-Z_]* pattern in the DATA_MAIN macro.

Commit cb87481ee8 ("kbuild: linker script do not match C names unless
LD_DEAD_CODE_DATA_ELIMINATION is configured") was introduced to suppress
the issue for the default CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=n case,
providing a minimal fix for stable backporting. We were aware this did
not address the issue for CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y. The
plan was to apply correct fixes and then revert cb87481ee8. [1]

Seven years have passed since then, yet the #ifdef workaround remains in
place.

Using a ".." separator in the section name fixes the issue for
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION and CONFIG_LTO_CLANG.

[1]: https://lore.kernel.org/linux-kbuild/CAK7LNASck6BfdLnESxXUeECYL26yUDm0cwRZuM4gmaWUkxjL5g@mail.gmail.com/

Fixes: cb87481ee8 ("kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Rong Xu
d63b852430 kbuild: Fix Propeller build option
The '-fbasic-block-sections=labels' option has been deprecated in tip
of tree clang (20.0.0) [1]. While the option still works, a warning is
emitted:

  clang: warning: argument '-fbasic-block-sections=labels' is deprecated, use '-fbasic-block-address-map' instead [-Wdeprecated]

Add a version check to set the proper option.

Link: https://github.com/llvm/llvm-project/pull/110039 [1]

Signed-off-by: Rong Xu <xur@google.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Rong Xu
d5dc958361 kbuild: Add Propeller configuration for kernel build
Add the build support for using Clang's Propeller optimizer. Like
AutoFDO, Propeller uses hardware sampling to gather information
about the frequency of execution of different code paths within a
binary. This information is then used to guide the compiler's
optimization decisions, resulting in a more efficient binary.

The support requires a Clang compiler LLVM 19 or later, and the
create_llvm_prof tool
(https://github.com/google/autofdo/releases/tag/v0.30.1). This
commit is limited to x86 platforms that support PMU features
like LBR on Intel machines and AMD Zen3 BRS.

Here is an example workflow for building an AutoFDO+Propeller
optimized kernel:

1) Build the kernel on the host machine, with AutoFDO and Propeller
   build config
      CONFIG_AUTOFDO_CLANG=y
      CONFIG_PROPELLER_CLANG=y
   then
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo_profile>

“<autofdo_profile>” is the profile collected when doing a non-Propeller
AutoFDO build. This step builds a kernel that has the same optimization
level as AutoFDO, plus a metadata section that records basic block
information. This kernel image runs as fast as an AutoFDO optimized
kernel.

2) Install the kernel on test/production machines.

3) Run the load tests. The '-c' option in perf specifies the sample
   event period. We suggest using a suitable prime number,
   like 500009, for this purpose.
   For Intel platforms:
      $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \
        -o <perf_file> -- <loadtest>
   For AMD platforms:
      The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2
      # To see if Zen3 support LBR:
      $ cat proc/cpuinfo | grep " brs"
      # To see if Zen4 support LBR:
      $ cat proc/cpuinfo | grep amd_lbr_v2
      # If the result is yes, then collect the profile using:
      $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \
        -N -b -c <count> -o <perf_file> -- <loadtest>

4) (Optional) Download the raw perf file to the host machine.

5) Generate Propeller profile:
   $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
     --format=propeller --propeller_output_module_name \
     --out=<propeller_profile_prefix>_cc_profile.txt \
     --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt

   “create_llvm_prof” is the profile conversion tool, and a prebuilt
   binary for linux can be found on
   https://github.com/google/autofdo/releases/tag/v0.30.1 (can also build
   from source).

   "<propeller_profile_prefix>" can be something like
   "/home/user/dir/any_string".

   This command generates a pair of Propeller profiles:
   "<propeller_profile_prefix>_cc_profile.txt" and
   "<propeller_profile_prefix>_ld_profile.txt".

6) Rebuild the kernel using the AutoFDO and Propeller profile files.
      CONFIG_AUTOFDO_CLANG=y
      CONFIG_PROPELLER_CLANG=y
   and
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo_profile> \
        CLANG_PROPELLER_PROFILE_PREFIX=<propeller_profile_prefix>

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Rong Xu
2fd65f7afd AutoFDO: Enable machine function split optimization for AutoFDO
Enable the machine function split optimization for AutoFDO in Clang.

Machine function split (MFS) is a pass in the Clang compiler that
splits a function into hot and cold parts. The linker groups all
cold blocks across functions together. This decreases hot code
fragmentation and improves iCache and iTLB utilization.

MFS requires a profile so this is enabled only for the AutoFDO builds.

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Rong Xu
0847420f5e AutoFDO: Enable -ffunction-sections for the AutoFDO build
Enable -ffunction-sections by default for the AutoFDO build.

With -ffunction-sections, the compiler places each function in its own
section named .text.function_name instead of placing all functions in
the .text section. In the AutoFDO build, this allows the linker to
utilize profile information to reorganize functions for improved
utilization of iCache and iTLB.

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Rong Xu
db0b2991ae vmlinux.lds.h: Add markers for text_unlikely and text_hot sections
Add markers like __hot_text_start, __hot_text_end, __unlikely_text_start,
and __unlikely_text_end which will be included in System.map. These markers
indicate how the compiler groups functions, providing valuable information
to developers about the layout and optimization of the code.

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Rong Xu
0043ecea23 vmlinux.lds.h: Adjust symbol ordering in text output section
When the -ffunction-sections compiler option is enabled, each function
is placed in a separate section named .text.function_name rather than
putting all functions in a single .text section.

However, using -function-sections can cause problems with the
linker script. The comments included in include/asm-generic/vmlinux.lds.h
note these issues.:
  “TEXT_MAIN here will match .text.fixup and .text.unlikely if dead
   code elimination is enabled, so these sections should be converted
   to use ".." first.”

It is unclear whether there is a straightforward method for converting
a suffix to "..".

This patch modifies the order of subsections within the text output
section. Specifically, it changes current order:
  .text.hot, .text, .text_unlikely, .text.unknown, .text.asan
to the new order:
  .text.asan, .text.unknown, .text_unlikely, .text.hot, .text

Here is the rationale behind the new layout:

The majority of the code resides in three sections: .text.hot, .text,
and .text.unlikely, with .text.unknown containing a negligible amount.
.text.asan is only generated in ASAN builds.

The primary goal is to group code segments based on their execution
frequency (hotness).

First, we want to place .text.hot adjacent to .text. Since we cannot put
.text.hot after .text (Due to constraints with -ffunction-sections,
placing .text.hot after .text is problematic), we need to put
.text.hot before .text.

Then it comes to .text.unlikely, we cannot put it after .text (same
-ffunction-sections issue) . Therefore, we position .text.unlikely
before .text.hot.

.text.unknown and .tex.asan follow the same logic.

This revised ordering effectively reverses the original arrangement (for
.text.unlikely, .text.unknown, and .tex.asan), maintaining a similar level
of affinity between sections.

It also places .text.hot section at the beginning of a page to better
utilize the TLB entry.

Note that the limitation arises because the linker script employs glob
patterns instead of regular expressions for string matching. While there
is a method to maintain the current order using complex patterns, this
significantly complicates the pattern and increases the likelihood of
errors.

This patch also changes vmlinux.lds.S for the sparc64 architecture to
accommodate specific symbol placement requirements.

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:38:27 +09:00
Rong Xu
52892ed6b0 MIPS: Place __kernel_entry at the beginning of text section
Mark __kernel_entry as ".head.text" and place HEAD_TEXT before
TEXT_TEXT in the linker script. This ensures that __kernel_entry
will be placed at the beginning of text section.

Drop mips from scripts/head-object-list.txt.

Signed-off-by: Rong Xu <xur@google.com>
Reported-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Closes: https://lore.kernel.org/lkml/c6719149-8531-4174-824e-a3caf4bc6d0e@alliedtelesis.co.nz/T/
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-27 09:36:01 +09:00
Rong Xu
18e885099f objtool: Fix unreachable instruction warnings for weak functions
In the presence of both weak and strong function definitions, the
linker drops the weak symbol in favor of a strong symbol, but
leaves the code in place. Code in ignore_unreachable_insn() has
some heuristics to suppress the warning, but it does not work when
-ffunction-sections is enabled.

Suppose function foo has both strong and weak definitions.
Case 1: The strong definition has an annotated section name,
like .init.text. Only the weak definition will be placed into
.text.foo. But since the section has no symbols, there will be no
"hole" in the section.

Case 2: Both sections are without an annotated section name.
Both will be placed into .text.foo section, but there will be only one
symbol (the strong one). If the weak code is before the strong code,
there is no "hole" as it fails to find the right-most symbol before
the offset.

The fix is to use the first node to compute the hole if hole.sym
is empty. If there is no symbol in the section, the first node
will be NULL, in which case, -1 is returned to skip the whole
section.

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 22:41:10 +09:00
Rong Xu
315ad8780a kbuild: Add AutoFDO support for Clang build
Add the build support for using Clang's AutoFDO. Building the kernel
with AutoFDO does not reduce the optimization level from the
compiler. AutoFDO uses hardware sampling to gather information about
the frequency of execution of different code paths within a binary.
This information is then used to guide the compiler's optimization
decisions, resulting in a more efficient binary. Experiments
showed that the kernel can improve up to 10% in latency.

The support requires a Clang compiler after LLVM 17. This submission
is limited to x86 platforms that support PMU features like LBR on
Intel machines and AMD Zen3 BRS. Support for SPE on ARM 1,
 and BRBE on ARM 1 is part of planned future work.

Here is an example workflow for AutoFDO kernel:

1) Build the kernel on the host machine with LLVM enabled, for example,
       $ make menuconfig LLVM=1
    Turn on AutoFDO build config:
      CONFIG_AUTOFDO_CLANG=y
    With a configuration that has LLVM enabled, use the following
    command:
       scripts/config -e AUTOFDO_CLANG
    After getting the config, build with
      $ make LLVM=1

2) Install the kernel on the test machine.

3) Run the load tests. The '-c' option in perf specifies the sample
   event period. We suggest     using a suitable prime number,
   like 500009, for this purpose.
   For Intel platforms:
      $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> \
        -o <perf_file> -- <loadtest>
   For AMD platforms:
      The supported system are: Zen3 with BRS, or Zen4 with amd_lbr_v2
     For Zen3:
      $ cat proc/cpuinfo | grep " brs"
      For Zen4:
      $ cat proc/cpuinfo | grep amd_lbr_v2
      $ perf record --pfm-events RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a \
        -N -b -c <count> -o <perf_file> -- <loadtest>

4) (Optional) Download the raw perf file to the host machine.

5) To generate an AutoFDO profile, two offline tools are available:
   create_llvm_prof and llvm_profgen. The create_llvm_prof tool is part
   of the AutoFDO project and can be found on GitHub
   (https://github.com/google/autofdo), version v0.30.1 or later. The
   llvm_profgen tool is included in the LLVM compiler itself. It's
   important to note that the version of llvm_profgen doesn't need to
   match the version of Clang. It needs to be the LLVM 19 release or
   later, or from the LLVM trunk.
      $ llvm-profgen --kernel --binary=<vmlinux> --perfdata=<perf_file> \
        -o <profile_file>
   or
      $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> \
        --format=extbinary --out=<profile_file>

   Note that multiple AutoFDO profile files can be merged into one via:
      $ llvm-profdata merge -o <profile_file>  <profile_1> ... <profile_n>

6) Rebuild the kernel using the AutoFDO profile file with the same config
   as step 1, (Note CONFIG_AUTOFDO_CLANG needs to be enabled):
      $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file>

Co-developed-by: Han Shen <shenhan@google.com>
Signed-off-by: Han Shen <shenhan@google.com>
Signed-off-by: Rong Xu <xur@google.com>
Suggested-by: Sriraman Tallam <tmsriram@google.com>
Suggested-by: Krzysztof Pszeniczny <kpszeniczny@google.com>
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Tested-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: Yabin Cui <yabinc@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Tested-by: Peter Jung <ptr1337@cachyos.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 22:41:09 +09:00
Masahiro Yamada
397a479b51 kbuild: simplify rustfmt target
There is no need to prune the rust/alloc directory because it was
removed by commit 9d0441bab7 ("rust: alloc: remove our fork of the
`alloc` crate").

There is no need to prune the rust/test directory because no '*.rs'
files are generated within it.

To avoid forking the 'grep -Fv generated' process, filter out generated
files using the option, ! -name '*generated*'.

Now that the '-path ... -prune' option is no longer used, there is no
need to use the absolute path. Searching in $(srctree), which can be
a relative path, is sufficient.

The comment mentions the use case where $(srctree) is '..', that is,
$(objtree) is a sub-directory of $(srctree). In this scenario, all
'*.rs' files under $(objtree) are generated files and filters out by
the '*generated*' pattern.

Add $(RCS_FIND_IGNORE) as a shortcut. Although I do not believe '*.rs'
files would exist under the .git directory, there is no need for the
'find' command to traverse it.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
2024-11-06 22:39:58 +09:00
Masahiro Yamada
a49401be4c kconfig: document the positional argument in the help message
The positional argument specifies the top-level Kconfig. Include this
information in the help message.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:35 +09:00
Masahiro Yamada
d6a91e28d1 kconfig: qconf: remove unnecessary mode check in ConfigItem::updateMenu()
The P_MENU entries ("menu" and "menuconfig") are never displayed in
symbolMode.

The condition, list->mode == symbolMode, is never met here.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
a914032b71 kconfig: qconf: refactor ConfigInfoView::clicked()
Most of the code in ConfigInfoView::clicked() is unnecessary.
There is no need to use the regular expression to search for a symbol.
Calling sym_find() is simpler and faster.

The hyperlink always begins with the "s" tag, and there is no other
tag used. Remove it.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
bce590f102 kconfig: add sym_get_prompt_menu() helper function
Split out the code that retrieves the menu entry with a prompt, so it
can be reused in other functions.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
929ce506d6 kconfig: qconf: remove non-functional href="m..." tag
The only functional tag is href="s<symbol_name>".

Commit c4f7398bee ("kconfig: qconf: make debug links work again")
changed prop->name to sym->name for this reference, but it missed to
change the tag "m" to "s".

This tag is not functional at all.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
8e8ce9531e kconfig: qconf: remove redundant check in goBack()
The same check is performed in the configList->setParentMenu() call.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
572cd1d2a9 kconfig: qconf: avoid unnecessary parentSelected() when ESC is pressed
When the ESC key is pressed, the parentSelected() signal is currently
emitted for singleMode, menuMode, and symbolMode.

However, parentSelected() signal is functional only for singleMode.

In menuMode, the signal is connected to the goBack() slot, but nothing
occurs because configList->rootEntry is always &rootmenu.

In symbolMode (in the right pane), the parentSelected() signal is not
connected to any slot.

This commit prevents the unnecessary emission of the parentSelected()
signal.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
511ff539c3 kconfig: qconf: remove ConfigItem::visible member
The " (NEW)" string should be displayed regardless of the visibility
of the associated menu.

The ConfigItem::visible member is not used for any other purpose.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
76567f93b3 kconfig: qconf: do not show goParent button in split view
When a menu is selected in the split view, the right pane displays the
goParent button, but it is never functional.

This is unnecessary, as you can select a menu from the menu tree in the
left pane.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
b6962d8694 kconfig: qconf: convert the last old connection syntax to Qt5 style
Commit a2574c12df ("kconfig: qconf: convert to Qt5 new signal/slot
connection syntax") converted most of the old string-based connections,
but one more instance still remains. Convert it to the new style.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
375a4f4ea7 kconfig: qconf: remove unnecessary lastWindowClosed() signal connection
The default value of the quitOnLastWindowClosed property is true.

Hence, the application implicitly quits when the last window is closed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
ac845932cb kconfig: qconf: remove unnecessary setRootIsDecorated() call
The default value of the rootIsDecorated property is true.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
0bab492cfe kconfig: qconf: remove redundant type check for choice members
Since commit fde192511b ("kconfig: remove tristate choice support"),
choice members are always boolean. The type check is redundant.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Masahiro Yamada
4a798a1e10 kconfig: qconf: remove mouse{Press,Move}Event() functions
These functions simply passes the event to the parent.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Rolf Eike Beer
8b36d3f2e6 kconfig: qconf: simplify character replacement
Replace the hand crafted lookup table with a QHash. This has the nice benefit
that the added offsets can not get out of sync with the length of the
replacement strings.

Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Rolf Eike Beer
5a4bed0fad kconfig: qconf: use default platform shortcuts
This renames "Load" to "Open" and switches Ctrl-L to Ctrl-O for the default
platforms. This may break the workflow for those used to it, but will make it
actually work for everyone else like me who would just expect the default
behavior. Add some more standard shortcuts where available. If they replace
the existing shortcuts they would have the same value in my case.

Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Rolf Eike Beer
cdb37fe66f kconfig: qconf: use QString to store path to configuration file
This is the native type used by the file dialogs and avoids any hassle with
filename encoding when converting this back and forth to a character array.

Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00
Thorsten Blum
cdb1e767c8 kconfig: nconf: Fix typo in function comment
s/handles/handled/

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-11-06 08:46:34 +09:00