linux/tools
Alexei Starovoitov a8b242d77b bpf: Introduce "volatile compare" macros
Compilers optimize conditional operators at will, but often bpf programmers
want to force compilers to keep the same operator in asm as it's written in C.
Introduce bpf_cmp_likely/unlikely(var1, conditional_op, var2) macros that can be used as:

-               if (seen >= 1000)
+               if (bpf_cmp_unlikely(seen, >=, 1000))

The macros take advantage of BPF assembly that is C like.

The macros check the sign of variable 'seen' and emits either
signed or unsigned compare.

For example:
int a;
bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX s> 0 goto' in BPF assembly.

unsigned int a;
bpf_cmp_unlikely(a, >, 0) will be translated to 'if rX > 0 goto' in BPF assembly.

C type conversions coupled with comparison operator are tricky.
  int i = -1;
  unsigned int j = 1;
  if (i < j) // this is false.

  long i = -1;
  unsigned int j = 1;
  if (i < j) // this is true.

Make sure BPF program is compiled with -Wsign-compare then the macros will catch
the mistake.

The macros check LHS (left hand side) only to figure out the sign of compare.

'if 0 < rX goto' is not allowed in the assembly, so the users
have to use a variable on LHS anyway.

The patch updates few tests to demonstrate the use of the macros.

The macro allows to use BPF_JSET in C code, since LLVM doesn't generate it at
present. For example:

if (i & j) compiles into r0 &= r1; if r0 == 0 goto

while

if (bpf_cmp_unlikely(i, &, j)) compiles into if r0 & r1 goto

Note that the macros has to be careful with RHS assembly predicate.
Since:
u64 __rhs = 1ull << 42;
asm goto("if r0 < %[rhs] goto +1" :: [rhs] "ri" (__rhs));
LLVM will silently truncate 64-bit constant into s32 imm.

Note that [lhs] "r"((short)LHS) the type cast is a workaround for LLVM issue.
When LHS is exactly 32-bit LLVM emits redundant <<=32, >>=32 to zero upper 32-bits.
When LHS is 64 or 16 or 8-bit variable there are no shifts.
When LHS is 32-bit the (u64) cast doesn't help. Hence use (short) cast.
It does _not_ truncate the variable before it's assigned to a register.

Traditional likely()/unlikely() macros that use __builtin_expect(!!(x), 1 or 0)
have no effect on these macros, hence macros implement the logic manually.
bpf_cmp_unlikely() macro preserves compare operator as-is while
bpf_cmp_likely() macro flips the compare.

Consider two cases:
A.
  for() {
    if (foo >= 10) {
      bar += foo;
    }
    other code;
  }

B.
  for() {
    if (foo >= 10)
       break;
    other code;
  }

It's ok to use either bpf_cmp_likely or bpf_cmp_unlikely macros in both cases,
but consider that 'break' is effectively 'goto out_of_the_loop'.
Hence it's better to use bpf_cmp_unlikely in the B case.
While 'bar += foo' is better to keep as 'fallthrough' == likely code path in the A case.

When it's written as:
A.
  for() {
    if (bpf_cmp_likely(foo, >=, 10)) {
      bar += foo;
    }
    other code;
  }

B.
  for() {
    if (bpf_cmp_unlikely(foo, >=, 10))
       break;
    other code;
  }

The assembly will look like:
A.
  for() {
    if r1 < 10 goto L1;
      bar += foo;
  L1:
    other code;
  }

B.
  for() {
    if r1 >= 10 goto L2;
    other code;
  }
  L2:

The bpf_cmp_likely vs bpf_cmp_unlikely changes basic block layout, hence it will
greatly influence the verification process. The number of processed instructions
will be different, since the verifier walks the fallthrough first.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20231226191148.48536-3-alexei.starovoitov@gmail.com
2024-01-03 10:58:42 -08:00
..
accounting
arch perf tools fixes for v6.7: 1st batch 2023-12-01 10:17:16 +09:00
bootconfig
bpf bpftool: Add support to display uprobe_multi links 2023-11-28 21:50:09 -08:00
build perf tools fixes for v6.6: 2nd batch 2023-10-30 13:46:27 -07:00
certs
cgroup iocost_monitor: improve it by adding iocg wait_ms 2023-08-08 15:43:03 -06:00
counter
crypto/ccp crypto: ccp - Fix some unfused tests 2023-09-15 18:29:45 +08:00
debugging
edid
firewire
firmware
gpio
hv hv/hv_kvp_daemon: Some small fixes for handling NM keyfiles 2023-11-10 23:27:46 +00:00
iio tools: iio: iio_generic_buffer ensure alignment 2023-10-05 16:16:20 +01:00
include net/sched: Remove uapi support for CBQ qdisc 2024-01-02 14:25:51 +00:00
kvm/kvm_stat
laptop
leds
lib libbpf: Fix NULL pointer dereference in bpf_object__collect_prog_relos 2023-12-21 10:05:42 +01:00
memory-model
mm tools/mm: update the usage output to be more organized 2023-10-18 14:34:19 -07:00
net/ynl tools/net/ynl-gen-rst: Remove extra indentation from generated docs 2023-12-18 14:39:44 -08:00
objtool cred: get rid of CONFIG_DEBUG_CREDENTIALS 2023-12-15 14:19:48 -08:00
pci
pcmcia
perf perf list: Fix JSON segfault by setting the used skip_duplicate_pmus callback 2023-12-05 11:16:00 -08:00
power PM: tools: Fix sleepgraph syntax error 2023-11-20 17:59:58 +01:00
rcu
scripts tools/build: Fix -s detection code in tools/scripts/Makefile.include 2023-10-18 15:29:47 -07:00
spi
testing bpf: Introduce "volatile compare" macros 2024-01-03 10:58:42 -08:00
thermal tools/thermal: Remove unused 'mds' and 'nrhandler' variables 2023-10-15 23:40:10 +02:00
time
tracing rtla: Fix uninitialized variable found 2023-10-30 19:00:12 +01:00
usb
verification verification/dot2k: Delete duplicate imports 2023-10-30 16:59:12 +01:00
virtio tools/virtio: Add dma sync api for virtio test 2023-10-16 05:32:23 -04:00
wmi
workqueue workqueue: Implement non-strict affinity scope for unbound workqueues 2023-08-07 15:57:25 -10:00
Makefile