pull-request: bpf-next 2023-08-03

-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRdM/uy1Ege0+EN1fNar9k/UBDW4wUCZMvevwAKCRBar9k/UBDW
 42Z0AP90hLZ9OmoghYAlALHLl8zqXuHCV8OeFXR5auqG+kkcCwEAx6h99vnh4zgP
 Tngj6Yid60o39/IZXXblhV37HfSiyQ8=
 =/kVE
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2023-08-03

We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 84 files changed, 4026 insertions(+), 562 deletions(-).

The main changes are:

1) Add SO_REUSEPORT support for TC bpf_sk_assign from Lorenz Bauer,
   Daniel Borkmann

2) Support new insns from cpu v4 from Yonghong Song

3) Non-atomically allocate freelist during prefill from YiFei Zhu

4) Support defragmenting IPv(4|6) packets in BPF from Daniel Xu

5) Add tracepoint to xdp attaching failure from Leon Hwang

6) struct netdev_rx_queue and xdp.h reshuffling to reduce
   rebuild time from Jakub Kicinski

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
  net: invert the netdevice.h vs xdp.h dependency
  net: move struct netdev_rx_queue out of netdevice.h
  eth: add missing xdp.h includes in drivers
  selftests/bpf: Add testcase for xdp attaching failure tracepoint
  bpf, xdp: Add tracepoint to xdp attaching failure
  selftests/bpf: fix static assert compilation issue for test_cls_*.c
  bpf: fix bpf_probe_read_kernel prototype mismatch
  riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework
  libbpf: fix typos in Makefile
  tracing: bpf: use struct trace_entry in struct syscall_tp_t
  bpf, devmap: Remove unused dtab field from bpf_dtab_netdev
  bpf, cpumap: Remove unused cmap field from bpf_cpu_map_entry
  netfilter: bpf: Only define get_proto_defrag_hook() if necessary
  bpf: Fix an array-index-out-of-bounds issue in disasm.c
  net: remove duplicate INDIRECT_CALLABLE_DECLARE of udp[6]_ehashfn
  docs/bpf: Fix malformed documentation
  bpf: selftests: Add defrag selftests
  bpf: selftests: Support custom type and proto for client sockets
  bpf: selftests: Support not connecting client socket
  netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link
  ...
====================

Link: https://lore.kernel.org/r/20230803174845.825419-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Jakub Kicinski 2023-08-03 15:34:36 -07:00
commit d07b7b32da
84 changed files with 4023 additions and 559 deletions

View File

@ -140,11 +140,6 @@ A: Because if we picked one-to-one relationship to x64 it would have made
it more complicated to support on arm64 and other archs. Also it
needs div-by-zero runtime check.
Q: Why there is no BPF_SDIV for signed divide operation?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because it would be rarely used. llvm errors in such case and
prints a suggestion to use unsigned divide instead.
Q: Why BPF has implicit prologue and epilogue?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because architectures like sparc have register windows and in general

View File

@ -154,24 +154,27 @@ otherwise identical operations.
The 'code' field encodes the operation as below, where 'src' and 'dst' refer
to the values of the source and destination registers, respectively.
======== ===== ==========================================================
code value description
======== ===== ==========================================================
BPF_ADD 0x00 dst += src
BPF_SUB 0x10 dst -= src
BPF_MUL 0x20 dst \*= src
BPF_DIV 0x30 dst = (src != 0) ? (dst / src) : 0
BPF_OR 0x40 dst \|= src
BPF_AND 0x50 dst &= src
BPF_LSH 0x60 dst <<= (src & mask)
BPF_RSH 0x70 dst >>= (src & mask)
BPF_NEG 0x80 dst = -src
BPF_MOD 0x90 dst = (src != 0) ? (dst % src) : dst
BPF_XOR 0xa0 dst ^= src
BPF_MOV 0xb0 dst = src
BPF_ARSH 0xc0 sign extending dst >>= (src & mask)
BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
======== ===== ==========================================================
========= ===== ======= ==========================================================
code value offset description
========= ===== ======= ==========================================================
BPF_ADD 0x00 0 dst += src
BPF_SUB 0x10 0 dst -= src
BPF_MUL 0x20 0 dst \*= src
BPF_DIV 0x30 0 dst = (src != 0) ? (dst / src) : 0
BPF_SDIV 0x30 1 dst = (src != 0) ? (dst s/ src) : 0
BPF_OR 0x40 0 dst \|= src
BPF_AND 0x50 0 dst &= src
BPF_LSH 0x60 0 dst <<= (src & mask)
BPF_RSH 0x70 0 dst >>= (src & mask)
BPF_NEG 0x80 0 dst = -dst
BPF_MOD 0x90 0 dst = (src != 0) ? (dst % src) : dst
BPF_SMOD 0x90 1 dst = (src != 0) ? (dst s% src) : dst
BPF_XOR 0xa0 0 dst ^= src
BPF_MOV 0xb0 0 dst = src
BPF_MOVSX 0xb0 8/16/32 dst = (s8,s16,s32)src
BPF_ARSH 0xc0 0 sign extending dst >>= (src & mask)
BPF_END 0xd0 0 byte swap operations (see `Byte swap instructions`_ below)
========= ===== ======= ==========================================================
Underflow and overflow are allowed during arithmetic operations, meaning
the 64-bit or 32-bit value will wrap. If eBPF program execution would
@ -198,33 +201,51 @@ where '(u32)' indicates that the upper 32 bits are zeroed.
dst = dst ^ imm32
Also note that the division and modulo operations are unsigned. Thus, for
``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas
for ``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result
interpreted as an unsigned 64-bit value. There are no instructions for
signed division or modulo.
Note that most instructions have instruction offset of 0. Only three instructions
(``BPF_SDIV``, ``BPF_SMOD``, ``BPF_MOVSX``) have a non-zero offset.
The devision and modulo operations support both unsigned and signed flavors.
For unsigned operations (``BPF_DIV`` and ``BPF_MOD``), for ``BPF_ALU``,
'imm' is interpreted as a 32-bit unsigned value. For ``BPF_ALU64``,
'imm' is first sign extended from 32 to 64 bits, and then interpreted as
a 64-bit unsigned value.
For signed operations (``BPF_SDIV`` and ``BPF_SMOD``), for ``BPF_ALU``,
'imm' is interpreted as a 32-bit signed value. For ``BPF_ALU64``, 'imm'
is first sign extended from 32 to 64 bits, and then interpreted as a
64-bit signed value.
The ``BPF_MOVSX`` instruction does a move operation with sign extension.
``BPF_ALU | BPF_MOVSX`` sign extends 8-bit and 16-bit operands into 32
bit operands, and zeroes the remaining upper 32 bits.
``BPF_ALU64 | BPF_MOVSX`` sign extends 8-bit, 16-bit, and 32-bit
operands into 64 bit operands.
Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
for 32-bit operations.
Byte swap instructions
~~~~~~~~~~~~~~~~~~~~~~
----------------------
The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
'code' field of ``BPF_END``.
The byte swap instructions use instruction classes of ``BPF_ALU`` and ``BPF_ALU64``
and a 4-bit 'code' field of ``BPF_END``.
The byte swap instructions operate on the destination register
only and do not use a separate source register or immediate value.
The 1-bit source operand field in the opcode is used to select what byte
order the operation convert from or to:
For ``BPF_ALU``, the 1-bit source operand field in the opcode is used to
select what byte order the operation converts from or to. For
``BPF_ALU64``, the 1-bit source operand field in the opcode is reserved
and must be set to 0.
========= ===== =================================================
source value description
========= ===== =================================================
BPF_TO_LE 0x00 convert between host byte order and little endian
BPF_TO_BE 0x08 convert between host byte order and big endian
========= ===== =================================================
========= ========= ===== =================================================
class source value description
========= ========= ===== =================================================
BPF_ALU BPF_TO_LE 0x00 convert between host byte order and little endian
BPF_ALU BPF_TO_BE 0x08 convert between host byte order and big endian
BPF_ALU64 Reserved 0x00 do byte swap unconditionally
========= ========= ===== =================================================
The 'imm' field encodes the width of the swap operations. The following widths
are supported: 16, 32 and 64.
@ -239,6 +260,12 @@ Examples:
dst = htobe64(dst)
``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16/32/64 means::
dst = bswap16 dst
dst = bswap32 dst
dst = bswap64 dst
Jump instructions
-----------------
@ -249,7 +276,8 @@ The 'code' field encodes the operation as below:
======== ===== === =========================================== =========================================
code value src description notes
======== ===== === =========================================== =========================================
BPF_JA 0x0 0x0 PC += offset BPF_JMP only
BPF_JA 0x0 0x0 PC += offset BPF_JMP class
BPF_JA 0x0 0x0 PC += imm BPF_JMP32 class
BPF_JEQ 0x1 any PC += offset if dst == src
BPF_JGT 0x2 any PC += offset if dst > src unsigned
BPF_JGE 0x3 any PC += offset if dst >= src unsigned
@ -278,6 +306,19 @@ Example:
where 's>=' indicates a signed '>=' comparison.
``BPF_JA | BPF_K | BPF_JMP32`` (0x06) means::
gotol +imm
where 'imm' means the branch offset comes from insn 'imm' field.
Note that there are two flavors of ``BPF_JA`` instructions. The
``BPF_JMP`` class permits a 16-bit jump offset specified by the 'offset'
field, whereas the ``BPF_JMP32`` class permits a 32-bit jump offset
specified by the 'imm' field. A > 16-bit conditional jump may be
converted to a < 16-bit conditional jump plus a 32-bit unconditional
jump.
Helper functions
~~~~~~~~~~~~~~~~
@ -320,6 +361,7 @@ The mode modifier is one of:
BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_
BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_
BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_
BPF_MEMSX 0x80 sign-extension load operations `Sign-extension load operations`_
BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_
============= ===== ==================================== =============
@ -350,9 +392,23 @@ instructions that transfer data between a register and memory.
``BPF_MEM | <size> | BPF_LDX`` means::
dst = *(size *) (src + offset)
dst = *(unsigned size *) (src + offset)
Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW`` and
'unsigned size' is one of u8, u16, u32 or u64.
Sign-extension load operations
------------------------------
The ``BPF_MEMSX`` mode modifier is used to encode sign-extension load
instructions that transfer data between a register and memory.
``BPF_MEMSX | <size> | BPF_LDX`` means::
dst = *(signed size *) (src + offset)
Where size is one of: ``BPF_B``, ``BPF_H`` or ``BPF_W``, and
'signed size' is one of s8, s16 or s32.
Atomic operations
-----------------

View File

@ -3704,7 +3704,7 @@ M: Daniel Borkmann <daniel@iogearbox.net>
M: Andrii Nakryiko <andrii@kernel.org>
R: Martin KaFai Lau <martin.lau@linux.dev>
R: Song Liu <song@kernel.org>
R: Yonghong Song <yhs@fb.com>
R: Yonghong Song <yonghong.song@linux.dev>
R: John Fastabend <john.fastabend@gmail.com>
R: KP Singh <kpsingh@kernel.org>
R: Stanislav Fomichev <sdf@google.com>
@ -3743,7 +3743,7 @@ F: tools/lib/bpf/
F: tools/testing/selftests/bpf/
BPF [ITERATOR]
M: Yonghong Song <yhs@fb.com>
M: Yonghong Song <yonghong.song@linux.dev>
L: bpf@vger.kernel.org
S: Maintained
F: kernel/bpf/*iter.c

View File

@ -13,6 +13,8 @@
#include <asm/patch.h>
#include "bpf_jit.h"
#define RV_FENTRY_NINSNS 2
#define RV_REG_TCC RV_REG_A6
#define RV_REG_TCC_SAVED RV_REG_S6 /* Store A6 in S6 if program do calls */
@ -241,7 +243,7 @@ static void __build_epilogue(bool is_tail_call, struct rv_jit_context *ctx)
if (!is_tail_call)
emit_mv(RV_REG_A0, RV_REG_A5, ctx);
emit_jalr(RV_REG_ZERO, is_tail_call ? RV_REG_T3 : RV_REG_RA,
is_tail_call ? 20 : 0, /* skip reserved nops and TCC init */
is_tail_call ? (RV_FENTRY_NINSNS + 1) * 4 : 0, /* skip reserved nops and TCC init */
ctx);
}
@ -618,32 +620,7 @@ static int add_exception_handler(const struct bpf_insn *insn,
return 0;
}
static int gen_call_or_nops(void *target, void *ip, u32 *insns)
{
s64 rvoff;
int i, ret;
struct rv_jit_context ctx;
ctx.ninsns = 0;
ctx.insns = (u16 *)insns;
if (!target) {
for (i = 0; i < 4; i++)
emit(rv_nop(), &ctx);
return 0;
}
rvoff = (s64)(target - (ip + 4));
emit(rv_sd(RV_REG_SP, -8, RV_REG_RA), &ctx);
ret = emit_jump_and_link(RV_REG_RA, rvoff, false, &ctx);
if (ret)
return ret;
emit(rv_ld(RV_REG_RA, -8, RV_REG_SP), &ctx);
return 0;
}
static int gen_jump_or_nops(void *target, void *ip, u32 *insns)
static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
{
s64 rvoff;
struct rv_jit_context ctx;
@ -658,38 +635,35 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns)
}
rvoff = (s64)(target - ip);
return emit_jump_and_link(RV_REG_ZERO, rvoff, false, &ctx);
return emit_jump_and_link(is_call ? RV_REG_T0 : RV_REG_ZERO, rvoff, false, &ctx);
}
int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
void *old_addr, void *new_addr)
{
u32 old_insns[4], new_insns[4];
u32 old_insns[RV_FENTRY_NINSNS], new_insns[RV_FENTRY_NINSNS];
bool is_call = poke_type == BPF_MOD_CALL;
int (*gen_insns)(void *target, void *ip, u32 *insns);
int ninsns = is_call ? 4 : 2;
int ret;
if (!is_bpf_text_address((unsigned long)ip))
if (!is_kernel_text((unsigned long)ip) &&
!is_bpf_text_address((unsigned long)ip))
return -ENOTSUPP;
gen_insns = is_call ? gen_call_or_nops : gen_jump_or_nops;
ret = gen_insns(old_addr, ip, old_insns);
ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
if (ret)
return ret;
if (memcmp(ip, old_insns, ninsns * 4))
if (memcmp(ip, old_insns, RV_FENTRY_NINSNS * 4))
return -EFAULT;
ret = gen_insns(new_addr, ip, new_insns);
ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
if (ret)
return ret;
cpus_read_lock();
mutex_lock(&text_mutex);
if (memcmp(ip, new_insns, ninsns * 4))
ret = patch_text(ip, new_insns, ninsns);
if (memcmp(ip, new_insns, RV_FENTRY_NINSNS * 4))
ret = patch_text(ip, new_insns, RV_FENTRY_NINSNS);
mutex_unlock(&text_mutex);
cpus_read_unlock();
@ -787,8 +761,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
int i, ret, offset;
int *branches_off = NULL;
int stack_size = 0, nregs = m->nr_args;
int retaddr_off, fp_off, retval_off, args_off;
int nregs_off, ip_off, run_ctx_off, sreg_off;
int retval_off, args_off, nregs_off, ip_off, run_ctx_off, sreg_off;
struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
@ -796,13 +769,27 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
bool save_ret;
u32 insn;
/* Generated trampoline stack layout:
/* Two types of generated trampoline stack layout:
*
* FP - 8 [ RA of parent func ] return address of parent
* 1. trampoline called from function entry
* --------------------------------------
* FP + 8 [ RA to parent func ] return address to parent
* function
* FP - retaddr_off [ RA of traced func ] return address of traced
* FP + 0 [ FP of parent func ] frame pointer of parent
* function
* FP - fp_off [ FP of parent func ]
* FP - 8 [ T0 to traced func ] return address of traced
* function
* FP - 16 [ FP of traced func ] frame pointer of traced
* function
* --------------------------------------
*
* 2. trampoline called directly
* --------------------------------------
* FP - 8 [ RA to caller func ] return address to caller
* function
* FP - 16 [ FP of caller func ] frame pointer of caller
* function
* --------------------------------------
*
* FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
* BPF_TRAMP_F_RET_FENTRY_RET
@ -833,14 +820,8 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
if (nregs > 8)
return -ENOTSUPP;
/* room for parent function return address */
stack_size += 8;
stack_size += 8;
retaddr_off = stack_size;
stack_size += 8;
fp_off = stack_size;
/* room of trampoline frame to store return address and frame pointer */
stack_size += 16;
save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
if (save_ret) {
@ -867,12 +848,29 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
stack_size = round_up(stack_size, 16);
emit_addi(RV_REG_SP, RV_REG_SP, -stack_size, ctx);
if (func_addr) {
/* For the trampoline called from function entry,
* the frame of traced function and the frame of
* trampoline need to be considered.
*/
emit_addi(RV_REG_SP, RV_REG_SP, -16, ctx);
emit_sd(RV_REG_SP, 8, RV_REG_RA, ctx);
emit_sd(RV_REG_SP, 0, RV_REG_FP, ctx);
emit_addi(RV_REG_FP, RV_REG_SP, 16, ctx);
emit_sd(RV_REG_SP, stack_size - retaddr_off, RV_REG_RA, ctx);
emit_sd(RV_REG_SP, stack_size - fp_off, RV_REG_FP, ctx);
emit_addi(RV_REG_FP, RV_REG_SP, stack_size, ctx);
emit_addi(RV_REG_SP, RV_REG_SP, -stack_size, ctx);
emit_sd(RV_REG_SP, stack_size - 8, RV_REG_T0, ctx);
emit_sd(RV_REG_SP, stack_size - 16, RV_REG_FP, ctx);
emit_addi(RV_REG_FP, RV_REG_SP, stack_size, ctx);
} else {
/* For the trampoline called directly, just handle
* the frame of trampoline.
*/
emit_addi(RV_REG_SP, RV_REG_SP, -stack_size, ctx);
emit_sd(RV_REG_SP, stack_size - 8, RV_REG_RA, ctx);
emit_sd(RV_REG_SP, stack_size - 16, RV_REG_FP, ctx);
emit_addi(RV_REG_FP, RV_REG_SP, stack_size, ctx);
}
/* callee saved register S1 to pass start time */
emit_sd(RV_REG_FP, -sreg_off, RV_REG_S1, ctx);
@ -890,7 +888,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
/* skip to actual body of traced function */
if (flags & BPF_TRAMP_F_SKIP_FRAME)
orig_call += 16;
orig_call += RV_FENTRY_NINSNS * 4;
if (flags & BPF_TRAMP_F_CALL_ORIG) {
emit_imm(RV_REG_A0, (const s64)im, ctx);
@ -967,17 +965,30 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx);
if (flags & BPF_TRAMP_F_SKIP_FRAME)
/* return address of parent function */
if (func_addr) {
/* trampoline called from function entry */
emit_ld(RV_REG_T0, stack_size - 8, RV_REG_SP, ctx);
emit_ld(RV_REG_FP, stack_size - 16, RV_REG_SP, ctx);
emit_addi(RV_REG_SP, RV_REG_SP, stack_size, ctx);
emit_ld(RV_REG_RA, 8, RV_REG_SP, ctx);
emit_ld(RV_REG_FP, 0, RV_REG_SP, ctx);
emit_addi(RV_REG_SP, RV_REG_SP, 16, ctx);
if (flags & BPF_TRAMP_F_SKIP_FRAME)
/* return to parent function */
emit_jalr(RV_REG_ZERO, RV_REG_RA, 0, ctx);
else
/* return to traced function */
emit_jalr(RV_REG_ZERO, RV_REG_T0, 0, ctx);
} else {
/* trampoline called directly */
emit_ld(RV_REG_RA, stack_size - 8, RV_REG_SP, ctx);
else
/* return address of traced function */
emit_ld(RV_REG_RA, stack_size - retaddr_off, RV_REG_SP, ctx);
emit_ld(RV_REG_FP, stack_size - 16, RV_REG_SP, ctx);
emit_addi(RV_REG_SP, RV_REG_SP, stack_size, ctx);
emit_ld(RV_REG_FP, stack_size - fp_off, RV_REG_SP, ctx);
emit_addi(RV_REG_SP, RV_REG_SP, stack_size, ctx);
emit_jalr(RV_REG_ZERO, RV_REG_RA, 0, ctx);
emit_jalr(RV_REG_ZERO, RV_REG_RA, 0, ctx);
}
ret = ctx->ninsns;
out:
@ -1691,8 +1702,8 @@ void bpf_jit_build_prologue(struct rv_jit_context *ctx)
store_offset = stack_adjust - 8;
/* reserve 4 nop insns */
for (i = 0; i < 4; i++)
/* nops reserved for auipc+jalr pair */
for (i = 0; i < RV_FENTRY_NINSNS; i++)
emit(rv_nop(), ctx);
/* First instruction is always setting the tail-call-counter

View File

@ -701,6 +701,38 @@ static void emit_mov_reg(u8 **pprog, bool is64, u32 dst_reg, u32 src_reg)
*pprog = prog;
}
static void emit_movsx_reg(u8 **pprog, int num_bits, bool is64, u32 dst_reg,
u32 src_reg)
{
u8 *prog = *pprog;
if (is64) {
/* movs[b,w,l]q dst, src */
if (num_bits == 8)
EMIT4(add_2mod(0x48, src_reg, dst_reg), 0x0f, 0xbe,
add_2reg(0xC0, src_reg, dst_reg));
else if (num_bits == 16)
EMIT4(add_2mod(0x48, src_reg, dst_reg), 0x0f, 0xbf,
add_2reg(0xC0, src_reg, dst_reg));
else if (num_bits == 32)
EMIT3(add_2mod(0x48, src_reg, dst_reg), 0x63,
add_2reg(0xC0, src_reg, dst_reg));
} else {
/* movs[b,w]l dst, src */
if (num_bits == 8) {
EMIT4(add_2mod(0x40, src_reg, dst_reg), 0x0f, 0xbe,
add_2reg(0xC0, src_reg, dst_reg));
} else if (num_bits == 16) {
if (is_ereg(dst_reg) || is_ereg(src_reg))
EMIT1(add_2mod(0x40, src_reg, dst_reg));
EMIT3(add_2mod(0x0f, src_reg, dst_reg), 0xbf,
add_2reg(0xC0, src_reg, dst_reg));
}
}
*pprog = prog;
}
/* Emit the suffix (ModR/M etc) for addressing *(ptr_reg + off) and val_reg */
static void emit_insn_suffix(u8 **pprog, u32 ptr_reg, u32 val_reg, int off)
{
@ -779,6 +811,29 @@ static void emit_ldx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
*pprog = prog;
}
/* LDSX: dst_reg = *(s8*)(src_reg + off) */
static void emit_ldsx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
{
u8 *prog = *pprog;
switch (size) {
case BPF_B:
/* Emit 'movsx rax, byte ptr [rax + off]' */
EMIT3(add_2mod(0x48, src_reg, dst_reg), 0x0F, 0xBE);
break;
case BPF_H:
/* Emit 'movsx rax, word ptr [rax + off]' */
EMIT3(add_2mod(0x48, src_reg, dst_reg), 0x0F, 0xBF);
break;
case BPF_W:
/* Emit 'movsx rax, dword ptr [rax+0x14]' */
EMIT2(add_2mod(0x48, src_reg, dst_reg), 0x63);
break;
}
emit_insn_suffix(&prog, src_reg, dst_reg, off);
*pprog = prog;
}
/* STX: *(u8*)(dst_reg + off) = src_reg */
static void emit_stx(u8 **pprog, u32 size, u32 dst_reg, u32 src_reg, int off)
{
@ -1028,9 +1083,14 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
case BPF_ALU64 | BPF_MOV | BPF_X:
case BPF_ALU | BPF_MOV | BPF_X:
emit_mov_reg(&prog,
BPF_CLASS(insn->code) == BPF_ALU64,
dst_reg, src_reg);
if (insn->off == 0)
emit_mov_reg(&prog,
BPF_CLASS(insn->code) == BPF_ALU64,
dst_reg, src_reg);
else
emit_movsx_reg(&prog, insn->off,
BPF_CLASS(insn->code) == BPF_ALU64,
dst_reg, src_reg);
break;
/* neg dst */
@ -1134,15 +1194,26 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
/* mov rax, dst_reg */
emit_mov_reg(&prog, is64, BPF_REG_0, dst_reg);
/*
* xor edx, edx
* equivalent to 'xor rdx, rdx', but one byte less
*/
EMIT2(0x31, 0xd2);
if (insn->off == 0) {
/*
* xor edx, edx
* equivalent to 'xor rdx, rdx', but one byte less
*/
EMIT2(0x31, 0xd2);
/* div src_reg */
maybe_emit_1mod(&prog, src_reg, is64);
EMIT2(0xF7, add_1reg(0xF0, src_reg));
/* div src_reg */
maybe_emit_1mod(&prog, src_reg, is64);
EMIT2(0xF7, add_1reg(0xF0, src_reg));
} else {
if (BPF_CLASS(insn->code) == BPF_ALU)
EMIT1(0x99); /* cdq */
else
EMIT2(0x48, 0x99); /* cqo */
/* idiv src_reg */
maybe_emit_1mod(&prog, src_reg, is64);
EMIT2(0xF7, add_1reg(0xF8, src_reg));
}
if (BPF_OP(insn->code) == BPF_MOD &&
dst_reg != BPF_REG_3)
@ -1262,6 +1333,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
break;
case BPF_ALU | BPF_END | BPF_FROM_BE:
case BPF_ALU64 | BPF_END | BPF_FROM_LE:
switch (imm32) {
case 16:
/* Emit 'ror %ax, 8' to swap lower 2 bytes */
@ -1370,9 +1442,17 @@ st: if (is_imm8(insn->off))
case BPF_LDX | BPF_PROBE_MEM | BPF_W:
case BPF_LDX | BPF_MEM | BPF_DW:
case BPF_LDX | BPF_PROBE_MEM | BPF_DW:
/* LDXS: dst_reg = *(s8*)(src_reg + off) */
case BPF_LDX | BPF_MEMSX | BPF_B:
case BPF_LDX | BPF_MEMSX | BPF_H:
case BPF_LDX | BPF_MEMSX | BPF_W:
case BPF_LDX | BPF_PROBE_MEMSX | BPF_B:
case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
insn_off = insn->off;
if (BPF_MODE(insn->code) == BPF_PROBE_MEM) {
if (BPF_MODE(insn->code) == BPF_PROBE_MEM ||
BPF_MODE(insn->code) == BPF_PROBE_MEMSX) {
/* Conservatively check that src_reg + insn->off is a kernel address:
* src_reg + insn->off >= TASK_SIZE_MAX + PAGE_SIZE
* src_reg is used as scratch for src_reg += insn->off and restored
@ -1415,8 +1495,13 @@ st: if (is_imm8(insn->off))
start_of_ldx = prog;
end_of_jmp[-1] = start_of_ldx - end_of_jmp;
}
emit_ldx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
if (BPF_MODE(insn->code) == BPF_PROBE_MEM) {
if (BPF_MODE(insn->code) == BPF_PROBE_MEMSX ||
BPF_MODE(insn->code) == BPF_MEMSX)
emit_ldsx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
else
emit_ldx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
if (BPF_MODE(insn->code) == BPF_PROBE_MEM ||
BPF_MODE(insn->code) == BPF_PROBE_MEMSX) {
struct exception_table_entry *ex;
u8 *_insn = image + proglen + (start_of_ldx - temp);
s64 delta;
@ -1730,16 +1815,24 @@ st: if (is_imm8(insn->off))
break;
case BPF_JMP | BPF_JA:
if (insn->off == -1)
/* -1 jmp instructions will always jump
* backwards two bytes. Explicitly handling
* this case avoids wasting too many passes
* when there are long sequences of replaced
* dead code.
*/
jmp_offset = -2;
else
jmp_offset = addrs[i + insn->off] - addrs[i];
case BPF_JMP32 | BPF_JA:
if (BPF_CLASS(insn->code) == BPF_JMP) {
if (insn->off == -1)
/* -1 jmp instructions will always jump
* backwards two bytes. Explicitly handling
* this case avoids wasting too many passes
* when there are long sequences of replaced
* dead code.
*/
jmp_offset = -2;
else
jmp_offset = addrs[i + insn->off] - addrs[i];
} else {
if (insn->imm == -1)
jmp_offset = -2;
else
jmp_offset = addrs[i + insn->imm] - addrs[i];
}
if (!jmp_offset) {
/*

View File

@ -90,6 +90,7 @@
#include <net/tls.h>
#endif
#include <net/ip6_route.h>
#include <net/xdp.h>
#include "bonding_priv.h"

View File

@ -14,6 +14,7 @@
#include <linux/interrupt.h>
#include <linux/netdevice.h>
#include <linux/skbuff.h>
#include <net/xdp.h>
#include <uapi/linux/bpf.h>
#include "ena_com.h"

View File

@ -14,6 +14,7 @@
#include <linux/net_tstamp.h>
#include <linux/ptp_clock_kernel.h>
#include <linux/miscdevice.h>
#include <net/xdp.h>
#define TSNEP "tsnep"

View File

@ -12,6 +12,7 @@
#include <linux/fsl/mc.h>
#include <linux/net_tstamp.h>
#include <net/devlink.h>
#include <net/xdp.h>
#include <soc/fsl/dpaa2-io.h>
#include <soc/fsl/dpaa2-fd.h>

View File

@ -11,6 +11,7 @@
#include <linux/if_vlan.h>
#include <linux/phylink.h>
#include <linux/dim.h>
#include <net/xdp.h>
#include "enetc_hw.h"

View File

@ -22,6 +22,7 @@
#include <linux/timecounter.h>
#include <dt-bindings/firmware/imx/rsrc.h>
#include <linux/firmware/imx/sci.h>
#include <net/xdp.h>
#if defined(CONFIG_M523x) || defined(CONFIG_M527x) || defined(CONFIG_M528x) || \
defined(CONFIG_M520x) || defined(CONFIG_M532x) || defined(CONFIG_ARM) || \

View File

@ -5,6 +5,7 @@
#include <linux/netdevice.h>
#include <linux/u64_stats_sync.h>
#include <net/xdp.h>
/* Tx descriptor size */
#define FUNETH_SQE_SIZE 64U

View File

@ -11,6 +11,7 @@
#include <linux/netdevice.h>
#include <linux/pci.h>
#include <linux/u64_stats_sync.h>
#include <net/xdp.h>
#include "gve_desc.h"
#include "gve_desc_dqo.h"

View File

@ -15,6 +15,7 @@
#include <linux/net_tstamp.h>
#include <linux/bitfield.h>
#include <linux/hrtimer.h>
#include <net/xdp.h>
#include "igc_hw.h"

View File

@ -14,6 +14,7 @@
#include <net/pkt_cls.h>
#include <net/pkt_sched.h>
#include <net/switchdev.h>
#include <net/xdp.h>
#include <vcap_api.h>
#include <vcap_api_client.h>

View File

@ -11,6 +11,7 @@
#include <net/checksum.h>
#include <net/ip6_checksum.h>
#include <net/xdp.h>
#include <net/mana/mana.h>
#include <net/mana/mana_auxiliary.h>

View File

@ -22,6 +22,7 @@
#include <linux/net_tstamp.h>
#include <linux/reset.h>
#include <net/page_pool.h>
#include <net/xdp.h>
#include <uapi/linux/bpf.h>
struct stmmac_resources {

View File

@ -6,6 +6,7 @@
#ifndef DRIVERS_NET_ETHERNET_TI_CPSW_PRIV_H_
#define DRIVERS_NET_ETHERNET_TI_CPSW_PRIV_H_
#include <net/xdp.h>
#include <uapi/linux/bpf.h>
#include "davinci_cpdma.h"

View File

@ -16,6 +16,7 @@
#include <linux/hyperv.h>
#include <linux/rndis.h>
#include <linux/jhash.h>
#include <net/xdp.h>
/* RSS related */
#define OID_GEN_RECEIVE_SCALE_CAPABILITIES 0x00010203 /* query only */

View File

@ -22,6 +22,7 @@
#include <net/net_namespace.h>
#include <net/rtnetlink.h>
#include <net/sock.h>
#include <net/xdp.h>
#include <linux/virtio_net.h>
#include <linux/skb_array.h>

View File

@ -22,6 +22,7 @@
#include <net/route.h>
#include <net/xdp.h>
#include <net/net_failover.h>
#include <net/netdev_rx_queue.h>
static int napi_weight = NAPI_POLL_WEIGHT;
module_param(napi_weight, int, 0444);

View File

@ -2661,6 +2661,18 @@ static inline void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr)
}
#endif /* CONFIG_BPF_SYSCALL */
static __always_inline int
bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr)
{
int ret = -EFAULT;
if (IS_ENABLED(CONFIG_BPF_EVENTS))
ret = copy_from_kernel_nofault(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
memset(dst, 0, size);
return ret;
}
void __bpf_free_used_btfs(struct bpf_prog_aux *aux,
struct btf_mod_pair *used_btfs, u32 len);

View File

@ -69,6 +69,9 @@ struct ctl_table_header;
/* unused opcode to mark special load instruction. Same as BPF_ABS */
#define BPF_PROBE_MEM 0x20
/* unused opcode to mark special ldsx instruction. Same as BPF_IND */
#define BPF_PROBE_MEMSX 0x40
/* unused opcode to mark call to interpreter with arguments */
#define BPF_CALL_ARGS 0xe0
@ -90,22 +93,28 @@ struct ctl_table_header;
/* ALU ops on registers, bpf_add|sub|...: dst_reg += src_reg */
#define BPF_ALU64_REG(OP, DST, SRC) \
#define BPF_ALU64_REG_OFF(OP, DST, SRC, OFF) \
((struct bpf_insn) { \
.code = BPF_ALU64 | BPF_OP(OP) | BPF_X, \
.dst_reg = DST, \
.src_reg = SRC, \
.off = 0, \
.off = OFF, \
.imm = 0 })
#define BPF_ALU32_REG(OP, DST, SRC) \
#define BPF_ALU64_REG(OP, DST, SRC) \
BPF_ALU64_REG_OFF(OP, DST, SRC, 0)
#define BPF_ALU32_REG_OFF(OP, DST, SRC, OFF) \
((struct bpf_insn) { \
.code = BPF_ALU | BPF_OP(OP) | BPF_X, \
.dst_reg = DST, \
.src_reg = SRC, \
.off = 0, \
.off = OFF, \
.imm = 0 })
#define BPF_ALU32_REG(OP, DST, SRC) \
BPF_ALU32_REG_OFF(OP, DST, SRC, 0)
/* ALU ops on immediates, bpf_add|sub|...: dst_reg += imm32 */
#define BPF_ALU64_IMM(OP, DST, IMM) \
@ -765,23 +774,6 @@ DECLARE_STATIC_KEY_FALSE(bpf_master_redirect_enabled_key);
u32 xdp_master_redirect(struct xdp_buff *xdp);
static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
struct xdp_buff *xdp)
{
/* Driver XDP hooks are invoked within a single NAPI poll cycle and thus
* under local_bh_disable(), which provides the needed RCU protection
* for accessing map entries.
*/
u32 act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp));
if (static_branch_unlikely(&bpf_master_redirect_enabled_key)) {
if (act == XDP_TX && netif_is_bond_slave(xdp->rxq->dev))
act = xdp_master_redirect(xdp);
}
return act;
}
void bpf_prog_change_xdp(struct bpf_prog *prev_prog, struct bpf_prog *prog);
static inline u32 bpf_prog_insn_size(const struct bpf_prog *prog)

View File

@ -40,7 +40,6 @@
#include <net/dcbnl.h>
#endif
#include <net/netprio_cgroup.h>
#include <net/xdp.h>
#include <linux/netdev_features.h>
#include <linux/neighbour.h>
@ -77,8 +76,12 @@ struct udp_tunnel_nic_info;
struct udp_tunnel_nic;
struct bpf_prog;
struct xdp_buff;
struct xdp_frame;
struct xdp_metadata_ops;
struct xdp_md;
typedef u32 xdp_features_t;
void synchronize_net(void);
void netdev_set_default_ethtool_ops(struct net_device *dev,
const struct ethtool_ops *ops);
@ -783,32 +786,6 @@ bool rps_may_expire_flow(struct net_device *dev, u16 rxq_index, u32 flow_id,
#endif
#endif /* CONFIG_RPS */
/* This structure contains an instance of an RX queue. */
struct netdev_rx_queue {
struct xdp_rxq_info xdp_rxq;
#ifdef CONFIG_RPS
struct rps_map __rcu *rps_map;
struct rps_dev_flow_table __rcu *rps_flow_table;
#endif
struct kobject kobj;
struct net_device *dev;
netdevice_tracker dev_tracker;
#ifdef CONFIG_XDP_SOCKETS
struct xsk_buff_pool *pool;
#endif
} ____cacheline_aligned_in_smp;
/*
* RX queue sysfs structures and functions.
*/
struct rx_queue_attribute {
struct attribute attr;
ssize_t (*show)(struct netdev_rx_queue *queue, char *buf);
ssize_t (*store)(struct netdev_rx_queue *queue,
const char *buf, size_t len);
};
/* XPS map type and offset of the xps map within net_device->xps_maps[]. */
enum xps_map_type {
XPS_CPUS = 0,
@ -1670,12 +1647,6 @@ struct net_device_ops {
struct netlink_ext_ack *extack);
};
struct xdp_metadata_ops {
int (*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
int (*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
enum xdp_rss_hash_type *rss_type);
};
/**
* enum netdev_priv_flags - &struct net_device priv_flags
*
@ -3851,24 +3822,6 @@ static inline int netif_set_real_num_rx_queues(struct net_device *dev,
int netif_set_real_num_queues(struct net_device *dev,
unsigned int txq, unsigned int rxq);
static inline struct netdev_rx_queue *
__netif_get_rx_queue(struct net_device *dev, unsigned int rxq)
{
return dev->_rx + rxq;
}
#ifdef CONFIG_SYSFS
static inline unsigned int get_netdev_rx_queue_index(
struct netdev_rx_queue *queue)
{
struct net_device *dev = queue->dev;
int index = queue - dev->_rx;
BUG_ON(index >= dev->num_rx_queues);
return index;
}
#endif
int netif_get_num_default_rss_queues(void);
void dev_kfree_skb_irq_reason(struct sk_buff *skb, enum skb_drop_reason reason);

View File

@ -11,6 +11,7 @@
#include <linux/wait.h>
#include <linux/list.h>
#include <linux/static_key.h>
#include <linux/module.h>
#include <linux/netfilter_defs.h>
#include <linux/netdevice.h>
#include <linux/sockptr.h>
@ -481,6 +482,15 @@ struct nfnl_ct_hook {
};
extern const struct nfnl_ct_hook __rcu *nfnl_ct_hook;
struct nf_defrag_hook {
struct module *owner;
int (*enable)(struct net *net);
void (*disable)(struct net *net);
};
extern const struct nf_defrag_hook __rcu *nf_defrag_v4_hook;
extern const struct nf_defrag_hook __rcu *nf_defrag_v6_hook;
/*
* nf_skb_duplicated - TEE target has sent a packet
*

View File

@ -16,6 +16,7 @@
#include <linux/sched/clock.h>
#include <linux/sched/signal.h>
#include <net/ip.h>
#include <net/xdp.h>
/* 0 - Reserved to indicate value not set
* 1..NR_CPUS - Reserved for sender_cpu

View File

@ -48,6 +48,22 @@ struct sock *__inet6_lookup_established(struct net *net,
const u16 hnum, const int dif,
const int sdif);
typedef u32 (inet6_ehashfn_t)(const struct net *net,
const struct in6_addr *laddr, const u16 lport,
const struct in6_addr *faddr, const __be16 fport);
inet6_ehashfn_t inet6_ehashfn;
INDIRECT_CALLABLE_DECLARE(inet6_ehashfn_t udp6_ehashfn);
struct sock *inet6_lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb, int doff,
const struct in6_addr *saddr,
__be16 sport,
const struct in6_addr *daddr,
unsigned short hnum,
inet6_ehashfn_t *ehashfn);
struct sock *inet6_lookup_listener(struct net *net,
struct inet_hashinfo *hashinfo,
struct sk_buff *skb, int doff,
@ -57,6 +73,15 @@ struct sock *inet6_lookup_listener(struct net *net,
const unsigned short hnum,
const int dif, const int sdif);
struct sock *inet6_lookup_run_sk_lookup(struct net *net,
int protocol,
struct sk_buff *skb, int doff,
const struct in6_addr *saddr,
const __be16 sport,
const struct in6_addr *daddr,
const u16 hnum, const int dif,
inet6_ehashfn_t *ehashfn);
static inline struct sock *__inet6_lookup(struct net *net,
struct inet_hashinfo *hashinfo,
struct sk_buff *skb, int doff,
@ -78,6 +103,46 @@ static inline struct sock *__inet6_lookup(struct net *net,
daddr, hnum, dif, sdif);
}
static inline
struct sock *inet6_steal_sock(struct net *net, struct sk_buff *skb, int doff,
const struct in6_addr *saddr, const __be16 sport,
const struct in6_addr *daddr, const __be16 dport,
bool *refcounted, inet6_ehashfn_t *ehashfn)
{
struct sock *sk, *reuse_sk;
bool prefetched;
sk = skb_steal_sock(skb, refcounted, &prefetched);
if (!sk)
return NULL;
if (!prefetched)
return sk;
if (sk->sk_protocol == IPPROTO_TCP) {
if (sk->sk_state != TCP_LISTEN)
return sk;
} else if (sk->sk_protocol == IPPROTO_UDP) {
if (sk->sk_state != TCP_CLOSE)
return sk;
} else {
return sk;
}
reuse_sk = inet6_lookup_reuseport(net, sk, skb, doff,
saddr, sport, daddr, ntohs(dport),
ehashfn);
if (!reuse_sk)
return sk;
/* We've chosen a new reuseport sock which is never refcounted. This
* implies that sk also isn't refcounted.
*/
WARN_ON_ONCE(*refcounted);
return reuse_sk;
}
static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo,
struct sk_buff *skb, int doff,
const __be16 sport,
@ -85,14 +150,20 @@ static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo,
int iif, int sdif,
bool *refcounted)
{
struct sock *sk = skb_steal_sock(skb, refcounted);
struct net *net = dev_net(skb_dst(skb)->dev);
const struct ipv6hdr *ip6h = ipv6_hdr(skb);
struct sock *sk;
sk = inet6_steal_sock(net, skb, doff, &ip6h->saddr, sport, &ip6h->daddr, dport,
refcounted, inet6_ehashfn);
if (IS_ERR(sk))
return NULL;
if (sk)
return sk;
return __inet6_lookup(dev_net(skb_dst(skb)->dev), hashinfo, skb,
doff, &ipv6_hdr(skb)->saddr, sport,
&ipv6_hdr(skb)->daddr, ntohs(dport),
return __inet6_lookup(net, hashinfo, skb,
doff, &ip6h->saddr, sport,
&ip6h->daddr, ntohs(dport),
iif, sdif, refcounted);
}

View File

@ -379,6 +379,27 @@ struct sock *__inet_lookup_established(struct net *net,
const __be32 daddr, const u16 hnum,
const int dif, const int sdif);
typedef u32 (inet_ehashfn_t)(const struct net *net,
const __be32 laddr, const __u16 lport,
const __be32 faddr, const __be16 fport);
inet_ehashfn_t inet_ehashfn;
INDIRECT_CALLABLE_DECLARE(inet_ehashfn_t udp_ehashfn);
struct sock *inet_lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb, int doff,
__be32 saddr, __be16 sport,
__be32 daddr, unsigned short hnum,
inet_ehashfn_t *ehashfn);
struct sock *inet_lookup_run_sk_lookup(struct net *net,
int protocol,
struct sk_buff *skb, int doff,
__be32 saddr, __be16 sport,
__be32 daddr, u16 hnum, const int dif,
inet_ehashfn_t *ehashfn);
static inline struct sock *
inet_lookup_established(struct net *net, struct inet_hashinfo *hashinfo,
const __be32 saddr, const __be16 sport,
@ -428,6 +449,46 @@ static inline struct sock *inet_lookup(struct net *net,
return sk;
}
static inline
struct sock *inet_steal_sock(struct net *net, struct sk_buff *skb, int doff,
const __be32 saddr, const __be16 sport,
const __be32 daddr, const __be16 dport,
bool *refcounted, inet_ehashfn_t *ehashfn)
{
struct sock *sk, *reuse_sk;
bool prefetched;
sk = skb_steal_sock(skb, refcounted, &prefetched);
if (!sk)
return NULL;
if (!prefetched)
return sk;
if (sk->sk_protocol == IPPROTO_TCP) {
if (sk->sk_state != TCP_LISTEN)
return sk;
} else if (sk->sk_protocol == IPPROTO_UDP) {
if (sk->sk_state != TCP_CLOSE)
return sk;
} else {
return sk;
}
reuse_sk = inet_lookup_reuseport(net, sk, skb, doff,
saddr, sport, daddr, ntohs(dport),
ehashfn);
if (!reuse_sk)
return sk;
/* We've chosen a new reuseport sock which is never refcounted. This
* implies that sk also isn't refcounted.
*/
WARN_ON_ONCE(*refcounted);
return reuse_sk;
}
static inline struct sock *__inet_lookup_skb(struct inet_hashinfo *hashinfo,
struct sk_buff *skb,
int doff,
@ -436,22 +497,23 @@ static inline struct sock *__inet_lookup_skb(struct inet_hashinfo *hashinfo,
const int sdif,
bool *refcounted)
{
struct sock *sk = skb_steal_sock(skb, refcounted);
struct net *net = dev_net(skb_dst(skb)->dev);
const struct iphdr *iph = ip_hdr(skb);
struct sock *sk;
sk = inet_steal_sock(net, skb, doff, iph->saddr, sport, iph->daddr, dport,
refcounted, inet_ehashfn);
if (IS_ERR(sk))
return NULL;
if (sk)
return sk;
return __inet_lookup(dev_net(skb_dst(skb)->dev), hashinfo, skb,
return __inet_lookup(net, hashinfo, skb,
doff, iph->saddr, sport,
iph->daddr, dport, inet_iif(skb), sdif,
refcounted);
}
u32 inet6_ehashfn(const struct net *net,
const struct in6_addr *laddr, const u16 lport,
const struct in6_addr *faddr, const __be16 fport);
static inline void sk_daddr_set(struct sock *sk, __be32 addr)
{
sk->sk_daddr = addr; /* alias of inet_daddr */

View File

@ -4,6 +4,8 @@
#ifndef _MANA_H
#define _MANA_H
#include <net/xdp.h>
#include "gdma.h"
#include "hw_channel.h"

View File

@ -0,0 +1,53 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_NETDEV_RX_QUEUE_H
#define _LINUX_NETDEV_RX_QUEUE_H
#include <linux/kobject.h>
#include <linux/netdevice.h>
#include <linux/sysfs.h>
#include <net/xdp.h>
/* This structure contains an instance of an RX queue. */
struct netdev_rx_queue {
struct xdp_rxq_info xdp_rxq;
#ifdef CONFIG_RPS
struct rps_map __rcu *rps_map;
struct rps_dev_flow_table __rcu *rps_flow_table;
#endif
struct kobject kobj;
struct net_device *dev;
netdevice_tracker dev_tracker;
#ifdef CONFIG_XDP_SOCKETS
struct xsk_buff_pool *pool;
#endif
} ____cacheline_aligned_in_smp;
/*
* RX queue sysfs structures and functions.
*/
struct rx_queue_attribute {
struct attribute attr;
ssize_t (*show)(struct netdev_rx_queue *queue, char *buf);
ssize_t (*store)(struct netdev_rx_queue *queue,
const char *buf, size_t len);
};
static inline struct netdev_rx_queue *
__netif_get_rx_queue(struct net_device *dev, unsigned int rxq)
{
return dev->_rx + rxq;
}
#ifdef CONFIG_SYSFS
static inline unsigned int
get_netdev_rx_queue_index(struct netdev_rx_queue *queue)
{
struct net_device *dev = queue->dev;
int index = queue - dev->_rx;
BUG_ON(index >= dev->num_rx_queues);
return index;
}
#endif
#endif

View File

@ -2815,20 +2815,23 @@ sk_is_refcounted(struct sock *sk)
* skb_steal_sock - steal a socket from an sk_buff
* @skb: sk_buff to steal the socket from
* @refcounted: is set to true if the socket is reference-counted
* @prefetched: is set to true if the socket was assigned from bpf
*/
static inline struct sock *
skb_steal_sock(struct sk_buff *skb, bool *refcounted)
skb_steal_sock(struct sk_buff *skb, bool *refcounted, bool *prefetched)
{
if (skb->sk) {
struct sock *sk = skb->sk;
*refcounted = true;
if (skb_sk_is_prefetched(skb))
*prefetched = skb_sk_is_prefetched(skb);
if (*prefetched)
*refcounted = sk_is_refcounted(sk);
skb->destructor = NULL;
skb->sk = NULL;
return sk;
}
*prefetched = false;
*refcounted = false;
return NULL;
}

View File

@ -6,9 +6,10 @@
#ifndef __LINUX_NET_XDP_H__
#define __LINUX_NET_XDP_H__
#include <linux/skbuff.h> /* skb_shared_info */
#include <uapi/linux/netdev.h>
#include <linux/bitfield.h>
#include <linux/filter.h>
#include <linux/netdevice.h>
#include <linux/skbuff.h> /* skb_shared_info */
/**
* DOC: XDP RX-queue information
@ -45,8 +46,6 @@ enum xdp_mem_type {
MEM_TYPE_MAX,
};
typedef u32 xdp_features_t;
/* XDP flags for ndo_xdp_xmit */
#define XDP_XMIT_FLUSH (1U << 0) /* doorbell signal consumer */
#define XDP_XMIT_FLAGS_MASK XDP_XMIT_FLUSH
@ -443,6 +442,12 @@ enum xdp_rss_hash_type {
XDP_RSS_TYPE_L4_IPV6_SCTP_EX = XDP_RSS_TYPE_L4_IPV6_SCTP | XDP_RSS_L3_DYNHDR,
};
struct xdp_metadata_ops {
int (*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp);
int (*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash,
enum xdp_rss_hash_type *rss_type);
};
#ifdef CONFIG_NET
u32 bpf_xdp_metadata_kfunc_id(int id);
bool bpf_dev_bound_kfunc_id(u32 btf_id);
@ -474,4 +479,20 @@ static inline void xdp_clear_features_flag(struct net_device *dev)
xdp_set_features_flag(dev, 0);
}
static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
struct xdp_buff *xdp)
{
/* Driver XDP hooks are invoked within a single NAPI poll cycle and thus
* under local_bh_disable(), which provides the needed RCU protection
* for accessing map entries.
*/
u32 act = __bpf_prog_run(prog, xdp, BPF_DISPATCHER_FUNC(xdp));
if (static_branch_unlikely(&bpf_master_redirect_enabled_key)) {
if (act == XDP_TX && netif_is_bond_slave(xdp->rxq->dev))
act = xdp_master_redirect(xdp);
}
return act;
}
#endif /* __LINUX_NET_XDP_H__ */

View File

@ -9,6 +9,7 @@
#include <linux/filter.h>
#include <linux/tracepoint.h>
#include <linux/bpf.h>
#include <net/xdp.h>
#define __XDP_ACT_MAP(FN) \
FN(ABORTED) \
@ -404,6 +405,23 @@ TRACE_EVENT(mem_return_failed,
)
);
TRACE_EVENT(bpf_xdp_link_attach_failed,
TP_PROTO(const char *msg),
TP_ARGS(msg),
TP_STRUCT__entry(
__string(msg, msg)
),
TP_fast_assign(
__assign_str(msg, msg);
),
TP_printk("errmsg=%s", __get_str(msg))
);
#endif /* _TRACE_XDP_H */
#include <trace/define_trace.h>

View File

@ -19,6 +19,7 @@
/* ld/ldx fields */
#define BPF_DW 0x18 /* double word (64-bit) */
#define BPF_MEMSX 0x80 /* load with sign extension */
#define BPF_ATOMIC 0xc0 /* atomic memory ops - op type in immediate */
#define BPF_XADD 0xc0 /* exclusive add - legacy name */
@ -1187,6 +1188,11 @@ enum bpf_perf_event_type {
*/
#define BPF_F_KPROBE_MULTI_RETURN (1U << 0)
/* link_create.netfilter.flags used in LINK_CREATE command for
* BPF_PROG_TYPE_NETFILTER to enable IP packet defragmentation.
*/
#define BPF_F_NETFILTER_IP_DEFRAG (1U << 0)
/* When BPF ldimm64's insn[0].src_reg != 0 then this can have
* the following extensions:
*
@ -4198,9 +4204,6 @@ union bpf_attr {
* **-EOPNOTSUPP** if the operation is not supported, for example
* a call from outside of TC ingress.
*
* **-ESOCKTNOSUPPORT** if the socket type is not supported
* (reuseport).
*
* long bpf_sk_assign(struct bpf_sk_lookup *ctx, struct bpf_sock *sk, u64 flags)
* Description
* Helper is overloaded depending on BPF program type. This

View File

@ -29,6 +29,7 @@
#include <net/netfilter/nf_bpf_link.h>
#include <net/sock.h>
#include <net/xdp.h>
#include "../tools/lib/bpf/relo_core.h"
/* BTF (BPF Type Format) is the meta data format which describes

View File

@ -61,6 +61,7 @@
#define AX regs[BPF_REG_AX]
#define ARG1 regs[BPF_REG_ARG1]
#define CTX regs[BPF_REG_CTX]
#define OFF insn->off
#define IMM insn->imm
struct bpf_mem_alloc bpf_global_ma;
@ -372,7 +373,12 @@ static int bpf_adj_delta_to_off(struct bpf_insn *insn, u32 pos, s32 end_old,
{
const s32 off_min = S16_MIN, off_max = S16_MAX;
s32 delta = end_new - end_old;
s32 off = insn->off;
s32 off;
if (insn->code == (BPF_JMP32 | BPF_JA))
off = insn->imm;
else
off = insn->off;
if (curr < pos && curr + off + 1 >= end_old)
off += delta;
@ -380,8 +386,12 @@ static int bpf_adj_delta_to_off(struct bpf_insn *insn, u32 pos, s32 end_old,
off -= delta;
if (off < off_min || off > off_max)
return -ERANGE;
if (!probe_pass)
insn->off = off;
if (!probe_pass) {
if (insn->code == (BPF_JMP32 | BPF_JA))
insn->imm = off;
else
insn->off = off;
}
return 0;
}
@ -1271,7 +1281,7 @@ static int bpf_jit_blind_insn(const struct bpf_insn *from,
case BPF_ALU | BPF_MOD | BPF_K:
*to++ = BPF_ALU32_IMM(BPF_MOV, BPF_REG_AX, imm_rnd ^ from->imm);
*to++ = BPF_ALU32_IMM(BPF_XOR, BPF_REG_AX, imm_rnd);
*to++ = BPF_ALU32_REG(from->code, from->dst_reg, BPF_REG_AX);
*to++ = BPF_ALU32_REG_OFF(from->code, from->dst_reg, BPF_REG_AX, from->off);
break;
case BPF_ALU64 | BPF_ADD | BPF_K:
@ -1285,7 +1295,7 @@ static int bpf_jit_blind_insn(const struct bpf_insn *from,
case BPF_ALU64 | BPF_MOD | BPF_K:
*to++ = BPF_ALU64_IMM(BPF_MOV, BPF_REG_AX, imm_rnd ^ from->imm);
*to++ = BPF_ALU64_IMM(BPF_XOR, BPF_REG_AX, imm_rnd);
*to++ = BPF_ALU64_REG(from->code, from->dst_reg, BPF_REG_AX);
*to++ = BPF_ALU64_REG_OFF(from->code, from->dst_reg, BPF_REG_AX, from->off);
break;
case BPF_JMP | BPF_JEQ | BPF_K:
@ -1523,6 +1533,7 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
INSN_3(ALU64, DIV, X), \
INSN_3(ALU64, MOD, X), \
INSN_2(ALU64, NEG), \
INSN_3(ALU64, END, TO_LE), \
/* Immediate based. */ \
INSN_3(ALU64, ADD, K), \
INSN_3(ALU64, SUB, K), \
@ -1591,6 +1602,7 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
INSN_3(JMP, JSLE, K), \
INSN_3(JMP, JSET, K), \
INSN_2(JMP, JA), \
INSN_2(JMP32, JA), \
/* Store instructions. */ \
/* Register based. */ \
INSN_3(STX, MEM, B), \
@ -1610,6 +1622,9 @@ EXPORT_SYMBOL_GPL(__bpf_call_base);
INSN_3(LDX, MEM, H), \
INSN_3(LDX, MEM, W), \
INSN_3(LDX, MEM, DW), \
INSN_3(LDX, MEMSX, B), \
INSN_3(LDX, MEMSX, H), \
INSN_3(LDX, MEMSX, W), \
/* Immediate based. */ \
INSN_3(LD, IMM, DW)
@ -1635,12 +1650,6 @@ bool bpf_opcode_in_insntable(u8 code)
}
#ifndef CONFIG_BPF_JIT_ALWAYS_ON
u64 __weak bpf_probe_read_kernel(void *dst, u32 size, const void *unsafe_ptr)
{
memset(dst, 0, size);
return -EFAULT;
}
/**
* ___bpf_prog_run - run eBPF program on a given context
* @regs: is the array of MAX_BPF_EXT_REG eBPF pseudo-registers
@ -1666,6 +1675,9 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
[BPF_LDX | BPF_PROBE_MEM | BPF_H] = &&LDX_PROBE_MEM_H,
[BPF_LDX | BPF_PROBE_MEM | BPF_W] = &&LDX_PROBE_MEM_W,
[BPF_LDX | BPF_PROBE_MEM | BPF_DW] = &&LDX_PROBE_MEM_DW,
[BPF_LDX | BPF_PROBE_MEMSX | BPF_B] = &&LDX_PROBE_MEMSX_B,
[BPF_LDX | BPF_PROBE_MEMSX | BPF_H] = &&LDX_PROBE_MEMSX_H,
[BPF_LDX | BPF_PROBE_MEMSX | BPF_W] = &&LDX_PROBE_MEMSX_W,
};
#undef BPF_INSN_3_LBL
#undef BPF_INSN_2_LBL
@ -1733,13 +1745,36 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
DST = -DST;
CONT;
ALU_MOV_X:
DST = (u32) SRC;
switch (OFF) {
case 0:
DST = (u32) SRC;
break;
case 8:
DST = (u32)(s8) SRC;
break;
case 16:
DST = (u32)(s16) SRC;
break;
}
CONT;
ALU_MOV_K:
DST = (u32) IMM;
CONT;
ALU64_MOV_X:
DST = SRC;
switch (OFF) {
case 0:
DST = SRC;
break;
case 8:
DST = (s8) SRC;
break;
case 16:
DST = (s16) SRC;
break;
case 32:
DST = (s32) SRC;
break;
}
CONT;
ALU64_MOV_K:
DST = IMM;
@ -1761,36 +1796,114 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
(*(s64 *) &DST) >>= IMM;
CONT;
ALU64_MOD_X:
div64_u64_rem(DST, SRC, &AX);
DST = AX;
switch (OFF) {
case 0:
div64_u64_rem(DST, SRC, &AX);
DST = AX;
break;
case 1:
AX = div64_s64(DST, SRC);
DST = DST - AX * SRC;
break;
}
CONT;
ALU_MOD_X:
AX = (u32) DST;
DST = do_div(AX, (u32) SRC);
switch (OFF) {
case 0:
AX = (u32) DST;
DST = do_div(AX, (u32) SRC);
break;
case 1:
AX = abs((s32)DST);
AX = do_div(AX, abs((s32)SRC));
if ((s32)DST < 0)
DST = (u32)-AX;
else
DST = (u32)AX;
break;
}
CONT;
ALU64_MOD_K:
div64_u64_rem(DST, IMM, &AX);
DST = AX;
switch (OFF) {
case 0:
div64_u64_rem(DST, IMM, &AX);
DST = AX;
break;
case 1:
AX = div64_s64(DST, IMM);
DST = DST - AX * IMM;
break;
}
CONT;
ALU_MOD_K:
AX = (u32) DST;
DST = do_div(AX, (u32) IMM);
switch (OFF) {
case 0:
AX = (u32) DST;
DST = do_div(AX, (u32) IMM);
break;
case 1:
AX = abs((s32)DST);
AX = do_div(AX, abs((s32)IMM));
if ((s32)DST < 0)
DST = (u32)-AX;
else
DST = (u32)AX;
break;
}
CONT;
ALU64_DIV_X:
DST = div64_u64(DST, SRC);
switch (OFF) {
case 0:
DST = div64_u64(DST, SRC);
break;
case 1:
DST = div64_s64(DST, SRC);
break;
}
CONT;
ALU_DIV_X:
AX = (u32) DST;
do_div(AX, (u32) SRC);
DST = (u32) AX;
switch (OFF) {
case 0:
AX = (u32) DST;
do_div(AX, (u32) SRC);
DST = (u32) AX;
break;
case 1:
AX = abs((s32)DST);
do_div(AX, abs((s32)SRC));
if (((s32)DST < 0) == ((s32)SRC < 0))
DST = (u32)AX;
else
DST = (u32)-AX;
break;
}
CONT;
ALU64_DIV_K:
DST = div64_u64(DST, IMM);
switch (OFF) {
case 0:
DST = div64_u64(DST, IMM);
break;
case 1:
DST = div64_s64(DST, IMM);
break;
}
CONT;
ALU_DIV_K:
AX = (u32) DST;
do_div(AX, (u32) IMM);
DST = (u32) AX;
switch (OFF) {
case 0:
AX = (u32) DST;
do_div(AX, (u32) IMM);
DST = (u32) AX;
break;
case 1:
AX = abs((s32)DST);
do_div(AX, abs((s32)IMM));
if (((s32)DST < 0) == ((s32)IMM < 0))
DST = (u32)AX;
else
DST = (u32)-AX;
break;
}
CONT;
ALU_END_TO_BE:
switch (IMM) {
@ -1818,6 +1931,19 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
break;
}
CONT;
ALU64_END_TO_LE:
switch (IMM) {
case 16:
DST = (__force u16) __swab16(DST);
break;
case 32:
DST = (__force u32) __swab32(DST);
break;
case 64:
DST = (__force u64) __swab64(DST);
break;
}
CONT;
/* CALL */
JMP_CALL:
@ -1867,6 +1993,9 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
JMP_JA:
insn += insn->off;
CONT;
JMP32_JA:
insn += insn->imm;
CONT;
JMP_EXIT:
return BPF_R0;
/* JMP */
@ -1931,8 +2060,8 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
DST = *(SIZE *)(unsigned long) (SRC + insn->off); \
CONT; \
LDX_PROBE_MEM_##SIZEOP: \
bpf_probe_read_kernel(&DST, sizeof(SIZE), \
(const void *)(long) (SRC + insn->off)); \
bpf_probe_read_kernel_common(&DST, sizeof(SIZE), \
(const void *)(long) (SRC + insn->off)); \
DST = *((SIZE *)&DST); \
CONT;
@ -1942,6 +2071,21 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
LDST(DW, u64)
#undef LDST
#define LDSX(SIZEOP, SIZE) \
LDX_MEMSX_##SIZEOP: \
DST = *(SIZE *)(unsigned long) (SRC + insn->off); \
CONT; \
LDX_PROBE_MEMSX_##SIZEOP: \
bpf_probe_read_kernel_common(&DST, sizeof(SIZE), \
(const void *)(long) (SRC + insn->off)); \
DST = *((SIZE *)&DST); \
CONT;
LDSX(B, s8)
LDSX(H, s16)
LDSX(W, s32)
#undef LDSX
#define ATOMIC_ALU_OP(BOP, KOP) \
case BOP: \
if (BPF_SIZE(insn->code) == BPF_W) \

View File

@ -61,8 +61,6 @@ struct bpf_cpu_map_entry {
/* XDP can run multiple RX-ring queues, need __percpu enqueue store */
struct xdp_bulk_queue __percpu *bulkq;
struct bpf_cpu_map *cmap;
/* Queue with potential multi-producers, and single-consumer kthread */
struct ptr_ring *queue;
struct task_struct *kthread;
@ -595,7 +593,6 @@ static long cpu_map_update_elem(struct bpf_map *map, void *key, void *value,
rcpu = __cpu_map_entry_alloc(map, &cpumap_value, key_cpu);
if (!rcpu)
return -ENOMEM;
rcpu->cmap = cmap;
}
rcu_read_lock();
__cpu_map_entry_replace(cmap, key_cpu, rcpu);

View File

@ -65,7 +65,6 @@ struct xdp_dev_bulk_queue {
struct bpf_dtab_netdev {
struct net_device *dev; /* must be first member, due to tracepoint */
struct hlist_node index_hlist;
struct bpf_dtab *dtab;
struct bpf_prog *xdp_prog;
struct rcu_head rcu;
unsigned int idx;
@ -874,7 +873,6 @@ static struct bpf_dtab_netdev *__dev_map_alloc_node(struct net *net,
}
dev->idx = idx;
dev->dtab = dtab;
if (prog) {
dev->xdp_prog = prog;
dev->val.bpf_prog.id = prog->aux->id;

View File

@ -87,6 +87,17 @@ const char *const bpf_alu_string[16] = {
[BPF_END >> 4] = "endian",
};
const char *const bpf_alu_sign_string[16] = {
[BPF_DIV >> 4] = "s/=",
[BPF_MOD >> 4] = "s%=",
};
const char *const bpf_movsx_string[4] = {
[0] = "(s8)",
[1] = "(s16)",
[3] = "(s32)",
};
static const char *const bpf_atomic_alu_string[16] = {
[BPF_ADD >> 4] = "add",
[BPF_AND >> 4] = "and",
@ -101,6 +112,12 @@ static const char *const bpf_ldst_string[] = {
[BPF_DW >> 3] = "u64",
};
static const char *const bpf_ldsx_string[] = {
[BPF_W >> 3] = "s32",
[BPF_H >> 3] = "s16",
[BPF_B >> 3] = "s8",
};
static const char *const bpf_jmp_string[16] = {
[BPF_JA >> 4] = "jmp",
[BPF_JEQ >> 4] = "==",
@ -128,6 +145,27 @@ static void print_bpf_end_insn(bpf_insn_print_t verbose,
insn->imm, insn->dst_reg);
}
static void print_bpf_bswap_insn(bpf_insn_print_t verbose,
void *private_data,
const struct bpf_insn *insn)
{
verbose(private_data, "(%02x) r%d = bswap%d r%d\n",
insn->code, insn->dst_reg,
insn->imm, insn->dst_reg);
}
static bool is_sdiv_smod(const struct bpf_insn *insn)
{
return (BPF_OP(insn->code) == BPF_DIV || BPF_OP(insn->code) == BPF_MOD) &&
insn->off == 1;
}
static bool is_movsx(const struct bpf_insn *insn)
{
return BPF_OP(insn->code) == BPF_MOV &&
(insn->off == 8 || insn->off == 16 || insn->off == 32);
}
void print_bpf_insn(const struct bpf_insn_cbs *cbs,
const struct bpf_insn *insn,
bool allow_ptr_leaks)
@ -138,7 +176,7 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
if (class == BPF_ALU || class == BPF_ALU64) {
if (BPF_OP(insn->code) == BPF_END) {
if (class == BPF_ALU64)
verbose(cbs->private_data, "BUG_alu64_%02x\n", insn->code);
print_bpf_bswap_insn(verbose, cbs->private_data, insn);
else
print_bpf_end_insn(verbose, cbs->private_data, insn);
} else if (BPF_OP(insn->code) == BPF_NEG) {
@ -147,17 +185,20 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
insn->dst_reg, class == BPF_ALU ? 'w' : 'r',
insn->dst_reg);
} else if (BPF_SRC(insn->code) == BPF_X) {
verbose(cbs->private_data, "(%02x) %c%d %s %c%d\n",
verbose(cbs->private_data, "(%02x) %c%d %s %s%c%d\n",
insn->code, class == BPF_ALU ? 'w' : 'r',
insn->dst_reg,
bpf_alu_string[BPF_OP(insn->code) >> 4],
is_sdiv_smod(insn) ? bpf_alu_sign_string[BPF_OP(insn->code) >> 4]
: bpf_alu_string[BPF_OP(insn->code) >> 4],
is_movsx(insn) ? bpf_movsx_string[(insn->off >> 3) - 1] : "",
class == BPF_ALU ? 'w' : 'r',
insn->src_reg);
} else {
verbose(cbs->private_data, "(%02x) %c%d %s %d\n",
insn->code, class == BPF_ALU ? 'w' : 'r',
insn->dst_reg,
bpf_alu_string[BPF_OP(insn->code) >> 4],
is_sdiv_smod(insn) ? bpf_alu_sign_string[BPF_OP(insn->code) >> 4]
: bpf_alu_string[BPF_OP(insn->code) >> 4],
insn->imm);
}
} else if (class == BPF_STX) {
@ -218,13 +259,15 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
verbose(cbs->private_data, "BUG_st_%02x\n", insn->code);
}
} else if (class == BPF_LDX) {
if (BPF_MODE(insn->code) != BPF_MEM) {
if (BPF_MODE(insn->code) != BPF_MEM && BPF_MODE(insn->code) != BPF_MEMSX) {
verbose(cbs->private_data, "BUG_ldx_%02x\n", insn->code);
return;
}
verbose(cbs->private_data, "(%02x) r%d = *(%s *)(r%d %+d)\n",
insn->code, insn->dst_reg,
bpf_ldst_string[BPF_SIZE(insn->code) >> 3],
BPF_MODE(insn->code) == BPF_MEM ?
bpf_ldst_string[BPF_SIZE(insn->code) >> 3] :
bpf_ldsx_string[BPF_SIZE(insn->code) >> 3],
insn->src_reg, insn->off);
} else if (class == BPF_LD) {
if (BPF_MODE(insn->code) == BPF_ABS) {
@ -279,6 +322,9 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
} else if (insn->code == (BPF_JMP | BPF_JA)) {
verbose(cbs->private_data, "(%02x) goto pc%+d\n",
insn->code, insn->off);
} else if (insn->code == (BPF_JMP32 | BPF_JA)) {
verbose(cbs->private_data, "(%02x) gotol pc%+d\n",
insn->code, insn->imm);
} else if (insn->code == (BPF_JMP | BPF_EXIT)) {
verbose(cbs->private_data, "(%02x) exit\n", insn->code);
} else if (BPF_SRC(insn->code) == BPF_X) {

View File

@ -183,11 +183,11 @@ static void inc_active(struct bpf_mem_cache *c, unsigned long *flags)
WARN_ON_ONCE(local_inc_return(&c->active) != 1);
}
static void dec_active(struct bpf_mem_cache *c, unsigned long flags)
static void dec_active(struct bpf_mem_cache *c, unsigned long *flags)
{
local_dec(&c->active);
if (IS_ENABLED(CONFIG_PREEMPT_RT))
local_irq_restore(flags);
local_irq_restore(*flags);
}
static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj)
@ -197,16 +197,20 @@ static void add_obj_to_free_list(struct bpf_mem_cache *c, void *obj)
inc_active(c, &flags);
__llist_add(obj, &c->free_llist);
c->free_cnt++;
dec_active(c, flags);
dec_active(c, &flags);
}
/* Mostly runs from irq_work except __init phase. */
static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node)
static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node, bool atomic)
{
struct mem_cgroup *memcg = NULL, *old_memcg;
gfp_t gfp;
void *obj;
int i;
gfp = __GFP_NOWARN | __GFP_ACCOUNT;
gfp |= atomic ? GFP_NOWAIT : GFP_KERNEL;
for (i = 0; i < cnt; i++) {
/*
* For every 'c' llist_del_first(&c->free_by_rcu_ttrace); is
@ -238,7 +242,7 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node)
* will allocate from the current numa node which is what we
* want here.
*/
obj = __alloc(c, node, GFP_NOWAIT | __GFP_NOWARN | __GFP_ACCOUNT);
obj = __alloc(c, node, gfp);
if (!obj)
break;
add_obj_to_free_list(c, obj);
@ -344,7 +348,7 @@ static void free_bulk(struct bpf_mem_cache *c)
cnt = --c->free_cnt;
else
cnt = 0;
dec_active(c, flags);
dec_active(c, &flags);
if (llnode)
enque_to_free(tgt, llnode);
} while (cnt > (c->high_watermark + c->low_watermark) / 2);
@ -384,7 +388,7 @@ static void check_free_by_rcu(struct bpf_mem_cache *c)
llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra_rcu))
if (__llist_add(llnode, &c->free_by_rcu))
c->free_by_rcu_tail = llnode;
dec_active(c, flags);
dec_active(c, &flags);
}
if (llist_empty(&c->free_by_rcu))
@ -408,7 +412,7 @@ static void check_free_by_rcu(struct bpf_mem_cache *c)
inc_active(c, &flags);
WRITE_ONCE(c->waiting_for_gp.first, __llist_del_all(&c->free_by_rcu));
c->waiting_for_gp_tail = c->free_by_rcu_tail;
dec_active(c, flags);
dec_active(c, &flags);
if (unlikely(READ_ONCE(c->draining))) {
free_all(llist_del_all(&c->waiting_for_gp), !!c->percpu_size);
@ -429,7 +433,7 @@ static void bpf_mem_refill(struct irq_work *work)
/* irq_work runs on this cpu and kmalloc will allocate
* from the current numa node which is what we want here.
*/
alloc_bulk(c, c->batch, NUMA_NO_NODE);
alloc_bulk(c, c->batch, NUMA_NO_NODE, true);
else if (cnt > c->high_watermark)
free_bulk(c);
@ -477,7 +481,7 @@ static void prefill_mem_cache(struct bpf_mem_cache *c, int cpu)
* prog won't be doing more than 4 map_update_elem from
* irq disabled region
*/
alloc_bulk(c, c->unit_size <= 256 ? 4 : 1, cpu_to_node(cpu));
alloc_bulk(c, c->unit_size <= 256 ? 4 : 1, cpu_to_node(cpu), false);
}
/* When size != 0 bpf_mem_cache for each cpu.

View File

@ -25,6 +25,7 @@
#include <linux/rhashtable.h>
#include <linux/rtnetlink.h>
#include <linux/rwsem.h>
#include <net/xdp.h>
/* Protects offdevs, members of bpf_offload_netdev and offload members
* of all progs.

View File

@ -26,6 +26,7 @@
#include <linux/poison.h>
#include <linux/module.h>
#include <linux/cpumask.h>
#include <net/xdp.h>
#include "disasm.h"
@ -2855,7 +2856,10 @@ static int check_subprogs(struct bpf_verifier_env *env)
goto next;
if (BPF_OP(code) == BPF_EXIT || BPF_OP(code) == BPF_CALL)
goto next;
off = i + insn[i].off + 1;
if (code == (BPF_JMP32 | BPF_JA))
off = i + insn[i].imm + 1;
else
off = i + insn[i].off + 1;
if (off < subprog_start || off >= subprog_end) {
verbose(env, "jump out of range from insn %d to %d\n", i, off);
return -EINVAL;
@ -2867,6 +2871,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
* or unconditional jump back
*/
if (code != (BPF_JMP | BPF_EXIT) &&
code != (BPF_JMP32 | BPF_JA) &&
code != (BPF_JMP | BPF_JA)) {
verbose(env, "last insn is not an exit or jmp\n");
return -EINVAL;
@ -3012,8 +3017,10 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
}
if (class == BPF_ALU64 && op == BPF_END && (insn->imm == 16 || insn->imm == 32))
return false;
if (class == BPF_ALU64 || class == BPF_JMP ||
/* BPF_END always use BPF_ALU class. */
(class == BPF_ALU && op == BPF_END && insn->imm == 64))
return true;
@ -3421,7 +3428,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
return 0;
if (opcode == BPF_MOV) {
if (BPF_SRC(insn->code) == BPF_X) {
/* dreg = sreg
/* dreg = sreg or dreg = (s8, s16, s32)sreg
* dreg needs precision after this insn
* sreg needs precision before this insn
*/
@ -5827,6 +5834,147 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
__reg_combine_64_into_32(reg);
}
static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
{
if (size == 1) {
reg->smin_value = reg->s32_min_value = S8_MIN;
reg->smax_value = reg->s32_max_value = S8_MAX;
} else if (size == 2) {
reg->smin_value = reg->s32_min_value = S16_MIN;
reg->smax_value = reg->s32_max_value = S16_MAX;
} else {
/* size == 4 */
reg->smin_value = reg->s32_min_value = S32_MIN;
reg->smax_value = reg->s32_max_value = S32_MAX;
}
reg->umin_value = reg->u32_min_value = 0;
reg->umax_value = U64_MAX;
reg->u32_max_value = U32_MAX;
reg->var_off = tnum_unknown;
}
static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
{
s64 init_s64_max, init_s64_min, s64_max, s64_min, u64_cval;
u64 top_smax_value, top_smin_value;
u64 num_bits = size * 8;
if (tnum_is_const(reg->var_off)) {
u64_cval = reg->var_off.value;
if (size == 1)
reg->var_off = tnum_const((s8)u64_cval);
else if (size == 2)
reg->var_off = tnum_const((s16)u64_cval);
else
/* size == 4 */
reg->var_off = tnum_const((s32)u64_cval);
u64_cval = reg->var_off.value;
reg->smax_value = reg->smin_value = u64_cval;
reg->umax_value = reg->umin_value = u64_cval;
reg->s32_max_value = reg->s32_min_value = u64_cval;
reg->u32_max_value = reg->u32_min_value = u64_cval;
return;
}
top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;
if (top_smax_value != top_smin_value)
goto out;
/* find the s64_min and s64_min after sign extension */
if (size == 1) {
init_s64_max = (s8)reg->smax_value;
init_s64_min = (s8)reg->smin_value;
} else if (size == 2) {
init_s64_max = (s16)reg->smax_value;
init_s64_min = (s16)reg->smin_value;
} else {
init_s64_max = (s32)reg->smax_value;
init_s64_min = (s32)reg->smin_value;
}
s64_max = max(init_s64_max, init_s64_min);
s64_min = min(init_s64_max, init_s64_min);
/* both of s64_max/s64_min positive or negative */
if ((s64_max >= 0) == (s64_min >= 0)) {
reg->smin_value = reg->s32_min_value = s64_min;
reg->smax_value = reg->s32_max_value = s64_max;
reg->umin_value = reg->u32_min_value = s64_min;
reg->umax_value = reg->u32_max_value = s64_max;
reg->var_off = tnum_range(s64_min, s64_max);
return;
}
out:
set_sext64_default_val(reg, size);
}
static void set_sext32_default_val(struct bpf_reg_state *reg, int size)
{
if (size == 1) {
reg->s32_min_value = S8_MIN;
reg->s32_max_value = S8_MAX;
} else {
/* size == 2 */
reg->s32_min_value = S16_MIN;
reg->s32_max_value = S16_MAX;
}
reg->u32_min_value = 0;
reg->u32_max_value = U32_MAX;
}
static void coerce_subreg_to_size_sx(struct bpf_reg_state *reg, int size)
{
s32 init_s32_max, init_s32_min, s32_max, s32_min, u32_val;
u32 top_smax_value, top_smin_value;
u32 num_bits = size * 8;
if (tnum_is_const(reg->var_off)) {
u32_val = reg->var_off.value;
if (size == 1)
reg->var_off = tnum_const((s8)u32_val);
else
reg->var_off = tnum_const((s16)u32_val);
u32_val = reg->var_off.value;
reg->s32_min_value = reg->s32_max_value = u32_val;
reg->u32_min_value = reg->u32_max_value = u32_val;
return;
}
top_smax_value = ((u32)reg->s32_max_value >> num_bits) << num_bits;
top_smin_value = ((u32)reg->s32_min_value >> num_bits) << num_bits;
if (top_smax_value != top_smin_value)
goto out;
/* find the s32_min and s32_min after sign extension */
if (size == 1) {
init_s32_max = (s8)reg->s32_max_value;
init_s32_min = (s8)reg->s32_min_value;
} else {
/* size == 2 */
init_s32_max = (s16)reg->s32_max_value;
init_s32_min = (s16)reg->s32_min_value;
}
s32_max = max(init_s32_max, init_s32_min);
s32_min = min(init_s32_max, init_s32_min);
if ((s32_min >= 0) == (s32_max >= 0)) {
reg->s32_min_value = s32_min;
reg->s32_max_value = s32_max;
reg->u32_min_value = (u32)s32_min;
reg->u32_max_value = (u32)s32_max;
return;
}
out:
set_sext32_default_val(reg, size);
}
static bool bpf_map_is_rdonly(const struct bpf_map *map)
{
/* A map is considered read-only if the following condition are true:
@ -5847,7 +5995,8 @@ static bool bpf_map_is_rdonly(const struct bpf_map *map)
!bpf_map_write_active(map);
}
static int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val)
static int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val,
bool is_ldsx)
{
void *ptr;
u64 addr;
@ -5860,13 +6009,13 @@ static int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val)
switch (size) {
case sizeof(u8):
*val = (u64)*(u8 *)ptr;
*val = is_ldsx ? (s64)*(s8 *)ptr : (u64)*(u8 *)ptr;
break;
case sizeof(u16):
*val = (u64)*(u16 *)ptr;
*val = is_ldsx ? (s64)*(s16 *)ptr : (u64)*(u16 *)ptr;
break;
case sizeof(u32):
*val = (u64)*(u32 *)ptr;
*val = is_ldsx ? (s64)*(s32 *)ptr : (u64)*(u32 *)ptr;
break;
case sizeof(u64):
*val = *(u64 *)ptr;
@ -6285,7 +6434,7 @@ static int check_stack_access_within_bounds(
*/
static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regno,
int off, int bpf_size, enum bpf_access_type t,
int value_regno, bool strict_alignment_once)
int value_regno, bool strict_alignment_once, bool is_ldsx)
{
struct bpf_reg_state *regs = cur_regs(env);
struct bpf_reg_state *reg = regs + regno;
@ -6346,7 +6495,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
u64 val = 0;
err = bpf_map_direct_read(map, map_off, size,
&val);
&val, is_ldsx);
if (err)
return err;
@ -6516,8 +6665,11 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
if (!err && size < BPF_REG_SIZE && value_regno >= 0 && t == BPF_READ &&
regs[value_regno].type == SCALAR_VALUE) {
/* b/h/w load zero-extends, mark upper bits as known 0 */
coerce_reg_to_size(&regs[value_regno], size);
if (!is_ldsx)
/* b/h/w load zero-extends, mark upper bits as known 0 */
coerce_reg_to_size(&regs[value_regno], size);
else
coerce_reg_to_size_sx(&regs[value_regno], size);
}
return err;
}
@ -6609,17 +6761,17 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i
* case to simulate the register fill.
*/
err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
BPF_SIZE(insn->code), BPF_READ, -1, true);
BPF_SIZE(insn->code), BPF_READ, -1, true, false);
if (!err && load_reg >= 0)
err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
BPF_SIZE(insn->code), BPF_READ, load_reg,
true);
true, false);
if (err)
return err;
/* Check whether we can write into the same memory. */
err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
BPF_SIZE(insn->code), BPF_WRITE, -1, true);
BPF_SIZE(insn->code), BPF_WRITE, -1, true, false);
if (err)
return err;
@ -6865,7 +7017,7 @@ static int check_helper_mem_access(struct bpf_verifier_env *env, int regno,
return zero_size_allowed ? 0 : -EACCES;
return check_mem_access(env, env->insn_idx, regno, offset, BPF_B,
atype, -1, false);
atype, -1, false, false);
}
fallthrough;
@ -7237,7 +7389,7 @@ static int process_dynptr_func(struct bpf_verifier_env *env, int regno, int insn
/* we write BPF_DW bits (8 bytes) at a time */
for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
err = check_mem_access(env, insn_idx, regno,
i, BPF_DW, BPF_WRITE, -1, false);
i, BPF_DW, BPF_WRITE, -1, false, false);
if (err)
return err;
}
@ -7330,7 +7482,7 @@ static int process_iter_arg(struct bpf_verifier_env *env, int regno, int insn_id
for (i = 0; i < nr_slots * 8; i += BPF_REG_SIZE) {
err = check_mem_access(env, insn_idx, regno,
i, BPF_DW, BPF_WRITE, -1, false);
i, BPF_DW, BPF_WRITE, -1, false, false);
if (err)
return err;
}
@ -9474,7 +9626,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
*/
for (i = 0; i < meta.access_size; i++) {
err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
BPF_WRITE, -1, false);
BPF_WRITE, -1, false, false);
if (err)
return err;
}
@ -12931,7 +13083,8 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
} else {
if (insn->src_reg != BPF_REG_0 || insn->off != 0 ||
(insn->imm != 16 && insn->imm != 32 && insn->imm != 64) ||
BPF_CLASS(insn->code) == BPF_ALU64) {
(BPF_CLASS(insn->code) == BPF_ALU64 &&
BPF_SRC(insn->code) != BPF_TO_LE)) {
verbose(env, "BPF_END uses reserved fields\n");
return -EINVAL;
}
@ -12956,11 +13109,24 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
} else if (opcode == BPF_MOV) {
if (BPF_SRC(insn->code) == BPF_X) {
if (insn->imm != 0 || insn->off != 0) {
if (insn->imm != 0) {
verbose(env, "BPF_MOV uses reserved fields\n");
return -EINVAL;
}
if (BPF_CLASS(insn->code) == BPF_ALU) {
if (insn->off != 0 && insn->off != 8 && insn->off != 16) {
verbose(env, "BPF_MOV uses reserved fields\n");
return -EINVAL;
}
} else {
if (insn->off != 0 && insn->off != 8 && insn->off != 16 &&
insn->off != 32) {
verbose(env, "BPF_MOV uses reserved fields\n");
return -EINVAL;
}
}
/* check src operand */
err = check_reg_arg(env, insn->src_reg, SRC_OP);
if (err)
@ -12984,18 +13150,33 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
!tnum_is_const(src_reg->var_off);
if (BPF_CLASS(insn->code) == BPF_ALU64) {
/* case: R1 = R2
* copy register state to dest reg
*/
if (need_id)
/* Assign src and dst registers the same ID
* that will be used by find_equal_scalars()
* to propagate min/max range.
if (insn->off == 0) {
/* case: R1 = R2
* copy register state to dest reg
*/
src_reg->id = ++env->id_gen;
copy_register_state(dst_reg, src_reg);
dst_reg->live |= REG_LIVE_WRITTEN;
dst_reg->subreg_def = DEF_NOT_SUBREG;
if (need_id)
/* Assign src and dst registers the same ID
* that will be used by find_equal_scalars()
* to propagate min/max range.
*/
src_reg->id = ++env->id_gen;
copy_register_state(dst_reg, src_reg);
dst_reg->live |= REG_LIVE_WRITTEN;
dst_reg->subreg_def = DEF_NOT_SUBREG;
} else {
/* case: R1 = (s8, s16 s32)R2 */
bool no_sext;
no_sext = src_reg->umax_value < (1ULL << (insn->off - 1));
if (no_sext && need_id)
src_reg->id = ++env->id_gen;
copy_register_state(dst_reg, src_reg);
if (!no_sext)
dst_reg->id = 0;
coerce_reg_to_size_sx(dst_reg, insn->off >> 3);
dst_reg->live |= REG_LIVE_WRITTEN;
dst_reg->subreg_def = DEF_NOT_SUBREG;
}
} else {
/* R1 = (u32) R2 */
if (is_pointer_value(env, insn->src_reg)) {
@ -13004,19 +13185,33 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
insn->src_reg);
return -EACCES;
} else if (src_reg->type == SCALAR_VALUE) {
bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX;
if (insn->off == 0) {
bool is_src_reg_u32 = src_reg->umax_value <= U32_MAX;
if (is_src_reg_u32 && need_id)
src_reg->id = ++env->id_gen;
copy_register_state(dst_reg, src_reg);
/* Make sure ID is cleared if src_reg is not in u32 range otherwise
* dst_reg min/max could be incorrectly
* propagated into src_reg by find_equal_scalars()
*/
if (!is_src_reg_u32)
dst_reg->id = 0;
dst_reg->live |= REG_LIVE_WRITTEN;
dst_reg->subreg_def = env->insn_idx + 1;
if (is_src_reg_u32 && need_id)
src_reg->id = ++env->id_gen;
copy_register_state(dst_reg, src_reg);
/* Make sure ID is cleared if src_reg is not in u32
* range otherwise dst_reg min/max could be incorrectly
* propagated into src_reg by find_equal_scalars()
*/
if (!is_src_reg_u32)
dst_reg->id = 0;
dst_reg->live |= REG_LIVE_WRITTEN;
dst_reg->subreg_def = env->insn_idx + 1;
} else {
/* case: W1 = (s8, s16)W2 */
bool no_sext = src_reg->umax_value < (1ULL << (insn->off - 1));
if (no_sext && need_id)
src_reg->id = ++env->id_gen;
copy_register_state(dst_reg, src_reg);
if (!no_sext)
dst_reg->id = 0;
dst_reg->live |= REG_LIVE_WRITTEN;
dst_reg->subreg_def = env->insn_idx + 1;
coerce_subreg_to_size_sx(dst_reg, insn->off >> 3);
}
} else {
mark_reg_unknown(env, regs,
insn->dst_reg);
@ -13047,7 +13242,8 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
} else { /* all other ALU ops: and, sub, xor, add, ... */
if (BPF_SRC(insn->code) == BPF_X) {
if (insn->imm != 0 || insn->off != 0) {
if (insn->imm != 0 || insn->off > 1 ||
(insn->off == 1 && opcode != BPF_MOD && opcode != BPF_DIV)) {
verbose(env, "BPF_ALU uses reserved fields\n");
return -EINVAL;
}
@ -13056,7 +13252,8 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
if (err)
return err;
} else {
if (insn->src_reg != BPF_REG_0 || insn->off != 0) {
if (insn->src_reg != BPF_REG_0 || insn->off > 1 ||
(insn->off == 1 && opcode != BPF_MOD && opcode != BPF_DIV)) {
verbose(env, "BPF_ALU uses reserved fields\n");
return -EINVAL;
}
@ -14600,7 +14797,7 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns,
static int visit_insn(int t, struct bpf_verifier_env *env)
{
struct bpf_insn *insns = env->prog->insnsi, *insn = &insns[t];
int ret;
int ret, off;
if (bpf_pseudo_func(insn))
return visit_func_call_insn(t, insns, env, true);
@ -14648,14 +14845,19 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
if (BPF_SRC(insn->code) != BPF_K)
return -EINVAL;
if (BPF_CLASS(insn->code) == BPF_JMP)
off = insn->off;
else
off = insn->imm;
/* unconditional jump with single edge */
ret = push_insn(t, t + insn->off + 1, FALLTHROUGH, env,
ret = push_insn(t, t + off + 1, FALLTHROUGH, env,
true);
if (ret)
return ret;
mark_prune_point(env, t + insn->off + 1);
mark_jmp_point(env, t + insn->off + 1);
mark_prune_point(env, t + off + 1);
mark_jmp_point(env, t + off + 1);
return ret;
@ -16202,7 +16404,7 @@ static int save_aux_ptr_type(struct bpf_verifier_env *env, enum bpf_reg_type typ
* Have to support a use case when one path through
* the program yields TRUSTED pointer while another
* is UNTRUSTED. Fallback to UNTRUSTED to generate
* BPF_PROBE_MEM.
* BPF_PROBE_MEM/BPF_PROBE_MEMSX.
*/
*prev_type = PTR_TO_BTF_ID | PTR_UNTRUSTED;
} else {
@ -16343,7 +16545,8 @@ static int do_check(struct bpf_verifier_env *env)
*/
err = check_mem_access(env, env->insn_idx, insn->src_reg,
insn->off, BPF_SIZE(insn->code),
BPF_READ, insn->dst_reg, false);
BPF_READ, insn->dst_reg, false,
BPF_MODE(insn->code) == BPF_MEMSX);
if (err)
return err;
@ -16380,7 +16583,7 @@ static int do_check(struct bpf_verifier_env *env)
/* check that memory (dst_reg + off) is writeable */
err = check_mem_access(env, env->insn_idx, insn->dst_reg,
insn->off, BPF_SIZE(insn->code),
BPF_WRITE, insn->src_reg, false);
BPF_WRITE, insn->src_reg, false, false);
if (err)
return err;
@ -16405,7 +16608,7 @@ static int do_check(struct bpf_verifier_env *env)
/* check that memory (dst_reg + off) is writeable */
err = check_mem_access(env, env->insn_idx, insn->dst_reg,
insn->off, BPF_SIZE(insn->code),
BPF_WRITE, -1, false);
BPF_WRITE, -1, false, false);
if (err)
return err;
@ -16450,15 +16653,18 @@ static int do_check(struct bpf_verifier_env *env)
mark_reg_scratched(env, BPF_REG_0);
} else if (opcode == BPF_JA) {
if (BPF_SRC(insn->code) != BPF_K ||
insn->imm != 0 ||
insn->src_reg != BPF_REG_0 ||
insn->dst_reg != BPF_REG_0 ||
class == BPF_JMP32) {
(class == BPF_JMP && insn->imm != 0) ||
(class == BPF_JMP32 && insn->off != 0)) {
verbose(env, "BPF_JA uses reserved fields\n");
return -EINVAL;
}
env->insn_idx += insn->off + 1;
if (class == BPF_JMP)
env->insn_idx += insn->off + 1;
else
env->insn_idx += insn->imm + 1;
continue;
} else if (opcode == BPF_EXIT) {
@ -16833,7 +17039,8 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
for (i = 0; i < insn_cnt; i++, insn++) {
if (BPF_CLASS(insn->code) == BPF_LDX &&
(BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0)) {
((BPF_MODE(insn->code) != BPF_MEM && BPF_MODE(insn->code) != BPF_MEMSX) ||
insn->imm != 0)) {
verbose(env, "BPF_LDX uses reserved fields\n");
return -EINVAL;
}
@ -17304,13 +17511,13 @@ static bool insn_is_cond_jump(u8 code)
{
u8 op;
op = BPF_OP(code);
if (BPF_CLASS(code) == BPF_JMP32)
return true;
return op != BPF_JA;
if (BPF_CLASS(code) != BPF_JMP)
return false;
op = BPF_OP(code);
return op != BPF_JA && op != BPF_EXIT && op != BPF_CALL;
}
@ -17527,11 +17734,15 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
for (i = 0; i < insn_cnt; i++, insn++) {
bpf_convert_ctx_access_t convert_ctx_access;
u8 mode;
if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
insn->code == (BPF_LDX | BPF_MEM | BPF_DW)) {
insn->code == (BPF_LDX | BPF_MEM | BPF_DW) ||
insn->code == (BPF_LDX | BPF_MEMSX | BPF_B) ||
insn->code == (BPF_LDX | BPF_MEMSX | BPF_H) ||
insn->code == (BPF_LDX | BPF_MEMSX | BPF_W)) {
type = BPF_READ;
} else if (insn->code == (BPF_STX | BPF_MEM | BPF_B) ||
insn->code == (BPF_STX | BPF_MEM | BPF_H) ||
@ -17590,8 +17801,12 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
*/
case PTR_TO_BTF_ID | MEM_ALLOC | PTR_UNTRUSTED:
if (type == BPF_READ) {
insn->code = BPF_LDX | BPF_PROBE_MEM |
BPF_SIZE((insn)->code);
if (BPF_MODE(insn->code) == BPF_MEM)
insn->code = BPF_LDX | BPF_PROBE_MEM |
BPF_SIZE((insn)->code);
else
insn->code = BPF_LDX | BPF_PROBE_MEMSX |
BPF_SIZE((insn)->code);
env->prog->aux->num_exentries++;
}
continue;
@ -17601,6 +17816,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
ctx_field_size = env->insn_aux_data[i + delta].ctx_field_size;
size = BPF_LDST_BYTES(insn);
mode = BPF_MODE(insn->code);
/* If the read access is a narrower load of the field,
* convert to a 4/8-byte load, to minimum program type specific
@ -17660,6 +17876,10 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
(1ULL << size * 8) - 1);
}
}
if (mode == BPF_MEMSX)
insn_buf[cnt++] = BPF_RAW_INSN(BPF_ALU64 | BPF_MOV | BPF_X,
insn->dst_reg, insn->dst_reg,
size * 8, 0);
new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
if (!new_prog)
@ -17779,7 +17999,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
insn = func[i]->insnsi;
for (j = 0; j < func[i]->len; j++, insn++) {
if (BPF_CLASS(insn->code) == BPF_LDX &&
BPF_MODE(insn->code) == BPF_PROBE_MEM)
(BPF_MODE(insn->code) == BPF_PROBE_MEM ||
BPF_MODE(insn->code) == BPF_PROBE_MEMSX))
num_exentries++;
}
func[i]->aux->num_exentries = num_exentries;

View File

@ -223,17 +223,6 @@ const struct bpf_func_proto bpf_probe_read_user_str_proto = {
.arg3_type = ARG_ANYTHING,
};
static __always_inline int
bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr)
{
int ret;
ret = copy_from_kernel_nofault(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
memset(dst, 0, size);
return ret;
}
BPF_CALL_3(bpf_probe_read_kernel, void *, dst, u32, size,
const void *, unsafe_ptr)
{

View File

@ -555,12 +555,15 @@ static int perf_call_bpf_enter(struct trace_event_call *call, struct pt_regs *re
struct syscall_trace_enter *rec)
{
struct syscall_tp_t {
unsigned long long regs;
struct trace_entry ent;
unsigned long syscall_nr;
unsigned long args[SYSCALL_DEFINE_MAXARGS];
} param;
} __aligned(8) param;
int i;
BUILD_BUG_ON(sizeof(param.ent) < sizeof(void *));
/* bpf prog requires 'regs' to be the first member in the ctx (a.k.a. &param) */
*(struct pt_regs **)&param = regs;
param.syscall_nr = rec->nr;
for (i = 0; i < sys_data->nb_args; i++)
@ -657,11 +660,12 @@ static int perf_call_bpf_exit(struct trace_event_call *call, struct pt_regs *reg
struct syscall_trace_exit *rec)
{
struct syscall_tp_t {
unsigned long long regs;
struct trace_entry ent;
unsigned long syscall_nr;
unsigned long ret;
} param;
} __aligned(8) param;
/* bpf prog requires 'regs' to be the first member in the ctx (a.k.a. &param) */
*(struct pt_regs **)&param = regs;
param.syscall_nr = rec->nr;
param.ret = rec->ret;

View File

@ -20,6 +20,7 @@
#include <linux/smp.h>
#include <linux/sock_diag.h>
#include <linux/netfilter.h>
#include <net/netdev_rx_queue.h>
#include <net/xdp.h>
#include <net/netfilter/nf_bpf_link.h>

View File

@ -133,6 +133,7 @@
#include <trace/events/net.h>
#include <trace/events/skb.h>
#include <trace/events/qdisc.h>
#include <trace/events/xdp.h>
#include <linux/inetdevice.h>
#include <linux/cpu_rmap.h>
#include <linux/static_key.h>
@ -151,6 +152,7 @@
#include <linux/pm_runtime.h>
#include <linux/prandom.h>
#include <linux/once_lite.h>
#include <net/netdev_rx_queue.h>
#include "dev.h"
#include "net-sysfs.h"
@ -9475,6 +9477,7 @@ int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
{
struct net *net = current->nsproxy->net_ns;
struct bpf_link_primer link_primer;
struct netlink_ext_ack extack = {};
struct bpf_xdp_link *link;
struct net_device *dev;
int err, fd;
@ -9502,12 +9505,13 @@ int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
goto unlock;
}
err = dev_xdp_attach_link(dev, NULL, link);
err = dev_xdp_attach_link(dev, &extack, link);
rtnl_unlock();
if (err) {
link->dev = NULL;
bpf_link_cleanup(&link_primer);
trace_bpf_xdp_link_attach_failed(extack._msg);
goto out_put_dev;
}

View File

@ -7351,8 +7351,8 @@ BPF_CALL_3(bpf_sk_assign, struct sk_buff *, skb, struct sock *, sk, u64, flags)
return -EOPNOTSUPP;
if (unlikely(dev_net(skb->dev) != sock_net(sk)))
return -ENETUNREACH;
if (unlikely(sk_fullsock(sk) && sk->sk_reuseport))
return -ESOCKTNOSUPPORT;
if (sk_unhashed(sk))
return -EOPNOTSUPP;
if (sk_is_refcounted(sk) &&
unlikely(!refcount_inc_not_zero(&sk->sk_refcnt)))
return -ENOENT;

View File

@ -23,6 +23,7 @@
#include <linux/of.h>
#include <linux/of_net.h>
#include <linux/cpu.h>
#include <net/netdev_rx_queue.h>
#include "dev.h"
#include "net-sysfs.h"

View File

@ -28,9 +28,9 @@
#include <net/tcp.h>
#include <net/sock_reuseport.h>
static u32 inet_ehashfn(const struct net *net, const __be32 laddr,
const __u16 lport, const __be32 faddr,
const __be16 fport)
u32 inet_ehashfn(const struct net *net, const __be32 laddr,
const __u16 lport, const __be32 faddr,
const __be16 fport)
{
static u32 inet_ehash_secret __read_mostly;
@ -39,6 +39,7 @@ static u32 inet_ehashfn(const struct net *net, const __be32 laddr,
return __inet_ehashfn(laddr, lport, faddr, fport,
inet_ehash_secret + net_hash_mix(net));
}
EXPORT_SYMBOL_GPL(inet_ehashfn);
/* This function handles inet_sock, but also timewait and request sockets
* for IPv4/IPv6.
@ -332,20 +333,38 @@ static inline int compute_score(struct sock *sk, struct net *net,
return score;
}
static inline struct sock *lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb, int doff,
__be32 saddr, __be16 sport,
__be32 daddr, unsigned short hnum)
/**
* inet_lookup_reuseport() - execute reuseport logic on AF_INET socket if necessary.
* @net: network namespace.
* @sk: AF_INET socket, must be in TCP_LISTEN state for TCP or TCP_CLOSE for UDP.
* @skb: context for a potential SK_REUSEPORT program.
* @doff: header offset.
* @saddr: source address.
* @sport: source port.
* @daddr: destination address.
* @hnum: destination port in host byte order.
* @ehashfn: hash function used to generate the fallback hash.
*
* Return: NULL if sk doesn't have SO_REUSEPORT set, otherwise a pointer to
* the selected sock or an error.
*/
struct sock *inet_lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb, int doff,
__be32 saddr, __be16 sport,
__be32 daddr, unsigned short hnum,
inet_ehashfn_t *ehashfn)
{
struct sock *reuse_sk = NULL;
u32 phash;
if (sk->sk_reuseport) {
phash = inet_ehashfn(net, daddr, hnum, saddr, sport);
phash = INDIRECT_CALL_2(ehashfn, udp_ehashfn, inet_ehashfn,
net, daddr, hnum, saddr, sport);
reuse_sk = reuseport_select_sock(sk, phash, skb, doff);
}
return reuse_sk;
}
EXPORT_SYMBOL_GPL(inet_lookup_reuseport);
/*
* Here are some nice properties to exploit here. The BSD API
@ -369,8 +388,8 @@ static struct sock *inet_lhash2_lookup(struct net *net,
sk_nulls_for_each_rcu(sk, node, &ilb2->nulls_head) {
score = compute_score(sk, net, hnum, daddr, dif, sdif);
if (score > hiscore) {
result = lookup_reuseport(net, sk, skb, doff,
saddr, sport, daddr, hnum);
result = inet_lookup_reuseport(net, sk, skb, doff,
saddr, sport, daddr, hnum, inet_ehashfn);
if (result)
return result;
@ -382,24 +401,23 @@ static struct sock *inet_lhash2_lookup(struct net *net,
return result;
}
static inline struct sock *inet_lookup_run_bpf(struct net *net,
struct inet_hashinfo *hashinfo,
struct sk_buff *skb, int doff,
__be32 saddr, __be16 sport,
__be32 daddr, u16 hnum, const int dif)
struct sock *inet_lookup_run_sk_lookup(struct net *net,
int protocol,
struct sk_buff *skb, int doff,
__be32 saddr, __be16 sport,
__be32 daddr, u16 hnum, const int dif,
inet_ehashfn_t *ehashfn)
{
struct sock *sk, *reuse_sk;
bool no_reuseport;
if (hashinfo != net->ipv4.tcp_death_row.hashinfo)
return NULL; /* only TCP is supported */
no_reuseport = bpf_sk_lookup_run_v4(net, IPPROTO_TCP, saddr, sport,
no_reuseport = bpf_sk_lookup_run_v4(net, protocol, saddr, sport,
daddr, hnum, dif, &sk);
if (no_reuseport || IS_ERR_OR_NULL(sk))
return sk;
reuse_sk = lookup_reuseport(net, sk, skb, doff, saddr, sport, daddr, hnum);
reuse_sk = inet_lookup_reuseport(net, sk, skb, doff, saddr, sport, daddr, hnum,
ehashfn);
if (reuse_sk)
sk = reuse_sk;
return sk;
@ -417,9 +435,11 @@ struct sock *__inet_lookup_listener(struct net *net,
unsigned int hash2;
/* Lookup redirect from BPF */
if (static_branch_unlikely(&bpf_sk_lookup_enabled)) {
result = inet_lookup_run_bpf(net, hashinfo, skb, doff,
saddr, sport, daddr, hnum, dif);
if (static_branch_unlikely(&bpf_sk_lookup_enabled) &&
hashinfo == net->ipv4.tcp_death_row.hashinfo) {
result = inet_lookup_run_sk_lookup(net, IPPROTO_TCP, skb, doff,
saddr, sport, daddr, hnum, dif,
inet_ehashfn);
if (result)
goto done;
}

View File

@ -7,6 +7,7 @@
#include <linux/ip.h>
#include <linux/netfilter.h>
#include <linux/module.h>
#include <linux/rcupdate.h>
#include <linux/skbuff.h>
#include <net/netns/generic.h>
#include <net/route.h>
@ -113,17 +114,31 @@ static void __net_exit defrag4_net_exit(struct net *net)
}
}
static const struct nf_defrag_hook defrag_hook = {
.owner = THIS_MODULE,
.enable = nf_defrag_ipv4_enable,
.disable = nf_defrag_ipv4_disable,
};
static struct pernet_operations defrag4_net_ops = {
.exit = defrag4_net_exit,
};
static int __init nf_defrag_init(void)
{
return register_pernet_subsys(&defrag4_net_ops);
int err;
err = register_pernet_subsys(&defrag4_net_ops);
if (err)
return err;
rcu_assign_pointer(nf_defrag_v4_hook, &defrag_hook);
return err;
}
static void __exit nf_defrag_fini(void)
{
rcu_assign_pointer(nf_defrag_v4_hook, NULL);
unregister_pernet_subsys(&defrag4_net_ops);
}

View File

@ -407,9 +407,9 @@ static int compute_score(struct sock *sk, struct net *net,
return score;
}
static u32 udp_ehashfn(const struct net *net, const __be32 laddr,
const __u16 lport, const __be32 faddr,
const __be16 fport)
INDIRECT_CALLABLE_SCOPE
u32 udp_ehashfn(const struct net *net, const __be32 laddr, const __u16 lport,
const __be32 faddr, const __be16 fport)
{
static u32 udp_ehash_secret __read_mostly;
@ -419,22 +419,6 @@ static u32 udp_ehashfn(const struct net *net, const __be32 laddr,
udp_ehash_secret + net_hash_mix(net));
}
static struct sock *lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb,
__be32 saddr, __be16 sport,
__be32 daddr, unsigned short hnum)
{
struct sock *reuse_sk = NULL;
u32 hash;
if (sk->sk_reuseport && sk->sk_state != TCP_ESTABLISHED) {
hash = udp_ehashfn(net, daddr, hnum, saddr, sport);
reuse_sk = reuseport_select_sock(sk, hash, skb,
sizeof(struct udphdr));
}
return reuse_sk;
}
/* called with rcu_read_lock() */
static struct sock *udp4_lib_lookup2(struct net *net,
__be32 saddr, __be16 sport,
@ -452,42 +436,36 @@ static struct sock *udp4_lib_lookup2(struct net *net,
score = compute_score(sk, net, saddr, sport,
daddr, hnum, dif, sdif);
if (score > badness) {
result = lookup_reuseport(net, sk, skb,
saddr, sport, daddr, hnum);
badness = score;
if (sk->sk_state == TCP_ESTABLISHED) {
result = sk;
continue;
}
result = inet_lookup_reuseport(net, sk, skb, sizeof(struct udphdr),
saddr, sport, daddr, hnum, udp_ehashfn);
if (!result) {
result = sk;
continue;
}
/* Fall back to scoring if group has connections */
if (result && !reuseport_has_conns(sk))
if (!reuseport_has_conns(sk))
return result;
result = result ? : sk;
badness = score;
/* Reuseport logic returned an error, keep original score. */
if (IS_ERR(result))
continue;
badness = compute_score(result, net, saddr, sport,
daddr, hnum, dif, sdif);
}
}
return result;
}
static struct sock *udp4_lookup_run_bpf(struct net *net,
struct udp_table *udptable,
struct sk_buff *skb,
__be32 saddr, __be16 sport,
__be32 daddr, u16 hnum, const int dif)
{
struct sock *sk, *reuse_sk;
bool no_reuseport;
if (udptable != net->ipv4.udp_table)
return NULL; /* only UDP is supported */
no_reuseport = bpf_sk_lookup_run_v4(net, IPPROTO_UDP, saddr, sport,
daddr, hnum, dif, &sk);
if (no_reuseport || IS_ERR_OR_NULL(sk))
return sk;
reuse_sk = lookup_reuseport(net, sk, skb, saddr, sport, daddr, hnum);
if (reuse_sk)
sk = reuse_sk;
return sk;
}
/* UDP is nearly always wildcards out the wazoo, it makes no sense to try
* harder than this. -DaveM
*/
@ -512,9 +490,11 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr,
goto done;
/* Lookup redirect from BPF */
if (static_branch_unlikely(&bpf_sk_lookup_enabled)) {
sk = udp4_lookup_run_bpf(net, udptable, skb,
saddr, sport, daddr, hnum, dif);
if (static_branch_unlikely(&bpf_sk_lookup_enabled) &&
udptable == net->ipv4.udp_table) {
sk = inet_lookup_run_sk_lookup(net, IPPROTO_UDP, skb, sizeof(struct udphdr),
saddr, sport, daddr, hnum, dif,
udp_ehashfn);
if (sk) {
result = sk;
goto done;
@ -2412,7 +2392,11 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
if (udp4_csum_init(skb, uh, proto))
goto csum_error;
sk = skb_steal_sock(skb, &refcounted);
sk = inet_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest,
&refcounted, udp_ehashfn);
if (IS_ERR(sk))
goto no_sk;
if (sk) {
struct dst_entry *dst = skb_dst(skb);
int ret;
@ -2433,7 +2417,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
sk = __udp4_lib_lookup_skb(skb, uh->source, uh->dest, udptable);
if (sk)
return udp_unicast_rcv_skb(sk, skb, uh);
no_sk:
if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
goto drop;
nf_reset_ct(skb);

View File

@ -39,6 +39,7 @@ u32 inet6_ehashfn(const struct net *net,
return __inet6_ehashfn(lhash, lport, fhash, fport,
inet6_ehash_secret + net_hash_mix(net));
}
EXPORT_SYMBOL_GPL(inet6_ehashfn);
/*
* Sockets in TCP_CLOSE state are _always_ taken out of the hash, so
@ -111,22 +112,40 @@ static inline int compute_score(struct sock *sk, struct net *net,
return score;
}
static inline struct sock *lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb, int doff,
const struct in6_addr *saddr,
__be16 sport,
const struct in6_addr *daddr,
unsigned short hnum)
/**
* inet6_lookup_reuseport() - execute reuseport logic on AF_INET6 socket if necessary.
* @net: network namespace.
* @sk: AF_INET6 socket, must be in TCP_LISTEN state for TCP or TCP_CLOSE for UDP.
* @skb: context for a potential SK_REUSEPORT program.
* @doff: header offset.
* @saddr: source address.
* @sport: source port.
* @daddr: destination address.
* @hnum: destination port in host byte order.
* @ehashfn: hash function used to generate the fallback hash.
*
* Return: NULL if sk doesn't have SO_REUSEPORT set, otherwise a pointer to
* the selected sock or an error.
*/
struct sock *inet6_lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb, int doff,
const struct in6_addr *saddr,
__be16 sport,
const struct in6_addr *daddr,
unsigned short hnum,
inet6_ehashfn_t *ehashfn)
{
struct sock *reuse_sk = NULL;
u32 phash;
if (sk->sk_reuseport) {
phash = inet6_ehashfn(net, daddr, hnum, saddr, sport);
phash = INDIRECT_CALL_INET(ehashfn, udp6_ehashfn, inet6_ehashfn,
net, daddr, hnum, saddr, sport);
reuse_sk = reuseport_select_sock(sk, phash, skb, doff);
}
return reuse_sk;
}
EXPORT_SYMBOL_GPL(inet6_lookup_reuseport);
/* called with rcu_read_lock() */
static struct sock *inet6_lhash2_lookup(struct net *net,
@ -143,8 +162,8 @@ static struct sock *inet6_lhash2_lookup(struct net *net,
sk_nulls_for_each_rcu(sk, node, &ilb2->nulls_head) {
score = compute_score(sk, net, hnum, daddr, dif, sdif);
if (score > hiscore) {
result = lookup_reuseport(net, sk, skb, doff,
saddr, sport, daddr, hnum);
result = inet6_lookup_reuseport(net, sk, skb, doff,
saddr, sport, daddr, hnum, inet6_ehashfn);
if (result)
return result;
@ -156,30 +175,30 @@ static struct sock *inet6_lhash2_lookup(struct net *net,
return result;
}
static inline struct sock *inet6_lookup_run_bpf(struct net *net,
struct inet_hashinfo *hashinfo,
struct sk_buff *skb, int doff,
const struct in6_addr *saddr,
const __be16 sport,
const struct in6_addr *daddr,
const u16 hnum, const int dif)
struct sock *inet6_lookup_run_sk_lookup(struct net *net,
int protocol,
struct sk_buff *skb, int doff,
const struct in6_addr *saddr,
const __be16 sport,
const struct in6_addr *daddr,
const u16 hnum, const int dif,
inet6_ehashfn_t *ehashfn)
{
struct sock *sk, *reuse_sk;
bool no_reuseport;
if (hashinfo != net->ipv4.tcp_death_row.hashinfo)
return NULL; /* only TCP is supported */
no_reuseport = bpf_sk_lookup_run_v6(net, IPPROTO_TCP, saddr, sport,
no_reuseport = bpf_sk_lookup_run_v6(net, protocol, saddr, sport,
daddr, hnum, dif, &sk);
if (no_reuseport || IS_ERR_OR_NULL(sk))
return sk;
reuse_sk = lookup_reuseport(net, sk, skb, doff, saddr, sport, daddr, hnum);
reuse_sk = inet6_lookup_reuseport(net, sk, skb, doff,
saddr, sport, daddr, hnum, ehashfn);
if (reuse_sk)
sk = reuse_sk;
return sk;
}
EXPORT_SYMBOL_GPL(inet6_lookup_run_sk_lookup);
struct sock *inet6_lookup_listener(struct net *net,
struct inet_hashinfo *hashinfo,
@ -193,9 +212,11 @@ struct sock *inet6_lookup_listener(struct net *net,
unsigned int hash2;
/* Lookup redirect from BPF */
if (static_branch_unlikely(&bpf_sk_lookup_enabled)) {
result = inet6_lookup_run_bpf(net, hashinfo, skb, doff,
saddr, sport, daddr, hnum, dif);
if (static_branch_unlikely(&bpf_sk_lookup_enabled) &&
hashinfo == net->ipv4.tcp_death_row.hashinfo) {
result = inet6_lookup_run_sk_lookup(net, IPPROTO_TCP, skb, doff,
saddr, sport, daddr, hnum, dif,
inet6_ehashfn);
if (result)
goto done;
}

View File

@ -10,6 +10,7 @@
#include <linux/module.h>
#include <linux/skbuff.h>
#include <linux/icmp.h>
#include <linux/rcupdate.h>
#include <linux/sysctl.h>
#include <net/ipv6_frag.h>
@ -96,6 +97,12 @@ static void __net_exit defrag6_net_exit(struct net *net)
}
}
static const struct nf_defrag_hook defrag_hook = {
.owner = THIS_MODULE,
.enable = nf_defrag_ipv6_enable,
.disable = nf_defrag_ipv6_disable,
};
static struct pernet_operations defrag6_net_ops = {
.exit = defrag6_net_exit,
};
@ -114,6 +121,9 @@ static int __init nf_defrag_init(void)
pr_err("nf_defrag_ipv6: can't register pernet ops\n");
goto cleanup_frag6;
}
rcu_assign_pointer(nf_defrag_v6_hook, &defrag_hook);
return ret;
cleanup_frag6:
@ -124,6 +134,7 @@ static int __init nf_defrag_init(void)
static void __exit nf_defrag_fini(void)
{
rcu_assign_pointer(nf_defrag_v6_hook, NULL);
unregister_pernet_subsys(&defrag6_net_ops);
nf_ct_frag6_cleanup();
}

View File

@ -72,11 +72,12 @@ int udpv6_init_sock(struct sock *sk)
return 0;
}
static u32 udp6_ehashfn(const struct net *net,
const struct in6_addr *laddr,
const u16 lport,
const struct in6_addr *faddr,
const __be16 fport)
INDIRECT_CALLABLE_SCOPE
u32 udp6_ehashfn(const struct net *net,
const struct in6_addr *laddr,
const u16 lport,
const struct in6_addr *faddr,
const __be16 fport)
{
static u32 udp6_ehash_secret __read_mostly;
static u32 udp_ipv6_hash_secret __read_mostly;
@ -161,24 +162,6 @@ static int compute_score(struct sock *sk, struct net *net,
return score;
}
static struct sock *lookup_reuseport(struct net *net, struct sock *sk,
struct sk_buff *skb,
const struct in6_addr *saddr,
__be16 sport,
const struct in6_addr *daddr,
unsigned int hnum)
{
struct sock *reuse_sk = NULL;
u32 hash;
if (sk->sk_reuseport && sk->sk_state != TCP_ESTABLISHED) {
hash = udp6_ehashfn(net, daddr, hnum, saddr, sport);
reuse_sk = reuseport_select_sock(sk, hash, skb,
sizeof(struct udphdr));
}
return reuse_sk;
}
/* called with rcu_read_lock() */
static struct sock *udp6_lib_lookup2(struct net *net,
const struct in6_addr *saddr, __be16 sport,
@ -195,44 +178,35 @@ static struct sock *udp6_lib_lookup2(struct net *net,
score = compute_score(sk, net, saddr, sport,
daddr, hnum, dif, sdif);
if (score > badness) {
result = lookup_reuseport(net, sk, skb,
saddr, sport, daddr, hnum);
badness = score;
if (sk->sk_state == TCP_ESTABLISHED) {
result = sk;
continue;
}
result = inet6_lookup_reuseport(net, sk, skb, sizeof(struct udphdr),
saddr, sport, daddr, hnum, udp6_ehashfn);
if (!result) {
result = sk;
continue;
}
/* Fall back to scoring if group has connections */
if (result && !reuseport_has_conns(sk))
if (!reuseport_has_conns(sk))
return result;
result = result ? : sk;
badness = score;
/* Reuseport logic returned an error, keep original score. */
if (IS_ERR(result))
continue;
badness = compute_score(sk, net, saddr, sport,
daddr, hnum, dif, sdif);
}
}
return result;
}
static inline struct sock *udp6_lookup_run_bpf(struct net *net,
struct udp_table *udptable,
struct sk_buff *skb,
const struct in6_addr *saddr,
__be16 sport,
const struct in6_addr *daddr,
u16 hnum, const int dif)
{
struct sock *sk, *reuse_sk;
bool no_reuseport;
if (udptable != net->ipv4.udp_table)
return NULL; /* only UDP is supported */
no_reuseport = bpf_sk_lookup_run_v6(net, IPPROTO_UDP, saddr, sport,
daddr, hnum, dif, &sk);
if (no_reuseport || IS_ERR_OR_NULL(sk))
return sk;
reuse_sk = lookup_reuseport(net, sk, skb, saddr, sport, daddr, hnum);
if (reuse_sk)
sk = reuse_sk;
return sk;
}
/* rcu_read_lock() must be held */
struct sock *__udp6_lib_lookup(struct net *net,
const struct in6_addr *saddr, __be16 sport,
@ -257,9 +231,11 @@ struct sock *__udp6_lib_lookup(struct net *net,
goto done;
/* Lookup redirect from BPF */
if (static_branch_unlikely(&bpf_sk_lookup_enabled)) {
sk = udp6_lookup_run_bpf(net, udptable, skb,
saddr, sport, daddr, hnum, dif);
if (static_branch_unlikely(&bpf_sk_lookup_enabled) &&
udptable == net->ipv4.udp_table) {
sk = inet6_lookup_run_sk_lookup(net, IPPROTO_UDP, skb, sizeof(struct udphdr),
saddr, sport, daddr, hnum, dif,
udp6_ehashfn);
if (sk) {
result = sk;
goto done;
@ -992,7 +968,11 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
goto csum_error;
/* Check if the socket is already available, e.g. due to early demux */
sk = skb_steal_sock(skb, &refcounted);
sk = inet6_steal_sock(net, skb, sizeof(struct udphdr), saddr, uh->source, daddr, uh->dest,
&refcounted, udp6_ehashfn);
if (IS_ERR(sk))
goto no_sk;
if (sk) {
struct dst_entry *dst = skb_dst(skb);
int ret;
@ -1026,7 +1006,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
goto report_csum_error;
return udp6_unicast_rcv_skb(sk, skb, uh);
}
no_sk:
reason = SKB_DROP_REASON_NO_SOCKET;
if (!uh->check)

View File

@ -680,6 +680,12 @@ EXPORT_SYMBOL_GPL(nfnl_ct_hook);
const struct nf_ct_hook __rcu *nf_ct_hook __read_mostly;
EXPORT_SYMBOL_GPL(nf_ct_hook);
const struct nf_defrag_hook __rcu *nf_defrag_v4_hook __read_mostly;
EXPORT_SYMBOL_GPL(nf_defrag_v4_hook);
const struct nf_defrag_hook __rcu *nf_defrag_v6_hook __read_mostly;
EXPORT_SYMBOL_GPL(nf_defrag_v6_hook);
#if IS_ENABLED(CONFIG_NF_CONNTRACK)
u8 nf_ctnetlink_has_listener;
EXPORT_SYMBOL_GPL(nf_ctnetlink_has_listener);

View File

@ -1,6 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <linux/filter.h>
#include <linux/kmod.h>
#include <linux/module.h>
#include <linux/netfilter.h>
#include <net/netfilter/nf_bpf_link.h>
@ -23,8 +25,90 @@ struct bpf_nf_link {
struct nf_hook_ops hook_ops;
struct net *net;
u32 dead;
const struct nf_defrag_hook *defrag_hook;
};
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4) || IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
static const struct nf_defrag_hook *
get_proto_defrag_hook(struct bpf_nf_link *link,
const struct nf_defrag_hook __rcu *global_hook,
const char *mod)
{
const struct nf_defrag_hook *hook;
int err;
/* RCU protects us from races against module unloading */
rcu_read_lock();
hook = rcu_dereference(global_hook);
if (!hook) {
rcu_read_unlock();
err = request_module(mod);
if (err)
return ERR_PTR(err < 0 ? err : -EINVAL);
rcu_read_lock();
hook = rcu_dereference(global_hook);
}
if (hook && try_module_get(hook->owner)) {
/* Once we have a refcnt on the module, we no longer need RCU */
hook = rcu_pointer_handoff(hook);
} else {
WARN_ONCE(!hook, "%s has bad registration", mod);
hook = ERR_PTR(-ENOENT);
}
rcu_read_unlock();
if (!IS_ERR(hook)) {
err = hook->enable(link->net);
if (err) {
module_put(hook->owner);
hook = ERR_PTR(err);
}
}
return hook;
}
#endif
static int bpf_nf_enable_defrag(struct bpf_nf_link *link)
{
const struct nf_defrag_hook __maybe_unused *hook;
switch (link->hook_ops.pf) {
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV4)
case NFPROTO_IPV4:
hook = get_proto_defrag_hook(link, nf_defrag_v4_hook, "nf_defrag_ipv4");
if (IS_ERR(hook))
return PTR_ERR(hook);
link->defrag_hook = hook;
return 0;
#endif
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
case NFPROTO_IPV6:
hook = get_proto_defrag_hook(link, nf_defrag_v6_hook, "nf_defrag_ipv6");
if (IS_ERR(hook))
return PTR_ERR(hook);
link->defrag_hook = hook;
return 0;
#endif
default:
return -EAFNOSUPPORT;
}
}
static void bpf_nf_disable_defrag(struct bpf_nf_link *link)
{
const struct nf_defrag_hook *hook = link->defrag_hook;
if (!hook)
return;
hook->disable(link->net);
module_put(hook->owner);
}
static void bpf_nf_link_release(struct bpf_link *link)
{
struct bpf_nf_link *nf_link = container_of(link, struct bpf_nf_link, link);
@ -32,11 +116,11 @@ static void bpf_nf_link_release(struct bpf_link *link)
if (nf_link->dead)
return;
/* prevent hook-not-found warning splat from netfilter core when
* .detach was already called
*/
if (!cmpxchg(&nf_link->dead, 0, 1))
/* do not double release in case .detach was already called */
if (!cmpxchg(&nf_link->dead, 0, 1)) {
nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops);
bpf_nf_disable_defrag(nf_link);
}
}
static void bpf_nf_link_dealloc(struct bpf_link *link)
@ -92,6 +176,8 @@ static const struct bpf_link_ops bpf_nf_link_lops = {
static int bpf_nf_check_pf_and_hooks(const union bpf_attr *attr)
{
int prio;
switch (attr->link_create.netfilter.pf) {
case NFPROTO_IPV4:
case NFPROTO_IPV6:
@ -102,19 +188,18 @@ static int bpf_nf_check_pf_and_hooks(const union bpf_attr *attr)
return -EAFNOSUPPORT;
}
if (attr->link_create.netfilter.flags)
if (attr->link_create.netfilter.flags & ~BPF_F_NETFILTER_IP_DEFRAG)
return -EOPNOTSUPP;
/* make sure conntrack confirm is always last.
*
* In the future, if userspace can e.g. request defrag, then
* "defrag_requested && prio before NF_IP_PRI_CONNTRACK_DEFRAG"
* should fail.
*/
switch (attr->link_create.netfilter.priority) {
case NF_IP_PRI_FIRST: return -ERANGE; /* sabotage_in and other warts */
case NF_IP_PRI_LAST: return -ERANGE; /* e.g. conntrack confirm */
}
/* make sure conntrack confirm is always last */
prio = attr->link_create.netfilter.priority;
if (prio == NF_IP_PRI_FIRST)
return -ERANGE; /* sabotage_in and other warts */
else if (prio == NF_IP_PRI_LAST)
return -ERANGE; /* e.g. conntrack confirm */
else if ((attr->link_create.netfilter.flags & BPF_F_NETFILTER_IP_DEFRAG) &&
prio <= NF_IP_PRI_CONNTRACK_DEFRAG)
return -ERANGE; /* cannot use defrag if prog runs before nf_defrag */
return 0;
}
@ -149,6 +234,7 @@ int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
link->net = net;
link->dead = false;
link->defrag_hook = NULL;
err = bpf_link_prime(&link->link, &link_primer);
if (err) {
@ -156,8 +242,17 @@ int bpf_nf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
return err;
}
if (attr->link_create.netfilter.flags & BPF_F_NETFILTER_IP_DEFRAG) {
err = bpf_nf_enable_defrag(link);
if (err) {
bpf_link_cleanup(&link_primer);
return err;
}
}
err = nf_register_net_hook(net, &link->hook_ops);
if (err) {
bpf_nf_disable_defrag(link);
bpf_link_cleanup(&link_primer);
return err;
}

View File

@ -14,6 +14,7 @@
#include <linux/types.h>
#include <linux/btf_ids.h>
#include <linux/net_namespace.h>
#include <net/xdp.h>
#include <net/netfilter/nf_conntrack_bpf.h>
#include <net/netfilter/nf_conntrack_core.h>

View File

@ -25,6 +25,7 @@
#include <linux/vmalloc.h>
#include <net/xdp_sock_drv.h>
#include <net/busy_poll.h>
#include <net/netdev_rx_queue.h>
#include <net/xdp.h>
#include "xsk_queue.h"

View File

@ -19,6 +19,7 @@
/* ld/ldx fields */
#define BPF_DW 0x18 /* double word (64-bit) */
#define BPF_MEMSX 0x80 /* load with sign extension */
#define BPF_ATOMIC 0xc0 /* atomic memory ops - op type in immediate */
#define BPF_XADD 0xc0 /* exclusive add - legacy name */
@ -1187,6 +1188,11 @@ enum bpf_perf_event_type {
*/
#define BPF_F_KPROBE_MULTI_RETURN (1U << 0)
/* link_create.netfilter.flags used in LINK_CREATE command for
* BPF_PROG_TYPE_NETFILTER to enable IP packet defragmentation.
*/
#define BPF_F_NETFILTER_IP_DEFRAG (1U << 0)
/* When BPF ldimm64's insn[0].src_reg != 0 then this can have
* the following extensions:
*
@ -4198,9 +4204,6 @@ union bpf_attr {
* **-EOPNOTSUPP** if the operation is not supported, for example
* a call from outside of TC ingress.
*
* **-ESOCKTNOSUPPORT** if the socket type is not supported
* (reuseport).
*
* long bpf_sk_assign(struct bpf_sk_lookup *ctx, struct bpf_sock *sk, u64 flags)
* Description
* Helper is overloaded depending on BPF program type. This

View File

@ -293,11 +293,11 @@ help:
@echo ' HINT: use "V=1" to enable verbose build'
@echo ' all - build libraries and pkgconfig'
@echo ' clean - remove all generated files'
@echo ' check - check abi and version info'
@echo ' check - check ABI and version info'
@echo ''
@echo 'libbpf install targets:'
@echo ' HINT: use "prefix"(defaults to "/usr/local") or "DESTDIR" (defaults to "/")'
@echo ' to adjust target desitantion, e.g. "make prefix=/usr/local install"'
@echo ' to adjust target destination, e.g. "make prefix=/usr/local install"'
@echo ' install - build and install all headers, libraries and pkgconfig'
@echo ' install_headers - install only headers to include/bpf'
@echo ''

View File

@ -13,6 +13,7 @@ test_dev_cgroup
/test_progs
/test_progs-no_alu32
/test_progs-bpf_gcc
/test_progs-cpuv4
test_verifier_log
feature
test_sock
@ -36,6 +37,7 @@ test_cpp
*.lskel.h
/no_alu32
/bpf_gcc
/cpuv4
/host-tools
/tools
/runqslower

View File

@ -33,11 +33,16 @@ CFLAGS += -g -O0 -rdynamic -Wall -Werror $(GENFLAGS) $(SAN_CFLAGS) \
LDFLAGS += $(SAN_LDFLAGS)
LDLIBS += -lelf -lz -lrt -lpthread
# Silence some warnings when compiled with clang
ifneq ($(LLVM),)
# Silence some warnings when compiled with clang
CFLAGS += -Wno-unused-command-line-argument
endif
# Check whether bpf cpu=v4 is supported or not by clang
ifneq ($(shell $(CLANG) --target=bpf -mcpu=help 2>&1 | grep 'v4'),)
CLANG_CPUV4 := 1
endif
# Order correspond to 'make run_tests' order
TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
test_dev_cgroup \
@ -51,6 +56,10 @@ ifneq ($(BPF_GCC),)
TEST_GEN_PROGS += test_progs-bpf_gcc
endif
ifneq ($(CLANG_CPUV4),)
TEST_GEN_PROGS += test_progs-cpuv4
endif
TEST_GEN_FILES = test_lwt_ip_encap.bpf.o test_tc_edt.bpf.o
TEST_FILES = xsk_prereqs.sh $(wildcard progs/btf_dump_test_case_*.c)
@ -383,6 +392,11 @@ define CLANG_NOALU32_BPF_BUILD_RULE
$(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2)
$(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v2 -o $2
endef
# Similar to CLANG_BPF_BUILD_RULE, but with cpu-v4
define CLANG_CPUV4_BPF_BUILD_RULE
$(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2)
$(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v4 -o $2
endef
# Build BPF object using GCC
define GCC_BPF_BUILD_RULE
$(call msg,GCC-BPF,$(TRUNNER_BINARY),$2)
@ -425,7 +439,7 @@ LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.c,$(foreach skel,$(LINKED_SKELS),$($(ske
# $eval()) and pass control to DEFINE_TEST_RUNNER_RULES.
# Parameters:
# $1 - test runner base binary name (e.g., test_progs)
# $2 - test runner extra "flavor" (e.g., no_alu32, gcc-bpf, etc)
# $2 - test runner extra "flavor" (e.g., no_alu32, cpuv4, gcc-bpf, etc)
define DEFINE_TEST_RUNNER
TRUNNER_OUTPUT := $(OUTPUT)$(if $2,/)$2
@ -453,7 +467,7 @@ endef
# Using TRUNNER_XXX variables, provided by callers of DEFINE_TEST_RUNNER and
# set up by DEFINE_TEST_RUNNER itself, create test runner build rules with:
# $1 - test runner base binary name (e.g., test_progs)
# $2 - test runner extra "flavor" (e.g., no_alu32, gcc-bpf, etc)
# $2 - test runner extra "flavor" (e.g., no_alu32, cpuv4, gcc-bpf, etc)
define DEFINE_TEST_RUNNER_RULES
ifeq ($($(TRUNNER_OUTPUT)-dir),)
@ -565,8 +579,8 @@ TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \
network_helpers.c testing_helpers.c \
btf_helpers.c flow_dissector_load.h \
cap_helpers.c test_loader.c xsk.c disasm.c \
json_writer.c unpriv_helpers.c
json_writer.c unpriv_helpers.c \
ip_check_defrag_frags.h
TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read $(OUTPUT)/bpf_testmod.ko \
$(OUTPUT)/liburandom_read.so \
$(OUTPUT)/xdp_synproxy \
@ -584,6 +598,13 @@ TRUNNER_BPF_BUILD_RULE := CLANG_NOALU32_BPF_BUILD_RULE
TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
$(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32))
# Define test_progs-cpuv4 test runner.
ifneq ($(CLANG_CPUV4),)
TRUNNER_BPF_BUILD_RULE := CLANG_CPUV4_BPF_BUILD_RULE
TRUNNER_BPF_CFLAGS := $(BPF_CFLAGS) $(CLANG_CFLAGS)
$(eval $(call DEFINE_TEST_RUNNER,test_progs,cpuv4))
endif
# Define test_progs BPF-GCC-flavored test runner.
ifneq ($(BPF_GCC),)
TRUNNER_BPF_BUILD_RULE := GCC_BPF_BUILD_RULE
@ -681,7 +702,7 @@ EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \
prog_tests/tests.h map_tests/tests.h verifier/tests.h \
feature bpftool \
$(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h *.subskel.h \
no_alu32 bpf_gcc bpf_testmod.ko \
no_alu32 cpuv4 bpf_gcc bpf_testmod.ko \
liburandom_read.so)
.PHONY: docs docs-clean

View File

@ -98,6 +98,12 @@ bpf_testmod_test_struct_arg_8(u64 a, void *b, short c, int d, void *e,
return bpf_testmod_test_struct_arg_result;
}
noinline int
bpf_testmod_test_arg_ptr_to_struct(struct bpf_testmod_struct_arg_1 *a) {
bpf_testmod_test_struct_arg_result = a->a;
return bpf_testmod_test_struct_arg_result;
}
__bpf_kfunc void
bpf_testmod_test_mod_kfunc(int i)
{
@ -240,7 +246,7 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj,
.off = off,
.len = len,
};
struct bpf_testmod_struct_arg_1 struct_arg1 = {10};
struct bpf_testmod_struct_arg_1 struct_arg1 = {10}, struct_arg1_2 = {-1};
struct bpf_testmod_struct_arg_2 struct_arg2 = {2, 3};
struct bpf_testmod_struct_arg_3 *struct_arg3;
struct bpf_testmod_struct_arg_4 struct_arg4 = {21, 22};
@ -259,6 +265,7 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj,
(void)bpf_testmod_test_struct_arg_8(16, (void *)17, 18, 19,
(void *)20, struct_arg4, 23);
(void)bpf_testmod_test_arg_ptr_to_struct(&struct_arg1_2);
struct_arg3 = kmalloc((sizeof(struct bpf_testmod_struct_arg_3) +
sizeof(int)), GFP_KERNEL);

View File

@ -0,0 +1,90 @@
#!/bin/env python3
# SPDX-License-Identifier: GPL-2.0
"""
This script helps generate fragmented UDP packets.
While it is technically possible to dynamically generate
fragmented packets in C, it is much harder to read and write
said code. `scapy` is relatively industry standard and really
easy to read / write.
So we choose to write this script that generates a valid C
header. Rerun script and commit generated file after any
modifications.
"""
import argparse
import os
from scapy.all import *
# These constants must stay in sync with `ip_check_defrag.c`
VETH1_ADDR = "172.16.1.200"
VETH0_ADDR6 = "fc00::100"
VETH1_ADDR6 = "fc00::200"
CLIENT_PORT = 48878
SERVER_PORT = 48879
MAGIC_MESSAGE = "THIS IS THE ORIGINAL MESSAGE, PLEASE REASSEMBLE ME"
def print_header(f):
f.write("// SPDX-License-Identifier: GPL-2.0\n")
f.write("/* DO NOT EDIT -- this file is generated */\n")
f.write("\n")
f.write("#ifndef _IP_CHECK_DEFRAG_FRAGS_H\n")
f.write("#define _IP_CHECK_DEFRAG_FRAGS_H\n")
f.write("\n")
f.write("#include <stdint.h>\n")
f.write("\n")
def print_frags(f, frags, v6):
for idx, frag in enumerate(frags):
# 10 bytes per line to keep width in check
chunks = [frag[i : i + 10] for i in range(0, len(frag), 10)]
chunks_fmted = [", ".join([str(hex(b)) for b in chunk]) for chunk in chunks]
suffix = "6" if v6 else ""
f.write(f"static uint8_t frag{suffix}_{idx}[] = {{\n")
for chunk in chunks_fmted:
f.write(f"\t{chunk},\n")
f.write(f"}};\n")
def print_trailer(f):
f.write("\n")
f.write("#endif /* _IP_CHECK_DEFRAG_FRAGS_H */\n")
def main(f):
# srcip of 0 is filled in by IP_HDRINCL
sip = "0.0.0.0"
sip6 = VETH0_ADDR6
dip = VETH1_ADDR
dip6 = VETH1_ADDR6
sport = CLIENT_PORT
dport = SERVER_PORT
payload = MAGIC_MESSAGE.encode()
# Disable UDPv4 checksums to keep code simpler
pkt = IP(src=sip,dst=dip) / UDP(sport=sport,dport=dport,chksum=0) / Raw(load=payload)
# UDPv6 requires a checksum
# Also pin the ipv6 fragment header ID, otherwise it's a random value
pkt6 = IPv6(src=sip6,dst=dip6) / IPv6ExtHdrFragment(id=0xBEEF) / UDP(sport=sport,dport=dport) / Raw(load=payload)
frags = [f.build() for f in pkt.fragment(24)]
frags6 = [f.build() for f in fragment6(pkt6, 72)]
print_header(f)
print_frags(f, frags, False)
print_frags(f, frags6, True)
print_trailer(f)
if __name__ == "__main__":
dir = os.path.dirname(os.path.realpath(__file__))
header = f"{dir}/ip_check_defrag_frags.h"
with open(header, "w") as f:
main(f)

View File

@ -0,0 +1,57 @@
// SPDX-License-Identifier: GPL-2.0
/* DO NOT EDIT -- this file is generated */
#ifndef _IP_CHECK_DEFRAG_FRAGS_H
#define _IP_CHECK_DEFRAG_FRAGS_H
#include <stdint.h>
static uint8_t frag_0[] = {
0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x0, 0x40, 0x11,
0xac, 0xe8, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8,
0xbe, 0xee, 0xbe, 0xef, 0x0, 0x3a, 0x0, 0x0, 0x54, 0x48,
0x49, 0x53, 0x20, 0x49, 0x53, 0x20, 0x54, 0x48, 0x45, 0x20,
0x4f, 0x52, 0x49, 0x47,
};
static uint8_t frag_1[] = {
0x45, 0x0, 0x0, 0x2c, 0x0, 0x1, 0x20, 0x3, 0x40, 0x11,
0xac, 0xe5, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8,
0x49, 0x4e, 0x41, 0x4c, 0x20, 0x4d, 0x45, 0x53, 0x53, 0x41,
0x47, 0x45, 0x2c, 0x20, 0x50, 0x4c, 0x45, 0x41, 0x53, 0x45,
0x20, 0x52, 0x45, 0x41,
};
static uint8_t frag_2[] = {
0x45, 0x0, 0x0, 0x1e, 0x0, 0x1, 0x0, 0x6, 0x40, 0x11,
0xcc, 0xf0, 0x0, 0x0, 0x0, 0x0, 0xac, 0x10, 0x1, 0xc8,
0x53, 0x53, 0x45, 0x4d, 0x42, 0x4c, 0x45, 0x20, 0x4d, 0x45,
};
static uint8_t frag6_0[] = {
0x60, 0x0, 0x0, 0x0, 0x0, 0x20, 0x2c, 0x40, 0xfc, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0,
0x11, 0x0, 0x0, 0x1, 0x0, 0x0, 0xbe, 0xef, 0xbe, 0xee,
0xbe, 0xef, 0x0, 0x3a, 0xd0, 0xf8, 0x54, 0x48, 0x49, 0x53,
0x20, 0x49, 0x53, 0x20, 0x54, 0x48, 0x45, 0x20, 0x4f, 0x52,
0x49, 0x47,
};
static uint8_t frag6_1[] = {
0x60, 0x0, 0x0, 0x0, 0x0, 0x20, 0x2c, 0x40, 0xfc, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0,
0x11, 0x0, 0x0, 0x19, 0x0, 0x0, 0xbe, 0xef, 0x49, 0x4e,
0x41, 0x4c, 0x20, 0x4d, 0x45, 0x53, 0x53, 0x41, 0x47, 0x45,
0x2c, 0x20, 0x50, 0x4c, 0x45, 0x41, 0x53, 0x45, 0x20, 0x52,
0x45, 0x41,
};
static uint8_t frag6_2[] = {
0x60, 0x0, 0x0, 0x0, 0x0, 0x12, 0x2c, 0x40, 0xfc, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x1, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x0,
0x11, 0x0, 0x0, 0x30, 0x0, 0x0, 0xbe, 0xef, 0x53, 0x53,
0x45, 0x4d, 0x42, 0x4c, 0x45, 0x20, 0x4d, 0x45,
};
#endif /* _IP_CHECK_DEFRAG_FRAGS_H */

View File

@ -270,14 +270,23 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts)
opts = &default_opts;
optlen = sizeof(type);
if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) {
log_err("getsockopt(SOL_TYPE)");
return -1;
if (opts->type) {
type = opts->type;
} else {
if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) {
log_err("getsockopt(SOL_TYPE)");
return -1;
}
}
if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) {
log_err("getsockopt(SOL_PROTOCOL)");
return -1;
if (opts->proto) {
protocol = opts->proto;
} else {
if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) {
log_err("getsockopt(SOL_PROTOCOL)");
return -1;
}
}
addrlen = sizeof(addr);
@ -301,8 +310,9 @@ int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts)
strlen(opts->cc) + 1))
goto error_close;
if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail))
goto error_close;
if (!opts->noconnect)
if (connect_fd_to_addr(fd, &addr, addrlen, opts->must_fail))
goto error_close;
return fd;
@ -423,6 +433,9 @@ struct nstoken *open_netns(const char *name)
void close_netns(struct nstoken *token)
{
if (!token)
return;
ASSERT_OK(setns(token->orig_netns_fd, CLONE_NEWNET), "setns");
close(token->orig_netns_fd);
free(token);

View File

@ -21,6 +21,9 @@ struct network_helper_opts {
const char *cc;
int timeout_ms;
bool must_fail;
bool noconnect;
int type;
int proto;
};
/* ipv4 test vector */

View File

@ -0,0 +1,199 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2023 Isovalent */
#include <uapi/linux/if_link.h>
#include <test_progs.h>
#include <netinet/tcp.h>
#include <netinet/udp.h>
#include "network_helpers.h"
#include "test_assign_reuse.skel.h"
#define NS_TEST "assign_reuse"
#define LOOPBACK 1
#define PORT 4443
static int attach_reuseport(int sock_fd, int prog_fd)
{
return setsockopt(sock_fd, SOL_SOCKET, SO_ATTACH_REUSEPORT_EBPF,
&prog_fd, sizeof(prog_fd));
}
static __u64 cookie(int fd)
{
__u64 cookie = 0;
socklen_t cookie_len = sizeof(cookie);
int ret;
ret = getsockopt(fd, SOL_SOCKET, SO_COOKIE, &cookie, &cookie_len);
ASSERT_OK(ret, "cookie");
ASSERT_GT(cookie, 0, "cookie_invalid");
return cookie;
}
static int echo_test_udp(int fd_sv)
{
struct sockaddr_storage addr = {};
socklen_t len = sizeof(addr);
char buff[1] = {};
int fd_cl = -1, ret;
fd_cl = connect_to_fd(fd_sv, 100);
ASSERT_GT(fd_cl, 0, "create_client");
ASSERT_EQ(getsockname(fd_cl, (void *)&addr, &len), 0, "getsockname");
ASSERT_EQ(send(fd_cl, buff, sizeof(buff), 0), 1, "send_client");
ret = recv(fd_sv, buff, sizeof(buff), 0);
if (ret < 0) {
close(fd_cl);
return errno;
}
ASSERT_EQ(ret, 1, "recv_server");
ASSERT_EQ(sendto(fd_sv, buff, sizeof(buff), 0, (void *)&addr, len), 1, "send_server");
ASSERT_EQ(recv(fd_cl, buff, sizeof(buff), 0), 1, "recv_client");
close(fd_cl);
return 0;
}
static int echo_test_tcp(int fd_sv)
{
char buff[1] = {};
int fd_cl = -1, fd_sv_cl = -1;
fd_cl = connect_to_fd(fd_sv, 100);
if (fd_cl < 0)
return errno;
fd_sv_cl = accept(fd_sv, NULL, NULL);
ASSERT_GE(fd_sv_cl, 0, "accept_fd");
ASSERT_EQ(send(fd_cl, buff, sizeof(buff), 0), 1, "send_client");
ASSERT_EQ(recv(fd_sv_cl, buff, sizeof(buff), 0), 1, "recv_server");
ASSERT_EQ(send(fd_sv_cl, buff, sizeof(buff), 0), 1, "send_server");
ASSERT_EQ(recv(fd_cl, buff, sizeof(buff), 0), 1, "recv_client");
close(fd_sv_cl);
close(fd_cl);
return 0;
}
void run_assign_reuse(int family, int sotype, const char *ip, __u16 port)
{
DECLARE_LIBBPF_OPTS(bpf_tc_hook, tc_hook,
.ifindex = LOOPBACK,
.attach_point = BPF_TC_INGRESS,
);
DECLARE_LIBBPF_OPTS(bpf_tc_opts, tc_opts,
.handle = 1,
.priority = 1,
);
bool hook_created = false, tc_attached = false;
int ret, fd_tc, fd_accept, fd_drop, fd_map;
int *fd_sv = NULL;
__u64 fd_val;
struct test_assign_reuse *skel;
const int zero = 0;
skel = test_assign_reuse__open();
if (!ASSERT_OK_PTR(skel, "skel_open"))
goto cleanup;
skel->rodata->dest_port = port;
ret = test_assign_reuse__load(skel);
if (!ASSERT_OK(ret, "skel_load"))
goto cleanup;
ASSERT_EQ(skel->bss->sk_cookie_seen, 0, "cookie_init");
fd_tc = bpf_program__fd(skel->progs.tc_main);
fd_accept = bpf_program__fd(skel->progs.reuse_accept);
fd_drop = bpf_program__fd(skel->progs.reuse_drop);
fd_map = bpf_map__fd(skel->maps.sk_map);
fd_sv = start_reuseport_server(family, sotype, ip, port, 100, 1);
if (!ASSERT_NEQ(fd_sv, NULL, "start_reuseport_server"))
goto cleanup;
ret = attach_reuseport(*fd_sv, fd_drop);
if (!ASSERT_OK(ret, "attach_reuseport"))
goto cleanup;
fd_val = *fd_sv;
ret = bpf_map_update_elem(fd_map, &zero, &fd_val, BPF_NOEXIST);
if (!ASSERT_OK(ret, "bpf_sk_map"))
goto cleanup;
ret = bpf_tc_hook_create(&tc_hook);
if (ret == 0)
hook_created = true;
ret = ret == -EEXIST ? 0 : ret;
if (!ASSERT_OK(ret, "bpf_tc_hook_create"))
goto cleanup;
tc_opts.prog_fd = fd_tc;
ret = bpf_tc_attach(&tc_hook, &tc_opts);
if (!ASSERT_OK(ret, "bpf_tc_attach"))
goto cleanup;
tc_attached = true;
if (sotype == SOCK_STREAM)
ASSERT_EQ(echo_test_tcp(*fd_sv), ECONNREFUSED, "drop_tcp");
else
ASSERT_EQ(echo_test_udp(*fd_sv), EAGAIN, "drop_udp");
ASSERT_EQ(skel->bss->reuseport_executed, 1, "program executed once");
skel->bss->sk_cookie_seen = 0;
skel->bss->reuseport_executed = 0;
ASSERT_OK(attach_reuseport(*fd_sv, fd_accept), "attach_reuseport(accept)");
if (sotype == SOCK_STREAM)
ASSERT_EQ(echo_test_tcp(*fd_sv), 0, "echo_tcp");
else
ASSERT_EQ(echo_test_udp(*fd_sv), 0, "echo_udp");
ASSERT_EQ(skel->bss->sk_cookie_seen, cookie(*fd_sv),
"cookie_mismatch");
ASSERT_EQ(skel->bss->reuseport_executed, 1, "program executed once");
cleanup:
if (tc_attached) {
tc_opts.flags = tc_opts.prog_fd = tc_opts.prog_id = 0;
ret = bpf_tc_detach(&tc_hook, &tc_opts);
ASSERT_OK(ret, "bpf_tc_detach");
}
if (hook_created) {
tc_hook.attach_point = BPF_TC_INGRESS | BPF_TC_EGRESS;
bpf_tc_hook_destroy(&tc_hook);
}
test_assign_reuse__destroy(skel);
free_fds(fd_sv, 1);
}
void test_assign_reuse(void)
{
struct nstoken *tok = NULL;
SYS(out, "ip netns add %s", NS_TEST);
SYS(cleanup, "ip -net %s link set dev lo up", NS_TEST);
tok = open_netns(NS_TEST);
if (!ASSERT_OK_PTR(tok, "netns token"))
return;
if (test__start_subtest("tcpv4"))
run_assign_reuse(AF_INET, SOCK_STREAM, "127.0.0.1", PORT);
if (test__start_subtest("tcpv6"))
run_assign_reuse(AF_INET6, SOCK_STREAM, "::1", PORT);
if (test__start_subtest("udpv4"))
run_assign_reuse(AF_INET, SOCK_DGRAM, "127.0.0.1", PORT);
if (test__start_subtest("udpv6"))
run_assign_reuse(AF_INET6, SOCK_DGRAM, "::1", PORT);
cleanup:
close_netns(tok);
SYS_NOFAIL("ip netns delete %s", NS_TEST);
out:
return;
}

View File

@ -0,0 +1,283 @@
// SPDX-License-Identifier: GPL-2.0
#include <test_progs.h>
#include <net/if.h>
#include <linux/netfilter.h>
#include <network_helpers.h>
#include "ip_check_defrag.skel.h"
#include "ip_check_defrag_frags.h"
/*
* This selftest spins up a client and an echo server, each in their own
* network namespace. The client will send a fragmented message to the server.
* The prog attached to the server will shoot down any fragments. Thus, if
* the server is able to correctly echo back the message to the client, we will
* have verified that netfilter is reassembling packets for us.
*
* Topology:
* =========
* NS0 | NS1
* |
* client | server
* ---------- | ----------
* | veth0 | --------- | veth1 |
* ---------- peer ----------
* |
* | with bpf
*/
#define NS0 "defrag_ns0"
#define NS1 "defrag_ns1"
#define VETH0 "veth0"
#define VETH1 "veth1"
#define VETH0_ADDR "172.16.1.100"
#define VETH0_ADDR6 "fc00::100"
/* The following constants must stay in sync with `generate_udp_fragments.py` */
#define VETH1_ADDR "172.16.1.200"
#define VETH1_ADDR6 "fc00::200"
#define CLIENT_PORT 48878
#define SERVER_PORT 48879
#define MAGIC_MESSAGE "THIS IS THE ORIGINAL MESSAGE, PLEASE REASSEMBLE ME"
static int setup_topology(bool ipv6)
{
bool up;
int i;
SYS(fail, "ip netns add " NS0);
SYS(fail, "ip netns add " NS1);
SYS(fail, "ip link add " VETH0 " netns " NS0 " type veth peer name " VETH1 " netns " NS1);
if (ipv6) {
SYS(fail, "ip -6 -net " NS0 " addr add " VETH0_ADDR6 "/64 dev " VETH0 " nodad");
SYS(fail, "ip -6 -net " NS1 " addr add " VETH1_ADDR6 "/64 dev " VETH1 " nodad");
} else {
SYS(fail, "ip -net " NS0 " addr add " VETH0_ADDR "/24 dev " VETH0);
SYS(fail, "ip -net " NS1 " addr add " VETH1_ADDR "/24 dev " VETH1);
}
SYS(fail, "ip -net " NS0 " link set dev " VETH0 " up");
SYS(fail, "ip -net " NS1 " link set dev " VETH1 " up");
/* Wait for up to 5s for links to come up */
for (i = 0; i < 5; ++i) {
if (ipv6)
up = !system("ip netns exec " NS0 " ping -6 -c 1 -W 1 " VETH1_ADDR6 " &>/dev/null");
else
up = !system("ip netns exec " NS0 " ping -c 1 -W 1 " VETH1_ADDR " &>/dev/null");
if (up)
break;
}
return 0;
fail:
return -1;
}
static void cleanup_topology(void)
{
SYS_NOFAIL("test -f /var/run/netns/" NS0 " && ip netns delete " NS0);
SYS_NOFAIL("test -f /var/run/netns/" NS1 " && ip netns delete " NS1);
}
static int attach(struct ip_check_defrag *skel, bool ipv6)
{
LIBBPF_OPTS(bpf_netfilter_opts, opts,
.pf = ipv6 ? NFPROTO_IPV6 : NFPROTO_IPV4,
.priority = 42,
.flags = BPF_F_NETFILTER_IP_DEFRAG);
struct nstoken *nstoken;
int err = -1;
nstoken = open_netns(NS1);
skel->links.defrag = bpf_program__attach_netfilter(skel->progs.defrag, &opts);
if (!ASSERT_OK_PTR(skel->links.defrag, "program attach"))
goto out;
err = 0;
out:
close_netns(nstoken);
return err;
}
static int send_frags(int client)
{
struct sockaddr_storage saddr;
struct sockaddr *saddr_p;
socklen_t saddr_len;
int err;
saddr_p = (struct sockaddr *)&saddr;
err = make_sockaddr(AF_INET, VETH1_ADDR, SERVER_PORT, &saddr, &saddr_len);
if (!ASSERT_OK(err, "make_sockaddr"))
return -1;
err = sendto(client, frag_0, sizeof(frag_0), 0, saddr_p, saddr_len);
if (!ASSERT_GE(err, 0, "sendto frag_0"))
return -1;
err = sendto(client, frag_1, sizeof(frag_1), 0, saddr_p, saddr_len);
if (!ASSERT_GE(err, 0, "sendto frag_1"))
return -1;
err = sendto(client, frag_2, sizeof(frag_2), 0, saddr_p, saddr_len);
if (!ASSERT_GE(err, 0, "sendto frag_2"))
return -1;
return 0;
}
static int send_frags6(int client)
{
struct sockaddr_storage saddr;
struct sockaddr *saddr_p;
socklen_t saddr_len;
int err;
saddr_p = (struct sockaddr *)&saddr;
/* Port needs to be set to 0 for raw ipv6 socket for some reason */
err = make_sockaddr(AF_INET6, VETH1_ADDR6, 0, &saddr, &saddr_len);
if (!ASSERT_OK(err, "make_sockaddr"))
return -1;
err = sendto(client, frag6_0, sizeof(frag6_0), 0, saddr_p, saddr_len);
if (!ASSERT_GE(err, 0, "sendto frag6_0"))
return -1;
err = sendto(client, frag6_1, sizeof(frag6_1), 0, saddr_p, saddr_len);
if (!ASSERT_GE(err, 0, "sendto frag6_1"))
return -1;
err = sendto(client, frag6_2, sizeof(frag6_2), 0, saddr_p, saddr_len);
if (!ASSERT_GE(err, 0, "sendto frag6_2"))
return -1;
return 0;
}
void test_bpf_ip_check_defrag_ok(bool ipv6)
{
struct network_helper_opts rx_opts = {
.timeout_ms = 1000,
.noconnect = true,
};
struct network_helper_opts tx_ops = {
.timeout_ms = 1000,
.type = SOCK_RAW,
.proto = IPPROTO_RAW,
.noconnect = true,
};
struct sockaddr_storage caddr;
struct ip_check_defrag *skel;
struct nstoken *nstoken;
int client_tx_fd = -1;
int client_rx_fd = -1;
socklen_t caddr_len;
int srv_fd = -1;
char buf[1024];
int len, err;
skel = ip_check_defrag__open_and_load();
if (!ASSERT_OK_PTR(skel, "skel_open"))
return;
if (!ASSERT_OK(setup_topology(ipv6), "setup_topology"))
goto out;
if (!ASSERT_OK(attach(skel, ipv6), "attach"))
goto out;
/* Start server in ns1 */
nstoken = open_netns(NS1);
if (!ASSERT_OK_PTR(nstoken, "setns ns1"))
goto out;
srv_fd = start_server(ipv6 ? AF_INET6 : AF_INET, SOCK_DGRAM, NULL, SERVER_PORT, 0);
close_netns(nstoken);
if (!ASSERT_GE(srv_fd, 0, "start_server"))
goto out;
/* Open tx raw socket in ns0 */
nstoken = open_netns(NS0);
if (!ASSERT_OK_PTR(nstoken, "setns ns0"))
goto out;
client_tx_fd = connect_to_fd_opts(srv_fd, &tx_ops);
close_netns(nstoken);
if (!ASSERT_GE(client_tx_fd, 0, "connect_to_fd_opts"))
goto out;
/* Open rx socket in ns0 */
nstoken = open_netns(NS0);
if (!ASSERT_OK_PTR(nstoken, "setns ns0"))
goto out;
client_rx_fd = connect_to_fd_opts(srv_fd, &rx_opts);
close_netns(nstoken);
if (!ASSERT_GE(client_rx_fd, 0, "connect_to_fd_opts"))
goto out;
/* Bind rx socket to a premeditated port */
memset(&caddr, 0, sizeof(caddr));
nstoken = open_netns(NS0);
if (!ASSERT_OK_PTR(nstoken, "setns ns0"))
goto out;
if (ipv6) {
struct sockaddr_in6 *c = (struct sockaddr_in6 *)&caddr;
c->sin6_family = AF_INET6;
inet_pton(AF_INET6, VETH0_ADDR6, &c->sin6_addr);
c->sin6_port = htons(CLIENT_PORT);
err = bind(client_rx_fd, (struct sockaddr *)c, sizeof(*c));
} else {
struct sockaddr_in *c = (struct sockaddr_in *)&caddr;
c->sin_family = AF_INET;
inet_pton(AF_INET, VETH0_ADDR, &c->sin_addr);
c->sin_port = htons(CLIENT_PORT);
err = bind(client_rx_fd, (struct sockaddr *)c, sizeof(*c));
}
close_netns(nstoken);
if (!ASSERT_OK(err, "bind"))
goto out;
/* Send message in fragments */
if (ipv6) {
if (!ASSERT_OK(send_frags6(client_tx_fd), "send_frags6"))
goto out;
} else {
if (!ASSERT_OK(send_frags(client_tx_fd), "send_frags"))
goto out;
}
if (!ASSERT_EQ(skel->bss->shootdowns, 0, "shootdowns"))
goto out;
/* Receive reassembled msg on server and echo back to client */
caddr_len = sizeof(caddr);
len = recvfrom(srv_fd, buf, sizeof(buf), 0, (struct sockaddr *)&caddr, &caddr_len);
if (!ASSERT_GE(len, 0, "server recvfrom"))
goto out;
len = sendto(srv_fd, buf, len, 0, (struct sockaddr *)&caddr, caddr_len);
if (!ASSERT_GE(len, 0, "server sendto"))
goto out;
/* Expect reassembed message to be echoed back */
len = recvfrom(client_rx_fd, buf, sizeof(buf), 0, NULL, NULL);
if (!ASSERT_EQ(len, sizeof(MAGIC_MESSAGE) - 1, "client short read"))
goto out;
out:
if (client_rx_fd != -1)
close(client_rx_fd);
if (client_tx_fd != -1)
close(client_tx_fd);
if (srv_fd != -1)
close(srv_fd);
cleanup_topology();
ip_check_defrag__destroy(skel);
}
void test_bpf_ip_check_defrag(void)
{
if (test__start_subtest("v4"))
test_bpf_ip_check_defrag_ok(false);
if (test__start_subtest("v6"))
test_bpf_ip_check_defrag_ok(true);
}

View File

@ -0,0 +1,139 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates.*/
#include <test_progs.h>
#include <network_helpers.h>
#include "test_ldsx_insn.skel.h"
static void test_map_val_and_probed_memory(void)
{
struct test_ldsx_insn *skel;
int err;
skel = test_ldsx_insn__open();
if (!ASSERT_OK_PTR(skel, "test_ldsx_insn__open"))
return;
if (skel->rodata->skip) {
test__skip();
goto out;
}
bpf_program__set_autoload(skel->progs.rdonly_map_prog, true);
bpf_program__set_autoload(skel->progs.map_val_prog, true);
bpf_program__set_autoload(skel->progs.test_ptr_struct_arg, true);
err = test_ldsx_insn__load(skel);
if (!ASSERT_OK(err, "test_ldsx_insn__load"))
goto out;
err = test_ldsx_insn__attach(skel);
if (!ASSERT_OK(err, "test_ldsx_insn__attach"))
goto out;
ASSERT_OK(trigger_module_test_read(256), "trigger_read");
ASSERT_EQ(skel->bss->done1, 1, "done1");
ASSERT_EQ(skel->bss->ret1, 1, "ret1");
ASSERT_EQ(skel->bss->done2, 1, "done2");
ASSERT_EQ(skel->bss->ret2, 1, "ret2");
ASSERT_EQ(skel->bss->int_member, -1, "int_member");
out:
test_ldsx_insn__destroy(skel);
}
static void test_ctx_member_sign_ext(void)
{
struct test_ldsx_insn *skel;
int err, fd, cgroup_fd;
char buf[16] = {0};
socklen_t optlen;
cgroup_fd = test__join_cgroup("/ldsx_test");
if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup /ldsx_test"))
return;
skel = test_ldsx_insn__open();
if (!ASSERT_OK_PTR(skel, "test_ldsx_insn__open"))
goto close_cgroup_fd;
if (skel->rodata->skip) {
test__skip();
goto destroy_skel;
}
bpf_program__set_autoload(skel->progs._getsockopt, true);
err = test_ldsx_insn__load(skel);
if (!ASSERT_OK(err, "test_ldsx_insn__load"))
goto destroy_skel;
skel->links._getsockopt =
bpf_program__attach_cgroup(skel->progs._getsockopt, cgroup_fd);
if (!ASSERT_OK_PTR(skel->links._getsockopt, "getsockopt_link"))
goto destroy_skel;
fd = socket(AF_INET, SOCK_STREAM, 0);
if (!ASSERT_GE(fd, 0, "socket"))
goto destroy_skel;
optlen = sizeof(buf);
(void)getsockopt(fd, SOL_IP, IP_TTL, buf, &optlen);
ASSERT_EQ(skel->bss->set_optlen, -1, "optlen");
ASSERT_EQ(skel->bss->set_retval, -1, "retval");
close(fd);
destroy_skel:
test_ldsx_insn__destroy(skel);
close_cgroup_fd:
close(cgroup_fd);
}
static void test_ctx_member_narrow_sign_ext(void)
{
struct test_ldsx_insn *skel;
struct __sk_buff skb = {};
LIBBPF_OPTS(bpf_test_run_opts, topts,
.data_in = &pkt_v4,
.data_size_in = sizeof(pkt_v4),
.ctx_in = &skb,
.ctx_size_in = sizeof(skb),
);
int err, prog_fd;
skel = test_ldsx_insn__open();
if (!ASSERT_OK_PTR(skel, "test_ldsx_insn__open"))
return;
if (skel->rodata->skip) {
test__skip();
goto out;
}
bpf_program__set_autoload(skel->progs._tc, true);
err = test_ldsx_insn__load(skel);
if (!ASSERT_OK(err, "test_ldsx_insn__load"))
goto out;
prog_fd = bpf_program__fd(skel->progs._tc);
err = bpf_prog_test_run_opts(prog_fd, &topts);
ASSERT_OK(err, "test_run");
ASSERT_EQ(skel->bss->set_mark, -2, "set_mark");
out:
test_ldsx_insn__destroy(skel);
}
void test_ldsx_insn(void)
{
if (test__start_subtest("map_val and probed_memory"))
test_map_val_and_probed_memory();
if (test__start_subtest("ctx_member_sign_ext"))
test_ctx_member_sign_ext();
if (test__start_subtest("ctx_member_narrow_sign_ext"))
test_ctx_member_narrow_sign_ext();
}

View File

@ -11,6 +11,7 @@
#include "verifier_bounds_deduction_non_const.skel.h"
#include "verifier_bounds_mix_sign_unsign.skel.h"
#include "verifier_bpf_get_stack.skel.h"
#include "verifier_bswap.skel.h"
#include "verifier_btf_ctx_access.skel.h"
#include "verifier_cfg.skel.h"
#include "verifier_cgroup_inv_retcode.skel.h"
@ -24,6 +25,7 @@
#include "verifier_direct_stack_access_wraparound.skel.h"
#include "verifier_div0.skel.h"
#include "verifier_div_overflow.skel.h"
#include "verifier_gotol.skel.h"
#include "verifier_helper_access_var_len.skel.h"
#include "verifier_helper_packet_access.skel.h"
#include "verifier_helper_restricted.skel.h"
@ -31,6 +33,7 @@
#include "verifier_int_ptr.skel.h"
#include "verifier_jeq_infer_not_null.skel.h"
#include "verifier_ld_ind.skel.h"
#include "verifier_ldsx.skel.h"
#include "verifier_leak_ptr.skel.h"
#include "verifier_loops1.skel.h"
#include "verifier_lwt.skel.h"
@ -40,6 +43,7 @@
#include "verifier_map_ret_val.skel.h"
#include "verifier_masking.skel.h"
#include "verifier_meta_access.skel.h"
#include "verifier_movsx.skel.h"
#include "verifier_netfilter_ctx.skel.h"
#include "verifier_netfilter_retcode.skel.h"
#include "verifier_prevent_map_lookup.skel.h"
@ -51,6 +55,7 @@
#include "verifier_ringbuf.skel.h"
#include "verifier_runtime_jit.skel.h"
#include "verifier_scalar_ids.skel.h"
#include "verifier_sdiv.skel.h"
#include "verifier_search_pruning.skel.h"
#include "verifier_sock.skel.h"
#include "verifier_spill_fill.skel.h"
@ -113,6 +118,7 @@ void test_verifier_bounds_deduction(void) { RUN(verifier_bounds_deduction);
void test_verifier_bounds_deduction_non_const(void) { RUN(verifier_bounds_deduction_non_const); }
void test_verifier_bounds_mix_sign_unsign(void) { RUN(verifier_bounds_mix_sign_unsign); }
void test_verifier_bpf_get_stack(void) { RUN(verifier_bpf_get_stack); }
void test_verifier_bswap(void) { RUN(verifier_bswap); }
void test_verifier_btf_ctx_access(void) { RUN(verifier_btf_ctx_access); }
void test_verifier_cfg(void) { RUN(verifier_cfg); }
void test_verifier_cgroup_inv_retcode(void) { RUN(verifier_cgroup_inv_retcode); }
@ -126,6 +132,7 @@ void test_verifier_direct_packet_access(void) { RUN(verifier_direct_packet_acces
void test_verifier_direct_stack_access_wraparound(void) { RUN(verifier_direct_stack_access_wraparound); }
void test_verifier_div0(void) { RUN(verifier_div0); }
void test_verifier_div_overflow(void) { RUN(verifier_div_overflow); }
void test_verifier_gotol(void) { RUN(verifier_gotol); }
void test_verifier_helper_access_var_len(void) { RUN(verifier_helper_access_var_len); }
void test_verifier_helper_packet_access(void) { RUN(verifier_helper_packet_access); }
void test_verifier_helper_restricted(void) { RUN(verifier_helper_restricted); }
@ -133,6 +140,7 @@ void test_verifier_helper_value_access(void) { RUN(verifier_helper_value_access
void test_verifier_int_ptr(void) { RUN(verifier_int_ptr); }
void test_verifier_jeq_infer_not_null(void) { RUN(verifier_jeq_infer_not_null); }
void test_verifier_ld_ind(void) { RUN(verifier_ld_ind); }
void test_verifier_ldsx(void) { RUN(verifier_ldsx); }
void test_verifier_leak_ptr(void) { RUN(verifier_leak_ptr); }
void test_verifier_loops1(void) { RUN(verifier_loops1); }
void test_verifier_lwt(void) { RUN(verifier_lwt); }
@ -142,6 +150,7 @@ void test_verifier_map_ptr_mixing(void) { RUN(verifier_map_ptr_mixing); }
void test_verifier_map_ret_val(void) { RUN(verifier_map_ret_val); }
void test_verifier_masking(void) { RUN(verifier_masking); }
void test_verifier_meta_access(void) { RUN(verifier_meta_access); }
void test_verifier_movsx(void) { RUN(verifier_movsx); }
void test_verifier_netfilter_ctx(void) { RUN(verifier_netfilter_ctx); }
void test_verifier_netfilter_retcode(void) { RUN(verifier_netfilter_retcode); }
void test_verifier_prevent_map_lookup(void) { RUN(verifier_prevent_map_lookup); }
@ -153,6 +162,7 @@ void test_verifier_regalloc(void) { RUN(verifier_regalloc); }
void test_verifier_ringbuf(void) { RUN(verifier_ringbuf); }
void test_verifier_runtime_jit(void) { RUN(verifier_runtime_jit); }
void test_verifier_scalar_ids(void) { RUN(verifier_scalar_ids); }
void test_verifier_sdiv(void) { RUN(verifier_sdiv); }
void test_verifier_search_pruning(void) { RUN(verifier_search_pruning); }
void test_verifier_sock(void) { RUN(verifier_sock); }
void test_verifier_spill_fill(void) { RUN(verifier_spill_fill); }

View File

@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
#include <test_progs.h>
#include "test_xdp_attach_fail.skel.h"
#define IFINDEX_LO 1
#define XDP_FLAGS_REPLACE (1U << 4)
@ -85,10 +86,74 @@ static void test_xdp_attach(const char *file)
bpf_object__close(obj1);
}
#define ERRMSG_LEN 64
struct xdp_errmsg {
char msg[ERRMSG_LEN];
};
static void on_xdp_errmsg(void *ctx, int cpu, void *data, __u32 size)
{
struct xdp_errmsg *ctx_errmg = ctx, *tp_errmsg = data;
memcpy(&ctx_errmg->msg, &tp_errmsg->msg, ERRMSG_LEN);
}
static const char tgt_errmsg[] = "Invalid XDP flags for BPF link attachment";
static void test_xdp_attach_fail(const char *file)
{
struct test_xdp_attach_fail *skel = NULL;
struct xdp_errmsg errmsg = {};
struct perf_buffer *pb = NULL;
struct bpf_object *obj = NULL;
int err, fd_xdp;
LIBBPF_OPTS(bpf_link_create_opts, opts);
skel = test_xdp_attach_fail__open_and_load();
if (!ASSERT_OK_PTR(skel, "test_xdp_attach_fail__open_and_load"))
goto out_close;
err = test_xdp_attach_fail__attach(skel);
if (!ASSERT_EQ(err, 0, "test_xdp_attach_fail__attach"))
goto out_close;
/* set up perf buffer */
pb = perf_buffer__new(bpf_map__fd(skel->maps.xdp_errmsg_pb), 1,
on_xdp_errmsg, NULL, &errmsg, NULL);
if (!ASSERT_OK_PTR(pb, "perf_buffer__new"))
goto out_close;
err = bpf_prog_test_load(file, BPF_PROG_TYPE_XDP, &obj, &fd_xdp);
if (!ASSERT_EQ(err, 0, "bpf_prog_test_load"))
goto out_close;
opts.flags = 0xFF; // invalid flags to fail to attach XDP prog
err = bpf_link_create(fd_xdp, IFINDEX_LO, BPF_XDP, &opts);
if (!ASSERT_EQ(err, -EINVAL, "bpf_link_create"))
goto out_close;
/* read perf buffer */
err = perf_buffer__poll(pb, 100);
if (!ASSERT_GT(err, -1, "perf_buffer__poll"))
goto out_close;
ASSERT_STRNEQ((const char *) errmsg.msg, tgt_errmsg,
42 /* strlen(tgt_errmsg) */, "check error message");
out_close:
perf_buffer__free(pb);
bpf_object__close(obj);
test_xdp_attach_fail__destroy(skel);
}
void serial_test_xdp_attach(void)
{
if (test__start_subtest("xdp_attach"))
test_xdp_attach("./test_xdp.bpf.o");
if (test__start_subtest("xdp_attach_dynptr"))
test_xdp_attach("./test_xdp_dynptr.bpf.o");
if (test__start_subtest("xdp_attach_failed"))
test_xdp_attach_fail("./xdp_dummy.bpf.o");
}

View File

@ -0,0 +1,104 @@
// SPDX-License-Identifier: GPL-2.0-only
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
#include "bpf_tracing_net.h"
#define NF_DROP 0
#define NF_ACCEPT 1
#define ETH_P_IP 0x0800
#define ETH_P_IPV6 0x86DD
#define IP_MF 0x2000
#define IP_OFFSET 0x1FFF
#define NEXTHDR_FRAGMENT 44
extern int bpf_dynptr_from_skb(struct sk_buff *skb, __u64 flags,
struct bpf_dynptr *ptr__uninit) __ksym;
extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, uint32_t offset,
void *buffer, uint32_t buffer__sz) __ksym;
volatile int shootdowns = 0;
static bool is_frag_v4(struct iphdr *iph)
{
int offset;
int flags;
offset = bpf_ntohs(iph->frag_off);
flags = offset & ~IP_OFFSET;
offset &= IP_OFFSET;
offset <<= 3;
return (flags & IP_MF) || offset;
}
static bool is_frag_v6(struct ipv6hdr *ip6h)
{
/* Simplifying assumption that there are no extension headers
* between fixed header and fragmentation header. This assumption
* is only valid in this test case. It saves us the hassle of
* searching all potential extension headers.
*/
return ip6h->nexthdr == NEXTHDR_FRAGMENT;
}
static int handle_v4(struct sk_buff *skb)
{
struct bpf_dynptr ptr;
u8 iph_buf[20] = {};
struct iphdr *iph;
if (bpf_dynptr_from_skb(skb, 0, &ptr))
return NF_DROP;
iph = bpf_dynptr_slice(&ptr, 0, iph_buf, sizeof(iph_buf));
if (!iph)
return NF_DROP;
/* Shootdown any frags */
if (is_frag_v4(iph)) {
shootdowns++;
return NF_DROP;
}
return NF_ACCEPT;
}
static int handle_v6(struct sk_buff *skb)
{
struct bpf_dynptr ptr;
struct ipv6hdr *ip6h;
u8 ip6h_buf[40] = {};
if (bpf_dynptr_from_skb(skb, 0, &ptr))
return NF_DROP;
ip6h = bpf_dynptr_slice(&ptr, 0, ip6h_buf, sizeof(ip6h_buf));
if (!ip6h)
return NF_DROP;
/* Shootdown any frags */
if (is_frag_v6(ip6h)) {
shootdowns++;
return NF_DROP;
}
return NF_ACCEPT;
}
SEC("netfilter")
int defrag(struct bpf_nf_ctx *ctx)
{
struct sk_buff *skb = ctx->skb;
switch (bpf_ntohs(skb->protocol)) {
case ETH_P_IP:
return handle_v4(skb);
case ETH_P_IPV6:
return handle_v6(skb);
default:
return NF_ACCEPT;
}
}
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,142 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2023 Isovalent */
#include <stdbool.h>
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/in.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/tcp.h>
#include <linux/udp.h>
#include <bpf/bpf_endian.h>
#include <bpf/bpf_helpers.h>
#include <linux/pkt_cls.h>
char LICENSE[] SEC("license") = "GPL";
__u64 sk_cookie_seen;
__u64 reuseport_executed;
union {
struct tcphdr tcp;
struct udphdr udp;
} headers;
const volatile __u16 dest_port;
struct {
__uint(type, BPF_MAP_TYPE_SOCKMAP);
__uint(max_entries, 1);
__type(key, __u32);
__type(value, __u64);
} sk_map SEC(".maps");
SEC("sk_reuseport")
int reuse_accept(struct sk_reuseport_md *ctx)
{
reuseport_executed++;
if (ctx->ip_protocol == IPPROTO_TCP) {
if (ctx->data + sizeof(headers.tcp) > ctx->data_end)
return SK_DROP;
if (__builtin_memcmp(&headers.tcp, ctx->data, sizeof(headers.tcp)) != 0)
return SK_DROP;
} else if (ctx->ip_protocol == IPPROTO_UDP) {
if (ctx->data + sizeof(headers.udp) > ctx->data_end)
return SK_DROP;
if (__builtin_memcmp(&headers.udp, ctx->data, sizeof(headers.udp)) != 0)
return SK_DROP;
} else {
return SK_DROP;
}
sk_cookie_seen = bpf_get_socket_cookie(ctx->sk);
return SK_PASS;
}
SEC("sk_reuseport")
int reuse_drop(struct sk_reuseport_md *ctx)
{
reuseport_executed++;
sk_cookie_seen = 0;
return SK_DROP;
}
static int
assign_sk(struct __sk_buff *skb)
{
int zero = 0, ret = 0;
struct bpf_sock *sk;
sk = bpf_map_lookup_elem(&sk_map, &zero);
if (!sk)
return TC_ACT_SHOT;
ret = bpf_sk_assign(skb, sk, 0);
bpf_sk_release(sk);
return ret ? TC_ACT_SHOT : TC_ACT_OK;
}
static bool
maybe_assign_tcp(struct __sk_buff *skb, struct tcphdr *th)
{
if (th + 1 > (void *)(long)(skb->data_end))
return TC_ACT_SHOT;
if (!th->syn || th->ack || th->dest != bpf_htons(dest_port))
return TC_ACT_OK;
__builtin_memcpy(&headers.tcp, th, sizeof(headers.tcp));
return assign_sk(skb);
}
static bool
maybe_assign_udp(struct __sk_buff *skb, struct udphdr *uh)
{
if (uh + 1 > (void *)(long)(skb->data_end))
return TC_ACT_SHOT;
if (uh->dest != bpf_htons(dest_port))
return TC_ACT_OK;
__builtin_memcpy(&headers.udp, uh, sizeof(headers.udp));
return assign_sk(skb);
}
SEC("tc")
int tc_main(struct __sk_buff *skb)
{
void *data_end = (void *)(long)skb->data_end;
void *data = (void *)(long)skb->data;
struct ethhdr *eth;
eth = (struct ethhdr *)(data);
if (eth + 1 > data_end)
return TC_ACT_SHOT;
if (eth->h_proto == bpf_htons(ETH_P_IP)) {
struct iphdr *iph = (struct iphdr *)(data + sizeof(*eth));
if (iph + 1 > data_end)
return TC_ACT_SHOT;
if (iph->protocol == IPPROTO_TCP)
return maybe_assign_tcp(skb, (struct tcphdr *)(iph + 1));
else if (iph->protocol == IPPROTO_UDP)
return maybe_assign_udp(skb, (struct udphdr *)(iph + 1));
else
return TC_ACT_SHOT;
} else {
struct ipv6hdr *ip6h = (struct ipv6hdr *)(data + sizeof(*eth));
if (ip6h + 1 > data_end)
return TC_ACT_SHOT;
if (ip6h->nexthdr == IPPROTO_TCP)
return maybe_assign_tcp(skb, (struct tcphdr *)(ip6h + 1));
else if (ip6h->nexthdr == IPPROTO_UDP)
return maybe_assign_udp(skb, (struct udphdr *)(ip6h + 1));
else
return TC_ACT_SHOT;
}
}

View File

@ -12,6 +12,15 @@
#include <linux/ipv6.h>
#include <linux/udp.h>
/* offsetof() is used in static asserts, and the libbpf-redefined CO-RE
* friendly version breaks compilation for older clang versions <= 15
* when invoked in a static assert. Restore original here.
*/
#ifdef offsetof
#undef offsetof
#define offsetof(type, member) __builtin_offsetof(type, member)
#endif
struct gre_base_hdr {
uint16_t flags;
uint16_t protocol;

View File

@ -0,0 +1,118 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#if defined(__TARGET_ARCH_x86) && __clang_major__ >= 18
const volatile int skip = 0;
#else
const volatile int skip = 1;
#endif
volatile const short val1 = -1;
volatile const int val2 = -1;
short val3 = -1;
int val4 = -1;
int done1, done2, ret1, ret2;
SEC("?raw_tp/sys_enter")
int rdonly_map_prog(const void *ctx)
{
if (done1)
return 0;
done1 = 1;
/* val1/val2 readonly map */
if (val1 == val2)
ret1 = 1;
return 0;
}
SEC("?raw_tp/sys_enter")
int map_val_prog(const void *ctx)
{
if (done2)
return 0;
done2 = 1;
/* val1/val2 regular read/write map */
if (val3 == val4)
ret2 = 1;
return 0;
}
struct bpf_testmod_struct_arg_1 {
int a;
};
long long int_member;
SEC("?fentry/bpf_testmod_test_arg_ptr_to_struct")
int BPF_PROG2(test_ptr_struct_arg, struct bpf_testmod_struct_arg_1 *, p)
{
/* probed memory access */
int_member = p->a;
return 0;
}
long long set_optlen, set_retval;
SEC("?cgroup/getsockopt")
int _getsockopt(volatile struct bpf_sockopt *ctx)
{
int old_optlen, old_retval;
old_optlen = ctx->optlen;
old_retval = ctx->retval;
ctx->optlen = -1;
ctx->retval = -1;
/* sign extension for ctx member */
set_optlen = ctx->optlen;
set_retval = ctx->retval;
ctx->optlen = old_optlen;
ctx->retval = old_retval;
return 0;
}
long long set_mark;
SEC("?tc")
int _tc(volatile struct __sk_buff *skb)
{
long long tmp_mark;
int old_mark;
old_mark = skb->mark;
skb->mark = 0xf6fe;
/* narrowed sign extension for ctx member */
#if __clang_major__ >= 18
/* force narrow one-byte signed load. Otherwise, compiler may
* generate a 32-bit unsigned load followed by an s8 movsx.
*/
asm volatile ("r1 = *(s8 *)(%[ctx] + %[off_mark])\n\t"
"%[tmp_mark] = r1"
: [tmp_mark]"=r"(tmp_mark)
: [ctx]"r"(skb),
[off_mark]"i"(offsetof(struct __sk_buff, mark))
: "r1");
#else
tmp_mark = (char)skb->mark;
#endif
set_mark = tmp_mark;
skb->mark = old_mark;
return 0;
}
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,54 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright Leon Hwang */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#define ERRMSG_LEN 64
struct xdp_errmsg {
char msg[ERRMSG_LEN];
};
struct {
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
__type(key, int);
__type(value, int);
} xdp_errmsg_pb SEC(".maps");
struct xdp_attach_error_ctx {
unsigned long unused;
/*
* bpf does not support tracepoint __data_loc directly.
*
* Actually, this field is a 32 bit integer whose value encodes
* information on where to find the actual data. The first 2 bytes is
* the size of the data. The last 2 bytes is the offset from the start
* of the tracepoint struct where the data begins.
* -- https://github.com/iovisor/bpftrace/pull/1542
*/
__u32 msg; // __data_loc char[] msg;
};
/*
* Catch the error message at the tracepoint.
*/
SEC("tp/xdp/bpf_xdp_link_attach_failed")
int tp__xdp__bpf_xdp_link_attach_failed(struct xdp_attach_error_ctx *ctx)
{
char *msg = (void *)(__u64) ((void *) ctx + (__u16) ctx->msg);
struct xdp_errmsg errmsg = {};
bpf_probe_read_kernel_str(&errmsg.msg, ERRMSG_LEN, msg);
bpf_perf_event_output(ctx, &xdp_errmsg_pb, BPF_F_CURRENT_CPU, &errmsg,
ERRMSG_LEN);
return 0;
}
/*
* Reuse the XDP program in xdp_dummy.c.
*/
char LICENSE[] SEC("license") = "GPL";

View File

@ -0,0 +1,59 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
#if defined(__TARGET_ARCH_x86) && __clang_major__ >= 18
SEC("socket")
__description("BSWAP, 16")
__success __success_unpriv __retval(0x23ff)
__naked void bswap_16(void)
{
asm volatile (" \
r0 = 0xff23; \
r0 = bswap16 r0; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("BSWAP, 32")
__success __success_unpriv __retval(0x23ff0000)
__naked void bswap_32(void)
{
asm volatile (" \
r0 = 0xff23; \
r0 = bswap32 r0; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("BSWAP, 64")
__success __success_unpriv __retval(0x34ff12ff)
__naked void bswap_64(void)
{
asm volatile (" \
r0 = %[u64_val] ll; \
r0 = bswap64 r0; \
exit; \
" :
: [u64_val]"i"(0xff12ff34ff56ff78ull)
: __clobber_all);
}
#else
SEC("socket")
__description("cpuv4 is not supported by compiler or jit, use a dummy test")
__success
int dummy_test(void)
{
return 0;
}
#endif
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,44 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
#if defined(__TARGET_ARCH_x86) && __clang_major__ >= 18
SEC("socket")
__description("gotol, small_imm")
__success __success_unpriv __retval(1)
__naked void gotol_small_imm(void)
{
asm volatile (" \
call %[bpf_ktime_get_ns]; \
if r0 == 0 goto l0_%=; \
gotol l1_%=; \
l2_%=: \
gotol l3_%=; \
l1_%=: \
r0 = 1; \
gotol l2_%=; \
l0_%=: \
r0 = 2; \
l3_%=: \
exit; \
" :
: __imm(bpf_ktime_get_ns)
: __clobber_all);
}
#else
SEC("socket")
__description("cpuv4 is not supported by compiler or jit, use a dummy test")
__success
int dummy_test(void)
{
return 0;
}
#endif
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,131 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
#if defined(__TARGET_ARCH_x86) && __clang_major__ >= 18
SEC("socket")
__description("LDSX, S8")
__success __success_unpriv __retval(-2)
__naked void ldsx_s8(void)
{
asm volatile (" \
r1 = 0x3fe; \
*(u64 *)(r10 - 8) = r1; \
r0 = *(s8 *)(r10 - 8); \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("LDSX, S16")
__success __success_unpriv __retval(-2)
__naked void ldsx_s16(void)
{
asm volatile (" \
r1 = 0x3fffe; \
*(u64 *)(r10 - 8) = r1; \
r0 = *(s16 *)(r10 - 8); \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("LDSX, S32")
__success __success_unpriv __retval(-1)
__naked void ldsx_s32(void)
{
asm volatile (" \
r1 = 0xfffffffe; \
*(u64 *)(r10 - 8) = r1; \
r0 = *(s32 *)(r10 - 8); \
r0 >>= 1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("LDSX, S8 range checking, privileged")
__log_level(2) __success __retval(1)
__msg("R1_w=scalar(smin=-128,smax=127)")
__naked void ldsx_s8_range_priv(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
*(u64 *)(r10 - 8) = r0; \
r1 = *(s8 *)(r10 - 8); \
/* r1 with s8 range */ \
if r1 s> 0x7f goto l0_%=; \
if r1 s< -0x80 goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
SEC("socket")
__description("LDSX, S16 range checking")
__success __success_unpriv __retval(1)
__naked void ldsx_s16_range(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
*(u64 *)(r10 - 8) = r0; \
r1 = *(s16 *)(r10 - 8); \
/* r1 with s16 range */ \
if r1 s> 0x7fff goto l0_%=; \
if r1 s< -0x8000 goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
SEC("socket")
__description("LDSX, S32 range checking")
__success __success_unpriv __retval(1)
__naked void ldsx_s32_range(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
*(u64 *)(r10 - 8) = r0; \
r1 = *(s32 *)(r10 - 8); \
/* r1 with s16 range */ \
if r1 s> 0x7fffFFFF goto l0_%=; \
if r1 s< -0x80000000 goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
#else
SEC("socket")
__description("cpuv4 is not supported by compiler or jit, use a dummy test")
__success
int dummy_test(void)
{
return 0;
}
#endif
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,213 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
#if defined(__TARGET_ARCH_x86) && __clang_major__ >= 18
SEC("socket")
__description("MOV32SX, S8")
__success __success_unpriv __retval(0x23)
__naked void mov32sx_s8(void)
{
asm volatile (" \
w0 = 0xff23; \
w0 = (s8)w0; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("MOV32SX, S16")
__success __success_unpriv __retval(0xFFFFff23)
__naked void mov32sx_s16(void)
{
asm volatile (" \
w0 = 0xff23; \
w0 = (s16)w0; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("MOV64SX, S8")
__success __success_unpriv __retval(-2)
__naked void mov64sx_s8(void)
{
asm volatile (" \
r0 = 0x1fe; \
r0 = (s8)r0; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("MOV64SX, S16")
__success __success_unpriv __retval(0xf23)
__naked void mov64sx_s16(void)
{
asm volatile (" \
r0 = 0xf0f23; \
r0 = (s16)r0; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("MOV64SX, S32")
__success __success_unpriv __retval(-1)
__naked void mov64sx_s32(void)
{
asm volatile (" \
r0 = 0xfffffffe; \
r0 = (s32)r0; \
r0 >>= 1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("MOV32SX, S8, range_check")
__success __success_unpriv __retval(1)
__naked void mov32sx_s8_range(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
w1 = (s8)w0; \
/* w1 with s8 range */ \
if w1 s> 0x7f goto l0_%=; \
if w1 s< -0x80 goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
SEC("socket")
__description("MOV32SX, S16, range_check")
__success __success_unpriv __retval(1)
__naked void mov32sx_s16_range(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
w1 = (s16)w0; \
/* w1 with s16 range */ \
if w1 s> 0x7fff goto l0_%=; \
if w1 s< -0x80ff goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
SEC("socket")
__description("MOV32SX, S16, range_check 2")
__success __success_unpriv __retval(1)
__naked void mov32sx_s16_range_2(void)
{
asm volatile (" \
r1 = 65535; \
w2 = (s16)w1; \
r2 >>= 1; \
if r2 != 0x7fffFFFF goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 0; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
SEC("socket")
__description("MOV64SX, S8, range_check")
__success __success_unpriv __retval(1)
__naked void mov64sx_s8_range(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
r1 = (s8)r0; \
/* r1 with s8 range */ \
if r1 s> 0x7f goto l0_%=; \
if r1 s< -0x80 goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
SEC("socket")
__description("MOV64SX, S16, range_check")
__success __success_unpriv __retval(1)
__naked void mov64sx_s16_range(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
r1 = (s16)r0; \
/* r1 with s16 range */ \
if r1 s> 0x7fff goto l0_%=; \
if r1 s< -0x8000 goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
SEC("socket")
__description("MOV64SX, S32, range_check")
__success __success_unpriv __retval(1)
__naked void mov64sx_s32_range(void)
{
asm volatile (" \
call %[bpf_get_prandom_u32]; \
r1 = (s32)r0; \
/* r1 with s32 range */ \
if r1 s> 0x7fffffff goto l0_%=; \
if r1 s< -0x80000000 goto l0_%=; \
r0 = 1; \
l1_%=: \
exit; \
l0_%=: \
r0 = 2; \
goto l1_%=; \
" :
: __imm(bpf_get_prandom_u32)
: __clobber_all);
}
#else
SEC("socket")
__description("cpuv4 is not supported by compiler or jit, use a dummy test")
__success
int dummy_test(void)
{
return 0;
}
#endif
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,781 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
#if defined(__TARGET_ARCH_x86) && __clang_major__ >= 18
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 1")
__success __success_unpriv __retval(-20)
__naked void sdiv32_non_zero_imm_1(void)
{
asm volatile (" \
w0 = -41; \
w0 s/= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 2")
__success __success_unpriv __retval(-20)
__naked void sdiv32_non_zero_imm_2(void)
{
asm volatile (" \
w0 = 41; \
w0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 3")
__success __success_unpriv __retval(20)
__naked void sdiv32_non_zero_imm_3(void)
{
asm volatile (" \
w0 = -41; \
w0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 4")
__success __success_unpriv __retval(-21)
__naked void sdiv32_non_zero_imm_4(void)
{
asm volatile (" \
w0 = -42; \
w0 s/= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 5")
__success __success_unpriv __retval(-21)
__naked void sdiv32_non_zero_imm_5(void)
{
asm volatile (" \
w0 = 42; \
w0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 6")
__success __success_unpriv __retval(21)
__naked void sdiv32_non_zero_imm_6(void)
{
asm volatile (" \
w0 = -42; \
w0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 7")
__success __success_unpriv __retval(21)
__naked void sdiv32_non_zero_imm_7(void)
{
asm volatile (" \
w0 = 42; \
w0 s/= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero imm divisor, check 8")
__success __success_unpriv __retval(20)
__naked void sdiv32_non_zero_imm_8(void)
{
asm volatile (" \
w0 = 41; \
w0 s/= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 1")
__success __success_unpriv __retval(-20)
__naked void sdiv32_non_zero_reg_1(void)
{
asm volatile (" \
w0 = -41; \
w1 = 2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 2")
__success __success_unpriv __retval(-20)
__naked void sdiv32_non_zero_reg_2(void)
{
asm volatile (" \
w0 = 41; \
w1 = -2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 3")
__success __success_unpriv __retval(20)
__naked void sdiv32_non_zero_reg_3(void)
{
asm volatile (" \
w0 = -41; \
w1 = -2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 4")
__success __success_unpriv __retval(-21)
__naked void sdiv32_non_zero_reg_4(void)
{
asm volatile (" \
w0 = -42; \
w1 = 2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 5")
__success __success_unpriv __retval(-21)
__naked void sdiv32_non_zero_reg_5(void)
{
asm volatile (" \
w0 = 42; \
w1 = -2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 6")
__success __success_unpriv __retval(21)
__naked void sdiv32_non_zero_reg_6(void)
{
asm volatile (" \
w0 = -42; \
w1 = -2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 7")
__success __success_unpriv __retval(21)
__naked void sdiv32_non_zero_reg_7(void)
{
asm volatile (" \
w0 = 42; \
w1 = 2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, non-zero reg divisor, check 8")
__success __success_unpriv __retval(20)
__naked void sdiv32_non_zero_reg_8(void)
{
asm volatile (" \
w0 = 41; \
w1 = 2; \
w0 s/= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero imm divisor, check 1")
__success __success_unpriv __retval(-20)
__naked void sdiv64_non_zero_imm_1(void)
{
asm volatile (" \
r0 = -41; \
r0 s/= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero imm divisor, check 2")
__success __success_unpriv __retval(-20)
__naked void sdiv64_non_zero_imm_2(void)
{
asm volatile (" \
r0 = 41; \
r0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero imm divisor, check 3")
__success __success_unpriv __retval(20)
__naked void sdiv64_non_zero_imm_3(void)
{
asm volatile (" \
r0 = -41; \
r0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero imm divisor, check 4")
__success __success_unpriv __retval(-21)
__naked void sdiv64_non_zero_imm_4(void)
{
asm volatile (" \
r0 = -42; \
r0 s/= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero imm divisor, check 5")
__success __success_unpriv __retval(-21)
__naked void sdiv64_non_zero_imm_5(void)
{
asm volatile (" \
r0 = 42; \
r0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero imm divisor, check 6")
__success __success_unpriv __retval(21)
__naked void sdiv64_non_zero_imm_6(void)
{
asm volatile (" \
r0 = -42; \
r0 s/= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero reg divisor, check 1")
__success __success_unpriv __retval(-20)
__naked void sdiv64_non_zero_reg_1(void)
{
asm volatile (" \
r0 = -41; \
r1 = 2; \
r0 s/= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero reg divisor, check 2")
__success __success_unpriv __retval(-20)
__naked void sdiv64_non_zero_reg_2(void)
{
asm volatile (" \
r0 = 41; \
r1 = -2; \
r0 s/= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero reg divisor, check 3")
__success __success_unpriv __retval(20)
__naked void sdiv64_non_zero_reg_3(void)
{
asm volatile (" \
r0 = -41; \
r1 = -2; \
r0 s/= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero reg divisor, check 4")
__success __success_unpriv __retval(-21)
__naked void sdiv64_non_zero_reg_4(void)
{
asm volatile (" \
r0 = -42; \
r1 = 2; \
r0 s/= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero reg divisor, check 5")
__success __success_unpriv __retval(-21)
__naked void sdiv64_non_zero_reg_5(void)
{
asm volatile (" \
r0 = 42; \
r1 = -2; \
r0 s/= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, non-zero reg divisor, check 6")
__success __success_unpriv __retval(21)
__naked void sdiv64_non_zero_reg_6(void)
{
asm volatile (" \
r0 = -42; \
r1 = -2; \
r0 s/= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero imm divisor, check 1")
__success __success_unpriv __retval(-1)
__naked void smod32_non_zero_imm_1(void)
{
asm volatile (" \
w0 = -41; \
w0 s%%= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero imm divisor, check 2")
__success __success_unpriv __retval(1)
__naked void smod32_non_zero_imm_2(void)
{
asm volatile (" \
w0 = 41; \
w0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero imm divisor, check 3")
__success __success_unpriv __retval(-1)
__naked void smod32_non_zero_imm_3(void)
{
asm volatile (" \
w0 = -41; \
w0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero imm divisor, check 4")
__success __success_unpriv __retval(0)
__naked void smod32_non_zero_imm_4(void)
{
asm volatile (" \
w0 = -42; \
w0 s%%= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero imm divisor, check 5")
__success __success_unpriv __retval(0)
__naked void smod32_non_zero_imm_5(void)
{
asm volatile (" \
w0 = 42; \
w0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero imm divisor, check 6")
__success __success_unpriv __retval(0)
__naked void smod32_non_zero_imm_6(void)
{
asm volatile (" \
w0 = -42; \
w0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero reg divisor, check 1")
__success __success_unpriv __retval(-1)
__naked void smod32_non_zero_reg_1(void)
{
asm volatile (" \
w0 = -41; \
w1 = 2; \
w0 s%%= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero reg divisor, check 2")
__success __success_unpriv __retval(1)
__naked void smod32_non_zero_reg_2(void)
{
asm volatile (" \
w0 = 41; \
w1 = -2; \
w0 s%%= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero reg divisor, check 3")
__success __success_unpriv __retval(-1)
__naked void smod32_non_zero_reg_3(void)
{
asm volatile (" \
w0 = -41; \
w1 = -2; \
w0 s%%= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero reg divisor, check 4")
__success __success_unpriv __retval(0)
__naked void smod32_non_zero_reg_4(void)
{
asm volatile (" \
w0 = -42; \
w1 = 2; \
w0 s%%= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero reg divisor, check 5")
__success __success_unpriv __retval(0)
__naked void smod32_non_zero_reg_5(void)
{
asm volatile (" \
w0 = 42; \
w1 = -2; \
w0 s%%= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, non-zero reg divisor, check 6")
__success __success_unpriv __retval(0)
__naked void smod32_non_zero_reg_6(void)
{
asm volatile (" \
w0 = -42; \
w1 = -2; \
w0 s%%= w1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 1")
__success __success_unpriv __retval(-1)
__naked void smod64_non_zero_imm_1(void)
{
asm volatile (" \
r0 = -41; \
r0 s%%= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 2")
__success __success_unpriv __retval(1)
__naked void smod64_non_zero_imm_2(void)
{
asm volatile (" \
r0 = 41; \
r0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 3")
__success __success_unpriv __retval(-1)
__naked void smod64_non_zero_imm_3(void)
{
asm volatile (" \
r0 = -41; \
r0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 4")
__success __success_unpriv __retval(0)
__naked void smod64_non_zero_imm_4(void)
{
asm volatile (" \
r0 = -42; \
r0 s%%= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 5")
__success __success_unpriv __retval(-0)
__naked void smod64_non_zero_imm_5(void)
{
asm volatile (" \
r0 = 42; \
r0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 6")
__success __success_unpriv __retval(0)
__naked void smod64_non_zero_imm_6(void)
{
asm volatile (" \
r0 = -42; \
r0 s%%= -2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 7")
__success __success_unpriv __retval(0)
__naked void smod64_non_zero_imm_7(void)
{
asm volatile (" \
r0 = 42; \
r0 s%%= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero imm divisor, check 8")
__success __success_unpriv __retval(1)
__naked void smod64_non_zero_imm_8(void)
{
asm volatile (" \
r0 = 41; \
r0 s%%= 2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 1")
__success __success_unpriv __retval(-1)
__naked void smod64_non_zero_reg_1(void)
{
asm volatile (" \
r0 = -41; \
r1 = 2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 2")
__success __success_unpriv __retval(1)
__naked void smod64_non_zero_reg_2(void)
{
asm volatile (" \
r0 = 41; \
r1 = -2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 3")
__success __success_unpriv __retval(-1)
__naked void smod64_non_zero_reg_3(void)
{
asm volatile (" \
r0 = -41; \
r1 = -2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 4")
__success __success_unpriv __retval(0)
__naked void smod64_non_zero_reg_4(void)
{
asm volatile (" \
r0 = -42; \
r1 = 2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 5")
__success __success_unpriv __retval(0)
__naked void smod64_non_zero_reg_5(void)
{
asm volatile (" \
r0 = 42; \
r1 = -2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 6")
__success __success_unpriv __retval(0)
__naked void smod64_non_zero_reg_6(void)
{
asm volatile (" \
r0 = -42; \
r1 = -2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 7")
__success __success_unpriv __retval(0)
__naked void smod64_non_zero_reg_7(void)
{
asm volatile (" \
r0 = 42; \
r1 = 2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, non-zero reg divisor, check 8")
__success __success_unpriv __retval(1)
__naked void smod64_non_zero_reg_8(void)
{
asm volatile (" \
r0 = 41; \
r1 = 2; \
r0 s%%= r1; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV32, zero divisor")
__success __success_unpriv __retval(0)
__naked void sdiv32_zero_divisor(void)
{
asm volatile (" \
w0 = 42; \
w1 = 0; \
w2 = -1; \
w2 s/= w1; \
w0 = w2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SDIV64, zero divisor")
__success __success_unpriv __retval(0)
__naked void sdiv64_zero_divisor(void)
{
asm volatile (" \
r0 = 42; \
r1 = 0; \
r2 = -1; \
r2 s/= r1; \
r0 = r2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD32, zero divisor")
__success __success_unpriv __retval(-1)
__naked void smod32_zero_divisor(void)
{
asm volatile (" \
w0 = 42; \
w1 = 0; \
w2 = -1; \
w2 s%%= w1; \
w0 = w2; \
exit; \
" ::: __clobber_all);
}
SEC("socket")
__description("SMOD64, zero divisor")
__success __success_unpriv __retval(-1)
__naked void smod64_zero_divisor(void)
{
asm volatile (" \
r0 = 42; \
r1 = 0; \
r2 = -1; \
r2 s%%= r1; \
r0 = r2; \
exit; \
" ::: __clobber_all);
}
#else
SEC("socket")
__description("cpuv4 is not supported by compiler or jit, use a dummy test")
__success
int dummy_test(void)
{
return 0;
}
#endif
char _license[] SEC("license") = "GPL";

View File

@ -176,11 +176,11 @@
.retval = 1,
},
{
"invalid 64-bit BPF_END",
"invalid 64-bit BPF_END with BPF_TO_BE",
.insns = {
BPF_MOV32_IMM(BPF_REG_0, 0),
{
.code = BPF_ALU64 | BPF_END | BPF_TO_LE,
.code = BPF_ALU64 | BPF_END | BPF_TO_BE,
.dst_reg = BPF_REG_0,
.src_reg = 0,
.off = 0,
@ -188,7 +188,7 @@
},
BPF_EXIT_INSN(),
},
.errstr = "unknown opcode d7",
.errstr = "unknown opcode df",
.result = REJECT,
},
{

View File

@ -2076,7 +2076,7 @@ static void init_iface(struct ifobject *ifobj, const char *dst_mac, const char *
err = bpf_xdp_query(ifobj->ifindex, XDP_FLAGS_DRV_MODE, &query_opts);
if (err) {
ksft_print_msg("Error querrying XDP capabilities\n");
ksft_print_msg("Error querying XDP capabilities\n");
exit_with_error(-err);
}
if (query_opts.feature_flags & NETDEV_XDP_ACT_RX_SG)