Probes updates for v6.11:

Uprobes:
 - x86/shstk: Make return uprobe work with shadow stack.
 - Add uretprobe syscall which speeds up the uretprobe 10-30% faster. This
   syscall is automatically used from user-space trampolines which are
   generated by the uretprobe. If this syscall is used by normal
   user program, it will cause SIGILL. Note that this is currently only
   implemented on x86_64.
   (This also has 2 fixes for adjusting the syscall number to avoid conflict
    with new *attrat syscalls.)
 - uprobes/perf: fix user stack traces in the presence of pending uretprobe.
   This corrects the uretprobe's trampoline address in the stacktrace with
   correct return address.
 - selftests/x86: Add a return uprobe with shadow stack test.
 - selftests/bpf: Add uretprobe syscall related tests.
   . test case for register integrity check.
   . test case with register changing case.
   . test case for uretprobe syscall without uprobes (expected to be failed).
   . test case for uretprobe with shadow stack.
 - selftests/bpf: add test validating uprobe/uretprobe stack traces
 - MAINTAINERS: Add uprobes entry. This does not specify the tree but to
   clarify who maintains and reviews the uprobes.
 
 Kprobes:
 - tracing/kprobes: Test case cleanups. Replace redundant WARN_ON_ONCE() +
   pr_warn() with WARN_ONCE() and remove unnecessary code from selftest.
 - tracing/kprobes: Add symbol counting check when module loads. This
   checks the uniqueness of the probed symbol on modules. The same check
   has already done for kernel symbols.
   (This also has a fix for build error with CONFIG_MODULES=n)
 
 Cleanup:
 - Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmaWYxwbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bsUgH/3JcSzDZujQWCZ1f4fJn
 QecvTFSYcCl6ck8+/3wm4EsgeCXIFOyPnoPc7k2Gm+l6Dlk1DKGV6wV4tuKFUq9X
 9mplcwoVA0Ln+EX9zv9v4s99yUGxcU9xjgC9XT7J52SvqYncPIi6dR0Z9wlJBmyd
 Bx3cZk+wSzCYaoqYngI2fKlzsEcYgDIP999fQPRi0HGzNZujc4xeJyjCTC/48yWO
 9kreRQq6wFdgRQTwMcR/fKPDKIGZQCU8jkXv5crVV5K3rNaBcwBmCJJMP8PzPU0V
 UQ0+8RZK+Qk8SBwXcMNVRqm/efTderob4IYxP8OBe5wjAIE7+vu8r6sqwxRIS54M
 Cyg=
 =DRSr
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:
 "Uprobes:

   - x86/shstk: Make return uprobe work with shadow stack

   - Add uretprobe syscall which speeds up the uretprobe 10-30% faster.
     This syscall is automatically used from user-space trampolines
     which are generated by the uretprobe. If this syscall is used by
     normal user program, it will cause SIGILL. Note that this is
     currently only implemented on x86_64.

     (This also has two fixes for adjusting the syscall number to avoid
     conflict with new *attrat syscalls.)

   - uprobes/perf: fix user stack traces in the presence of pending
     uretprobe. This corrects the uretprobe's trampoline address in the
     stacktrace with correct return address

   - selftests/x86: Add a return uprobe with shadow stack test

   - selftests/bpf: Add uretprobe syscall related tests.
      - test case for register integrity check
      - test case with register changing case
      - test case for uretprobe syscall without uprobes (expected to fail)
      - test case for uretprobe with shadow stack

   - selftests/bpf: add test validating uprobe/uretprobe stack traces

   - MAINTAINERS: Add uprobes entry. This does not specify the tree but
     to clarify who maintains and reviews the uprobes

  Kprobes:

   - tracing/kprobes: Test case cleanups.

     Replace redundant WARN_ON_ONCE() + pr_warn() with WARN_ONCE() and
     remove unnecessary code from selftest

   - tracing/kprobes: Add symbol counting check when module loads.

     This checks the uniqueness of the probed symbol on modules. The
     same check has already done for kernel symbols

     (This also has a fix for build error with CONFIG_MODULES=n)

  Cleanup:

   - Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples"

* tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  MAINTAINERS: Add uprobes entry
  selftests/bpf: Change uretprobe syscall number in uprobe_syscall test
  uprobe: Change uretprobe syscall scope and number
  tracing/kprobes: Fix build error when find_module() is not available
  tracing/kprobes: Add symbol counting check when module loads
  selftests/bpf: add test validating uprobe/uretprobe stack traces
  perf,uprobes: fix user stack traces in the presence of pending uretprobes
  tracing/kprobe: Remove cleanup code unrelated to selftest
  tracing/kprobe: Integrate test warnings into WARN_ONCE
  selftests/bpf: Add uretprobe shadow stack test
  selftests/bpf: Add uretprobe syscall call from user space test
  selftests/bpf: Add uretprobe syscall test for regs changes
  selftests/bpf: Add uretprobe syscall test for regs integrity
  selftests/x86: Add return uprobe shadow stack test
  uprobe: Add uretprobe syscall to speed up return probe
  uprobe: Wire up uretprobe system call
  x86/shstk: Make return uprobe work with shadow stack
  samples: kprobes: add missing MODULE_DESCRIPTION() macros
  fprobe: add missing MODULE_DESCRIPTION() macro
This commit is contained in:
Linus Torvalds 2024-07-18 12:19:20 -07:00
commit 91bd008d4e
23 changed files with 1320 additions and 92 deletions

View File

@ -23367,6 +23367,19 @@ F: drivers/mtd/ubi/
F: include/linux/mtd/ubi.h F: include/linux/mtd/ubi.h
F: include/uapi/mtd/ubi-user.h F: include/uapi/mtd/ubi-user.h
UPROBES
M: Masami Hiramatsu <mhiramat@kernel.org>
M: Oleg Nesterov <oleg@redhat.com>
M: Peter Zijlstra <peterz@infradead.org>
L: linux-kernel@vger.kernel.org
L: linux-trace-kernel@vger.kernel.org
S: Maintained
F: arch/*/include/asm/uprobes.h
F: arch/*/kernel/probes/uprobes.c
F: arch/*/kernel/uprobes.c
F: include/linux/uprobes.h
F: kernel/events/uprobes.c
USB "USBNET" DRIVER FRAMEWORK USB "USBNET" DRIVER FRAMEWORK
M: Oliver Neukum <oneukum@suse.com> M: Oliver Neukum <oneukum@suse.com>
L: netdev@vger.kernel.org L: netdev@vger.kernel.org

View File

@ -385,6 +385,7 @@
460 common lsm_set_self_attr sys_lsm_set_self_attr 460 common lsm_set_self_attr sys_lsm_set_self_attr
461 common lsm_list_modules sys_lsm_list_modules 461 common lsm_list_modules sys_lsm_list_modules
462 common mseal sys_mseal 462 common mseal sys_mseal
467 common uretprobe sys_uretprobe
# #
# Due to a historical design error, certain syscalls are numbered differently # Due to a historical design error, certain syscalls are numbered differently

View File

@ -21,6 +21,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clon
void shstk_free(struct task_struct *p); void shstk_free(struct task_struct *p);
int setup_signal_shadow_stack(struct ksignal *ksig); int setup_signal_shadow_stack(struct ksignal *ksig);
int restore_signal_shadow_stack(void); int restore_signal_shadow_stack(void);
int shstk_update_last_frame(unsigned long val);
bool shstk_is_enabled(void);
#else #else
static inline long shstk_prctl(struct task_struct *task, int option, static inline long shstk_prctl(struct task_struct *task, int option,
unsigned long arg2) { return -EINVAL; } unsigned long arg2) { return -EINVAL; }
@ -31,6 +33,8 @@ static inline unsigned long shstk_alloc_thread_stack(struct task_struct *p,
static inline void shstk_free(struct task_struct *p) {} static inline void shstk_free(struct task_struct *p) {}
static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; } static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; }
static inline int restore_signal_shadow_stack(void) { return 0; } static inline int restore_signal_shadow_stack(void) { return 0; }
static inline int shstk_update_last_frame(unsigned long val) { return 0; }
static inline bool shstk_is_enabled(void) { return false; }
#endif /* CONFIG_X86_USER_SHADOW_STACK */ #endif /* CONFIG_X86_USER_SHADOW_STACK */
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */

View File

@ -577,3 +577,19 @@ long shstk_prctl(struct task_struct *task, int option, unsigned long arg2)
return wrss_control(true); return wrss_control(true);
return -EINVAL; return -EINVAL;
} }
int shstk_update_last_frame(unsigned long val)
{
unsigned long ssp;
if (!features_enabled(ARCH_SHSTK_SHSTK))
return 0;
ssp = get_user_shstk_addr();
return write_user_shstk_64((u64 __user *)ssp, (u64)val);
}
bool shstk_is_enabled(void)
{
return features_enabled(ARCH_SHSTK_SHSTK);
}

View File

@ -12,6 +12,7 @@
#include <linux/ptrace.h> #include <linux/ptrace.h>
#include <linux/uprobes.h> #include <linux/uprobes.h>
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <linux/syscalls.h>
#include <linux/kdebug.h> #include <linux/kdebug.h>
#include <asm/processor.h> #include <asm/processor.h>
@ -308,6 +309,122 @@ static int uprobe_init_insn(struct arch_uprobe *auprobe, struct insn *insn, bool
} }
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
asm (
".pushsection .rodata\n"
".global uretprobe_trampoline_entry\n"
"uretprobe_trampoline_entry:\n"
"pushq %rax\n"
"pushq %rcx\n"
"pushq %r11\n"
"movq $" __stringify(__NR_uretprobe) ", %rax\n"
"syscall\n"
".global uretprobe_syscall_check\n"
"uretprobe_syscall_check:\n"
"popq %r11\n"
"popq %rcx\n"
/* The uretprobe syscall replaces stored %rax value with final
* return address, so we don't restore %rax in here and just
* call ret.
*/
"retq\n"
".global uretprobe_trampoline_end\n"
"uretprobe_trampoline_end:\n"
".popsection\n"
);
extern u8 uretprobe_trampoline_entry[];
extern u8 uretprobe_trampoline_end[];
extern u8 uretprobe_syscall_check[];
void *arch_uprobe_trampoline(unsigned long *psize)
{
static uprobe_opcode_t insn = UPROBE_SWBP_INSN;
struct pt_regs *regs = task_pt_regs(current);
/*
* At the moment the uretprobe syscall trampoline is supported
* only for native 64-bit process, the compat process still uses
* standard breakpoint.
*/
if (user_64bit_mode(regs)) {
*psize = uretprobe_trampoline_end - uretprobe_trampoline_entry;
return uretprobe_trampoline_entry;
}
*psize = UPROBE_SWBP_INSN_SIZE;
return &insn;
}
static unsigned long trampoline_check_ip(void)
{
unsigned long tramp = uprobe_get_trampoline_vaddr();
return tramp + (uretprobe_syscall_check - uretprobe_trampoline_entry);
}
SYSCALL_DEFINE0(uretprobe)
{
struct pt_regs *regs = task_pt_regs(current);
unsigned long err, ip, sp, r11_cx_ax[3];
if (regs->ip != trampoline_check_ip())
goto sigill;
err = copy_from_user(r11_cx_ax, (void __user *)regs->sp, sizeof(r11_cx_ax));
if (err)
goto sigill;
/* expose the "right" values of r11/cx/ax/sp to uprobe_consumer/s */
regs->r11 = r11_cx_ax[0];
regs->cx = r11_cx_ax[1];
regs->ax = r11_cx_ax[2];
regs->sp += sizeof(r11_cx_ax);
regs->orig_ax = -1;
ip = regs->ip;
sp = regs->sp;
uprobe_handle_trampoline(regs);
/*
* Some of the uprobe consumers has changed sp, we can do nothing,
* just return via iret.
* .. or shadow stack is enabled, in which case we need to skip
* return through the user space stack address.
*/
if (regs->sp != sp || shstk_is_enabled())
return regs->ax;
regs->sp -= sizeof(r11_cx_ax);
/* for the case uprobe_consumer has changed r11/cx */
r11_cx_ax[0] = regs->r11;
r11_cx_ax[1] = regs->cx;
/*
* ax register is passed through as return value, so we can use
* its space on stack for ip value and jump to it through the
* trampoline's ret instruction
*/
r11_cx_ax[2] = regs->ip;
regs->ip = ip;
err = copy_to_user((void __user *)regs->sp, r11_cx_ax, sizeof(r11_cx_ax));
if (err)
goto sigill;
/* ensure sysret, see do_syscall_64() */
regs->r11 = regs->flags;
regs->cx = regs->ip;
return regs->ax;
sigill:
force_sig(SIGILL);
return -1;
}
/* /*
* If arch_uprobe->insn doesn't use rip-relative addressing, return * If arch_uprobe->insn doesn't use rip-relative addressing, return
* immediately. Otherwise, rewrite the instruction so that it accesses * immediately. Otherwise, rewrite the instruction so that it accesses
@ -1076,8 +1193,13 @@ arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs
return orig_ret_vaddr; return orig_ret_vaddr;
nleft = copy_to_user((void __user *)regs->sp, &trampoline_vaddr, rasize); nleft = copy_to_user((void __user *)regs->sp, &trampoline_vaddr, rasize);
if (likely(!nleft)) if (likely(!nleft)) {
if (shstk_update_last_frame(trampoline_vaddr)) {
force_sig(SIGSEGV);
return -1;
}
return orig_ret_vaddr; return orig_ret_vaddr;
}
if (nleft != rasize) { if (nleft != rasize) {
pr_err("return address clobbered: pid=%d, %%sp=%#lx, %%ip=%#lx\n", pr_err("return address clobbered: pid=%d, %%sp=%#lx, %%ip=%#lx\n",

View File

@ -979,6 +979,8 @@ asmlinkage long sys_lsm_list_modules(u64 __user *ids, u32 __user *size, u32 flag
/* x86 */ /* x86 */
asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int on); asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int on);
asmlinkage long sys_uretprobe(void);
/* pciconfig: alpha, arm, arm64, ia64, sparc */ /* pciconfig: alpha, arm, arm64, ia64, sparc */
asmlinkage long sys_pciconfig_read(unsigned long bus, unsigned long dfn, asmlinkage long sys_pciconfig_read(unsigned long bus, unsigned long dfn,
unsigned long off, unsigned long len, unsigned long off, unsigned long len,

View File

@ -138,6 +138,9 @@ extern bool arch_uretprobe_is_alive(struct return_instance *ret, enum rp_check c
extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs); extern bool arch_uprobe_ignore(struct arch_uprobe *aup, struct pt_regs *regs);
extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, extern void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
void *src, unsigned long len); void *src, unsigned long len);
extern void uprobe_handle_trampoline(struct pt_regs *regs);
extern void *arch_uprobe_trampoline(unsigned long *psize);
extern unsigned long uprobe_get_trampoline_vaddr(void);
#else /* !CONFIG_UPROBES */ #else /* !CONFIG_UPROBES */
struct uprobes_state { struct uprobes_state {
}; };

View File

@ -841,8 +841,11 @@ __SYSCALL(__NR_lsm_list_modules, sys_lsm_list_modules)
#define __NR_mseal 462 #define __NR_mseal 462
__SYSCALL(__NR_mseal, sys_mseal) __SYSCALL(__NR_mseal, sys_mseal)
#define __NR_uretprobe 463
__SYSCALL(__NR_uretprobe, sys_uretprobe)
#undef __NR_syscalls #undef __NR_syscalls
#define __NR_syscalls 463 #define __NR_syscalls 464
/* /*
* 32 bit systems traditionally used different * 32 bit systems traditionally used different

View File

@ -11,6 +11,7 @@
#include <linux/perf_event.h> #include <linux/perf_event.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/sched/task_stack.h> #include <linux/sched/task_stack.h>
#include <linux/uprobes.h>
#include "internal.h" #include "internal.h"
@ -176,13 +177,51 @@ put_callchain_entry(int rctx)
put_recursion_context(this_cpu_ptr(callchain_recursion), rctx); put_recursion_context(this_cpu_ptr(callchain_recursion), rctx);
} }
static void fixup_uretprobe_trampoline_entries(struct perf_callchain_entry *entry,
int start_entry_idx)
{
#ifdef CONFIG_UPROBES
struct uprobe_task *utask = current->utask;
struct return_instance *ri;
__u64 *cur_ip, *last_ip, tramp_addr;
if (likely(!utask || !utask->return_instances))
return;
cur_ip = &entry->ip[start_entry_idx];
last_ip = &entry->ip[entry->nr - 1];
ri = utask->return_instances;
tramp_addr = uprobe_get_trampoline_vaddr();
/*
* If there are pending uretprobes for the current thread, they are
* recorded in a list inside utask->return_instances; each such
* pending uretprobe replaces traced user function's return address on
* the stack, so when stack trace is captured, instead of seeing
* actual function's return address, we'll have one or many uretprobe
* trampoline addresses in the stack trace, which are not helpful and
* misleading to users.
* So here we go over the pending list of uretprobes, and each
* encountered trampoline address is replaced with actual return
* address.
*/
while (ri && cur_ip <= last_ip) {
if (*cur_ip == tramp_addr) {
*cur_ip = ri->orig_ret_vaddr;
ri = ri->next;
}
cur_ip++;
}
#endif
}
struct perf_callchain_entry * struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
u32 max_stack, bool crosstask, bool add_mark) u32 max_stack, bool crosstask, bool add_mark)
{ {
struct perf_callchain_entry *entry; struct perf_callchain_entry *entry;
struct perf_callchain_entry_ctx ctx; struct perf_callchain_entry_ctx ctx;
int rctx; int rctx, start_entry_idx;
entry = get_callchain_entry(&rctx); entry = get_callchain_entry(&rctx);
if (!entry) if (!entry)
@ -215,7 +254,9 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
if (add_mark) if (add_mark)
perf_callchain_store_context(&ctx, PERF_CONTEXT_USER); perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
start_entry_idx = entry->nr;
perf_callchain_user(&ctx, regs); perf_callchain_user(&ctx, regs);
fixup_uretprobe_trampoline_entries(entry, start_entry_idx);
} }
} }

View File

@ -1474,11 +1474,20 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
return ret; return ret;
} }
void * __weak arch_uprobe_trampoline(unsigned long *psize)
{
static uprobe_opcode_t insn = UPROBE_SWBP_INSN;
*psize = UPROBE_SWBP_INSN_SIZE;
return &insn;
}
static struct xol_area *__create_xol_area(unsigned long vaddr) static struct xol_area *__create_xol_area(unsigned long vaddr)
{ {
struct mm_struct *mm = current->mm; struct mm_struct *mm = current->mm;
uprobe_opcode_t insn = UPROBE_SWBP_INSN; unsigned long insns_size;
struct xol_area *area; struct xol_area *area;
void *insns;
area = kmalloc(sizeof(*area), GFP_KERNEL); area = kmalloc(sizeof(*area), GFP_KERNEL);
if (unlikely(!area)) if (unlikely(!area))
@ -1502,7 +1511,8 @@ static struct xol_area *__create_xol_area(unsigned long vaddr)
/* Reserve the 1st slot for get_trampoline_vaddr() */ /* Reserve the 1st slot for get_trampoline_vaddr() */
set_bit(0, area->bitmap); set_bit(0, area->bitmap);
atomic_set(&area->slot_count, 1); atomic_set(&area->slot_count, 1);
arch_uprobe_copy_ixol(area->pages[0], 0, &insn, UPROBE_SWBP_INSN_SIZE); insns = arch_uprobe_trampoline(&insns_size);
arch_uprobe_copy_ixol(area->pages[0], 0, insns, insns_size);
if (!xol_add_vma(mm, area)) if (!xol_add_vma(mm, area))
return area; return area;
@ -1827,7 +1837,7 @@ void uprobe_copy_process(struct task_struct *t, unsigned long flags)
* *
* Returns -1 in case the xol_area is not allocated. * Returns -1 in case the xol_area is not allocated.
*/ */
static unsigned long get_trampoline_vaddr(void) unsigned long uprobe_get_trampoline_vaddr(void)
{ {
struct xol_area *area; struct xol_area *area;
unsigned long trampoline_vaddr = -1; unsigned long trampoline_vaddr = -1;
@ -1878,7 +1888,7 @@ static void prepare_uretprobe(struct uprobe *uprobe, struct pt_regs *regs)
if (!ri) if (!ri)
return; return;
trampoline_vaddr = get_trampoline_vaddr(); trampoline_vaddr = uprobe_get_trampoline_vaddr();
orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs); orig_ret_vaddr = arch_uretprobe_hijack_return_addr(trampoline_vaddr, regs);
if (orig_ret_vaddr == -1) if (orig_ret_vaddr == -1)
goto fail; goto fail;
@ -2123,7 +2133,7 @@ static struct return_instance *find_next_ret_chain(struct return_instance *ri)
return ri; return ri;
} }
static void handle_trampoline(struct pt_regs *regs) void uprobe_handle_trampoline(struct pt_regs *regs)
{ {
struct uprobe_task *utask; struct uprobe_task *utask;
struct return_instance *ri, *next; struct return_instance *ri, *next;
@ -2149,6 +2159,15 @@ static void handle_trampoline(struct pt_regs *regs)
instruction_pointer_set(regs, ri->orig_ret_vaddr); instruction_pointer_set(regs, ri->orig_ret_vaddr);
do { do {
/* pop current instance from the stack of pending return instances,
* as it's not pending anymore: we just fixed up original
* instruction pointer in regs and are about to call handlers;
* this allows fixup_uretprobe_trampoline_entries() to properly fix up
* captured stack traces from uretprobe handlers, in which pending
* trampoline addresses on the stack are replaced with correct
* original return addresses
*/
utask->return_instances = ri->next;
if (valid) if (valid)
handle_uretprobe_chain(ri, regs); handle_uretprobe_chain(ri, regs);
ri = free_ret_instance(ri); ri = free_ret_instance(ri);
@ -2187,8 +2206,8 @@ static void handle_swbp(struct pt_regs *regs)
int is_swbp; int is_swbp;
bp_vaddr = uprobe_get_swbp_addr(regs); bp_vaddr = uprobe_get_swbp_addr(regs);
if (bp_vaddr == get_trampoline_vaddr()) if (bp_vaddr == uprobe_get_trampoline_vaddr())
return handle_trampoline(regs); return uprobe_handle_trampoline(regs);
uprobe = find_active_uprobe(bp_vaddr, &is_swbp); uprobe = find_active_uprobe(bp_vaddr, &is_swbp);
if (!uprobe) { if (!uprobe) {

View File

@ -390,3 +390,5 @@ COND_SYSCALL(setuid16);
/* restartable sequence */ /* restartable sequence */
COND_SYSCALL(rseq); COND_SYSCALL(rseq);
COND_SYSCALL(uretprobe);

View File

@ -678,6 +678,21 @@ end:
} }
#ifdef CONFIG_MODULES #ifdef CONFIG_MODULES
static int validate_module_probe_symbol(const char *modname, const char *symbol);
static int register_module_trace_kprobe(struct module *mod, struct trace_kprobe *tk)
{
const char *p;
int ret = 0;
p = strchr(trace_kprobe_symbol(tk), ':');
if (p)
ret = validate_module_probe_symbol(module_name(mod), p + 1);
if (!ret)
ret = __register_trace_kprobe(tk);
return ret;
}
/* Module notifier call back, checking event on the module */ /* Module notifier call back, checking event on the module */
static int trace_kprobe_module_callback(struct notifier_block *nb, static int trace_kprobe_module_callback(struct notifier_block *nb,
unsigned long val, void *data) unsigned long val, void *data)
@ -696,7 +711,7 @@ static int trace_kprobe_module_callback(struct notifier_block *nb,
if (trace_kprobe_within_module(tk, mod)) { if (trace_kprobe_within_module(tk, mod)) {
/* Don't need to check busy - this should have gone. */ /* Don't need to check busy - this should have gone. */
__unregister_trace_kprobe(tk); __unregister_trace_kprobe(tk);
ret = __register_trace_kprobe(tk); ret = register_module_trace_kprobe(mod, tk);
if (ret) if (ret)
pr_warn("Failed to re-register probe %s on %s: %d\n", pr_warn("Failed to re-register probe %s on %s: %d\n",
trace_probe_name(&tk->tp), trace_probe_name(&tk->tp),
@ -747,17 +762,81 @@ static int count_mod_symbols(void *data, const char *name, unsigned long unused)
return 0; return 0;
} }
static unsigned int number_of_same_symbols(char *func_name) static unsigned int number_of_same_symbols(const char *mod, const char *func_name)
{ {
struct sym_count_ctx ctx = { .count = 0, .name = func_name }; struct sym_count_ctx ctx = { .count = 0, .name = func_name };
kallsyms_on_each_match_symbol(count_symbols, func_name, &ctx.count); if (!mod)
kallsyms_on_each_match_symbol(count_symbols, func_name, &ctx.count);
module_kallsyms_on_each_symbol(NULL, count_mod_symbols, &ctx); module_kallsyms_on_each_symbol(mod, count_mod_symbols, &ctx);
return ctx.count; return ctx.count;
} }
static int validate_module_probe_symbol(const char *modname, const char *symbol)
{
unsigned int count = number_of_same_symbols(modname, symbol);
if (count > 1) {
/*
* Users should use ADDR to remove the ambiguity of
* using KSYM only.
*/
return -EADDRNOTAVAIL;
} else if (count == 0) {
/*
* We can return ENOENT earlier than when register the
* kprobe.
*/
return -ENOENT;
}
return 0;
}
#ifdef CONFIG_MODULES
/* Return NULL if the module is not loaded or under unloading. */
static struct module *try_module_get_by_name(const char *name)
{
struct module *mod;
rcu_read_lock_sched();
mod = find_module(name);
if (mod && !try_module_get(mod))
mod = NULL;
rcu_read_unlock_sched();
return mod;
}
#else
#define try_module_get_by_name(name) (NULL)
#endif
static int validate_probe_symbol(char *symbol)
{
struct module *mod = NULL;
char *modname = NULL, *p;
int ret = 0;
p = strchr(symbol, ':');
if (p) {
modname = symbol;
symbol = p + 1;
*p = '\0';
mod = try_module_get_by_name(modname);
if (!mod)
goto out;
}
ret = validate_module_probe_symbol(modname, symbol);
out:
if (p)
*p = ':';
if (mod)
module_put(mod);
return ret;
}
static int trace_kprobe_entry_handler(struct kretprobe_instance *ri, static int trace_kprobe_entry_handler(struct kretprobe_instance *ri,
struct pt_regs *regs); struct pt_regs *regs);
@ -881,6 +960,14 @@ static int __trace_kprobe_create(int argc, const char *argv[])
trace_probe_log_err(0, BAD_PROBE_ADDR); trace_probe_log_err(0, BAD_PROBE_ADDR);
goto parse_error; goto parse_error;
} }
ret = validate_probe_symbol(symbol);
if (ret) {
if (ret == -EADDRNOTAVAIL)
trace_probe_log_err(0, NON_UNIQ_SYMBOL);
else
trace_probe_log_err(0, BAD_PROBE_ADDR);
goto parse_error;
}
if (is_return) if (is_return)
ctx.flags |= TPARG_FL_RETURN; ctx.flags |= TPARG_FL_RETURN;
ret = kprobe_on_func_entry(NULL, symbol, offset); ret = kprobe_on_func_entry(NULL, symbol, offset);
@ -893,31 +980,6 @@ static int __trace_kprobe_create(int argc, const char *argv[])
} }
} }
if (symbol && !strchr(symbol, ':')) {
unsigned int count;
count = number_of_same_symbols(symbol);
if (count > 1) {
/*
* Users should use ADDR to remove the ambiguity of
* using KSYM only.
*/
trace_probe_log_err(0, NON_UNIQ_SYMBOL);
ret = -EADDRNOTAVAIL;
goto error;
} else if (count == 0) {
/*
* We can return ENOENT earlier than when register the
* kprobe.
*/
trace_probe_log_err(0, BAD_PROBE_ADDR);
ret = -ENOENT;
goto error;
}
}
trace_probe_log_set_index(0); trace_probe_log_set_index(0);
if (event) { if (event) {
ret = traceprobe_parse_event_name(&event, &group, gbuf, ret = traceprobe_parse_event_name(&event, &group, gbuf,
@ -1835,21 +1897,9 @@ create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
char *event; char *event;
if (func) { if (func) {
unsigned int count; ret = validate_probe_symbol(func);
if (ret)
count = number_of_same_symbols(func); return ERR_PTR(ret);
if (count > 1)
/*
* Users should use addr to remove the ambiguity of
* using func only.
*/
return ERR_PTR(-EADDRNOTAVAIL);
else if (count == 0)
/*
* We can return ENOENT earlier than when register the
* kprobe.
*/
return ERR_PTR(-ENOENT);
} }
/* /*
@ -2023,19 +2073,16 @@ static __init int kprobe_trace_self_tests_init(void)
pr_info("Testing kprobe tracing: "); pr_info("Testing kprobe tracing: ");
ret = create_or_delete_trace_kprobe("p:testprobe kprobe_trace_selftest_target $stack $stack0 +0($stack)"); ret = create_or_delete_trace_kprobe("p:testprobe kprobe_trace_selftest_target $stack $stack0 +0($stack)");
if (WARN_ON_ONCE(ret)) { if (WARN_ONCE(ret, "error on probing function entry.")) {
pr_warn("error on probing function entry.\n");
warn++; warn++;
} else { } else {
/* Enable trace point */ /* Enable trace point */
tk = find_trace_kprobe("testprobe", KPROBE_EVENT_SYSTEM); tk = find_trace_kprobe("testprobe", KPROBE_EVENT_SYSTEM);
if (WARN_ON_ONCE(tk == NULL)) { if (WARN_ONCE(tk == NULL, "error on probing function entry.")) {
pr_warn("error on getting new probe.\n");
warn++; warn++;
} else { } else {
file = find_trace_probe_file(tk, top_trace_array()); file = find_trace_probe_file(tk, top_trace_array());
if (WARN_ON_ONCE(file == NULL)) { if (WARN_ONCE(file == NULL, "error on getting probe file.")) {
pr_warn("error on getting probe file.\n");
warn++; warn++;
} else } else
enable_trace_kprobe( enable_trace_kprobe(
@ -2044,19 +2091,16 @@ static __init int kprobe_trace_self_tests_init(void)
} }
ret = create_or_delete_trace_kprobe("r:testprobe2 kprobe_trace_selftest_target $retval"); ret = create_or_delete_trace_kprobe("r:testprobe2 kprobe_trace_selftest_target $retval");
if (WARN_ON_ONCE(ret)) { if (WARN_ONCE(ret, "error on probing function return.")) {
pr_warn("error on probing function return.\n");
warn++; warn++;
} else { } else {
/* Enable trace point */ /* Enable trace point */
tk = find_trace_kprobe("testprobe2", KPROBE_EVENT_SYSTEM); tk = find_trace_kprobe("testprobe2", KPROBE_EVENT_SYSTEM);
if (WARN_ON_ONCE(tk == NULL)) { if (WARN_ONCE(tk == NULL, "error on getting 2nd new probe.")) {
pr_warn("error on getting 2nd new probe.\n");
warn++; warn++;
} else { } else {
file = find_trace_probe_file(tk, top_trace_array()); file = find_trace_probe_file(tk, top_trace_array());
if (WARN_ON_ONCE(file == NULL)) { if (WARN_ONCE(file == NULL, "error on getting probe file.")) {
pr_warn("error on getting probe file.\n");
warn++; warn++;
} else } else
enable_trace_kprobe( enable_trace_kprobe(
@ -2079,18 +2123,15 @@ static __init int kprobe_trace_self_tests_init(void)
/* Disable trace points before removing it */ /* Disable trace points before removing it */
tk = find_trace_kprobe("testprobe", KPROBE_EVENT_SYSTEM); tk = find_trace_kprobe("testprobe", KPROBE_EVENT_SYSTEM);
if (WARN_ON_ONCE(tk == NULL)) { if (WARN_ONCE(tk == NULL, "error on getting test probe.")) {
pr_warn("error on getting test probe.\n");
warn++; warn++;
} else { } else {
if (trace_kprobe_nhit(tk) != 1) { if (WARN_ONCE(trace_kprobe_nhit(tk) != 1,
pr_warn("incorrect number of testprobe hits\n"); "incorrect number of testprobe hits."))
warn++; warn++;
}
file = find_trace_probe_file(tk, top_trace_array()); file = find_trace_probe_file(tk, top_trace_array());
if (WARN_ON_ONCE(file == NULL)) { if (WARN_ONCE(file == NULL, "error on getting probe file.")) {
pr_warn("error on getting probe file.\n");
warn++; warn++;
} else } else
disable_trace_kprobe( disable_trace_kprobe(
@ -2098,18 +2139,15 @@ static __init int kprobe_trace_self_tests_init(void)
} }
tk = find_trace_kprobe("testprobe2", KPROBE_EVENT_SYSTEM); tk = find_trace_kprobe("testprobe2", KPROBE_EVENT_SYSTEM);
if (WARN_ON_ONCE(tk == NULL)) { if (WARN_ONCE(tk == NULL, "error on getting 2nd test probe.")) {
pr_warn("error on getting 2nd test probe.\n");
warn++; warn++;
} else { } else {
if (trace_kprobe_nhit(tk) != 1) { if (WARN_ONCE(trace_kprobe_nhit(tk) != 1,
pr_warn("incorrect number of testprobe2 hits\n"); "incorrect number of testprobe2 hits."))
warn++; warn++;
}
file = find_trace_probe_file(tk, top_trace_array()); file = find_trace_probe_file(tk, top_trace_array());
if (WARN_ON_ONCE(file == NULL)) { if (WARN_ONCE(file == NULL, "error on getting probe file.")) {
pr_warn("error on getting probe file.\n");
warn++; warn++;
} else } else
disable_trace_kprobe( disable_trace_kprobe(
@ -2117,23 +2155,15 @@ static __init int kprobe_trace_self_tests_init(void)
} }
ret = create_or_delete_trace_kprobe("-:testprobe"); ret = create_or_delete_trace_kprobe("-:testprobe");
if (WARN_ON_ONCE(ret)) { if (WARN_ONCE(ret, "error on deleting a probe."))
pr_warn("error on deleting a probe.\n");
warn++; warn++;
}
ret = create_or_delete_trace_kprobe("-:testprobe2"); ret = create_or_delete_trace_kprobe("-:testprobe2");
if (WARN_ON_ONCE(ret)) { if (WARN_ONCE(ret, "error on deleting a probe."))
pr_warn("error on deleting a probe.\n");
warn++; warn++;
}
end: end:
ret = dyn_events_release_all(&trace_kprobe_ops);
if (WARN_ON_ONCE(ret)) {
pr_warn("error on cleaning up probes.\n");
warn++;
}
/* /*
* Wait for the optimizer work to finish. Otherwise it might fiddle * Wait for the optimizer work to finish. Otherwise it might fiddle
* with probes in already freed __init text. * with probes in already freed __init text.

View File

@ -150,4 +150,5 @@ static void __exit fprobe_exit(void)
module_init(fprobe_init) module_init(fprobe_init)
module_exit(fprobe_exit) module_exit(fprobe_exit)
MODULE_DESCRIPTION("sample kernel module showing the use of fprobe");
MODULE_LICENSE("GPL"); MODULE_LICENSE("GPL");

View File

@ -125,4 +125,5 @@ static void __exit kprobe_exit(void)
module_init(kprobe_init) module_init(kprobe_init)
module_exit(kprobe_exit) module_exit(kprobe_exit)
MODULE_DESCRIPTION("sample kernel module showing the use of kprobes");
MODULE_LICENSE("GPL"); MODULE_LICENSE("GPL");

View File

@ -104,4 +104,5 @@ static void __exit kretprobe_exit(void)
module_init(kretprobe_init) module_init(kretprobe_init)
module_exit(kretprobe_exit) module_exit(kretprobe_exit)
MODULE_DESCRIPTION("sample kernel module showing the use of return probes");
MODULE_LICENSE("GPL"); MODULE_LICENSE("GPL");

View File

@ -62,6 +62,10 @@
#define __nocf_check __attribute__((nocf_check)) #define __nocf_check __attribute__((nocf_check))
#endif #endif
#ifndef __naked
#define __naked __attribute__((__naked__))
#endif
/* Are two types/vars the same type (ignoring qualifiers)? */ /* Are two types/vars the same type (ignoring qualifiers)? */
#ifndef __same_type #ifndef __same_type
# define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b)) # define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))

View File

@ -18,6 +18,7 @@
#include <linux/in6.h> #include <linux/in6.h>
#include <linux/un.h> #include <linux/un.h>
#include <net/sock.h> #include <net/sock.h>
#include <linux/namei.h>
#include "bpf_testmod.h" #include "bpf_testmod.h"
#include "bpf_testmod_kfunc.h" #include "bpf_testmod_kfunc.h"
@ -413,6 +414,119 @@ static struct bin_attribute bin_attr_bpf_testmod_file __ro_after_init = {
.write = bpf_testmod_test_write, .write = bpf_testmod_test_write,
}; };
/* bpf_testmod_uprobe sysfs attribute is so far enabled for x86_64 only,
* please see test_uretprobe_regs_change test
*/
#ifdef __x86_64__
static int
uprobe_ret_handler(struct uprobe_consumer *self, unsigned long func,
struct pt_regs *regs)
{
regs->ax = 0x12345678deadbeef;
regs->cx = 0x87654321feebdaed;
regs->r11 = (u64) -1;
return true;
}
struct testmod_uprobe {
struct path path;
loff_t offset;
struct uprobe_consumer consumer;
};
static DEFINE_MUTEX(testmod_uprobe_mutex);
static struct testmod_uprobe uprobe = {
.consumer.ret_handler = uprobe_ret_handler,
};
static int testmod_register_uprobe(loff_t offset)
{
int err = -EBUSY;
if (uprobe.offset)
return -EBUSY;
mutex_lock(&testmod_uprobe_mutex);
if (uprobe.offset)
goto out;
err = kern_path("/proc/self/exe", LOOKUP_FOLLOW, &uprobe.path);
if (err)
goto out;
err = uprobe_register_refctr(d_real_inode(uprobe.path.dentry),
offset, 0, &uprobe.consumer);
if (err)
path_put(&uprobe.path);
else
uprobe.offset = offset;
out:
mutex_unlock(&testmod_uprobe_mutex);
return err;
}
static void testmod_unregister_uprobe(void)
{
mutex_lock(&testmod_uprobe_mutex);
if (uprobe.offset) {
uprobe_unregister(d_real_inode(uprobe.path.dentry),
uprobe.offset, &uprobe.consumer);
uprobe.offset = 0;
}
mutex_unlock(&testmod_uprobe_mutex);
}
static ssize_t
bpf_testmod_uprobe_write(struct file *file, struct kobject *kobj,
struct bin_attribute *bin_attr,
char *buf, loff_t off, size_t len)
{
unsigned long offset = 0;
int err = 0;
if (kstrtoul(buf, 0, &offset))
return -EINVAL;
if (offset)
err = testmod_register_uprobe(offset);
else
testmod_unregister_uprobe();
return err ?: strlen(buf);
}
static struct bin_attribute bin_attr_bpf_testmod_uprobe_file __ro_after_init = {
.attr = { .name = "bpf_testmod_uprobe", .mode = 0666, },
.write = bpf_testmod_uprobe_write,
};
static int register_bpf_testmod_uprobe(void)
{
return sysfs_create_bin_file(kernel_kobj, &bin_attr_bpf_testmod_uprobe_file);
}
static void unregister_bpf_testmod_uprobe(void)
{
testmod_unregister_uprobe();
sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_uprobe_file);
}
#else
static int register_bpf_testmod_uprobe(void)
{
return 0;
}
static void unregister_bpf_testmod_uprobe(void) { }
#endif
BTF_KFUNCS_START(bpf_testmod_common_kfunc_ids) BTF_KFUNCS_START(bpf_testmod_common_kfunc_ids)
BTF_ID_FLAGS(func, bpf_iter_testmod_seq_new, KF_ITER_NEW) BTF_ID_FLAGS(func, bpf_iter_testmod_seq_new, KF_ITER_NEW)
BTF_ID_FLAGS(func, bpf_iter_testmod_seq_next, KF_ITER_NEXT | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_testmod_seq_next, KF_ITER_NEXT | KF_RET_NULL)
@ -983,7 +1097,13 @@ static int bpf_testmod_init(void)
return -EINVAL; return -EINVAL;
sock = NULL; sock = NULL;
mutex_init(&sock_lock); mutex_init(&sock_lock);
return sysfs_create_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file); ret = sysfs_create_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
if (ret < 0)
return ret;
ret = register_bpf_testmod_uprobe();
if (ret < 0)
return ret;
return 0;
} }
static void bpf_testmod_exit(void) static void bpf_testmod_exit(void)
@ -998,6 +1118,7 @@ static void bpf_testmod_exit(void)
bpf_kfunc_close_sock(); bpf_kfunc_close_sock();
sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file); sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
unregister_bpf_testmod_uprobe();
} }
module_init(bpf_testmod_init); module_init(bpf_testmod_init);

View File

@ -0,0 +1,385 @@
// SPDX-License-Identifier: GPL-2.0
#include <test_progs.h>
#ifdef __x86_64__
#include <unistd.h>
#include <asm/ptrace.h>
#include <linux/compiler.h>
#include <linux/stringify.h>
#include <sys/wait.h>
#include <sys/syscall.h>
#include <sys/prctl.h>
#include <asm/prctl.h>
#include "uprobe_syscall.skel.h"
#include "uprobe_syscall_executed.skel.h"
__naked unsigned long uretprobe_regs_trigger(void)
{
asm volatile (
"movq $0xdeadbeef, %rax\n"
"ret\n"
);
}
__naked void uretprobe_regs(struct pt_regs *before, struct pt_regs *after)
{
asm volatile (
"movq %r15, 0(%rdi)\n"
"movq %r14, 8(%rdi)\n"
"movq %r13, 16(%rdi)\n"
"movq %r12, 24(%rdi)\n"
"movq %rbp, 32(%rdi)\n"
"movq %rbx, 40(%rdi)\n"
"movq %r11, 48(%rdi)\n"
"movq %r10, 56(%rdi)\n"
"movq %r9, 64(%rdi)\n"
"movq %r8, 72(%rdi)\n"
"movq %rax, 80(%rdi)\n"
"movq %rcx, 88(%rdi)\n"
"movq %rdx, 96(%rdi)\n"
"movq %rsi, 104(%rdi)\n"
"movq %rdi, 112(%rdi)\n"
"movq $0, 120(%rdi)\n" /* orig_rax */
"movq $0, 128(%rdi)\n" /* rip */
"movq $0, 136(%rdi)\n" /* cs */
"pushf\n"
"pop %rax\n"
"movq %rax, 144(%rdi)\n" /* eflags */
"movq %rsp, 152(%rdi)\n" /* rsp */
"movq $0, 160(%rdi)\n" /* ss */
/* save 2nd argument */
"pushq %rsi\n"
"call uretprobe_regs_trigger\n"
/* save return value and load 2nd argument pointer to rax */
"pushq %rax\n"
"movq 8(%rsp), %rax\n"
"movq %r15, 0(%rax)\n"
"movq %r14, 8(%rax)\n"
"movq %r13, 16(%rax)\n"
"movq %r12, 24(%rax)\n"
"movq %rbp, 32(%rax)\n"
"movq %rbx, 40(%rax)\n"
"movq %r11, 48(%rax)\n"
"movq %r10, 56(%rax)\n"
"movq %r9, 64(%rax)\n"
"movq %r8, 72(%rax)\n"
"movq %rcx, 88(%rax)\n"
"movq %rdx, 96(%rax)\n"
"movq %rsi, 104(%rax)\n"
"movq %rdi, 112(%rax)\n"
"movq $0, 120(%rax)\n" /* orig_rax */
"movq $0, 128(%rax)\n" /* rip */
"movq $0, 136(%rax)\n" /* cs */
/* restore return value and 2nd argument */
"pop %rax\n"
"pop %rsi\n"
"movq %rax, 80(%rsi)\n"
"pushf\n"
"pop %rax\n"
"movq %rax, 144(%rsi)\n" /* eflags */
"movq %rsp, 152(%rsi)\n" /* rsp */
"movq $0, 160(%rsi)\n" /* ss */
"ret\n"
);
}
static void test_uretprobe_regs_equal(void)
{
struct uprobe_syscall *skel = NULL;
struct pt_regs before = {}, after = {};
unsigned long *pb = (unsigned long *) &before;
unsigned long *pa = (unsigned long *) &after;
unsigned long *pp;
unsigned int i, cnt;
int err;
skel = uprobe_syscall__open_and_load();
if (!ASSERT_OK_PTR(skel, "uprobe_syscall__open_and_load"))
goto cleanup;
err = uprobe_syscall__attach(skel);
if (!ASSERT_OK(err, "uprobe_syscall__attach"))
goto cleanup;
uretprobe_regs(&before, &after);
pp = (unsigned long *) &skel->bss->regs;
cnt = sizeof(before)/sizeof(*pb);
for (i = 0; i < cnt; i++) {
unsigned int offset = i * sizeof(unsigned long);
/*
* Check register before and after uretprobe_regs_trigger call
* that triggers the uretprobe.
*/
switch (offset) {
case offsetof(struct pt_regs, rax):
ASSERT_EQ(pa[i], 0xdeadbeef, "return value");
break;
default:
if (!ASSERT_EQ(pb[i], pa[i], "register before-after value check"))
fprintf(stdout, "failed register offset %u\n", offset);
}
/*
* Check register seen from bpf program and register after
* uretprobe_regs_trigger call
*/
switch (offset) {
/*
* These values will be different (not set in uretprobe_regs),
* we don't care.
*/
case offsetof(struct pt_regs, orig_rax):
case offsetof(struct pt_regs, rip):
case offsetof(struct pt_regs, cs):
case offsetof(struct pt_regs, rsp):
case offsetof(struct pt_regs, ss):
break;
default:
if (!ASSERT_EQ(pp[i], pa[i], "register prog-after value check"))
fprintf(stdout, "failed register offset %u\n", offset);
}
}
cleanup:
uprobe_syscall__destroy(skel);
}
#define BPF_TESTMOD_UPROBE_TEST_FILE "/sys/kernel/bpf_testmod_uprobe"
static int write_bpf_testmod_uprobe(unsigned long offset)
{
size_t n, ret;
char buf[30];
int fd;
n = sprintf(buf, "%lu", offset);
fd = open(BPF_TESTMOD_UPROBE_TEST_FILE, O_WRONLY);
if (fd < 0)
return -errno;
ret = write(fd, buf, n);
close(fd);
return ret != n ? (int) ret : 0;
}
static void test_uretprobe_regs_change(void)
{
struct pt_regs before = {}, after = {};
unsigned long *pb = (unsigned long *) &before;
unsigned long *pa = (unsigned long *) &after;
unsigned long cnt = sizeof(before)/sizeof(*pb);
unsigned int i, err, offset;
offset = get_uprobe_offset(uretprobe_regs_trigger);
err = write_bpf_testmod_uprobe(offset);
if (!ASSERT_OK(err, "register_uprobe"))
return;
uretprobe_regs(&before, &after);
err = write_bpf_testmod_uprobe(0);
if (!ASSERT_OK(err, "unregister_uprobe"))
return;
for (i = 0; i < cnt; i++) {
unsigned int offset = i * sizeof(unsigned long);
switch (offset) {
case offsetof(struct pt_regs, rax):
ASSERT_EQ(pa[i], 0x12345678deadbeef, "rax");
break;
case offsetof(struct pt_regs, rcx):
ASSERT_EQ(pa[i], 0x87654321feebdaed, "rcx");
break;
case offsetof(struct pt_regs, r11):
ASSERT_EQ(pa[i], (__u64) -1, "r11");
break;
default:
if (!ASSERT_EQ(pa[i], pb[i], "register before-after value check"))
fprintf(stdout, "failed register offset %u\n", offset);
}
}
}
#ifndef __NR_uretprobe
#define __NR_uretprobe 467
#endif
__naked unsigned long uretprobe_syscall_call_1(void)
{
/*
* Pretend we are uretprobe trampoline to trigger the return
* probe invocation in order to verify we get SIGILL.
*/
asm volatile (
"pushq %rax\n"
"pushq %rcx\n"
"pushq %r11\n"
"movq $" __stringify(__NR_uretprobe) ", %rax\n"
"syscall\n"
"popq %r11\n"
"popq %rcx\n"
"retq\n"
);
}
__naked unsigned long uretprobe_syscall_call(void)
{
asm volatile (
"call uretprobe_syscall_call_1\n"
"retq\n"
);
}
static void test_uretprobe_syscall_call(void)
{
LIBBPF_OPTS(bpf_uprobe_multi_opts, opts,
.retprobe = true,
);
struct uprobe_syscall_executed *skel;
int pid, status, err, go[2], c;
if (ASSERT_OK(pipe(go), "pipe"))
return;
skel = uprobe_syscall_executed__open_and_load();
if (!ASSERT_OK_PTR(skel, "uprobe_syscall_executed__open_and_load"))
goto cleanup;
pid = fork();
if (!ASSERT_GE(pid, 0, "fork"))
goto cleanup;
/* child */
if (pid == 0) {
close(go[1]);
/* wait for parent's kick */
err = read(go[0], &c, 1);
if (err != 1)
exit(-1);
uretprobe_syscall_call();
_exit(0);
}
skel->links.test = bpf_program__attach_uprobe_multi(skel->progs.test, pid,
"/proc/self/exe",
"uretprobe_syscall_call", &opts);
if (!ASSERT_OK_PTR(skel->links.test, "bpf_program__attach_uprobe_multi"))
goto cleanup;
/* kick the child */
write(go[1], &c, 1);
err = waitpid(pid, &status, 0);
ASSERT_EQ(err, pid, "waitpid");
/* verify the child got killed with SIGILL */
ASSERT_EQ(WIFSIGNALED(status), 1, "WIFSIGNALED");
ASSERT_EQ(WTERMSIG(status), SIGILL, "WTERMSIG");
/* verify the uretprobe program wasn't called */
ASSERT_EQ(skel->bss->executed, 0, "executed");
cleanup:
uprobe_syscall_executed__destroy(skel);
close(go[1]);
close(go[0]);
}
/*
* Borrowed from tools/testing/selftests/x86/test_shadow_stack.c.
*
* For use in inline enablement of shadow stack.
*
* The program can't return from the point where shadow stack gets enabled
* because there will be no address on the shadow stack. So it can't use
* syscall() for enablement, since it is a function.
*
* Based on code from nolibc.h. Keep a copy here because this can't pull
* in all of nolibc.h.
*/
#define ARCH_PRCTL(arg1, arg2) \
({ \
long _ret; \
register long _num asm("eax") = __NR_arch_prctl; \
register long _arg1 asm("rdi") = (long)(arg1); \
register long _arg2 asm("rsi") = (long)(arg2); \
\
asm volatile ( \
"syscall\n" \
: "=a"(_ret) \
: "r"(_arg1), "r"(_arg2), \
"0"(_num) \
: "rcx", "r11", "memory", "cc" \
); \
_ret; \
})
#ifndef ARCH_SHSTK_ENABLE
#define ARCH_SHSTK_ENABLE 0x5001
#define ARCH_SHSTK_DISABLE 0x5002
#define ARCH_SHSTK_SHSTK (1ULL << 0)
#endif
static void test_uretprobe_shadow_stack(void)
{
if (ARCH_PRCTL(ARCH_SHSTK_ENABLE, ARCH_SHSTK_SHSTK)) {
test__skip();
return;
}
/* Run all of the uretprobe tests. */
test_uretprobe_regs_equal();
test_uretprobe_regs_change();
test_uretprobe_syscall_call();
ARCH_PRCTL(ARCH_SHSTK_DISABLE, ARCH_SHSTK_SHSTK);
}
#else
static void test_uretprobe_regs_equal(void)
{
test__skip();
}
static void test_uretprobe_regs_change(void)
{
test__skip();
}
static void test_uretprobe_syscall_call(void)
{
test__skip();
}
static void test_uretprobe_shadow_stack(void)
{
test__skip();
}
#endif
void test_uprobe_syscall(void)
{
if (test__start_subtest("uretprobe_regs_equal"))
test_uretprobe_regs_equal();
if (test__start_subtest("uretprobe_regs_change"))
test_uretprobe_regs_change();
if (test__start_subtest("uretprobe_syscall_call"))
test_uretprobe_syscall_call();
if (test__start_subtest("uretprobe_shadow_stack"))
test_uretprobe_shadow_stack();
}

View File

@ -0,0 +1,186 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
#include <test_progs.h>
#include "uretprobe_stack.skel.h"
#include "../sdt.h"
/* We set up target_1() -> target_2() -> target_3() -> target_4() -> USDT()
* call chain, each being traced by our BPF program. On entry or return from
* each target_*() we are capturing user stack trace and recording it in
* global variable, so that user space part of the test can validate it.
*
* Note, we put each target function into a custom section to get those
* __start_XXX/__stop_XXX symbols, generated by linker for us, which allow us
* to know address range of those functions
*/
__attribute__((section("uprobe__target_4")))
__weak int target_4(void)
{
STAP_PROBE1(uretprobe_stack, target, 42);
return 42;
}
extern const void *__start_uprobe__target_4;
extern const void *__stop_uprobe__target_4;
__attribute__((section("uprobe__target_3")))
__weak int target_3(void)
{
return target_4();
}
extern const void *__start_uprobe__target_3;
extern const void *__stop_uprobe__target_3;
__attribute__((section("uprobe__target_2")))
__weak int target_2(void)
{
return target_3();
}
extern const void *__start_uprobe__target_2;
extern const void *__stop_uprobe__target_2;
__attribute__((section("uprobe__target_1")))
__weak int target_1(int depth)
{
if (depth < 1)
return 1 + target_1(depth + 1);
else
return target_2();
}
extern const void *__start_uprobe__target_1;
extern const void *__stop_uprobe__target_1;
extern const void *__start_uretprobe_stack_sec;
extern const void *__stop_uretprobe_stack_sec;
struct range {
long start;
long stop;
};
static struct range targets[] = {
{}, /* we want target_1 to map to target[1], so need 1-based indexing */
{ (long)&__start_uprobe__target_1, (long)&__stop_uprobe__target_1 },
{ (long)&__start_uprobe__target_2, (long)&__stop_uprobe__target_2 },
{ (long)&__start_uprobe__target_3, (long)&__stop_uprobe__target_3 },
{ (long)&__start_uprobe__target_4, (long)&__stop_uprobe__target_4 },
};
static struct range caller = {
(long)&__start_uretprobe_stack_sec,
(long)&__stop_uretprobe_stack_sec,
};
static void validate_stack(__u64 *ips, int stack_len, int cnt, ...)
{
int i, j;
va_list args;
if (!ASSERT_GT(stack_len, 0, "stack_len"))
return;
stack_len /= 8;
/* check if we have enough entries to satisfy test expectations */
if (!ASSERT_GE(stack_len, cnt, "stack_len2"))
return;
if (env.verbosity >= VERBOSE_NORMAL) {
printf("caller: %#lx - %#lx\n", caller.start, caller.stop);
for (i = 1; i < ARRAY_SIZE(targets); i++)
printf("target_%d: %#lx - %#lx\n", i, targets[i].start, targets[i].stop);
for (i = 0; i < stack_len; i++) {
for (j = 1; j < ARRAY_SIZE(targets); j++) {
if (ips[i] >= targets[j].start && ips[i] < targets[j].stop)
break;
}
if (j < ARRAY_SIZE(targets)) { /* found target match */
printf("ENTRY #%d: %#lx (in target_%d)\n", i, (long)ips[i], j);
} else if (ips[i] >= caller.start && ips[i] < caller.stop) {
printf("ENTRY #%d: %#lx (in caller)\n", i, (long)ips[i]);
} else {
printf("ENTRY #%d: %#lx\n", i, (long)ips[i]);
}
}
}
va_start(args, cnt);
for (i = cnt - 1; i >= 0; i--) {
/* most recent entry is the deepest target function */
const struct range *t = va_arg(args, const struct range *);
ASSERT_GE(ips[i], t->start, "addr_start");
ASSERT_LT(ips[i], t->stop, "addr_stop");
}
va_end(args);
}
/* __weak prevents inlining */
__attribute__((section("uretprobe_stack_sec")))
__weak void test_uretprobe_stack(void)
{
LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
struct uretprobe_stack *skel;
int err;
skel = uretprobe_stack__open_and_load();
if (!ASSERT_OK_PTR(skel, "skel_open"))
return;
err = uretprobe_stack__attach(skel);
if (!ASSERT_OK(err, "skel_attach"))
goto cleanup;
/* trigger */
ASSERT_EQ(target_1(0), 42 + 1, "trigger_return");
/*
* Stacks captured on ENTRY uprobes
*/
/* (uprobe 1) target_1 in stack trace*/
validate_stack(skel->bss->entry_stack1, skel->bss->entry1_len,
2, &caller, &targets[1]);
/* (uprobe 1, recursed) */
validate_stack(skel->bss->entry_stack1_recur, skel->bss->entry1_recur_len,
3, &caller, &targets[1], &targets[1]);
/* (uprobe 2) caller -> target_1 -> target_1 -> target_2 */
validate_stack(skel->bss->entry_stack2, skel->bss->entry2_len,
4, &caller, &targets[1], &targets[1], &targets[2]);
/* (uprobe 3) */
validate_stack(skel->bss->entry_stack3, skel->bss->entry3_len,
5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
/* (uprobe 4) caller -> target_1 -> target_1 -> target_2 -> target_3 -> target_4 */
validate_stack(skel->bss->entry_stack4, skel->bss->entry4_len,
6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
/* (USDT): full caller -> target_1 -> target_1 -> target_2 (uretprobed)
* -> target_3 -> target_4 (uretprobes) chain
*/
validate_stack(skel->bss->usdt_stack, skel->bss->usdt_len,
6, &caller, &targets[1], &targets[1], &targets[2], &targets[3], &targets[4]);
/*
* Now stacks captured on the way out in EXIT uprobes
*/
/* (uretprobe 4) everything up to target_4, but excluding it */
validate_stack(skel->bss->exit_stack4, skel->bss->exit4_len,
5, &caller, &targets[1], &targets[1], &targets[2], &targets[3]);
/* we didn't install uretprobes on target_2 and target_3 */
/* (uretprobe 1, recur) first target_1 call only */
validate_stack(skel->bss->exit_stack1_recur, skel->bss->exit1_recur_len,
2, &caller, &targets[1]);
/* (uretprobe 1) just a caller in the stack trace */
validate_stack(skel->bss->exit_stack1, skel->bss->exit1_len,
1, &caller);
cleanup:
uretprobe_stack__destroy(skel);
}

View File

@ -0,0 +1,15 @@
// SPDX-License-Identifier: GPL-2.0
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <string.h>
struct pt_regs regs;
char _license[] SEC("license") = "GPL";
SEC("uretprobe//proc/self/exe:uretprobe_regs_trigger")
int uretprobe(struct pt_regs *ctx)
{
__builtin_memcpy(&regs, ctx, sizeof(regs));
return 0;
}

View File

@ -0,0 +1,17 @@
// SPDX-License-Identifier: GPL-2.0
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <string.h>
struct pt_regs regs;
char _license[] SEC("license") = "GPL";
int executed = 0;
SEC("uretprobe.multi")
int test(struct pt_regs *regs)
{
executed = 1;
return 0;
}

View File

@ -0,0 +1,96 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/usdt.bpf.h>
char _license[] SEC("license") = "GPL";
__u64 entry_stack1[32], exit_stack1[32];
__u64 entry_stack1_recur[32], exit_stack1_recur[32];
__u64 entry_stack2[32];
__u64 entry_stack3[32];
__u64 entry_stack4[32], exit_stack4[32];
__u64 usdt_stack[32];
int entry1_len, exit1_len;
int entry1_recur_len, exit1_recur_len;
int entry2_len, exit2_len;
int entry3_len, exit3_len;
int entry4_len, exit4_len;
int usdt_len;
#define SZ sizeof(usdt_stack)
SEC("uprobe//proc/self/exe:target_1")
int BPF_UPROBE(uprobe_1)
{
/* target_1 is recursive wit depth of 2, so we capture two separate
* stack traces, depending on which occurence it is
*/
static bool recur = false;
if (!recur)
entry1_len = bpf_get_stack(ctx, &entry_stack1, SZ, BPF_F_USER_STACK);
else
entry1_recur_len = bpf_get_stack(ctx, &entry_stack1_recur, SZ, BPF_F_USER_STACK);
recur = true;
return 0;
}
SEC("uretprobe//proc/self/exe:target_1")
int BPF_URETPROBE(uretprobe_1)
{
/* see above, target_1 is recursive */
static bool recur = false;
/* NOTE: order of returns is reversed to order of entries */
if (!recur)
exit1_recur_len = bpf_get_stack(ctx, &exit_stack1_recur, SZ, BPF_F_USER_STACK);
else
exit1_len = bpf_get_stack(ctx, &exit_stack1, SZ, BPF_F_USER_STACK);
recur = true;
return 0;
}
SEC("uprobe//proc/self/exe:target_2")
int BPF_UPROBE(uprobe_2)
{
entry2_len = bpf_get_stack(ctx, &entry_stack2, SZ, BPF_F_USER_STACK);
return 0;
}
/* no uretprobe for target_2 */
SEC("uprobe//proc/self/exe:target_3")
int BPF_UPROBE(uprobe_3)
{
entry3_len = bpf_get_stack(ctx, &entry_stack3, SZ, BPF_F_USER_STACK);
return 0;
}
/* no uretprobe for target_3 */
SEC("uprobe//proc/self/exe:target_4")
int BPF_UPROBE(uprobe_4)
{
entry4_len = bpf_get_stack(ctx, &entry_stack4, SZ, BPF_F_USER_STACK);
return 0;
}
SEC("uretprobe//proc/self/exe:target_4")
int BPF_URETPROBE(uretprobe_4)
{
exit4_len = bpf_get_stack(ctx, &exit_stack4, SZ, BPF_F_USER_STACK);
return 0;
}
SEC("usdt//proc/self/exe:uretprobe_stack:target")
int BPF_USDT(usdt_probe)
{
usdt_len = bpf_get_stack(ctx, &usdt_stack, SZ, BPF_F_USER_STACK);
return 0;
}

View File

@ -34,6 +34,7 @@
#include <sys/ptrace.h> #include <sys/ptrace.h>
#include <sys/signal.h> #include <sys/signal.h>
#include <linux/elf.h> #include <linux/elf.h>
#include <linux/perf_event.h>
/* /*
* Define the ABI defines if needed, so people can run the tests * Define the ABI defines if needed, so people can run the tests
@ -734,6 +735,144 @@ int test_32bit(void)
return !segv_triggered; return !segv_triggered;
} }
static int parse_uint_from_file(const char *file, const char *fmt)
{
int err, ret;
FILE *f;
f = fopen(file, "re");
if (!f) {
err = -errno;
printf("failed to open '%s': %d\n", file, err);
return err;
}
err = fscanf(f, fmt, &ret);
if (err != 1) {
err = err == EOF ? -EIO : -errno;
printf("failed to parse '%s': %d\n", file, err);
fclose(f);
return err;
}
fclose(f);
return ret;
}
static int determine_uprobe_perf_type(void)
{
const char *file = "/sys/bus/event_source/devices/uprobe/type";
return parse_uint_from_file(file, "%d\n");
}
static int determine_uprobe_retprobe_bit(void)
{
const char *file = "/sys/bus/event_source/devices/uprobe/format/retprobe";
return parse_uint_from_file(file, "config:%d\n");
}
static ssize_t get_uprobe_offset(const void *addr)
{
size_t start, end, base;
char buf[256];
bool found = false;
FILE *f;
f = fopen("/proc/self/maps", "r");
if (!f)
return -errno;
while (fscanf(f, "%zx-%zx %s %zx %*[^\n]\n", &start, &end, buf, &base) == 4) {
if (buf[2] == 'x' && (uintptr_t)addr >= start && (uintptr_t)addr < end) {
found = true;
break;
}
}
fclose(f);
if (!found)
return -ESRCH;
return (uintptr_t)addr - start + base;
}
static __attribute__((noinline)) void uretprobe_trigger(void)
{
asm volatile ("");
}
/*
* This test setups return uprobe, which is sensitive to shadow stack
* (crashes without extra fix). After executing the uretprobe we fail
* the test if we receive SIGSEGV, no crash means we're good.
*
* Helper functions above borrowed from bpf selftests.
*/
static int test_uretprobe(void)
{
const size_t attr_sz = sizeof(struct perf_event_attr);
const char *file = "/proc/self/exe";
int bit, fd = 0, type, err = 1;
struct perf_event_attr attr;
struct sigaction sa = {};
ssize_t offset;
type = determine_uprobe_perf_type();
if (type < 0) {
if (type == -ENOENT)
printf("[SKIP]\tUretprobe test, uprobes are not available\n");
return 0;
}
offset = get_uprobe_offset(uretprobe_trigger);
if (offset < 0)
return 1;
bit = determine_uprobe_retprobe_bit();
if (bit < 0)
return 1;
sa.sa_sigaction = segv_gp_handler;
sa.sa_flags = SA_SIGINFO;
if (sigaction(SIGSEGV, &sa, NULL))
return 1;
/* Setup return uprobe through perf event interface. */
memset(&attr, 0, attr_sz);
attr.size = attr_sz;
attr.type = type;
attr.config = 1 << bit;
attr.config1 = (__u64) (unsigned long) file;
attr.config2 = offset;
fd = syscall(__NR_perf_event_open, &attr, 0 /* pid */, -1 /* cpu */,
-1 /* group_fd */, PERF_FLAG_FD_CLOEXEC);
if (fd < 0)
goto out;
if (sigsetjmp(jmp_buffer, 1))
goto out;
ARCH_PRCTL(ARCH_SHSTK_ENABLE, ARCH_SHSTK_SHSTK);
/*
* This either segfaults and goes through sigsetjmp above
* or succeeds and we're good.
*/
uretprobe_trigger();
printf("[OK]\tUretprobe test\n");
err = 0;
out:
ARCH_PRCTL(ARCH_SHSTK_DISABLE, ARCH_SHSTK_SHSTK);
signal(SIGSEGV, SIG_DFL);
if (fd)
close(fd);
return err;
}
void segv_handler_ptrace(int signum, siginfo_t *si, void *uc) void segv_handler_ptrace(int signum, siginfo_t *si, void *uc)
{ {
/* The SSP adjustment caused a segfault. */ /* The SSP adjustment caused a segfault. */
@ -926,6 +1065,12 @@ int main(int argc, char *argv[])
goto out; goto out;
} }
if (test_uretprobe()) {
ret = 1;
printf("[FAIL]\turetprobe test\n");
goto out;
}
return ret; return ret;
out: out: