This pull request has a new feature to ftrace, namely the trace event

triggers by Tom Zanussi. A trigger is a way to enable an action when an
 event is hit. The actions are:
 
  o  trace on/off - enable or disable tracing
  o  snapshot     - save the current trace buffer in the snapshot
  o  stacktrace   - dump the current stack trace to the ringbuffer
  o  enable/disable events - enable or disable another event
 
 Namhyung Kim added updates to the tracing uprobes code. Having the
 uprobes add support for fetch methods.
 
 The rest are various bug fixes with the new code, and minor ones for
 the old code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJS3Z9fAAoJEKQekfcNnQGuFf0H/0CteaN+BJjpif6Tnxia15Sp
 pcftzU0lgqfNzsfitmbjiVTgXWqCghoZo8UI9tQZvBZ9wmDIxeXQR73uoBgVlSCQ
 ovyBO/R8r+lq+7EsDCwntZvrLbcdn6s/jzoruRvt7r35ghK5pH81DNR1BOzTQBhW
 x+361Xtc13aok7N7JN8KR96VDUP9f8KU6PWqJ5lgS2Zl+wbVw6b0p8OV8IMCHczP
 MdYrx8y4Jv4QWW7rMShAAVBe9qJQ56JWiWA17ysa4kY8BkKQ7QtlEFr+r1YY0nX5
 67brXiL8u0NFzRx5y2VRpGc25BbImnVBFpoLQ5Itluq9OdZE3aOQubzXlY70R6g=
 =Hkho
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "This pull request has a new feature to ftrace, namely the trace event
  triggers by Tom Zanussi.  A trigger is a way to enable an action when
  an event is hit.  The actions are:

   o  trace on/off - enable or disable tracing
   o  snapshot     - save the current trace buffer in the snapshot
   o  stacktrace   - dump the current stack trace to the ringbuffer
   o  enable/disable events - enable or disable another event

  Namhyung Kim added updates to the tracing uprobes code.  Having the
  uprobes add support for fetch methods.

  The rest are various bug fixes with the new code, and minor ones for
  the old code"

* tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (38 commits)
  tracing: Fix buggered tee(2) on tracing_pipe
  tracing: Have trace buffer point back to trace_array
  ftrace: Fix synchronization location disabling and freeing ftrace_ops
  ftrace: Have function graph only trace based on global_ops filters
  ftrace: Synchronize setting function_trace_op with ftrace_trace_function
  tracing: Show available event triggers when no trigger is set
  tracing: Consolidate event trigger code
  tracing: Fix counter for traceon/off event triggers
  tracing: Remove double-underscore naming in syscall trigger invocations
  tracing/kprobes: Add trace event trigger invocations
  tracing/probes: Fix build break on !CONFIG_KPROBE_EVENT
  tracing/uprobes: Add @+file_offset fetch method
  uprobes: Allocate ->utask before handler_chain() for tracing handlers
  tracing/uprobes: Add support for full argument access methods
  tracing/uprobes: Fetch args before reserving a ring buffer
  tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg()
  tracing/probes: Implement 'memory' fetch method for uprobes
  tracing/probes: Add fetch{,_size} member into deref fetch method
  tracing/probes: Move 'symbol' fetch method to kprobes
  tracing/probes: Implement 'stack' fetch method for uprobes
  ...
This commit is contained in:
Linus Torvalds 2014-01-22 16:35:21 -08:00
commit 60eaa0190f
19 changed files with 3464 additions and 924 deletions

View File

@ -287,3 +287,210 @@ their old filters):
prev_pid == 0
# cat sched_wakeup/filter
common_pid == 0
6. Event triggers
=================
Trace events can be made to conditionally invoke trigger 'commands'
which can take various forms and are described in detail below;
examples would be enabling or disabling other trace events or invoking
a stack trace whenever the trace event is hit. Whenever a trace event
with attached triggers is invoked, the set of trigger commands
associated with that event is invoked. Any given trigger can
additionally have an event filter of the same form as described in
section 5 (Event filtering) associated with it - the command will only
be invoked if the event being invoked passes the associated filter.
If no filter is associated with the trigger, it always passes.
Triggers are added to and removed from a particular event by writing
trigger expressions to the 'trigger' file for the given event.
A given event can have any number of triggers associated with it,
subject to any restrictions that individual commands may have in that
regard.
Event triggers are implemented on top of "soft" mode, which means that
whenever a trace event has one or more triggers associated with it,
the event is activated even if it isn't actually enabled, but is
disabled in a "soft" mode. That is, the tracepoint will be called,
but just will not be traced, unless of course it's actually enabled.
This scheme allows triggers to be invoked even for events that aren't
enabled, and also allows the current event filter implementation to be
used for conditionally invoking triggers.
The syntax for event triggers is roughly based on the syntax for
set_ftrace_filter 'ftrace filter commands' (see the 'Filter commands'
section of Documentation/trace/ftrace.txt), but there are major
differences and the implementation isn't currently tied to it in any
way, so beware about making generalizations between the two.
6.1 Expression syntax
---------------------
Triggers are added by echoing the command to the 'trigger' file:
# echo 'command[:count] [if filter]' > trigger
Triggers are removed by echoing the same command but starting with '!'
to the 'trigger' file:
# echo '!command[:count] [if filter]' > trigger
The [if filter] part isn't used in matching commands when removing, so
leaving that off in a '!' command will accomplish the same thing as
having it in.
The filter syntax is the same as that described in the 'Event
filtering' section above.
For ease of use, writing to the trigger file using '>' currently just
adds or removes a single trigger and there's no explicit '>>' support
('>' actually behaves like '>>') or truncation support to remove all
triggers (you have to use '!' for each one added.)
6.2 Supported trigger commands
------------------------------
The following commands are supported:
- enable_event/disable_event
These commands can enable or disable another trace event whenever
the triggering event is hit. When these commands are registered,
the other trace event is activated, but disabled in a "soft" mode.
That is, the tracepoint will be called, but just will not be traced.
The event tracepoint stays in this mode as long as there's a trigger
in effect that can trigger it.
For example, the following trigger causes kmalloc events to be
traced when a read system call is entered, and the :1 at the end
specifies that this enablement happens only once:
# echo 'enable_event:kmem:kmalloc:1' > \
/sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
The following trigger causes kmalloc events to stop being traced
when a read system call exits. This disablement happens on every
read system call exit:
# echo 'disable_event:kmem:kmalloc' > \
/sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
The format is:
enable_event:<system>:<event>[:count]
disable_event:<system>:<event>[:count]
To remove the above commands:
# echo '!enable_event:kmem:kmalloc:1' > \
/sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
# echo '!disable_event:kmem:kmalloc' > \
/sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
Note that there can be any number of enable/disable_event triggers
per triggering event, but there can only be one trigger per
triggered event. e.g. sys_enter_read can have triggers enabling both
kmem:kmalloc and sched:sched_switch, but can't have two kmem:kmalloc
versions such as kmem:kmalloc and kmem:kmalloc:1 or 'kmem:kmalloc if
bytes_req == 256' and 'kmem:kmalloc if bytes_alloc == 256' (they
could be combined into a single filter on kmem:kmalloc though).
- stacktrace
This command dumps a stacktrace in the trace buffer whenever the
triggering event occurs.
For example, the following trigger dumps a stacktrace every time the
kmalloc tracepoint is hit:
# echo 'stacktrace' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
The following trigger dumps a stacktrace the first 5 times a kmalloc
request happens with a size >= 64K
# echo 'stacktrace:5 if bytes_req >= 65536' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
The format is:
stacktrace[:count]
To remove the above commands:
# echo '!stacktrace' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
# echo '!stacktrace:5 if bytes_req >= 65536' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
The latter can also be removed more simply by the following (without
the filter):
# echo '!stacktrace:5' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
Note that there can be only one stacktrace trigger per triggering
event.
- snapshot
This command causes a snapshot to be triggered whenever the
triggering event occurs.
The following command creates a snapshot every time a block request
queue is unplugged with a depth > 1. If you were tracing a set of
events or functions at the time, the snapshot trace buffer would
capture those events when the trigger event occured:
# echo 'snapshot if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
To only snapshot once:
# echo 'snapshot:1 if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
To remove the above commands:
# echo '!snapshot if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
# echo '!snapshot:1 if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
Note that there can be only one snapshot trigger per triggering
event.
- traceon/traceoff
These commands turn tracing on and off when the specified events are
hit. The parameter determines how many times the tracing system is
turned on and off. If unspecified, there is no limit.
The following command turns tracing off the first time a block
request queue is unplugged with a depth > 1. If you were tracing a
set of events or functions at the time, you could then examine the
trace buffer to see the sequence of events that led up to the
trigger event:
# echo 'traceoff:1 if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
To always disable tracing when nr_rq > 1 :
# echo 'traceoff if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
To remove the above commands:
# echo '!traceoff:1 if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
# echo '!traceoff if nr_rq > 1' > \
/sys/kernel/debug/tracing/events/block/block_unplug/trigger
Note that there can be only one traceon or traceoff trigger per
triggering event.

View File

@ -19,18 +19,44 @@ user to calculate the offset of the probepoint in the object.
Synopsis of uprobe_tracer
-------------------------
p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a uprobe
r[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a return uprobe (uretprobe)
p[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a uprobe
r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS] : Set a return uprobe (uretprobe)
-:[GRP/]EVENT : Clear uprobe or uretprobe event
GRP : Group name. If omitted, "uprobes" is the default value.
EVENT : Event name. If omitted, the event name is generated based
on SYMBOL+offs.
on PATH+OFFSET.
PATH : Path to an executable or a library.
SYMBOL[+offs] : Symbol+offset where the probe is inserted.
OFFSET : Offset where the probe is inserted.
FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
@ADDR : Fetch memory at ADDR (ADDR should be in userspace)
@+OFFSET : Fetch memory at OFFSET (OFFSET from same file as PATH)
$stackN : Fetch Nth entry of stack (N >= 0)
$stack : Fetch stack address.
$retval : Fetch return value.(*)
+|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield
are supported.
(*) only for return probe.
(**) this is useful for fetching a field of data structures.
Types
-----
Several types are supported for fetch-args. Uprobe tracer will access memory
by given type. Prefix 's' and 'u' means those types are signed and unsigned
respectively. Traced arguments are shown in decimal (signed) or hex (unsigned).
String type is a special type, which fetches a "null-terminated" string from
user space.
Bitfield is another special type, which takes 3 parameters, bit-width, bit-
offset, and container-size (usually 32). The syntax is;
b<bit-width>@<bit-offset>/<container-size>
Event Profiling
---------------

View File

@ -570,8 +570,6 @@ static inline int
ftrace_regex_release(struct inode *inode, struct file *file) { return -ENODEV; }
#endif /* CONFIG_DYNAMIC_FTRACE */
loff_t ftrace_filter_lseek(struct file *file, loff_t offset, int whence);
/* totally disable ftrace - can not re-enable after this */
void ftrace_kill(void);

View File

@ -1,3 +1,4 @@
#ifndef _LINUX_FTRACE_EVENT_H
#define _LINUX_FTRACE_EVENT_H
@ -264,6 +265,8 @@ enum {
FTRACE_EVENT_FL_NO_SET_FILTER_BIT,
FTRACE_EVENT_FL_SOFT_MODE_BIT,
FTRACE_EVENT_FL_SOFT_DISABLED_BIT,
FTRACE_EVENT_FL_TRIGGER_MODE_BIT,
FTRACE_EVENT_FL_TRIGGER_COND_BIT,
};
/*
@ -275,6 +278,8 @@ enum {
* SOFT_MODE - The event is enabled/disabled by SOFT_DISABLED
* SOFT_DISABLED - When set, do not trace the event (even though its
* tracepoint may be enabled)
* TRIGGER_MODE - When set, invoke the triggers associated with the event
* TRIGGER_COND - When set, one or more triggers has an associated filter
*/
enum {
FTRACE_EVENT_FL_ENABLED = (1 << FTRACE_EVENT_FL_ENABLED_BIT),
@ -283,6 +288,8 @@ enum {
FTRACE_EVENT_FL_NO_SET_FILTER = (1 << FTRACE_EVENT_FL_NO_SET_FILTER_BIT),
FTRACE_EVENT_FL_SOFT_MODE = (1 << FTRACE_EVENT_FL_SOFT_MODE_BIT),
FTRACE_EVENT_FL_SOFT_DISABLED = (1 << FTRACE_EVENT_FL_SOFT_DISABLED_BIT),
FTRACE_EVENT_FL_TRIGGER_MODE = (1 << FTRACE_EVENT_FL_TRIGGER_MODE_BIT),
FTRACE_EVENT_FL_TRIGGER_COND = (1 << FTRACE_EVENT_FL_TRIGGER_COND_BIT),
};
struct ftrace_event_file {
@ -292,6 +299,7 @@ struct ftrace_event_file {
struct dentry *dir;
struct trace_array *tr;
struct ftrace_subsystem_dir *system;
struct list_head triggers;
/*
* 32 bit flags:
@ -299,6 +307,7 @@ struct ftrace_event_file {
* bit 1: enabled cmd record
* bit 2: enable/disable with the soft disable bit
* bit 3: soft disabled
* bit 4: trigger enabled
*
* Note: The bits must be set atomically to prevent races
* from other writers. Reads of flags do not need to be in
@ -310,6 +319,7 @@ struct ftrace_event_file {
*/
unsigned long flags;
atomic_t sm_ref; /* soft-mode reference counter */
atomic_t tm_ref; /* trigger-mode reference counter */
};
#define __TRACE_EVENT_FLAGS(name, value) \
@ -337,6 +347,14 @@ struct ftrace_event_file {
#define MAX_FILTER_STR_VAL 256 /* Should handle KSYM_SYMBOL_LEN */
enum event_trigger_type {
ETT_NONE = (0),
ETT_TRACE_ONOFF = (1 << 0),
ETT_SNAPSHOT = (1 << 1),
ETT_STACKTRACE = (1 << 2),
ETT_EVENT_ENABLE = (1 << 3),
};
extern void destroy_preds(struct ftrace_event_file *file);
extern void destroy_call_preds(struct ftrace_event_call *call);
extern int filter_match_preds(struct event_filter *filter, void *rec);
@ -347,6 +365,127 @@ extern int filter_check_discard(struct ftrace_event_file *file, void *rec,
extern int call_filter_check_discard(struct ftrace_event_call *call, void *rec,
struct ring_buffer *buffer,
struct ring_buffer_event *event);
extern enum event_trigger_type event_triggers_call(struct ftrace_event_file *file,
void *rec);
extern void event_triggers_post_call(struct ftrace_event_file *file,
enum event_trigger_type tt);
/**
* ftrace_trigger_soft_disabled - do triggers and test if soft disabled
* @file: The file pointer of the event to test
*
* If any triggers without filters are attached to this event, they
* will be called here. If the event is soft disabled and has no
* triggers that require testing the fields, it will return true,
* otherwise false.
*/
static inline bool
ftrace_trigger_soft_disabled(struct ftrace_event_file *file)
{
unsigned long eflags = file->flags;
if (!(eflags & FTRACE_EVENT_FL_TRIGGER_COND)) {
if (eflags & FTRACE_EVENT_FL_TRIGGER_MODE)
event_triggers_call(file, NULL);
if (eflags & FTRACE_EVENT_FL_SOFT_DISABLED)
return true;
}
return false;
}
/*
* Helper function for event_trigger_unlock_commit{_regs}().
* If there are event triggers attached to this event that requires
* filtering against its fields, then they wil be called as the
* entry already holds the field information of the current event.
*
* It also checks if the event should be discarded or not.
* It is to be discarded if the event is soft disabled and the
* event was only recorded to process triggers, or if the event
* filter is active and this event did not match the filters.
*
* Returns true if the event is discarded, false otherwise.
*/
static inline bool
__event_trigger_test_discard(struct ftrace_event_file *file,
struct ring_buffer *buffer,
struct ring_buffer_event *event,
void *entry,
enum event_trigger_type *tt)
{
unsigned long eflags = file->flags;
if (eflags & FTRACE_EVENT_FL_TRIGGER_COND)
*tt = event_triggers_call(file, entry);
if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, &file->flags))
ring_buffer_discard_commit(buffer, event);
else if (!filter_check_discard(file, entry, buffer, event))
return false;
return true;
}
/**
* event_trigger_unlock_commit - handle triggers and finish event commit
* @file: The file pointer assoctiated to the event
* @buffer: The ring buffer that the event is being written to
* @event: The event meta data in the ring buffer
* @entry: The event itself
* @irq_flags: The state of the interrupts at the start of the event
* @pc: The state of the preempt count at the start of the event.
*
* This is a helper function to handle triggers that require data
* from the event itself. It also tests the event against filters and
* if the event is soft disabled and should be discarded.
*/
static inline void
event_trigger_unlock_commit(struct ftrace_event_file *file,
struct ring_buffer *buffer,
struct ring_buffer_event *event,
void *entry, unsigned long irq_flags, int pc)
{
enum event_trigger_type tt = ETT_NONE;
if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
trace_buffer_unlock_commit(buffer, event, irq_flags, pc);
if (tt)
event_triggers_post_call(file, tt);
}
/**
* event_trigger_unlock_commit_regs - handle triggers and finish event commit
* @file: The file pointer assoctiated to the event
* @buffer: The ring buffer that the event is being written to
* @event: The event meta data in the ring buffer
* @entry: The event itself
* @irq_flags: The state of the interrupts at the start of the event
* @pc: The state of the preempt count at the start of the event.
*
* This is a helper function to handle triggers that require data
* from the event itself. It also tests the event against filters and
* if the event is soft disabled and should be discarded.
*
* Same as event_trigger_unlock_commit() but calls
* trace_buffer_unlock_commit_regs() instead of trace_buffer_unlock_commit().
*/
static inline void
event_trigger_unlock_commit_regs(struct ftrace_event_file *file,
struct ring_buffer *buffer,
struct ring_buffer_event *event,
void *entry, unsigned long irq_flags, int pc,
struct pt_regs *regs)
{
enum event_trigger_type tt = ETT_NONE;
if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
trace_buffer_unlock_commit_regs(buffer, event,
irq_flags, pc, regs);
if (tt)
event_triggers_post_call(file, tt);
}
enum {
FILTER_OTHER = 0,

View File

@ -418,6 +418,8 @@ static inline notrace int ftrace_get_offsets_##call( \
* struct ftrace_event_file *ftrace_file = __data;
* struct ftrace_event_call *event_call = ftrace_file->event_call;
* struct ftrace_data_offsets_<call> __maybe_unused __data_offsets;
* unsigned long eflags = ftrace_file->flags;
* enum event_trigger_type __tt = ETT_NONE;
* struct ring_buffer_event *event;
* struct ftrace_raw_<call> *entry; <-- defined in stage 1
* struct ring_buffer *buffer;
@ -425,9 +427,12 @@ static inline notrace int ftrace_get_offsets_##call( \
* int __data_size;
* int pc;
*
* if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT,
* &ftrace_file->flags))
* if (!(eflags & FTRACE_EVENT_FL_TRIGGER_COND)) {
* if (eflags & FTRACE_EVENT_FL_TRIGGER_MODE)
* event_triggers_call(ftrace_file, NULL);
* if (eflags & FTRACE_EVENT_FL_SOFT_DISABLED)
* return;
* }
*
* local_save_flags(irq_flags);
* pc = preempt_count();
@ -445,8 +450,17 @@ static inline notrace int ftrace_get_offsets_##call( \
* { <assign>; } <-- Here we assign the entries by the __field and
* __array macros.
*
* if (!filter_check_discard(ftrace_file, entry, buffer, event))
* if (eflags & FTRACE_EVENT_FL_TRIGGER_COND)
* __tt = event_triggers_call(ftrace_file, entry);
*
* if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT,
* &ftrace_file->flags))
* ring_buffer_discard_commit(buffer, event);
* else if (!filter_check_discard(ftrace_file, entry, buffer, event))
* trace_buffer_unlock_commit(buffer, event, irq_flags, pc);
*
* if (__tt)
* event_triggers_post_call(ftrace_file, __tt);
* }
*
* static struct trace_event ftrace_event_type_<call> = {
@ -539,8 +553,7 @@ ftrace_raw_event_##call(void *__data, proto) \
int __data_size; \
int pc; \
\
if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, \
&ftrace_file->flags)) \
if (ftrace_trigger_soft_disabled(ftrace_file)) \
return; \
\
local_save_flags(irq_flags); \
@ -560,8 +573,8 @@ ftrace_raw_event_##call(void *__data, proto) \
\
{ assign; } \
\
if (!filter_check_discard(ftrace_file, entry, buffer, event)) \
trace_buffer_unlock_commit(buffer, event, irq_flags, pc); \
event_trigger_unlock_commit(ftrace_file, buffer, event, entry, \
irq_flags, pc); \
}
/*
* The ftrace_test_probe is compiled out, it is only here as a build time check

View File

@ -1854,6 +1854,10 @@ static void handle_swbp(struct pt_regs *regs)
if (unlikely(!test_bit(UPROBE_COPY_INSN, &uprobe->flags)))
goto out;
/* Tracing handlers use ->utask to communicate with fetch methods */
if (!get_utask())
goto out;
handler_chain(uprobe, regs);
if (can_skip_sstep(uprobe, regs))
goto out;

View File

@ -50,6 +50,7 @@ ifeq ($(CONFIG_PERF_EVENTS),y)
obj-$(CONFIG_EVENT_TRACING) += trace_event_perf.o
endif
obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
obj-$(CONFIG_EVENT_TRACING) += trace_events_trigger.o
obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
obj-$(CONFIG_TRACEPOINTS) += power-traces.o
ifeq ($(CONFIG_PM_RUNTIME),y)

View File

@ -85,6 +85,8 @@ int function_trace_stop __read_mostly;
/* Current function tracing op */
struct ftrace_ops *function_trace_op __read_mostly = &ftrace_list_end;
/* What to set function_trace_op to */
static struct ftrace_ops *set_function_trace_op;
/* List for set_ftrace_pid's pids. */
LIST_HEAD(ftrace_pids);
@ -278,6 +280,29 @@ static void update_global_ops(void)
global_ops.func = func;
}
static void ftrace_sync(struct work_struct *work)
{
/*
* This function is just a stub to implement a hard force
* of synchronize_sched(). This requires synchronizing
* tasks even in userspace and idle.
*
* Yes, function tracing is rude.
*/
}
static void ftrace_sync_ipi(void *data)
{
/* Probably not needed, but do it anyway */
smp_rmb();
}
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
static void update_function_graph_func(void);
#else
static inline void update_function_graph_func(void) { }
#endif
static void update_ftrace_function(void)
{
ftrace_func_t func;
@ -296,16 +321,61 @@ static void update_ftrace_function(void)
!FTRACE_FORCE_LIST_FUNC)) {
/* Set the ftrace_ops that the arch callback uses */
if (ftrace_ops_list == &global_ops)
function_trace_op = ftrace_global_list;
set_function_trace_op = ftrace_global_list;
else
function_trace_op = ftrace_ops_list;
set_function_trace_op = ftrace_ops_list;
func = ftrace_ops_list->func;
} else {
/* Just use the default ftrace_ops */
function_trace_op = &ftrace_list_end;
set_function_trace_op = &ftrace_list_end;
func = ftrace_ops_list_func;
}
/* If there's no change, then do nothing more here */
if (ftrace_trace_function == func)
return;
update_function_graph_func();
/*
* If we are using the list function, it doesn't care
* about the function_trace_ops.
*/
if (func == ftrace_ops_list_func) {
ftrace_trace_function = func;
/*
* Don't even bother setting function_trace_ops,
* it would be racy to do so anyway.
*/
return;
}
#ifndef CONFIG_DYNAMIC_FTRACE
/*
* For static tracing, we need to be a bit more careful.
* The function change takes affect immediately. Thus,
* we need to coorditate the setting of the function_trace_ops
* with the setting of the ftrace_trace_function.
*
* Set the function to the list ops, which will call the
* function we want, albeit indirectly, but it handles the
* ftrace_ops and doesn't depend on function_trace_op.
*/
ftrace_trace_function = ftrace_ops_list_func;
/*
* Make sure all CPUs see this. Yes this is slow, but static
* tracing is slow and nasty to have enabled.
*/
schedule_on_each_cpu(ftrace_sync);
/* Now all cpus are using the list ops. */
function_trace_op = set_function_trace_op;
/* Make sure the function_trace_op is visible on all CPUs */
smp_wmb();
/* Nasty way to force a rmb on all cpus */
smp_call_function(ftrace_sync_ipi, NULL, 1);
/* OK, we are all set to update the ftrace_trace_function now! */
#endif /* !CONFIG_DYNAMIC_FTRACE */
ftrace_trace_function = func;
}
@ -410,17 +480,6 @@ static int __register_ftrace_function(struct ftrace_ops *ops)
return 0;
}
static void ftrace_sync(struct work_struct *work)
{
/*
* This function is just a stub to implement a hard force
* of synchronize_sched(). This requires synchronizing
* tasks even in userspace and idle.
*
* Yes, function tracing is rude.
*/
}
static int __unregister_ftrace_function(struct ftrace_ops *ops)
{
int ret;
@ -439,20 +498,6 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops)
} else if (ops->flags & FTRACE_OPS_FL_CONTROL) {
ret = remove_ftrace_list_ops(&ftrace_control_list,
&control_ops, ops);
if (!ret) {
/*
* The ftrace_ops is now removed from the list,
* so there'll be no new users. We must ensure
* all current users are done before we free
* the control data.
* Note synchronize_sched() is not enough, as we
* use preempt_disable() to do RCU, but the function
* tracer can be called where RCU is not active
* (before user_exit()).
*/
schedule_on_each_cpu(ftrace_sync);
control_ops_free(ops);
}
} else
ret = remove_ftrace_ops(&ftrace_ops_list, ops);
@ -462,17 +507,6 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops)
if (ftrace_enabled)
update_ftrace_function();
/*
* Dynamic ops may be freed, we must make sure that all
* callers are done before leaving this function.
*
* Again, normal synchronize_sched() is not good enough.
* We need to do a hard force of sched synchronization.
*/
if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
schedule_on_each_cpu(ftrace_sync);
return 0;
}
@ -1082,19 +1116,6 @@ static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
static struct pid * const ftrace_swapper_pid = &init_struct_pid;
loff_t
ftrace_filter_lseek(struct file *file, loff_t offset, int whence)
{
loff_t ret;
if (file->f_mode & FMODE_READ)
ret = seq_lseek(file, offset, whence);
else
file->f_pos = ret = 1;
return ret;
}
#ifdef CONFIG_DYNAMIC_FTRACE
#ifndef CONFIG_FTRACE_MCOUNT_RECORD
@ -1992,8 +2013,14 @@ void ftrace_modify_all_code(int command)
else if (command & FTRACE_DISABLE_CALLS)
ftrace_replace_code(0);
if (update && ftrace_trace_function != ftrace_ops_list_func)
if (update && ftrace_trace_function != ftrace_ops_list_func) {
function_trace_op = set_function_trace_op;
smp_wmb();
/* If irqs are disabled, we are in stop machine */
if (!irqs_disabled())
smp_call_function(ftrace_sync_ipi, NULL, 1);
ftrace_update_ftrace_func(ftrace_trace_function);
}
if (command & FTRACE_START_FUNC_RET)
ftrace_enable_ftrace_graph_caller();
@ -2156,10 +2183,41 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
command |= FTRACE_UPDATE_TRACE_FUNC;
}
if (!command || !ftrace_enabled)
if (!command || !ftrace_enabled) {
/*
* If these are control ops, they still need their
* per_cpu field freed. Since, function tracing is
* not currently active, we can just free them
* without synchronizing all CPUs.
*/
if (ops->flags & FTRACE_OPS_FL_CONTROL)
control_ops_free(ops);
return 0;
}
ftrace_run_update_code(command);
/*
* Dynamic ops may be freed, we must make sure that all
* callers are done before leaving this function.
* The same goes for freeing the per_cpu data of the control
* ops.
*
* Again, normal synchronize_sched() is not good enough.
* We need to do a hard force of sched synchronization.
* This is because we use preempt_disable() to do RCU, but
* the function tracers can be called where RCU is not watching
* (like before user_exit()). We can not rely on the RCU
* infrastructure to do the synchronization, thus we must do it
* ourselves.
*/
if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_CONTROL)) {
schedule_on_each_cpu(ftrace_sync);
if (ops->flags & FTRACE_OPS_FL_CONTROL)
control_ops_free(ops);
}
return 0;
}
@ -2739,7 +2797,7 @@ static void ftrace_filter_reset(struct ftrace_hash *hash)
* routine, you can use ftrace_filter_write() for the write
* routine if @flag has FTRACE_ITER_FILTER set, or
* ftrace_notrace_write() if @flag has FTRACE_ITER_NOTRACE set.
* ftrace_filter_lseek() should be used as the lseek routine, and
* tracing_lseek() should be used as the lseek routine, and
* release must call ftrace_regex_release().
*/
int
@ -3767,7 +3825,7 @@ static const struct file_operations ftrace_filter_fops = {
.open = ftrace_filter_open,
.read = seq_read,
.write = ftrace_filter_write,
.llseek = ftrace_filter_lseek,
.llseek = tracing_lseek,
.release = ftrace_regex_release,
};
@ -3775,7 +3833,7 @@ static const struct file_operations ftrace_notrace_fops = {
.open = ftrace_notrace_open,
.read = seq_read,
.write = ftrace_notrace_write,
.llseek = ftrace_filter_lseek,
.llseek = tracing_lseek,
.release = ftrace_regex_release,
};
@ -4038,7 +4096,7 @@ static const struct file_operations ftrace_graph_fops = {
.open = ftrace_graph_open,
.read = seq_read,
.write = ftrace_graph_write,
.llseek = ftrace_filter_lseek,
.llseek = tracing_lseek,
.release = ftrace_graph_release,
};
@ -4046,7 +4104,7 @@ static const struct file_operations ftrace_graph_notrace_fops = {
.open = ftrace_graph_notrace_open,
.read = seq_read,
.write = ftrace_graph_write,
.llseek = ftrace_filter_lseek,
.llseek = tracing_lseek,
.release = ftrace_graph_release,
};
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
@ -4719,7 +4777,7 @@ static const struct file_operations ftrace_pid_fops = {
.open = ftrace_pid_open,
.write = ftrace_pid_write,
.read = seq_read,
.llseek = ftrace_filter_lseek,
.llseek = tracing_lseek,
.release = ftrace_pid_release,
};
@ -4862,6 +4920,7 @@ int ftrace_graph_entry_stub(struct ftrace_graph_ent *trace)
trace_func_graph_ret_t ftrace_graph_return =
(trace_func_graph_ret_t)ftrace_stub;
trace_func_graph_ent_t ftrace_graph_entry = ftrace_graph_entry_stub;
static trace_func_graph_ent_t __ftrace_graph_entry = ftrace_graph_entry_stub;
/* Try to assign a return stack array on FTRACE_RETSTACK_ALLOC_SIZE tasks. */
static int alloc_retstack_tasklist(struct ftrace_ret_stack **ret_stack_list)
@ -5003,6 +5062,30 @@ static struct ftrace_ops fgraph_ops __read_mostly = {
FTRACE_OPS_FL_RECURSION_SAFE,
};
static int ftrace_graph_entry_test(struct ftrace_graph_ent *trace)
{
if (!ftrace_ops_test(&global_ops, trace->func, NULL))
return 0;
return __ftrace_graph_entry(trace);
}
/*
* The function graph tracer should only trace the functions defined
* by set_ftrace_filter and set_ftrace_notrace. If another function
* tracer ops is registered, the graph tracer requires testing the
* function against the global ops, and not just trace any function
* that any ftrace_ops registered.
*/
static void update_function_graph_func(void)
{
if (ftrace_ops_list == &ftrace_list_end ||
(ftrace_ops_list == &global_ops &&
global_ops.next == &ftrace_list_end))
ftrace_graph_entry = __ftrace_graph_entry;
else
ftrace_graph_entry = ftrace_graph_entry_test;
}
int register_ftrace_graph(trace_func_graph_ret_t retfunc,
trace_func_graph_ent_t entryfunc)
{
@ -5027,7 +5110,16 @@ int register_ftrace_graph(trace_func_graph_ret_t retfunc,
}
ftrace_graph_return = retfunc;
ftrace_graph_entry = entryfunc;
/*
* Update the indirect function to the entryfunc, and the
* function that gets called to the entry_test first. Then
* call the update fgraph entry function to determine if
* the entryfunc should be called directly or not.
*/
__ftrace_graph_entry = entryfunc;
ftrace_graph_entry = ftrace_graph_entry_test;
update_function_graph_func();
ret = ftrace_startup(&fgraph_ops, FTRACE_START_FUNC_RET);
@ -5046,6 +5138,7 @@ void unregister_ftrace_graph(void)
ftrace_graph_active--;
ftrace_graph_return = (trace_func_graph_ret_t)ftrace_stub;
ftrace_graph_entry = ftrace_graph_entry_stub;
__ftrace_graph_entry = ftrace_graph_entry_stub;
ftrace_shutdown(&fgraph_ops, FTRACE_STOP_FUNC_RET);
unregister_pm_notifier(&ftrace_suspend_notifier);
unregister_trace_sched_switch(ftrace_graph_probe_sched_switch, NULL);

View File

@ -594,6 +594,28 @@ void free_snapshot(struct trace_array *tr)
tr->allocated_snapshot = false;
}
/**
* tracing_alloc_snapshot - allocate snapshot buffer.
*
* This only allocates the snapshot buffer if it isn't already
* allocated - it doesn't also take a snapshot.
*
* This is meant to be used in cases where the snapshot buffer needs
* to be set up for events that can't sleep but need to be able to
* trigger a snapshot.
*/
int tracing_alloc_snapshot(void)
{
struct trace_array *tr = &global_trace;
int ret;
ret = alloc_snapshot(tr);
WARN_ON(ret < 0);
return ret;
}
EXPORT_SYMBOL_GPL(tracing_alloc_snapshot);
/**
* trace_snapshot_alloc - allocate and take a snapshot of the current buffer.
*
@ -607,11 +629,10 @@ void free_snapshot(struct trace_array *tr)
*/
void tracing_snapshot_alloc(void)
{
struct trace_array *tr = &global_trace;
int ret;
ret = alloc_snapshot(tr);
if (WARN_ON(ret < 0))
ret = tracing_alloc_snapshot();
if (ret < 0)
return;
tracing_snapshot();
@ -623,6 +644,12 @@ void tracing_snapshot(void)
WARN_ONCE(1, "Snapshot feature not enabled, but internal snapshot used");
}
EXPORT_SYMBOL_GPL(tracing_snapshot);
int tracing_alloc_snapshot(void)
{
WARN_ONCE(1, "Snapshot feature not enabled, but snapshot allocation used");
return -ENODEV;
}
EXPORT_SYMBOL_GPL(tracing_alloc_snapshot);
void tracing_snapshot_alloc(void)
{
/* Give warning */
@ -3156,19 +3183,23 @@ tracing_write_stub(struct file *filp, const char __user *ubuf,
return count;
}
static loff_t tracing_seek(struct file *file, loff_t offset, int origin)
loff_t tracing_lseek(struct file *file, loff_t offset, int whence)
{
int ret;
if (file->f_mode & FMODE_READ)
return seq_lseek(file, offset, origin);
ret = seq_lseek(file, offset, whence);
else
return 0;
file->f_pos = ret = 0;
return ret;
}
static const struct file_operations tracing_fops = {
.open = tracing_open,
.read = seq_read,
.write = tracing_write_stub,
.llseek = tracing_seek,
.llseek = tracing_lseek,
.release = tracing_release,
};
@ -4212,12 +4243,6 @@ out:
return sret;
}
static void tracing_pipe_buf_release(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
{
__free_page(buf->page);
}
static void tracing_spd_release_pipe(struct splice_pipe_desc *spd,
unsigned int idx)
{
@ -4229,7 +4254,7 @@ static const struct pipe_buf_operations tracing_pipe_buf_ops = {
.map = generic_pipe_buf_map,
.unmap = generic_pipe_buf_unmap,
.confirm = generic_pipe_buf_confirm,
.release = tracing_pipe_buf_release,
.release = generic_pipe_buf_release,
.steal = generic_pipe_buf_steal,
.get = generic_pipe_buf_get,
};
@ -4913,7 +4938,7 @@ static const struct file_operations snapshot_fops = {
.open = tracing_snapshot_open,
.read = seq_read,
.write = tracing_snapshot_write,
.llseek = tracing_seek,
.llseek = tracing_lseek,
.release = tracing_snapshot_release,
};
@ -5883,6 +5908,8 @@ allocate_trace_buffer(struct trace_array *tr, struct trace_buffer *buf, int size
rb_flags = trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0;
buf->tr = tr;
buf->buffer = ring_buffer_alloc(size, rb_flags);
if (!buf->buffer)
return -ENOMEM;

View File

@ -1,3 +1,4 @@
#ifndef _LINUX_KERNEL_TRACE_H
#define _LINUX_KERNEL_TRACE_H
@ -587,6 +588,8 @@ void tracing_start_sched_switch_record(void);
int register_tracer(struct tracer *type);
int is_tracing_stopped(void);
loff_t tracing_lseek(struct file *file, loff_t offset, int whence);
extern cpumask_var_t __read_mostly tracing_buffer_mask;
#define for_each_tracing_cpu(cpu) \
@ -1020,6 +1023,10 @@ extern int apply_subsystem_event_filter(struct ftrace_subsystem_dir *dir,
extern void print_subsystem_event_filter(struct event_subsystem *system,
struct trace_seq *s);
extern int filter_assign_type(const char *type);
extern int create_event_filter(struct ftrace_event_call *call,
char *filter_str, bool set_str,
struct event_filter **filterp);
extern void free_event_filter(struct event_filter *filter);
struct ftrace_event_field *
trace_find_event_field(struct ftrace_event_call *call, char *name);
@ -1028,9 +1035,195 @@ extern void trace_event_enable_cmd_record(bool enable);
extern int event_trace_add_tracer(struct dentry *parent, struct trace_array *tr);
extern int event_trace_del_tracer(struct trace_array *tr);
extern struct ftrace_event_file *find_event_file(struct trace_array *tr,
const char *system,
const char *event);
static inline void *event_file_data(struct file *filp)
{
return ACCESS_ONCE(file_inode(filp)->i_private);
}
extern struct mutex event_mutex;
extern struct list_head ftrace_events;
extern const struct file_operations event_trigger_fops;
extern int register_trigger_cmds(void);
extern void clear_event_triggers(struct trace_array *tr);
struct event_trigger_data {
unsigned long count;
int ref;
struct event_trigger_ops *ops;
struct event_command *cmd_ops;
struct event_filter __rcu *filter;
char *filter_str;
void *private_data;
struct list_head list;
};
/**
* struct event_trigger_ops - callbacks for trace event triggers
*
* The methods in this structure provide per-event trigger hooks for
* various trigger operations.
*
* All the methods below, except for @init() and @free(), must be
* implemented.
*
* @func: The trigger 'probe' function called when the triggering
* event occurs. The data passed into this callback is the data
* that was supplied to the event_command @reg() function that
* registered the trigger (see struct event_command).
*
* @init: An optional initialization function called for the trigger
* when the trigger is registered (via the event_command reg()
* function). This can be used to perform per-trigger
* initialization such as incrementing a per-trigger reference
* count, for instance. This is usually implemented by the
* generic utility function @event_trigger_init() (see
* trace_event_triggers.c).
*
* @free: An optional de-initialization function called for the
* trigger when the trigger is unregistered (via the
* event_command @reg() function). This can be used to perform
* per-trigger de-initialization such as decrementing a
* per-trigger reference count and freeing corresponding trigger
* data, for instance. This is usually implemented by the
* generic utility function @event_trigger_free() (see
* trace_event_triggers.c).
*
* @print: The callback function invoked to have the trigger print
* itself. This is usually implemented by a wrapper function
* that calls the generic utility function @event_trigger_print()
* (see trace_event_triggers.c).
*/
struct event_trigger_ops {
void (*func)(struct event_trigger_data *data);
int (*init)(struct event_trigger_ops *ops,
struct event_trigger_data *data);
void (*free)(struct event_trigger_ops *ops,
struct event_trigger_data *data);
int (*print)(struct seq_file *m,
struct event_trigger_ops *ops,
struct event_trigger_data *data);
};
/**
* struct event_command - callbacks and data members for event commands
*
* Event commands are invoked by users by writing the command name
* into the 'trigger' file associated with a trace event. The
* parameters associated with a specific invocation of an event
* command are used to create an event trigger instance, which is
* added to the list of trigger instances associated with that trace
* event. When the event is hit, the set of triggers associated with
* that event is invoked.
*
* The data members in this structure provide per-event command data
* for various event commands.
*
* All the data members below, except for @post_trigger, must be set
* for each event command.
*
* @name: The unique name that identifies the event command. This is
* the name used when setting triggers via trigger files.
*
* @trigger_type: A unique id that identifies the event command
* 'type'. This value has two purposes, the first to ensure that
* only one trigger of the same type can be set at a given time
* for a particular event e.g. it doesn't make sense to have both
* a traceon and traceoff trigger attached to a single event at
* the same time, so traceon and traceoff have the same type
* though they have different names. The @trigger_type value is
* also used as a bit value for deferring the actual trigger
* action until after the current event is finished. Some
* commands need to do this if they themselves log to the trace
* buffer (see the @post_trigger() member below). @trigger_type
* values are defined by adding new values to the trigger_type
* enum in include/linux/ftrace_event.h.
*
* @post_trigger: A flag that says whether or not this command needs
* to have its action delayed until after the current event has
* been closed. Some triggers need to avoid being invoked while
* an event is currently in the process of being logged, since
* the trigger may itself log data into the trace buffer. Thus
* we make sure the current event is committed before invoking
* those triggers. To do that, the trigger invocation is split
* in two - the first part checks the filter using the current
* trace record; if a command has the @post_trigger flag set, it
* sets a bit for itself in the return value, otherwise it
* directly invokes the trigger. Once all commands have been
* either invoked or set their return flag, the current record is
* either committed or discarded. At that point, if any commands
* have deferred their triggers, those commands are finally
* invoked following the close of the current event. In other
* words, if the event_trigger_ops @func() probe implementation
* itself logs to the trace buffer, this flag should be set,
* otherwise it can be left unspecified.
*
* All the methods below, except for @set_filter(), must be
* implemented.
*
* @func: The callback function responsible for parsing and
* registering the trigger written to the 'trigger' file by the
* user. It allocates the trigger instance and registers it with
* the appropriate trace event. It makes use of the other
* event_command callback functions to orchestrate this, and is
* usually implemented by the generic utility function
* @event_trigger_callback() (see trace_event_triggers.c).
*
* @reg: Adds the trigger to the list of triggers associated with the
* event, and enables the event trigger itself, after
* initializing it (via the event_trigger_ops @init() function).
* This is also where commands can use the @trigger_type value to
* make the decision as to whether or not multiple instances of
* the trigger should be allowed. This is usually implemented by
* the generic utility function @register_trigger() (see
* trace_event_triggers.c).
*
* @unreg: Removes the trigger from the list of triggers associated
* with the event, and disables the event trigger itself, after
* initializing it (via the event_trigger_ops @free() function).
* This is usually implemented by the generic utility function
* @unregister_trigger() (see trace_event_triggers.c).
*
* @set_filter: An optional function called to parse and set a filter
* for the trigger. If no @set_filter() method is set for the
* event command, filters set by the user for the command will be
* ignored. This is usually implemented by the generic utility
* function @set_trigger_filter() (see trace_event_triggers.c).
*
* @get_trigger_ops: The callback function invoked to retrieve the
* event_trigger_ops implementation associated with the command.
*/
struct event_command {
struct list_head list;
char *name;
enum event_trigger_type trigger_type;
bool post_trigger;
int (*func)(struct event_command *cmd_ops,
struct ftrace_event_file *file,
char *glob, char *cmd, char *params);
int (*reg)(char *glob,
struct event_trigger_ops *ops,
struct event_trigger_data *data,
struct ftrace_event_file *file);
void (*unreg)(char *glob,
struct event_trigger_ops *ops,
struct event_trigger_data *data,
struct ftrace_event_file *file);
int (*set_filter)(char *filter_str,
struct event_trigger_data *data,
struct ftrace_event_file *file);
struct event_trigger_ops *(*get_trigger_ops)(char *cmd, char *param);
};
extern int trace_event_enable_disable(struct ftrace_event_file *file,
int enable, int soft_disable);
extern int tracing_alloc_snapshot(void);
extern const char *__start___trace_bprintk_fmt[];
extern const char *__stop___trace_bprintk_fmt[];

View File

@ -342,6 +342,12 @@ static int __ftrace_event_enable_disable(struct ftrace_event_file *file,
return ret;
}
int trace_event_enable_disable(struct ftrace_event_file *file,
int enable, int soft_disable)
{
return __ftrace_event_enable_disable(file, enable, soft_disable);
}
static int ftrace_event_enable_disable(struct ftrace_event_file *file,
int enable)
{
@ -421,11 +427,6 @@ static void remove_subsystem(struct ftrace_subsystem_dir *dir)
}
}
static void *event_file_data(struct file *filp)
{
return ACCESS_ONCE(file_inode(filp)->i_private);
}
static void remove_event_file_dir(struct ftrace_event_file *file)
{
struct dentry *dir = file->dir;
@ -1549,6 +1550,9 @@ event_create_dir(struct dentry *parent, struct ftrace_event_file *file)
trace_create_file("filter", 0644, file->dir, file,
&ftrace_event_filter_fops);
trace_create_file("trigger", 0644, file->dir, file,
&event_trigger_fops);
trace_create_file("format", 0444, file->dir, call,
&ftrace_event_format_fops);
@ -1645,6 +1649,8 @@ trace_create_new_event(struct ftrace_event_call *call,
file->event_call = call;
file->tr = tr;
atomic_set(&file->sm_ref, 0);
atomic_set(&file->tm_ref, 0);
INIT_LIST_HEAD(&file->triggers);
list_add(&file->list, &tr->events);
return file;
@ -1849,20 +1855,7 @@ __trace_add_event_dirs(struct trace_array *tr)
}
}
#ifdef CONFIG_DYNAMIC_FTRACE
/* Avoid typos */
#define ENABLE_EVENT_STR "enable_event"
#define DISABLE_EVENT_STR "disable_event"
struct event_probe_data {
struct ftrace_event_file *file;
unsigned long count;
int ref;
bool enable;
};
static struct ftrace_event_file *
struct ftrace_event_file *
find_event_file(struct trace_array *tr, const char *system, const char *event)
{
struct ftrace_event_file *file;
@ -1885,6 +1878,19 @@ find_event_file(struct trace_array *tr, const char *system, const char *event)
return NULL;
}
#ifdef CONFIG_DYNAMIC_FTRACE
/* Avoid typos */
#define ENABLE_EVENT_STR "enable_event"
#define DISABLE_EVENT_STR "disable_event"
struct event_probe_data {
struct ftrace_event_file *file;
unsigned long count;
int ref;
bool enable;
};
static void
event_enable_probe(unsigned long ip, unsigned long parent_ip, void **_data)
{
@ -2311,6 +2317,9 @@ int event_trace_del_tracer(struct trace_array *tr)
{
mutex_lock(&event_mutex);
/* Disable any event triggers and associated soft-disabled events */
clear_event_triggers(tr);
/* Disable any running events */
__ftrace_set_clr_event_nolock(tr, NULL, NULL, NULL, 0);
@ -2377,6 +2386,8 @@ static __init int event_trace_enable(void)
register_event_cmds();
register_trigger_cmds();
return 0;
}

View File

@ -799,6 +799,11 @@ static void __free_filter(struct event_filter *filter)
kfree(filter);
}
void free_event_filter(struct event_filter *filter)
{
__free_filter(filter);
}
void destroy_call_preds(struct ftrace_event_call *call)
{
__free_filter(call->filter);
@ -1938,6 +1943,13 @@ static int create_filter(struct ftrace_event_call *call,
return err;
}
int create_event_filter(struct ftrace_event_call *call,
char *filter_str, bool set_str,
struct event_filter **filterp)
{
return create_filter(call, filter_str, set_str, filterp);
}
/**
* create_system_filter - create a filter for an event_subsystem
* @system: event_subsystem to create a filter for

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -35,46 +35,27 @@ const char *reserved_field_names[] = {
FIELD_STRING_FUNC,
};
/* Printing function type */
#define PRINT_TYPE_FUNC_NAME(type) print_type_##type
#define PRINT_TYPE_FMT_NAME(type) print_type_format_##type
/* Printing in basic type function template */
#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt, cast) \
static __kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt) \
__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
const char *name, \
void *data, void *ent)\
void *data, void *ent) \
{ \
return trace_seq_printf(s, " %s=" fmt, name, (cast)*(type *)data);\
return trace_seq_printf(s, " %s=" fmt, name, *(type *)data); \
} \
static const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
const char PRINT_TYPE_FMT_NAME(type)[] = fmt;
DEFINE_BASIC_PRINT_TYPE_FUNC(u8, "%x", unsigned int)
DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "%x", unsigned int)
DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "%lx", unsigned long)
DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "%llx", unsigned long long)
DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d", int)
DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d", int)
DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%ld", long)
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%lld", long long)
static inline void *get_rloc_data(u32 *dl)
{
return (u8 *)dl + get_rloc_offs(*dl);
}
/* For data_loc conversion */
static inline void *get_loc_data(u32 *dl, void *ent)
{
return (u8 *)ent + get_rloc_offs(*dl);
}
/* For defining macros, define string/string_size types */
typedef u32 string;
typedef u32 string_size;
DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "0x%Lx")
DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%Ld")
/* Print type function for string type */
static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
__kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
const char *name,
void *data, void *ent)
{
@ -87,18 +68,7 @@ static __kprobes int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s,
(const char *)get_loc_data(data, ent));
}
static const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type
/*
* Define macro for basic types - we don't need to define s* types, because
* we have to care only about bitwidth at recording time.
*/
#define DEFINE_BASIC_FETCH_FUNCS(method) \
DEFINE_FETCH_##method(u8) \
DEFINE_FETCH_##method(u16) \
DEFINE_FETCH_##method(u32) \
DEFINE_FETCH_##method(u64)
const char PRINT_TYPE_FMT_NAME(string)[] = "\\\"%s\\\"";
#define CHECK_FETCH_FUNCS(method, fn) \
(((FETCH_FUNC_NAME(method, u8) == fn) || \
@ -111,7 +81,7 @@ DEFINE_FETCH_##method(u64)
/* Data fetch function templates */
#define DEFINE_FETCH_reg(type) \
static __kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
__kprobes void FETCH_FUNC_NAME(reg, type)(struct pt_regs *regs, \
void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_register(regs, \
@ -122,20 +92,8 @@ DEFINE_BASIC_FETCH_FUNCS(reg)
#define fetch_reg_string NULL
#define fetch_reg_string_size NULL
#define DEFINE_FETCH_stack(type) \
static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
void *offset, void *dest) \
{ \
*(type *)dest = (type)regs_get_kernel_stack_nth(regs, \
(unsigned int)((unsigned long)offset)); \
}
DEFINE_BASIC_FETCH_FUNCS(stack)
/* No string on the stack entry */
#define fetch_stack_string NULL
#define fetch_stack_string_size NULL
#define DEFINE_FETCH_retval(type) \
static __kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs,\
__kprobes void FETCH_FUNC_NAME(retval, type)(struct pt_regs *regs, \
void *dummy, void *dest) \
{ \
*(type *)dest = (type)regs_return_value(regs); \
@ -145,150 +103,16 @@ DEFINE_BASIC_FETCH_FUNCS(retval)
#define fetch_retval_string NULL
#define fetch_retval_string_size NULL
#define DEFINE_FETCH_memory(type) \
static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
void *addr, void *dest) \
{ \
type retval; \
if (probe_kernel_address(addr, retval)) \
*(type *)dest = 0; \
else \
*(type *)dest = retval; \
}
DEFINE_BASIC_FETCH_FUNCS(memory)
/*
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location.
*/
static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
void *addr, void *dest)
{
long ret;
int maxlen = get_rloc_len(*(u32 *)dest);
u8 *dst = get_rloc_data(dest);
u8 *src = addr;
mm_segment_t old_fs = get_fs();
if (!maxlen)
return;
/*
* Try to get string again, since the string can be changed while
* probing.
*/
set_fs(KERNEL_DS);
pagefault_disable();
do
ret = __copy_from_user_inatomic(dst++, src++, 1);
while (dst[-1] && ret == 0 && src - (u8 *)addr < maxlen);
dst[-1] = '\0';
pagefault_enable();
set_fs(old_fs);
if (ret < 0) { /* Failed to fetch string */
((u8 *)get_rloc_data(dest))[0] = '\0';
*(u32 *)dest = make_data_rloc(0, get_rloc_offs(*(u32 *)dest));
} else {
*(u32 *)dest = make_data_rloc(src - (u8 *)addr,
get_rloc_offs(*(u32 *)dest));
}
}
/* Return the length of string -- including null terminal byte */
static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
void *addr, void *dest)
{
mm_segment_t old_fs;
int ret, len = 0;
u8 c;
old_fs = get_fs();
set_fs(KERNEL_DS);
pagefault_disable();
do {
ret = __copy_from_user_inatomic(&c, (u8 *)addr + len, 1);
len++;
} while (c && ret == 0 && len < MAX_STRING_SIZE);
pagefault_enable();
set_fs(old_fs);
if (ret < 0) /* Failed to check the length */
*(u32 *)dest = 0;
else
*(u32 *)dest = len;
}
/* Memory fetching by symbol */
struct symbol_cache {
char *symbol;
long offset;
unsigned long addr;
};
static unsigned long update_symbol_cache(struct symbol_cache *sc)
{
sc->addr = (unsigned long)kallsyms_lookup_name(sc->symbol);
if (sc->addr)
sc->addr += sc->offset;
return sc->addr;
}
static void free_symbol_cache(struct symbol_cache *sc)
{
kfree(sc->symbol);
kfree(sc);
}
static struct symbol_cache *alloc_symbol_cache(const char *sym, long offset)
{
struct symbol_cache *sc;
if (!sym || strlen(sym) == 0)
return NULL;
sc = kzalloc(sizeof(struct symbol_cache), GFP_KERNEL);
if (!sc)
return NULL;
sc->symbol = kstrdup(sym, GFP_KERNEL);
if (!sc->symbol) {
kfree(sc);
return NULL;
}
sc->offset = offset;
update_symbol_cache(sc);
return sc;
}
#define DEFINE_FETCH_symbol(type) \
static __kprobes void FETCH_FUNC_NAME(symbol, type)(struct pt_regs *regs,\
void *data, void *dest) \
{ \
struct symbol_cache *sc = data; \
if (sc->addr) \
fetch_memory_##type(regs, (void *)sc->addr, dest); \
else \
*(type *)dest = 0; \
}
DEFINE_BASIC_FETCH_FUNCS(symbol)
DEFINE_FETCH_symbol(string)
DEFINE_FETCH_symbol(string_size)
/* Dereference memory access function */
struct deref_fetch_param {
struct fetch_param orig;
long offset;
fetch_func_t fetch;
fetch_func_t fetch_size;
};
#define DEFINE_FETCH_deref(type) \
static __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs,\
__kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs, \
void *data, void *dest) \
{ \
struct deref_fetch_param *dprm = data; \
@ -296,13 +120,26 @@ static __kprobes void FETCH_FUNC_NAME(deref, type)(struct pt_regs *regs,\
call_fetch(&dprm->orig, regs, &addr); \
if (addr) { \
addr += dprm->offset; \
fetch_memory_##type(regs, (void *)addr, dest); \
dprm->fetch(regs, (void *)addr, dest); \
} else \
*(type *)dest = 0; \
}
DEFINE_BASIC_FETCH_FUNCS(deref)
DEFINE_FETCH_deref(string)
DEFINE_FETCH_deref(string_size)
__kprobes void FETCH_FUNC_NAME(deref, string_size)(struct pt_regs *regs,
void *data, void *dest)
{
struct deref_fetch_param *dprm = data;
unsigned long addr;
call_fetch(&dprm->orig, regs, &addr);
if (addr && dprm->fetch_size) {
addr += dprm->offset;
dprm->fetch_size(regs, (void *)addr, dest);
} else
*(string_size *)dest = 0;
}
static __kprobes void update_deref_fetch_param(struct deref_fetch_param *data)
{
@ -329,7 +166,7 @@ struct bitfield_fetch_param {
};
#define DEFINE_FETCH_bitfield(type) \
static __kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs,\
__kprobes void FETCH_FUNC_NAME(bitfield, type)(struct pt_regs *regs, \
void *data, void *dest) \
{ \
struct bitfield_fetch_param *bprm = data; \
@ -374,58 +211,8 @@ free_bitfield_fetch_param(struct bitfield_fetch_param *data)
kfree(data);
}
/* Default (unsigned long) fetch type */
#define __DEFAULT_FETCH_TYPE(t) u##t
#define _DEFAULT_FETCH_TYPE(t) __DEFAULT_FETCH_TYPE(t)
#define DEFAULT_FETCH_TYPE _DEFAULT_FETCH_TYPE(BITS_PER_LONG)
#define DEFAULT_FETCH_TYPE_STR __stringify(DEFAULT_FETCH_TYPE)
#define ASSIGN_FETCH_FUNC(method, type) \
[FETCH_MTD_##method] = FETCH_FUNC_NAME(method, type)
#define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \
{.name = _name, \
.size = _size, \
.is_signed = sign, \
.print = PRINT_TYPE_FUNC_NAME(ptype), \
.fmt = PRINT_TYPE_FMT_NAME(ptype), \
.fmttype = _fmttype, \
.fetch = { \
ASSIGN_FETCH_FUNC(reg, ftype), \
ASSIGN_FETCH_FUNC(stack, ftype), \
ASSIGN_FETCH_FUNC(retval, ftype), \
ASSIGN_FETCH_FUNC(memory, ftype), \
ASSIGN_FETCH_FUNC(symbol, ftype), \
ASSIGN_FETCH_FUNC(deref, ftype), \
ASSIGN_FETCH_FUNC(bitfield, ftype), \
} \
}
#define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \
__ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #ptype)
#define FETCH_TYPE_STRING 0
#define FETCH_TYPE_STRSIZE 1
/* Fetch type information table */
static const struct fetch_type fetch_type_table[] = {
/* Special types */
[FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
sizeof(u32), 1, "__data_loc char[]"),
[FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
string_size, sizeof(u32), 0, "u32"),
/* Basic types */
ASSIGN_FETCH_TYPE(u8, u8, 0),
ASSIGN_FETCH_TYPE(u16, u16, 0),
ASSIGN_FETCH_TYPE(u32, u32, 0),
ASSIGN_FETCH_TYPE(u64, u64, 0),
ASSIGN_FETCH_TYPE(s8, u8, 1),
ASSIGN_FETCH_TYPE(s16, u16, 1),
ASSIGN_FETCH_TYPE(s32, u32, 1),
ASSIGN_FETCH_TYPE(s64, u64, 1),
};
static const struct fetch_type *find_fetch_type(const char *type)
static const struct fetch_type *find_fetch_type(const char *type,
const struct fetch_type *ftbl)
{
int i;
@ -446,44 +233,52 @@ static const struct fetch_type *find_fetch_type(const char *type)
switch (bs) {
case 8:
return find_fetch_type("u8");
return find_fetch_type("u8", ftbl);
case 16:
return find_fetch_type("u16");
return find_fetch_type("u16", ftbl);
case 32:
return find_fetch_type("u32");
return find_fetch_type("u32", ftbl);
case 64:
return find_fetch_type("u64");
return find_fetch_type("u64", ftbl);
default:
goto fail;
}
}
for (i = 0; i < ARRAY_SIZE(fetch_type_table); i++)
if (strcmp(type, fetch_type_table[i].name) == 0)
return &fetch_type_table[i];
for (i = 0; ftbl[i].name; i++) {
if (strcmp(type, ftbl[i].name) == 0)
return &ftbl[i];
}
fail:
return NULL;
}
/* Special function : only accept unsigned long */
static __kprobes void fetch_stack_address(struct pt_regs *regs,
static __kprobes void fetch_kernel_stack_address(struct pt_regs *regs,
void *dummy, void *dest)
{
*(unsigned long *)dest = kernel_stack_pointer(regs);
}
static __kprobes void fetch_user_stack_address(struct pt_regs *regs,
void *dummy, void *dest)
{
*(unsigned long *)dest = user_stack_pointer(regs);
}
static fetch_func_t get_fetch_size_function(const struct fetch_type *type,
fetch_func_t orig_fn)
fetch_func_t orig_fn,
const struct fetch_type *ftbl)
{
int i;
if (type != &fetch_type_table[FETCH_TYPE_STRING])
if (type != &ftbl[FETCH_TYPE_STRING])
return NULL; /* Only string type needs size function */
for (i = 0; i < FETCH_MTD_END; i++)
if (type->fetch[i] == orig_fn)
return fetch_type_table[FETCH_TYPE_STRSIZE].fetch[i];
return ftbl[FETCH_TYPE_STRSIZE].fetch[i];
WARN_ON(1); /* This should not happen */
@ -516,7 +311,8 @@ int traceprobe_split_symbol_offset(char *symbol, unsigned long *offset)
#define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long))
static int parse_probe_vars(char *arg, const struct fetch_type *t,
struct fetch_param *f, bool is_return)
struct fetch_param *f, bool is_return,
bool is_kprobe)
{
int ret = 0;
unsigned long param;
@ -528,13 +324,16 @@ static int parse_probe_vars(char *arg, const struct fetch_type *t,
ret = -EINVAL;
} else if (strncmp(arg, "stack", 5) == 0) {
if (arg[5] == '\0') {
if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR) == 0)
f->fn = fetch_stack_address;
if (strcmp(t->name, DEFAULT_FETCH_TYPE_STR))
return -EINVAL;
if (is_kprobe)
f->fn = fetch_kernel_stack_address;
else
ret = -EINVAL;
f->fn = fetch_user_stack_address;
} else if (isdigit(arg[5])) {
ret = kstrtoul(arg + 5, 10, &param);
if (ret || param > PARAM_MAX_STACK)
if (ret || (is_kprobe && param > PARAM_MAX_STACK))
ret = -EINVAL;
else {
f->fn = t->fetch[FETCH_MTD_stack];
@ -552,20 +351,18 @@ static int parse_probe_vars(char *arg, const struct fetch_type *t,
static int parse_probe_arg(char *arg, const struct fetch_type *t,
struct fetch_param *f, bool is_return, bool is_kprobe)
{
const struct fetch_type *ftbl;
unsigned long param;
long offset;
char *tmp;
int ret;
int ret = 0;
ret = 0;
/* Until uprobe_events supports only reg arguments */
if (!is_kprobe && arg[0] != '%')
return -EINVAL;
ftbl = is_kprobe ? kprobes_fetch_type_table : uprobes_fetch_type_table;
BUG_ON(ftbl == NULL);
switch (arg[0]) {
case '$':
ret = parse_probe_vars(arg + 1, t, f, is_return);
ret = parse_probe_vars(arg + 1, t, f, is_return, is_kprobe);
break;
case '%': /* named register */
@ -577,7 +374,7 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
}
break;
case '@': /* memory or symbol */
case '@': /* memory, file-offset or symbol */
if (isdigit(arg[1])) {
ret = kstrtoul(arg + 1, 0, &param);
if (ret)
@ -585,7 +382,22 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
f->fn = t->fetch[FETCH_MTD_memory];
f->data = (void *)param;
} else if (arg[1] == '+') {
/* kprobes don't support file offsets */
if (is_kprobe)
return -EINVAL;
ret = kstrtol(arg + 2, 0, &offset);
if (ret)
break;
f->fn = t->fetch[FETCH_MTD_file_offset];
f->data = (void *)offset;
} else {
/* uprobes don't support symbols */
if (!is_kprobe)
return -EINVAL;
ret = traceprobe_split_symbol_offset(arg + 1, &offset);
if (ret)
break;
@ -616,7 +428,7 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
struct deref_fetch_param *dprm;
const struct fetch_type *t2;
t2 = find_fetch_type(NULL);
t2 = find_fetch_type(NULL, ftbl);
*tmp = '\0';
dprm = kzalloc(sizeof(struct deref_fetch_param), GFP_KERNEL);
@ -624,6 +436,9 @@ static int parse_probe_arg(char *arg, const struct fetch_type *t,
return -ENOMEM;
dprm->offset = offset;
dprm->fetch = t->fetch[FETCH_MTD_memory];
dprm->fetch_size = get_fetch_size_function(t,
dprm->fetch, ftbl);
ret = parse_probe_arg(arg, t2, &dprm->orig, is_return,
is_kprobe);
if (ret)
@ -685,9 +500,13 @@ static int __parse_bitfield_probe_arg(const char *bf,
int traceprobe_parse_probe_arg(char *arg, ssize_t *size,
struct probe_arg *parg, bool is_return, bool is_kprobe)
{
const struct fetch_type *ftbl;
const char *t;
int ret;
ftbl = is_kprobe ? kprobes_fetch_type_table : uprobes_fetch_type_table;
BUG_ON(ftbl == NULL);
if (strlen(arg) > MAX_ARGSTR_LEN) {
pr_info("Argument is too long.: %s\n", arg);
return -ENOSPC;
@ -702,7 +521,7 @@ int traceprobe_parse_probe_arg(char *arg, ssize_t *size,
arg[t - parg->comm] = '\0';
t++;
}
parg->type = find_fetch_type(t);
parg->type = find_fetch_type(t, ftbl);
if (!parg->type) {
pr_info("Unsupported type: %s\n", t);
return -EINVAL;
@ -716,7 +535,8 @@ int traceprobe_parse_probe_arg(char *arg, ssize_t *size,
if (ret >= 0) {
parg->fetch_size.fn = get_fetch_size_function(parg->type,
parg->fetch.fn);
parg->fetch.fn,
ftbl);
parg->fetch_size.data = parg->fetch.data;
}
@ -837,3 +657,65 @@ out:
return ret;
}
static int __set_print_fmt(struct trace_probe *tp, char *buf, int len,
bool is_return)
{
int i;
int pos = 0;
const char *fmt, *arg;
if (!is_return) {
fmt = "(%lx)";
arg = "REC->" FIELD_STRING_IP;
} else {
fmt = "(%lx <- %lx)";
arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
}
/* When len=0, we just calculate the needed length */
#define LEN_OR_ZERO (len ? len - pos : 0)
pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
for (i = 0; i < tp->nr_args; i++) {
pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
tp->args[i].name, tp->args[i].type->fmt);
}
pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
for (i = 0; i < tp->nr_args; i++) {
if (strcmp(tp->args[i].type->name, "string") == 0)
pos += snprintf(buf + pos, LEN_OR_ZERO,
", __get_str(%s)",
tp->args[i].name);
else
pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
tp->args[i].name);
}
#undef LEN_OR_ZERO
/* return the length of print_fmt */
return pos;
}
int set_print_fmt(struct trace_probe *tp, bool is_return)
{
int len;
char *print_fmt;
/* First: called with 0 length to calculate the needed length */
len = __set_print_fmt(tp, NULL, 0, is_return);
print_fmt = kmalloc(len + 1, GFP_KERNEL);
if (!print_fmt)
return -ENOMEM;
/* Second: actually write the @print_fmt */
__set_print_fmt(tp, print_fmt, len + 1, is_return);
tp->call.print_fmt = print_fmt;
return 0;
}

View File

@ -81,6 +81,17 @@
*/
#define convert_rloc_to_loc(dl, offs) ((u32)(dl) + (offs))
static inline void *get_rloc_data(u32 *dl)
{
return (u8 *)dl + get_rloc_offs(*dl);
}
/* For data_loc conversion */
static inline void *get_loc_data(u32 *dl, void *ent)
{
return (u8 *)ent + get_rloc_offs(*dl);
}
/* Data fetch function type */
typedef void (*fetch_func_t)(struct pt_regs *, void *, void *);
/* Printing function type */
@ -95,6 +106,7 @@ enum {
FETCH_MTD_symbol,
FETCH_MTD_deref,
FETCH_MTD_bitfield,
FETCH_MTD_file_offset,
FETCH_MTD_END,
};
@ -115,6 +127,148 @@ struct fetch_param {
void *data;
};
/* For defining macros, define string/string_size types */
typedef u32 string;
typedef u32 string_size;
#define PRINT_TYPE_FUNC_NAME(type) print_type_##type
#define PRINT_TYPE_FMT_NAME(type) print_type_format_##type
/* Printing in basic type function template */
#define DECLARE_BASIC_PRINT_TYPE_FUNC(type) \
__kprobes int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, \
const char *name, \
void *data, void *ent); \
extern const char PRINT_TYPE_FMT_NAME(type)[]
DECLARE_BASIC_PRINT_TYPE_FUNC(u8);
DECLARE_BASIC_PRINT_TYPE_FUNC(u16);
DECLARE_BASIC_PRINT_TYPE_FUNC(u32);
DECLARE_BASIC_PRINT_TYPE_FUNC(u64);
DECLARE_BASIC_PRINT_TYPE_FUNC(s8);
DECLARE_BASIC_PRINT_TYPE_FUNC(s16);
DECLARE_BASIC_PRINT_TYPE_FUNC(s32);
DECLARE_BASIC_PRINT_TYPE_FUNC(s64);
DECLARE_BASIC_PRINT_TYPE_FUNC(string);
#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type
/* Declare macro for basic types */
#define DECLARE_FETCH_FUNC(method, type) \
extern void FETCH_FUNC_NAME(method, type)(struct pt_regs *regs, \
void *data, void *dest)
#define DECLARE_BASIC_FETCH_FUNCS(method) \
DECLARE_FETCH_FUNC(method, u8); \
DECLARE_FETCH_FUNC(method, u16); \
DECLARE_FETCH_FUNC(method, u32); \
DECLARE_FETCH_FUNC(method, u64)
DECLARE_BASIC_FETCH_FUNCS(reg);
#define fetch_reg_string NULL
#define fetch_reg_string_size NULL
DECLARE_BASIC_FETCH_FUNCS(retval);
#define fetch_retval_string NULL
#define fetch_retval_string_size NULL
DECLARE_BASIC_FETCH_FUNCS(symbol);
DECLARE_FETCH_FUNC(symbol, string);
DECLARE_FETCH_FUNC(symbol, string_size);
DECLARE_BASIC_FETCH_FUNCS(deref);
DECLARE_FETCH_FUNC(deref, string);
DECLARE_FETCH_FUNC(deref, string_size);
DECLARE_BASIC_FETCH_FUNCS(bitfield);
#define fetch_bitfield_string NULL
#define fetch_bitfield_string_size NULL
/*
* Define macro for basic types - we don't need to define s* types, because
* we have to care only about bitwidth at recording time.
*/
#define DEFINE_BASIC_FETCH_FUNCS(method) \
DEFINE_FETCH_##method(u8) \
DEFINE_FETCH_##method(u16) \
DEFINE_FETCH_##method(u32) \
DEFINE_FETCH_##method(u64)
/* Default (unsigned long) fetch type */
#define __DEFAULT_FETCH_TYPE(t) u##t
#define _DEFAULT_FETCH_TYPE(t) __DEFAULT_FETCH_TYPE(t)
#define DEFAULT_FETCH_TYPE _DEFAULT_FETCH_TYPE(BITS_PER_LONG)
#define DEFAULT_FETCH_TYPE_STR __stringify(DEFAULT_FETCH_TYPE)
#define ASSIGN_FETCH_FUNC(method, type) \
[FETCH_MTD_##method] = FETCH_FUNC_NAME(method, type)
#define __ASSIGN_FETCH_TYPE(_name, ptype, ftype, _size, sign, _fmttype) \
{.name = _name, \
.size = _size, \
.is_signed = sign, \
.print = PRINT_TYPE_FUNC_NAME(ptype), \
.fmt = PRINT_TYPE_FMT_NAME(ptype), \
.fmttype = _fmttype, \
.fetch = { \
ASSIGN_FETCH_FUNC(reg, ftype), \
ASSIGN_FETCH_FUNC(stack, ftype), \
ASSIGN_FETCH_FUNC(retval, ftype), \
ASSIGN_FETCH_FUNC(memory, ftype), \
ASSIGN_FETCH_FUNC(symbol, ftype), \
ASSIGN_FETCH_FUNC(deref, ftype), \
ASSIGN_FETCH_FUNC(bitfield, ftype), \
ASSIGN_FETCH_FUNC(file_offset, ftype), \
} \
}
#define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \
__ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #ptype)
#define ASSIGN_FETCH_TYPE_END {}
#define FETCH_TYPE_STRING 0
#define FETCH_TYPE_STRSIZE 1
/*
* Fetch type information table.
* It's declared as a weak symbol due to conditional compilation.
*/
extern __weak const struct fetch_type kprobes_fetch_type_table[];
extern __weak const struct fetch_type uprobes_fetch_type_table[];
#ifdef CONFIG_KPROBE_EVENT
struct symbol_cache;
unsigned long update_symbol_cache(struct symbol_cache *sc);
void free_symbol_cache(struct symbol_cache *sc);
struct symbol_cache *alloc_symbol_cache(const char *sym, long offset);
#else
/* uprobes do not support symbol fetch methods */
#define fetch_symbol_u8 NULL
#define fetch_symbol_u16 NULL
#define fetch_symbol_u32 NULL
#define fetch_symbol_u64 NULL
#define fetch_symbol_string NULL
#define fetch_symbol_string_size NULL
struct symbol_cache {
};
static inline unsigned long __used update_symbol_cache(struct symbol_cache *sc)
{
return 0;
}
static inline void __used free_symbol_cache(struct symbol_cache *sc)
{
}
static inline struct symbol_cache * __used
alloc_symbol_cache(const char *sym, long offset)
{
return NULL;
}
#endif /* CONFIG_KPROBE_EVENT */
struct probe_arg {
struct fetch_param fetch;
struct fetch_param fetch_size;
@ -124,6 +278,26 @@ struct probe_arg {
const struct fetch_type *type; /* Type of this argument */
};
struct trace_probe {
unsigned int flags; /* For TP_FLAG_* */
struct ftrace_event_class class;
struct ftrace_event_call call;
struct list_head files;
ssize_t size; /* trace entry size */
unsigned int nr_args;
struct probe_arg args[];
};
static inline bool trace_probe_is_enabled(struct trace_probe *tp)
{
return !!(tp->flags & (TP_FLAG_TRACE | TP_FLAG_PROFILE));
}
static inline bool trace_probe_is_registered(struct trace_probe *tp)
{
return !!(tp->flags & TP_FLAG_REGISTERED);
}
static inline __kprobes void call_fetch(struct fetch_param *fprm,
struct pt_regs *regs, void *dest)
{
@ -158,3 +332,53 @@ extern ssize_t traceprobe_probes_write(struct file *file,
int (*createfn)(int, char**));
extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));
/* Sum up total data length for dynamic arraies (strings) */
static inline __kprobes int
__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
{
int i, ret = 0;
u32 len;
for (i = 0; i < tp->nr_args; i++)
if (unlikely(tp->args[i].fetch_size.fn)) {
call_fetch(&tp->args[i].fetch_size, regs, &len);
ret += len;
}
return ret;
}
/* Store the value of each argument */
static inline __kprobes void
store_trace_args(int ent_size, struct trace_probe *tp, struct pt_regs *regs,
u8 *data, int maxlen)
{
int i;
u32 end = tp->size;
u32 *dl; /* Data (relative) location */
for (i = 0; i < tp->nr_args; i++) {
if (unlikely(tp->args[i].fetch_size.fn)) {
/*
* First, we set the relative location and
* maximum data length to *dl
*/
dl = (u32 *)(data + tp->args[i].offset);
*dl = make_data_rloc(maxlen, end - tp->args[i].offset);
/* Then try to fetch string or dynamic array data */
call_fetch(&tp->args[i].fetch, regs, dl);
/* Reduce maximum length */
end += get_rloc_len(*dl);
maxlen -= get_rloc_len(*dl);
/* Trick here, convert data_rloc to data_loc */
*dl = convert_rloc_to_loc(*dl,
ent_size + tp->args[i].offset);
} else
/* Just fetching data normally */
call_fetch(&tp->args[i].fetch, regs,
data + tp->args[i].offset);
}
}
extern int set_print_fmt(struct trace_probe *tp, bool is_return);

View File

@ -382,7 +382,7 @@ static const struct file_operations stack_trace_filter_fops = {
.open = stack_trace_filter_open,
.read = seq_read,
.write = ftrace_filter_write,
.llseek = ftrace_filter_lseek,
.llseek = tracing_lseek,
.release = ftrace_regex_release,
};

View File

@ -321,7 +321,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
if (!ftrace_file)
return;
if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, &ftrace_file->flags))
if (ftrace_trigger_soft_disabled(ftrace_file))
return;
sys_data = syscall_nr_to_meta(syscall_nr);
@ -343,8 +343,7 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
entry->nr = syscall_nr;
syscall_get_arguments(current, regs, 0, sys_data->nb_args, entry->args);
if (!filter_check_discard(ftrace_file, entry, buffer, event))
trace_current_buffer_unlock_commit(buffer, event,
event_trigger_unlock_commit(ftrace_file, buffer, event, entry,
irq_flags, pc);
}
@ -369,7 +368,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
if (!ftrace_file)
return;
if (test_bit(FTRACE_EVENT_FL_SOFT_DISABLED_BIT, &ftrace_file->flags))
if (ftrace_trigger_soft_disabled(ftrace_file))
return;
sys_data = syscall_nr_to_meta(syscall_nr);
@ -390,8 +389,7 @@ static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret)
entry->nr = syscall_nr;
entry->ret = syscall_get_return_value(current, regs);
if (!filter_check_discard(ftrace_file, entry, buffer, event))
trace_current_buffer_unlock_commit(buffer, event,
event_trigger_unlock_commit(ftrace_file, buffer, event, entry,
irq_flags, pc);
}

View File

@ -51,22 +51,17 @@ struct trace_uprobe_filter {
*/
struct trace_uprobe {
struct list_head list;
struct ftrace_event_class class;
struct ftrace_event_call call;
struct trace_uprobe_filter filter;
struct uprobe_consumer consumer;
struct inode *inode;
char *filename;
unsigned long offset;
unsigned long nhit;
unsigned int flags; /* For TP_FLAG_* */
ssize_t size; /* trace entry size */
unsigned int nr_args;
struct probe_arg args[];
struct trace_probe tp;
};
#define SIZEOF_TRACE_UPROBE(n) \
(offsetof(struct trace_uprobe, args) + \
(offsetof(struct trace_uprobe, tp.args) + \
(sizeof(struct probe_arg) * (n)))
static int register_uprobe_event(struct trace_uprobe *tu);
@ -75,10 +70,151 @@ static int unregister_uprobe_event(struct trace_uprobe *tu);
static DEFINE_MUTEX(uprobe_lock);
static LIST_HEAD(uprobe_list);
struct uprobe_dispatch_data {
struct trace_uprobe *tu;
unsigned long bp_addr;
};
static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs);
static int uretprobe_dispatcher(struct uprobe_consumer *con,
unsigned long func, struct pt_regs *regs);
#ifdef CONFIG_STACK_GROWSUP
static unsigned long adjust_stack_addr(unsigned long addr, unsigned int n)
{
return addr - (n * sizeof(long));
}
#else
static unsigned long adjust_stack_addr(unsigned long addr, unsigned int n)
{
return addr + (n * sizeof(long));
}
#endif
static unsigned long get_user_stack_nth(struct pt_regs *regs, unsigned int n)
{
unsigned long ret;
unsigned long addr = user_stack_pointer(regs);
addr = adjust_stack_addr(addr, n);
if (copy_from_user(&ret, (void __force __user *) addr, sizeof(ret)))
return 0;
return ret;
}
/*
* Uprobes-specific fetch functions
*/
#define DEFINE_FETCH_stack(type) \
static __kprobes void FETCH_FUNC_NAME(stack, type)(struct pt_regs *regs,\
void *offset, void *dest) \
{ \
*(type *)dest = (type)get_user_stack_nth(regs, \
((unsigned long)offset)); \
}
DEFINE_BASIC_FETCH_FUNCS(stack)
/* No string on the stack entry */
#define fetch_stack_string NULL
#define fetch_stack_string_size NULL
#define DEFINE_FETCH_memory(type) \
static __kprobes void FETCH_FUNC_NAME(memory, type)(struct pt_regs *regs,\
void *addr, void *dest) \
{ \
type retval; \
void __user *vaddr = (void __force __user *) addr; \
\
if (copy_from_user(&retval, vaddr, sizeof(type))) \
*(type *)dest = 0; \
else \
*(type *) dest = retval; \
}
DEFINE_BASIC_FETCH_FUNCS(memory)
/*
* Fetch a null-terminated string. Caller MUST set *(u32 *)dest with max
* length and relative data location.
*/
static __kprobes void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
void *addr, void *dest)
{
long ret;
u32 rloc = *(u32 *)dest;
int maxlen = get_rloc_len(rloc);
u8 *dst = get_rloc_data(dest);
void __user *src = (void __force __user *) addr;
if (!maxlen)
return;
ret = strncpy_from_user(dst, src, maxlen);
if (ret < 0) { /* Failed to fetch string */
((u8 *)get_rloc_data(dest))[0] = '\0';
*(u32 *)dest = make_data_rloc(0, get_rloc_offs(rloc));
} else {
*(u32 *)dest = make_data_rloc(ret, get_rloc_offs(rloc));
}
}
static __kprobes void FETCH_FUNC_NAME(memory, string_size)(struct pt_regs *regs,
void *addr, void *dest)
{
int len;
void __user *vaddr = (void __force __user *) addr;
len = strnlen_user(vaddr, MAX_STRING_SIZE);
if (len == 0 || len > MAX_STRING_SIZE) /* Failed to check length */
*(u32 *)dest = 0;
else
*(u32 *)dest = len;
}
static unsigned long translate_user_vaddr(void *file_offset)
{
unsigned long base_addr;
struct uprobe_dispatch_data *udd;
udd = (void *) current->utask->vaddr;
base_addr = udd->bp_addr - udd->tu->offset;
return base_addr + (unsigned long)file_offset;
}
#define DEFINE_FETCH_file_offset(type) \
static __kprobes void FETCH_FUNC_NAME(file_offset, type)(struct pt_regs *regs,\
void *offset, void *dest) \
{ \
void *vaddr = (void *)translate_user_vaddr(offset); \
\
FETCH_FUNC_NAME(memory, type)(regs, vaddr, dest); \
}
DEFINE_BASIC_FETCH_FUNCS(file_offset)
DEFINE_FETCH_file_offset(string)
DEFINE_FETCH_file_offset(string_size)
/* Fetch type information table */
const struct fetch_type uprobes_fetch_type_table[] = {
/* Special types */
[FETCH_TYPE_STRING] = __ASSIGN_FETCH_TYPE("string", string, string,
sizeof(u32), 1, "__data_loc char[]"),
[FETCH_TYPE_STRSIZE] = __ASSIGN_FETCH_TYPE("string_size", u32,
string_size, sizeof(u32), 0, "u32"),
/* Basic types */
ASSIGN_FETCH_TYPE(u8, u8, 0),
ASSIGN_FETCH_TYPE(u16, u16, 0),
ASSIGN_FETCH_TYPE(u32, u32, 0),
ASSIGN_FETCH_TYPE(u64, u64, 0),
ASSIGN_FETCH_TYPE(s8, u8, 1),
ASSIGN_FETCH_TYPE(s16, u16, 1),
ASSIGN_FETCH_TYPE(s32, u32, 1),
ASSIGN_FETCH_TYPE(s64, u64, 1),
ASSIGN_FETCH_TYPE_END
};
static inline void init_trace_uprobe_filter(struct trace_uprobe_filter *filter)
{
rwlock_init(&filter->rwlock);
@ -114,13 +250,13 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret)
if (!tu)
return ERR_PTR(-ENOMEM);
tu->call.class = &tu->class;
tu->call.name = kstrdup(event, GFP_KERNEL);
if (!tu->call.name)
tu->tp.call.class = &tu->tp.class;
tu->tp.call.name = kstrdup(event, GFP_KERNEL);
if (!tu->tp.call.name)
goto error;
tu->class.system = kstrdup(group, GFP_KERNEL);
if (!tu->class.system)
tu->tp.class.system = kstrdup(group, GFP_KERNEL);
if (!tu->tp.class.system)
goto error;
INIT_LIST_HEAD(&tu->list);
@ -128,11 +264,11 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret)
if (is_ret)
tu->consumer.ret_handler = uretprobe_dispatcher;
init_trace_uprobe_filter(&tu->filter);
tu->call.flags |= TRACE_EVENT_FL_USE_CALL_FILTER;
tu->tp.call.flags |= TRACE_EVENT_FL_USE_CALL_FILTER;
return tu;
error:
kfree(tu->call.name);
kfree(tu->tp.call.name);
kfree(tu);
return ERR_PTR(-ENOMEM);
@ -142,12 +278,12 @@ static void free_trace_uprobe(struct trace_uprobe *tu)
{
int i;
for (i = 0; i < tu->nr_args; i++)
traceprobe_free_probe_arg(&tu->args[i]);
for (i = 0; i < tu->tp.nr_args; i++)
traceprobe_free_probe_arg(&tu->tp.args[i]);
iput(tu->inode);
kfree(tu->call.class->system);
kfree(tu->call.name);
kfree(tu->tp.call.class->system);
kfree(tu->tp.call.name);
kfree(tu->filename);
kfree(tu);
}
@ -157,8 +293,8 @@ static struct trace_uprobe *find_probe_event(const char *event, const char *grou
struct trace_uprobe *tu;
list_for_each_entry(tu, &uprobe_list, list)
if (strcmp(tu->call.name, event) == 0 &&
strcmp(tu->call.class->system, group) == 0)
if (strcmp(tu->tp.call.name, event) == 0 &&
strcmp(tu->tp.call.class->system, group) == 0)
return tu;
return NULL;
@ -181,16 +317,16 @@ static int unregister_trace_uprobe(struct trace_uprobe *tu)
/* Register a trace_uprobe and probe_event */
static int register_trace_uprobe(struct trace_uprobe *tu)
{
struct trace_uprobe *old_tp;
struct trace_uprobe *old_tu;
int ret;
mutex_lock(&uprobe_lock);
/* register as an event */
old_tp = find_probe_event(tu->call.name, tu->call.class->system);
if (old_tp) {
old_tu = find_probe_event(tu->tp.call.name, tu->tp.call.class->system);
if (old_tu) {
/* delete old event */
ret = unregister_trace_uprobe(old_tp);
ret = unregister_trace_uprobe(old_tu);
if (ret)
goto end;
}
@ -211,7 +347,7 @@ end:
/*
* Argument syntax:
* - Add uprobe: p|r[:[GRP/]EVENT] PATH:SYMBOL [FETCHARGS]
* - Add uprobe: p|r[:[GRP/]EVENT] PATH:OFFSET [FETCHARGS]
*
* - Remove uprobe: -:[GRP/]EVENT
*/
@ -360,34 +496,36 @@ static int create_trace_uprobe(int argc, char **argv)
/* parse arguments */
ret = 0;
for (i = 0; i < argc && i < MAX_TRACE_ARGS; i++) {
struct probe_arg *parg = &tu->tp.args[i];
/* Increment count for freeing args in error case */
tu->nr_args++;
tu->tp.nr_args++;
/* Parse argument name */
arg = strchr(argv[i], '=');
if (arg) {
*arg++ = '\0';
tu->args[i].name = kstrdup(argv[i], GFP_KERNEL);
parg->name = kstrdup(argv[i], GFP_KERNEL);
} else {
arg = argv[i];
/* If argument name is omitted, set "argN" */
snprintf(buf, MAX_EVENT_NAME_LEN, "arg%d", i + 1);
tu->args[i].name = kstrdup(buf, GFP_KERNEL);
parg->name = kstrdup(buf, GFP_KERNEL);
}
if (!tu->args[i].name) {
if (!parg->name) {
pr_info("Failed to allocate argument[%d] name.\n", i);
ret = -ENOMEM;
goto error;
}
if (!is_good_name(tu->args[i].name)) {
pr_info("Invalid argument[%d] name: %s\n", i, tu->args[i].name);
if (!is_good_name(parg->name)) {
pr_info("Invalid argument[%d] name: %s\n", i, parg->name);
ret = -EINVAL;
goto error;
}
if (traceprobe_conflict_field_name(tu->args[i].name, tu->args, i)) {
if (traceprobe_conflict_field_name(parg->name, tu->tp.args, i)) {
pr_info("Argument[%d] name '%s' conflicts with "
"another field.\n", i, argv[i]);
ret = -EINVAL;
@ -395,7 +533,8 @@ static int create_trace_uprobe(int argc, char **argv)
}
/* Parse fetch argument */
ret = traceprobe_parse_probe_arg(arg, &tu->size, &tu->args[i], false, false);
ret = traceprobe_parse_probe_arg(arg, &tu->tp.size, parg,
is_return, false);
if (ret) {
pr_info("Parse error at argument[%d]. (%d)\n", i, ret);
goto error;
@ -459,11 +598,11 @@ static int probes_seq_show(struct seq_file *m, void *v)
char c = is_ret_probe(tu) ? 'r' : 'p';
int i;
seq_printf(m, "%c:%s/%s", c, tu->call.class->system, tu->call.name);
seq_printf(m, "%c:%s/%s", c, tu->tp.call.class->system, tu->tp.call.name);
seq_printf(m, " %s:0x%p", tu->filename, (void *)tu->offset);
for (i = 0; i < tu->nr_args; i++)
seq_printf(m, " %s=%s", tu->args[i].name, tu->args[i].comm);
for (i = 0; i < tu->tp.nr_args; i++)
seq_printf(m, " %s=%s", tu->tp.args[i].name, tu->tp.args[i].comm);
seq_printf(m, "\n");
return 0;
@ -509,7 +648,7 @@ static int probes_profile_seq_show(struct seq_file *m, void *v)
{
struct trace_uprobe *tu = v;
seq_printf(m, " %s %-44s %15lu\n", tu->filename, tu->call.name, tu->nhit);
seq_printf(m, " %s %-44s %15lu\n", tu->filename, tu->tp.call.name, tu->nhit);
return 0;
}
@ -533,22 +672,118 @@ static const struct file_operations uprobe_profile_ops = {
.release = seq_release,
};
struct uprobe_cpu_buffer {
struct mutex mutex;
void *buf;
};
static struct uprobe_cpu_buffer __percpu *uprobe_cpu_buffer;
static int uprobe_buffer_refcnt;
static int uprobe_buffer_init(void)
{
int cpu, err_cpu;
uprobe_cpu_buffer = alloc_percpu(struct uprobe_cpu_buffer);
if (uprobe_cpu_buffer == NULL)
return -ENOMEM;
for_each_possible_cpu(cpu) {
struct page *p = alloc_pages_node(cpu_to_node(cpu),
GFP_KERNEL, 0);
if (p == NULL) {
err_cpu = cpu;
goto err;
}
per_cpu_ptr(uprobe_cpu_buffer, cpu)->buf = page_address(p);
mutex_init(&per_cpu_ptr(uprobe_cpu_buffer, cpu)->mutex);
}
return 0;
err:
for_each_possible_cpu(cpu) {
if (cpu == err_cpu)
break;
free_page((unsigned long)per_cpu_ptr(uprobe_cpu_buffer, cpu)->buf);
}
free_percpu(uprobe_cpu_buffer);
return -ENOMEM;
}
static int uprobe_buffer_enable(void)
{
int ret = 0;
BUG_ON(!mutex_is_locked(&event_mutex));
if (uprobe_buffer_refcnt++ == 0) {
ret = uprobe_buffer_init();
if (ret < 0)
uprobe_buffer_refcnt--;
}
return ret;
}
static void uprobe_buffer_disable(void)
{
BUG_ON(!mutex_is_locked(&event_mutex));
if (--uprobe_buffer_refcnt == 0) {
free_percpu(uprobe_cpu_buffer);
uprobe_cpu_buffer = NULL;
}
}
static struct uprobe_cpu_buffer *uprobe_buffer_get(void)
{
struct uprobe_cpu_buffer *ucb;
int cpu;
cpu = raw_smp_processor_id();
ucb = per_cpu_ptr(uprobe_cpu_buffer, cpu);
/*
* Use per-cpu buffers for fastest access, but we might migrate
* so the mutex makes sure we have sole access to it.
*/
mutex_lock(&ucb->mutex);
return ucb;
}
static void uprobe_buffer_put(struct uprobe_cpu_buffer *ucb)
{
mutex_unlock(&ucb->mutex);
}
static void uprobe_trace_print(struct trace_uprobe *tu,
unsigned long func, struct pt_regs *regs)
{
struct uprobe_trace_entry_head *entry;
struct ring_buffer_event *event;
struct ring_buffer *buffer;
struct uprobe_cpu_buffer *ucb;
void *data;
int size, i;
struct ftrace_event_call *call = &tu->call;
int size, dsize, esize;
struct ftrace_event_call *call = &tu->tp.call;
size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
size + tu->size, 0, 0);
if (!event)
dsize = __get_data_size(&tu->tp, regs);
esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
if (WARN_ON_ONCE(!uprobe_cpu_buffer || tu->tp.size + dsize > PAGE_SIZE))
return;
ucb = uprobe_buffer_get();
store_trace_args(esize, &tu->tp, regs, ucb->buf, dsize);
size = esize + tu->tp.size + dsize;
event = trace_current_buffer_lock_reserve(&buffer, call->event.type,
size, 0, 0);
if (!event)
goto out;
entry = ring_buffer_event_data(event);
if (is_ret_probe(tu)) {
entry->vaddr[0] = func;
@ -559,11 +794,13 @@ static void uprobe_trace_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}
for (i = 0; i < tu->nr_args; i++)
call_fetch(&tu->args[i].fetch, regs, data + tu->args[i].offset);
memcpy(data, ucb->buf, tu->tp.size + dsize);
if (!call_filter_check_discard(call, entry, buffer, event))
trace_buffer_unlock_commit(buffer, event, 0, 0);
out:
uprobe_buffer_put(ucb);
}
/* uprobe handler */
@ -591,23 +828,24 @@ print_uprobe_event(struct trace_iterator *iter, int flags, struct trace_event *e
int i;
entry = (struct uprobe_trace_entry_head *)iter->ent;
tu = container_of(event, struct trace_uprobe, call.event);
tu = container_of(event, struct trace_uprobe, tp.call.event);
if (is_ret_probe(tu)) {
if (!trace_seq_printf(s, "%s: (0x%lx <- 0x%lx)", tu->call.name,
if (!trace_seq_printf(s, "%s: (0x%lx <- 0x%lx)", tu->tp.call.name,
entry->vaddr[1], entry->vaddr[0]))
goto partial;
data = DATAOF_TRACE_ENTRY(entry, true);
} else {
if (!trace_seq_printf(s, "%s: (0x%lx)", tu->call.name,
if (!trace_seq_printf(s, "%s: (0x%lx)", tu->tp.call.name,
entry->vaddr[0]))
goto partial;
data = DATAOF_TRACE_ENTRY(entry, false);
}
for (i = 0; i < tu->nr_args; i++) {
if (!tu->args[i].type->print(s, tu->args[i].name,
data + tu->args[i].offset, entry))
for (i = 0; i < tu->tp.nr_args; i++) {
struct probe_arg *parg = &tu->tp.args[i];
if (!parg->type->print(s, parg->name, data + parg->offset, entry))
goto partial;
}
@ -618,11 +856,6 @@ partial:
return TRACE_TYPE_PARTIAL_LINE;
}
static inline bool is_trace_uprobe_enabled(struct trace_uprobe *tu)
{
return tu->flags & (TP_FLAG_TRACE | TP_FLAG_PROFILE);
}
typedef bool (*filter_func_t)(struct uprobe_consumer *self,
enum uprobe_filter_ctx ctx,
struct mm_struct *mm);
@ -632,29 +865,35 @@ probe_event_enable(struct trace_uprobe *tu, int flag, filter_func_t filter)
{
int ret = 0;
if (is_trace_uprobe_enabled(tu))
if (trace_probe_is_enabled(&tu->tp))
return -EINTR;
ret = uprobe_buffer_enable();
if (ret < 0)
return ret;
WARN_ON(!uprobe_filter_is_empty(&tu->filter));
tu->flags |= flag;
tu->tp.flags |= flag;
tu->consumer.filter = filter;
ret = uprobe_register(tu->inode, tu->offset, &tu->consumer);
if (ret)
tu->flags &= ~flag;
tu->tp.flags &= ~flag;
return ret;
}
static void probe_event_disable(struct trace_uprobe *tu, int flag)
{
if (!is_trace_uprobe_enabled(tu))
if (!trace_probe_is_enabled(&tu->tp))
return;
WARN_ON(!uprobe_filter_is_empty(&tu->filter));
uprobe_unregister(tu->inode, tu->offset, &tu->consumer);
tu->flags &= ~flag;
tu->tp.flags &= ~flag;
uprobe_buffer_disable();
}
static int uprobe_event_define_fields(struct ftrace_event_call *event_call)
@ -672,12 +911,12 @@ static int uprobe_event_define_fields(struct ftrace_event_call *event_call)
size = SIZEOF_TRACE_ENTRY(false);
}
/* Set argument names as fields */
for (i = 0; i < tu->nr_args; i++) {
ret = trace_define_field(event_call, tu->args[i].type->fmttype,
tu->args[i].name,
size + tu->args[i].offset,
tu->args[i].type->size,
tu->args[i].type->is_signed,
for (i = 0; i < tu->tp.nr_args; i++) {
struct probe_arg *parg = &tu->tp.args[i];
ret = trace_define_field(event_call, parg->type->fmttype,
parg->name, size + parg->offset,
parg->type->size, parg->type->is_signed,
FILTER_OTHER);
if (ret)
@ -686,59 +925,6 @@ static int uprobe_event_define_fields(struct ftrace_event_call *event_call)
return 0;
}
#define LEN_OR_ZERO (len ? len - pos : 0)
static int __set_print_fmt(struct trace_uprobe *tu, char *buf, int len)
{
const char *fmt, *arg;
int i;
int pos = 0;
if (is_ret_probe(tu)) {
fmt = "(%lx <- %lx)";
arg = "REC->" FIELD_STRING_FUNC ", REC->" FIELD_STRING_RETIP;
} else {
fmt = "(%lx)";
arg = "REC->" FIELD_STRING_IP;
}
/* When len=0, we just calculate the needed length */
pos += snprintf(buf + pos, LEN_OR_ZERO, "\"%s", fmt);
for (i = 0; i < tu->nr_args; i++) {
pos += snprintf(buf + pos, LEN_OR_ZERO, " %s=%s",
tu->args[i].name, tu->args[i].type->fmt);
}
pos += snprintf(buf + pos, LEN_OR_ZERO, "\", %s", arg);
for (i = 0; i < tu->nr_args; i++) {
pos += snprintf(buf + pos, LEN_OR_ZERO, ", REC->%s",
tu->args[i].name);
}
return pos; /* return the length of print_fmt */
}
#undef LEN_OR_ZERO
static int set_print_fmt(struct trace_uprobe *tu)
{
char *print_fmt;
int len;
/* First: called with 0 length to calculate the needed length */
len = __set_print_fmt(tu, NULL, 0);
print_fmt = kmalloc(len + 1, GFP_KERNEL);
if (!print_fmt)
return -ENOMEM;
/* Second: actually write the @print_fmt */
__set_print_fmt(tu, print_fmt, len + 1);
tu->call.print_fmt = print_fmt;
return 0;
}
#ifdef CONFIG_PERF_EVENTS
static bool
__uprobe_perf_filter(struct trace_uprobe_filter *filter, struct mm_struct *mm)
@ -831,14 +1017,27 @@ static bool uprobe_perf_filter(struct uprobe_consumer *uc,
static void uprobe_perf_print(struct trace_uprobe *tu,
unsigned long func, struct pt_regs *regs)
{
struct ftrace_event_call *call = &tu->call;
struct ftrace_event_call *call = &tu->tp.call;
struct uprobe_trace_entry_head *entry;
struct hlist_head *head;
struct uprobe_cpu_buffer *ucb;
void *data;
int size, rctx, i;
int size, dsize, esize;
int rctx;
size = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
size = ALIGN(size + tu->size + sizeof(u32), sizeof(u64)) - sizeof(u32);
dsize = __get_data_size(&tu->tp, regs);
esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
if (WARN_ON_ONCE(!uprobe_cpu_buffer))
return;
size = esize + tu->tp.size + dsize;
size = ALIGN(size + sizeof(u32), sizeof(u64)) - sizeof(u32);
if (WARN_ONCE(size > PERF_MAX_TRACE_SIZE, "profile buffer not large enough"))
return;
ucb = uprobe_buffer_get();
store_trace_args(esize, &tu->tp, regs, ucb->buf, dsize);
preempt_disable();
head = this_cpu_ptr(call->perf_events);
@ -858,12 +1057,18 @@ static void uprobe_perf_print(struct trace_uprobe *tu,
data = DATAOF_TRACE_ENTRY(entry, false);
}
for (i = 0; i < tu->nr_args; i++)
call_fetch(&tu->args[i].fetch, regs, data + tu->args[i].offset);
memcpy(data, ucb->buf, tu->tp.size + dsize);
if (size - esize > tu->tp.size + dsize) {
int len = tu->tp.size + dsize;
memset(data + len, 0, size - esize - len);
}
perf_trace_buf_submit(entry, size, rctx, 0, 1, regs, head, NULL);
out:
preempt_enable();
uprobe_buffer_put(ucb);
}
/* uprobe profile handler */
@ -921,16 +1126,22 @@ int trace_uprobe_register(struct ftrace_event_call *event, enum trace_reg type,
static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs)
{
struct trace_uprobe *tu;
struct uprobe_dispatch_data udd;
int ret = 0;
tu = container_of(con, struct trace_uprobe, consumer);
tu->nhit++;
if (tu->flags & TP_FLAG_TRACE)
udd.tu = tu;
udd.bp_addr = instruction_pointer(regs);
current->utask->vaddr = (unsigned long) &udd;
if (tu->tp.flags & TP_FLAG_TRACE)
ret |= uprobe_trace_func(tu, regs);
#ifdef CONFIG_PERF_EVENTS
if (tu->flags & TP_FLAG_PROFILE)
if (tu->tp.flags & TP_FLAG_PROFILE)
ret |= uprobe_perf_func(tu, regs);
#endif
return ret;
@ -940,14 +1151,20 @@ static int uretprobe_dispatcher(struct uprobe_consumer *con,
unsigned long func, struct pt_regs *regs)
{
struct trace_uprobe *tu;
struct uprobe_dispatch_data udd;
tu = container_of(con, struct trace_uprobe, consumer);
if (tu->flags & TP_FLAG_TRACE)
udd.tu = tu;
udd.bp_addr = func;
current->utask->vaddr = (unsigned long) &udd;
if (tu->tp.flags & TP_FLAG_TRACE)
uretprobe_trace_func(tu, func, regs);
#ifdef CONFIG_PERF_EVENTS
if (tu->flags & TP_FLAG_PROFILE)
if (tu->tp.flags & TP_FLAG_PROFILE)
uretprobe_perf_func(tu, func, regs);
#endif
return 0;
@ -959,7 +1176,7 @@ static struct trace_event_functions uprobe_funcs = {
static int register_uprobe_event(struct trace_uprobe *tu)
{
struct ftrace_event_call *call = &tu->call;
struct ftrace_event_call *call = &tu->tp.call;
int ret;
/* Initialize ftrace_event_call */
@ -967,7 +1184,7 @@ static int register_uprobe_event(struct trace_uprobe *tu)
call->event.funcs = &uprobe_funcs;
call->class->define_fields = uprobe_event_define_fields;
if (set_print_fmt(tu) < 0)
if (set_print_fmt(&tu->tp, is_ret_probe(tu)) < 0)
return -ENOMEM;
ret = register_ftrace_event(&call->event);
@ -994,11 +1211,11 @@ static int unregister_uprobe_event(struct trace_uprobe *tu)
int ret;
/* tu->event is unregistered in trace_remove_event_call() */
ret = trace_remove_event_call(&tu->call);
ret = trace_remove_event_call(&tu->tp.call);
if (ret)
return ret;
kfree(tu->call.print_fmt);
tu->call.print_fmt = NULL;
kfree(tu->tp.call.print_fmt);
tu->tp.call.print_fmt = NULL;
return 0;
}