linux-next/samples
Alexei Starovoitov 95ff141e52 samples/bpf: add map_lookup microbenchmark
$ map_perf_test 128
speed of HASH bpf_map_lookup_elem() in lookups per second
	w/o JIT		w/JIT
before	46M		58M
after	42M		74M

perf report
before:
    54.23%  map_perf_test  [kernel.kallsyms]  [k] __htab_map_lookup_elem
    14.24%  map_perf_test  [kernel.kallsyms]  [k] lookup_elem_raw
     8.84%  map_perf_test  [kernel.kallsyms]  [k] htab_map_lookup_elem
     5.93%  map_perf_test  [kernel.kallsyms]  [k] bpf_map_lookup_elem
     2.30%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
     1.49%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler

after:
    60.03%  map_perf_test  [kernel.kallsyms]  [k] __htab_map_lookup_elem
    18.07%  map_perf_test  [kernel.kallsyms]  [k] lookup_elem_raw
     2.91%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
     1.94%  map_perf_test  [kernel.kallsyms]  [k] _einittext
     1.90%  map_perf_test  [kernel.kallsyms]  [k] __audit_syscall_exit
     1.72%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler

Notice that bpf_map_lookup_elem() and htab_map_lookup_elem() are trivial
functions, yet they take sizeable amount of cpu time.
htab_map_gen_lookup() removes bpf_map_lookup_elem() and converts
htab_map_lookup_elem() into three BPF insns which causing cpu time
for bpf_prog_da4fc6a3f41761a2() slightly increase.

$ map_perf_test 256
speed of ARRAY bpf_map_lookup_elem() in lookups per second
	w/o JIT		w/JIT
before	97M		174M
after	64M		280M

before:
    37.33%  map_perf_test  [kernel.kallsyms]  [k] array_map_lookup_elem
    13.95%  map_perf_test  [kernel.kallsyms]  [k] bpf_map_lookup_elem
     6.54%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
     4.57%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler

after:
    32.86%  map_perf_test  [kernel.kallsyms]  [k] bpf_prog_da4fc6a3f41761a2
     6.54%  map_perf_test  [kernel.kallsyms]  [k] kprobe_ftrace_handler

array_map_gen_lookup() removes calls to array_map_lookup_elem()
and bpf_map_lookup_elem() and replaces them with 7 bpf insns.

The performance without JIT is slower, since executing extra insns
in the interpreter is slower than running native C code,
but with JIT the performance gains are obvious,
since native C->x86 code is replaced with fewer bpf->x86 instructions.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-16 20:44:12 -07:00
..
auxdisplay samples: move auxdisplay example code from Documentation 2016-09-23 11:52:32 -06:00
blackfin samples: move blackfin gptimers-example from Documentation 2016-10-10 07:12:02 -06:00
bpf samples/bpf: add map_lookup microbenchmark 2017-03-16 20:44:12 -07:00
configfs configfs: remove old API 2015-10-13 22:17:57 -07:00
connector make use of make variable CURDIR instead of calling pwd 2016-12-11 12:12:56 +01:00
hidraw HID: samples/hidraw: make it possible to select device 2015-03-15 10:11:21 -04:00
hw_breakpoint perf: Add context field to perf_event 2011-07-01 11:06:38 +02:00
kdb kdb: Add kdb kernel module sample 2010-10-29 13:14:39 -05:00
kfifo kfifo API type safety 2013-11-15 09:32:23 +09:00
kobject samples/kobject: be explicit in the module license 2015-03-25 13:41:42 +01:00
kprobes samples/kretprobe: fix the wrong type 2016-08-04 08:50:07 -04:00
livepatch livepatch: reuse module loader code to write relocations 2016-04-01 15:00:11 +02:00
mei samples: move misc-devices/mei example code from Documentation 2016-09-23 11:51:43 -06:00
mic/mpssd samples: move mic/mpssd example code from Documentation 2016-09-20 12:38:48 -06:00
pktgen samples: Add an IPv6 '-6' option to the pktgen scripts 2016-07-20 22:16:02 -07:00
rpmsg rpmsg: Allow callback to return errors 2016-09-08 22:15:25 -07:00
seccomp samples/seccomp: fix 64-bit comparison macros 2017-01-09 17:22:03 +11:00
statx statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
timers samples: move timers example code from Documentation 2016-09-23 11:51:58 -06:00
trace_events sched/core: Remove the tsk_cpus_allowed() wrapper 2017-03-02 08:42:24 +01:00
trace_printk tracing: Add trace_printk sample code 2016-06-20 09:54:21 -04:00
uhid HID: uhid: improve uhid example client 2013-09-04 11:35:14 +02:00
v4l [media] vb2: replace void *alloc_ctxs by struct device *alloc_devs 2016-07-08 14:45:07 -03:00
vfio-mdev vfio-mdev: remove some dead code 2017-01-11 12:12:37 -07:00
watchdog samples: move watchdog example code from Documentation 2016-09-23 11:52:14 -06:00
Kconfig statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
Makefile statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00