Denis Vlasenko 4277eedd79 vsprintf.c: optimizing, part 2: base 10 conversion speedup, v2
Optimize integer-to-string conversion in vsprintf.c for base 10.  This is
by far the most used conversion, and in some use cases it impacts
performance.  For example, top reads /proc/$PID/stat for every process, and
with 4000 processes decimal conversion alone takes noticeable time.

Using code from

http://www.cs.uiowa.edu/~jones/bcd/decimal.html
(with permission from the author, Douglas W. Jones)

binary-to-decimal-string conversion is done in groups of five digits at
once, using only additions/subtractions/shifts (with -O2; -Os throws in
some multiply instructions).

On i386 arch gcc 4.1.2 -O2 generates ~500 bytes of code.

This patch is run tested. Userspace benchmark/test is also attached.
I tested it on PIII and AMD64 and new code is generally ~2.5 times
faster. On AMD64:

# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv:  .......... 62 ns per iteration
Testing correctness
12895992590592 ok...        [Ctrl-C]
# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv:  .......... 62 ns per iteration
Testing correctness
26025406464 ok...        [Ctrl-C]

More realistic test: top from busybox project was modified to
report how many us it took to scan /proc (this does not account
any processing done after that, like sorting process list),
and then I test it with 4000 processes:

#!/bin/sh
i=4000
while test $i != 0; do
    sleep 30 &
    let i--
done
busybox top -b -n3 >/dev/null

on unpatched kernel:

top: 4120 processes took 102864 microseconds to scan
top: 4120 processes took 91757 microseconds to scan
top: 4120 processes took 92517 microseconds to scan
top: 4120 processes took 92581 microseconds to scan

on patched kernel:

top: 4120 processes took 75460 microseconds to scan
top: 4120 processes took 66451 microseconds to scan
top: 4120 processes took 67267 microseconds to scan
top: 4120 processes took 67618 microseconds to scan

The speedup comes from much faster generation of /proc/PID/stat
by sprintf() calls inside the kernel.

Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:52 -07:00
..
2007-07-10 17:51:13 -07:00
2005-10-18 08:26:15 -07:00
2007-05-11 05:38:25 -04:00
2007-07-16 09:05:50 -07:00
2006-06-25 10:01:20 -07:00
2005-04-16 15:20:36 -07:00
2006-06-25 10:01:20 -07:00
2007-05-10 18:24:13 +02:00
2005-04-16 15:20:36 -07:00
2007-04-25 22:28:53 -07:00
2005-04-16 15:20:36 -07:00
2007-05-08 11:14:58 -07:00
2005-04-16 15:20:36 -07:00
2007-02-20 17:10:15 -08:00
2005-04-16 15:20:36 -07:00
2007-06-08 17:23:34 -07:00
2007-07-16 09:05:34 -07:00
2006-06-20 20:24:58 -07:00
2007-05-21 09:18:19 -07:00
2006-10-06 08:53:40 -07:00
2007-07-10 17:51:13 -07:00
2006-06-25 10:01:09 -07:00
2007-07-16 09:05:50 -07:00
2005-04-16 15:20:36 -07:00
2006-12-04 02:00:22 -05:00
2007-02-17 19:07:33 +01:00