linux_dsm_epyc7002/lib
Denys Vlasenko 133fd9f5cd vsprintf: further optimize decimal conversion
Previous code was using optimizations which were developed to work well
even on narrow-word CPUs (by today's standards).  But Linux runs only on
32-bit and wider CPUs.  We can use that.

First: using 32x32->64 multiply and trivial 32-bit shift, we can correctly
divide by 10 much larger numbers, and thus we can print groups of 9 digits
instead of groups of 5 digits.

Next: there are two algorithms to print larger numbers.  One is generic:
divide by 1000000000 and repeatedly print groups of (up to) 9 digits.
It's conceptually simple, but requires an (unsigned long long) /
1000000000 division.

Second algorithm splits 64-bit unsigned long long into 16-bit chunks,
manipulates them cleverly and generates groups of 4 decimal digits.  It so
happens that it does NOT require long long division.

If long is > 32 bits, division of 64-bit values is relatively easy, and we
will use the first algorithm.  If long long is > 64 bits (strange
architecture with VERY large long long), second algorithm can't be used,
and we again use the first one.

Else (if long is 32 bits and long long is 64 bits) we use second one.

And third: there is a simple optimization which takes fast path not only
for zero as was done before, but for all one-digit numbers.

In all tested cases new code is faster than old one, in many cases by 30%,
in few cases by more than 50% (for example, on x86-32, conversion of
12345678).  Code growth is ~0 in 32-bit case and ~130 bytes in 64-bit
case.

This patch is based upon an original from Michal Nazarewicz.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31 17:49:27 -07:00
..
lzo lib: add support for LZO-compressed kernels 2010-01-11 09:34:04 -08:00
mpi mpi: Avoid using freed pointer in mpi_lshift_limbs() 2012-04-18 12:14:28 +10:00
raid6 lib/raid6: cleanup gen_syndrome function selection 2012-05-22 13:54:24 +10:00
reed_solomon lib: Remove unnecessary inclusions of asm/semaphore.h 2008-04-18 22:17:17 -04:00
xz XZ: Fix incorrect XZ_BUF_ERROR 2011-09-21 13:39:59 -07:00
zlib_deflate zlib: slim down zlib_deflate() workspace when possible 2011-03-22 17:44:17 -07:00
zlib_inflate inflate_fast: sout is already a short so ptr arith was off by one. 2010-03-12 15:52:44 -08:00
.gitignore
argv_split.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
atomic64_test.c bug.h: add include of it to various implicit C users 2012-02-29 17:15:08 -05:00
atomic64.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
audit.c audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
average.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
bcd.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
bch.c lib: add shared BCH ECC library 2011-03-11 14:25:50 +00:00
bitmap.c lib/bitmap.c: fix documentation for scnprintf() functions 2012-05-29 16:22:32 -07:00
bitrev.c lib: export bitrev16 2008-06-06 11:29:10 -07:00
bsearch.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
btree.c btree: export btree_get_prev() so modules can use btree_for_each 2012-01-10 16:30:49 -08:00
bug.c bugs, x86: Fix printk levels for panic, softlockups and stack dumps 2012-01-26 21:28:45 +01:00
bust_spinlocks.c oops handling: ensure that any oops is flushed to the mtdoops console 2009-01-06 15:59:11 -08:00
check_signature.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
checksum.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
clz_tab.c lib: Fix multiple definitions of clz_tab 2012-02-02 10:34:23 +11:00
cmdline.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
cordic.c Docs: wording: functions -> algorithm 2011-10-29 21:20:22 +02:00
cpu_rmap.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
cpu-notifier-error-inject.c fault-injection: add CPU notifier error injection module 2010-05-27 09:12:48 -07:00
cpumask.c lib/cpumask.c: remove __any_online_cpu() 2012-03-28 17:14:35 -07:00
crc7.c
crc8.c lib: crc8: add new library module providing crc8 algorithm 2011-06-03 15:01:06 -04:00
crc16.c
crc32.c crc32: add self-test code for crc32c 2012-03-23 16:58:38 -07:00
crc32defs.h crc32: select an algorithm via Kconfig 2012-03-23 16:58:38 -07:00
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c [SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC 2008-07-12 08:22:32 -05:00
ctype.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
debug_locks.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
debugobjects.c debugobjects: Fill_pool() returns void now 2012-04-18 13:38:48 +02:00
dec_and_lock.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
decompress_bunzip2.c decompress_bunzip2: remove invalid vi modeline 2011-12-06 10:00:05 +01:00
decompress_inflate.c decompressors: check input size in decompress_inflate.c 2011-01-13 08:03:25 -08:00
decompress_unlzma.c treewide: Fix comment and string typo 'bufer' 2011-12-06 09:53:40 +01:00
decompress_unlzo.c unlzo: fix input buffer free 2012-01-12 20:13:13 -08:00
decompress_unxz.c Fix common misspellings 2011-03-31 11:26:23 -03:00
decompress.c decompressors: add boot-time XZ support 2011-01-13 08:03:25 -08:00
devres.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
digsig.c lib/digsig: checks for NULL return value 2012-02-02 00:24:04 +11:00
div64.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
dma-debug.c Remove useless get_driver()/put_driver() calls 2012-01-24 16:00:35 -08:00
dump_stack.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
dynamic_debug.c Revert "dynamic_debug: remove unneeded includes" 2012-05-07 16:47:32 -07:00
dynamic_queue_limits.c dql: Fix undefined jiffies 2012-03-11 19:59:43 -07:00
extable.c module: trim exception table on init free. 2009-06-12 21:47:04 +09:30
fault-inject.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
find_last_bit.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
find_next_bit.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
flex_array.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
gcd.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
gen_crc32table.c crc32: bolt on crc32c 2012-03-23 16:58:38 -07:00
genalloc.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
halfmd4.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
hexdump.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
hweight.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
idr.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
inflate.c MN10300: Don't try and #include <linux/slab.h> in lib/inflate.c from bootloader 2010-08-12 09:51:35 -07:00
int_sqrt.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
iomap_copy.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
iomap.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
iommu-helper.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
ioremap.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
irq_regs.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
is_single_threaded.c kernel: is_current_single_threaded: don't use ->mmap_sem 2009-07-17 09:11:31 +10:00
jedec_ddr_data.c ddr: add LPDDR2 data from JESD209-2 2012-05-02 00:04:06 -07:00
kasprintf.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
Kconfig Merge branch 'generic-string-functions' 2012-05-26 16:57:16 -07:00
Kconfig.debug Merge branches 'x86-asm-for-linus', 'x86-cleanups-for-linus', 'x86-cpu-for-linus', 'x86-debug-for-linus' and 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-05-23 10:09:50 -07:00
Kconfig.kgdb mips,kgdb: kdb low level trap catch and stack trace 2010-05-20 21:04:26 -05:00
Kconfig.kmemcheck kmemcheck: depend on HAVE_ARCH_KMEMCHECK 2009-07-01 22:28:44 +02:00
klist.c Revert "driver core: check start node in klist_iter_init_node" 2012-04-19 19:17:30 -07:00
kobject_uevent.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
kobject.c kobject: fix the uncorrect comment 2012-05-07 16:51:19 -07:00
kstrtox.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
kstrtox.h lib/kstrtox: common code between kstrto*() and simple_strto*() functions 2011-10-31 17:30:56 -07:00
lcm.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
libcrc32c.c libcrc32c: Fix "crc32c undefined" compilation error 2008-12-25 11:01:42 +11:00
list_debug.c list_debug: WARN for adding something already in the list 2012-05-29 16:22:32 -07:00
list_sort.c lib/list_sort: test: check element addresses 2010-10-26 16:52:19 -07:00
llist.c Disintegrate and delete asm/system.h 2012-03-28 15:58:21 -07:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
lru_cache.c lru_cache: use correct type in sizeof for allocation 2011-05-25 08:39:52 -07:00
Makefile Merge branch 'generic-string-functions' 2012-05-26 16:57:16 -07:00
md5.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
nlattr.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
parser.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
pci_iomap.c lib: add NO_GENERIC_PCI_IOPORT_MAP 2012-01-31 23:19:47 +02:00
percpu_counter.c lib/percpu_counter.c: enclose hotplug only variables in hotplug ifdef 2011-10-31 17:30:56 -07:00
plist.c bug.h: add include of it to various implicit C users 2012-02-29 17:15:08 -05:00
prio_heap.c lib: fix sparse shadowed variable warning 2009-01-06 15:59:11 -08:00
prio_tree.c prio_tree: introduce prio_set_parent() 2012-03-23 16:58:36 -07:00
proportions.c locking, lib/proportions: Annotate prop_local_percpu::lock as raw 2011-09-13 11:11:50 +02:00
radix-tree.c radix-tree: fix preload vector size 2012-05-29 16:22:33 -07:00
random32.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
ratelimit.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
rational.c lib: Change mail address of Oskar Schirmer 2012-05-17 15:18:37 +02:00
rbtree.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
reciprocal_div.c sch_red: Adaptative RED AQM 2011-12-08 19:52:43 -05:00
rwsem-spinlock.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
rwsem.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
scatterlist.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
sha1.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
show_mem.c arch, mm: filter disallowed nodes from arch specific show_mem functions 2011-05-25 08:39:03 -07:00
smp_processor_id.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
sort.c generic swap(): lib/sort.c: rename swap to swap_func 2009-01-08 08:31:14 -08:00
spinlock_debug.c spinlock_debug: print kallsyms name for lock 2012-05-29 16:22:33 -07:00
stmp_device.c lib: add support for stmp-style devices 2012-04-20 23:27:08 +02:00
string_helpers.c lib/string_helpers.c: make arrays static 2012-05-29 16:22:32 -07:00
string.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
strncpy_from_user.c word-at-a-time: make the interfaces truly generic 2012-05-26 11:33:40 -07:00
strnlen_user.c lib: Fix generic strnlen_user for 32-bit big-endian machines 2012-05-27 20:59:46 -07:00
swiotlb.c swiotlb: print physical addresses consistently with other parts of kernel 2012-05-29 16:22:21 -07:00
syscall.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
test-kstrtox.c lib/test-kstrtox.c: mark const init data with __initconst instead of __initdata 2012-05-29 16:22:32 -07:00
textsearch.c textsearch: doc - fix spelling in lib/textsearch.c. 2011-01-24 23:33:30 -08:00
timerqueue.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
ts_bm.c textsearch: ts_bm: support case insensitive searching in Boyer-Moore algorithm 2008-07-08 02:37:54 -07:00
ts_fsm.c textsearch: ts_fsm: return error on request for case insensitive search 2008-07-08 02:38:27 -07:00
ts_kmp.c textsearch: ts_kmp: support case insensitive searching in Knuth-Morris-Pratt algorithm 2008-07-08 02:38:09 -07:00
uuid.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
vsprintf.c vsprintf: further optimize decimal conversion 2012-05-31 17:49:27 -07:00