linux_dsm_epyc7002/arch/x86
Denys Vlasenko 2a4e90b18c x86: Force inlining of atomic ops
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously
doesn't inline very small functions we expect to be inlined:

$ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^
*1 ' | sort -rn     473 000000000000000b t spin_unlock_irqrestore
    449 000000000000005f t rcu_read_unlock
    355 0000000000000009 t atomic_inc                <== THIS
    353 000000000000006e t rcu_read_lock
    350 0000000000000075 t rcu_read_lock_sched_held
    291 000000000000000b t spin_unlock
    266 0000000000000019 t arch_local_irq_restore
    215 000000000000000b t spin_lock
    180 0000000000000011 t kzalloc
    165 0000000000000012 t list_add_tail
    161 0000000000000019 t arch_local_save_flags
    153 0000000000000016 t test_and_set_bit
    134 000000000000000b t spin_unlock_irq
    134 0000000000000009 t atomic_dec                <== THIS
    130 000000000000000b t spin_unlock_bh
    122 0000000000000010 t brelse
    120 0000000000000016 t test_and_clear_bit
    120 000000000000000b t spin_lock_irq
    119 000000000000001e t get_dma_ops
    117 0000000000000053 t cpumask_next
    116 0000000000000036 t kref_get
    114 000000000000001a t schedule_work
    106 000000000000000b t spin_lock_bh
    103 0000000000000019 t arch_local_irq_disable
...

Note sizes of marked functions. They are merely 9 bytes long!
Selecting function with 'atomic' in their names:

    355 0000000000000009 t atomic_inc
    134 0000000000000009 t atomic_dec
     98 0000000000000014 t atomic_dec_and_test
     31 000000000000000e t atomic_add_return
     27 000000000000000a t atomic64_inc
     26 000000000000002f t kmap_atomic
     24 0000000000000009 t atomic_add
     12 0000000000000009 t atomic_sub
     10 0000000000000021 t __atomic_add_unless
     10 000000000000000a t atomic64_add
      5 000000000000001f t __atomic_add_unless.constprop.7
      5 000000000000000a t atomic64_dec
      4 000000000000001f t __atomic_add_unless.constprop.18
      4 000000000000001f t __atomic_add_unless.constprop.12
      4 000000000000001f t __atomic_add_unless.constprop.10
      3 000000000000001f t __atomic_add_unless.constprop.13
      3 0000000000000011 t atomic64_add_return
      2 000000000000001f t __atomic_add_unless.constprop.9
      2 000000000000001f t __atomic_add_unless.constprop.8
      2 000000000000001f t __atomic_add_unless.constprop.6
      2 000000000000001f t __atomic_add_unless.constprop.5
      2 000000000000001f t __atomic_add_unless.constprop.3
      2 000000000000001f t __atomic_add_unless.constprop.22
      2 000000000000001f t __atomic_add_unless.constprop.14
      2 000000000000001f t __atomic_add_unless.constprop.11
      2 000000000000001e t atomic_dec_if_positive
      2 0000000000000014 t atomic_inc_and_test
      2 0000000000000011 t atomic_add_return.constprop.4
      2 0000000000000011 t atomic_add_return.constprop.17
      2 0000000000000011 t atomic_add_return.constprop.16
      2 000000000000000d t atomic_inc.constprop.4
      2 000000000000000c t atomic_cmpxchg

This patch fixes this for x86 atomic ops via
s/inline/__always_inline/. This decreases allyesconfig kernel by
about 25k:

    text     data      bss       dec     hex filename
82399481 22255416 20627456 125282353 777a831 vmlinux.before
82375570 22255544 20627456 125258570 7774b4a vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1431080762-17797-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-08 12:55:50 +02:00
..
boot Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:19:10 -07:00
configs x86/build/defconfig: Enable USB_EHCI_TT_NEWSCHED=y 2015-02-19 02:21:14 +01:00
crypto x86/asm: Replace "MOVQ $imm, %reg" with MOVL 2015-04-01 13:17:39 +02:00
ia32 x86/asm/entry/32: Update -ENOSYS handling to match the 64-bit logic 2015-04-22 08:41:44 +02:00
include x86: Force inlining of atomic ops 2015-05-08 12:55:50 +02:00
kernel x86/asm/entry/64: Clean up usage of TEST insns 2015-05-08 11:07:32 +02:00
kvm Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 11:08:28 -07:00
lguest x86/asm/entry/irq: Simplify interrupt dispatch table (IDT) layout 2015-04-08 09:02:13 +02:00
lib Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:16:36 -07:00
math-emu
mm Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:34:46 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-12-10 15:48:20 -05:00
oprofile x86/asm/entry: Change all 'user_mode_vm()' calls to 'user_mode()' 2015-03-23 11:14:17 +01:00
pci Revert "x86/PCI: Refine the way to release PCI IRQ resources" 2015-03-20 14:56:19 +01:00
platform Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:32:35 -07:00
power x86/asm, x86/power/hibernate: Use local labels in asm 2015-04-15 11:37:51 +02:00
purgatory Merge branches 'x86-build-for-linus', 'x86-cleanups-for-linus' and 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-12-10 12:35:46 -08:00
realmode Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-02-16 14:58:12 -08:00
syscalls x86/asm/entry/64: Remove stub_iopl 2015-03-10 13:56:10 +01:00
tools x86, build: replace Perl script with Shell script 2015-01-26 13:37:18 -08:00
um x86/asm/entry/64: Remove stub_iopl 2015-03-10 13:56:10 +01:00
vdso Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:36:45 -07:00
video
xen x86, paravirt, xen: Remove the 64-bit ->irq_enable_sysexit() pvop 2015-04-22 08:07:45 +02:00
.gitignore
Kbuild
Kconfig Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:31:32 -07:00
Kconfig.cpu
Kconfig.debug x86/intel/quark: Add Isolated Memory Regions for Quark X1000 2015-02-18 23:22:47 +01:00
Makefile x86/asm: Use -mskip-rax-setup if supported 2015-05-06 11:11:01 +02:00
Makefile_32.cpu
Makefile.um kbuild: do not add $(call ...) to invoke cc-version or cc-fullversion 2015-01-09 17:25:44 +01:00