linux_dsm_epyc7002/arch
Peter Zijlstra 0b9ccc0a9b x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()
Nadav Amit reported that commit:

  b59167ac7b ("x86/percpu: Fix this_cpu_read()")

added a bunch of constraints to all sorts of code; and while some of
that was correct and desired, some of that seems superfluous.

The thing is, the this_cpu_*() operations are defined IRQ-safe, this
means the values are subject to change from IRQs, and thus must be
reloaded.

Also, the generic form:

  local_irq_save()
  __this_cpu_read()
  local_irq_restore()

would not allow the re-use of previous values; if by nothing else,
then the barrier()s implied by local_irq_*().

Which raises the point that percpu_from_op() and the others also need
that volatile.

OTOH __this_cpu_*() operations are not IRQ-safe and assume external
preempt/IRQ disabling and could thus be allowed more room for
optimization.

This makes the this_cpu_*() vs __this_cpu_*() behaviour more
consistent with other architectures.

  $ ./compare.sh defconfig-build defconfig-build1 vmlinux.o
  x86_pmu_cancel_txn                                         80         71   -9,+0
  __text_poke                                               919        964   +45,+0
  do_user_addr_fault                                       1082       1058   -24,+0
  __do_page_fault                                          1194       1178   -16,+0
  do_exit                                                  2995       3027   -43,+75
  process_one_work                                         1008        989   -67,+48
  finish_task_switch                                        524        505   -19,+0
  __schedule_bug                                            103         98   -59,+54
  __schedule_bug                                            103         98   -59,+54
  __sched_setscheduler                                     2015       2030   +15,+0
  freeze_processes                                          203        230   +31,-4
  rcu_gp_kthread_wake                                       106         99   -7,+0
  rcu_core                                                 1841       1834   -7,+0
  call_timer_fn                                             298        286   -12,+0
  can_stop_idle_tick                                        146        139   -31,+24
  perf_pending_event                                        253        239   -14,+0
  shmem_alloc_page                                          209        213   +4,+0
  __alloc_pages_slowpath                                   3284       3269   -15,+0
  umount_tree                                               671        694   +23,+0
  advance_transaction                                       803        798   -5,+0
  con_put_char                                               71         51   -20,+0
  xhci_urb_enqueue                                         1302       1295   -7,+0
  xhci_urb_enqueue                                         1302       1295   -7,+0
  tcp_sacktag_write_queue                                  2130       2075   -55,+0
  tcp_try_undo_loss                                         229        208   -21,+0
  tcp_v4_inbound_md5_hash                                   438        411   -31,+4
  tcp_v4_inbound_md5_hash                                   438        411   -31,+4
  tcp_v6_inbound_md5_hash                                   469        411   -33,-25
  tcp_v6_inbound_md5_hash                                   469        411   -33,-25
  restricted_pointer                                        434        420   -14,+0
  irq_exit                                                  162        154   -8,+0
  get_perf_callchain                                        638        624   -14,+0
  rt_mutex_trylock                                          169        156   -13,+0
  avc_has_extended_perms                                   1092       1089   -3,+0
  avc_has_perm_noaudit                                      309        306   -3,+0
  __perf_sw_event                                           138        122   -16,+0
  perf_swevent_get_recursion_context                        116        102   -14,+0
  __local_bh_enable_ip                                       93         72   -21,+0
  xfrm_input                                               4175       4161   -14,+0
  avc_has_perm                                              446        443   -3,+0
  vm_events_fold_cpu                                         57         56   -1,+0
  vfree                                                      68         61   -7,+0
  freeze_processes                                          203        230   +31,-4
  _local_bh_enable                                           44         30   -14,+0
  ip_do_fragment                                           1982       1944   -38,+0
  do_exit                                                  2995       3027   -43,+75
  __do_softirq                                              742        724   -18,+0
  cpu_init                                                 1510       1489   -21,+0
  account_system_time                                        80         79   -1,+0
                                               total   12985281   12984819   -742,+280

Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20181206112433.GB13675@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-06-17 12:43:40 +02:00
..
alpha Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
arc Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
arm Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
arm64 Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
c6x treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
csky treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
h8300 treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
hexagon treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 267 2019-06-05 17:30:29 +02:00
ia64 Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
m68k treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 285 2019-06-05 17:36:37 +02:00
microblaze treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
mips Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
nds32 nds32 patches for 5.2-rc3 2019-06-03 10:23:41 -07:00
nios2 treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
openrisc treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
parisc SPDX update for 5.2-rc4 2019-06-08 12:52:42 -07:00
powerpc Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
riscv Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
s390 Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
sh treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 211 2019-05-30 11:29:53 -07:00
sparc Linux 5.2-rc5 2019-06-17 12:06:34 +02:00
um treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
unicore32 treewide: Add SPDX license identifier - Kbuild 2019-05-30 11:32:33 -07:00
x86 x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}() 2019-06-17 12:43:40 +02:00
xtensa Xtensa fixes for v5.2-rc4 2019-06-07 13:06:00 -07:00
.gitignore
Kconfig Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-05-16 11:00:20 -07:00