linux_dsm_epyc7002/drivers/cpuidle
Ying Huang 966a967116 smp: Avoid using two cache lines for struct call_single_data
struct call_single_data is used in IPIs to transfer information between
CPUs.  Its size is bigger than sizeof(unsigned long) and less than
cache line size.  Currently it is not allocated with any explicit alignment
requirements.  This makes it possible for allocated call_single_data to
cross two cache lines, which results in double the number of the cache lines
that need to be transferred among CPUs.

This can be fixed by requiring call_single_data to be aligned with the
size of call_single_data. Currently the size of call_single_data is the
power of 2.  If we add new fields to call_single_data, we may need to
add padding to make sure the size of new definition is the power of 2
as well.

Fortunately, this is enforced by GCC, which will report bad sizes.

To set alignment requirements of call_single_data to the size of
call_single_data, a struct definition and a typedef is used.

To test the effect of the patch, I used the vm-scalability multiple
thread swap test case (swap-w-seq-mt).  The test will create multiple
threads and each thread will eat memory until all RAM and part of swap
is used, so that huge number of IPIs are triggered when unmapping
memory.  In the test, the throughput of memory writing improves ~5%
compared with misaligned call_single_data, because of faster IPIs.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Huang, Ying <ying.huang@intel.com>
[ Add call_single_data_t and align with size of call_single_data. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29 15:14:38 +02:00
..
governors cpuidle: menu: allow state 0 to be disabled 2017-06-29 22:59:17 +02:00
coupled.c smp: Avoid using two cache lines for struct call_single_data 2017-08-29 15:14:38 +02:00
cpuidle-arm.c ARM: cpuidle: Support asymmetric idle definition 2017-06-24 01:51:00 +02:00
cpuidle-at91.c
cpuidle-big_little.c
cpuidle-calxeda.c
cpuidle-clps711x.c
cpuidle-cps.c cpuidle: cpuidle-cps: remove unused variable 2017-04-19 23:04:54 +02:00
cpuidle-exynos.c
cpuidle-kirkwood.c
cpuidle-mvebu-v7.c
cpuidle-powernv.c powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails 2017-08-08 20:21:23 +10:00
cpuidle-pseries.c cpuidle: powerpc: no memory barrier after break from idle 2017-06-28 13:08:12 +10:00
cpuidle-ux500.c
cpuidle-zynq.c
cpuidle.c cpuidle: Fix idle time tracking 2017-05-15 10:15:20 +02:00
cpuidle.h
driver.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/idle.h> 2017-03-02 08:42:26 +01:00
dt_idle_states.c cpuidle: dt: Add missing 'of_node_put()' 2017-06-12 14:36:13 +02:00
dt_idle_states.h
governor.c cpuidle: governors: Remove remaining old module code 2016-10-21 14:49:51 +02:00
Kconfig
Kconfig.arm ARM: cpuidle: Support asymmetric idle definition 2017-06-24 01:51:00 +02:00
Kconfig.mips cpuidle: cpuidle-cps: Enable use with MIPSr6 CPUs. 2016-10-04 16:13:57 +02:00
Kconfig.powerpc
Makefile
sysfs.c cpuidle: Validate cpu_dev in cpuidle_add_sysfs() 2017-03-21 22:26:37 +01:00