linux_dsm_epyc7002/block
Ying Huang 966a967116 smp: Avoid using two cache lines for struct call_single_data
struct call_single_data is used in IPIs to transfer information between
CPUs.  Its size is bigger than sizeof(unsigned long) and less than
cache line size.  Currently it is not allocated with any explicit alignment
requirements.  This makes it possible for allocated call_single_data to
cross two cache lines, which results in double the number of the cache lines
that need to be transferred among CPUs.

This can be fixed by requiring call_single_data to be aligned with the
size of call_single_data. Currently the size of call_single_data is the
power of 2.  If we add new fields to call_single_data, we may need to
add padding to make sure the size of new definition is the power of 2
as well.

Fortunately, this is enforced by GCC, which will report bad sizes.

To set alignment requirements of call_single_data to the size of
call_single_data, a struct definition and a typedef is used.

To test the effect of the patch, I used the vm-scalability multiple
thread swap test case (swap-w-seq-mt).  The test will create multiple
threads and each thread will eat memory until all RAM and part of swap
is used, so that huge number of IPIs are triggered when unmapping
memory.  In the test, the throughput of memory writing improves ~5%
compared with misaligned call_single_data, because of faster IPIs.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Huang, Ying <ying.huang@intel.com>
[ Add call_single_data_t and align with size of call_single_data. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Aaron Lu <aaron.lu@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/87bmnqd6lz.fsf@yhuang-mobile.sh.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29 15:14:38 +02:00
..
partitions partitions/ldm: switch to use uuid_t 2017-06-05 16:59:14 +02:00
badblocks.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
bfq-cgroup.c block, bfq: access and cache blkg data only when safe 2017-06-08 09:51:10 -06:00
bfq-iosched.c bfq: dispatch request to prevent queue stalling after the request completion 2017-07-12 08:32:04 -06:00
bfq-iosched.h block, bfq: consider also in_service_entity to state whether an entity is active 2017-07-29 15:32:49 -06:00
bfq-wf2q.c block, bfq: consider also in_service_entity to state whether an entity is active 2017-07-29 15:32:49 -06:00
bio-integrity.c bio-integrity: only verify integrity on the lowest stacked driver 2017-08-09 20:24:36 -06:00
bio.c block: call bio_uninit in bio_endio 2017-07-10 12:43:33 -06:00
blk-cgroup.c block: Avoid that blk_exit_rl() triggers a use-after-free 2017-06-01 13:07:55 -06:00
blk-core.c block: disable runtime-pm for blk-mq 2017-07-24 08:46:40 -06:00
blk-exec.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
blk-flush.c block: Check locking assumptions at runtime 2017-06-20 19:27:14 -06:00
blk-integrity.c block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
blk-ioc.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2017-03-03 10:53:35 -08:00
blk-lib.c block: Fix __blkdev_issue_zeroout loop 2017-07-06 09:43:20 -06:00
blk-map.c blk-map: call blk_queue_bounce from blk_rq_append_bio 2017-06-27 12:13:21 -06:00
blk-merge.c block: add support for write hints in a bio 2017-06-27 12:05:27 -06:00
blk-mq-cpumap.c blk-mq: map queues to all present CPUs 2017-07-24 10:01:31 -06:00
blk-mq-debugfs.c blk-mq: expose write hints through debugfs 2017-06-27 12:05:31 -06:00
blk-mq-debugfs.h mq-deadline: add debugfs attributes 2017-05-04 08:25:17 -06:00
blk-mq-pci.c blk-mq-pci: add a fallback when pci_irq_get_affinity returns NULL 2017-08-18 08:08:14 -06:00
blk-mq-sched.c blk-mq-sched: fix performance regression of mq-deadline 2017-07-03 16:54:09 -06:00
blk-mq-sched.h Merge commit '8e8320c9315c' into for-4.13/block 2017-06-22 21:55:24 -06:00
blk-mq-sysfs.c blk-mq: untangle debugfs and sysfs 2017-05-04 08:24:13 -06:00
blk-mq-tag.c blk-mq: add shallow depth option for blk_mq_get_tag() 2017-04-14 14:06:54 -06:00
blk-mq-tag.h blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset 2017-03-02 08:56:04 -07:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c blk-mq: Fix queue usage on failed request allocation 2017-08-15 07:22:34 -06:00
blk-mq.h Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-03 16:50:31 -07:00
blk-settings.c block: don't bother with bounce limits for make_request drivers 2017-06-27 12:13:45 -06:00
blk-softirq.c smp: Avoid using two cache lines for struct call_single_data 2017-08-29 15:14:38 +02:00
blk-stat.c blk-stat: don't use this_cpu_ptr() in a preemptable section 2017-05-10 07:40:18 -06:00
blk-stat.h blk-stat: kill blk_stat_rq_ddir() 2017-04-21 07:56:23 -06:00
blk-sysfs.c block: Fix a blk_exit_rl() regression 2017-06-14 13:27:50 -06:00
blk-tag.c block: Check locking assumptions at runtime 2017-06-20 19:27:14 -06:00
blk-throttle.c blk-throttle: set default latency baseline for harddisk 2017-06-07 09:09:32 -06:00
blk-timeout.c block: Check locking assumptions at runtime 2017-06-20 19:27:14 -06:00
blk-wbt.c sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming 2017-06-20 12:19:14 +02:00
blk-wbt.h block: Make writeback throttling defaults consistent for SQ devices 2017-04-19 08:49:03 -06:00
blk-zoned.c block: Rename blk_queue_zone_size and bdev_zone_size 2017-01-12 07:58:32 -07:00
blk.h bio-integrity: stop abusing bi_end_io 2017-07-03 17:00:59 -06:00
bounce.c block: remove the queue_bounce_pfn helper 2017-06-27 12:13:45 -06:00
bsg-lib.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
bsg.c block: Make most scsi_req_init() calls implicit 2017-06-20 19:27:14 -06:00
cfq-iosched.c Linux 4.12-rc5 2017-06-12 08:30:13 -06:00
cmdline-parser.c block: remove unrelated header files and export symbol 2014-01-21 20:18:26 -08:00
compat_ioctl.c compat_hdio_ioctl: get rid of set_fs() 2017-06-29 18:17:52 -04:00
deadline-iosched.c block: enumify ELEVATOR_*_MERGE 2017-02-08 13:43:06 -07:00
elevator.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
genhd.c block: Constify disk_type 2017-06-20 19:27:14 -06:00
ioctl.c block: remove the discard_zeroes_data flag 2017-04-08 11:25:38 -06:00
ioprio.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
Kconfig Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 2017-05-12 15:43:10 -07:00
Kconfig.iosched block, bfq: add full hierarchical scheduling and cgroups support 2017-04-19 08:30:26 -06:00
kyber-iosched.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-03 13:08:04 -07:00
Makefile block, bfq: split bfq-iosched.c into multiple source files 2017-04-19 08:48:24 -06:00
mq-deadline.c mq-deadline: add debugfs attributes 2017-05-04 08:25:17 -06:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block/sed-opal: allocate struct opal_dev dynamically 2017-02-17 12:41:47 -07:00
partition-generic.c block: fix an error code in add_partition() 2017-05-23 08:41:59 -06:00
scsi_ioctl.c block: Change argument type of scsi_req_init() 2017-06-20 19:27:14 -06:00
sed-opal.c block: sed-opal: Tone down all the pr_* to debugs 2017-04-07 14:24:16 -06:00
t10-pi.c t10-pi: Move opencoded contants to common header 2017-07-03 16:56:25 -06:00