linux_dsm_epyc7002/include
Ravikiran G Thirumalai e073ae1b34 [PATCH] x86-64: Set HASHDIST_DEFAULT to 1 for x86_64 NUMA
Enable system hashtable memory to be distributed among nodes on x86_64 NUMA

Forcing the kernel to use node interleaved vmalloc instead of bootmem for
the system hashtable memory (alloc_large_system_hash) reduces the memory
imbalance on node 0 by around 40MB on a 8 node x86_64 NUMA box:

Before the following patch, on bootup of a 8 node box:

Node 0 MemTotal:      3407488 kB
Node 0 MemFree:       3206296 kB
Node 0 MemUsed:        201192 kB
Node 0 Active:           7012 kB
Node 0 Inactive:          512 kB
Node 0 Dirty:               0 kB
Node 0 Writeback:           0 kB
Node 0 FilePages:        1912 kB
Node 0 Mapped:            420 kB
Node 0 AnonPages:        5612 kB
Node 0 PageTables:        468 kB
Node 0 NFS_Unstable:        0 kB
Node 0 Bounce:              0 kB
Node 0 Slab:             5408 kB
Node 0 SReclaimable:      644 kB
Node 0 SUnreclaim:       4764 kB

After the patch (or using hashdist=1 on the kernel command line):

Node 0 MemTotal:      3407488 kB
Node 0 MemFree:       3247608 kB
Node 0 MemUsed:        159880 kB
Node 0 Active:           3012 kB
Node 0 Inactive:          616 kB
Node 0 Dirty:               0 kB
Node 0 Writeback:           0 kB
Node 0 FilePages:        2424 kB
Node 0 Mapped:            380 kB
Node 0 AnonPages:        1200 kB
Node 0 PageTables:        396 kB
Node 0 NFS_Unstable:        0 kB
Node 0 Bounce:              0 kB
Node 0 Slab:             6304 kB
Node 0 SReclaimable:     1596 kB
Node 0 SUnreclaim:       4708 kB

I guess it is a good idea to keep HASHDIST_DEFAULT "on" for x86_64 NUMA
since x86_64 has no dearth of vmalloc space?  Or maybe enable hash
distribution for all 64bit NUMA arches?  The following patch does it only
for x86_64.

I ran a HPC MPI benchmark -- 'Ansys wingsolid', which takes up quite a bit of
memory and uses up tlb entries.  This was on a 4 way, 2 socket
Tyan AMD box (non vsmp), with 8G total memory (4G pernode).

The results with and without hash distribution are:

1. Vanilla - runtime of 1188.000s
2. With hashdist=1 runtime of 1154.000s

Oprofile output for the duration of run is:

1. Vanilla:
PU: AMD64 processors, speed 2411.16 MHz (estimated)
Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
mask of 0x00 (No unit mask) count 500
samples  %        app name                 symbol name
163054    6.5513  libansys1.so             MultiFront::decompose(int, int,
Elemset *, int *, int, int, int)
162061    6.5114  libansys3.so             blockSaxpy6L_fd
162042    6.5107  libansys3.so             blockInnerProduct6L_fd
156286    6.2794  libansys3.so             maxb33_
87879     3.5309  libansys1.so             elmatrixmultpcg_
84857     3.4095  libansys4.so             saxpy_pcg
58637     2.3560  libansys4.so             .st4560
46612     1.8728  libansys4.so             .st4282
43043     1.7294  vmlinux-t                copy_user_generic_string
41326     1.6604  libansys3.so             blockSaxpyBackSolve6L_fd
41288     1.6589  libansys3.so             blockInnerProductBackSolve6L_fd

2. With hashdist=1
CPU: AMD64 processors, speed 2411.13 MHz (estimated)
Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
mask of 0x00 (No unit mask) count 500
samples  %        app name                 symbol name
162993    6.9814  libansys1.so             MultiFront::decompose(int, int,
Elemset *, int *, int, int, int)
160799    6.8874  libansys3.so             blockInnerProduct6L_fd
160459    6.8729  libansys3.so             blockSaxpy6L_fd
156018    6.6826  libansys3.so             maxb33_
84700     3.6279  libansys4.so             saxpy_pcg
83434     3.5737  libansys1.so             elmatrixmultpcg_
58074     2.4875  libansys4.so             .st4560
46000     1.9703  libansys4.so             .st4282
41166     1.7632  libansys3.so             blockSaxpyBackSolve6L_fd
41033     1.7575  libansys3.so             blockInnerProductBackSolve6L_fd
35762     1.5318  libansys1.so             inner_product_sub
35591     1.5245  libansys1.so             inner_product_sub2
28259     1.2104  libansys4.so             addVectors

Signed-off-by: Pravin B. Shelar <pravin.shelar@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Christoph Lameter <clameter@engr.sgi.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:08 +02:00
..
acpi ACPI: Disable MSI on request of FADT 2007-04-25 01:13:47 -04:00
asm-alpha Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2007-04-27 09:29:04 -07:00
asm-arm [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-arm26 [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-avr32 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2007-04-27 09:26:46 -07:00
asm-cris [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-frv [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-generic Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2007-04-27 09:26:46 -07:00
asm-h8300 [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-i386 [PATCH] i386: clean up mach_reboot_fixups 2007-05-02 19:27:06 +02:00
asm-ia64 [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-m32r [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-m68k [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-m68knommu [PATCH] m68knommu: GPIO line defines for the ColdFire 5282 2007-03-06 18:08:38 -08:00
asm-mips Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2007-04-27 09:26:46 -07:00
asm-parisc [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-powerpc Merge branch 'for-2.6.22' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc 2007-04-30 08:10:12 -07:00
asm-ppc [POWERPC] Stop using ppc_sys for Xilinx Virtex boards 2007-04-30 11:02:04 +10:00
asm-s390 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2007-04-27 09:26:46 -07:00
asm-sh Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2007-04-27 09:29:04 -07:00
asm-sh64 [NET]: Introduce SIOCGSTAMPNS ioctl to get timestamps with nanosec resolution 2007-04-25 22:24:04 -07:00
asm-sparc Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2007-04-27 09:29:04 -07:00
asm-sparc64 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2007-04-27 09:29:04 -07:00
asm-um [NET]: div64_64 consolidate (rev3) 2007-04-25 22:23:33 -07:00
asm-v850 [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
asm-x86_64 [PATCH] x86-64: build-time checking 2007-05-02 19:27:08 +02:00
asm-xtensa [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
crypto
keys [AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both 2007-04-26 15:48:28 -07:00
linux [PATCH] x86-64: Set HASHDIST_DEFAULT to 1 for x86_64 NUMA 2007-05-02 19:27:08 +02:00
math-emu
media V4L/DVB (5355): Add VIDIOC_G_CHIP_IDENT to various i2c modules 2007-04-27 15:43:50 -03:00
mtd UBI: Unsorted Block Images 2007-04-27 14:23:33 +03:00
net Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2007-04-30 08:14:42 -07:00
pcmcia serial: Add PCMCIA IDs for Quatech DSP-100 dual RS232 adapter. 2007-02-16 15:19:16 -08:00
rdma RDMA/cma: Add multicast communication support 2007-02-16 14:29:07 -08:00
rxrpc [AF_RXRPC]: Delete the old RxRPC code. 2007-04-26 15:55:48 -07:00
scsi Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2007-02-19 13:32:28 -08:00
sound [ALSA] version 1.0.14rc3 2007-03-14 08:25:52 +01:00
video [PATCH] Video: fb, add true ref_count atomicity 2007-02-12 09:48:42 -08:00
Kbuild