linux_dsm_epyc7002/arch
Ram Pai 9d2edb1848 powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6,
in the 4K backed HPTE pages.These bits continue to be used
for 64K backed HPTE pages in this patch, but will be freed
up in the next patch. The bit numbers are big-endian as
defined in the ISA3.0

The patch does the following change to the 4k HTPE backed
64K PTE's format.

H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
		below)
V0 which occupied bit 4 is not used anymore.
V1 which occupied bit 5 is not used anymore.
V2 which occupied bit 6 is not used anymore.
V3 which occupied bit 7 is not used anymore.

Before the patch, the 4k backed 64k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

After the patch, the 4k backed 64k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

the four bits S,G,I,X (one quadruplet per 4k HPTE) that
cache the hash-bucket slot value, is initialized to
1,1,1,1 indicating -- an invalid slot. If a HPTE gets
cached in a 1111 slot(i.e 7th slot of secondary hash
bucket), it is released immediately. In other words,
even though 1111 is a valid slot value in the hash
bucket, we consider it invalid and release the slot and
the HPTE. This gives us the opportunity to determine
the validity of S,G,I,X bits based on its contents and
not on any of the bits V0,V1,V2 or V3 in the primary PTE

When we release a HPTE cached in the 1111 slot
we also release a legitimate slot in the primary
hash bucket and unmap its corresponding HPTE. This
is to ensure that we do get a HPTE cached in a slot
of the primary hash bucket, the next time we retry.

Though treating 1111 slot as invalid, reduces the
number of available slots in the hash bucket and may
have an effect on the performance, the probabilty of
hitting a 1111 slot is extermely low.

Compared to the current scheme, the above scheme
reduces the number of false hash table updates
significantly and has the added advantage of releasing
four valuable PTE bits for other purpose.

NOTE:even though bits 3, 4, 5, 6, 7 are not used when
the 64K PTE is backed by 4k HPTE, they continue to be
used if the PTE gets backed by 64k HPTE. The next
patch will decouple that aswell, and truely release the
bits.

This idea was jointly developed by Paul Mackerras,
Aneesh, Michael Ellermen and myself.

4K PTE format remains unchanged currently.

The patch does the following code changes
a) PTE flags are split between 64k and 4k header files.
b) __hash_page_4K() is reimplemented to reflect the
 above logic.

Acked-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-20 18:57:20 +11:00
..
alpha treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
arc ARC updates for 4.15-rc1 2017-11-25 08:21:54 -10:00
arm Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm 2017-12-03 10:51:08 -05:00
arm64 arm64 fixes: 2017-12-01 19:37:03 -05:00
blackfin treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
c6x Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
cris pci-v4.15-changes 2017-11-15 15:01:28 -08:00
frv Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
h8300 mm, arch: remove empty_bad_page* 2017-11-15 18:21:03 -08:00
hexagon Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
ia64 arch/ia64/include/asm/topology.h: remove unused parent_node() macro 2017-11-17 16:10:04 -08:00
m32r m32r: fix endianness constraints 2017-11-15 18:21:00 -08:00
m68k m68k/macboing: Fix missed timer callback assignment 2017-11-24 16:19:40 +01:00
metag DeviceTree for 4.15: 2017-11-14 18:25:40 -08:00
microblaze Microblaze patch for 4.15-rc2 2017-11-29 14:19:22 -08:00
mips * x86 bugfixes: APIC, nested virtualization, IOAPIC 2017-11-30 08:15:19 -08:00
mn10300 bug: define the "cut here" string in a single place 2017-11-17 16:10:01 -08:00
nios2 DeviceTree for 4.15: 2017-11-14 18:25:40 -08:00
openrisc kmemcheck: remove annotations 2017-11-15 18:21:04 -08:00
parisc treewide: Switch DEFINE_TIMER callbacks to struct timer_list * 2017-11-21 15:57:05 -08:00
powerpc powerpc: Free up four 64K PTE bits in 4K backed HPTE pages 2017-12-20 18:57:20 +11:00
riscv RISC-V: Fixes for clean allmodconfig build 2017-12-01 13:31:31 -08:00
s390 * x86 bugfixes: APIC, nested virtualization, IOAPIC 2017-11-30 08:15:19 -08:00
score License cleanup: add SPDX license identifier to uapi header files with no license 2017-11-02 11:19:54 +01:00
sh treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
sparc Merge branch 'akpm' (patches from Andrew) 2017-11-29 19:12:44 -08:00
tile mm: switch to 'define pmd_write' instead of __HAVE_ARCH_PMD_WRITE 2017-11-29 18:40:42 -08:00
um This pull request contains the following core changes: 2017-11-22 20:46:06 -10:00
unicore32 kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK 2017-11-15 18:21:04 -08:00
x86 * x86 bugfixes: APIC, nested virtualization, IOAPIC 2017-11-30 08:15:19 -08:00
xtensa libnvdimm for 4.15 2017-11-17 09:51:57 -08:00
.gitignore
Kconfig bpf: Revert bpf_overrid_function() helper changes. 2017-11-11 18:24:55 +09:00