2019-05-27 13:55:01 +07:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-or-later
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
/*
|
|
|
|
* This file contains the routines for initializing the MMU
|
|
|
|
* on the 8xx series of chips.
|
|
|
|
* -- christophe
|
|
|
|
*
|
|
|
|
* Derived from arch/powerpc/mm/40x_mmu.c:
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/memblock.h>
|
2018-10-18 12:22:27 +07:00
|
|
|
#include <linux/mmu_context.h>
|
2016-05-17 14:02:45 +07:00
|
|
|
#include <asm/fixmap.h>
|
|
|
|
#include <asm/code-patching.h>
|
2020-05-06 10:40:26 +07:00
|
|
|
#include <asm/inst.h>
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
|
2019-03-29 16:59:59 +07:00
|
|
|
#include <mm/mmu_decl.h>
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
|
2016-05-17 14:02:45 +07:00
|
|
|
#define IMMR_SIZE (FIX_IMMR_SIZE << PAGE_SHIFT)
|
|
|
|
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
extern int __map_without_ltlbs;
|
2016-05-17 14:02:45 +07:00
|
|
|
|
2017-07-12 17:08:45 +07:00
|
|
|
static unsigned long block_mapped_ram;
|
|
|
|
|
2016-05-17 14:02:45 +07:00
|
|
|
/*
|
2019-11-26 20:16:50 +07:00
|
|
|
* Return PA for this VA if it is in an area mapped with LTLBs or fixmap.
|
2017-07-12 17:08:45 +07:00
|
|
|
* Otherwise, returns 0
|
2016-05-17 14:02:45 +07:00
|
|
|
*/
|
|
|
|
phys_addr_t v_block_mapped(unsigned long va)
|
|
|
|
{
|
|
|
|
unsigned long p = PHYS_IMMR_BASE;
|
|
|
|
|
|
|
|
if (va >= VIRT_IMMR_BASE && va < VIRT_IMMR_BASE + IMMR_SIZE)
|
|
|
|
return p + va - VIRT_IMMR_BASE;
|
2019-11-26 20:16:50 +07:00
|
|
|
if (__map_without_ltlbs)
|
|
|
|
return 0;
|
2017-07-12 17:08:45 +07:00
|
|
|
if (va >= PAGE_OFFSET && va < PAGE_OFFSET + block_mapped_ram)
|
|
|
|
return __pa(va);
|
2016-05-17 14:02:45 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2019-11-26 20:16:50 +07:00
|
|
|
* Return VA for a given PA mapped with LTLBs or fixmap
|
|
|
|
* Return 0 if not mapped
|
2016-05-17 14:02:45 +07:00
|
|
|
*/
|
|
|
|
unsigned long p_block_mapped(phys_addr_t pa)
|
|
|
|
{
|
|
|
|
unsigned long p = PHYS_IMMR_BASE;
|
|
|
|
|
|
|
|
if (pa >= p && pa < p + IMMR_SIZE)
|
|
|
|
return VIRT_IMMR_BASE + pa - p;
|
2019-11-26 20:16:50 +07:00
|
|
|
if (__map_without_ltlbs)
|
|
|
|
return 0;
|
2017-07-12 17:08:45 +07:00
|
|
|
if (pa < block_mapped_ram)
|
|
|
|
return (unsigned long)__va(pa);
|
2016-05-17 14:02:45 +07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-05-17 14:02:54 +07:00
|
|
|
#define LARGE_PAGE_SIZE_8M (1<<23)
|
|
|
|
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
/*
|
|
|
|
* MMU_init_hw does the chip-specific initialization of the MMU hardware.
|
|
|
|
*/
|
|
|
|
void __init MMU_init_hw(void)
|
|
|
|
{
|
2016-05-17 14:02:54 +07:00
|
|
|
/* PIN up to the 3 first 8Mb after IMMR in DTLB table */
|
2019-02-13 23:06:19 +07:00
|
|
|
if (IS_ENABLED(CONFIG_PIN_TLB_DATA)) {
|
|
|
|
unsigned long ctr = mfspr(SPRN_MD_CTR) & 0xfe000000;
|
|
|
|
unsigned long flags = 0xf0 | MD_SPS16K | _PAGE_SH | _PAGE_DIRTY;
|
|
|
|
int i = IS_ENABLED(CONFIG_PIN_TLB_IMMR) ? 29 : 28;
|
|
|
|
unsigned long addr = 0;
|
|
|
|
unsigned long mem = total_lowmem;
|
|
|
|
|
|
|
|
for (; i < 32 && mem >= LARGE_PAGE_SIZE_8M; i++) {
|
|
|
|
mtspr(SPRN_MD_CTR, ctr | (i << 8));
|
|
|
|
mtspr(SPRN_MD_EPN, (unsigned long)__va(addr) | MD_EVALID);
|
|
|
|
mtspr(SPRN_MD_TWC, MD_PS8MEG | MD_SVALID);
|
|
|
|
mtspr(SPRN_MD_RPN, addr | flags | _PAGE_PRESENT);
|
|
|
|
addr += LARGE_PAGE_SIZE_8M;
|
|
|
|
mem -= LARGE_PAGE_SIZE_8M;
|
|
|
|
}
|
2016-05-17 14:02:54 +07:00
|
|
|
}
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
}
|
|
|
|
|
2017-07-12 17:08:55 +07:00
|
|
|
static void __init mmu_mapin_immr(void)
|
2016-05-17 14:02:45 +07:00
|
|
|
{
|
|
|
|
unsigned long p = PHYS_IMMR_BASE;
|
|
|
|
unsigned long v = VIRT_IMMR_BASE;
|
|
|
|
int offset;
|
|
|
|
|
|
|
|
for (offset = 0; offset < IMMR_SIZE; offset += PAGE_SIZE)
|
2018-10-09 20:51:45 +07:00
|
|
|
map_kernel_page(v + offset, p + offset, PAGE_KERNEL_NCG);
|
2016-05-17 14:02:45 +07:00
|
|
|
}
|
|
|
|
|
2019-02-22 02:08:51 +07:00
|
|
|
static void mmu_patch_cmp_limit(s32 *site, unsigned long mapped)
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
{
|
2018-11-10 00:33:26 +07:00
|
|
|
modify_instruction_site(site, 0xffff, (unsigned long)__va(mapped) >> 16);
|
2016-05-17 14:02:51 +07:00
|
|
|
}
|
|
|
|
|
2019-02-22 02:08:51 +07:00
|
|
|
static void mmu_patch_addis(s32 *site, long simm)
|
|
|
|
{
|
|
|
|
unsigned int instr = *(unsigned int *)patch_site_addr(site);
|
|
|
|
|
|
|
|
instr &= 0xffff0000;
|
|
|
|
instr |= ((unsigned long)simm) >> 16;
|
2020-05-06 10:40:26 +07:00
|
|
|
patch_instruction_site(site, ppc_inst(instr));
|
2019-02-22 02:08:51 +07:00
|
|
|
}
|
|
|
|
|
2019-12-14 15:10:29 +07:00
|
|
|
static void mmu_mapin_ram_chunk(unsigned long offset, unsigned long top, pgprot_t prot)
|
2019-08-23 16:56:21 +07:00
|
|
|
{
|
|
|
|
unsigned long s = offset;
|
|
|
|
unsigned long v = PAGE_OFFSET + s;
|
|
|
|
phys_addr_t p = memstart_addr + s;
|
|
|
|
|
|
|
|
for (; s < top; s += PAGE_SIZE) {
|
|
|
|
map_kernel_page(v, p, prot);
|
|
|
|
v += PAGE_SIZE;
|
|
|
|
p += PAGE_SIZE;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-02-22 02:08:38 +07:00
|
|
|
unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
|
2016-05-17 14:02:51 +07:00
|
|
|
{
|
|
|
|
unsigned long mapped;
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
|
2016-05-17 14:02:45 +07:00
|
|
|
if (__map_without_ltlbs) {
|
2016-05-17 14:02:51 +07:00
|
|
|
mapped = 0;
|
2016-05-17 14:02:45 +07:00
|
|
|
mmu_mapin_immr();
|
2019-02-13 23:06:19 +07:00
|
|
|
if (!IS_ENABLED(CONFIG_PIN_TLB_IMMR))
|
2020-05-06 10:40:26 +07:00
|
|
|
patch_instruction_site(&patch__dtlbmiss_immr_jmp, ppc_inst(PPC_INST_NOP));
|
2019-02-13 23:06:19 +07:00
|
|
|
if (!IS_ENABLED(CONFIG_PIN_TLB_TEXT))
|
|
|
|
mmu_patch_cmp_limit(&patch__itlbmiss_linmem_top, 0);
|
2016-05-17 14:02:51 +07:00
|
|
|
} else {
|
2019-08-23 16:56:21 +07:00
|
|
|
unsigned long einittext8 = ALIGN(__pa(_einittext), SZ_8M);
|
|
|
|
|
2016-05-17 14:02:51 +07:00
|
|
|
mapped = top & ~(LARGE_PAGE_SIZE_8M - 1);
|
2019-02-13 23:06:21 +07:00
|
|
|
if (!IS_ENABLED(CONFIG_PIN_TLB_TEXT))
|
2019-08-23 16:56:21 +07:00
|
|
|
mmu_patch_cmp_limit(&patch__itlbmiss_linmem_top, einittext8);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Populate page tables to:
|
|
|
|
* - have them appear in /sys/kernel/debug/kernel_page_tables
|
|
|
|
* - allow the BDI to find the pages when they are not PINNED
|
|
|
|
*/
|
|
|
|
mmu_mapin_ram_chunk(0, einittext8, PAGE_KERNEL_X);
|
|
|
|
mmu_mapin_ram_chunk(einittext8, mapped, PAGE_KERNEL);
|
|
|
|
mmu_mapin_immr();
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
}
|
|
|
|
|
2018-10-19 13:55:06 +07:00
|
|
|
mmu_patch_cmp_limit(&patch__dtlbmiss_linmem_top, mapped);
|
|
|
|
mmu_patch_cmp_limit(&patch__fixupdar_linmem_top, mapped);
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
|
|
|
|
/* If the size of RAM is not an exact power of two, we may not
|
|
|
|
* have covered RAM in its entirety with 8 MiB
|
|
|
|
* pages. Consequently, restrict the top end of RAM currently
|
|
|
|
* allocable so that calls to the MEMBLOCK to allocate PTEs for "tail"
|
|
|
|
* coverage with normal-sized pages (or other reasons) do not
|
|
|
|
* attempt to allocate outside the allowed range.
|
|
|
|
*/
|
2016-05-17 14:02:51 +07:00
|
|
|
if (mapped)
|
|
|
|
memblock_set_current_limit(mapped);
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
|
2017-07-12 17:08:45 +07:00
|
|
|
block_mapped_ram = mapped;
|
|
|
|
|
powerpc/8xx: Map linear kernel RAM with 8M pages
On a live running system (VoIP gateway for Air Trafic Control), over
a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
and approximatly 35 secondes are spent in DTLB handler.
This represents 5.8% of the overall time and even 10.8% of the
non-idle time.
Among those 87 millions DTLB misses, 15% are on user addresses and
85% are on kernel addresses. And within the kernel addresses, 93%
are on addresses from the linear address space and only 7% are on
addresses from the virtual address space.
MPC8xx has no BATs but it has 8Mb page size. This patch implements
mapping of kernel RAM using 8Mb pages, on the same model as what is
done on the 40x.
In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
entries to the same 8Mb physical page. In each second entry, we add
4Mb to the page physical address to ease life of the FixupDAR
routine. This is just ignored by HW.
In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
will point to the first page of the area. The DTLB handler adds
the 3 bits from EPN to map the correct page.
With this patch applied, we now get only 13 millions TLB misses
during the 10 minutes period. The idle time has increased to 313s
and the overall time spent in DTLB miss handler is 6.3s, which
represents 1% of the overall time and 2.2% of non-idle time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
2016-02-09 23:07:50 +07:00
|
|
|
return mapped;
|
|
|
|
}
|
2016-02-09 23:07:54 +07:00
|
|
|
|
2019-02-22 02:08:51 +07:00
|
|
|
void mmu_mark_initmem_nx(void)
|
|
|
|
{
|
|
|
|
if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX) && CONFIG_ETEXT_SHIFT < 23)
|
|
|
|
mmu_patch_addis(&patch__itlbmiss_linmem_top8,
|
|
|
|
-((long)_etext & ~(LARGE_PAGE_SIZE_8M - 1)));
|
2019-08-23 16:56:21 +07:00
|
|
|
if (!IS_ENABLED(CONFIG_PIN_TLB_TEXT)) {
|
|
|
|
unsigned long einittext8 = ALIGN(__pa(_einittext), SZ_8M);
|
|
|
|
unsigned long etext8 = ALIGN(__pa(_etext), SZ_8M);
|
|
|
|
unsigned long etext = __pa(_etext);
|
|
|
|
|
2019-02-22 02:08:51 +07:00
|
|
|
mmu_patch_cmp_limit(&patch__itlbmiss_linmem_top, __pa(_etext));
|
2019-08-23 16:56:21 +07:00
|
|
|
|
|
|
|
/* Update page tables for PTDUMP and BDI */
|
|
|
|
mmu_mapin_ram_chunk(0, einittext8, __pgprot(0));
|
|
|
|
if (IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)) {
|
|
|
|
mmu_mapin_ram_chunk(0, etext, PAGE_KERNEL_TEXT);
|
|
|
|
mmu_mapin_ram_chunk(etext, einittext8, PAGE_KERNEL);
|
|
|
|
} else {
|
|
|
|
mmu_mapin_ram_chunk(0, etext8, PAGE_KERNEL_TEXT);
|
|
|
|
mmu_mapin_ram_chunk(etext8, einittext8, PAGE_KERNEL);
|
|
|
|
}
|
|
|
|
}
|
2019-02-22 02:08:51 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
#ifdef CONFIG_STRICT_KERNEL_RWX
|
|
|
|
void mmu_mark_rodata_ro(void)
|
|
|
|
{
|
2019-08-23 16:56:21 +07:00
|
|
|
unsigned long sinittext = __pa(_sinittext);
|
|
|
|
unsigned long etext = __pa(_etext);
|
|
|
|
|
2019-02-22 02:08:51 +07:00
|
|
|
if (CONFIG_DATA_SHIFT < 23)
|
|
|
|
mmu_patch_addis(&patch__dtlbmiss_romem_top8,
|
|
|
|
-__pa(((unsigned long)_sinittext) &
|
|
|
|
~(LARGE_PAGE_SIZE_8M - 1)));
|
|
|
|
mmu_patch_addis(&patch__dtlbmiss_romem_top, -__pa(_sinittext));
|
2019-08-23 16:56:21 +07:00
|
|
|
|
|
|
|
/* Update page tables for PTDUMP and BDI */
|
|
|
|
mmu_mapin_ram_chunk(0, sinittext, __pgprot(0));
|
|
|
|
mmu_mapin_ram_chunk(0, etext, PAGE_KERNEL_ROX);
|
|
|
|
mmu_mapin_ram_chunk(etext, sinittext, PAGE_KERNEL_RO);
|
2019-02-22 02:08:51 +07:00
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2017-07-12 17:08:55 +07:00
|
|
|
void __init setup_initial_memory_limit(phys_addr_t first_memblock_base,
|
|
|
|
phys_addr_t first_memblock_size)
|
2016-02-09 23:07:54 +07:00
|
|
|
{
|
|
|
|
/* We don't currently support the first MEMBLOCK not mapping 0
|
|
|
|
* physical on those processors
|
|
|
|
*/
|
|
|
|
BUG_ON(first_memblock_base != 0);
|
|
|
|
|
2019-02-13 23:06:21 +07:00
|
|
|
/* 8xx can only access 32MB at the moment */
|
|
|
|
memblock_set_current_limit(min_t(u64, first_memblock_size, 0x02000000));
|
2016-02-09 23:07:54 +07:00
|
|
|
}
|
2016-02-09 23:08:18 +07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Set up to use a given MMU context.
|
|
|
|
* id is context number, pgd is PGD pointer.
|
|
|
|
*
|
|
|
|
* We place the physical address of the new task page directory loaded
|
|
|
|
* into the MMU base register, and set the ASID compare register with
|
|
|
|
* the new "context."
|
|
|
|
*/
|
|
|
|
void set_context(unsigned long id, pgd_t *pgd)
|
|
|
|
{
|
|
|
|
s16 offset = (s16)(__pa(swapper_pg_dir));
|
|
|
|
|
|
|
|
/* Context switch the PTE pointer for the Abatron BDI2000.
|
|
|
|
* The PGDIR is passed as second argument.
|
|
|
|
*/
|
2019-02-21 17:37:53 +07:00
|
|
|
if (IS_ENABLED(CONFIG_BDI_SWITCH))
|
|
|
|
abatron_pteptrs[1] = pgd;
|
2016-02-09 23:08:18 +07:00
|
|
|
|
2018-11-29 21:07:15 +07:00
|
|
|
/* Register M_TWB will contain base address of level 1 table minus the
|
2016-02-09 23:08:18 +07:00
|
|
|
* lower part of the kernel PGDIR base address, so that all accesses to
|
|
|
|
* level 1 table are done relative to lower part of kernel PGDIR base
|
|
|
|
* address.
|
|
|
|
*/
|
2018-11-29 21:07:15 +07:00
|
|
|
mtspr(SPRN_M_TWB, __pa(pgd) - offset);
|
2016-02-09 23:08:18 +07:00
|
|
|
|
|
|
|
/* Update context */
|
powerpc/mm/slice: Fix hugepage allocation at hint address on 8xx
On the 8xx, the page size is set in the PMD entry and applies to
all pages of the page table pointed by the said PMD entry.
When an app has some regular pages allocated (e.g. see below) and tries
to mmap() a huge page at a hint address covered by the same PMD entry,
the kernel accepts the hint allthough the 8xx cannot handle different
page sizes in the same PMD entry.
10000000-10001000 r-xp 00000000 00:0f 2597 /root/malloc
10010000-10011000 rwxp 00000000 00:0f 2597 /root/malloc
mmap(0x10080000, 524288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x10080000
This results the app remaining forever in do_page_fault()/hugetlb_fault()
and when interrupting that app, we get the following warning:
[162980.035629] WARNING: CPU: 0 PID: 2777 at arch/powerpc/mm/hugetlbpage.c:354 hugetlb_free_pgd_range+0xc8/0x1e4
[162980.035699] CPU: 0 PID: 2777 Comm: malloc Tainted: G W 4.14.6 #85
[162980.035744] task: c67e2c00 task.stack: c668e000
[162980.035783] NIP: c000fe18 LR: c00e1eec CTR: c00f90c0
[162980.035830] REGS: c668fc20 TRAP: 0700 Tainted: G W (4.14.6)
[162980.035854] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24044224 XER: 20000000
[162980.036003]
[162980.036003] GPR00: c00e1eec c668fcd0 c67e2c00 00000010 c6869410 10080000 00000000 77fb4000
[162980.036003] GPR08: ffff0001 0683c001 00000000 ffffff80 44028228 10018a34 00004008 418004fc
[162980.036003] GPR16: c668e000 00040100 c668e000 c06c0000 c668fe78 c668e000 c6835ba0 c668fd48
[162980.036003] GPR24: 00000000 73ffffff 74000000 00000001 77fb4000 100fffff 10100000 10100000
[162980.036743] NIP [c000fe18] hugetlb_free_pgd_range+0xc8/0x1e4
[162980.036839] LR [c00e1eec] free_pgtables+0x12c/0x150
[162980.036861] Call Trace:
[162980.036939] [c668fcd0] [c00f0774] unlink_anon_vmas+0x1c4/0x214 (unreliable)
[162980.037040] [c668fd10] [c00e1eec] free_pgtables+0x12c/0x150
[162980.037118] [c668fd40] [c00eabac] exit_mmap+0xe8/0x1b4
[162980.037210] [c668fda0] [c0019710] mmput.part.9+0x20/0xd8
[162980.037301] [c668fdb0] [c001ecb0] do_exit+0x1f0/0x93c
[162980.037386] [c668fe00] [c001f478] do_group_exit+0x40/0xcc
[162980.037479] [c668fe10] [c002a76c] get_signal+0x47c/0x614
[162980.037570] [c668fe70] [c0007840] do_signal+0x54/0x244
[162980.037654] [c668ff30] [c0007ae8] do_notify_resume+0x34/0x88
[162980.037744] [c668ff40] [c000dae8] do_user_signal+0x74/0xc4
[162980.037781] Instruction dump:
[162980.037821] 7fdff378 81370000 54a3463a 80890020 7d24182e 7c841a14 712a0004 4082ff94
[162980.038014] 2f890000 419e0010 712a0ff0 408200e0 <0fe00000> 54a9000a 7f984840 419d0094
[162980.038216] ---[ end trace c0ceeca8e7a5800a ]---
[162980.038754] BUG: non-zero nr_ptes on freeing mm: 1
[162985.363322] BUG: non-zero nr_ptes on freeing mm: -1
In order to fix this, this patch uses the address space "slices"
implemented for BOOK3S/64 and enhanced to support PPC32 by the
preceding patch.
This patch modifies the context.id on the 8xx to be in the range
[1:16] instead of [0:15] in order to identify context.id == 0 as
not initialised contexts as done on BOOK3S
This patch activates CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is
selected for the 8xx
Alltough we could in theory have as many slices as PMD entries, the
current slices implementation limits the number of low slices to 16.
This limitation is not preventing us to fix the initial issue allthough
it is suboptimal. It will be cured in a subsequent patch.
Fixes: 4b91428699477 ("powerpc/8xx: Implement support of hugepages")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-22 21:27:26 +07:00
|
|
|
mtspr(SPRN_M_CASID, id - 1);
|
2016-02-09 23:08:18 +07:00
|
|
|
/* sync */
|
|
|
|
mb();
|
|
|
|
}
|
2016-02-09 23:08:21 +07:00
|
|
|
|
|
|
|
void flush_instruction_cache(void)
|
|
|
|
{
|
|
|
|
isync();
|
|
|
|
mtspr(SPRN_IC_CST, IDC_INVALL);
|
|
|
|
isync();
|
|
|
|
}
|
2019-03-11 15:30:33 +07:00
|
|
|
|
|
|
|
#ifdef CONFIG_PPC_KUEP
|
|
|
|
void __init setup_kuep(bool disabled)
|
|
|
|
{
|
|
|
|
if (disabled)
|
|
|
|
return;
|
|
|
|
|
|
|
|
pr_info("Activating Kernel Userspace Execution Prevention\n");
|
|
|
|
|
|
|
|
mtspr(SPRN_MI_AP, MI_APG_KUEP);
|
|
|
|
}
|
|
|
|
#endif
|
2019-03-11 15:30:34 +07:00
|
|
|
|
|
|
|
#ifdef CONFIG_PPC_KUAP
|
|
|
|
void __init setup_kuap(bool disabled)
|
|
|
|
{
|
|
|
|
pr_info("Activating Kernel Userspace Access Protection\n");
|
|
|
|
|
|
|
|
if (disabled)
|
|
|
|
pr_warn("KUAP cannot be disabled yet on 8xx when compiled in\n");
|
|
|
|
|
|
|
|
mtspr(SPRN_MD_AP, MD_APG_KUAP);
|
|
|
|
}
|
|
|
|
#endif
|