powerpc updates for 5.10

- A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting it for
    powerpc, as well as a related fix for sparc.
 
  - Remove support for PowerPC 601.
 
  - Some fixes for watchpoints & addition of a new ptrace flag for detecting ISA
    v3.1 (Power10) watchpoint features.
 
  - A fix for kernels using 4K pages and the hash MMU on bare metal Power9
    systems with > 16TB of RAM, or RAM on the 2nd node.
 
  - A basic idle driver for shallow stop states on Power10.
 
  - Tweaks to our sched domains code to better inform the scheduler about the
    hardware topology on Power9/10, where two SMT4 cores can be presented by
    firmware as an SMT8 core.
 
  - A series doing further reworks & cleanups of our EEH code.
 
  - Addition of a filter for RTAS (firmware) calls done via sys_rtas(), to
    prevent root from overwriting kernel memory.
 
  - Other smaller features, fixes & cleanups.
 
 Thanks to:
   Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Biwen
   Li, Cameron Berkenpas, Cédric Le Goater, Christophe Leroy, Christoph Hellwig,
   Colin Ian King, Daniel Axtens, David Dai, Finn Thain, Frederic Barrat, Gautham
   R. Shenoy, Greg Kurz, Gustavo Romero, Ira Weiny, Jason Yan, Joel Stanley,
   Jordan Niethe, Kajol Jain, Konrad Rzeszutek Wilk, Laurent Dufour, Leonardo
   Bras, Liu Shixin, Luca Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar,
   Nathan Lynch, Nicholas Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver
   O'Halloran, Pedro Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai,
   Qinglang Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott
   Cheloha, Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
   Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
   Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
   Yingliang, zhengbin.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+JBQoTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJJAD/0e3tsFP+9rFlxKSJlDcMW3w7kXDRXE
 tG40F1ubYFLU8wtFVR0De3njTRsz5HyaNU6SI8CwPq48mCa7OFn1D1OeHonHXDX9
 w6v3GE2S1uXXQnjm+czcfdjWQut0IwWBLx007/S23WcPff3Abc2irupKLNu+Gx29
 b/yxJHZSRJVX59jSV94HkdJS75mDHQ3oUOlFGXtuGcUZDufpD1ynRcQOjr0V/8JU
 F4WAblFSe7hiczHGqIvfhFVJ+OikEhnj2aEMAL8U7vxzrAZ7RErKCN9s/0Tf0Ktx
 FzNEFNLHZGqh+qNDpKKmM+RnaeO2Lcoc9qVn7vMHOsXPzx9F5LJwkI/DgPjtgAq/
 mFvGnQB/FapATnQeMluViC/qhEe5bQXLUfPP5i2+QOjK0QqwyFlUMgaVNfsY8jRW
 0Q/sNA72Opzst4WUTveCd4SOInlUuat09e5nLooCRLW7u7/jIiXNRSFNvpOiwkfF
 EcIPJsi6FUQ4SNbqpRSNEO9fK5JZrrUtmr0pg8I7fZhHYGcxEjqPR6IWCs3DTsak
 4/KhjhhTnP/IWJRw6qKAyNhEyEwpWqYZ97SIQbvSb1g/bS47AIdQdJRb0eEoRjhx
 sbbnnYFwPFkG4c1yQSIFanT9wNDQ2hFx/c/mRfbd7J+ordx9JsoqXjqrGuhsU/pH
 GttJLmkJ5FH+pQ==
 =akeX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting
   it for powerpc, as well as a related fix for sparc.

 - Remove support for PowerPC 601.

 - Some fixes for watchpoints & addition of a new ptrace flag for
   detecting ISA v3.1 (Power10) watchpoint features.

 - A fix for kernels using 4K pages and the hash MMU on bare metal
   Power9 systems with > 16TB of RAM, or RAM on the 2nd node.

 - A basic idle driver for shallow stop states on Power10.

 - Tweaks to our sched domains code to better inform the scheduler about
   the hardware topology on Power9/10, where two SMT4 cores can be
   presented by firmware as an SMT8 core.

 - A series doing further reworks & cleanups of our EEH code.

 - Addition of a filter for RTAS (firmware) calls done via sys_rtas(),
   to prevent root from overwriting kernel memory.

 - Other smaller features, fixes & cleanups.

Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe
Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn
Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero,
Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad
Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca
Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas
Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro
Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang
Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha,
Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt,
Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain,
Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang
Yingliang, zhengbin.

* tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits)
  Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"
  selftests/powerpc: Fix eeh-basic.sh exit codes
  cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier
  powerpc/time: Make get_tb() common to PPC32 and PPC64
  powerpc/time: Make get_tbl() common to PPC32 and PPC64
  powerpc/time: Remove get_tbu()
  powerpc/time: Avoid using get_tbl() and get_tbu() internally
  powerpc/time: Make mftb() common to PPC32 and PPC64
  powerpc/time: Rename mftbl() to mftb()
  powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S
  powerpc/32s: Rename head_32.S to head_book3s_32.S
  powerpc/32s: Setup the early hash table at all time.
  powerpc/time: Remove ifdef in get_dec() and set_dec()
  powerpc: Remove get_tb_or_rtc()
  powerpc: Remove __USE_RTC()
  powerpc: Tidy up a bit after removal of PowerPC 601.
  powerpc: Remove support for PowerPC 601
  powerpc: Remove PowerPC 601
  powerpc: Drop SYNC_601() ISYNC_601() and SYNC()
  powerpc: Remove CONFIG_PPC601_SYNC_FIX
  ...
This commit is contained in:
Linus Torvalds 2020-10-16 12:21:15 -07:00
commit 96685f8666
223 changed files with 3245 additions and 2491 deletions

View File

@ -1,3 +1,28 @@
What: /sys/bus/event_source/devices/hv_24x7/format
Date: September 2020
Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
Description: Read-only. Attribute group to describe the magic bits
that go into perf_event_attr.config for a particular pmu.
(See ABI/testing/sysfs-bus-event_source-devices-format).
Each attribute under this group defines a bit range of the
perf_event_attr.config. All supported attributes are listed
below.
chip = "config:16-31"
core = "config:16-31"
domain = "config:0-3"
lpar = "config:0-15"
offset = "config:32-63"
vcpu = "config:16-31"
For example,
PM_PB_CYC = "domain=1,offset=0x80,chip=?,lpar=0x0"
In this event, '?' after chip specifies that
this value will be provided by user while running this event.
What: /sys/bus/event_source/devices/hv_24x7/interface/catalog
Date: February 2014
Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>

View File

@ -1,3 +1,34 @@
What: /sys/bus/event_source/devices/hv_gpci/format
Date: September 2020
Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
Description: Read-only. Attribute group to describe the magic bits
that go into perf_event_attr.config for a particular pmu.
(See ABI/testing/sysfs-bus-event_source-devices-format).
Each attribute under this group defines a bit range of the
perf_event_attr.config. All supported attributes are listed
below.
counter_info_version = "config:16-23"
length = "config:24-31"
partition_id = "config:32-63"
request = "config:0-31"
sibling_part_id = "config:32-63"
hw_chip_id = "config:32-63"
offset = "config:32-63"
phys_processor_idx = "config:32-63"
secondary_index = "config:0-15"
starting_index = "config:32-63"
For example,
processor_core_utilization_instructions_completed = "request=0x94,
phys_processor_idx=?,counter_info_version=0x8,
length=8,offset=0x18"
In this event, '?' after phys_processor_idx specifies this value
this value will be provided by user while running this event.
What: /sys/bus/event_source/devices/hv_gpci/interface/collect_privileged
Date: February 2014
Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
@ -41,3 +72,10 @@ Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
Description:
A number indicating the latest version of the gpci interface
that the kernel is aware of.
What: /sys/devices/hv_gpci/cpumask
Date: October 2020
Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
Description: read only
This sysfs file exposes the cpumask which is designated to make
HCALLs to retrieve hv-gpci pmu event counter data.

View File

@ -7,6 +7,7 @@ Mapping of some CPU versions to relevant ISA versions.
========= ====================================================================
CPU Architecture version
========= ====================================================================
Power10 Power ISA v3.1
Power9 Power ISA v3.0B
Power8 Power ISA v2.07
Power7 Power ISA v2.06
@ -32,6 +33,7 @@ Key Features
========== ==================
CPU VMX (aka. Altivec)
========== ==================
Power10 Yes
Power9 Yes
Power8 Yes
Power7 Yes
@ -47,6 +49,7 @@ PPC970 Yes
========== ====
CPU VSX
========== ====
Power10 Yes
Power9 Yes
Power8 Yes
Power7 Yes
@ -62,6 +65,7 @@ PPC970 No
========== ====================================
CPU Transactional Memory
========== ====================================
Power10 No (* see Power ISA v3.1, "Appendix A. Notes on the Removal of Transactional Memory from the Architecture")
Power9 Yes (* see transactional_memory.txt)
Power8 Yes
Power7 No

View File

@ -46,6 +46,7 @@ features will have bits indicating whether there is support for::
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 0x20
2. PTRACE_SETHWDEBUG

View File

@ -420,6 +420,13 @@ config MMU_GATHER_NO_GATHER
bool
depends on MMU_GATHER_TABLE_FREE
config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
bool
help
Temporary select until all architectures can be converted to have
irqs disabled over activate_mm. Architectures that do IPI based TLB
shootdowns should enable this.
config ARCH_HAVE_NMI_SAFE_CMPXCHG
bool

View File

@ -59,7 +59,10 @@ config HAVE_SETUP_PER_CPU_AREA
def_bool PPC64
config NEED_PER_CPU_EMBED_FIRST_CHUNK
def_bool PPC64
def_bool y if PPC64
config NEED_PER_CPU_PAGE_FIRST_CHUNK
def_bool y if PPC64
config NR_IRQS
int "Number of virtual interrupt numbers"
@ -148,6 +151,7 @@ config PPC
select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS
select ARCH_WANT_IPC_PARSE_VERSION
select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
select ARCH_WEAK_RELEASE_ACQUIRE
select BINFMT_ELF
select BUILDTIME_TABLE_SORT
@ -964,7 +968,7 @@ config PPC_MEM_KEYS
config PPC_SECURE_BOOT
prompt "Enable secure boot support"
bool
depends on PPC_POWERNV
depends on PPC_POWERNV || PPC_PSERIES
depends on IMA_ARCH_POLICY
imply IMA_SECURE_AND_OR_TRUSTED_BOOT
help
@ -984,6 +988,19 @@ config PPC_SECVAR_SYSFS
read/write operations on these variables. Say Y if you have
secure boot enabled and want to expose variables to userspace.
config PPC_RTAS_FILTER
bool "Enable filtering of RTAS syscalls"
default y
depends on PPC_RTAS
help
The RTAS syscall API has security issues that could be used to
compromise system integrity. This option enforces restrictions on the
RTAS calls and arguments passed by userspace programs to mitigate
these issues.
Say Y unless you know what you are doing and the filter is causing
problems for you.
endmenu
config ISA_DMA_API

View File

@ -264,7 +264,8 @@ KBUILD_CFLAGS += $(cpu-as-y)
KBUILD_AFLAGS += $(aflags-y)
KBUILD_CFLAGS += $(cflags-y)
head-y := arch/powerpc/kernel/head_$(BITS).o
head-$(CONFIG_PPC64) := arch/powerpc/kernel/head_64.o
head-$(CONFIG_PPC_BOOK3S_32) := arch/powerpc/kernel/head_book3s_32.o
head-$(CONFIG_PPC_8xx) := arch/powerpc/kernel/head_8xx.o
head-$(CONFIG_40x) := arch/powerpc/kernel/head_40x.o
head-$(CONFIG_44x) := arch/powerpc/kernel/head_44x.o

View File

@ -18,7 +18,7 @@ quiet_cmd_relocs_check = CHKREL $@
ifdef CONFIG_PPC_BOOK3S_64
cmd_relocs_check = \
$(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh "$(OBJDUMP)" "$(NM)" "$@" ; \
$(BASH) $(srctree)/arch/powerpc/tools/unrel_branch_check.sh "$(OBJDUMP)" "$@"
$(BASH) $(srctree)/arch/powerpc/tools/unrel_branch_check.sh "$(OBJDUMP)" "$(NM)" "$@"
else
cmd_relocs_check = \
$(CONFIG_SHELL) $(srctree)/arch/powerpc/tools/relocs_check.sh "$(OBJDUMP)" "$(NM)" "$@"

View File

@ -7,7 +7,7 @@
# Based on coffboot by Paul Mackerras
# Simplified for ppc64 by Todd Inglett
#
# NOTE: this code is built for 32 bit in ELF32 format even though
# NOTE: this code may be built for 32 bit in ELF32 format even though
# it packages a 64 bit kernel. We do this to simplify the
# bootloader and increase compatibility with OpenFirmware.
#

View File

@ -161,7 +161,6 @@ eeprom@50 {
rtc@68 {
compatible = "dallas,ds1339";
reg = <0x68>;
interrupts = <0x1 0x1 0 0>;
};
};

View File

@ -144,7 +144,6 @@ eeprom@56 {
rtc@68 {
compatible = "dallas,ds1374";
reg = <0x68>;
interrupts = <0x1 0x1 0 0>;
};
};

View File

@ -18,7 +18,7 @@
.text
/* udelay (on non-601 processors) needs to know the period of the
/* udelay needs to know the period of the
* timebase in nanoseconds. This used to be hardcoded to be 60ns
* (period of 66MHz/4). Now a variable is used that is initialized to
* 60 for backward compatibility, but it can be overridden as necessary
@ -37,19 +37,6 @@ timebase_period_ns:
*/
.globl udelay
udelay:
mfspr r4,SPRN_PVR
srwi r4,r4,16
cmpwi 0,r4,1 /* 601 ? */
bne .Ludelay_not_601
00: li r0,86 /* Instructions / microsecond? */
mtctr r0
10: addi r0,r0,0 /* NOP */
bdnz 10b
subic. r3,r3,1
bne 00b
blr
.Ludelay_not_601:
mulli r4,r3,1000 /* nanoseconds */
/* Change r4 to be the number of ticks using:
* (nanoseconds + (timebase_period_ns - 1 )) / timebase_period_ns

View File

@ -29,9 +29,9 @@ CONFIG_SYN_COOKIES=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=32768
CONFIG_IDE=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_ATA=y
CONFIG_ATA_GENERIC=y
CONFIG_PATA_VIA=y
CONFIG_NETDEVICES=y
CONFIG_GIANFAR=y
CONFIG_E1000=y

View File

@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=32768
CONFIG_IDE=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_ATA=y
CONFIG_ATA_GENERIC=y
CONFIG_PATA_VIA=y
CONFIG_NETDEVICES=y
CONFIG_GIANFAR=y
CONFIG_E100=y

View File

@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=32768
CONFIG_IDE=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_ATA=y
CONFIG_ATA_GENERIC=y
CONFIG_PATA_VIA=y
CONFIG_NETDEVICES=y
CONFIG_GIANFAR=y
CONFIG_E100=y

View File

@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=32768
CONFIG_IDE=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_ATA=y
CONFIG_ATA_GENERIC=y
CONFIG_PATA_VIA=y
CONFIG_NETDEVICES=y
CONFIG_GIANFAR=y
CONFIG_E100=y

View File

@ -30,9 +30,9 @@ CONFIG_MTD_CFI_AMDSTD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=32768
CONFIG_IDE=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_ATA=y
CONFIG_ATA_GENERIC=y
CONFIG_PATA_VIA=y
CONFIG_NETDEVICES=y
CONFIG_GIANFAR=y
CONFIG_E100=y

View File

@ -67,6 +67,7 @@ void single_step_exception(struct pt_regs *regs);
void program_check_exception(struct pt_regs *regs);
void alignment_exception(struct pt_regs *regs);
void StackOverflow(struct pt_regs *regs);
void stack_overflow_exception(struct pt_regs *regs);
void kernel_fp_unavailable_exception(struct pt_regs *regs);
void altivec_unavailable_exception(struct pt_regs *regs);
void vsx_unavailable_exception(struct pt_regs *regs);
@ -144,7 +145,9 @@ void _kvmppc_restore_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr);
void _kvmppc_save_tm_pr(struct kvm_vcpu *vcpu, u64 guest_msr);
/* Patch sites */
extern s32 patch__call_flush_branch_caches;
extern s32 patch__call_flush_branch_caches1;
extern s32 patch__call_flush_branch_caches2;
extern s32 patch__call_flush_branch_caches3;
extern s32 patch__flush_count_cache_return;
extern s32 patch__flush_link_stack_return;
extern s32 patch__call_kvm_flush_link_stack;

View File

@ -13,20 +13,24 @@
*/
#define MAX_EA_BITS_PER_CONTEXT 46
#define REGION_SHIFT (MAX_EA_BITS_PER_CONTEXT - 2)
/*
* Our page table limit us to 64TB. Hence for the kernel mapping,
* each MAP area is limited to 16 TB.
* The four map areas are: linear mapping, vmap, IO and vmemmap
* Our page table limit us to 64TB. For 64TB physical memory, we only need 64GB
* of vmemmap space. To better support sparse memory layout, we use 61TB
* linear map range, 1TB of vmalloc, 1TB of I/O and 1TB of vmememmap.
*/
#define REGION_SHIFT (40)
#define H_KERN_MAP_SIZE (ASM_CONST(1) << REGION_SHIFT)
/*
* Define the address range of the kernel non-linear virtual area
* 16TB
* Limits the linear mapping range
*/
#define H_KERN_VIRT_START ASM_CONST(0xc000100000000000)
#define H_MAX_PHYSMEM_BITS 46
/*
* Define the address range of the kernel non-linear virtual area (61TB)
*/
#define H_KERN_VIRT_START ASM_CONST(0xc0003d0000000000)
#ifndef __ASSEMBLY__
#define H_PTE_TABLE_SIZE (sizeof(pte_t) << H_PTE_INDEX_SIZE)

View File

@ -7,6 +7,19 @@
#define H_PUD_INDEX_SIZE 10 // size: 8B << 10 = 8KB, maps 2^10 x 16GB = 16TB
#define H_PGD_INDEX_SIZE 8 // size: 8B << 8 = 2KB, maps 2^8 x 16TB = 4PB
/*
* If we store section details in page->flags we can't increase the MAX_PHYSMEM_BITS
* if we increase SECTIONS_WIDTH we will not store node details in page->flags and
* page_to_nid does a page->section->node lookup
* Hence only increase for VMEMMAP. Further depending on SPARSEMEM_EXTREME reduce
* memory requirements with large number of sections.
* 51 bits is the max physical real address on POWER9
*/
#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME)
#define H_MAX_PHYSMEM_BITS 51
#else
#define H_MAX_PHYSMEM_BITS 46
#endif
/*
* Each context is 512TB size. SLB miss for first context/default context

View File

@ -577,8 +577,8 @@ extern void slb_set_size(u16 size);
* For vmalloc and memmap, we use just one context with 512TB. With 64 byte
* struct page size, we need ony 32 TB in memmap for 2PB (51 bits (MAX_PHYSMEM_BITS)).
*/
#if (MAX_PHYSMEM_BITS > MAX_EA_BITS_PER_CONTEXT)
#define MAX_KERNEL_CTX_CNT (1UL << (MAX_PHYSMEM_BITS - MAX_EA_BITS_PER_CONTEXT))
#if (H_MAX_PHYSMEM_BITS > MAX_EA_BITS_PER_CONTEXT)
#define MAX_KERNEL_CTX_CNT (1UL << (H_MAX_PHYSMEM_BITS - MAX_EA_BITS_PER_CONTEXT))
#else
#define MAX_KERNEL_CTX_CNT 1
#endif

View File

@ -27,21 +27,6 @@ struct mmu_psize_def {
extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
#endif /* __ASSEMBLY__ */
/*
* If we store section details in page->flags we can't increase the MAX_PHYSMEM_BITS
* if we increase SECTIONS_WIDTH we will not store node details in page->flags and
* page_to_nid does a page->section->node lookup
* Hence only increase for VMEMMAP. Further depending on SPARSEMEM_EXTREME reduce
* memory requirements with large number of sections.
* 51 bits is the max physical real address on POWER9
*/
#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME) && \
defined(CONFIG_PPC_64K_PAGES)
#define MAX_PHYSMEM_BITS 51
#else
#define MAX_PHYSMEM_BITS 46
#endif
/* 64-bit classic hash table MMU */
#include <asm/book3s/64/mmu-hash.h>
@ -85,7 +70,7 @@ extern unsigned int mmu_base_pid;
/*
* memory block size used with radix translation.
*/
extern unsigned int __ro_after_init radix_mem_block_size;
extern unsigned long __ro_after_init radix_mem_block_size;
#define PRTB_SIZE_SHIFT (mmu_pid_bits + 4)
#define PRTB_ENTRIES (1ul << mmu_pid_bits)

View File

@ -294,6 +294,13 @@ extern unsigned long pci_io_base;
#include <asm/book3s/64/hash.h>
#include <asm/book3s/64/radix.h>
#if H_MAX_PHYSMEM_BITS > R_MAX_PHYSMEM_BITS
#define MAX_PHYSMEM_BITS H_MAX_PHYSMEM_BITS
#else
#define MAX_PHYSMEM_BITS R_MAX_PHYSMEM_BITS
#endif
#ifdef CONFIG_PPC_64K_PAGES
#include <asm/book3s/64/pgtable-64k.h>
#else

View File

@ -91,6 +91,22 @@
* +------------------------------+ Kernel linear (0xc.....)
*/
/*
* If we store section details in page->flags we can't increase the MAX_PHYSMEM_BITS
* if we increase SECTIONS_WIDTH we will not store node details in page->flags and
* page_to_nid does a page->section->node lookup
* Hence only increase for VMEMMAP. Further depending on SPARSEMEM_EXTREME reduce
* memory requirements with large number of sections.
* 51 bits is the max physical real address on POWER9
*/
#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME)
#define R_MAX_PHYSMEM_BITS 51
#else
#define R_MAX_PHYSMEM_BITS 46
#endif
#define RADIX_KERN_VIRT_START ASM_CONST(0xc008000000000000)
/*
* 49 = MAX_EA_BITS_PER_CONTEXT (hash specific). To make sure we pick

View File

@ -98,6 +98,16 @@ static inline void invalidate_dcache_range(unsigned long start,
mb(); /* sync */
}
#ifdef CONFIG_4xx
static inline void flush_instruction_cache(void)
{
iccci((void *)KERNELBASE);
isync();
}
#else
void flush_instruction_cache(void);
#endif
#include <asm-generic/cacheflush.h>
#endif /* _ASM_POWERPC_CACHEFLUSH_H */

View File

@ -9,11 +9,6 @@
#ifndef __ASSEMBLY__
/*
* Added to include __machine_check_early_realmode_* functions
*/
#include <asm/mce.h>
/* This structure can grow, it's real size is used by head.S code
* via the mkdefs mechanism.
*/
@ -170,6 +165,7 @@ static inline void cpu_feature_keys_init(void) { }
#else /* CONFIG_PPC32 */
/* Define these to 0 for the sake of tests in common code */
#define CPU_FTR_PPC_LE (0)
#define CPU_FTR_SPE (0)
#endif
/*
@ -299,8 +295,6 @@ static inline void cpu_feature_keys_init(void) { }
#define CPU_FTR_MAYBE_CAN_NAP 0
#endif
#define CPU_FTRS_PPC601 (CPU_FTR_COMMON | \
CPU_FTR_COHERENT_ICACHE)
#define CPU_FTRS_603 (CPU_FTR_COMMON | CPU_FTR_MAYBE_CAN_DOZE | \
CPU_FTR_MAYBE_CAN_NAP | CPU_FTR_PPC_LE | CPU_FTR_NOEXECUTE)
#define CPU_FTRS_604 (CPU_FTR_COMMON | CPU_FTR_PPC_LE)
@ -516,10 +510,8 @@ static inline void cpu_feature_keys_init(void) { }
#else
enum {
CPU_FTRS_POSSIBLE =
#ifdef CONFIG_PPC_BOOK3S_601
CPU_FTRS_PPC601 |
#elif defined(CONFIG_PPC_BOOK3S_32)
CPU_FTRS_PPC601 | CPU_FTRS_603 | CPU_FTRS_604 | CPU_FTRS_740_NOTAU |
#ifdef CONFIG_PPC_BOOK3S_32
CPU_FTRS_603 | CPU_FTRS_604 | CPU_FTRS_740_NOTAU |
CPU_FTRS_740 | CPU_FTRS_750 | CPU_FTRS_750FX1 |
CPU_FTRS_750FX2 | CPU_FTRS_750FX | CPU_FTRS_750GX |
CPU_FTRS_7400_NOTAU | CPU_FTRS_7400 | CPU_FTRS_7450_20 |
@ -594,9 +586,7 @@ enum {
#else
enum {
CPU_FTRS_ALWAYS =
#ifdef CONFIG_PPC_BOOK3S_601
CPU_FTRS_PPC601 &
#elif defined(CONFIG_PPC_BOOK3S_32)
#ifdef CONFIG_PPC_BOOK3S_32
CPU_FTRS_603 & CPU_FTRS_604 & CPU_FTRS_740_NOTAU &
CPU_FTRS_740 & CPU_FTRS_750 & CPU_FTRS_750FX1 &
CPU_FTRS_750FX2 & CPU_FTRS_750FX & CPU_FTRS_750GX &

View File

@ -23,7 +23,6 @@
extern int threads_per_core;
extern int threads_per_subcore;
extern int threads_shift;
extern bool has_big_cores;
extern cpumask_t threads_core_mask;
#else
#define threads_per_core 1

View File

@ -54,7 +54,7 @@ extern void udelay(unsigned long usecs);
({ \
typeof(condition) __ret; \
unsigned long __loops = tb_ticks_per_usec * timeout; \
unsigned long __start = get_tbl(); \
unsigned long __start = mftb(); \
\
if (delay) { \
while (!(__ret = (condition)) && \

View File

@ -8,26 +8,39 @@
#ifndef _ASM_POWERPC_LMB_H
#define _ASM_POWERPC_LMB_H
#include <linux/sched.h>
struct drmem_lmb {
u64 base_addr;
u32 drc_index;
u32 aa_index;
u32 flags;
#ifdef CONFIG_MEMORY_HOTPLUG
int nid;
#endif
};
struct drmem_lmb_info {
struct drmem_lmb *lmbs;
int n_lmbs;
u32 lmb_size;
u64 lmb_size;
};
extern struct drmem_lmb_info *drmem_info;
static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb,
const struct drmem_lmb *start)
{
/*
* DLPAR code paths can take several milliseconds per element
* when interacting with firmware. Ensure that we don't
* unfairly monopolize the CPU.
*/
if (((++lmb - start) % 16) == 0)
cond_resched();
return lmb;
}
#define for_each_drmem_lmb_in_range(lmb, start, end) \
for ((lmb) = (start); (lmb) < (end); (lmb)++)
for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb, start))
#define for_each_drmem_lmb(lmb) \
for_each_drmem_lmb_in_range((lmb), \
@ -67,7 +80,7 @@ struct of_drconf_cell_v2 {
#define DRCONF_MEM_RESERVED 0x00000080
#define DRCONF_MEM_HOTREMOVABLE 0x00000100
static inline u32 drmem_lmb_size(void)
static inline u64 drmem_lmb_size(void)
{
return drmem_info->lmb_size;
}
@ -105,22 +118,4 @@ static inline void invalidate_lmb_associativity_index(struct drmem_lmb *lmb)
lmb->aa_index = 0xffffffff;
}
#ifdef CONFIG_MEMORY_HOTPLUG
static inline void lmb_set_nid(struct drmem_lmb *lmb)
{
lmb->nid = memory_add_physaddr_to_nid(lmb->base_addr);
}
static inline void lmb_clear_nid(struct drmem_lmb *lmb)
{
lmb->nid = -1;
}
#else
static inline void lmb_set_nid(struct drmem_lmb *lmb)
{
}
static inline void lmb_clear_nid(struct drmem_lmb *lmb)
{
}
#endif
#endif /* _ASM_POWERPC_LMB_H */

View File

@ -27,7 +27,6 @@ struct pci_dn;
#define EEH_FORCE_DISABLED 0x02 /* EEH disabled */
#define EEH_PROBE_MODE_DEV 0x04 /* From PCI device */
#define EEH_PROBE_MODE_DEVTREE 0x08 /* From device tree */
#define EEH_VALID_PE_ZERO 0x10 /* PE#0 is valid */
#define EEH_ENABLE_IO_FOR_LOG 0x20 /* Enable IO for log */
#define EEH_EARLY_DUMP_LOG 0x40 /* Dump log immediately */
@ -74,7 +73,6 @@ struct pci_dn;
struct eeh_pe {
int type; /* PE type: PHB/Bus/Device */
int state; /* PE EEH dependent mode */
int config_addr; /* Traditional PCI address */
int addr; /* PE configuration address */
struct pci_controller *phb; /* Associated PHB */
struct pci_bus *bus; /* Top PCI bus for bus PE */
@ -216,7 +214,6 @@ enum {
struct eeh_ops {
char *name;
int (*init)(void);
struct eeh_dev *(*probe)(struct pci_dev *pdev);
int (*set_option)(struct eeh_pe *pe, int option);
int (*get_state)(struct eeh_pe *pe, int *delay);
@ -281,8 +278,7 @@ int eeh_phb_pe_create(struct pci_controller *phb);
int eeh_wait_state(struct eeh_pe *pe, int max_wait);
struct eeh_pe *eeh_phb_pe_get(struct pci_controller *phb);
struct eeh_pe *eeh_pe_next(struct eeh_pe *pe, struct eeh_pe *root);
struct eeh_pe *eeh_pe_get(struct pci_controller *phb,
int pe_no, int config_addr);
struct eeh_pe *eeh_pe_get(struct pci_controller *phb, int pe_no);
int eeh_pe_tree_insert(struct eeh_dev *edev, struct eeh_pe *new_pe_parent);
int eeh_pe_tree_remove(struct eeh_dev *edev);
void eeh_pe_update_time_stamp(struct eeh_pe *pe);
@ -295,8 +291,7 @@ const char *eeh_pe_loc_get(struct eeh_pe *pe);
struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe);
void eeh_show_enabled(void);
int __init eeh_ops_register(struct eeh_ops *ops);
int __exit eeh_ops_unregister(const char *name);
int __init eeh_init(struct eeh_ops *ops);
int eeh_check_failure(const volatile void __iomem *token);
int eeh_dev_check_failure(struct eeh_dev *edev);
void eeh_addr_cache_init(void);

View File

@ -375,11 +375,13 @@
#define H_CPU_CHAR_THREAD_RECONFIG_CTRL (1ull << 57) // IBM bit 6
#define H_CPU_CHAR_COUNT_CACHE_DISABLED (1ull << 56) // IBM bit 7
#define H_CPU_CHAR_BCCTR_FLUSH_ASSIST (1ull << 54) // IBM bit 9
#define H_CPU_CHAR_BCCTR_LINK_FLUSH_ASSIST (1ull << 52) // IBM bit 11
#define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0
#define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1
#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2
#define H_CPU_BEHAV_FLUSH_COUNT_CACHE (1ull << 58) // IBM bit 5
#define H_CPU_BEHAV_FLUSH_LINK_STACK (1ull << 57) // IBM bit 6
/* Flag values used in H_REGISTER_PROC_TBL hcall */
#define PROC_TABLE_OP_MASK 0x18
@ -560,6 +562,42 @@ struct hv_guest_state {
/* Latest version of hv_guest_state structure */
#define HV_GUEST_STATE_VERSION 1
/*
* From the document "H_GetPerformanceCounterInfo Interface" v1.07
*
* H_GET_PERF_COUNTER_INFO argument
*/
struct hv_get_perf_counter_info_params {
__be32 counter_request; /* I */
__be32 starting_index; /* IO */
__be16 secondary_index; /* IO */
__be16 returned_values; /* O */
__be32 detail_rc; /* O, only needed when called via *_norets() */
/*
* O, size each of counter_value element in bytes, only set for version
* >= 0x3
*/
__be16 cv_element_size;
/* I, 0 (zero) for versions < 0x3 */
__u8 counter_info_version_in;
/* O, 0 (zero) if version < 0x3. Must be set to 0 when making hcall */
__u8 counter_info_version_out;
__u8 reserved[0xC];
__u8 counter_value[];
} __packed;
#define HGPCI_REQ_BUFFER_SIZE 4096
#define HGPCI_MAX_DATA_BYTES \
(HGPCI_REQ_BUFFER_SIZE - sizeof(struct hv_get_perf_counter_info_params))
struct hv_gpci_request_buffer {
struct hv_get_perf_counter_info_params params;
uint8_t bytes[HGPCI_MAX_DATA_BYTES];
} __packed;
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_HVCALL_H */

View File

@ -10,6 +10,7 @@
#define _PPC_BOOK3S_64_HW_BREAKPOINT_H
#include <asm/cpu_has_feature.h>
#include <asm/inst.h>
#ifdef __KERNEL__
struct arch_hw_breakpoint {
@ -17,6 +18,7 @@ struct arch_hw_breakpoint {
u16 type;
u16 len; /* length of the target data symbol */
u16 hw_len; /* length programmed in hw */
u8 flags;
};
/* Note: Don't change the first 6 bits below as they are in the same order
@ -36,12 +38,15 @@ struct arch_hw_breakpoint {
#define HW_BRK_TYPE_PRIV_ALL (HW_BRK_TYPE_USER | HW_BRK_TYPE_KERNEL | \
HW_BRK_TYPE_HYP)
#define HW_BRK_FLAG_DISABLED 0x1
/* Minimum granularity */
#ifdef CONFIG_PPC_8xx
#define HW_BREAKPOINT_SIZE 0x4
#else
#define HW_BREAKPOINT_SIZE 0x8
#endif
#define HW_BREAKPOINT_SIZE_QUADWORD 0x10
#define DABR_MAX_LEN 8
#define DAWR_MAX_LEN 512
@ -51,6 +56,13 @@ static inline int nr_wp_slots(void)
return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1;
}
bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
unsigned long ea, int type, int size,
struct arch_hw_breakpoint *info);
void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
int *type, int *size, unsigned long *ea);
#ifdef CONFIG_HAVE_HW_BREAKPOINT
#include <linux/kdebug.h>
#include <asm/reg.h>

View File

@ -25,9 +25,8 @@
#define PACA_IRQ_DBELL 0x02
#define PACA_IRQ_EE 0x04
#define PACA_IRQ_DEC 0x08 /* Or FIT */
#define PACA_IRQ_EE_EDGE 0x10 /* BookE only */
#define PACA_IRQ_HMI 0x20
#define PACA_IRQ_PMI 0x40
#define PACA_IRQ_HMI 0x10
#define PACA_IRQ_PMI 0x20
/*
* Some soft-masked interrupts must be hard masked until they are replayed
@ -369,12 +368,6 @@ static inline void may_hard_irq_enable(void) { }
#define ARCH_IRQ_INIT_FLAGS IRQ_NOREQUEST
/*
* interrupt-retrigger: should we handle this via lost interrupts and IPIs
* or should we not care like we do now ? --BenH.
*/
struct irq_chip;
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_HW_IRQ_H */

View File

@ -156,8 +156,7 @@ struct coprocessor_request_block {
u8 reserved[32];
struct coprocessor_status_block csb;
} __packed;
} __aligned(128);
/* RFC02167 Initiate Coprocessor Instructions document
* Chapter 8.2.1.1.1 RS
@ -188,6 +187,9 @@ static inline int icswx(__be32 ccw, struct coprocessor_request_block *crb)
__be64 ccw_reg = ccw;
u32 cr;
/* NB: the same structures are used by VAS-NX */
BUILD_BUG_ON(sizeof(*crb) != 128);
__asm__ __volatile__(
PPC_ICSWX(%1,0,%2) "\n"
"mfcr %0\n"

View File

@ -35,7 +35,6 @@ static __inline__ int irq_canonicalize(int irq)
extern int distribute_irqs;
struct irqaction;
struct pt_regs;
#define __ARCH_HAS_DO_SOFTIRQ

View File

@ -65,7 +65,6 @@ struct machdep_calls {
void __noreturn (*restart)(char *cmd);
void __noreturn (*halt)(void);
void (*panic)(char *str);
void (*cpu_die)(void);
long (*time_init)(void); /* Optional, may be NULL */
@ -222,8 +221,6 @@ struct machdep_calls {
extern void e500_idle(void);
extern void power4_idle(void);
extern void power7_idle(void);
extern void power9_idle(void);
extern void ppc6xx_idle(void);
extern void book3e_idle(void);

View File

@ -244,7 +244,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
*/
static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next)
{
switch_mm(prev, next, current);
switch_mm_irqs_off(prev, next, current);
}
/* We don't currently use enter_lazy_tlb() for anything */

View File

@ -65,4 +65,18 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
pte_update(mm, addr, ptep, clr, set, 1);
}
#ifdef CONFIG_PPC_4K_PAGES
static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
struct page *page, int writable)
{
size_t size = huge_page_size(hstate_vma(vma));
if (size == SZ_16K)
return __pte(pte_val(entry) & ~_PAGE_HUGE);
else
return entry;
}
#define arch_make_huge_pte arch_make_huge_pte
#endif
#endif /* _ASM_POWERPC_NOHASH_32_HUGETLB_8XX_H */

View File

@ -227,6 +227,19 @@ static inline void pmd_clear(pmd_t *pmdp)
*/
#ifdef CONFIG_PPC_8xx
static pmd_t *pmd_off(struct mm_struct *mm, unsigned long addr);
static int hugepd_ok(hugepd_t hpd);
static int number_of_cells_per_pte(pmd_t *pmd, pte_basic_t val, int huge)
{
if (!huge)
return PAGE_SIZE / SZ_4K;
else if (hugepd_ok(*((hugepd_t *)pmd)))
return 1;
else if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && !(val & _PAGE_HUGE))
return SZ_16K / SZ_4K;
else
return SZ_512K / SZ_4K;
}
static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, pte_t *p,
unsigned long clr, unsigned long set, int huge)
@ -237,12 +250,7 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p
int num, i;
pmd_t *pmd = pmd_off(mm, addr);
if (!huge)
num = PAGE_SIZE / SZ_4K;
else if ((pmd_val(*pmd) & _PMD_PAGE_MASK) != _PMD_PAGE_8M)
num = SZ_512K / SZ_4K;
else
num = 1;
num = number_of_cells_per_pte(pmd, new, huge);
for (i = 0; i < num; i++, entry++, new += SZ_4K)
*entry = new;

View File

@ -28,7 +28,4 @@ int pnv_ocxl_spa_setup(struct pci_dev *dev, void *spa_mem, int PE_mask, void **p
void pnv_ocxl_spa_release(void *platform_data);
int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle);
int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr);
void pnv_ocxl_free_xive_irq(u32 irq);
#endif /* _ASM_PNV_OCXL_H */

View File

@ -382,16 +382,6 @@ GLUE(.,name):
#endif
/* various errata or part fixups */
#ifdef CONFIG_PPC601_SYNC_FIX
#define SYNC sync; isync
#define SYNC_601 sync
#define ISYNC_601 isync
#else
#define SYNC
#define SYNC_601
#define ISYNC_601
#endif
#if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
#define MFTB(dest) \
90: mfspr dest, SPRN_TBRL; \
@ -411,8 +401,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_BUG, CPU_FTR_CELL_TB_BUG, 96)
#define MFTBU(dest) mfspr dest, SPRN_TBRU
#endif
/* tlbsync is not implemented on 601 */
#if !defined(CONFIG_SMP) || defined(CONFIG_PPC_BOOK3S_601)
#ifndef CONFIG_SMP
#define TLBSYNC
#else
#define TLBSYNC tlbsync; sync

View File

@ -220,6 +220,7 @@ struct thread_struct {
unsigned long tm_tar;
unsigned long tm_ppr;
unsigned long tm_dscr;
unsigned long tm_amr;
/*
* Checkpointed FP and VSX 0-31 register set.
@ -432,16 +433,10 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
extern int powersave_nap; /* set if nap mode can be used in idle loop */
extern void power7_idle_type(unsigned long type);
extern void power9_idle_type(unsigned long stop_psscr_val,
extern void arch300_idle_type(unsigned long stop_psscr_val,
unsigned long stop_psscr_mask);
extern void flush_instruction_cache(void);
extern void hard_reset_now(void);
extern void poweroff_now(void);
extern int fix_alignment(struct pt_regs *);
extern void cvt_fd(float *from, double *to);
extern void cvt_df(double *from, float *to);
extern void _nmask_and_or_msr(unsigned long nmask, unsigned long or_val);
#ifdef CONFIG_PPC64
/*

View File

@ -243,11 +243,7 @@ static inline void set_trap_norestart(struct pt_regs *regs)
}
#define arch_has_single_step() (1)
#ifndef CONFIG_PPC_BOOK3S_601
#define arch_has_block_step() (true)
#else
#define arch_has_block_step() (false)
#endif
#define ARCH_HAS_USER_SINGLE_STEP_REPORT
/*

View File

@ -521,6 +521,8 @@
#define SPRN_TSCR 0x399 /* Thread Switch Control Register */
#define SPRN_DEC 0x016 /* Decrement Register */
#define SPRN_PIT 0x3DB /* Programmable Interval Timer (40x/BOOKE) */
#define SPRN_DER 0x095 /* Debug Enable Register */
#define DER_RSTE 0x40000000 /* Reset Interrupt */
#define DER_CHSTPE 0x20000000 /* Check Stop */
@ -817,7 +819,7 @@
#define THRM1_TIN (1 << 31)
#define THRM1_TIV (1 << 30)
#define THRM1_THRES(x) ((x&0x7f)<<23)
#define THRM3_SITV(x) ((x&0x3fff)<<1)
#define THRM3_SITV(x) ((x & 0x1fff) << 1)
#define THRM1_TID (1<<2)
#define THRM1_TIE (1<<1)
#define THRM1_V (1<<0)
@ -1353,6 +1355,7 @@
#define PVR_POWER8NVL 0x004C
#define PVR_POWER8 0x004D
#define PVR_POWER9 0x004E
#define PVR_POWER10 0x0080
#define PVR_BE 0x0070
#define PVR_PA6T 0x0090
@ -1416,8 +1419,7 @@ static inline void msr_check_and_clear(unsigned long bits)
__msr_check_and_clear(bits);
}
#ifdef __powerpc64__
#if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
#if defined(CONFIG_PPC_CELL) || defined(CONFIG_E500)
#define mftb() ({unsigned long rval; \
asm volatile( \
"90: mfspr %0, %2;\n" \
@ -1427,29 +1429,23 @@ static inline void msr_check_and_clear(unsigned long bits)
: "=r" (rval) \
: "i" (CPU_FTR_CELL_TB_BUG), "i" (SPRN_TBRL) : "cr0"); \
rval;})
#elif defined(CONFIG_PPC_8xx)
#define mftb() ({unsigned long rval; \
asm volatile("mftbl %0" : "=r" (rval)); rval;})
#else
#define mftb() ({unsigned long rval; \
asm volatile("mfspr %0, %1" : \
"=r" (rval) : "i" (SPRN_TBRL)); rval;})
#endif /* !CONFIG_PPC_CELL */
#else /* __powerpc64__ */
#if defined(CONFIG_PPC_8xx)
#define mftbl() ({unsigned long rval; \
asm volatile("mftbl %0" : "=r" (rval)); rval;})
#define mftbu() ({unsigned long rval; \
asm volatile("mftbu %0" : "=r" (rval)); rval;})
#else
#define mftbl() ({unsigned long rval; \
asm volatile("mfspr %0, %1" : "=r" (rval) : \
"i" (SPRN_TBRL)); rval;})
#define mftbu() ({unsigned long rval; \
asm volatile("mfspr %0, %1" : "=r" (rval) : \
"i" (SPRN_TBRU)); rval;})
#endif
#define mftb() mftbl()
#endif /* !__powerpc64__ */
#define mttbl(v) asm volatile("mttbl %0":: "r"(v))
#define mttbu(v) asm volatile("mttbu %0":: "r"(v))

View File

@ -174,7 +174,6 @@
#define SPRN_L1CSR1 0x3F3 /* L1 Cache Control and Status Register 1 */
#define SPRN_MMUCSR0 0x3F4 /* MMU Control and Status Register 0 */
#define SPRN_MMUCFG 0x3F7 /* MMU Configuration Register */
#define SPRN_PIT 0x3DB /* Programmable Interval Timer */
#define SPRN_BUCSR 0x3F5 /* Branch Unit Control and Status */
#define SPRN_L2CSR0 0x3F9 /* L2 Data Cache Control and Status Register 0 */
#define SPRN_L2CSR1 0x3FA /* L2 Data Cache Control and Status Register 1 */

View File

@ -28,8 +28,8 @@
extern int boot_cpuid;
extern int spinning_secondaries;
extern u32 *cpu_to_phys_id;
extern bool coregroup_enabled;
extern void cpu_die(void);
extern int cpu_to_chip_id(int cpu);
#ifdef CONFIG_SMP
@ -50,6 +50,9 @@ struct smp_ops_t {
int (*cpu_disable)(void);
void (*cpu_die)(unsigned int nr);
int (*cpu_bootable)(unsigned int nr);
#ifdef CONFIG_HOTPLUG_CPU
void (*cpu_offline_self)(void);
#endif
};
extern int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us);
@ -118,11 +121,6 @@ static inline struct cpumask *cpu_sibling_mask(int cpu)
return per_cpu(cpu_sibling_map, cpu);
}
static inline struct cpumask *cpu_core_mask(int cpu)
{
return per_cpu(cpu_core_map, cpu);
}
static inline struct cpumask *cpu_l2_cache_mask(int cpu)
{
return per_cpu(cpu_l2_cache_map, cpu);
@ -135,6 +133,19 @@ static inline struct cpumask *cpu_smallcore_mask(int cpu)
extern int cpu_to_core_id(int cpu);
extern bool has_big_cores;
#define cpu_smt_mask cpu_smt_mask
#ifdef CONFIG_SCHED_SMT
static inline const struct cpumask *cpu_smt_mask(int cpu)
{
if (has_big_cores)
return per_cpu(cpu_smallcore_map, cpu);
return per_cpu(cpu_sibling_map, cpu);
}
#endif /* CONFIG_SCHED_SMT */
/* Since OpenPIC has only 4 IPIs, we use slightly different message numbers.
*
* Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up
@ -243,7 +254,6 @@ extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
* 64-bit but defining them all here doesn't harm
*/
extern void generic_secondary_smp_init(void);
extern void generic_secondary_thread_init(void);
extern unsigned long __secondary_hold_spinloop;
extern unsigned long __secondary_hold_acknowledge;
extern char __secondary_hold;

View File

@ -15,6 +15,8 @@ static inline bool is_secure_guest(void)
return mfmsr() & MSR_S;
}
void __init svm_swiotlb_init(void);
void dtl_cache_ctor(void *addr);
#define get_dtl_cache_ctor() (is_secure_guest() ? dtl_cache_ctor : NULL)
@ -25,6 +27,8 @@ static inline bool is_secure_guest(void)
return false;
}
static inline void svm_swiotlb_init(void) {}
#define get_dtl_cache_ctor() NULL
#endif /* CONFIG_PPC_SVM */

View File

@ -3,8 +3,9 @@
#define _ASM_POWERPC_SYNCH_H
#ifdef __KERNEL__
#include <asm/cputable.h>
#include <asm/feature-fixups.h>
#include <asm/asm-const.h>
#include <asm/ppc-opcode.h>
#ifndef __ASSEMBLY__
extern unsigned int __start___lwsync_fixup, __stop___lwsync_fixup;
@ -20,6 +21,22 @@ static inline void isync(void)
{
__asm__ __volatile__ ("isync" : : : "memory");
}
static inline void ppc_after_tlbiel_barrier(void)
{
asm volatile("ptesync": : :"memory");
/*
* POWER9, POWER10 need a cp_abort after tlbiel to ensure the copy is
* invalidated correctly. If this is not done, the paste can take data
* from the physical address that was translated at copy time.
*
* POWER9 in practice does not need this, because address spaces with
* accelerators mapped will use tlbie (which does invalidate the copy)
* to invalidate translations. It's not possible to limit POWER10 this
* way due to local copy-paste.
*/
asm volatile(ASM_FTR_IFSET(PPC_CP_ABORT, "", %0) : : "i" (CPU_FTR_ARCH_31) : "memory");
}
#endif /* __ASSEMBLY__ */
#if defined(__powerpc64__)

View File

@ -38,44 +38,10 @@ struct div_result {
u64 result_low;
};
/* Accessor functions for the timebase (RTC on 601) registers. */
#define __USE_RTC() (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
#ifdef CONFIG_PPC64
/* For compatibility, get_tbl() is defined as get_tb() on ppc64 */
#define get_tbl get_tb
#else
static inline unsigned long get_tbl(void)
{
return mftbl();
}
static inline unsigned int get_tbu(void)
{
return mftbu();
}
#endif /* !CONFIG_PPC64 */
static inline unsigned int get_rtcl(void)
{
unsigned int rtcl;
asm volatile("mfrtcl %0" : "=r" (rtcl));
return rtcl;
}
static inline u64 get_rtc(void)
{
unsigned int hi, lo, hi2;
do {
asm volatile("mfrtcu %0; mfrtcl %1; mfrtcu %2"
: "=r" (hi), "=r" (lo), "=r" (hi2));
} while (hi2 != hi);
return (u64)hi * 1000000000 + lo;
return mftb();
}
static inline u64 get_vtb(void)
@ -87,30 +53,21 @@ static inline u64 get_vtb(void)
return 0;
}
#ifdef CONFIG_PPC64
static inline u64 get_tb(void)
{
return mftb();
}
#else /* CONFIG_PPC64 */
static inline u64 get_tb(void)
{
unsigned int tbhi, tblo, tbhi2;
if (IS_ENABLED(CONFIG_PPC64))
return mftb();
do {
tbhi = get_tbu();
tblo = get_tbl();
tbhi2 = get_tbu();
tbhi = mftbu();
tblo = mftb();
tbhi2 = mftbu();
} while (tbhi != tbhi2);
return ((u64)tbhi << 32) | tblo;
}
#endif /* !CONFIG_PPC64 */
static inline u64 get_tb_or_rtc(void)
{
return __USE_RTC() ? get_rtc() : get_tb();
}
static inline void set_tb(unsigned int upper, unsigned int lower)
{
@ -127,11 +84,10 @@ static inline void set_tb(unsigned int upper, unsigned int lower)
*/
static inline u64 get_dec(void)
{
#if defined(CONFIG_40x)
return (mfspr(SPRN_PIT));
#else
return (mfspr(SPRN_DEC));
#endif
if (IS_ENABLED(CONFIG_40x))
return mfspr(SPRN_PIT);
return mfspr(SPRN_DEC);
}
/*
@ -141,23 +97,17 @@ static inline u64 get_dec(void)
*/
static inline void set_dec(u64 val)
{
#if defined(CONFIG_40x)
mtspr(SPRN_PIT, (u32) val);
#else
#ifndef CONFIG_BOOKE
--val;
#endif
mtspr(SPRN_DEC, val);
#endif /* not 40x */
if (IS_ENABLED(CONFIG_40x))
mtspr(SPRN_PIT, (u32)val);
else if (IS_ENABLED(CONFIG_BOOKE))
mtspr(SPRN_DEC, val);
else
mtspr(SPRN_DEC, val - 1);
}
static inline unsigned long tb_ticks_since(unsigned long tstamp)
{
if (__USE_RTC()) {
int delta = get_rtcl() - (unsigned int) tstamp;
return delta < 0 ? delta + 1000000000 : delta;
}
return get_tbl() - tstamp;
return mftb() - tstamp;
}
#define mulhwu(x,y) \

View File

@ -17,9 +17,6 @@ typedef unsigned long cycles_t;
static inline cycles_t get_cycles(void)
{
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
return 0;
return mftb();
}

View File

@ -66,19 +66,6 @@ static inline int mm_is_thread_local(struct mm_struct *mm)
return false;
return cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm));
}
static inline void mm_reset_thread_local(struct mm_struct *mm)
{
WARN_ON(atomic_read(&mm->context.copros) > 0);
/*
* It's possible for mm_access to take a reference on mm_users to
* access the remote mm from another thread, but it's not allowed
* to set mm_cpumask, so mm_users may be > 1 here.
*/
WARN_ON(current->mm != mm);
atomic_set(&mm->context.active_cpus, 1);
cpumask_clear(mm_cpumask(mm));
cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
}
#else /* CONFIG_PPC_BOOK3S_64 */
static inline int mm_is_thread_local(struct mm_struct *mm)
{

View File

@ -86,14 +86,27 @@ static inline int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
#endif /* CONFIG_NUMA */
struct drmem_lmb;
int of_drconf_to_nid_single(struct drmem_lmb *lmb);
#if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
extern int find_and_online_cpu_nid(int cpu);
extern int cpu_to_coregroup_id(int cpu);
#else
static inline int find_and_online_cpu_nid(int cpu)
{
return 0;
}
static inline int cpu_to_coregroup_id(int cpu)
{
#ifdef CONFIG_SMP
return cpu_to_core_id(cpu);
#else
return 0;
#endif
}
#endif /* CONFIG_NUMA && CONFIG_PPC_SPLPAR */
#include <asm-generic/topology.h>
@ -104,15 +117,10 @@ static inline int find_and_online_cpu_nid(int cpu)
#ifdef CONFIG_PPC64
#include <asm/smp.h>
#ifdef CONFIG_PPC_SPLPAR
int get_physical_package_id(int cpu);
#define topology_physical_package_id(cpu) (get_physical_package_id(cpu))
#else
#define topology_physical_package_id(cpu) (cpu_to_chip_id(cpu))
#endif
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_core_cpumask(cpu) (cpu_cpu_mask(cpu))
#define topology_core_id(cpu) (cpu_to_core_id(cpu))
#endif

View File

@ -151,52 +151,16 @@ static inline int __access_ok(unsigned long addr, unsigned long size,
extern long __put_user_bad(void);
/*
* We don't tell gcc that we are accessing memory, but this is OK
* because we do not write to any memory gcc knows about, so there
* are no aliasing issues.
*/
#define __put_user_asm(x, addr, err, op) \
__asm__ __volatile__( \
"1: " op " %1,0(%2) # put_user\n" \
"2:\n" \
".section .fixup,\"ax\"\n" \
"3: li %0,%3\n" \
" b 2b\n" \
".previous\n" \
EX_TABLE(1b, 3b) \
: "=r" (err) \
: "r" (x), "b" (addr), "i" (-EFAULT), "0" (err))
#ifdef __powerpc64__
#define __put_user_asm2(x, ptr, retval) \
__put_user_asm(x, ptr, retval, "std")
#else /* __powerpc64__ */
#define __put_user_asm2(x, addr, err) \
__asm__ __volatile__( \
"1: stw %1,0(%2)\n" \
"2: stw %1+1,4(%2)\n" \
"3:\n" \
".section .fixup,\"ax\"\n" \
"4: li %0,%3\n" \
" b 3b\n" \
".previous\n" \
EX_TABLE(1b, 4b) \
EX_TABLE(2b, 4b) \
: "=r" (err) \
: "r" (x), "b" (addr), "i" (-EFAULT), "0" (err))
#endif /* __powerpc64__ */
#define __put_user_size_allowed(x, ptr, size, retval) \
do { \
__label__ __pu_failed; \
\
retval = 0; \
switch (size) { \
case 1: __put_user_asm(x, ptr, retval, "stb"); break; \
case 2: __put_user_asm(x, ptr, retval, "sth"); break; \
case 4: __put_user_asm(x, ptr, retval, "stw"); break; \
case 8: __put_user_asm2(x, ptr, retval); break; \
default: __put_user_bad(); \
} \
__put_user_size_goto(x, ptr, size, __pu_failed); \
break; \
\
__pu_failed: \
retval = -EFAULT; \
} while (0)
#define __put_user_size(x, ptr, size, retval) \
@ -249,12 +213,17 @@ do { \
})
/*
* We don't tell gcc that we are accessing memory, but this is OK
* because we do not write to any memory gcc knows about, so there
* are no aliasing issues.
*/
#define __put_user_asm_goto(x, addr, label, op) \
asm volatile goto( \
"1: " op "%U1%X1 %0,%1 # put_user\n" \
EX_TABLE(1b, %l2) \
: \
: "r" (x), "m" (*addr) \
: "r" (x), "m<>" (*addr) \
: \
: label)
@ -316,7 +285,7 @@ extern long __get_user_bad(void);
#define __get_user_asm(x, addr, err, op) \
__asm__ __volatile__( \
"1: "op" %1,0(%2) # get_user\n" \
"1: "op"%U2%X2 %1, %2 # get_user\n" \
"2:\n" \
".section .fixup,\"ax\"\n" \
"3: li %0,%3\n" \
@ -325,7 +294,7 @@ extern long __get_user_bad(void);
".previous\n" \
EX_TABLE(1b, 3b) \
: "=r" (err), "=r" (x) \
: "b" (addr), "i" (-EFAULT), "0" (err))
: "m<>" (*addr), "i" (-EFAULT), "0" (err))
#ifdef __powerpc64__
#define __get_user_asm2(x, addr, err) \
@ -333,8 +302,8 @@ extern long __get_user_bad(void);
#else /* __powerpc64__ */
#define __get_user_asm2(x, addr, err) \
__asm__ __volatile__( \
"1: lwz %1,0(%2)\n" \
"2: lwz %1+1,4(%2)\n" \
"1: lwz%X2 %1, %2\n" \
"2: lwz%X2 %L1, %L2\n" \
"3:\n" \
".section .fixup,\"ax\"\n" \
"4: li %0,%3\n" \
@ -345,7 +314,7 @@ extern long __get_user_bad(void);
EX_TABLE(1b, 4b) \
EX_TABLE(2b, 4b) \
: "=r" (err), "=&r" (x) \
: "b" (addr), "i" (-EFAULT), "0" (err))
: "m" (*addr), "i" (-EFAULT), "0" (err))
#endif /* __powerpc64__ */
#define __get_user_size_allowed(x, ptr, size, retval) \
@ -355,10 +324,10 @@ do { \
if (size > sizeof(x)) \
(x) = __get_user_bad(); \
switch (size) { \
case 1: __get_user_asm(x, ptr, retval, "lbz"); break; \
case 2: __get_user_asm(x, ptr, retval, "lhz"); break; \
case 4: __get_user_asm(x, ptr, retval, "lwz"); break; \
case 8: __get_user_asm2(x, ptr, retval); break; \
case 1: __get_user_asm(x, (u8 __user *)ptr, retval, "lbz"); break; \
case 2: __get_user_asm(x, (u16 __user *)ptr, retval, "lhz"); break; \
case 4: __get_user_asm(x, (u32 __user *)ptr, retval, "lwz"); break; \
case 8: __get_user_asm2(x, (u64 __user *)ptr, retval); break; \
default: (x) = __get_user_bad(); \
} \
} while (0)

View File

@ -222,6 +222,7 @@ struct ppc_debug_info {
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x0000000000000004
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0000000000000008
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0000000000000010
#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 0x0000000000000020
#ifndef __ASSEMBLY__

View File

@ -45,7 +45,8 @@ obj-y := cputable.o syscalls.o \
signal.o sysfs.o cacheinfo.o time.o \
prom.o traps.o setup-common.o \
udbg.o misc.o io.o misc_$(BITS).o \
of_platform.o prom_parse.o firmware.o
of_platform.o prom_parse.o firmware.o \
hw_breakpoint_constraints.o
obj-y += ptrace/
obj-$(CONFIG_PPC64) += setup_64.o \
paca.o nvram_64.o note.o syscall_64.o
@ -94,7 +95,8 @@ obj-$(CONFIG_PPC_FSL_BOOK3E) += cpu_setup_fsl_booke.o
obj-$(CONFIG_PPC_DOORBELL) += dbell.o
obj-$(CONFIG_JUMP_LABEL) += jump_label.o
extra-y := head_$(BITS).o
extra-$(CONFIG_PPC64) := head_64.o
extra-$(CONFIG_PPC_BOOK3S_32) := head_book3s_32.o
extra-$(CONFIG_40x) := head_40x.o
extra-$(CONFIG_44x) := head_44x.o
extra-$(CONFIG_FSL_BOOKE) := head_fsl_booke.o

View File

@ -176,6 +176,7 @@ int main(void)
OFFSET(THREAD_TM_TAR, thread_struct, tm_tar);
OFFSET(THREAD_TM_PPR, thread_struct, tm_ppr);
OFFSET(THREAD_TM_DSCR, thread_struct, tm_dscr);
OFFSET(THREAD_TM_AMR, thread_struct, tm_amr);
OFFSET(PT_CKPT_REGS, thread_struct, ckpt_regs);
OFFSET(THREAD_CKVRSTATE, thread_struct, ckvr_state.vr);
OFFSET(THREAD_CKVRSAVE, thread_struct, ckvrsave);

View File

@ -95,19 +95,10 @@ void __init btext_prepare_BAT(void)
boot_text_mapped = 0;
return;
}
if (PVR_VER(mfspr(SPRN_PVR)) != 1) {
/* 603, 604, G3, G4, ... */
lowbits = addr & ~0xFF000000UL;
addr &= 0xFF000000UL;
disp_BAT[0] = vaddr | (BL_16M<<2) | 2;
disp_BAT[1] = addr | (_PAGE_NO_CACHE | _PAGE_GUARDED | BPP_RW);
} else {
/* 601 */
lowbits = addr & ~0xFF800000UL;
addr &= 0xFF800000UL;
disp_BAT[0] = vaddr | (_PAGE_NO_CACHE | PP_RWXX) | 4;
disp_BAT[1] = addr | BL_8M | 0x40;
}
lowbits = addr & ~0xFF000000UL;
addr &= 0xFF000000UL;
disp_BAT[0] = vaddr | (BL_16M<<2) | 2;
disp_BAT[1] = addr | (_PAGE_NO_CACHE | _PAGE_GUARDED | BPP_RW);
logicalDisplayBase = (void *) (vaddr + lowbits);
}
#endif

View File

@ -16,6 +16,7 @@
#include <asm/oprofile_impl.h>
#include <asm/cputable.h>
#include <asm/prom.h> /* for PTRRELOC on ARCH=ppc */
#include <asm/mce.h>
#include <asm/mmu.h>
#include <asm/setup.h>
@ -608,21 +609,6 @@ static struct cpu_spec __initdata cpu_specs[] = {
#endif /* CONFIG_PPC_BOOK3S_64 */
#ifdef CONFIG_PPC32
#ifdef CONFIG_PPC_BOOK3S_601
{ /* 601 */
.pvr_mask = 0xffff0000,
.pvr_value = 0x00010000,
.cpu_name = "601",
.cpu_features = CPU_FTRS_PPC601,
.cpu_user_features = COMMON_USER | PPC_FEATURE_601_INSTR |
PPC_FEATURE_UNIFIED_CACHE | PPC_FEATURE_NO_TB,
.mmu_features = MMU_FTR_HPTE_TABLE,
.icache_bsize = 32,
.dcache_bsize = 32,
.machine_check = machine_check_generic,
.platform = "ppc601",
},
#endif /* CONFIG_PPC_BOOK3S_601 */
#ifdef CONFIG_PPC_BOOK3S_6xx
{ /* 603 */
.pvr_mask = 0xffff0000,

View File

@ -17,6 +17,7 @@
#include <asm/cputable.h>
#include <asm/dt_cpu_ftrs.h>
#include <asm/mce.h>
#include <asm/mmu.h>
#include <asm/oprofile_impl.h>
#include <asm/prom.h>

View File

@ -466,7 +466,7 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
return 0;
}
if (!pe->addr && !pe->config_addr) {
if (!pe->addr) {
eeh_stats.no_cfg_addr++;
return 0;
}
@ -929,56 +929,6 @@ void eeh_save_bars(struct eeh_dev *edev)
edev->config_space[1] |= PCI_COMMAND_MASTER;
}
/**
* eeh_ops_register - Register platform dependent EEH operations
* @ops: platform dependent EEH operations
*
* Register the platform dependent EEH operation callback
* functions. The platform should call this function before
* any other EEH operations.
*/
int __init eeh_ops_register(struct eeh_ops *ops)
{
if (!ops->name) {
pr_warn("%s: Invalid EEH ops name for %p\n",
__func__, ops);
return -EINVAL;
}
if (eeh_ops && eeh_ops != ops) {
pr_warn("%s: EEH ops of platform %s already existing (%s)\n",
__func__, eeh_ops->name, ops->name);
return -EEXIST;
}
eeh_ops = ops;
return 0;
}
/**
* eeh_ops_unregister - Unreigster platform dependent EEH operations
* @name: name of EEH platform operations
*
* Unregister the platform dependent EEH operation callback
* functions.
*/
int __exit eeh_ops_unregister(const char *name)
{
if (!name || !strlen(name)) {
pr_warn("%s: Invalid EEH ops name\n",
__func__);
return -EINVAL;
}
if (eeh_ops && !strcmp(eeh_ops->name, name)) {
eeh_ops = NULL;
return 0;
}
return -EEXIST;
}
static int eeh_reboot_notifier(struct notifier_block *nb,
unsigned long action, void *unused)
{
@ -990,54 +940,6 @@ static struct notifier_block eeh_reboot_nb = {
.notifier_call = eeh_reboot_notifier,
};
/**
* eeh_init - EEH initialization
*
* Initialize EEH by trying to enable it for all of the adapters in the system.
* As a side effect we can determine here if eeh is supported at all.
* Note that we leave EEH on so failed config cycles won't cause a machine
* check. If a user turns off EEH for a particular adapter they are really
* telling Linux to ignore errors. Some hardware (e.g. POWER5) won't
* grant access to a slot if EEH isn't enabled, and so we always enable
* EEH for all slots/all devices.
*
* The eeh-force-off option disables EEH checking globally, for all slots.
* Even if force-off is set, the EEH hardware is still enabled, so that
* newer systems can boot.
*/
static int eeh_init(void)
{
struct pci_controller *hose, *tmp;
int ret = 0;
/* Register reboot notifier */
ret = register_reboot_notifier(&eeh_reboot_nb);
if (ret) {
pr_warn("%s: Failed to register notifier (%d)\n",
__func__, ret);
return ret;
}
/* call platform initialization function */
if (!eeh_ops) {
pr_warn("%s: Platform EEH operation not found\n",
__func__);
return -EEXIST;
} else if ((ret = eeh_ops->init()))
return ret;
/* Initialize PHB PEs */
list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
eeh_phb_pe_create(hose);
eeh_addr_cache_init();
/* Initialize EEH event */
return eeh_event_init();
}
core_initcall_sync(eeh_init);
static int eeh_device_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@ -1062,12 +964,47 @@ static struct notifier_block eeh_device_nb = {
.notifier_call = eeh_device_notifier,
};
static __init int eeh_set_bus_notifier(void)
/**
* eeh_init - System wide EEH initialization
*
* It's the platform's job to call this from an arch_initcall().
*/
int eeh_init(struct eeh_ops *ops)
{
bus_register_notifier(&pci_bus_type, &eeh_device_nb);
return 0;
struct pci_controller *hose, *tmp;
int ret = 0;
/* the platform should only initialise EEH once */
if (WARN_ON(eeh_ops))
return -EEXIST;
if (WARN_ON(!ops))
return -ENOENT;
eeh_ops = ops;
/* Register reboot notifier */
ret = register_reboot_notifier(&eeh_reboot_nb);
if (ret) {
pr_warn("%s: Failed to register reboot notifier (%d)\n",
__func__, ret);
return ret;
}
ret = bus_register_notifier(&pci_bus_type, &eeh_device_nb);
if (ret) {
pr_warn("%s: Failed to register bus notifier (%d)\n",
__func__, ret);
return ret;
}
/* Initialize PHB PEs */
list_for_each_entry_safe(hose, tmp, &hose_list, list_node)
eeh_phb_pe_create(hose);
eeh_addr_cache_init();
/* Initialize EEH event */
return eeh_event_init();
}
arch_initcall(eeh_set_bus_notifier);
/**
* eeh_probe_device() - Perform EEH initialization for the indicated pci device
@ -1720,7 +1657,7 @@ static ssize_t eeh_force_recover_write(struct file *filp,
return -ENODEV;
/* Retrieve PE */
pe = eeh_pe_get(hose, pe_no, 0);
pe = eeh_pe_get(hose, pe_no);
if (!pe)
return -ENODEV;

View File

@ -251,43 +251,21 @@ void eeh_pe_dev_traverse(struct eeh_pe *root,
/**
* __eeh_pe_get - Check the PE address
* @data: EEH PE
* @flag: EEH device
*
* For one particular PE, it can be identified by PE address
* or tranditional BDF address. BDF address is composed of
* Bus/Device/Function number. The extra data referred by flag
* indicates which type of address should be used.
*/
struct eeh_pe_get_flag {
int pe_no;
int config_addr;
};
static void *__eeh_pe_get(struct eeh_pe *pe, void *flag)
{
struct eeh_pe_get_flag *tmp = (struct eeh_pe_get_flag *) flag;
int *target_pe = flag;
/* Unexpected PHB PE */
/* PHB PEs are special and should be ignored */
if (pe->type & EEH_PE_PHB)
return NULL;
/*
* We prefer PE address. For most cases, we should
* have non-zero PE address
*/
if (eeh_has_flag(EEH_VALID_PE_ZERO)) {
if (tmp->pe_no == pe->addr)
return pe;
} else {
if (tmp->pe_no &&
(tmp->pe_no == pe->addr))
return pe;
}
/* Try BDF address */
if (tmp->config_addr &&
(tmp->config_addr == pe->config_addr))
if (*target_pe == pe->addr)
return pe;
return NULL;
@ -297,7 +275,6 @@ static void *__eeh_pe_get(struct eeh_pe *pe, void *flag)
* eeh_pe_get - Search PE based on the given address
* @phb: PCI controller
* @pe_no: PE number
* @config_addr: Config address
*
* Search the corresponding PE based on the specified address which
* is included in the eeh device. The function is used to check if
@ -306,16 +283,11 @@ static void *__eeh_pe_get(struct eeh_pe *pe, void *flag)
* which is composed of PCI bus/device/function number, or unified
* PE address.
*/
struct eeh_pe *eeh_pe_get(struct pci_controller *phb,
int pe_no, int config_addr)
struct eeh_pe *eeh_pe_get(struct pci_controller *phb, int pe_no)
{
struct eeh_pe *root = eeh_phb_pe_get(phb);
struct eeh_pe_get_flag tmp = { pe_no, config_addr };
struct eeh_pe *pe;
pe = eeh_pe_traverse(root, __eeh_pe_get, &tmp);
return pe;
return eeh_pe_traverse(root, __eeh_pe_get, &pe_no);
}
/**
@ -336,19 +308,13 @@ int eeh_pe_tree_insert(struct eeh_dev *edev, struct eeh_pe *new_pe_parent)
struct pci_controller *hose = edev->controller;
struct eeh_pe *pe, *parent;
/* Check if the PE number is valid */
if (!eeh_has_flag(EEH_VALID_PE_ZERO) && !edev->pe_config_addr) {
eeh_edev_err(edev, "PE#0 is invalid for this PHB!\n");
return -EINVAL;
}
/*
* Search the PE has been existing or not according
* to the PE address. If that has been existing, the
* PE should be composed of PCI bus and its subordinate
* components.
*/
pe = eeh_pe_get(hose, edev->pe_config_addr, edev->bdfn);
pe = eeh_pe_get(hose, edev->pe_config_addr);
if (pe) {
if (pe->type & EEH_PE_INVALID) {
list_add_tail(&edev->entry, &pe->edevs);
@ -388,8 +354,8 @@ int eeh_pe_tree_insert(struct eeh_dev *edev, struct eeh_pe *new_pe_parent)
pr_err("%s: out of memory!\n", __func__);
return -ENOMEM;
}
pe->addr = edev->pe_config_addr;
pe->config_addr = edev->bdfn;
pe->addr = edev->pe_config_addr;
/*
* Put the new EEH PE into hierarchy tree. If the parent

View File

@ -234,7 +234,6 @@ transfer_to_handler_cont:
mtspr SPRN_SRR0,r11
mtspr SPRN_SRR1,r10
mtlr r9
SYNC
RFI /* jump to handler, enable MMU */
#if defined (CONFIG_PPC_BOOK3S_32) || defined(CONFIG_E500)
@ -264,7 +263,6 @@ _ASM_NOKPROBE_SYMBOL(transfer_to_handler_cont)
LOAD_REG_IMMEDIATE(r0, MSR_KERNEL)
mtspr SPRN_SRR0,r12
mtspr SPRN_SRR1,r0
SYNC
RFI
reenable_mmu:
@ -323,7 +321,6 @@ stack_ovf:
#endif
mtspr SPRN_SRR0,r9
mtspr SPRN_SRR1,r10
SYNC
RFI
_ASM_NOKPROBE_SYMBOL(stack_ovf)
#endif
@ -411,7 +408,6 @@ ret_from_syscall:
/* disable interrupts so current_thread_info()->flags can't change */
LOAD_REG_IMMEDIATE(r10,MSR_KERNEL) /* doesn't include MSR_EE */
/* Note: We don't bother telling lockdep about it */
SYNC
mtmsr r10
lwz r9,TI_FLAGS(r2)
li r8,-MAX_ERRNO
@ -474,7 +470,6 @@ syscall_exit_finish:
#endif
mtspr SPRN_SRR0,r7
mtspr SPRN_SRR1,r8
SYNC
RFI
_ASM_NOKPROBE_SYMBOL(syscall_exit_finish)
#ifdef CONFIG_44x
@ -567,7 +562,6 @@ syscall_exit_work:
* lockdep as we are supposed to have IRQs on at this point
*/
ori r10,r10,MSR_EE
SYNC
mtmsr r10
/* Save NVGPRS if they're not saved already */
@ -606,7 +600,6 @@ ret_from_kernel_syscall:
#endif
mtspr SPRN_SRR0, r9
mtspr SPRN_SRR1, r10
SYNC
RFI
_ASM_NOKPROBE_SYMBOL(ret_from_kernel_syscall)
@ -810,7 +803,6 @@ fast_exception_return:
REST_GPR(9, r11)
REST_GPR(12, r11)
lwz r11,GPR11(r11)
SYNC
RFI
_ASM_NOKPROBE_SYMBOL(fast_exception_return)
@ -819,19 +811,11 @@ _ASM_NOKPROBE_SYMBOL(fast_exception_return)
1: lis r3,exc_exit_restart_end@ha
addi r3,r3,exc_exit_restart_end@l
cmplw r12,r3
#ifdef CONFIG_PPC_BOOK3S_601
bge 2b
#else
bge 3f
#endif
lis r4,exc_exit_restart@ha
addi r4,r4,exc_exit_restart@l
cmplw r12,r4
#ifdef CONFIG_PPC_BOOK3S_601
blt 2b
#else
blt 3f
#endif
lis r3,fee_restarts@ha
tophys(r3,r3)
lwz r5,fee_restarts@l(r3)
@ -848,7 +832,6 @@ fee_restarts:
/* aargh, a nonrecoverable interrupt, panic */
/* aargh, we don't know which trap this is */
/* but the 601 doesn't implement the RI bit, so assume it's OK */
3:
li r10,-1
stw r10,_TRAP(r11)
@ -872,7 +855,6 @@ ret_from_except:
* from the interrupt. */
/* Note: We don't bother telling lockdep about it */
LOAD_REG_IMMEDIATE(r10,MSR_KERNEL)
SYNC /* Some chip revs have problems here... */
mtmsr r10 /* disable interrupts */
lwz r3,_MSR(r1) /* Returning to user mode? */
@ -1035,7 +1017,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRED_STWCX)
* exc_exit_restart below. -- paulus
*/
LOAD_REG_IMMEDIATE(r10,MSR_KERNEL & ~MSR_RI)
SYNC
mtmsr r10 /* clear the RI bit */
.globl exc_exit_restart
exc_exit_restart:
@ -1046,7 +1027,6 @@ exc_exit_restart:
lwz r1,GPR1(r1)
.globl exc_exit_restart_end
exc_exit_restart_end:
SYNC
RFI
_ASM_NOKPROBE_SYMBOL(exc_exit_restart)
_ASM_NOKPROBE_SYMBOL(exc_exit_restart_end)
@ -1274,7 +1254,6 @@ do_resched: /* r10 contains MSR_KERNEL here */
mfmsr r10
#endif
ori r10,r10,MSR_EE
SYNC
mtmsr r10 /* hard-enable interrupts */
bl schedule
recheck:
@ -1283,7 +1262,6 @@ recheck:
* TI_FLAGS aren't advertised.
*/
LOAD_REG_IMMEDIATE(r10,MSR_KERNEL)
SYNC
mtmsr r10 /* disable interrupts */
lwz r9,TI_FLAGS(r2)
andi. r0,r9,_TIF_NEED_RESCHED
@ -1292,7 +1270,6 @@ recheck:
beq restore_user
do_user_signal: /* r10 contains MSR_KERNEL here */
ori r10,r10,MSR_EE
SYNC
mtmsr r10 /* hard-enable interrupts */
/* save r13-r31 in the exception frame, if not already done */
lwz r3,_TRAP(r1)
@ -1316,19 +1293,11 @@ nonrecoverable:
lis r10,exc_exit_restart_end@ha
addi r10,r10,exc_exit_restart_end@l
cmplw r12,r10
#ifdef CONFIG_PPC_BOOK3S_601
bgelr
#else
bge 3f
#endif
lis r11,exc_exit_restart@ha
addi r11,r11,exc_exit_restart@l
cmplw r12,r11
#ifdef CONFIG_PPC_BOOK3S_601
bltlr
#else
blt 3f
#endif
lis r10,ee_restarts@ha
lwz r12,ee_restarts@l(r10)
addi r12,r12,1
@ -1336,7 +1305,6 @@ nonrecoverable:
mr r12,r11 /* restart at exc_exit_restart */
blr
3: /* OK, we can't recover, kill this process */
/* but the 601 doesn't implement the RI bit, so assume it's OK */
lwz r3,_TRAP(r1)
andi. r0,r3,1
beq 5f
@ -1382,8 +1350,7 @@ _GLOBAL(enter_rtas)
mfmsr r9
stw r9,8(r1)
LOAD_REG_IMMEDIATE(r0,MSR_KERNEL)
SYNC /* disable interrupts so SRR0/1 */
mtmsr r0 /* don't get trashed */
mtmsr r0 /* disable interrupts so SRR0/1 don't get trashed */
li r9,MSR_KERNEL & ~(MSR_IR|MSR_DR)
mtlr r6
stw r7, THREAD + RTAS_SP(r2)

View File

@ -430,7 +430,11 @@ _ASM_NOKPROBE_SYMBOL(save_nvgprs);
#define FLUSH_COUNT_CACHE \
1: nop; \
patch_site 1b, patch__call_flush_branch_caches
patch_site 1b, patch__call_flush_branch_caches1; \
1: nop; \
patch_site 1b, patch__call_flush_branch_caches2; \
1: nop; \
patch_site 1b, patch__call_flush_branch_caches3
.macro nops number
.rept \number
@ -512,7 +516,7 @@ _GLOBAL(_switch)
kuap_check_amr r9, r10
FLUSH_COUNT_CACHE
FLUSH_COUNT_CACHE /* Clobbers r9, ctr */
/*
* On SMP kernels, care must be taken because a task may be

View File

@ -988,7 +988,6 @@ kernel_dbg_exc:
.endm
masked_interrupt_book3e_0x500:
// XXX When adding support for EPR, use PACA_IRQ_EE_EDGE
masked_interrupt_book3e PACA_IRQ_EE 1
masked_interrupt_book3e_0x900:
@ -1303,16 +1302,6 @@ fast_exception_return:
addi r3,r1,STACK_FRAME_OVERHEAD;
bl do_IRQ
b ret_from_except
1: cmpwi cr0,r3,0xf00
bne 1f
addi r3,r1,STACK_FRAME_OVERHEAD;
bl performance_monitor_exception
b ret_from_except
1: cmpwi cr0,r3,0xe60
bne 1f
addi r3,r1,STACK_FRAME_OVERHEAD;
bl handle_hmi_exception
b ret_from_except
1: cmpwi cr0,r3,0x900
bne 1f
addi r3,r1,STACK_FRAME_OVERHEAD;

View File

@ -754,10 +754,8 @@ u32 *fadump_regs_to_elf_notes(u32 *buf, struct pt_regs *regs)
void fadump_update_elfcore_header(char *bufp)
{
struct elfhdr *elf;
struct elf_phdr *phdr;
elf = (struct elfhdr *)bufp;
bufp += sizeof(struct elfhdr);
/* First note is a place holder for cpu notes info. */

View File

@ -87,7 +87,6 @@ BEGIN_FTR_SECTION
oris r5,r5,MSR_VSX@h
END_FTR_SECTION_IFSET(CPU_FTR_VSX)
#endif
SYNC
MTMSRD(r5) /* enable use of fpu now */
isync
/* enable use of FP after return */
@ -134,18 +133,3 @@ _GLOBAL(save_fpu)
mffs fr0
stfd fr0,FPSTATE_FPSCR(r6)
blr
/*
* These are used in the alignment trap handler when emulating
* single-precision loads and stores.
*/
_GLOBAL(cvt_fd)
lfs 0,0(r3)
stfd 0,0(r4)
blr
_GLOBAL(cvt_df)
lfd 0,0(r3)
stfs 0,0(r4)
blr

View File

@ -40,48 +40,52 @@
.macro EXCEPTION_PROLOG_1 for_rtas=0
#ifdef CONFIG_VMAP_STACK
.ifeq \for_rtas
li r11, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
mtmsr r11
isync
.endif
subi r11, r1, INT_FRAME_SIZE /* use r1 if kernel */
mr r11, r1
subi r1, r1, INT_FRAME_SIZE /* use r1 if kernel */
beq 1f
mfspr r1,SPRN_SPRG_THREAD
lwz r1,TASK_STACK-THREAD(r1)
addi r1, r1, THREAD_SIZE - INT_FRAME_SIZE
#else
tophys(r11,r1) /* use tophys(r1) if kernel */
subi r11, r11, INT_FRAME_SIZE /* alloc exc. frame */
#endif
subi r11, r1, INT_FRAME_SIZE /* use r1 if kernel */
beq 1f
mfspr r11,SPRN_SPRG_THREAD
tovirt_vmstack r11, r11
lwz r11,TASK_STACK-THREAD(r11)
addi r11, r11, THREAD_SIZE - INT_FRAME_SIZE
tophys_novmstack r11, r11
#endif
1:
tophys_novmstack r11, r11
#ifdef CONFIG_VMAP_STACK
mtcrf 0x7f, r11
mtcrf 0x7f, r1
bt 32 - THREAD_ALIGN_SHIFT, stack_overflow
#endif
.endm
.macro EXCEPTION_PROLOG_2 handle_dar_dsisr=0
#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_PPC_BOOK3S)
BEGIN_MMU_FTR_SECTION
#ifdef CONFIG_VMAP_STACK
mtcr r10
FTR_SECTION_ELSE
stw r10, _CCR(r11)
ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
li r10, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
mtmsr r10
isync
#else
stw r10,_CCR(r11) /* save registers */
#endif
mfspr r10, SPRN_SPRG_SCRATCH0
#ifdef CONFIG_VMAP_STACK
stw r11,GPR1(r1)
stw r11,0(r1)
mr r11, r1
#else
stw r1,GPR1(r11)
stw r1,0(r11)
tovirt(r1, r11) /* set new kernel sp */
#endif
stw r12,GPR12(r11)
stw r9,GPR9(r11)
stw r10,GPR10(r11)
#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_PPC_BOOK3S)
BEGIN_MMU_FTR_SECTION
#ifdef CONFIG_VMAP_STACK
mfcr r10
stw r10, _CCR(r11)
END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
#endif
mfspr r12,SPRN_SPRG_SCRATCH1
stw r12,GPR11(r11)
@ -97,19 +101,12 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
stw r10, _DSISR(r11)
.endif
lwz r9, SRR1(r12)
#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_PPC_BOOK3S)
BEGIN_MMU_FTR_SECTION
andi. r10, r9, MSR_PR
END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
#endif
lwz r12, SRR0(r12)
#else
mfspr r12,SPRN_SRR0
mfspr r9,SPRN_SRR1
#endif
stw r1,GPR1(r11)
stw r1,0(r11)
tovirt_novmstack r1, r11 /* set new kernel sp */
#ifdef CONFIG_40x
rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */
#else
@ -225,7 +222,6 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
#endif
mtspr SPRN_SRR1,r10
mtspr SPRN_SRR0,r11
SYNC
RFI /* jump to handler, enable MMU */
99: b ret_from_kernel_syscall
.endm
@ -327,20 +323,19 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
.macro vmap_stack_overflow_exception
#ifdef CONFIG_VMAP_STACK
#ifdef CONFIG_SMP
mfspr r11, SPRN_SPRG_THREAD
tovirt(r11, r11)
lwz r11, TASK_CPU - THREAD(r11)
slwi r11, r11, 3
addis r11, r11, emergency_ctx@ha
mfspr r1, SPRN_SPRG_THREAD
lwz r1, TASK_CPU - THREAD(r1)
slwi r1, r1, 3
addis r1, r1, emergency_ctx@ha
#else
lis r11, emergency_ctx@ha
lis r1, emergency_ctx@ha
#endif
lwz r11, emergency_ctx@l(r11)
cmpwi cr1, r11, 0
lwz r1, emergency_ctx@l(r1)
cmpwi cr1, r1, 0
bne cr1, 1f
lis r11, init_thread_union@ha
addi r11, r11, init_thread_union@l
1: addi r11, r11, THREAD_SIZE - INT_FRAME_SIZE
lis r1, init_thread_union@ha
addi r1, r1, init_thread_union@l
1: addi r1, r1, THREAD_SIZE - INT_FRAME_SIZE
EXCEPTION_PROLOG_2
SAVE_NVGPRS(r11)
addi r3, r1, STACK_FRAME_OVERHEAD

View File

@ -72,7 +72,6 @@ turn_on_mmu:
lis r0,start_here@h
ori r0,r0,start_here@l
mtspr SPRN_SRR0,r0
SYNC
rfi /* enables MMU */
b . /* prevent prefetch past rfi */

View File

@ -300,9 +300,6 @@ _GLOBAL(fsl_secondary_thread_init)
rlwimi r3, r3, 30, 2, 30
mtspr SPRN_PIR, r3
1:
#endif
_GLOBAL(generic_secondary_thread_init)
mr r24,r3
/* turn on 64-bit mode */
@ -312,13 +309,13 @@ _GLOBAL(generic_secondary_thread_init)
bl relative_toc
tovirt(r2,r2)
#ifdef CONFIG_PPC_BOOK3E
/* Book3E initialization */
mr r3,r24
bl book3e_secondary_thread_init
#endif
b generic_secondary_common_init
#endif /* CONFIG_PPC_BOOK3E */
/*
* On pSeries and most other platforms, secondary processors spin
* in the following code.

View File

@ -34,16 +34,6 @@
#include "head_32.h"
/* 601 only have IBAT */
#ifdef CONFIG_PPC_BOOK3S_601
#define LOAD_BAT(n, reg, RA, RB) \
li RA,0; \
mtspr SPRN_IBAT##n##U,RA; \
lwz RA,(n*16)+0(reg); \
lwz RB,(n*16)+4(reg); \
mtspr SPRN_IBAT##n##U,RA; \
mtspr SPRN_IBAT##n##L,RB
#else
#define LOAD_BAT(n, reg, RA, RB) \
/* see the comment for clear_bats() -- Cort */ \
li RA,0; \
@ -57,11 +47,10 @@
lwz RB,(n*16)+12(reg); \
mtspr SPRN_DBAT##n##U,RA; \
mtspr SPRN_DBAT##n##L,RB
#endif
__HEAD
.stabs "arch/powerpc/kernel/",N_SO,0,0,0f
.stabs "head_32.S",N_SO,0,0,0f
.stabs "head_book3s_32.S",N_SO,0,0,0f
0:
_ENTRY(_stext);
@ -166,9 +155,9 @@ __after_mmu_off:
bl initial_bats
bl load_segment_registers
#ifdef CONFIG_KASAN
BEGIN_MMU_FTR_SECTION
bl early_hash_table
#endif
END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
#if defined(CONFIG_BOOTX_TEXT)
bl setup_disp_bat
#endif
@ -185,10 +174,8 @@ __after_mmu_off:
bl reloc_offset
li r24,0 /* cpu# */
bl call_setup_cpu /* Call setup_cpu for this CPU */
#ifdef CONFIG_PPC_BOOK3S_32
bl reloc_offset
bl init_idle_6xx
#endif /* CONFIG_PPC_BOOK3S_32 */
/*
@ -219,7 +206,6 @@ turn_on_mmu:
lis r0,start_here@h
ori r0,r0,start_here@l
mtspr SPRN_SRR0,r0
SYNC
RFI /* enables MMU */
/*
@ -274,14 +260,8 @@ __secondary_hold_acknowledge:
DO_KVM 0x200
MachineCheck:
EXCEPTION_PROLOG_0
#ifdef CONFIG_VMAP_STACK
li r11, MSR_KERNEL & ~(MSR_IR | MSR_RI) /* can take DTLB miss */
mtmsr r11
isync
#endif
#ifdef CONFIG_PPC_CHRP
mfspr r11, SPRN_SPRG_THREAD
tovirt_vmstack r11, r11
lwz r11, RTAS_SP(r11)
cmpwi cr1, r11, 0
bne cr1, 7f
@ -439,7 +419,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_FPU_UNAVAILABLE)
SystemCall:
SYSCALL_ENTRY 0xc00
/* Single step - not used on 601 */
EXCEPTION(0xd00, SingleStep, single_step_exception, EXC_XFER_STD)
EXCEPTION(0xe00, Trap_0e, unknown_exception, EXC_XFER_STD)
@ -790,14 +769,12 @@ fast_hash_page_return:
mtcr r11
lwz r11, THR11(r10)
mfspr r10, SPRN_SPRG_SCRATCH0
SYNC
RFI
1: /* ISI */
mtcr r11
mfspr r11, SPRN_SPRG_SCRATCH1
mfspr r10, SPRN_SPRG_SCRATCH0
SYNC
RFI
stack_overflow:
@ -888,7 +865,6 @@ __secondary_start_pmac_0:
set to map the 0xf0000000 - 0xffffffff region */
mfmsr r0
rlwinm r0,r0,0,28,26 /* clear DR (0x10) */
SYNC
mtmsr r0
isync
@ -900,10 +876,8 @@ __secondary_start:
lis r3,-KERNELBASE@h
mr r4,r24
bl call_setup_cpu /* Call setup_cpu for this CPU */
#ifdef CONFIG_PPC_BOOK3S_32
lis r3,-KERNELBASE@h
bl init_idle_6xx
#endif /* CONFIG_PPC_BOOK3S_32 */
/* get current's stack and current */
lis r2,secondary_current@ha
@ -936,7 +910,6 @@ __secondary_start:
ori r3,r3,start_secondary@l
mtspr SPRN_SRR0,r3
mtspr SPRN_SRR1,r4
SYNC
RFI
#endif /* CONFIG_SMP */
@ -944,22 +917,10 @@ __secondary_start:
#include "../kvm/book3s_rmhandlers.S"
#endif
/*
* Those generic dummy functions are kept for CPUs not
* included in CONFIG_PPC_BOOK3S_32
*/
#if !defined(CONFIG_PPC_BOOK3S_32)
_ENTRY(__save_cpu_setup)
blr
_ENTRY(__restore_cpu_setup)
blr
#endif /* !defined(CONFIG_PPC_BOOK3S_32) */
/*
* Load stuff into the MMU. Intended to be called with
* IR=0 and DR=0.
*/
#ifdef CONFIG_KASAN
early_hash_table:
sync /* Force all PTE updates to finish */
isync
@ -970,8 +931,10 @@ early_hash_table:
lis r6, early_hash - PAGE_OFFSET@h
ori r6, r6, 3 /* 256kB table */
mtspr SPRN_SDR1, r6
lis r6, early_hash@h
lis r3, Hash@ha
stw r6, Hash@l(r3)
blr
#endif
load_up_mmu:
sync /* Force all PTE updates to finish */
@ -985,8 +948,7 @@ load_up_mmu:
lwz r6,_SDR1@l(r6)
mtspr SPRN_SDR1,r6
/* Load the BAT registers with the values set up by MMU_init.
MMU_init takes care of whether we're on a 601 or not. */
/* Load the BAT registers with the values set up by MMU_init. */
lis r3,BATS@ha
addi r3,r3,BATS@l
tophys(r3,r3)
@ -1002,7 +964,7 @@ BEGIN_MMU_FTR_SECTION
END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
blr
load_segment_registers:
_GLOBAL(load_segment_registers)
li r0, NUM_USER_SEGMENTS /* load up user segment register values */
mtctr r0 /* for context 0 */
li r3, 0 /* Kp = 0, Ks = 0, VSID = 0 */
@ -1061,11 +1023,7 @@ start_here:
bl machine_init
bl __save_cpu_setup
bl MMU_init
#ifdef CONFIG_KASAN
BEGIN_MMU_FTR_SECTION
bl MMU_init_hw_patch
END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
#endif
/*
* Go back to running unmapped so we can load up new values
@ -1080,7 +1038,6 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
.align 4
mtspr SPRN_SRR0,r4
mtspr SPRN_SRR1,r3
SYNC
RFI
/* Load up the kernel context */
2: bl load_up_mmu
@ -1092,7 +1049,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
*/
lis r5, abatron_pteptrs@h
ori r5, r5, abatron_pteptrs@l
stw r5, 0xf0(r0) /* This much match your Abatron config */
stw r5, 0xf0(0) /* This much match your Abatron config */
lis r6, swapper_pg_dir@h
ori r6, r6, swapper_pg_dir@l
tophys(r5, r5)
@ -1105,7 +1062,6 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
ori r3,r3,start_kernel@l
mtspr SPRN_SRR0,r3
mtspr SPRN_SRR1,r4
SYNC
RFI
/*
@ -1165,7 +1121,6 @@ EXPORT_SYMBOL(switch_mmu_context)
clear_bats:
li r10,0
#ifndef CONFIG_PPC_BOOK3S_601
mtspr SPRN_DBAT0U,r10
mtspr SPRN_DBAT0L,r10
mtspr SPRN_DBAT1U,r10
@ -1174,7 +1129,6 @@ clear_bats:
mtspr SPRN_DBAT2L,r10
mtspr SPRN_DBAT3U,r10
mtspr SPRN_DBAT3L,r10
#endif
mtspr SPRN_IBAT0U,r10
mtspr SPRN_IBAT0L,r10
mtspr SPRN_IBAT1U,r10
@ -1223,7 +1177,6 @@ _ENTRY(update_bats)
.align 4
mtspr SPRN_SRR0, r4
mtspr SPRN_SRR1, r3
SYNC
RFI
1: bl clear_bats
lis r3, BATS@ha
@ -1243,7 +1196,6 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_HIGH_BATS)
mtmsr r3
mtspr SPRN_SRR0, r7
mtspr SPRN_SRR1, r6
SYNC
RFI
flush_tlbs:
@ -1267,26 +1219,9 @@ mmu_off:
sync
RFI
/*
* On 601, we use 3 BATs to map up to 24M of RAM at _PAGE_OFFSET
* (we keep one for debugging) and on others, we use one 256M BAT.
*/
/* We use one BAT to map up to 256M of RAM at _PAGE_OFFSET */
initial_bats:
lis r11,PAGE_OFFSET@h
#ifdef CONFIG_PPC_BOOK3S_601
ori r11,r11,4 /* set up BAT registers for 601 */
li r8,0x7f /* valid, block length = 8MB */
mtspr SPRN_IBAT0U,r11 /* N.B. 601 has valid bit in */
mtspr SPRN_IBAT0L,r8 /* lower BAT register */
addis r11,r11,0x800000@h
addis r8,r8,0x800000@h
mtspr SPRN_IBAT1U,r11
mtspr SPRN_IBAT1L,r8
addis r11,r11,0x800000@h
addis r8,r8,0x800000@h
mtspr SPRN_IBAT2U,r11
mtspr SPRN_IBAT2L,r8
#else
tophys(r8,r11)
#ifdef CONFIG_SMP
ori r8,r8,0x12 /* R/W access, M=1 */
@ -1295,11 +1230,10 @@ initial_bats:
#endif /* CONFIG_SMP */
ori r11,r11,BL_256M<<2|0x2 /* set up BAT registers for 604 */
mtspr SPRN_DBAT0L,r8 /* N.B. 6xx (not 601) have valid */
mtspr SPRN_DBAT0L,r8 /* N.B. 6xx have valid */
mtspr SPRN_DBAT0U,r11 /* bit in upper BAT register */
mtspr SPRN_IBAT0L,r8
mtspr SPRN_IBAT0U,r11
#endif
isync
blr
@ -1317,13 +1251,8 @@ setup_disp_bat:
beqlr
lwz r11,0(r8)
lwz r8,4(r8)
#ifndef CONFIG_PPC_BOOK3S_601
mtspr SPRN_DBAT3L,r8
mtspr SPRN_DBAT3U,r11
#else
mtspr SPRN_IBAT3L,r8
mtspr SPRN_IBAT3U,r11
#endif
blr
#endif /* CONFIG_BOOTX_TEXT */

View File

@ -176,7 +176,6 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
#endif
mtspr SPRN_SRR1,r10
mtspr SPRN_SRR0,r11
SYNC
RFI /* jump to handler, enable MMU */
99: b ret_from_kernel_syscall
.endm

View File

@ -494,151 +494,6 @@ void thread_change_pc(struct task_struct *tsk, struct pt_regs *regs)
}
}
static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint *info)
{
return ((info->address <= dar) && (dar - info->address < info->len));
}
static bool ea_user_range_overlaps(unsigned long ea, int size,
struct arch_hw_breakpoint *info)
{
return ((ea < info->address + info->len) &&
(ea + size > info->address));
}
static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
{
unsigned long hw_start_addr, hw_end_addr;
hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
return ((hw_start_addr <= dar) && (hw_end_addr > dar));
}
static bool ea_hw_range_overlaps(unsigned long ea, int size,
struct arch_hw_breakpoint *info)
{
unsigned long hw_start_addr, hw_end_addr;
hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
}
/*
* If hw has multiple DAWR registers, we also need to check all
* dawrx constraint bits to confirm this is _really_ a valid event.
* If type is UNKNOWN, but privilege level matches, consider it as
* a positive match.
*/
static bool check_dawrx_constraints(struct pt_regs *regs, int type,
struct arch_hw_breakpoint *info)
{
if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
return false;
/*
* The Cache Management instructions other than dcbz never
* cause a match. i.e. if type is CACHEOP, the instruction
* is dcbz, and dcbz is treated as Store.
*/
if ((OP_IS_STORE(type) || type == CACHEOP) && !(info->type & HW_BRK_TYPE_WRITE))
return false;
if (is_kernel_addr(regs->nip) && !(info->type & HW_BRK_TYPE_KERNEL))
return false;
if (user_mode(regs) && !(info->type & HW_BRK_TYPE_USER))
return false;
return true;
}
/*
* Return true if the event is valid wrt dawr configuration,
* including extraneous exception. Otherwise return false.
*/
static bool check_constraints(struct pt_regs *regs, struct ppc_inst instr,
unsigned long ea, int type, int size,
struct arch_hw_breakpoint *info)
{
bool in_user_range = dar_in_user_range(regs->dar, info);
bool dawrx_constraints;
/*
* 8xx supports only one breakpoint and thus we can
* unconditionally return true.
*/
if (IS_ENABLED(CONFIG_PPC_8xx)) {
if (!in_user_range)
info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
return true;
}
if (unlikely(ppc_inst_equal(instr, ppc_inst(0)))) {
if (cpu_has_feature(CPU_FTR_ARCH_31) &&
!dar_in_hw_range(regs->dar, info))
return false;
return true;
}
dawrx_constraints = check_dawrx_constraints(regs, type, info);
if (type == UNKNOWN) {
if (cpu_has_feature(CPU_FTR_ARCH_31) &&
!dar_in_hw_range(regs->dar, info))
return false;
return dawrx_constraints;
}
if (ea_user_range_overlaps(ea, size, info))
return dawrx_constraints;
if (ea_hw_range_overlaps(ea, size, info)) {
if (dawrx_constraints) {
info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
return true;
}
}
return false;
}
static int cache_op_size(void)
{
#ifdef __powerpc64__
return ppc64_caches.l1d.block_size;
#else
return L1_CACHE_BYTES;
#endif
}
static void get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
int *type, int *size, unsigned long *ea)
{
struct instruction_op op;
if (__get_user_instr_inatomic(*instr, (void __user *)regs->nip))
return;
analyse_instr(&op, regs, *instr);
*type = GETTYPE(op.type);
*ea = op.ea;
#ifdef __powerpc64__
if (!(regs->msr & MSR_64BIT))
*ea &= 0xffffffffUL;
#endif
*size = GETSIZE(op.type);
if (*type == CACHEOP) {
*size = cache_op_size();
*ea &= ~(*size - 1);
}
}
static bool is_larx_stcx_instr(int type)
{
return type == LARX || type == STCX;
@ -722,7 +577,7 @@ int hw_breakpoint_handler(struct die_args *args)
rcu_read_lock();
if (!IS_ENABLED(CONFIG_PPC_8xx))
get_instr_detail(regs, &instr, &type, &size, &ea);
wp_get_instr_detail(regs, &instr, &type, &size, &ea);
for (i = 0; i < nr_wp_slots(); i++) {
bp[i] = __this_cpu_read(bp_per_reg[i]);
@ -732,7 +587,7 @@ int hw_breakpoint_handler(struct die_args *args)
info[i] = counter_arch_bp(bp[i]);
info[i]->type &= ~HW_BRK_TYPE_EXTRANEOUS_IRQ;
if (check_constraints(regs, instr, ea, type, size, info[i])) {
if (wp_check_constraints(regs, instr, ea, type, size, info[i])) {
if (!IS_ENABLED(CONFIG_PPC_8xx) &&
ppc_inst_equal(instr, ppc_inst(0))) {
handler_error(bp[i], info[i]);

View File

@ -0,0 +1,162 @@
// SPDX-License-Identifier: GPL-2.0+
#include <linux/kernel.h>
#include <linux/uaccess.h>
#include <linux/sched.h>
#include <asm/hw_breakpoint.h>
#include <asm/sstep.h>
#include <asm/cache.h>
static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint *info)
{
return ((info->address <= dar) && (dar - info->address < info->len));
}
static bool ea_user_range_overlaps(unsigned long ea, int size,
struct arch_hw_breakpoint *info)
{
return ((ea < info->address + info->len) &&
(ea + size > info->address));
}
static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
{
unsigned long hw_start_addr, hw_end_addr;
hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
return ((hw_start_addr <= dar) && (hw_end_addr > dar));
}
static bool ea_hw_range_overlaps(unsigned long ea, int size,
struct arch_hw_breakpoint *info)
{
unsigned long hw_start_addr, hw_end_addr;
unsigned long align_size = HW_BREAKPOINT_SIZE;
/*
* On p10 predecessors, quadword is handle differently then
* other instructions.
*/
if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
align_size = HW_BREAKPOINT_SIZE_QUADWORD;
hw_start_addr = ALIGN_DOWN(info->address, align_size);
hw_end_addr = ALIGN(info->address + info->len, align_size);
return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
}
/*
* If hw has multiple DAWR registers, we also need to check all
* dawrx constraint bits to confirm this is _really_ a valid event.
* If type is UNKNOWN, but privilege level matches, consider it as
* a positive match.
*/
static bool check_dawrx_constraints(struct pt_regs *regs, int type,
struct arch_hw_breakpoint *info)
{
if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
return false;
/*
* The Cache Management instructions other than dcbz never
* cause a match. i.e. if type is CACHEOP, the instruction
* is dcbz, and dcbz is treated as Store.
*/
if ((OP_IS_STORE(type) || type == CACHEOP) && !(info->type & HW_BRK_TYPE_WRITE))
return false;
if (is_kernel_addr(regs->nip) && !(info->type & HW_BRK_TYPE_KERNEL))
return false;
if (user_mode(regs) && !(info->type & HW_BRK_TYPE_USER))
return false;
return true;
}
/*
* Return true if the event is valid wrt dawr configuration,
* including extraneous exception. Otherwise return false.
*/
bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
unsigned long ea, int type, int size,
struct arch_hw_breakpoint *info)
{
bool in_user_range = dar_in_user_range(regs->dar, info);
bool dawrx_constraints;
/*
* 8xx supports only one breakpoint and thus we can
* unconditionally return true.
*/
if (IS_ENABLED(CONFIG_PPC_8xx)) {
if (!in_user_range)
info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
return true;
}
if (unlikely(ppc_inst_equal(instr, ppc_inst(0)))) {
if (cpu_has_feature(CPU_FTR_ARCH_31) &&
!dar_in_hw_range(regs->dar, info))
return false;
return true;
}
dawrx_constraints = check_dawrx_constraints(regs, type, info);
if (type == UNKNOWN) {
if (cpu_has_feature(CPU_FTR_ARCH_31) &&
!dar_in_hw_range(regs->dar, info))
return false;
return dawrx_constraints;
}
if (ea_user_range_overlaps(ea, size, info))
return dawrx_constraints;
if (ea_hw_range_overlaps(ea, size, info)) {
if (dawrx_constraints) {
info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
return true;
}
}
return false;
}
static int cache_op_size(void)
{
#ifdef __powerpc64__
return ppc64_caches.l1d.block_size;
#else
return L1_CACHE_BYTES;
#endif
}
void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
int *type, int *size, unsigned long *ea)
{
struct instruction_op op;
if (__get_user_instr_inatomic(*instr, (void __user *)regs->nip))
return;
analyse_instr(&op, regs, *instr);
*type = GETTYPE(op.type);
*ea = op.ea;
#ifdef __powerpc64__
if (!(regs->msr & MSR_64BIT))
*ea &= 0xffffffffUL;
#endif
*size = GETSIZE(op.type);
if (*type == CACHEOP) {
*size = cache_op_size();
*ea &= ~(*size - 1);
} else if (*type == LOAD_VMX || *type == STORE_VMX) {
*ea &= ~(*size - 1);
}
}

View File

@ -41,14 +41,6 @@ static int __init powersave_off(char *arg)
}
__setup("powersave=off", powersave_off);
#ifdef CONFIG_HOTPLUG_CPU
void arch_cpu_idle_dead(void)
{
sched_preempt_enable_no_resched();
cpu_die();
}
#endif
void arch_cpu_idle(void)
{
ppc64_runlatch_off();

View File

@ -104,7 +104,7 @@ static inline notrace unsigned long get_irq_happened(void)
static inline notrace int decrementer_check_overflow(void)
{
u64 now = get_tb_or_rtc();
u64 now = get_tb();
u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
return now >= *next_tb;
@ -113,7 +113,7 @@ static inline notrace int decrementer_check_overflow(void)
#ifdef CONFIG_PPC_BOOK3E
/* This is called whenever we are re-enabling interrupts
* and returns either 0 (nothing to do) or 500/900/280/a00/e80 if
* and returns either 0 (nothing to do) or 500/900/280 if
* there's an EE, DEC or DBELL to generate.
*
* This is called in two contexts: From arch_local_irq_restore()
@ -181,16 +181,6 @@ notrace unsigned int __check_irq_replay(void)
return 0x500;
}
/*
* Check if an EPR external interrupt happened this bit is typically
* set if we need to handle another "edge" interrupt from within the
* MPIC "EPR" handler.
*/
if (happened & PACA_IRQ_EE_EDGE) {
local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
return 0x500;
}
if (happened & PACA_IRQ_DBELL) {
local_paca->irq_happened &= ~PACA_IRQ_DBELL;
return 0x280;
@ -201,6 +191,25 @@ notrace unsigned int __check_irq_replay(void)
return 0;
}
/*
* This is specifically called by assembly code to re-enable interrupts
* if they are currently disabled. This is typically called before
* schedule() or do_signal() when returning to userspace. We do it
* in C to avoid the burden of dealing with lockdep etc...
*
* NOTE: This is called with interrupts hard disabled but not marked
* as such in paca->irq_happened, so we need to resync this.
*/
void notrace restore_interrupts(void)
{
if (irqs_disabled()) {
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
local_irq_enable();
} else
__hard_irq_enable();
}
#endif /* CONFIG_PPC_BOOK3E */
void replay_soft_interrupts(void)
@ -214,7 +223,7 @@ void replay_soft_interrupts(void)
struct pt_regs regs;
ppc_save_regs(&regs);
regs.softe = IRQS_ALL_DISABLED;
regs.softe = IRQS_ENABLED;
again:
if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG))
@ -270,19 +279,6 @@ void replay_soft_interrupts(void)
hard_irq_disable();
}
/*
* Check if an EPR external interrupt happened this bit is typically
* set if we need to handle another "edge" interrupt from within the
* MPIC "EPR" handler.
*/
if (IS_ENABLED(CONFIG_PPC_BOOK3E) && (happened & PACA_IRQ_EE_EDGE)) {
local_paca->irq_happened &= ~PACA_IRQ_EE_EDGE;
regs.trap = 0x500;
do_IRQ(&regs);
if (!(local_paca->irq_happened & PACA_IRQ_HARD_DIS))
hard_irq_disable();
}
if (IS_ENABLED(CONFIG_PPC_DOORBELL) && (happened & PACA_IRQ_DBELL)) {
local_paca->irq_happened &= ~PACA_IRQ_DBELL;
if (IS_ENABLED(CONFIG_PPC_BOOK3E))
@ -368,6 +364,12 @@ notrace void arch_local_irq_restore(unsigned long mask)
}
}
/*
* Disable preempt here, so that the below preempt_enable will
* perform resched if required (a replayed interrupt may set
* need_resched).
*/
preempt_disable();
irq_soft_mask_set(IRQS_ALL_DISABLED);
trace_hardirqs_off();
@ -377,27 +379,10 @@ notrace void arch_local_irq_restore(unsigned long mask)
trace_hardirqs_on();
irq_soft_mask_set(IRQS_ENABLED);
__hard_irq_enable();
preempt_enable();
}
EXPORT_SYMBOL(arch_local_irq_restore);
/*
* This is specifically called by assembly code to re-enable interrupts
* if they are currently disabled. This is typically called before
* schedule() or do_signal() when returning to userspace. We do it
* in C to avoid the burden of dealing with lockdep etc...
*
* NOTE: This is called with interrupts hard disabled but not marked
* as such in paca->irq_happened, so we need to resync this.
*/
void notrace restore_interrupts(void)
{
if (irqs_disabled()) {
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
local_irq_enable();
} else
__hard_irq_enable();
}
/*
* This is a helper to use when about to go into idle low-power
* when the latter has the side effect of re-enabling interrupts

View File

@ -256,7 +256,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_SPEC7450)
sync
/* Restore MSR (restores EE and DR bits to original state) */
SYNC
mtmsr r7
isync
@ -377,7 +376,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_L3CR)
1: bdnz 1b
/* Restore MSR (restores EE and DR bits to original state) */
4: SYNC
4:
mtmsr r7
isync
blr

View File

@ -215,19 +215,6 @@ _GLOBAL(low_choose_7447a_dfs)
#endif /* CONFIG_CPU_FREQ_PMAC && CONFIG_PPC_BOOK3S_32 */
/*
* complement mask on the msr then "or" some values on.
* _nmask_and_or_msr(nmask, value_to_or)
*/
_GLOBAL(_nmask_and_or_msr)
mfmsr r0 /* Get current msr */
andc r0,r0,r3 /* And off the bits set in r3 (first parm) */
or r0,r0,r4 /* Or on the bits in r4 (second parm) */
SYNC /* Some chip revs have problems here... */
mtmsr r0 /* Update machine state */
isync
blr /* Done */
#ifdef CONFIG_40x
/*
@ -268,41 +255,6 @@ _ASM_NOKPROBE_SYMBOL(real_writeb)
#endif /* CONFIG_40x */
/*
* Flush instruction cache.
* This is a no-op on the 601.
*/
#ifndef CONFIG_PPC_8xx
_GLOBAL(flush_instruction_cache)
#if defined(CONFIG_4xx)
lis r3, KERNELBASE@h
iccci 0,r3
#elif defined(CONFIG_FSL_BOOKE)
#ifdef CONFIG_E200
mfspr r3,SPRN_L1CSR0
ori r3,r3,L1CSR0_CFI|L1CSR0_CLFC
/* msync; isync recommended here */
mtspr SPRN_L1CSR0,r3
isync
blr
#endif
mfspr r3,SPRN_L1CSR1
ori r3,r3,L1CSR1_ICFI|L1CSR1_ICLFR
mtspr SPRN_L1CSR1,r3
#elif defined(CONFIG_PPC_BOOK3S_601)
blr /* for 601, do nothing */
#else
/* 603/604 processor - use invalidate-all bit in HID0 */
mfspr r3,SPRN_HID0
ori r3,r3,HID0_ICFI
mtspr SPRN_HID0,r3
#endif /* CONFIG_4xx */
isync
blr
EXPORT_SYMBOL(flush_instruction_cache)
#endif /* CONFIG_PPC_8xx */
/*
* Copy a whole page. We use the dcbz instruction on the destination
* to reduce memory traffic (it eliminates the unnecessary reads of

View File

@ -365,7 +365,6 @@ _GLOBAL(kexec_smp_wait)
li r4,KEXEC_STATE_REAL_MODE
stb r4,PACAKEXECSTATE(r13)
SYNC
b kexec_wait

View File

@ -124,10 +124,8 @@ unsigned long notrace msr_check_and_set(unsigned long bits)
newmsr = oldmsr | bits;
#ifdef CONFIG_VSX
if (cpu_has_feature(CPU_FTR_VSX) && (bits & MSR_FP))
newmsr |= MSR_VSX;
#endif
if (oldmsr != newmsr)
mtmsr_isync(newmsr);
@ -144,10 +142,8 @@ void notrace __msr_check_and_clear(unsigned long bits)
newmsr = oldmsr & ~bits;
#ifdef CONFIG_VSX
if (cpu_has_feature(CPU_FTR_VSX) && (bits & MSR_FP))
newmsr &= ~MSR_VSX;
#endif
if (oldmsr != newmsr)
mtmsr_isync(newmsr);
@ -162,10 +158,8 @@ static void __giveup_fpu(struct task_struct *tsk)
save_fpu(tsk);
msr = tsk->thread.regs->msr;
msr &= ~(MSR_FP|MSR_FE0|MSR_FE1);
#ifdef CONFIG_VSX
if (cpu_has_feature(CPU_FTR_VSX))
msr &= ~MSR_VSX;
#endif
tsk->thread.regs->msr = msr;
}
@ -235,6 +229,8 @@ void enable_kernel_fp(void)
}
}
EXPORT_SYMBOL(enable_kernel_fp);
#else
static inline void __giveup_fpu(struct task_struct *tsk) { }
#endif /* CONFIG_PPC_FPU */
#ifdef CONFIG_ALTIVEC
@ -245,10 +241,8 @@ static void __giveup_altivec(struct task_struct *tsk)
save_altivec(tsk);
msr = tsk->thread.regs->msr;
msr &= ~MSR_VEC;
#ifdef CONFIG_VSX
if (cpu_has_feature(CPU_FTR_VSX))
msr &= ~MSR_VSX;
#endif
tsk->thread.regs->msr = msr;
}
@ -414,21 +408,14 @@ static unsigned long msr_all_available;
static int __init init_msr_all_available(void)
{
#ifdef CONFIG_PPC_FPU
msr_all_available |= MSR_FP;
#endif
#ifdef CONFIG_ALTIVEC
if (IS_ENABLED(CONFIG_PPC_FPU))
msr_all_available |= MSR_FP;
if (cpu_has_feature(CPU_FTR_ALTIVEC))
msr_all_available |= MSR_VEC;
#endif
#ifdef CONFIG_VSX
if (cpu_has_feature(CPU_FTR_VSX))
msr_all_available |= MSR_VSX;
#endif
#ifdef CONFIG_SPE
if (cpu_has_feature(CPU_FTR_SPE))
msr_all_available |= MSR_SPE;
#endif
return 0;
}
@ -452,18 +439,12 @@ void giveup_all(struct task_struct *tsk)
WARN_ON((usermsr & MSR_VSX) && !((usermsr & MSR_FP) && (usermsr & MSR_VEC)));
#ifdef CONFIG_PPC_FPU
if (usermsr & MSR_FP)
__giveup_fpu(tsk);
#endif
#ifdef CONFIG_ALTIVEC
if (usermsr & MSR_VEC)
__giveup_altivec(tsk);
#endif
#ifdef CONFIG_SPE
if (usermsr & MSR_SPE)
__giveup_spe(tsk);
#endif
msr_check_and_clear(msr_all_available);
}
@ -509,19 +490,18 @@ static bool should_restore_altivec(void) { return false; }
static void do_restore_altivec(void) { }
#endif /* CONFIG_ALTIVEC */
#ifdef CONFIG_VSX
static bool should_restore_vsx(void)
{
if (cpu_has_feature(CPU_FTR_VSX))
return true;
return false;
}
#ifdef CONFIG_VSX
static void do_restore_vsx(void)
{
current->thread.used_vsr = 1;
}
#else
static bool should_restore_vsx(void) { return false; }
static void do_restore_vsx(void) { }
#endif /* CONFIG_VSX */
@ -581,7 +561,7 @@ void notrace restore_math(struct pt_regs *regs)
regs->msr |= new_msr | fpexc_mode;
}
}
#endif
#endif /* CONFIG_PPC_BOOK3S_64 */
static void save_all(struct task_struct *tsk)
{
@ -642,6 +622,44 @@ void do_send_trap(struct pt_regs *regs, unsigned long address,
(void __user *)address);
}
#else /* !CONFIG_PPC_ADV_DEBUG_REGS */
static void do_break_handler(struct pt_regs *regs)
{
struct arch_hw_breakpoint null_brk = {0};
struct arch_hw_breakpoint *info;
struct ppc_inst instr = ppc_inst(0);
int type = 0;
int size = 0;
unsigned long ea;
int i;
/*
* If underneath hw supports only one watchpoint, we know it
* caused exception. 8xx also falls into this category.
*/
if (nr_wp_slots() == 1) {
__set_breakpoint(0, &null_brk);
current->thread.hw_brk[0] = null_brk;
current->thread.hw_brk[0].flags |= HW_BRK_FLAG_DISABLED;
return;
}
/* Otherwise findout which DAWR caused exception and disable it. */
wp_get_instr_detail(regs, &instr, &type, &size, &ea);
for (i = 0; i < nr_wp_slots(); i++) {
info = &current->thread.hw_brk[i];
if (!info->address)
continue;
if (wp_check_constraints(regs, instr, ea, type, size, info)) {
__set_breakpoint(i, &null_brk);
current->thread.hw_brk[i] = null_brk;
current->thread.hw_brk[i].flags |= HW_BRK_FLAG_DISABLED;
}
}
}
void do_break (struct pt_regs *regs, unsigned long address,
unsigned long error_code)
{
@ -653,6 +671,16 @@ void do_break (struct pt_regs *regs, unsigned long address,
if (debugger_break_match(regs))
return;
/*
* We reach here only when watchpoint exception is generated by ptrace
* event (or hw is buggy!). Now if CONFIG_HAVE_HW_BREAKPOINT is set,
* watchpoint is already handled by hw_breakpoint_handler() so we don't
* have to do anything. But when CONFIG_HAVE_HW_BREAKPOINT is not set,
* we need to manually handle the watchpoint here.
*/
if (!IS_ENABLED(CONFIG_HAVE_HW_BREAKPOINT))
do_break_handler(regs);
/* Deliver the signal to userspace */
force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
}
@ -783,9 +811,8 @@ static void switch_hw_breakpoint(struct task_struct *new)
static inline int __set_dabr(unsigned long dabr, unsigned long dabrx)
{
mtspr(SPRN_DAC1, dabr);
#ifdef CONFIG_PPC_47x
isync();
#endif
if (IS_ENABLED(CONFIG_PPC_47x))
isync();
return 0;
}
#elif defined(CONFIG_PPC_BOOK3S)
@ -1256,15 +1283,17 @@ struct task_struct *__switch_to(struct task_struct *prev,
restore_math(current->thread.regs);
/*
* The copy-paste buffer can only store into foreign real
* addresses, so unprivileged processes can not see the
* data or use it in any way unless they have foreign real
* mappings. If the new process has the foreign real address
* mappings, we must issue a cp_abort to clear any state and
* prevent snooping, corruption or a covert channel.
* On POWER9 the copy-paste buffer can only paste into
* foreign real addresses, so unprivileged processes can not
* see the data or use it in any way unless they have
* foreign real mappings. If the new process has the foreign
* real address mappings, we must issue a cp_abort to clear
* any state and prevent snooping, corruption or a covert
* channel. ISA v3.1 supports paste into local memory.
*/
if (current->mm &&
atomic_read(&current->mm->context.vas_windows))
(cpu_has_feature(CPU_FTR_ARCH_31) ||
atomic_read(&current->mm->context.vas_windows)))
asm volatile(PPC_CP_ABORT);
}
#endif /* CONFIG_PPC_BOOK3S_64 */
@ -1453,12 +1482,13 @@ void show_regs(struct pt_regs * regs)
trap = TRAP(regs);
if (!trap_is_syscall(regs) && cpu_has_feature(CPU_FTR_CFAR))
pr_cont("CFAR: "REG" ", regs->orig_gpr3);
if (trap == 0x200 || trap == 0x300 || trap == 0x600)
#if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->dsisr);
#else
pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr);
#endif
if (trap == 0x200 || trap == 0x300 || trap == 0x600) {
if (IS_ENABLED(CONFIG_4xx) || IS_ENABLED(CONFIG_BOOKE))
pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, regs->dsisr);
else
pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, regs->dsisr);
}
#ifdef CONFIG_PPC64
pr_cont("IRQMASK: %lx ", regs->softe);
#endif
@ -1475,14 +1505,14 @@ void show_regs(struct pt_regs * regs)
break;
}
pr_cont("\n");
#ifdef CONFIG_KALLSYMS
/*
* Lookup NIP late so we have the best change of getting the
* above info out without failing
*/
printk("NIP ["REG"] %pS\n", regs->nip, (void *)regs->nip);
printk("LR ["REG"] %pS\n", regs->link, (void *)regs->link);
#endif
if (IS_ENABLED(CONFIG_KALLSYMS)) {
printk("NIP ["REG"] %pS\n", regs->nip, (void *)regs->nip);
printk("LR ["REG"] %pS\n", regs->link, (void *)regs->link);
}
show_stack(current, (unsigned long *) regs->gpr[1], KERN_DEFAULT);
if (!user_mode(regs))
show_instructions(regs);
@ -1731,10 +1761,8 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
#ifdef CONFIG_PPC64
unsigned long load_addr = regs->gpr[2]; /* saved by ELF_PLAT_INIT */
#ifdef CONFIG_PPC_BOOK3S_64
if (!radix_enabled())
if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !radix_enabled())
preload_new_slb_context(start, sp);
#endif
#endif
/*
@ -1866,7 +1894,6 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int val)
* fpexc_mode. fpexc_mode is also used for setting FP exception
* mode (asyn, precise, disabled) for 'Classic' FP. */
if (val & PR_FP_EXC_SW_ENABLE) {
#ifdef CONFIG_SPE
if (cpu_has_feature(CPU_FTR_SPE)) {
/*
* When the sticky exception bits are set
@ -1880,16 +1907,15 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int val)
* anyway to restore the prctl settings from
* the saved environment.
*/
#ifdef CONFIG_SPE
tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR);
tsk->thread.fpexc_mode = val &
(PR_FP_EXC_SW_ENABLE | PR_FP_ALL_EXCEPT);
#endif
return 0;
} else {
return -EINVAL;
}
#else
return -EINVAL;
#endif
}
/* on a CONFIG_SPE this does not hurt us. The bits that
@ -1908,10 +1934,9 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int val)
int get_fpexc_mode(struct task_struct *tsk, unsigned long adr)
{
unsigned int val;
unsigned int val = 0;
if (tsk->thread.fpexc_mode & PR_FP_EXC_SW_ENABLE)
#ifdef CONFIG_SPE
if (tsk->thread.fpexc_mode & PR_FP_EXC_SW_ENABLE) {
if (cpu_has_feature(CPU_FTR_SPE)) {
/*
* When the sticky exception bits are set
@ -1925,15 +1950,15 @@ int get_fpexc_mode(struct task_struct *tsk, unsigned long adr)
* anyway to restore the prctl settings from
* the saved environment.
*/
#ifdef CONFIG_SPE
tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR);
val = tsk->thread.fpexc_mode;
#endif
} else
return -EINVAL;
#else
return -EINVAL;
#endif
else
} else {
val = __unpack_fe01(tsk->thread.fpexc_mode);
}
return put_user(val, (unsigned int __user *) adr);
}
@ -2102,10 +2127,8 @@ void show_stack(struct task_struct *tsk, unsigned long *stack,
unsigned long sp, ip, lr, newsp;
int count = 0;
int firstframe = 1;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
unsigned long ret_addr;
int ftrace_idx = 0;
#endif
if (tsk == NULL)
tsk = current;
@ -2133,12 +2156,10 @@ void show_stack(struct task_struct *tsk, unsigned long *stack,
if (!firstframe || ip != lr) {
printk("%s["REG"] ["REG"] %pS",
loglvl, sp, ip, (void *)ip);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
ret_addr = ftrace_graph_ret_addr(current,
&ftrace_idx, ip, stack);
if (ret_addr != ip)
pr_cont(" (%pS)", (void *)ret_addr);
#endif
if (firstframe)
pr_cont(" (unreliable)");
pr_cont("\n");

View File

@ -776,6 +776,11 @@ void __init early_init_devtree(void *params)
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);
#if defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_PPC_4K_PAGES)
if (!early_radix_enabled())
memblock_cap_memory_range(0, 1UL << (H_MAX_PHYSMEM_BITS));
#endif
memblock_allow_resize();
memblock_dump_all();

View File

@ -2422,10 +2422,19 @@ static void __init prom_check_displays(void)
u32 width, height, pitch, addr;
prom_printf("Setting btext !\n");
prom_getprop(node, "width", &width, 4);
prom_getprop(node, "height", &height, 4);
prom_getprop(node, "linebytes", &pitch, 4);
prom_getprop(node, "address", &addr, 4);
if (prom_getprop(node, "width", &width, 4) == PROM_ERROR)
return;
if (prom_getprop(node, "height", &height, 4) == PROM_ERROR)
return;
if (prom_getprop(node, "linebytes", &pitch, 4) == PROM_ERROR)
return;
if (prom_getprop(node, "address", &addr, 4) == PROM_ERROR)
return;
prom_printf("W=%d H=%d LB=%d addr=0x%x\n",
width, height, pitch, addr);
btext_setup_display(width, height, 8, pitch, addr);

View File

@ -57,6 +57,8 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo)
} else {
dbginfo->features = 0;
}
if (cpu_has_feature(CPU_FTR_ARCH_31))
dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_ARCH_31;
}
int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
@ -217,8 +219,9 @@ long ppc_set_hwdebug(struct task_struct *child, struct ppc_hw_breakpoint *bp_inf
return -EIO;
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
brk.type = HW_BRK_TYPE_TRANSLATE;
brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
brk.len = DABR_MAX_LEN;
brk.hw_len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_WRITE)
@ -286,11 +289,13 @@ long ppc_del_hwdebug(struct task_struct *child, long data)
}
return ret;
#else /* CONFIG_HAVE_HW_BREAKPOINT */
if (child->thread.hw_brk[data - 1].address == 0)
if (!(child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED) &&
child->thread.hw_brk[data - 1].address == 0)
return -ENOENT;
child->thread.hw_brk[data - 1].address = 0;
child->thread.hw_brk[data - 1].type = 0;
child->thread.hw_brk[data - 1].flags = 0;
#endif /* CONFIG_HAVE_HW_BREAKPOINT */
return 0;

View File

@ -992,6 +992,147 @@ struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log,
return NULL;
}
#ifdef CONFIG_PPC_RTAS_FILTER
/*
* The sys_rtas syscall, as originally designed, allows root to pass
* arbitrary physical addresses to RTAS calls. A number of RTAS calls
* can be abused to write to arbitrary memory and do other things that
* are potentially harmful to system integrity, and thus should only
* be used inside the kernel and not exposed to userspace.
*
* All known legitimate users of the sys_rtas syscall will only ever
* pass addresses that fall within the RMO buffer, and use a known
* subset of RTAS calls.
*
* Accordingly, we filter RTAS requests to check that the call is
* permitted, and that provided pointers fall within the RMO buffer.
* The rtas_filters list contains an entry for each permitted call,
* with the indexes of the parameters which are expected to contain
* addresses and sizes of buffers allocated inside the RMO buffer.
*/
struct rtas_filter {
const char *name;
int token;
/* Indexes into the args buffer, -1 if not used */
int buf_idx1;
int size_idx1;
int buf_idx2;
int size_idx2;
int fixed_size;
};
static struct rtas_filter rtas_filters[] __ro_after_init = {
{ "ibm,activate-firmware", -1, -1, -1, -1, -1 },
{ "ibm,configure-connector", -1, 0, -1, 1, -1, 4096 }, /* Special cased */
{ "display-character", -1, -1, -1, -1, -1 },
{ "ibm,display-message", -1, 0, -1, -1, -1 },
{ "ibm,errinjct", -1, 2, -1, -1, -1, 1024 },
{ "ibm,close-errinjct", -1, -1, -1, -1, -1 },
{ "ibm,open-errinct", -1, -1, -1, -1, -1 },
{ "ibm,get-config-addr-info2", -1, -1, -1, -1, -1 },
{ "ibm,get-dynamic-sensor-state", -1, 1, -1, -1, -1 },
{ "ibm,get-indices", -1, 2, 3, -1, -1 },
{ "get-power-level", -1, -1, -1, -1, -1 },
{ "get-sensor-state", -1, -1, -1, -1, -1 },
{ "ibm,get-system-parameter", -1, 1, 2, -1, -1 },
{ "get-time-of-day", -1, -1, -1, -1, -1 },
{ "ibm,get-vpd", -1, 0, -1, 1, 2 },
{ "ibm,lpar-perftools", -1, 2, 3, -1, -1 },
{ "ibm,platform-dump", -1, 4, 5, -1, -1 },
{ "ibm,read-slot-reset-state", -1, -1, -1, -1, -1 },
{ "ibm,scan-log-dump", -1, 0, 1, -1, -1 },
{ "ibm,set-dynamic-indicator", -1, 2, -1, -1, -1 },
{ "ibm,set-eeh-option", -1, -1, -1, -1, -1 },
{ "set-indicator", -1, -1, -1, -1, -1 },
{ "set-power-level", -1, -1, -1, -1, -1 },
{ "set-time-for-power-on", -1, -1, -1, -1, -1 },
{ "ibm,set-system-parameter", -1, 1, -1, -1, -1 },
{ "set-time-of-day", -1, -1, -1, -1, -1 },
{ "ibm,suspend-me", -1, -1, -1, -1, -1 },
{ "ibm,update-nodes", -1, 0, -1, -1, -1, 4096 },
{ "ibm,update-properties", -1, 0, -1, -1, -1, 4096 },
{ "ibm,physical-attestation", -1, 0, 1, -1, -1 },
};
static bool in_rmo_buf(u32 base, u32 end)
{
return base >= rtas_rmo_buf &&
base < (rtas_rmo_buf + RTAS_RMOBUF_MAX) &&
base <= end &&
end >= rtas_rmo_buf &&
end < (rtas_rmo_buf + RTAS_RMOBUF_MAX);
}
static bool block_rtas_call(int token, int nargs,
struct rtas_args *args)
{
int i;
for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) {
struct rtas_filter *f = &rtas_filters[i];
u32 base, size, end;
if (token != f->token)
continue;
if (f->buf_idx1 != -1) {
base = be32_to_cpu(args->args[f->buf_idx1]);
if (f->size_idx1 != -1)
size = be32_to_cpu(args->args[f->size_idx1]);
else if (f->fixed_size)
size = f->fixed_size;
else
size = 1;
end = base + size - 1;
if (!in_rmo_buf(base, end))
goto err;
}
if (f->buf_idx2 != -1) {
base = be32_to_cpu(args->args[f->buf_idx2]);
if (f->size_idx2 != -1)
size = be32_to_cpu(args->args[f->size_idx2]);
else if (f->fixed_size)
size = f->fixed_size;
else
size = 1;
end = base + size - 1;
/*
* Special case for ibm,configure-connector where the
* address can be 0
*/
if (!strcmp(f->name, "ibm,configure-connector") &&
base == 0)
return false;
if (!in_rmo_buf(base, end))
goto err;
}
return false;
}
err:
pr_err_ratelimited("sys_rtas: RTAS call blocked - exploit attempt?\n");
pr_err_ratelimited("sys_rtas: token=0x%x, nargs=%d (called by %s)\n",
token, nargs, current->comm);
return true;
}
#else
static bool block_rtas_call(int token, int nargs,
struct rtas_args *args)
{
return false;
}
#endif /* CONFIG_PPC_RTAS_FILTER */
/* We assume to be passed big endian arguments */
SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
{
@ -1029,6 +1170,9 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
args.rets = &args.args[nargs];
memset(args.rets, 0, nret * sizeof(rtas_arg_t));
if (block_rtas_call(token, nargs, &args))
return -EINVAL;
/* Need to handle ibm,suspend_me call specially */
if (token == ibm_suspend_me_token) {
@ -1090,6 +1234,9 @@ void __init rtas_initialize(void)
unsigned long rtas_region = RTAS_INSTANTIATE_MAX;
u32 base, size, entry;
int no_base, no_size, no_entry;
#ifdef CONFIG_PPC_RTAS_FILTER
int i;
#endif
/* Get RTAS dev node and fill up our "rtas" structure with infos
* about it.
@ -1129,6 +1276,12 @@ void __init rtas_initialize(void)
#ifdef CONFIG_RTAS_ERROR_LOGGING
rtas_last_error_token = rtas_token("rtas-last-error");
#endif
#ifdef CONFIG_PPC_RTAS_FILTER
for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) {
rtas_filters[i].token = rtas_token(rtas_filters[i].name);
}
#endif
}
int __init early_init_dt_scan_rtas(unsigned long node,

View File

@ -430,30 +430,44 @@ device_initcall(stf_barrier_debugfs_init);
static void update_branch_cache_flush(void)
{
u32 *site;
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
site = &patch__call_kvm_flush_link_stack;
// This controls the branch from guest_exit_cont to kvm_flush_link_stack
if (link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) {
patch_instruction_site(&patch__call_kvm_flush_link_stack,
ppc_inst(PPC_INST_NOP));
patch_instruction_site(site, ppc_inst(PPC_INST_NOP));
} else {
// Could use HW flush, but that could also flush count cache
patch_branch_site(&patch__call_kvm_flush_link_stack,
(u64)&kvm_flush_link_stack, BRANCH_SET_LINK);
patch_branch_site(site, (u64)&kvm_flush_link_stack, BRANCH_SET_LINK);
}
#endif
// Patch out the bcctr first, then nop the rest
site = &patch__call_flush_branch_caches3;
patch_instruction_site(site, ppc_inst(PPC_INST_NOP));
site = &patch__call_flush_branch_caches2;
patch_instruction_site(site, ppc_inst(PPC_INST_NOP));
site = &patch__call_flush_branch_caches1;
patch_instruction_site(site, ppc_inst(PPC_INST_NOP));
// This controls the branch from _switch to flush_branch_caches
if (count_cache_flush_type == BRANCH_CACHE_FLUSH_NONE &&
link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) {
patch_instruction_site(&patch__call_flush_branch_caches,
ppc_inst(PPC_INST_NOP));
// Nothing to be done
} else if (count_cache_flush_type == BRANCH_CACHE_FLUSH_HW &&
link_stack_flush_type == BRANCH_CACHE_FLUSH_HW) {
patch_instruction_site(&patch__call_flush_branch_caches,
ppc_inst(PPC_INST_BCCTR_FLUSH));
// Patch in the bcctr last
site = &patch__call_flush_branch_caches1;
patch_instruction_site(site, ppc_inst(0x39207fff)); // li r9,0x7fff
site = &patch__call_flush_branch_caches2;
patch_instruction_site(site, ppc_inst(0x7d2903a6)); // mtctr r9
site = &patch__call_flush_branch_caches3;
patch_instruction_site(site, ppc_inst(PPC_INST_BCCTR_FLUSH));
} else {
patch_branch_site(&patch__call_flush_branch_caches,
(u64)&flush_branch_caches, BRANCH_SET_LINK);
patch_branch_site(site, (u64)&flush_branch_caches, BRANCH_SET_LINK);
// If we just need to flush the link stack, early return
if (count_cache_flush_type == BRANCH_CACHE_FLUSH_NONE) {

View File

@ -223,6 +223,6 @@ __init void initialize_cache_info(void)
dcache_bsize = cur_cpu_spec->dcache_bsize;
icache_bsize = cur_cpu_spec->icache_bsize;
ucache_bsize = 0;
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601) || IS_ENABLED(CONFIG_E200))
if (IS_ENABLED(CONFIG_E200))
ucache_bsize = icache_bsize = dcache_bsize;
}

View File

@ -66,6 +66,7 @@
#include <asm/feature-fixups.h>
#include <asm/kup.h>
#include <asm/early_ioremap.h>
#include <asm/pgalloc.h>
#include "setup.h"
@ -756,17 +757,46 @@ void __init emergency_stack_init(void)
}
#ifdef CONFIG_SMP
#define PCPU_DYN_SIZE ()
static void * __init pcpu_fc_alloc(unsigned int cpu, size_t size, size_t align)
/**
* pcpu_alloc_bootmem - NUMA friendly alloc_bootmem wrapper for percpu
* @cpu: cpu to allocate for
* @size: size allocation in bytes
* @align: alignment
*
* Allocate @size bytes aligned at @align for cpu @cpu. This wrapper
* does the right thing for NUMA regardless of the current
* configuration.
*
* RETURNS:
* Pointer to the allocated area on success, NULL on failure.
*/
static void * __init pcpu_alloc_bootmem(unsigned int cpu, size_t size,
size_t align)
{
return memblock_alloc_try_nid(size, align, __pa(MAX_DMA_ADDRESS),
MEMBLOCK_ALLOC_ACCESSIBLE,
early_cpu_to_node(cpu));
const unsigned long goal = __pa(MAX_DMA_ADDRESS);
#ifdef CONFIG_NEED_MULTIPLE_NODES
int node = early_cpu_to_node(cpu);
void *ptr;
if (!node_online(node) || !NODE_DATA(node)) {
ptr = memblock_alloc_from(size, align, goal);
pr_info("cpu %d has no node %d or node-local memory\n",
cpu, node);
pr_debug("per cpu data for cpu%d %lu bytes at %016lx\n",
cpu, size, __pa(ptr));
} else {
ptr = memblock_alloc_try_nid(size, align, goal,
MEMBLOCK_ALLOC_ACCESSIBLE, node);
pr_debug("per cpu data for cpu%d %lu bytes on node%d at "
"%016lx\n", cpu, size, node, __pa(ptr));
}
return ptr;
#else
return memblock_alloc_from(size, align, goal);
#endif
}
static void __init pcpu_fc_free(void *ptr, size_t size)
static void __init pcpu_free_bootmem(void *ptr, size_t size)
{
memblock_free(__pa(ptr), size);
}
@ -782,13 +812,58 @@ static int pcpu_cpu_distance(unsigned int from, unsigned int to)
unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
EXPORT_SYMBOL(__per_cpu_offset);
static void __init pcpu_populate_pte(unsigned long addr)
{
pgd_t *pgd = pgd_offset_k(addr);
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
p4d = p4d_offset(pgd, addr);
if (p4d_none(*p4d)) {
pud_t *new;
new = memblock_alloc(PUD_TABLE_SIZE, PUD_TABLE_SIZE);
if (!new)
goto err_alloc;
p4d_populate(&init_mm, p4d, new);
}
pud = pud_offset(p4d, addr);
if (pud_none(*pud)) {
pmd_t *new;
new = memblock_alloc(PMD_TABLE_SIZE, PMD_TABLE_SIZE);
if (!new)
goto err_alloc;
pud_populate(&init_mm, pud, new);
}
pmd = pmd_offset(pud, addr);
if (!pmd_present(*pmd)) {
pte_t *new;
new = memblock_alloc(PTE_TABLE_SIZE, PTE_TABLE_SIZE);
if (!new)
goto err_alloc;
pmd_populate_kernel(&init_mm, pmd, new);
}
return;
err_alloc:
panic("%s: Failed to allocate %lu bytes align=%lx from=%lx\n",
__func__, PAGE_SIZE, PAGE_SIZE, PAGE_SIZE);
}
void __init setup_per_cpu_areas(void)
{
const size_t dyn_size = PERCPU_MODULE_RESERVE + PERCPU_DYNAMIC_RESERVE;
size_t atom_size;
unsigned long delta;
unsigned int cpu;
int rc;
int rc = -EINVAL;
/*
* Linear mapping is one of 4K, 1M and 16M. For 4K, no need
@ -800,8 +875,18 @@ void __init setup_per_cpu_areas(void)
else
atom_size = 1 << 20;
rc = pcpu_embed_first_chunk(0, dyn_size, atom_size, pcpu_cpu_distance,
pcpu_fc_alloc, pcpu_fc_free);
if (pcpu_chosen_fc != PCPU_FC_PAGE) {
rc = pcpu_embed_first_chunk(0, dyn_size, atom_size, pcpu_cpu_distance,
pcpu_alloc_bootmem, pcpu_free_bootmem);
if (rc)
pr_warn("PERCPU: %s allocator failed (%d), "
"falling back to page size\n",
pcpu_fc_names[pcpu_chosen_fc], rc);
}
if (rc < 0)
rc = pcpu_page_first_chunk(0, pcpu_alloc_bootmem, pcpu_free_bootmem,
pcpu_populate_pte);
if (rc < 0)
panic("cannot initialize percpu area (err=%d)", rc);

View File

@ -75,17 +75,28 @@ static DEFINE_PER_CPU(int, cpu_state) = { 0 };
struct task_struct *secondary_current;
bool has_big_cores;
bool coregroup_enabled;
DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_l2_cache_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_core_map);
DEFINE_PER_CPU(cpumask_var_t, cpu_coregroup_map);
EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);
EXPORT_SYMBOL_GPL(has_big_cores);
enum {
#ifdef CONFIG_SCHED_SMT
smt_idx,
#endif
cache_idx,
mc_idx,
die_idx,
};
#define MAX_THREAD_LIST_SIZE 8
#define THREAD_GROUP_SHARE_L1 1
struct thread_groups {
@ -659,6 +670,28 @@ static void set_cpus_unrelated(int i, int j,
}
#endif
/*
* Extends set_cpus_related. Instead of setting one CPU at a time in
* dstmask, set srcmask at oneshot. dstmask should be super set of srcmask.
*/
static void or_cpumasks_related(int i, int j, struct cpumask *(*srcmask)(int),
struct cpumask *(*dstmask)(int))
{
struct cpumask *mask;
int k;
mask = srcmask(j);
for_each_cpu(k, srcmask(i))
cpumask_or(dstmask(k), dstmask(k), mask);
if (i == j)
return;
mask = srcmask(i);
for_each_cpu(k, srcmask(j))
cpumask_or(dstmask(k), dstmask(k), mask);
}
/*
* parse_thread_groups: Parses the "ibm,thread-groups" device tree
* property for the CPU device node @dn and stores
@ -789,10 +822,6 @@ static int init_cpu_l1_cache_map(int cpu)
if (err)
goto out;
zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
GFP_KERNEL,
cpu_to_node(cpu));
cpu_group_start = get_cpu_thread_group_start(cpu, &tg);
if (unlikely(cpu_group_start == -1)) {
@ -801,6 +830,9 @@ static int init_cpu_l1_cache_map(int cpu)
goto out;
}
zalloc_cpumask_var_node(&per_cpu(cpu_l1_cache_map, cpu),
GFP_KERNEL, cpu_to_node(cpu));
for (i = first_thread; i < first_thread + threads_per_core; i++) {
int i_group_start = get_cpu_thread_group_start(i, &tg);
@ -819,6 +851,74 @@ static int init_cpu_l1_cache_map(int cpu)
return err;
}
static bool shared_caches;
#ifdef CONFIG_SCHED_SMT
/* cpumask of CPUs with asymmetric SMT dependency */
static int powerpc_smt_flags(void)
{
int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
flags |= SD_ASYM_PACKING;
}
return flags;
}
#endif
/*
* P9 has a slightly odd architecture where pairs of cores share an L2 cache.
* This topology makes it *much* cheaper to migrate tasks between adjacent cores
* since the migrated task remains cache hot. We want to take advantage of this
* at the scheduler level so an extra topology level is required.
*/
static int powerpc_shared_cache_flags(void)
{
return SD_SHARE_PKG_RESOURCES;
}
/*
* We can't just pass cpu_l2_cache_mask() directly because
* returns a non-const pointer and the compiler barfs on that.
*/
static const struct cpumask *shared_cache_mask(int cpu)
{
return per_cpu(cpu_l2_cache_map, cpu);
}
#ifdef CONFIG_SCHED_SMT
static const struct cpumask *smallcore_smt_mask(int cpu)
{
return cpu_smallcore_mask(cpu);
}
#endif
static struct cpumask *cpu_coregroup_mask(int cpu)
{
return per_cpu(cpu_coregroup_map, cpu);
}
static bool has_coregroup_support(void)
{
return coregroup_enabled;
}
static const struct cpumask *cpu_mc_mask(int cpu)
{
return cpu_coregroup_mask(cpu);
}
static struct sched_domain_topology_level powerpc_topology[] = {
#ifdef CONFIG_SCHED_SMT
{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
#endif
{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
{ cpu_mc_mask, SD_INIT_NAME(MC) },
{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
{ NULL, },
};
static int init_big_cores(void)
{
int cpu;
@ -861,6 +961,11 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
GFP_KERNEL, cpu_to_node(cpu));
zalloc_cpumask_var_node(&per_cpu(cpu_core_map, cpu),
GFP_KERNEL, cpu_to_node(cpu));
if (has_coregroup_support())
zalloc_cpumask_var_node(&per_cpu(cpu_coregroup_map, cpu),
GFP_KERNEL, cpu_to_node(cpu));
#ifdef CONFIG_NEED_MULTIPLE_NODES
/*
* numa_node_id() works after this.
*/
@ -869,12 +974,21 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
set_cpu_numa_mem(cpu,
local_memory_node(numa_cpu_lookup_table[cpu]));
}
#endif
/*
* cpu_core_map is now more updated and exists only since
* its been exported for long. It only will have a snapshot
* of cpu_cpu_mask.
*/
cpumask_copy(per_cpu(cpu_core_map, cpu), cpu_cpu_mask(cpu));
}
/* Init the cpumasks so the boot CPU is related to itself */
cpumask_set_cpu(boot_cpuid, cpu_sibling_mask(boot_cpuid));
cpumask_set_cpu(boot_cpuid, cpu_l2_cache_mask(boot_cpuid));
cpumask_set_cpu(boot_cpuid, cpu_core_mask(boot_cpuid));
if (has_coregroup_support())
cpumask_set_cpu(boot_cpuid, cpu_coregroup_mask(boot_cpuid));
init_big_cores();
if (has_big_cores) {
@ -1126,30 +1240,61 @@ static struct device_node *cpu_to_l2cache(int cpu)
return cache;
}
static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
static bool update_mask_by_l2(int cpu)
{
struct cpumask *(*submask_fn)(int) = cpu_sibling_mask;
struct device_node *l2_cache, *np;
cpumask_var_t mask;
int i;
l2_cache = cpu_to_l2cache(cpu);
if (!l2_cache)
return false;
if (!l2_cache) {
struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
for_each_cpu(i, cpu_online_mask) {
/*
* If no l2cache for this CPU, assume all siblings to share
* cache with this CPU.
*/
if (has_big_cores)
sibling_mask = cpu_smallcore_mask;
for_each_cpu(i, sibling_mask(cpu))
set_cpus_related(cpu, i, cpu_l2_cache_mask);
return false;
}
alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
if (has_big_cores)
submask_fn = cpu_smallcore_mask;
/* Update l2-cache mask with all the CPUs that are part of submask */
or_cpumasks_related(cpu, cpu, submask_fn, cpu_l2_cache_mask);
/* Skip all CPUs already part of current CPU l2-cache mask */
cpumask_andnot(mask, mask, cpu_l2_cache_mask(cpu));
for_each_cpu(i, mask) {
/*
* when updating the marks the current CPU has not been marked
* online, but we need to update the cache masks
*/
np = cpu_to_l2cache(i);
if (!np)
continue;
if (np == l2_cache)
set_cpus_related(cpu, i, mask_fn);
/* Skip all CPUs already part of current CPU l2-cache */
if (np == l2_cache) {
or_cpumasks_related(cpu, i, submask_fn, cpu_l2_cache_mask);
cpumask_andnot(mask, mask, submask_fn(i));
} else {
cpumask_andnot(mask, mask, cpu_l2_cache_mask(i));
}
of_node_put(np);
}
of_node_put(l2_cache);
free_cpumask_var(mask);
return true;
}
@ -1157,59 +1302,75 @@ static bool update_mask_by_l2(int cpu, struct cpumask *(*mask_fn)(int))
#ifdef CONFIG_HOTPLUG_CPU
static void remove_cpu_from_masks(int cpu)
{
struct cpumask *(*mask_fn)(int) = cpu_sibling_mask;
int i;
/* NB: cpu_core_mask is a superset of the others */
for_each_cpu(i, cpu_core_mask(cpu)) {
set_cpus_unrelated(cpu, i, cpu_core_mask);
if (shared_caches)
mask_fn = cpu_l2_cache_mask;
for_each_cpu(i, mask_fn(cpu)) {
set_cpus_unrelated(cpu, i, cpu_l2_cache_mask);
set_cpus_unrelated(cpu, i, cpu_sibling_mask);
if (has_big_cores)
set_cpus_unrelated(cpu, i, cpu_smallcore_mask);
}
if (has_coregroup_support()) {
for_each_cpu(i, cpu_coregroup_mask(cpu))
set_cpus_unrelated(cpu, i, cpu_coregroup_mask);
}
}
#endif
static inline void add_cpu_to_smallcore_masks(int cpu)
{
struct cpumask *this_l1_cache_map = per_cpu(cpu_l1_cache_map, cpu);
int i, first_thread = cpu_first_thread_sibling(cpu);
int i;
if (!has_big_cores)
return;
cpumask_set_cpu(cpu, cpu_smallcore_mask(cpu));
for (i = first_thread; i < first_thread + threads_per_core; i++) {
if (cpu_online(i) && cpumask_test_cpu(i, this_l1_cache_map))
for_each_cpu(i, per_cpu(cpu_l1_cache_map, cpu)) {
if (cpu_online(i))
set_cpus_related(i, cpu, cpu_smallcore_mask);
}
}
int get_physical_package_id(int cpu)
static void update_coregroup_mask(int cpu)
{
int pkg_id = cpu_to_chip_id(cpu);
struct cpumask *(*submask_fn)(int) = cpu_sibling_mask;
cpumask_var_t mask;
int coregroup_id = cpu_to_coregroup_id(cpu);
int i;
/*
* If the platform is PowerNV or Guest on KVM, ibm,chip-id is
* defined. Hence we would return the chip-id as the result of
* get_physical_package_id.
*/
if (pkg_id == -1 && firmware_has_feature(FW_FEATURE_LPAR) &&
IS_ENABLED(CONFIG_PPC_SPLPAR)) {
struct device_node *np = of_get_cpu_node(cpu, NULL);
pkg_id = of_node_to_nid(np);
of_node_put(np);
alloc_cpumask_var_node(&mask, GFP_KERNEL, cpu_to_node(cpu));
cpumask_and(mask, cpu_online_mask, cpu_cpu_mask(cpu));
if (shared_caches)
submask_fn = cpu_l2_cache_mask;
/* Update coregroup mask with all the CPUs that are part of submask */
or_cpumasks_related(cpu, cpu, submask_fn, cpu_coregroup_mask);
/* Skip all CPUs already part of coregroup mask */
cpumask_andnot(mask, mask, cpu_coregroup_mask(cpu));
for_each_cpu(i, mask) {
/* Skip all CPUs not part of this coregroup */
if (coregroup_id == cpu_to_coregroup_id(i)) {
or_cpumasks_related(cpu, i, submask_fn, cpu_coregroup_mask);
cpumask_andnot(mask, mask, submask_fn(i));
} else {
cpumask_andnot(mask, mask, cpu_coregroup_mask(i));
}
}
return pkg_id;
free_cpumask_var(mask);
}
EXPORT_SYMBOL_GPL(get_physical_package_id);
static void add_cpu_to_masks(int cpu)
{
int first_thread = cpu_first_thread_sibling(cpu);
int pkg_id = get_physical_package_id(cpu);
int i;
/*
@ -1223,36 +1384,16 @@ static void add_cpu_to_masks(int cpu)
set_cpus_related(i, cpu, cpu_sibling_mask);
add_cpu_to_smallcore_masks(cpu);
/*
* Copy the thread sibling mask into the cache sibling mask
* and mark any CPUs that share an L2 with this CPU.
*/
for_each_cpu(i, cpu_sibling_mask(cpu))
set_cpus_related(cpu, i, cpu_l2_cache_mask);
update_mask_by_l2(cpu, cpu_l2_cache_mask);
update_mask_by_l2(cpu);
/*
* Copy the cache sibling mask into core sibling mask and mark
* any CPUs on the same chip as this CPU.
*/
for_each_cpu(i, cpu_l2_cache_mask(cpu))
set_cpus_related(cpu, i, cpu_core_mask);
if (pkg_id == -1)
return;
for_each_cpu(i, cpu_online_mask)
if (get_physical_package_id(i) == pkg_id)
set_cpus_related(cpu, i, cpu_core_mask);
if (has_coregroup_support())
update_coregroup_mask(cpu);
}
static bool shared_caches;
/* Activate a secondary processor. */
void start_secondary(void *unused)
{
unsigned int cpu = smp_processor_id();
struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
mmgrab(&init_mm);
current->active_mm = &init_mm;
@ -1278,14 +1419,20 @@ void start_secondary(void *unused)
/* Update topology CPU masks */
add_cpu_to_masks(cpu);
if (has_big_cores)
sibling_mask = cpu_smallcore_mask;
/*
* Check for any shared caches. Note that this must be done on a
* per-core basis because one core in the pair might be disabled.
*/
if (!cpumask_equal(cpu_l2_cache_mask(cpu), sibling_mask(cpu)))
shared_caches = true;
if (!shared_caches) {
struct cpumask *(*sibling_mask)(int) = cpu_sibling_mask;
struct cpumask *mask = cpu_l2_cache_mask(cpu);
if (has_big_cores)
sibling_mask = cpu_smallcore_mask;
if (cpumask_weight(mask) > cpumask_weight(sibling_mask(cpu)))
shared_caches = true;
}
set_numa_node(numa_cpu_lookup_table[cpu]);
set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
@ -1311,64 +1458,45 @@ int setup_profiling_timer(unsigned int multiplier)
return 0;
}
#ifdef CONFIG_SCHED_SMT
/* cpumask of CPUs with asymetric SMT dependancy */
static int powerpc_smt_flags(void)
static void fixup_topology(void)
{
int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
int i;
if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
flags |= SD_ASYM_PACKING;
#ifdef CONFIG_SCHED_SMT
if (has_big_cores) {
pr_info("Big cores detected but using small core scheduling\n");
powerpc_topology[smt_idx].mask = smallcore_smt_mask;
}
return flags;
}
#endif
static struct sched_domain_topology_level powerpc_topology[] = {
#ifdef CONFIG_SCHED_SMT
{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
if (!has_coregroup_support())
powerpc_topology[mc_idx].mask = powerpc_topology[cache_idx].mask;
/*
* Try to consolidate topology levels here instead of
* allowing scheduler to degenerate.
* - Dont consolidate if masks are different.
* - Dont consolidate if sd_flags exists and are different.
*/
for (i = 1; i <= die_idx; i++) {
if (powerpc_topology[i].mask != powerpc_topology[i - 1].mask)
continue;
if (powerpc_topology[i].sd_flags && powerpc_topology[i - 1].sd_flags &&
powerpc_topology[i].sd_flags != powerpc_topology[i - 1].sd_flags)
continue;
if (!powerpc_topology[i - 1].sd_flags)
powerpc_topology[i - 1].sd_flags = powerpc_topology[i].sd_flags;
powerpc_topology[i].mask = powerpc_topology[i + 1].mask;
powerpc_topology[i].sd_flags = powerpc_topology[i + 1].sd_flags;
#ifdef CONFIG_SCHED_DEBUG
powerpc_topology[i].name = powerpc_topology[i + 1].name;
#endif
{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
{ NULL, },
};
/*
* P9 has a slightly odd architecture where pairs of cores share an L2 cache.
* This topology makes it *much* cheaper to migrate tasks between adjacent cores
* since the migrated task remains cache hot. We want to take advantage of this
* at the scheduler level so an extra topology level is required.
*/
static int powerpc_shared_cache_flags(void)
{
return SD_SHARE_PKG_RESOURCES;
}
}
/*
* We can't just pass cpu_l2_cache_mask() directly because
* returns a non-const pointer and the compiler barfs on that.
*/
static const struct cpumask *shared_cache_mask(int cpu)
{
return cpu_l2_cache_mask(cpu);
}
#ifdef CONFIG_SCHED_SMT
static const struct cpumask *smallcore_smt_mask(int cpu)
{
return cpu_smallcore_mask(cpu);
}
#endif
static struct sched_domain_topology_level power9_topology[] = {
#ifdef CONFIG_SCHED_SMT
{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
#endif
{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
{ NULL, },
};
void __init smp_cpus_done(unsigned int max_cpus)
{
/*
@ -1382,24 +1510,8 @@ void __init smp_cpus_done(unsigned int max_cpus)
dump_numa_cpu_topology();
#ifdef CONFIG_SCHED_SMT
if (has_big_cores) {
pr_info("Big cores detected but using small core scheduling\n");
power9_topology[0].mask = smallcore_smt_mask;
powerpc_topology[0].mask = smallcore_smt_mask;
}
#endif
/*
* If any CPU detects that it's sharing a cache with another CPU then
* use the deeper topology that is aware of this sharing.
*/
if (shared_caches) {
pr_info("Using shared cache scheduler topology\n");
set_sched_topology(power9_topology);
} else {
pr_info("Using standard scheduler topology\n");
set_sched_topology(powerpc_topology);
}
fixup_topology();
set_sched_topology(powerpc_topology);
}
#ifdef CONFIG_HOTPLUG_CPU
@ -1429,16 +1541,18 @@ void __cpu_die(unsigned int cpu)
smp_ops->cpu_die(cpu);
}
void cpu_die(void)
void arch_cpu_idle_dead(void)
{
sched_preempt_enable_no_resched();
/*
* Disable on the down path. This will be re-enabled by
* start_secondary() via start_secondary_resume() below
*/
this_cpu_disable_ftrace();
if (ppc_md.cpu_die)
ppc_md.cpu_die();
if (smp_ops->cpu_offline_self)
smp_ops->cpu_offline_self();
/* If we return, we re-enter start_secondary */
start_secondary_resume();

View File

@ -32,29 +32,27 @@
static DEFINE_PER_CPU(struct cpu, cpu_devices);
/*
* SMT snooze delay stuff, 64-bit only for now
*/
#ifdef CONFIG_PPC64
/* Time in microseconds we delay before sleeping in the idle loop */
static DEFINE_PER_CPU(long, smt_snooze_delay) = { 100 };
/*
* Snooze delay has not been hooked up since 3fa8cad82b94 ("powerpc/pseries/cpuidle:
* smt-snooze-delay cleanup.") and has been broken even longer. As was foretold in
* 2014:
*
* "ppc64_util currently utilises it. Once we fix ppc64_util, propose to clean
* up the kernel code."
*
* powerpc-utils stopped using it as of 1.3.8. At some point in the future this
* code should be removed.
*/
static ssize_t store_smt_snooze_delay(struct device *dev,
struct device_attribute *attr,
const char *buf,
size_t count)
{
struct cpu *cpu = container_of(dev, struct cpu, dev);
ssize_t ret;
long snooze;
ret = sscanf(buf, "%ld", &snooze);
if (ret != 1)
return -EINVAL;
per_cpu(smt_snooze_delay, cpu->dev.id) = snooze;
pr_warn_once("%s (%d) stored to unsupported smt_snooze_delay, which has no effect.\n",
current->comm, current->pid);
return count;
}
@ -62,9 +60,9 @@ static ssize_t show_smt_snooze_delay(struct device *dev,
struct device_attribute *attr,
char *buf)
{
struct cpu *cpu = container_of(dev, struct cpu, dev);
return sprintf(buf, "%ld\n", per_cpu(smt_snooze_delay, cpu->dev.id));
pr_warn_once("%s (%d) read from unsupported smt_snooze_delay\n",
current->comm, current->pid);
return sprintf(buf, "100\n");
}
static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay,
@ -72,16 +70,10 @@ static DEVICE_ATTR(smt_snooze_delay, 0644, show_smt_snooze_delay,
static int __init setup_smt_snooze_delay(char *str)
{
unsigned int cpu;
long snooze;
if (!cpu_has_feature(CPU_FTR_SMT))
return 1;
snooze = simple_strtol(str, NULL, 10);
for_each_possible_cpu(cpu)
per_cpu(smt_snooze_delay, cpu) = snooze;
pr_warn("smt-snooze-delay command line option has no effect\n");
return 1;
}
__setup("smt-snooze-delay=", setup_smt_snooze_delay);
@ -225,14 +217,13 @@ static DEVICE_ATTR(dscr_default, 0600,
static void sysfs_create_dscr_default(void)
{
if (cpu_has_feature(CPU_FTR_DSCR)) {
int err = 0;
int cpu;
dscr_default = spr_default_dscr;
for_each_possible_cpu(cpu)
paca_ptrs[cpu]->dscr_default = dscr_default;
err = device_create_file(cpu_subsys.dev_root, &dev_attr_dscr_default);
device_create_file(cpu_subsys.dev_root, &dev_attr_dscr_default);
}
}
#endif /* CONFIG_PPC64 */
@ -1168,6 +1159,7 @@ static int __init topology_init(void)
for_each_possible_cpu(cpu) {
struct cpu *c = &per_cpu(cpu_devices, cpu);
#ifdef CONFIG_HOTPLUG_CPU
/*
* For now, we just see if the system supports making
* the RTAS calls for CPU hotplug. But, there may be a
@ -1175,8 +1167,9 @@ static int __init topology_init(void)
* CPU. For instance, the boot cpu might never be valid
* for hotplugging.
*/
if (ppc_md.cpu_die)
if (smp_ops->cpu_offline_self)
c->hotpluggable = 1;
#endif
if (cpu_online(cpu) || c->hotpluggable) {
register_cpu(c, cpu);

View File

@ -13,13 +13,14 @@
*/
#include <linux/errno.h>
#include <linux/jiffies.h>
#include <linux/kernel.h>
#include <linux/param.h>
#include <linux/string.h>
#include <linux/mm.h>
#include <linux/interrupt.h>
#include <linux/init.h>
#include <linux/delay.h>
#include <linux/workqueue.h>
#include <asm/io.h>
#include <asm/reg.h>
@ -39,9 +40,7 @@ static struct tau_temp
unsigned char grew;
} tau[NR_CPUS];
struct timer_list tau_timer;
#undef DEBUG
static bool tau_int_enable;
/* TODO: put these in a /proc interface, with some sanity checks, and maybe
* dynamic adjustment to minimize # of interrupts */
@ -50,72 +49,49 @@ struct timer_list tau_timer;
#define step_size 2 /* step size when temp goes out of range */
#define window_expand 1 /* expand the window by this much */
/* configurable values for shrinking the window */
#define shrink_timer 2*HZ /* period between shrinking the window */
#define shrink_timer 2000 /* period between shrinking the window */
#define min_window 2 /* minimum window size, degrees C */
static void set_thresholds(unsigned long cpu)
{
#ifdef CONFIG_TAU_INT
/*
* setup THRM1,
* threshold, valid bit, enable interrupts, interrupt when below threshold
*/
mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | THRM1_TIE | THRM1_TID);
u32 maybe_tie = tau_int_enable ? THRM1_TIE : 0;
/* setup THRM2,
* threshold, valid bit, enable interrupts, interrupt when above threshold
*/
mtspr (SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V | THRM1_TIE);
#else
/* same thing but don't enable interrupts */
mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | THRM1_TID);
mtspr(SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V);
#endif
/* setup THRM1, threshold, valid bit, interrupt when below threshold */
mtspr(SPRN_THRM1, THRM1_THRES(tau[cpu].low) | THRM1_V | maybe_tie | THRM1_TID);
/* setup THRM2, threshold, valid bit, interrupt when above threshold */
mtspr(SPRN_THRM2, THRM1_THRES(tau[cpu].high) | THRM1_V | maybe_tie);
}
static void TAUupdate(int cpu)
{
unsigned thrm;
#ifdef DEBUG
printk("TAUupdate ");
#endif
u32 thrm;
u32 bits = THRM1_TIV | THRM1_TIN | THRM1_V;
/* if both thresholds are crossed, the step_sizes cancel out
* and the window winds up getting expanded twice. */
if((thrm = mfspr(SPRN_THRM1)) & THRM1_TIV){ /* is valid? */
if(thrm & THRM1_TIN){ /* crossed low threshold */
if (tau[cpu].low >= step_size){
tau[cpu].low -= step_size;
tau[cpu].high -= (step_size - window_expand);
}
tau[cpu].grew = 1;
#ifdef DEBUG
printk("low threshold crossed ");
#endif
thrm = mfspr(SPRN_THRM1);
if ((thrm & bits) == bits) {
mtspr(SPRN_THRM1, 0);
if (tau[cpu].low >= step_size) {
tau[cpu].low -= step_size;
tau[cpu].high -= (step_size - window_expand);
}
tau[cpu].grew = 1;
pr_debug("%s: low threshold crossed\n", __func__);
}
if((thrm = mfspr(SPRN_THRM2)) & THRM1_TIV){ /* is valid? */
if(thrm & THRM1_TIN){ /* crossed high threshold */
if (tau[cpu].high <= 127-step_size){
tau[cpu].low += (step_size - window_expand);
tau[cpu].high += step_size;
}
tau[cpu].grew = 1;
#ifdef DEBUG
printk("high threshold crossed ");
#endif
thrm = mfspr(SPRN_THRM2);
if ((thrm & bits) == bits) {
mtspr(SPRN_THRM2, 0);
if (tau[cpu].high <= 127 - step_size) {
tau[cpu].low += (step_size - window_expand);
tau[cpu].high += step_size;
}
tau[cpu].grew = 1;
pr_debug("%s: high threshold crossed\n", __func__);
}
#ifdef DEBUG
printk("grew = %d\n", tau[cpu].grew);
#endif
#ifndef CONFIG_TAU_INT /* tau_timeout will do this if not using interrupts */
set_thresholds(cpu);
#endif
}
#ifdef CONFIG_TAU_INT
@ -140,17 +116,16 @@ void TAUException(struct pt_regs * regs)
static void tau_timeout(void * info)
{
int cpu;
unsigned long flags;
int size;
int shrink;
/* disabling interrupts *should* be okay */
local_irq_save(flags);
cpu = smp_processor_id();
#ifndef CONFIG_TAU_INT
TAUupdate(cpu);
#endif
if (!tau_int_enable)
TAUupdate(cpu);
/* Stop thermal sensor comparisons and interrupts */
mtspr(SPRN_THRM3, 0);
size = tau[cpu].high - tau[cpu].low;
if (size > min_window && ! tau[cpu].grew) {
@ -173,32 +148,26 @@ static void tau_timeout(void * info)
set_thresholds(cpu);
/*
* Do the enable every time, since otherwise a bunch of (relatively)
* complex sleep code needs to be added. One mtspr every time
* tau_timeout is called is probably not a big deal.
*
* Enable thermal sensor and set up sample interval timer
* need 20 us to do the compare.. until a nice 'cpu_speed' function
* call is implemented, just assume a 500 mhz clock. It doesn't really
* matter if we take too long for a compare since it's all interrupt
* driven anyway.
*
* use a extra long time.. (60 us @ 500 mhz)
/* Restart thermal sensor comparisons and interrupts.
* The "PowerPC 740 and PowerPC 750 Microprocessor Datasheet"
* recommends that "the maximum value be set in THRM3 under all
* conditions."
*/
mtspr(SPRN_THRM3, THRM3_SITV(500*60) | THRM3_E);
local_irq_restore(flags);
mtspr(SPRN_THRM3, THRM3_SITV(0x1fff) | THRM3_E);
}
static void tau_timeout_smp(struct timer_list *unused)
static struct workqueue_struct *tau_workq;
static void tau_work_func(struct work_struct *work)
{
/* schedule ourselves to be run again */
mod_timer(&tau_timer, jiffies + shrink_timer) ;
msleep(shrink_timer);
on_each_cpu(tau_timeout, NULL, 0);
/* schedule ourselves to be run again */
queue_work(tau_workq, work);
}
DECLARE_WORK(tau_work, tau_work_func);
/*
* setup the TAU
*
@ -231,21 +200,19 @@ static int __init TAU_init(void)
return 1;
}
tau_int_enable = IS_ENABLED(CONFIG_TAU_INT) &&
!strcmp(cur_cpu_spec->platform, "ppc750");
/* first, set up the window shrinking timer */
timer_setup(&tau_timer, tau_timeout_smp, 0);
tau_timer.expires = jiffies + shrink_timer;
add_timer(&tau_timer);
tau_workq = alloc_workqueue("tau", WQ_UNBOUND, 1, 0);
if (!tau_workq)
return -ENOMEM;
on_each_cpu(TAU_init_smp, NULL, 0);
printk("Thermal assist unit ");
#ifdef CONFIG_TAU_INT
printk("using interrupts, ");
#else
printk("using timers, ");
#endif
printk("shrink_timer: %d jiffies\n", shrink_timer);
queue_work(tau_workq, &tau_work);
pr_info("Thermal assist unit using %s, shrink_timer: %d ms\n",
tau_int_enable ? "interrupts" : "workqueue", shrink_timer);
tau_initialized = 1;
return 0;

View File

@ -75,15 +75,6 @@
#include <linux/clockchips.h>
#include <linux/timekeeper_internal.h>
static u64 rtc_read(struct clocksource *);
static struct clocksource clocksource_rtc = {
.name = "rtc",
.rating = 400,
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
.mask = CLOCKSOURCE_MASK(64),
.read = rtc_read,
};
static u64 timebase_read(struct clocksource *);
static struct clocksource clocksource_timebase = {
.name = "timebase",
@ -447,19 +438,9 @@ void vtime_flush(struct task_struct *tsk)
void __delay(unsigned long loops)
{
unsigned long start;
int diff;
spin_begin();
if (__USE_RTC()) {
start = get_rtcl();
do {
/* the RTCL register wraps at 1000000000 */
diff = get_rtcl() - start;
if (diff < 0)
diff += 1000000000;
spin_cpu_relax();
} while (diff < loops);
} else if (tb_invalid) {
if (tb_invalid) {
/*
* TB is in error state and isn't ticking anymore.
* HMI handler was unable to recover from TB error.
@ -467,8 +448,8 @@ void __delay(unsigned long loops)
*/
spin_cpu_relax();
} else {
start = get_tbl();
while (get_tbl() - start < loops)
start = mftb();
while (mftb() - start < loops)
spin_cpu_relax();
}
spin_end();
@ -614,7 +595,7 @@ void timer_interrupt(struct pt_regs *regs)
irq_work_run();
}
now = get_tb_or_rtc();
now = get_tb();
if (now >= *next_tb) {
*next_tb = ~(u64)0;
if (evt->event_handler)
@ -696,8 +677,6 @@ EXPORT_SYMBOL_GPL(tb_to_ns);
*/
notrace unsigned long long sched_clock(void)
{
if (__USE_RTC())
return get_rtc();
return mulhdu(get_tb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
}
@ -847,11 +826,6 @@ void read_persistent_clock64(struct timespec64 *ts)
}
/* clocksource code */
static notrace u64 rtc_read(struct clocksource *cs)
{
return (u64)get_rtc();
}
static notrace u64 timebase_read(struct clocksource *cs)
{
return (u64)get_tb();
@ -948,12 +922,7 @@ void update_vsyscall_tz(void)
static void __init clocksource_init(void)
{
struct clocksource *clock;
if (__USE_RTC())
clock = &clocksource_rtc;
else
clock = &clocksource_timebase;
struct clocksource *clock = &clocksource_timebase;
if (clocksource_register_hz(clock, tb_ticks_per_sec)) {
printk(KERN_ERR "clocksource: %s is already registered\n",
@ -968,7 +937,7 @@ static void __init clocksource_init(void)
static int decrementer_set_next_event(unsigned long evt,
struct clock_event_device *dev)
{
__this_cpu_write(decrementers_next_tb, get_tb_or_rtc() + evt);
__this_cpu_write(decrementers_next_tb, get_tb() + evt);
set_dec(evt);
/* We may have raced with new irq work */
@ -1071,17 +1040,12 @@ void __init time_init(void)
u64 scale;
unsigned shift;
if (__USE_RTC()) {
/* 601 processor: dec counts down by 128 every 128ns */
ppc_tb_freq = 1000000000;
} else {
/* Normal PowerPC with timebase register */
ppc_md.calibrate_decr();
printk(KERN_DEBUG "time_init: decrementer frequency = %lu.%.6lu MHz\n",
ppc_tb_freq / 1000000, ppc_tb_freq % 1000000);
printk(KERN_DEBUG "time_init: processor frequency = %lu.%.6lu MHz\n",
ppc_proc_freq / 1000000, ppc_proc_freq % 1000000);
}
/* Normal PowerPC with timebase register */
ppc_md.calibrate_decr();
printk(KERN_DEBUG "time_init: decrementer frequency = %lu.%.6lu MHz\n",
ppc_tb_freq / 1000000, ppc_tb_freq % 1000000);
printk(KERN_DEBUG "time_init: processor frequency = %lu.%.6lu MHz\n",
ppc_proc_freq / 1000000, ppc_proc_freq % 1000000);
tb_ticks_per_jiffy = ppc_tb_freq / HZ;
tb_ticks_per_sec = ppc_tb_freq;
@ -1107,7 +1071,7 @@ void __init time_init(void)
tb_to_ns_scale = scale;
tb_to_ns_shift = shift;
/* Save the current timebase to pretty up CONFIG_PRINTK_TIME */
boot_tb = get_tb_or_rtc();
boot_tb = get_tb();
/* If platform provided a timezone (pmac), we correct the time */
if (timezone_offset) {

View File

@ -122,6 +122,13 @@ _GLOBAL(tm_reclaim)
std r3, STK_PARAM(R3)(r1)
SAVE_NVGPRS(r1)
/*
* Save kernel live AMR since it will be clobbered by treclaim
* but can be used elsewhere later in kernel space.
*/
mfspr r3, SPRN_AMR
std r3, TM_FRAME_L1(r1)
/* We need to setup MSR for VSX register save instructions. */
mfmsr r14
mr r15, r14
@ -245,7 +252,7 @@ _GLOBAL(tm_reclaim)
* but is used in signal return to 'wind back' to the abort handler.
*/
/* ******************** CR,LR,CCR,MSR ********** */
/* ***************** CTR, LR, CR, XER ********** */
mfctr r3
mflr r4
mfcr r5
@ -256,7 +263,6 @@ _GLOBAL(tm_reclaim)
std r5, _CCR(r7)
std r6, _XER(r7)
/* ******************** TAR, DSCR ********** */
mfspr r3, SPRN_TAR
mfspr r4, SPRN_DSCR
@ -264,6 +270,10 @@ _GLOBAL(tm_reclaim)
std r3, THREAD_TM_TAR(r12)
std r4, THREAD_TM_DSCR(r12)
/* ******************** AMR **************** */
mfspr r3, SPRN_AMR
std r3, THREAD_TM_AMR(r12)
/*
* MSR and flags: We don't change CRs, and we don't need to alter MSR.
*/
@ -308,7 +318,9 @@ _GLOBAL(tm_reclaim)
std r3, THREAD_TM_TFHAR(r12)
std r4, THREAD_TM_TFIAR(r12)
/* AMR is checkpointed too, but is unsupported by Linux. */
/* Restore kernel live AMR */
ld r8, TM_FRAME_L1(r1)
mtspr SPRN_AMR, r8
/* Restore original MSR/IRQ state & clear TM mode */
ld r14, TM_FRAME_L0(r1) /* Orig MSR */
@ -355,6 +367,13 @@ _GLOBAL(__tm_recheckpoint)
*/
SAVE_NVGPRS(r1)
/*
* Save kernel live AMR since it will be clobbered for trechkpt
* but can be used elsewhere later in kernel space.
*/
mfspr r8, SPRN_AMR
std r8, TM_FRAME_L0(r1)
/* Load complete register state from ts_ckpt* registers */
addi r7, r3, PT_CKPT_REGS /* Thread's ckpt_regs */
@ -404,7 +423,7 @@ _GLOBAL(__tm_recheckpoint)
restore_gprs:
/* ******************** CR,LR,CCR,MSR ********** */
/* ****************** CTR, LR, XER ************* */
ld r4, _CTR(r7)
ld r5, _LINK(r7)
ld r8, _XER(r7)
@ -417,6 +436,10 @@ restore_gprs:
ld r4, THREAD_TM_TAR(r3)
mtspr SPRN_TAR, r4
/* ******************** AMR ******************** */
ld r4, THREAD_TM_AMR(r3)
mtspr SPRN_AMR, r4
/* Load up the PPR and DSCR in GPRs only at this stage */
ld r5, THREAD_TM_DSCR(r3)
ld r6, THREAD_TM_PPR(r3)
@ -509,6 +532,10 @@ restore_gprs:
li r4, MSR_RI
mtmsrd r4, 1
/* Restore kernel live AMR */
ld r8, TM_FRAME_L0(r1)
mtspr SPRN_AMR, r8
REST_NVGPRS(r1)
addi r1, r1, TM_FRAME_SIZE

View File

@ -529,9 +529,6 @@ void system_reset_exception(struct pt_regs *regs)
* Check if the NIP corresponds to the address of a sync
* instruction for which there is an entry in the exception
* table.
* Note that the 601 only takes a machine check on TEA
* (transfer error ack) signal assertion, and does not
* set any of the top 16 bits of SRR1.
* -- paulus.
*/
static inline int check_io_access(struct pt_regs *regs)
@ -796,7 +793,6 @@ int machine_check_generic(struct pt_regs *regs)
case 0x80000:
pr_cont("Machine check signal\n");
break;
case 0: /* for 601 */
case 0x40000:
case 0x140000: /* 7450 MSS error and TEA */
pr_cont("Transfer error ack signal\n");

View File

@ -47,7 +47,6 @@ V_FUNCTION_END(__kernel_get_syscall_map)
*
* returns the timebase frequency in HZ
*/
#ifndef CONFIG_PPC_BOOK3S_601
V_FUNCTION_BEGIN(__kernel_get_tbfreq)
.cfi_startproc
mflr r12
@ -60,4 +59,3 @@ V_FUNCTION_BEGIN(__kernel_get_tbfreq)
blr
.cfi_endproc
V_FUNCTION_END(__kernel_get_tbfreq)
#endif

View File

@ -144,13 +144,11 @@ VERSION
__kernel_datapage_offset;
__kernel_get_syscall_map;
#ifndef CONFIG_PPC_BOOK3S_601
__kernel_gettimeofday;
__kernel_clock_gettime;
__kernel_clock_getres;
__kernel_time;
__kernel_get_tbfreq;
#endif
__kernel_sync_dicache;
__kernel_sync_dicache_p5;
__kernel_sigtramp32;

View File

@ -3530,6 +3530,13 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
*/
asm volatile("eieio; tlbsync; ptesync");
/*
* cp_abort is required if the processor supports local copy-paste
* to clear the copy buffer that was under control of the guest.
*/
if (cpu_has_feature(CPU_FTR_ARCH_31))
asm volatile(PPC_CP_ABORT);
mtspr(SPRN_LPID, vcpu->kvm->arch.host_lpid); /* restore host LPID */
isync();

View File

@ -1830,6 +1830,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_P9_RADIX_PREFETCH_BUG)
2:
#endif /* CONFIG_PPC_RADIX_MMU */
/*
* cp_abort is required if the processor supports local copy-paste
* to clear the copy buffer that was under control of the guest.
*/
BEGIN_FTR_SECTION
PPC_CP_ABORT
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
/*
* POWER7/POWER8 guest -> host partition switch code.
* We don't have to lock against tlbies but we do

View File

@ -21,21 +21,18 @@
static int __patch_instruction(struct ppc_inst *exec_addr, struct ppc_inst instr,
struct ppc_inst *patch_addr)
{
int err = 0;
if (!ppc_inst_prefixed(instr)) {
__put_user_asm(ppc_inst_val(instr), patch_addr, err, "stw");
} else {
__put_user_asm(ppc_inst_as_u64(instr), patch_addr, err, "std");
}
if (err)
return err;
if (!ppc_inst_prefixed(instr))
__put_user_asm_goto(ppc_inst_val(instr), patch_addr, failed, "stw");
else
__put_user_asm_goto(ppc_inst_as_u64(instr), patch_addr, failed, "std");
asm ("dcbst 0, %0; sync; icbi 0,%1; sync; isync" :: "r" (patch_addr),
"r" (exec_addr));
return 0;
failed:
return -EFAULT;
}
int raw_patch_instruction(struct ppc_inst *addr, struct ppc_inst instr)

View File

@ -219,10 +219,13 @@ static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
ea += regs->gpr[ra];
else if (!prefix_r && !ra)
; /* Leave ea as is */
else if (prefix_r && !ra)
else if (prefix_r)
ea += regs->nip;
else if (prefix_r && ra)
; /* Invalid form. Should already be checked for by caller! */
/*
* (prefix_r && ra) is an invalid form. Should already be
* checked for by caller!
*/
return ea;
}

View File

@ -15,6 +15,7 @@
*/
#include <linux/pgtable.h>
#include <linux/init.h>
#include <asm/reg.h>
#include <asm/page.h>
#include <asm/cputable.h>
@ -199,11 +200,9 @@ _GLOBAL(add_hash_page)
* covered by a BAT). -- paulus
*/
mfmsr r9
SYNC
rlwinm r0,r9,0,17,15 /* clear bit 16 (MSR_EE) */
rlwinm r0,r0,0,28,26 /* clear MSR_DR */
mtmsr r0
SYNC_601
isync
#ifdef CONFIG_SMP
@ -262,7 +261,6 @@ _GLOBAL(add_hash_page)
/* reenable interrupts and DR */
mtmsr r9
SYNC_601
isync
lwz r0,4(r1)
@ -287,9 +285,9 @@ _ASM_NOKPROBE_SYMBOL(add_hash_page)
*
* For speed, 4 of the instructions get patched once the size and
* physical address of the hash table are known. These definitions
* of Hash_base and Hash_bits below are just an example.
* of Hash_base and Hash_bits below are for the early hash table.
*/
Hash_base = 0xc0180000
Hash_base = early_hash
Hash_bits = 12 /* e.g. 256kB hash table */
Hash_msk = (((1 << Hash_bits) - 1) * 64)
@ -310,6 +308,7 @@ Hash_msk = (((1 << Hash_bits) - 1) * 64)
#define HASH_LEFT 31-(LG_PTEG_SIZE+Hash_bits-1)
#define HASH_RIGHT 31-LG_PTEG_SIZE
__REF
_GLOBAL(create_hpte)
/* Convert linux-style PTE (r5) to low word of PPC-style PTE (r8) */
rlwinm r8,r5,32-9,30,30 /* _PAGE_RW -> PP msb */
@ -476,6 +475,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
sync /* make sure pte updates get to memory */
blr
.previous
_ASM_NOKPROBE_SYMBOL(create_hpte)
.section .bss
@ -496,6 +496,7 @@ htab_hash_searches:
*
* We assume that there is a hash table in use (Hash != 0).
*/
__REF
_GLOBAL(flush_hash_pages)
/*
* We disable interrupts here, even on UP, because we want
@ -506,11 +507,9 @@ _GLOBAL(flush_hash_pages)
* covered by a BAT). -- paulus
*/
mfmsr r10
SYNC
rlwinm r0,r10,0,17,15 /* clear bit 16 (MSR_EE) */
rlwinm r0,r0,0,28,26 /* clear MSR_DR */
mtmsr r0
SYNC_601
isync
/* First find a PTE in the range that has _PAGE_HASHPTE set */
@ -629,9 +628,9 @@ _GLOBAL(flush_hash_pages)
#endif
19: mtmsr r10
SYNC_601
isync
blr
.previous
EXPORT_SYMBOL(flush_hash_pages)
_ASM_NOKPROBE_SYMBOL(flush_hash_pages)
@ -643,11 +642,9 @@ _GLOBAL(_tlbie)
lwz r8,TASK_CPU(r2)
oris r8,r8,11
mfmsr r10
SYNC
rlwinm r0,r10,0,17,15 /* clear bit 16 (MSR_EE) */
rlwinm r0,r0,0,28,26 /* clear DR */
mtmsr r0
SYNC_601
isync
lis r9,mmu_hash_lock@h
ori r9,r9,mmu_hash_lock@l
@ -664,7 +661,6 @@ _GLOBAL(_tlbie)
li r0,0
stw r0,0(r9) /* clear mmu_hash_lock */
mtmsr r10
SYNC_601
isync
#else /* CONFIG_SMP */
tlbie r3
@ -681,11 +677,9 @@ _GLOBAL(_tlbia)
lwz r8,TASK_CPU(r2)
oris r8,r8,10
mfmsr r10
SYNC
rlwinm r0,r10,0,17,15 /* clear bit 16 (MSR_EE) */
rlwinm r0,r0,0,28,26 /* clear DR */
mtmsr r0
SYNC_601
isync
lis r9,mmu_hash_lock@h
ori r9,r9,mmu_hash_lock@l
@ -709,7 +703,6 @@ _GLOBAL(_tlbia)
li r0,0
stw r0,0(r9) /* clear mmu_hash_lock */
mtmsr r10
SYNC_601
isync
#endif /* CONFIG_SMP */
blr

View File

@ -31,6 +31,8 @@
#include <mm/mmu_decl.h>
u8 __initdata early_hash[SZ_256K] __aligned(SZ_256K) = {0};
struct hash_pte *Hash;
static unsigned long Hash_size, Hash_mask;
unsigned long _SDR1;
@ -73,23 +75,13 @@ unsigned long p_block_mapped(phys_addr_t pa)
static int find_free_bat(void)
{
int b;
int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601)) {
for (b = 0; b < 4; b++) {
struct ppc_bat *bat = BATS[b];
for (b = 0; b < n; b++) {
struct ppc_bat *bat = BATS[b];
if (!(bat[0].batl & 0x40))
return b;
}
} else {
int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
for (b = 0; b < n; b++) {
struct ppc_bat *bat = BATS[b];
if (!(bat[1].batu & 3))
return b;
}
if (!(bat[1].batu & 3))
return b;
}
return -1;
}
@ -97,7 +89,7 @@ static int find_free_bat(void)
/*
* This function calculates the size of the larger block usable to map the
* beginning of an area based on the start address and size of that area:
* - max block size is 8M on 601 and 256 on other 6xx.
* - max block size is 256 on 6xx.
* - base address must be aligned to the block size. So the maximum block size
* is identified by the lowest bit set to 1 in the base address (for instance
* if base is 0x16000000, max size is 0x02000000).
@ -106,7 +98,7 @@ static int find_free_bat(void)
*/
static unsigned int block_size(unsigned long base, unsigned long top)
{
unsigned int max_size = IS_ENABLED(CONFIG_PPC_BOOK3S_601) ? SZ_8M : SZ_256M;
unsigned int max_size = SZ_256M;
unsigned int base_shift = (ffs(base) - 1) & 31;
unsigned int block_shift = (fls(top - base) - 1) & 31;
@ -117,7 +109,6 @@ static unsigned int block_size(unsigned long base, unsigned long top)
* Set up one of the IBAT (block address translation) register pairs.
* The parameters are not checked; in particular size must be a power
* of 2 between 128k and 256M.
* Only for 603+ ...
*/
static void setibat(int index, unsigned long virt, phys_addr_t phys,
unsigned int size, pgprot_t prot)
@ -214,9 +205,6 @@ void mmu_mark_initmem_nx(void)
unsigned long border = (unsigned long)__init_begin - PAGE_OFFSET;
unsigned long size;
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
return;
for (i = 0; i < nb - 1 && base < top && top - base > (128 << 10);) {
size = block_size(base, top);
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
@ -253,9 +241,6 @@ void mmu_mark_rodata_ro(void)
int nb = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4;
int i;
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
return;
for (i = 0; i < nb; i++) {
struct ppc_bat *bat = BATS[i];
@ -294,35 +279,22 @@ void __init setbat(int index, unsigned long virt, phys_addr_t phys,
flags &= ~_PAGE_COHERENT;
bl = (size >> 17) - 1;
if (!IS_ENABLED(CONFIG_PPC_BOOK3S_601)) {
/* 603, 604, etc. */
/* Do DBAT first */
wimgxpp = flags & (_PAGE_WRITETHRU | _PAGE_NO_CACHE
| _PAGE_COHERENT | _PAGE_GUARDED);
wimgxpp |= (flags & _PAGE_RW)? BPP_RW: BPP_RX;
bat[1].batu = virt | (bl << 2) | 2; /* Vs=1, Vp=0 */
bat[1].batl = BAT_PHYS_ADDR(phys) | wimgxpp;
if (flags & _PAGE_USER)
bat[1].batu |= 1; /* Vp = 1 */
if (flags & _PAGE_GUARDED) {
/* G bit must be zero in IBATs */
flags &= ~_PAGE_EXEC;
}
if (flags & _PAGE_EXEC)
bat[0] = bat[1];
else
bat[0].batu = bat[0].batl = 0;
} else {
/* 601 cpu */
if (bl > BL_8M)
bl = BL_8M;
wimgxpp = flags & (_PAGE_WRITETHRU | _PAGE_NO_CACHE
| _PAGE_COHERENT);
wimgxpp |= (flags & _PAGE_RW)?
((flags & _PAGE_USER)? PP_RWRW: PP_RWXX): PP_RXRX;
bat->batu = virt | wimgxpp | 4; /* Ks=0, Ku=1 */
bat->batl = phys | bl | 0x40; /* V=1 */
/* Do DBAT first */
wimgxpp = flags & (_PAGE_WRITETHRU | _PAGE_NO_CACHE
| _PAGE_COHERENT | _PAGE_GUARDED);
wimgxpp |= (flags & _PAGE_RW)? BPP_RW: BPP_RX;
bat[1].batu = virt | (bl << 2) | 2; /* Vs=1, Vp=0 */
bat[1].batl = BAT_PHYS_ADDR(phys) | wimgxpp;
if (flags & _PAGE_USER)
bat[1].batu |= 1; /* Vp = 1 */
if (flags & _PAGE_GUARDED) {
/* G bit must be zero in IBATs */
flags &= ~_PAGE_EXEC;
}
if (flags & _PAGE_EXEC)
bat[0] = bat[1];
else
bat[0].batu = bat[0].batl = 0;
bat_addrs[index].start = virt;
bat_addrs[index].limit = virt + ((bl + 1) << 17) - 1;
@ -425,15 +397,6 @@ void __init MMU_init_hw(void)
hash_mb2 = hash_mb = 32 - LG_HPTEG_SIZE - lg_n_hpteg;
if (lg_n_hpteg > 16)
hash_mb2 = 16 - LG_HPTEG_SIZE;
/*
* When KASAN is selected, there is already an early temporary hash
* table and the switch to the final hash table is done later.
*/
if (IS_ENABLED(CONFIG_KASAN))
return;
MMU_init_hw_patch();
}
void __init MMU_init_hw_patch(void)
@ -441,6 +404,9 @@ void __init MMU_init_hw_patch(void)
unsigned int hmask = Hash_mask >> (16 - LG_HPTEG_SIZE);
unsigned int hash = (unsigned int)Hash - PAGE_OFFSET;
if (!mmu_has_feature(MMU_FTR_HPTE_TABLE))
return;
if (ppc_md.progress)
ppc_md.progress("hash:patch", 0x345);
if (ppc_md.progress)
@ -474,11 +440,7 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
*/
BUG_ON(first_memblock_base != 0);
/* 601 can only access 16MB at the moment */
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
memblock_set_current_limit(min_t(u64, first_memblock_size, 0x01000000));
else /* Anything else has 256M mapped */
memblock_set_current_limit(min_t(u64, first_memblock_size, 0x10000000));
memblock_set_current_limit(min_t(u64, first_memblock_size, SZ_256M));
}
void __init print_system_hash_info(void)

Some files were not shown because too many files have changed in this diff Show More