linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-17 12:56:55 +07:00

Author	SHA1	Message	Date
Jeremy Linton	e156ab71a9	arm64: topology: Avoid checking numa mask for scheduler MC selection The numa mask subset check can often lead to system hang or crash during CPU hotplug and system suspend operation if NUMA is disabled. This is mostly observed on HMP systems where the CPU compute capacities are different and ends up in different scheduler domains. Since cpumask_of_node is returned instead core_sibling, the scheduler is confused with incorrect cpumasks(e.g. one CPU in two different sched domains at the same time) on CPU hotplug. Lets disable the NUMA siblings checks for the time being, as NUMA in socket machines have LLC's that will assure that the scheduler topology isn't "borken". The NUMA check exists to assure that if a LLC within a socket crosses NUMA nodes/chiplets the scheduler domains remain consistent. This code will likely have to be re-enabled in the near future once the NUMA mask story is sorted. At the moment its not necessary because the NUMA in socket machines LLC's are contained within the NUMA domains. Further, as a defensive mechanism during hot-plug, lets assure that the LLC siblings are also masked. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-06-07 17:42:11 +01:00
Sudeep Holla	2520e627db	ACPI / PPTT: fix build when CONFIG_ACPI_PPTT is not enabled Though CONFIG_ACPI_PPTT is selected by platforms and nor user visible, it may be useful to support the build with CONFIG_ACPI_PPTT disabled. This patch adds the missing dummy/boiler plate implementation to fix the build. Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-06-05 18:06:24 +01:00
Arnd Bergmann	94a5d8790e	arm64: cpu_errata: include required headers Without including psci.h and arm-smccc.h, we now get a build failure in some configurations: arch/arm64/kernel/cpu_errata.c: In function 'arm64_update_smccc_conduit': arch/arm64/kernel/cpu_errata.c:278:10: error: 'psci_ops' undeclared (first use in this function); did you mean 'sysfs_ops'? arch/arm64/kernel/cpu_errata.c: In function 'arm64_set_ssbd_mitigation': arch/arm64/kernel/cpu_errata.c:311:3: error: implicit declaration of function 'arm_smccc_1_1_hvc' [-Werror=implicit-function-declaration] arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL); Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-06-05 16:51:31 +01:00
Catalin Marinas	38500be10e	arm64: KVM: Move VCPU_WORKAROUND_2_FLAG macros to the top of the file This is to avoid potential merging conflicts between commit `55e3748e89` ("arm64: KVM: Add ARCH_WORKAROUND_2 support for guests") and the KVM tree. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-06-02 10:42:54 +01:00
Dave Martin	94b07c1f8c	arm64: signal: Report signal frame size to userspace via auxv Stateful CPU architecture extensions may require the signal frame to grow to a size that exceeds the arch's MINSIGSTKSZ #define. However, changing this #define is an ABI break. To allow userspace the option of determining the signal frame size in a more forwards-compatible way, this patch adds a new auxv entry tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame size that the process can observe during its lifetime. If AT_MINSIGSTKSZ is absent from the aux vector, the caller can assume that the MINSIGSTKSZ #define is sufficient. This allows for a consistent interface with older kernels that do not provide AT_MINSIGSTKSZ. The idea is that libc could expose this via sysconf() or some similar mechanism. There is deliberately no AT_SIGSTKSZ. The kernel knows nothing about userspace's own stack overheads and should not pretend to know. For arm64: The primary motivation for this interface is the Scalable Vector Extension, which can require at least 4KB or so of extra space in the signal frame for the largest hardware implementations. To determine the correct value, a "Christmas tree" mode (via the add_all argument) is added to setup_sigframe_layout(), to simulate addition of all possible records to the signal frame at maximum possible size. If this procedure goes wrong somehow, resulting in a stupidly large frame layout and hence failure of sigframe_alloc() to allocate a record to the frame, then this is indicative of a kernel bug. In this case, we WARN() and no attempt is made to populate AT_MINSIGSTKSZ for userspace. For arm64 SVE: The SVE context block in the signal frame needs to be considered too when computing the maximum possible signal frame size. Because the size of this block depends on the vector length, this patch computes the size based not on the thread's current vector length but instead on the maximum possible vector length: this determines the maximum size of SVE context block that can be observed in any signal frame for the lifetime of the process. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-06-01 15:53:10 +01:00
Dave Martin	87c021a814	arm64/sve: Thin out initialisation sanity-checks for sve_max_vl Now that the kernel SVE support is reasonably mature, it is excessive to default sve_max_vl to the invalid value -1 and then sprinkle WARN_ON()s around the place to make sure it has been initialised before use. The cpufeatures code already runs pretty early, and will ensure sve_max_vl gets initialised. This patch initialises sve_max_vl to something sane that will be supported by every SVE implementation, and removes most of the sanity checks. The checks in find_supported_vector_length() are retained for now. If anything goes horribly wrong, we are likely to trip a check here sooner or later. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-06-01 15:53:07 +01:00
Catalin Marinas	cb877710e5	Merge branch 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux - perf/arm-cci: allow building as module - perf/arm-ccn: demote dev_warn() to dev_dbg() in event_init() - miscellaneous perf/arm cleanups * 'for-next/perf' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux: ARM: mcpm, perf/arm-cci: export mcpm_is_available drivers/bus: arm-cci: fix build warnings drivers/perf: Remove ARM_SPE_PMU explicit PERF_EVENTS dependency drivers/perf: arm-ccn: don't log to dmesg in event_init perf/arm-cci: Allow building as a module perf/arm-cci: Remove pointless PMU disabling perf/arm-cc*: Fix MODULE_LICENSE() tags arm_pmu: simplify arm_pmu::handle_irq perf/arm-cci: Remove unnecessary period adjustment perf: simplify getting .drvdata	2018-05-31 18:09:38 +01:00
Marc Zyngier	5d81f7dc9b	arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID Now that all our infrastructure is in place, let's expose the availability of ARCH_WORKAROUND_2 to guests. We take this opportunity to tidy up a couple of SMCCC constants. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 18:00:59 +01:00
Marc Zyngier	b4f18c063a	arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests In order to forward the guest's ARCH_WORKAROUND_2 calls to EL3, add a small(-ish) sequence to handle it at EL2. Special care must be taken to track the state of the guest itself by updating the workaround flags. We also rely on patching to enable calls into the firmware. Note that since we need to execute branches, this always executes after the Spectre-v2 mitigation has been applied. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 18:00:57 +01:00
Marc Zyngier	55e3748e89	arm64: KVM: Add ARCH_WORKAROUND_2 support for guests In order to offer ARCH_WORKAROUND_2 support to guests, we need a bit of infrastructure. Let's add a flag indicating whether or not the guest uses SSBD mitigation. Depending on the state of this flag, allow KVM to disable ARCH_WORKAROUND_2 before entering the guest, and enable it when exiting it. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 18:00:55 +01:00
Marc Zyngier	85478bab40	arm64: KVM: Add HYP per-cpu accessors As we're going to require to access per-cpu variables at EL2, let's craft the minimum set of accessors required to implement reading a per-cpu variable, relying on tpidr_el2 to contain the per-cpu offset. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 18:00:53 +01:00
Marc Zyngier	9cdc0108ba	arm64: ssbd: Add prctl interface for per-thread mitigation If running on a system that performs dynamic SSBD mitigation, allow userspace to request the mitigation for itself. This is implemented as a prctl call, allowing the mitigation to be enabled or disabled at will for this particular thread. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 18:00:52 +01:00
Marc Zyngier	9dd9614f54	arm64: ssbd: Introduce thread flag to control userspace mitigation In order to allow userspace to be mitigated on demand, let's introduce a new thread flag that prevents the mitigation from being turned off when exiting to userspace, and doesn't turn it on on entry into the kernel (with the assumption that the mitigation is always enabled in the kernel itself). This will be used by a prctl interface introduced in a later patch. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:35:32 +01:00
Marc Zyngier	647d0519b5	arm64: ssbd: Restore mitigation status on CPU resume On a system where firmware can dynamically change the state of the mitigation, the CPU will always come up with the mitigation enabled, including when coming back from suspend. If the user has requested "no mitigation" via a command line option, let's enforce it by calling into the firmware again to disable it. Similarily, for a resume from hibernate, the mitigation could have been disabled by the boot kernel. Let's ensure that it is set back on in that case. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:35:19 +01:00
Marc Zyngier	986372c436	arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation In order to avoid checking arm64_ssbd_callback_required on each kernel entry/exit even if no mitigation is required, let's add yet another alternative that by default jumps over the mitigation, and that gets nop'ed out if we're doing dynamic mitigation. Think of it as a poor man's static key... Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:35:06 +01:00
Marc Zyngier	c32e1736ca	arm64: ssbd: Add global mitigation state accessor We're about to need the mitigation state in various parts of the kernel in order to do the right thing for userspace and guests. Let's expose an accessor that will let other subsystems know about the state. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:34:57 +01:00
Marc Zyngier	a43ae4dfe5	arm64: Add 'ssbd' command-line option On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:34:49 +01:00
Marc Zyngier	a725e3dda1	arm64: Add ARCH_WORKAROUND_2 probing As for Spectre variant-2, we rely on SMCCC 1.1 to provide the discovery mechanism for detecting the SSBD mitigation. A new capability is also allocated for that purpose, and a config option. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:34:38 +01:00
Marc Zyngier	5cf9ce6e5e	arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2 In a heterogeneous system, we can end up with both affected and unaffected CPUs. Let's check their status before calling into the firmware. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:34:27 +01:00
Marc Zyngier	8e2906245f	arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1 In order for the kernel to protect itself, let's call the SSBD mitigation implemented by the higher exception level (either hypervisor or firmware) on each transition between userspace and kernel. We must take the PSCI conduit into account in order to target the right exception level, hence the introduction of a runtime patching callback. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:34:01 +01:00
Marc Zyngier	eff0e9e107	arm/arm64: smccc: Add SMCCC-specific return codes We've so far used the PSCI return codes for SMCCC because they were extremely similar. But with the new ARM DEN 0070A specification, "NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS". Let's bite the bullet and add SMCCC specific return codes. Users can be repainted as and when required. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-31 17:30:44 +01:00
Arnd Bergmann	73acc0315c	ARM: mcpm, perf/arm-cci: export mcpm_is_available Now that the ARM CCI PMU driver can be built as a loadable module, we get a link failure when MCPM is enabled: ERROR: "mcpm_is_available" [drivers/perf/arm-cci.ko] undefined! The simplest fix is to export that helper function. Fixes: `8b0c93c20e` ("perf/arm-cci: Allow building as a module") Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-29 16:53:16 +01:00
Arnd Bergmann	984e9cf1b9	drivers/bus: arm-cci: fix build warnings When the arm-cci driver is enabled, but both CONFIG_ARM_CCI5xx_PMU and CONFIG_ARM_CCI400_PMU are not, we get a warning about how parts of the driver are never used: drivers/perf/arm-cci.c:1454:29: error: 'cci_pmu_models' defined but not used [-Werror=unused-variable] drivers/perf/arm-cci.c:693:16: error: 'cci_pmu_event_show' defined but not used [-Werror=unused-function] drivers/perf/arm-cci.c:685:16: error: 'cci_pmu_format_show' defined but not used [-Werror=unused-function] Marking all three functions as __maybe_unused avoids the warnings in randconfig builds. I'm doing this lacking any ideas for a better fix. Fixes: `3de6be7a3d` ("drivers/bus: Split Arm CCI driver") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-29 16:38:16 +01:00
Mark Rutland	c870f14ea1	arm64: Unify kernel fault reporting In do_page_fault(), we handle some kernel faults early, and simply die() with a message. For faults handled later, we dump the faulting address, decode the ESR, walk the page tables, and perform a number of steps to ensure that this data is reported. Let's unify the handling of fatal kernel faults with a new die_kernel_fault() helper, handling all of these details. This is largely the same as the existing logic in __do_kernel_fault(), except that addresses are consistently padded to 16 hex characters, as would be expected for a 64-bit address. The messages currently logged in do_page_fault are adjusted to fit into the die_kernel_fault() message template. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-23 11:46:42 +01:00
Mark Rutland	969e61ba87	arm64: make is_permission_fault() name clearer The naming of is_permission_fault() makes it sound like it should return true for permission faults from EL0, but by design, it only does so for faults from EL1. Let's make this clear by dropping el1 in the name, as we do for is_el1_instruction_abort(). Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-23 11:46:07 +01:00
Will Deacon	7bd99b4034	arm64: Kconfig: Enable LSE atomics by default Now that we're seeing CPUs shipping with LSE atomics, default them to 'on' in Kconfig. CPUs without the instructions will continue to use LDXR/STXR-based sequences, but they will be placed out-of-line by the compiler. Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-23 11:33:45 +01:00
John Garry	b89205bd50	drivers/perf: Remove ARM_SPE_PMU explicit PERF_EVENTS dependency Since commit `bddb9b68d3` ("drivers/perf: commonise PERF_EVENTS dependency"), all perf drivers depend on PERF_EVENTS config under a common menu. Config ARM_SPE_PMU still declares explicitly a dependency on PERF_EVENTS, which is unneeded, so remove it. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-22 17:11:12 +01:00
Mark Rutland	1898eb61fb	drivers/perf: arm-ccn: don't log to dmesg in event_init The ARM CCN PMU driver uses dev_warn() to complain about parameters in the user-provided perf_event_attr. This means that under normal operation (e.g. a single invocation of the perf tool), a number of messages warnings may be logged to dmesg. Tools may issue multiple syscalls to probe for feature support, and multiple applications (from multiple users) can attempt to open events simultaneously, so this is not very helpful, even if a user happens to have access to dmesg. Worse, this can push important information out of the dmesg ring buffer, and can significantly slow down syscall fuzzers, vastly increasing the time it takes to find critical bugs. Demote the dev_warn() instances to dev_dbg(), as is the case for all other PMU drivers under drivers/perf/. Users who wish to debug PMU event initialisation can enable dynamic debug to receive these messages. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-21 18:21:32 +01:00
Robin Murphy	8b0c93c20e	perf/arm-cci: Allow building as a module Fill in the few extra bits and annotations needed to make the driver work properly as a module, and jiggle the Kconfig to expose the driver-level ARM_CCI_PMU option. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-21 18:12:54 +01:00
Robin Murphy	28c01dc9d8	perf/arm-cci: Remove pointless PMU disabling The CCI PMU driver bears some legacy remnants of the arm_pmu framework from when it was split in `c6f85cb430` ("bus: cci: move away from arm_pmu framework"). In particular this perf_pmu_{dis,en}able() dance around pmu->add which was fixed for arm_pmu in `a9e469d1c8` ("drivers/perf: arm_pmu: remove pointless PMU disabling"). For the exact same reasons (i.e. perf core already does this around the call anyway), give cci_pmu_add() the exact same change, which also prevents having to export those core functions to build it as a module. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-21 18:12:53 +01:00
Robin Murphy	75dc344145	perf/arm-cc*: Fix MODULE_LICENSE() tags The CCI/CCN drivers are licensed under GPLv2, but the MODULE_LICENSE() tags are using the bare "GPL" string implying GPLv2 or later. Fix them to match their actual file license. Acked-by: Pawel Moll <pawel.moll@arm.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-21 18:12:53 +01:00
Mark Rutland	0788f1e973	arm_pmu: simplify arm_pmu::handle_irq The arm_pmu::handle_irq() callback has the same prototype as a generic IRQ handler, taking the IRQ number and a void pointer argument which it must convert to an arm_pmu pointer. This means that all arm_pmu::handle_irq() take an IRQ number they never use, and all must explicitly cast the void pointer to an arm_pmu pointer. Instead, let's change arm_pmu::handle_irq to take an arm_pmu pointer, allowing these casts to be removed. The redundant IRQ number parameter is also removed. Suggested-by: Hoeun Ryu <hoeun.ryu@lge.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-21 18:07:05 +01:00
Robin Murphy	5c591304e7	perf/arm-cci: Remove unnecessary period adjustment Since sampling events are rejected up-front by cci_pmu_event_init(), it doesn't make much sense to go fiddling with the sampling period later. This would seem to be just another leftover artefact of the arm_pmu framwork, and as such can go. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-21 18:06:11 +01:00
Wolfram Sang	d0f2e42329	perf: simplify getting .drvdata We should get drvdata from struct device directly. Going via platform_device is an unneeded step back and forth. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2018-05-21 18:02:35 +01:00
Dave Martin	159fd7b8d3	arm64/sve: Write ZCR_EL1 on context switch only if changed Writes to ZCR_EL1 are self-synchronising, and so may be expensive in typical implementations. This patch adopts the approach used for costly system register writes elsewhere in the kernel: the system register write is suppressed if it would not change the stored value. Since the common case will be that of switching between tasks that use the same vector length as one another, prediction hit rates on the conditional branch should be reasonably good, with lower expected amortised cost than the unconditional execution of a heavyweight self-synchronising instruction. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 18:19:53 +01:00
Jeremy Linton	37c3ec2d81	arm64: topology: divorce MC scheduling domain from core_siblings Now that we have an accurate view of the physical topology we need to represent it correctly to the scheduler. Generally MC should equal the LLC in the system, but there are a number of special cases that need to be dealt with. In the case of NUMA in socket, we need to assure that the sched domain we build for the MC layer isn't larger than the DIE above it. Similarly for LLC's that might exist in cross socket interconnect or directory hardware we need to assure that MC is shrunk to the socket or NUMA node. This patch builds a sibling mask for the LLC, and then picks the smallest of LLC, socket siblings, or NUMA node siblings, which gives us the behavior described above. This is ever so slightly different than the similar alternative where we look for a cache layer less than or equal to the socket/NUMA siblings. The logic to pick the MC layer affects all arm64 machines, but only changes the behavior for DT/MPIDR systems if the NUMA domain is smaller than the core siblings (generally set to the cluster). Potentially this fixes a possible bug in DT systems, but really it only affects ACPI systems where the core siblings is correctly set to the socket siblings. Thus all currently available ACPI systems should have MC equal to LLC, including the NUMA in socket machines where the LLC is partitioned between the NUMA nodes. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	bce1a65172	ACPI: Add PPTT to injectable table list Add ACPI_SIG_PPTT to the table so initrd's can override the system topology. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Geoffrey Blake <geoffrey.blake@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	2f0a5d107e	arm64: topology: enable ACPI/PPTT based CPU topology Propagate the topology information from the PPTT tree to the cpu_topology array. We can get the thread id and core_id by assuming certain levels of the PPTT tree correspond to those concepts. The package_id is flagged in the tree and can be found by calling find_acpi_cpu_topology_package() which terminates its search when it finds an ACPI node flagged as the physical package. If the tree doesn't contain enough levels to represent all of the requested levels then the root node will be returned for all subsequent levels. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	868abc0768	arm64: topology: rename cluster_id The cluster concept isn't architecturally defined for arm64. Lets match the name of the arm64 topology field to the kernel macro that uses it. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	8571890e15	arm64: Add support for ACPI based firmware tables The /sys cache entries should support ACPI/PPTT generated cache topology information. For arm64, if ACPI is enabled, determine the max number of cache levels and populate them using the PPTT table if one is available. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	582b468bdc	drivers: base cacheinfo: Add support for ACPI based firmware tables Call ACPI cache parsing routines from base cacheinfo code if ACPI is enabled. Also stub out cache_setup_acpi and acpi_find_last_cache_level so that individual architectures can enable ACPI topology parsing. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	0ce8223223	ACPI: Enable PPTT support on ARM64 Now that we have a PPTT parser, in preparation for its use on arm64, lets build it. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	2bd00bcd73	ACPI/PPTT: Add Processor Properties Topology Table parsing ACPI 6.2 adds a new table, which describes how processing units are related to each other in tree like fashion. Caches are also sprinkled throughout the tree and describe the properties of the caches in relation to other caches and processing units. Add the code to parse the cache hierarchy and report the total number of levels of cache for a given core using acpi_find_last_cache_level() as well as fill out the individual cores cache information with cache_setup_acpi() once the cpu_cacheinfo structure has been populated by the arch specific code. An additional patch later in the set adds the ability to report peers in the topology using find_acpi_cpu_topology() to report a unique ID for each processing unit at a given level in the tree. These unique id's can then be used to match related processing units which exist as threads, within a given package, etc. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	30d87bfacb	arm64/acpi: Create arch specific cpu to acpi id helper Its helpful to be able to lookup the acpi_processor_id associated with a logical cpu. Provide an arm64 helper to do this. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	9b97387c5c	cacheinfo: rename of_node to fw_token Rename and change the type of of_node to indicate it is a generic pointer which is generally only used for comparison purposes. In a later patch we will put an ACPI/PPTT token pointer in fw_token so that the code which builds the shared cpu masks can be reused. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:28:09 +01:00
Jeremy Linton	2ff075c7df	drivers: base: cacheinfo: setup DT cache properties early The original intent in cacheinfo was that an architecture specific populate_cache_leaves() would probe the hardware and then cache_shared_cpu_map_setup() and cache_override_properties() would provide firmware help to extend/expand upon what was probed. Arm64 was really the only architecture that was working this way, and with the removal of most of the hardware probing logic it became clear that it was possible to simplify the logic a bit. This patch combines the walk of the DT nodes with the code updating the cache size/line_size and nr_sets. cache_override_properties() (which was DT specific) is then removed. The result is that cacheinfo.of_node is no longer used as a temporary place to hold DT references for future calls that update cache properties. That change helps to clarify its one remaining use (matching cacheinfo nodes that represent shared caches) which will be used by the ACPI/PPTT code in the following patches. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:27:49 +01:00
Jeremy Linton	d529a18a61	drivers: base: cacheinfo: move cache_setup_of_node() In preparation for the next patch, and to aid in review of that patch, lets move cache_setup_of_node further down in the module without any changes. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Vijaya Kumar K <vkilari@codeaurora.org> Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-17 17:06:49 +01:00
Will Deacon	1cfc63b5ae	arm64: cmpwait: Clear event register before arming exclusive monitor When waiting for a cacheline to change state in cmpwait, we may immediately wake-up the first time around the outer loop if the event register was already set (for example, because of the event stream). Avoid these spurious wakeups by explicitly clearing the event register before loading the cacheline and setting the exclusive monitor. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-16 12:21:19 +01:00
Robin Murphy	e75bef2a4f	arm64: Select ARCH_HAS_FAST_MULTIPLIER It is probably safe to assume that all Armv8-A implementations have a multiplier whose efficiency is comparable or better than a sequence of three or so register-dependent arithmetic instructions. Select ARCH_HAS_FAST_MULTIPLIER to get ever-so-slightly nicer codegen in the few dusty old corners which care. In a contrived benchmark calling hweight64() in a loop, this does indeed turn out to be a small win overall, with no measurable impact on Cortex-A57 but about 5% performance improvement on Cortex-A53. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-16 11:50:52 +01:00
Vincenzo Frascino	92faa7bea3	arm64: Remove duplicate include "make includecheck" detected few duplicated includes in arch/arm64. This patch removes the double inclusions. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2018-05-15 18:18:00 +01:00

1 2 3 4 5 ...

753112 Commits