- Update of the ACPICA code in the kernel to upstream revision 20160831 with
the following major changes:
* New mechanism for GPE masking.
* Fixes for issues related to the LoadTable operator and table loading.
* Fixes for issues related to so-called module-level code (MLC), that is
AML that doesn't belong to any methods.
* Change of the return value of the _OSI method to reflect the Windows
behavior.
* GAS (Generic Address Structure) support fix related to 32-bit FADT
addresses.
* Elimination of unnecessary FADT version 2 support.
* ACPI tools fixes and cleanups.
From Bob Moore, Lv Zheng, and Jung-uk Kim.
- ACPI sysfs interface updates to fix GPE handling (on top of the new GPE
masking mechanism in ACPICA) and issues related to table loading (Lv Zheng).
- New watchdog driver based on the ACPI WDAT (ACPI Watchdog Action Table),
needed on some platforms to replace the iTCO watchdog that doesn't work there
and related updates of the intel_pmc_ipc, i2c/i801 and MFD/lcp_ich drivers
(Mika Westerberg).
- Driver core fix to prevent it from leaking secondary fwnode objects during
device removal (Lukas Wunner).
- New definitions of built-in properties for UART in ACPI-based x86 SoC drivers
and a 8250_dw driver quirk for the APM X-Gene SoC (Heikki Krogerus).
- New device ID for the Vulcan SPI controller and constification of local
strucures in the AMD SoC (APD) ACPI driver (Kamlakant Patel, Julia Lawall).
- Fix for a bug causing the allocation of PCI resorces to fail if
ACPI-enumerated child platform devices are registered below the PCI
devices in question (Mika Westerberg).
- Change of the default polarity for PCI legacy IRQs to high on systems
booting wth ACPI on platforms with a GIC interrupt controller model
fixing the discrepancy between the specification and HW behavior (Lorenzo
Pieralisi).
- Fixes for the handling of system suspend/resume in the ACPI EC driver and
update of that driver to make it cope with the cases when the EC device
defined in the ECDT has to be used throughout the entire system life cycle
(Lv Zheng).
- Update of the ACPI CPPC library to allow it to batch requests sent over the
PCC channel (to reduce overhead), to support the fixed functional hardware
(FFH) CPPC registers access type, to notify the mailbox framework about TX
completions when the interrupt flag is set for the PCC mailbox, and to
support HW-Reduced Communication Subspace type 2 (Ashwin Chaugule, Prashanth
Prakash, Srinivas Pandruvada, Hoan Tran).
- ACPI button driver fix and documentation update related to the handling of
laptop lids (Lv Zheng).
- ACPI battery driver initialization fix (Carlos Garnacho).
- ACPI GPIO enumeration documentation update (Mika Westerberg).
- Assorted updates of the core ACPI bus type code (Lukas Wunner, Lv Zheng).
- Assorted cleanups of the ACPI table parsing code and the x86-specific ACPI
code (Al Stone).
- Fixes for assorted ACPI-related issues found in linux-next (Wei Yongjun).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJX8Y5+AAoJEILEb/54YlRx73oP/RiAi86NKjOj+GfYceVe37jn
6lSqoMugjgTQHRYvYiQCjJ/BR0GzQZqUkz9TAu1Op14+rhTH3OhSfPizzJWCpVfA
G9l9ZRQNnsKNs14bbYmWtmWduh46dFLVFJqo+M/0H3ZMFZu6Adcb+1SBtXHUoQ6L
z69ngFxTu3yRvqS4cmm5h7SOx5W2uZZl8zViJW8jgyGhUBStG87gzR6wsYBldGCk
XFxcaGWBXRccWGAQLSwfs0psQccEooCqbpsDqaUdrK/mI0rsQr88f25ZxEE7Zw7H
bv3py1cgJBZRq36L7eBGQXjIE7YQey6qG2lug2zsUJWe+vzy2vHjHVJHuBXKKgv3
txOA6QZx63UgEyN3zFT7K5ek6uOnkKdeE+s+Laj+K/x4V2R6gbtgO011EVcXy+bI
NvqsO76tfPHpwrn5s1VVc5lcEBEPHKHb+WulHrqhSSU4ivk0gtJDeSI+c8xta6YT
XwSry5tozDLkG1uEZqkyY1XTlOUAHO8E6YcrlOv2z1+mG7L8OH/vCp1apzgexsZA
1683AH5cwKc3KaP+4QdKGdxY2BDxb7OTVh3cGy4kAYb6tqQ/vj7vlRiJvtaMBtFw
xJn3buuagwJzKtgebpA565opvyFAfUX/RNFlTP63aXAefSAgq6KLq70vKFxkIZto
H1LpUbmiEbuBml8CBGb1
=xDOQ
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"First off, the ACPICA code in the kernel is updated to upstream
revision 20160831 that brings in a few bug fixes and cleanups. In
particular, it is possible to mask GPEs now (and the sysfs interface
for GPE control is fixed on top of that), problems related to the
table loading mechanism are fixed and all code related to FADT version
2 (which has never been part of the ACPI specification) is dropped.
On the new features front, there is a new watchdog driver based on the
ACPI WDAT (ACPI Watchdog Action Table), needed on some platforms to
replace the iTCO watchdog that doesn't work there, and some UART
devices get new definitions of built-in properties (to be accessed via
the generic device properties API).
Also, included is a fix for an ACPI-related PCI resorces allocation
issue and a few problems in the EC driver and in the button and
battery drivers are fixed.
In addition to that, the ACPI CPPC library is updated to make batching
of requests sent over the PCC channel possible (which reduces the PCC
usage overhead substantially in some cases) and to support functional
fixed hardware (FFH) type of CPPC registers access (which will allow
CPPC to be used on x86 too in the future).
As usual, there are some assorted fixes and cleanups too.
Specifics:
- Update of the ACPICA code in the kernel to upstream revision
20160831 with the following major changes:
* New mechanism for GPE masking.
* Fixes for issues related to the LoadTable operator and table
loading.
* Fixes for issues related to so-called module-level code (MLC),
that is AML that doesn't belong to any methods.
* Change of the return value of the _OSI method to reflect the
Windows behavior.
* GAS (Generic Address Structure) support fix related to 32-bit
FADT addresses.
* Elimination of unnecessary FADT version 2 support.
* ACPI tools fixes and cleanups.
From Bob Moore, Lv Zheng, and Jung-uk Kim.
- ACPI sysfs interface updates to fix GPE handling (on top of the new
GPE masking mechanism in ACPICA) and issues related to table
loading (Lv Zheng).
- New watchdog driver based on the ACPI WDAT (ACPI Watchdog Action
Table), needed on some platforms to replace the iTCO watchdog that
doesn't work there and related updates of the intel_pmc_ipc,
i2c/i801 and MFD/lcp_ich drivers (Mika Westerberg).
- Driver core fix to prevent it from leaking secondary fwnode objects
during device removal (Lukas Wunner).
- New definitions of built-in properties for UART in ACPI-based x86
SoC drivers and a 8250_dw driver quirk for the APM X-Gene SoC
(Heikki Krogerus).
- New device ID for the Vulcan SPI controller and constification of
local strucures in the AMD SoC (APD) ACPI driver (Kamlakant Patel,
Julia Lawall).
- Fix for a bug causing the allocation of PCI resorces to fail if
ACPI-enumerated child platform devices are registered below the PCI
devices in question (Mika Westerberg).
- Change of the default polarity for PCI legacy IRQs to high on
systems booting wth ACPI on platforms with a GIC interrupt
controller model fixing the discrepancy between the specification
and HW behavior (Lorenzo Pieralisi).
- Fixes for the handling of system suspend/resume in the ACPI EC
driver and update of that driver to make it cope with the cases
when the EC device defined in the ECDT has to be used throughout
the entire system life cycle (Lv Zheng).
- Update of the ACPI CPPC library to allow it to batch requests sent
over the PCC channel (to reduce overhead), to support the fixed
functional hardware (FFH) CPPC registers access type, to notify the
mailbox framework about TX completions when the interrupt flag is
set for the PCC mailbox, and to support HW-Reduced Communication
Subspace type 2 (Ashwin Chaugule, Prashanth Prakash, Srinivas
Pandruvada, Hoan Tran).
- ACPI button driver fix and documentation update related to the
handling of laptop lids (Lv Zheng).
- ACPI battery driver initialization fix (Carlos Garnacho).
- ACPI GPIO enumeration documentation update (Mika Westerberg).
- Assorted updates of the core ACPI bus type code (Lukas Wunner, Lv
Zheng).
- Assorted cleanups of the ACPI table parsing code and the
x86-specific ACPI code (Al Stone).
- Fixes for assorted ACPI-related issues found in linux-next (Wei
Yongjun)"
* tag 'acpi-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (98 commits)
ACPI / documentation: Use recommended name in GPIO property names
watchdog: wdat_wdt: Fix warning for using 0 as NULL
watchdog: wdat_wdt: fix return value check in wdat_wdt_probe()
platform/x86: intel_pmc_ipc: Do not create iTCO watchdog when WDAT table exists
i2c: i801: Do not create iTCO watchdog when WDAT table exists
mfd: lpc_ich: Do not create iTCO watchdog when WDAT table exists
ACPI / bus: Adjust ACPI subsystem initialization for new table loading mode
ACPICA: Parser: Fix a regression in LoadTable support
ACPICA: Tables: Fix "UNLOAD" code path lock issues
ACPI / watchdog: Add support for WDAT hardware watchdog
ACPI / platform: Pay attention to parent device's resources
PCI: Add pci_find_resource()
ACPI / CPPC: Support PCC with interrupt flag
ACPI / sysfs: Update sysfs signature handling code
ACPI / sysfs: Fix an issue for LoadTable opcode
ACPICA: Tables: Fix a regression in acpi_tb_find_table()
ACPI / tables: Remove duplicated include from tables.c
ACPI / APD: constify local structures
x86: ACPI: make variable names clearer in acpi_parse_madt_lapic_entries()
x86: ACPI: remove extraneous white space after semicolon
...
- Add a mechanism for passing hints from the scheduler to cpufreq governors
via their utilization update callbacks and use it to introduce "IOwait
boosting" into the schedutil governor and intel_pstate that will make them
boost performance if the enqueued task was previously waiting on I/O
(Rafael Wysocki).
- Fix a schedutil governor problem that causes it to overestimate utilization
if SMT is in use (Steve Muckle).
- Update defconfigs trying to use the schedutil governor as a module which is
not possible any more (Javier Martinez Canillas).
- Update the intel_pstate's pstate_sample tracepoint to take "IOwait boosting"
into account (Srinivas Pandruvada).
- Fix a problem in the cpufreq core causing it to mishandle the initialization
of CPUs registered after the cpufreq driver (Viresh Kumar, Rafael Wysocki).
- Make the cpufreq-dt driver support per-policy governor tunables, clean it
up and update its Kconfig description (Viresh Kumar).
- Add support for more ARM platforms to the cpufreq-dt driver (Chanwoo Choi,
Dave Gerlach, Geert Uytterhoeven).
- Make the cpufreq CPPC driver report frequencies in KHz to avoid user space
compatiblility issues (Al Stone, Hoan Tran).
- Clean up a few cpufreq drivers (st, kirkwood, SCPI) a bit (Colin Ian King,
Markus Elfring).
- Constify some local structures in the intel_pstate driver (Julia Lawall).
- Add a Documentation/cpu-freq/ entry to MAINTAINERS (Jean Delvare).
- Add support for PM domain removal to the generic power domains (genpd)
framework, add new DT helper functions to it and make it always enable
debugfs support if available (Jon Hunter, Tomeu Vizoso).
- Clean up the generic power domains (genpd) framework and make it avoid
measuring power-on and power-off latencies during system-wide PM transitions
(Ulf Hansson).
- Add support for the RockChip DFI controller and the rk3399 DMC to the
devfreq framework (Lin Huang, Axel Lin, Arnd Bergmann).
- Add COMPILE_TEST to the devfreq framework (Krzysztof Kozlowski, Stephen
Rothwell).
- Fix a minor issue in the exynos-ppmu devfreq driver and fix up devfreq
Kconfig indentation style (Wei Yongjun, Jisheng Zhang).
- Fix the system suspend interface to make suspend-to-idle work if platform
suspend operations have not been registered (Sudeep Holla).
- Make it possible to use hibernation with PAGE_POISONING_ZERO enabled
(Anisse Astier).
- Increas the default timeout of the system suspend/resume watchdog and make it
depend on EXPERT (Chen Yu).
- Make the operating performance points (OPP) framework avoid using OPPs that
aren't supported by the platform and fix a build warning in it (Dave Gerlach,
Arnd Bergmann).
- Fix the ARM cpuidle driver's return value (Christophe Jaillet).
- Make the SmartReflex AVS (Adaptive Voltage Scaling) driver use more common
logging style (Joe Perches).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJX8Y32AAoJEILEb/54YlRx8e0P/27zu8Lb6Aks1S2Zx9GEW0qr
DvrO4kklCHqi3DgHlyFOYetf9cxMrUluojVJofnoSDvgAayWyg7VAd4gtOrMGCXG
pJVJM73itcOUK+DsAVvoWJY3hk15nX77n2aiXPN2GqaMqennlQusdfzTmjCasqpm
M84j+JwFYlJcfyMCcF5kGWqS7QBjzxhA0CjytUX1i3pL3NqRALZUEpaHwBD1W+4r
tcF/jYTy3RsghCbuC6HoPxEF9NMOFGxeAXogmu6NvGu8gy0GqtywRSRrs5wA1a0z
ZDAJ8krrFbzuFPMdjNIE8wtTeziofS5i9piQx3JlIMH3HpNGN86BRXVfzuHzJj11
6ZMUI/FJy+fYukIXOEeVLtsLHUnMcMux8Jq1UF6N0InahaR9nbsjmGOmXh72+Scx
7VJ+29l0oVwX6wkw/DjPP3rb1Swd1i3yY0/3uRoJ174mYTjhRGbrbDkIjPiDeuM5
2Cx7QunscOjFmaNtPyr8niQ+7YhMEpn8VIbGNaX5ABz0fGftfi8nDHqliSNa391Z
nK6YoKD0O6R0JHE6GavvJTcuMS9qE+HHHOwymWKxEdE9KYk0JUqen3gj1sSTaAZT
BIPBsn6XlorqNy3dnqtWTHV7Nf0al9huolWvrL90s6g4Bh2BzTzDVydSgNWTMDUi
G64nP0q1sJTqdoe30uvk
=NYkv
-----END PGP SIGNATURE-----
Merge tag 'pm-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"Traditionally, cpufreq is the area with the greatest number of
changes, but there are fewer of them than last time. There also is
some activity in the generic power domains and the devfreq frameworks,
a couple of system suspend and hibernation fixes and some assorted
changes in other places.
One new feature is the cpufreq change to allow the scheduler to pass
hints to the governors' utilization update callbacks and some code
rework based on that. Another one is the support for domain removal in
the generic power domains framework. Also it is now possible to use
hibernation with PAGE_POISONING_ZERO enabled and devfreq supports the
RockChip DFI controller and the rk3399 DMC.
The rest of the changes is mostly fixes and cleanups in a number of
places.
Specifics:
- Add a mechanism for passing hints from the scheduler to cpufreq
governors via their utilization update callbacks and use it to
introduce "IOwait boosting" into the schedutil governor and
intel_pstate that will make them boost performance if the enqueued
task was previously waiting on I/O (Rafael Wysocki).
- Fix a schedutil governor problem that causes it to overestimate
utilization if SMT is in use (Steve Muckle).
- Update defconfigs trying to use the schedutil governor as a module
which is not possible any more (Javier Martinez Canillas).
- Update the intel_pstate's pstate_sample tracepoint to take "IOwait
boosting" into account (Srinivas Pandruvada).
- Fix a problem in the cpufreq core causing it to mishandle the
initialization of CPUs registered after the cpufreq driver (Viresh
Kumar, Rafael Wysocki).
- Make the cpufreq-dt driver support per-policy governor tunables,
clean it up and update its Kconfig description (Viresh Kumar).
- Add support for more ARM platforms to the cpufreq-dt driver
(Chanwoo Choi, Dave Gerlach, Geert Uytterhoeven).
- Make the cpufreq CPPC driver report frequencies in KHz to avoid
user space compatiblility issues (Al Stone, Hoan Tran).
- Clean up a few cpufreq drivers (st, kirkwood, SCPI) a bit (Colin
Ian King, Markus Elfring).
- Constify some local structures in the intel_pstate driver (Julia
Lawall).
- Add a Documentation/cpu-freq/ entry to MAINTAINERS (Jean Delvare).
- Add support for PM domain removal to the generic power domains
(genpd) framework, add new DT helper functions to it and make it
always enable debugfs support if available (Jon Hunter, Tomeu
Vizoso).
- Clean up the generic power domains (genpd) framework and make it
avoid measuring power-on and power-off latencies during system-wide
PM transitions (Ulf Hansson).
- Add support for the RockChip DFI controller and the rk3399 DMC to
the devfreq framework (Lin Huang, Axel Lin, Arnd Bergmann).
- Add COMPILE_TEST to the devfreq framework (Krzysztof Kozlowski,
Stephen Rothwell).
- Fix a minor issue in the exynos-ppmu devfreq driver and fix up
devfreq Kconfig indentation style (Wei Yongjun, Jisheng Zhang).
- Fix the system suspend interface to make suspend-to-idle work if
platform suspend operations have not been registered (Sudeep
Holla).
- Make it possible to use hibernation with PAGE_POISONING_ZERO
enabled (Anisse Astier).
- Increas the default timeout of the system suspend/resume watchdog
and make it depend on EXPERT (Chen Yu).
- Make the operating performance points (OPP) framework avoid using
OPPs that aren't supported by the platform and fix a build warning
in it (Dave Gerlach, Arnd Bergmann).
- Fix the ARM cpuidle driver's return value (Christophe Jaillet).
- Make the SmartReflex AVS (Adaptive Voltage Scaling) driver use more
common logging style (Joe Perches)"
* tag 'pm-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (58 commits)
PM / OPP: Don't support OPP if it provides supported-hw but platform does not
cpufreq: st: add missing \n to end of dev_err message
cpufreq: kirkwood: add missing \n to end of dev_err messages
PM / Domains: Rename pm_genpd_sync_poweron|poweroff()
PM / Domains: Don't measure latency of ->power_on|off() during system PM
PM / Domains: Remove redundant system PM callbacks
PM / Domains: Simplify detaching a device from its genpd
PM / devfreq: rk3399_dmc: Remove explictly regulator_put call in .remove
PM / devfreq: rockchip: add PM_DEVFREQ_EVENT dependency
PM / OPP: avoid maybe-uninitialized warning
PM / Domains: Allow holes in genpd_data.domains array
cpufreq: CPPC: Avoid overflow when calculating desired_perf
cpufreq: ti: Use generic platdev driver
cpufreq: intel_pstate: Add io_boost trace
partial revert of "PM / devfreq: Add COMPILE_TEST for build coverage"
cpufreq: intel_pstate: Use IOWAIT flag in Atom algorithm
cpufreq: schedutil: Add iowait boosting
cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition
PM / Domains: Add support for removing nested PM domains by provider
PM / Domains: Add support for removing PM domains
...
- Support for execute-only page permissions
- Support for hibernate and DEBUG_PAGEALLOC
- Support for heterogeneous systems with mismatches cache line sizes
- Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug)
- arm64 PMU perf updates, including cpumasks for heterogeneous systems
- Set UTS_MACHINE for building rpm packages
- Yet another head.S tidy-up
- Some cleanups and refactoring, particularly in the NUMA code
- Lots of random, non-critical fixes across the board
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJX7k31AAoJELescNyEwWM0XX0H/iOaWCfKlWOhvBsStGUCsLrK
XryTzQT2KjdnLKf3jwP+1ateCuBR5ROurYxoDCX5/7mD63c5KiI338Vbv61a1lE1
AAwjt1stmQVUg/j+kqnuQwB/0DYg+2C8se3D3q5Iyn7zc19cDZJEGcBHNrvLMufc
XgHrgHgl/rzBDDlHJXleknDFge/MfhU5/Q1vJMRRb4JYrpAtmIokzCO75CYMRcCT
ND2QbmppKtsyuFPGUTVbAFzJlP6dGKb3eruYta7/ct5d0pJQxav3u98D2yWGfjdM
YaYq1EmX5Pol7rWumqLtk0+mA9yCFcKLLc+PrJu20Vx0UkvOq8G8Xt70sHNvZU8=
=gdPM
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"It's a bit all over the place this time with no "killer feature" to
speak of. Support for mismatched cache line sizes should help people
seeing whacky JIT failures on some SoCs, and the big.LITTLE perf
updates have been a long time coming, but a lot of the changes here
are cleanups.
We stray outside arch/arm64 in a few areas: the arch/arm/ arch_timer
workaround is acked by Russell, the DT/OF bits are acked by Rob, the
arch_timer clocksource changes acked by Marc, CPU hotplug by tglx and
jump_label by Peter (all CC'd).
Summary:
- Support for execute-only page permissions
- Support for hibernate and DEBUG_PAGEALLOC
- Support for heterogeneous systems with mismatches cache line sizes
- Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug)
- arm64 PMU perf updates, including cpumasks for heterogeneous systems
- Set UTS_MACHINE for building rpm packages
- Yet another head.S tidy-up
- Some cleanups and refactoring, particularly in the NUMA code
- Lots of random, non-critical fixes across the board"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (100 commits)
arm64: tlbflush.h: add __tlbi() macro
arm64: Kconfig: remove SMP dependence for NUMA
arm64: Kconfig: select OF/ACPI_NUMA under NUMA config
arm64: fix dump_backtrace/unwind_frame with NULL tsk
arm/arm64: arch_timer: Use archdata to indicate vdso suitability
arm64: arch_timer: Work around QorIQ Erratum A-008585
arm64: arch_timer: Add device tree binding for A-008585 erratum
arm64: Correctly bounds check virt_addr_valid
arm64: migrate exception table users off module.h and onto extable.h
arm64: pmu: Hoist pmu platform device name
arm64: pmu: Probe default hw/cache counters
arm64: pmu: add fallback probe table
MAINTAINERS: Update ARM PMU PROFILING AND DEBUGGING entry
arm64: Improve kprobes test for atomic sequence
arm64/kvm: use alternative auto-nop
arm64: use alternative auto-nop
arm64: alternative: add auto-nop infrastructure
arm64: lse: convert lse alternatives NOP padding to use __nops
arm64: barriers: introduce nops and __nops macros for NOP sequences
arm64: sysreg: replace open-coded mrs_s/msr_s with {read,write}_sysreg_s
...
These files were only including module.h for exception table
related functions. We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.
One uses "print_modules" so that prevents us removing module.h in
that case, however.
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
The change adds a new device node with description of generic SRAM
on-chip memory found on NXP LPC32xx SoC series and connected to AHB
matrix slave port 3.
Note that NXP LPC3220 SoC has 128KiB of SRAM memory, the other
LPC3230, LPC3240 and LPC3250 SoCs all have 256KiB SRAM space,
in the shared DTSI file this change specifies 128KiB SRAM size.
Also it's worth to mention that the SRAM area contains of 64KiB banks,
2 banks on LPC3220 and 4 banks on the other SoCs from the series, and
all SRAM banks but the first one have independent power controls,
the description of this feature will be added with the introduction of
power domains for the SoC series.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Cc: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull ARM fixes from Russell King:
"Three relatively small fixes for ARM:
- Roger noticed that dma_max_pfn() was calculating the upper limit
wrongly, by adding the PFN offset of memory twice.
- A fix from Robin to correct parsing of MPIDR values when the
address size is larger than one BE32 unit.
- A fix from Srinivas to ensure that we do not rely on the boot
loader (or previous Linux kernel) setting the translation table
base register a certain way in the decompressor, which can lead to
crashes"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
ARM: 8617/1: dma: fix dma_max_pfn()
ARM: 8616/1: dt: Respect property size when parsing CPUs
If the bootloader uses the long descriptor format and jumps to
kernel decompressor code, TTBCR may not be in a right state.
Before enabling the MMU, it is required to clear the TTBCR.PD0
field to use TTBR0 for translation table walks.
The commit dbece45894 ("ARM: 7501/1: decompressor:
reset ttbcr for VMSA ARMv7 cores") does the reset of TTBCR.N, but
doesn't consider all the bits for the size of TTBCR.N.
Clear TTBCR.PD0 field and reset all the three bits of TTBCR.N to
indicate the use of TTBR0 and the correct base address width.
Fixes: dbece45894 ("ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores")
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull x86 fixes from Thomas Gleixner:
"The last regression fixes for 4.8 final:
- Two patches addressing the fallout of the CR4 optimizations which
caused CR4-less machines to fail.
- Fix the VDSO build on big endian machines
- Take care of FPU initialization if no CPUID is available otherwise
task struct size ends up being zero
- Fix up context tracking in case load_gs_index fails"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry/64: Fix context tracking state warning when load_gs_index fails
x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
x86/vdso: Fix building on big endian host
x86/boot: Fix another __read_cr4() case on 486
x86/init: Fix cr4_init_shadow() on CR4-less machines
The driver does not handle endianness properly when loading the input
data.
Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* pm-cpufreq: (24 commits)
cpufreq: st: add missing \n to end of dev_err message
cpufreq: kirkwood: add missing \n to end of dev_err messages
cpufreq: CPPC: Avoid overflow when calculating desired_perf
cpufreq: ti: Use generic platdev driver
cpufreq: intel_pstate: Add io_boost trace
cpufreq: intel_pstate: Use IOWAIT flag in Atom algorithm
cpufreq: schedutil: Add iowait boosting
cpufreq / sched: SCHED_CPUFREQ_IOWAIT flag to indicate iowait condition
cpufreq: CPPC: Force reporting values in KHz to fix user space interface
cpufreq: create link to policy only for registered CPUs
intel_pstate: constify local structures
cpufreq: dt: Support governor tunables per policy
cpufreq: dt: Update kconfig description
cpufreq: dt: Remove unused code
MAINTAINERS: Add Documentation/cpu-freq/
cpufreq: dt: Add support for r8a7792
cpufreq / sched: ignore SMT when determining max cpu capacity
cpufreq: Drop unnecessary check from cpufreq_policy_alloc()
ARM: multi_v7_defconfig: Don't attempt to enable schedutil governor as module
ARM: exynos_defconfig: Don't attempt to enable schedutil governor as module
...
When discovering the number of VPEs per core, smp_num_siblings will be
incorrect for kernels built without support for the MIPS MultiThreading
(MT) ASE running on systems which implement said ASE. This leads to
accesses to VPEs in secondary cores being performed incorrectly since
mips_cm_vp_id calculates the wrong ID to write to the local "other"
registers. Fix this by examining the number of VPEs in the core as
reported by the CM.
This patch presumes that the number of VPEs will be the same in each
core of the system. As this path only applies to systems with CM version
2.5 or lower, and this property is true of all such known systems, this
is likely to be fine but is described in a comment for good measure.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14338/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* acpi-x86:
x86: ACPI: make variable names clearer in acpi_parse_madt_lapic_entries()
x86: ACPI: remove extraneous white space after semicolon
* acpi-cppc:
ACPI / CPPC: Support PCC with interrupt flag
ACPI / CPPC: Add prefix cppc to cpudata structure name
ACPI / CPPC: Add support for functional fixed hardware address
ACPI / CPPC: Don't return on CPPC probe failure
ACPI / CPPC: Allow build with ACPI_CPU_FREQ_PSS config
ACPI / CPPC: check for error bit in PCC status field
ACPI / CPPC: move all PCC related information into pcc_data
ACPI / CPPC: add sysfs support to compute delivered performance
ACPI / CPPC: set a non-zero value for transition_latency
ACPI / CPPC: support for batching CPPC requests
ACPI / CPPC: acquire pcc_lock only while accessing PCC subspace
ACPI / CPPC: restructure read/writes for efficient sys mapped reg ops
mailbox: pcc: Support HW-Reduced Communication Subspace type 2
* acpi-soc:
ACPI / APD: constify local structures
ACPI / APD: Add device HID for Vulcan SPI controller
Commit d9676fa152 ("ARCv2: Enable LOCKDEP"), changed
local_save_flags() to not return raw STATUS32 but encoded in the form
such that it could be fed directly to CLRI/SETI instructions.
However the STATUS32.E[] was not captured correctly as it corresponds to
bits [4:1] in the register and not [3:0]
Fixes: d9676fa152 ("ARCv2: Enable LOCKDEP")
Cc: Evgeny Voevodin <evgeny.voevodin@intel.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Seem like values assigned as absolute number and not and
shift value, i.e. should be 0 for one node (2^0) and 1 for
couple of nodes (2^1)
Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In the end of "arc_init_IRQ" STATUS32.IE flag is going to be affected by
"flag" instruction but "flag" never touches IE flag on ARCv2. So "kflag"
instruction must be used instead of "flag".
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Cc: stable@vger.kernel.org #4.2+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
We used to keep the .exit.* sections as linker would fail in final link
due to references from .debug_frame which itself could not be discardrd
due to the forced "write,alloc" attributes for it.
| LD init/built-in.o
| `.exit.text' referenced in section `.debug_frame' of arch/arc/built-in.o: defined in discarded section `.exit.text' of arch/arc/built-in.o
| Makefile:949: recipe for target 'vmlinux' failed
With .debug_frame now retired, this hack is no longer needed.
kernel binary is now a little bit smaller as well.
closes STAR 9000549913
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This uses a new set of annoations viz. ENTRY_CFI/END_CFI to enabel cfi
ops generation.
Note that we didn't change the normal ENTRY/EXIT as we don't actually
want unwind info in the trap/exception/interrutp handlers which use
these, as unwinder then gets confused (it keeps recursing vs. stopping).
Semantically these are leaf routines and unwinding should stop when it
hits those routines.
Before
------
28.52% 1.19% 9929 hackbench libuClibc-1.0.17.so [.] __write_nocancel
|
---__write_nocancel
|--8.95%--EV_Trap
| --8.25%--sys_write
| |--3.93%--sock_write_iter
...
|--2.62%--memset <==== [LEAF entry as no unwind info]
^^^^^^
After
-----
29.46% 1.24% 13622 hackbench libuClibc-1.0.17.so [.] __write_nocancel
|
---__write_nocancel
|--9.31%--EV_Trap
| --8.62%--sys_write
| |--4.17%--sock_write_iter
...
|--6.19%--sys_write
| --6.19%--sock_write_iter
| unix_stream_sendmsg
| |--1.62%--sock_alloc_send_pskb
| |--0.89%--sock_def_readable
| |--0.88%--_raw_spin_unlock_irqrestore
| |--0.69%--memset
| | ^^^^^^ <==== [now in proper callframe]
| |
| --0.52%--skb_copy_datagram_from_iter
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
1. detect whether binutils supports the cfi pseudo ops
2. define conditional macros to generate the ops
3. define new ENTRY_CFI/END_CFI to annotate hand asm code.
- Needed because we don't want to emit dwarf info in general ENTRY/END
used by lowest level trap/exception/interrutp handlers as unwinder
gets confused trying to unwind out of them. We want unwinder to
instead stop when it hits onfo those routines
- These provide minimal start/end cfi ops assuming routine doesn't
touch stack memory/regs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This essentially removes ENTRY() assembler annotation for this symbol
since it didn't have a pairing END()
This in ahead of introducing cfi pseudo ops in ENTRY/END which expects
paired cfi_startproc/cfi_endproc
| ../arch/arc/kernel/entry.S: Assembler messages:
| ../arch/arc/kernel/entry.S:270: Error: previous CFI entry not closed (missing .cfi_endproc)
| ../scripts/Makefile.build:326: recipe for target 'arch/arc/kernel/entry-arcv2.o' failed
| make[4]: *** [arch/arc/kernel/entry-arcv2.o] Error 1
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In .debug_frame based unwinding regime, we used to force -gdwarf-2 since
kernel unwinder only claimed to handle dwarf 2. This changed since commit
6d0d506012 ("ARC: dw2 unwind: Don't bail for CIE.version != 1")
which added some support beyond dwarf 2, atleast to handle CIE != 1
The ill-effect of -gdwarf-2 is that it forces generation of .debug_*
sections, which bloats loadable modules .ko files. For the curious, this
doesn't affect vmlinx binary since linker script discards .debug_* but
same discard is not yet implemented for modules.
So it seems we can drop the -gdwarf-2 toggle, which should not be needed
anyways given that we now use .eh_frame based unwinding.
I've verified using GNU 2016.09-engo10 that the actual unwind info is
not different with or w/o this toggle - but the debug_* sections are
gone for good.
before
-----
arc-linux-readelf -S q_proc.ko-unwinding-1-eh_frame-switch | grep debug
[15] .debug_info PROGBITS 00000000 000300 00d08d 00 0 0 1
[16] .rela.debug_info RELA 00000000 0162a0 008844 0c I 29 15 4
[17] .debug_abbrev PROGBITS 00000000 00d38d 0005f8 00 0 0 1
[18] .debug_loc PROGBITS 00000000 00d985 000070 00 0 0 1
[19] .rela.debug_loc RELA 00000000 01eae4 0000c0 0c I 29 18 4
[20] .debug_aranges PROGBITS 00000000 00d9f5 000040 00 0 0 1
[21] .rela.debug_arang RELA 00000000 01eba4 000030 0c I 29 20 4
[22] .debug_ranges PROGBITS 00000000 00da35 000018 00 0 0 1
[23] .rela.debug_range RELA 00000000 01ebd4 000030 0c I 29 22 4
[24] .debug_line PROGBITS 00000000 00da4d 000b5b 00 0 0 1
[25] .rela.debug_line RELA 00000000 01ec04 0000cc 0c I 29 24 4
[26] .debug_str PROGBITS 00000000 00e5a8 007831 01 MS 0 0 1
after
----
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
So finally after almost 8 years of dealing with .debug_frame, we are
finally switching to .eh_frame. The reason being stripped kernel
binaries had non-functional unwinder as .debug_frame was gone.
Also, in general .eh_frame seems more common way of doing unwinding.
This also folds a revert of f52e126cc7 ("ARC: unwind: ensure that
.debug_frame is generated (vs. .eh_frame)") to ensure that we start
getting .eh_frame
Reported-by: Daniel Mentz <danielmentz@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
We used to live with PERF_COUNT_HW_CACHE_REFERENCES and
PERF_COUNT_HW_CACHE_REFERENCES not specified on ARC.
Those events are actually aliases to 2 cache events that we do support
and so this change sets "cache-reference" and "cache-misses" events
in the same way as "L1-dcache-loads" and L1-dcache-load-misses.
And while at it adding debug info for cache events as well as doing a
subtle fix in HW events debug info - config value is much better
represented by hex so we may see not only event index but as well other
control bits set (if they exist).
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Build brekeage since last changes to generic atomic operations.
Added couple of missing macros which are now mandatory
Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARCv2 ISA provides 64-bit exclusive load/stores so use them to implement
the 64-bit atomics and elide the spinlock based generic 64-bit atomics
boot tested with atomic64 self-test (and GOD bless the person who wrote
them, I realized my inline assmebly is sloppy as hell)
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
HS release 3.0 provides for even more flexibility in specifying the
volatile address space for mapping peripherals.
With HS 2.1 @start was made flexible / programmable - with HS 3.0 even
@end can be setup (vs. fixed to 0xFFFF_FFFF before).
So add code to reflect that and while at it remove an unused struct
defintion
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The cool thing is that same kernel image can run on
- nsim OSCI simulation platform
- SDPlite FPGA setups
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
As it was discussed quite some time ago (see
https://lkml.org/lkml/2015/11/5/862) it's a good practice to add
"model" property in .dts. Moreover as per ePAPR "model" property is
required and should look like "manufacturer,model" so we do here.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Jonas Gorski <jonas.gorski@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Christian Ruppert <christian.ruppert@alitech.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This warning:
WARNING: CPU: 0 PID: 3331 at arch/x86/entry/common.c:45 enter_from_user_mode+0x32/0x50
CPU: 0 PID: 3331 Comm: ldt_gdt_64 Not tainted 4.8.0-rc7+ #13
Call Trace:
dump_stack+0x99/0xd0
__warn+0xd1/0xf0
warn_slowpath_null+0x1d/0x20
enter_from_user_mode+0x32/0x50
error_entry+0x6d/0xc0
? general_protection+0x12/0x30
? native_load_gs_index+0xd/0x20
? do_set_thread_area+0x19c/0x1f0
SyS_set_thread_area+0x24/0x30
do_int80_syscall_32+0x7c/0x220
entry_INT80_compat+0x38/0x50
... can be reproduced by running the GS testcase of the ldt_gdt test unit in
the x86 selftests.
do_int80_syscall_32() will call enter_form_user_mode() to convert context
tracking state from user state to kernel state. The load_gs_index() call
can fail with user gsbase, gsbase will be fixed up and proceed if this
happen.
However, enter_from_user_mode() will be called again in the fixed up path
though it is context tracking kernel state currently.
This patch fixes it by just fixing up gsbase and telling lockdep that IRQs
are off once load_gs_index() failed with user gsbase.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1475197266-3440-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Otherwise arch_task_struct_size == 0 and we die. While we're at it,
set X86_FEATURE_ALWAYS, too.
Reported-by: David Saggiorato <david@saggiorato.net>
Tested-by: David Saggiorato <david@saggiorato.net>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: aaeb5c01c5b ("x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86")
Link: http://lkml.kernel.org/r/8de723afbf0811071185039f9088733188b606c9.1475103911.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ever since commit 254d1a3f02 ("xen/pv-on-hvm kexec: shutdown watches
from old kernel") using the INTx interrupt from Xen PCI platform
device for event channel notification would just lockup the guest
during bootup. postcore_initcall now calls xs_reset_watches which
will eventually try to read a value from XenStore and will get stuck
on read_reply at XenBus forever since the platform driver is not
probed yet and its INTx interrupt handler is not registered yet. That
means that the guest can not be notified at this moment of any pending
event channels and none of the per-event handlers will ever be invoked
(including the XenStore one) and the reply will never be picked up by
the kernel.
The exact stack where things get stuck during xenbus_init:
-xenbus_init
-xs_init
-xs_reset_watches
-xenbus_scanf
-xenbus_read
-xs_single
-xs_single
-xs_talkv
Vector callbacks have always been the favourite event notification
mechanism since their introduction in commit 38e20b07ef ("x86/xen:
event channels delivery on HVM.") and the vector callback feature has
always been advertised for quite some time by Xen that's why INTx was
broken for several years now without impacting anyone.
Luckily this also means that event channel notification through INTx
is basically dead-code which can be safely removed without impacting
anybody since it has been effectively disabled for more than 4 years
with nobody complaining about it (at least as far as I'm aware of).
This commit removes event channel notification through Xen PCI
platform device.
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Julien Grall <julien.grall@citrix.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
We use __read_cr4() vs __read_cr4_safe() inconsistently. On
CR4-less CPUs, all CR4 bits are effectively clear, so we can make
the code simpler and more robust by making __read_cr4() always fix
up faults on 32-bit kernels.
This may fix some bugs on old 486-like CPUs, but I don't have any
easy way to test that.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: david@saggiorato.net
Link: http://lkml.kernel.org/r/ea647033d357d9ce2ad2bbde5a631045f5052fb6.1475178370.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We need to call GET_LE to read hdr->e_type.
Fixes: 57f90c3dfc ("x86/vdso: Error out if the vDSO isn't a valid DSO")
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-next@vger.kernel.org
Link: http://lkml.kernel.org/r/20160929193442.GA16617@gate.crashing.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The condition for reading CR4 was wrong: there are some CPUs with
CPUID but not CR4. Rather than trying to make the condition exact,
use __read_cr4_safe().
Fixes: 18bc7bd523 ("x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly")
Reported-by: david@saggiorato.net
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Link: http://lkml.kernel.org/r/8c453a61c4f44ab6ff43c29780ba04835234d2e5.1475178369.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Switch to new CPU hotplug infrastructure.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The message is missing a \n, add it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Rename the ia64 only set_curr_task() function to free up the name.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
cmpxchg contained definitions for unused (x)add_* operations, dating back
to the original ticket spinlock implementation. Nowadays these are
unused so remove them.
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1474913478-17757-1-git-send-email-n.borisov.lkml@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Current code can call set_cpu_sibling_map() and invoke sched_set_topology()
more than once (e.g. on CPU hot plug). When this happens after
sched_init_smp() has been called, we lose the NUMA topology extension to
sched_domain_topology in sched_init_numa(). This results in incorrect
topology when the sched domain is rebuilt.
This patch fixes the bug and issues warning if we call sched_set_topology()
after sched_init_smp().
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@suse.de
Cc: jolsa@redhat.com
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/1474485552-141429-2-git-send-email-srinivas.pandruvada@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This file was only including module.h for exception table related
functions. We've now separated that content out into its own file
"extable.h" so now move over to that and avoid all the extra header
content in module.h that we don't really need to compile this.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
MMU initialization option is currently ignored on MMUv2 cores, but it is
used in Kconfig to select kernel load and start addresses. This choice
is not available for MMUv2 cores as they have hardwired TLB entries.
Disable MMU initialization option for known MMUv2 cores so that they get
correct kernel load/start address by default.
This fixes the default allmodconfig build.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
The paging_init() function contains code which detects that highmem is
in use but unsupported due to dcache aliasing. However this code was
ineffective because it was being run before the caches are probed,
meaning that cpu_has_dc_aliases would always evaluate to false (unless a
platform overrides it to a compile-time constant) and the detection of
the unsupported case is never triggered. The kernel would then go on to
attempt to use highmem & either hit coherency issues or trigger the
BUG_ON in flush_kernel_dcache_page().
Fix this by running paging_init() later than cpu_cache_init(), such that
the cpu_has_dc_aliases macro will evaluate correctly & the unsupported
highmem case will be detected successfully.
This then leads to a formerly hidden issue in that
mem_init_free_highmem() will attempt to free all highmem pages, even
though we're avoiding use of them & don't have valid page structs for
them. This leads to an invalid pointer dereference & a TLB exception.
Avoid this by skipping the loop in mem_init_free_highmem() if
cpu_has_dc_aliases evaluates true.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Rabin Vincent <rabinv@axis.com>
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Jaedon Shin <jaedon.shin@gmail.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Cc: Jonas Gorski <jogo@openwrt.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14184/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Malta boards used with CPU emulators feature a switch to disable use of
an IOCU. Software has to check this switch & ignore any present IOCU if
the switch is closed. The read used to do this was unsafe for 64 bit
kernels, as it simply casted the address 0xbf403000 to a pointer &
dereferenced it. Whilst in a 32 bit kernel this would access kseg1, in a
64 bit kernel this attempts to access xuseg & results in an address
error exception.
Fix by accessing a correctly formed ckseg1 address generated using the
CKSEG1ADDR macro.
Whilst modifying this code, define the name of the register and the bit
we care about within it, which indicates whether PCI DMA is routed to
the IOCU or straight to DRAM. The code previously checked that bit 0 was
also set, but the least significant 7 bits of the CONFIG_GEN0 register
contain the value of the MReqInfo signal provided to the IOCU OCP bus,
so singling out bit 0 makes little sense & that part of the check is
dropped.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: b6d92b4a6b ("MIPS: Add option to disable software I/O coherency.")
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14187/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When the kernel is built for microMIPS, branches targets need to be
known to be microMIPS code in order to result in bit 0 of the PC being
set. The branch target in the BUILD_ROLLBACK_PROLOGUE macro was simply
the end of the macro, which may be pointing at padding rather than at
code. This results in recent enough GNU linkers complaining like so:
mips-img-linux-gnu-ld: arch/mips/built-in.o: .text+0x3e3c: Unsupported branch between ISA modes.
mips-img-linux-gnu-ld: final link failed: Bad value
Makefile:936: recipe for target 'vmlinux' failed
make: *** [vmlinux] Error 1
Fix this by changing the branch target to be the start of the
appropriate handler, skipping over any padding.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14019/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On current P-series cores from Imagination the FTLB can be enabled or
disabled via a bit in the Config6 register, and an execution hazard is
created by changing the value of bit. The ftlb_disable function already
cleared that hazard but that does no good for other callers. Clear the
hazard in the set_ftlb_enable function that creates it, and only for the
cores where it applies.
This has the effect of reverting c982c6d6c4 ("MIPS: cpu-probe: Remove
cp0 hazard barrier when enabling the FTLB") which was incorrect.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: c982c6d6c4 ("MIPS: cpu-probe: Remove cp0 hazard barrier when enabling the FTLB")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14023/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On some cores (proAptiv, P5600) we make use of the sizes of the TLBs
to determine the desired FTLB:VTLB write ratio. However set_ftlb_enable
& thus calculate_ftlb_probability is called before decode_config4. This
results in us calculating a probability based on zero sizes, and we end
up setting FTLBP=3 for a 3:1 FTLB:VTLB write ratio in all cases. This
will make abysmal use of the available FTLB resources in the affected
cores.
Fix this by configuring the FTLB probability after having decoded
config4. However we do need to have enabled the FTLB before that point
such that fields in config4 actually reflect that an FTLB is present. So
set_ftlb_enable is now called twice, with flags indicating that it
should configure the write probability only the second time.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: cf0a8aa022 ("MIPS: cpu-probe: Set the FTLB probability bit on supported cores")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14022/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The FTLBP field in Config7 for the I6400 is intended as chicken bits for
debugging rather than as a field that software actually makes use of.
For best performance, FTLBP should be left at its default value of 0
with all TLB writes hitting the FTLB by default.
Additionally, since set_ftlb_enable is called from decode_configs before
decode_config4 which determines the size of the TLBs, this was
previously always setting FTLBP=3 for a 3:1 FTLB:VTLB write ratio which
makes abysmal use of the available FTLB resources.
This effectively reverts b0c4e1b79d8a ("MIPS: Set up FTLB probability
for I6400").
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: b0c4e1b79d8a ("MIPS: Set up FTLB probability for I6400")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14021/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When expanding the la or dla pseudo-instruction in a delay slot the GNU
assembler will complain should the pseudo-instruction expand to multiple
actual instructions, since only the first of them will be in the delay
slot leading to the pseudo-instruction being only partially executed if
the branch is taken. Use of PTR_LA in the dec int-handler.S leads to
such warnings:
arch/mips/dec/int-handler.S: Assembler messages:
arch/mips/dec/int-handler.S:149: Warning: macro instruction expanded into multiple instructions in a branch delay slot
arch/mips/dec/int-handler.S:198: Warning: macro instruction expanded into multiple instructions in a branch delay slot
Avoid this by open coding the PTR_LA macros.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We clear the OF_POPULATED flag for the GPIO controller node on Octeon
processors. Otherwise, none of the devices hanging on the GPIO lines
are probed. The 'gpio-leds' driver on OCTEON failed to probe in addition
to other devices on Cavium 71xx and 78xx development boards.
Fixes: 15cc2ed6dc ("of/irq: Mark initialised interrupt controllers as populated")
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: David Daney <david.daney@cavium.com>
Cc: Rob Herring <robh@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14091/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
arch_uprobe_pre_xol needs to emulate a branch if a branch instruction
has been replaced with a breakpoint, but in fact an uninitialised local
variable was passed to the emulator routine instead of the original
instruction
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Fixes: 40e084a506 ('MIPS: Add uprobes support.')
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14300/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Generic kernel code implements a weak version of set_orig_insn that
moves cached 'insn' from arch_uprobe to the original code location when
the trap is removed.
MIPS variant used arch_uprobe->orig_inst which was never initialised
properly, so this code only inserted a nop instead of the original
instruction. With that change orig_inst can also be safely removed.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Fixes: 40e084a506 ('MIPS: Add uprobes support.')
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14299/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
arch_uretprobe_hijack_return_addr should replace the return address for
a call with a trampoline address.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Fixes: 40e084a506 ('MIPS: Add uprobes support.')
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14298/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 0d2808f338 ("MIPS: smp-cps: Add support for CPU hotplug of
MIPSr6 processors") added a call to mips_cm_lock_other in order to lock
the CPC in CPUs containing a version 3 or higher Coherence Manager,
which use the general CM core other register, where previous CMs had a
dedicated core other register for the CPC.
A kernel BUG() is triggered, however, if mips_cm_lock_other is called
with a VP other than 0 on a CPU with CM < 3, a condition introduced by
0d2808f338.
Avoid the BUG() by always locking VP0 when locking the CPC, since the
required register, cpc_stat_conf, is shared by all vps in a core.
Fixes: 0d2808f338 ("MIPS: smp-cps: Add support for CPU hotplug...)
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14297/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since commit 6ce0d20016 ("ARM: dma: Use dma_pfn_offset for dma address translation"),
dma_to_pfn() already returns the PFN with the physical memory start offset
so we don't need to add it again.
This fixes USB mass storage lock-up problem on systems that can't do DMA
over the entire physical memory range (e.g.) Keystone 2 systems with 4GB RAM
can only do DMA over the first 2GB. [K2E-EVM].
What happens there is that without this patch SCSI layer sets a wrong
bounce buffer limit in scsi_calculate_bounce_limit() for the USB mass
storage device. dma_max_pfn() evaluates to 0x8fffff and bounce_limit
is set to 0x8fffff000 whereas maximum DMA'ble physical memory on Keystone 2
is 0x87fffffff. This results in non DMA'ble pages being given to the
USB controller and hence the lock-up.
NOTE: in the above case, USB-SCSI-device's dma_pfn_offset was showing as 0.
This should have really been 0x780000 as on K2e, LOWMEM_START is 0x80000000
and HIGHMEM_START is 0x800000000. DMA zone is 2GB so dma_max_pfn should be
0x87ffff. The incorrect dma_pfn_offset for the USB storage device is because
USB devices are not correctly inheriting the dma_pfn_offset from the
USB host controller. This will be fixed by a separate patch.
Fixes: 6ce0d20016 ("ARM: dma: Use dma_pfn_offset for dma address translation")
Cc: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Olof Johansson <olof@lixom.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Whilst MPIDR values themselves are less than 32 bits, it is still
perfectly valid for a DT to have #address-cells > 1 in the CPUs node,
resulting in the "reg" property having leading zero cell(s). In that
situation, the big-endian nature of the data conspires with the current
behaviour of only reading the first cell to cause the kernel to think
all CPUs have ID 0, and become resoundingly unhappy as a consequence.
Take the full property length into account when parsing CPUs so as to
be correct under any circumstances.
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
PPC KVM updates for 4.9.
- Fix for the bug that Thomas Huth found which caused guests to falsely
report soft lockups,
- other minor fixes from Thomas Huth and Dan Carpenter,
- and a small optimization from Balbir Singh.
- Various cleanups and removal of redundant code
- Two important fixes for not using an in-kernel irqchip
- A bit of optimizations
- Handle SError exceptions and present them to guests if appropriate
- Proxying of GICV access at EL2 if guest mappings are unsafe
- GICv3 on AArch32 on ARMv8
- Preparations for GICv3 save/restore, including ABI docs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJX6rKQAAoJEEtpOizt6ddy8i4H/0bfB1EVukggoL/FfGeds/dg
p2FG0oOsggcSBwK7VXUUvVllO7ioUssRCqqkn1e0/bCLtQrN4ex4PqJ3618EHFz/
pLP72hf8Zl33rP3OVtPaDcxzjjKKdf+xGbBIv3AE7x7O5rFZg4lWHeWjy4yuhFv2
Jm+8ul7JCxCMse08Xc90riou4i/jWjyoLadHbAoeX3tR+dVcZyOUZSlgAPI1bS/P
rOQi/zkl3bT2R3kh28QuEFTrJ9BVTnmw25BRW8DNr6+CWmR9bpM6y7AGzOwrZ3FZ
F1MbsPpN3ogcjvPg2QTYuOoqrwz8NLLHw5pR5YNj84VppjSpSsAhKU7Ug5Uhsr0=
=1z/L
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into next
KVM/ARM Changes for v4.9
- Various cleanups and removal of redundant code
- Two important fixes for not using an in-kernel irqchip
- A bit of optimizations
- Handle SError exceptions and present them to guests if appropriate
- Proxying of GICV access at EL2 if guest mappings are unsafe
- GICv3 on AArch32 on ARMv8
- Preparations for GICv3 save/restore, including ABI docs
There exists a slightly dubious optimisation in the implementation of
the MIPS KVM EntryHi emulation which skips TLB invalidation if the
EntryHi points to an address in the guest KSeg0 region, intended to
catch guest TLB invalidations where the ASID is almost immediately
restored to the previous value.
Now that we perform lazy host ASID regeneration for guest user mode when
the guest ASID changes we should be able to drop the optimisation
without a significant impact (only the extra TLB refills for the small
amount of code while the TLB is being invalidated).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Invalidate host TLB mappings when the guest ASID is changed by
regenerating ASIDs, rather than flushing the entire host TLB except
entries in the guest KSeg0 range.
For the guest kernel mode ASID we regenerate on the spot when the guest
ASID is changed, as that will always take place while the guest is in
kernel mode.
However when the guest invalidates TLB entries the ASID will often by
changed temporarily as part of writing EntryHi without the guest
returning to user mode in between. We therefore regenerate the user mode
ASID lazily before entering the guest in user mode, if and only if the
guest ASID has actually changed since the last guest user mode entry.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
The host ASIDs for guest kernel and user mode are regenerated together
if the ASID for guest kernel mode is out of date. That is fine as the
ASID for guest kernel mode is always generated first, however it doesn't
allow the ASIDs to be regenerated or invalidated individually instead of
linearly flushing the entire host TLB.
Therefore separate the regeneration code so that the ASIDs are checked
and regenerated separately.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
When a guest TLB entry is replaced by TLBWI or TLBWR, we only invalidate
TLB entries on the local CPU. This doesn't work correctly on an SMP host
when the guest is migrated to a different physical CPU, as it could pick
up stale TLB mappings from the last time the vCPU ran on that physical
CPU.
Therefore invalidate both user and kernel host ASIDs on other CPUs,
which will cause new ASIDs to be generated when it next runs on those
CPUs.
We're careful only to do this if the TLB entry was already valid, and
only for the kernel ASID where the virtual address it mapped is outside
of the guest user address range.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.10.x-
As EBS does not mean anything reasonable in the context it is used, it
seems like a misspelling for EBX.
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
__kernel_get_syscall_map() and __kernel_clock_getres() use cmpli to
check if the passed in pointer is non zero. cmpli maps to a 32 bit
compare on binutils, so we ignore the top 32 bits.
A simple test case can be created by passing in a bogus pointer with
the bottom 32 bits clear. Using a clk_id that is handled by the VDSO,
then one that is handled by the kernel shows the problem:
printf("%d\n", clock_getres(CLOCK_REALTIME, (void *)0x100000000));
printf("%d\n", clock_getres(CLOCK_BOOTTIME, (void *)0x100000000));
And we get:
0
-1
The bigger issue is if we pass a valid pointer with the bottom 32 bits
clear, in this case we will return success but won't write any data
to the pointer.
I stumbled across this issue because the LLVM integrated assembler
doesn't accept cmpli with 3 arguments. Fix this by converting them to
cmpldi.
Fixes: a7f290dad3 ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel")
Cc: stable@vger.kernel.org # v2.6.15+
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When PCI Device pass-through is enabled via VFIO, KVM-PPC will
pin pages using get_user_pages_fast(). One of the downsides of
the pinning is that the page could be in CMA region. The CMA
region is used for other allocations like the hash page table.
Ideally we want the pinned pages to be from non CMA region.
This patch (currently only for KVM PPC with VFIO) forcefully
migrates the pages out (huge pages are omitted for the moment).
There are more efficient ways of doing this, but that might
be elaborate and might impact a larger audience beyond just
the kvm ppc implementation.
The magic is in new_iommu_non_cma_page() which allocates the
new page from a non CMA region.
I've tested the patches lightly at my end. The full solution
requires migration of THP pages in the CMA region. That work
will be done incrementally on top of this.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Merged via powerpc tree as that's where the changes are]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This supports PCI surprise hotplug. The design is highlighted as
below:
* The PCI slot's surprise hotplug capability is exposed through
device node property "ibm,slot-surprise-pluggable", meaning
PCI surprise hotplug will be disabled if skiboot doesn't support
it yet.
* The interrupt because of presence or link state change is raised
on surprise hotplug event. One event is allocated and queued to
the PCI slot for workqueue to pick it up and process in serialized
fashion. The code flow for surprise hotplug is same to that for
managed hotplug except: the affected PEs are put into frozen state
to avoid unexpected EEH error reporting in surprise hot remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This unfreezes PE when it's initialized because the PE might be put
into frozen state in the last hot remove path. It's not harmful to
do so if the PE is already in unfrozen state.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This exports eeh_pe_state_mark(). It will be used to mark the surprise
hot removed PE as isolated to avoid unexpected EEH error reporting in
surprise remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This exports @confirm_error_lock so that eeh_serialize_{lock, unlock}()
can be used to freeze the affected PE in PCI surprise hot remove path.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Function eeh_pe_set_option() is used to apply the requested options
(enable, disable, unfreeze) in EEH virtualization path. The semantics
of this function isn't complete until freezing is supported.
This allows to freeze the indicated PE. The new semantics is going to
be used in PCI surprise hot remove path, to freeze removed PCI devices
(PE) to avoid unexpected EEH error reporting.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When issuing PHB reset, OPAL API opal_pci_poll() is called to drive
the state machine in OPAL forward. However, we needn't always call
the function under some circumstances like reset deassert.
This avoids calling opal_pci_poll() when OPAL_SUCCESS is returned
from opal_pci_reset(). Except the overhead introduced by additional
one unnecessary OPAL call, I didn't run into real issue because of
this.
Reported-by: Pridhiviraj Paidipeddi <ppaiddipe@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Need to provide a dummy smp_fill_in_cpu_possible_map.
Fixes: 9b2f753ec2 ("sparc64: Fix cpu_possible_mask if nr_cpus is set")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following DTC warning with W=1:
"Node /memory has a reg or ranges property, but no unit name"
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
This patch fixes the following DTC warning with W=1:
"Node /memory has a reg or ranges property, but no unit name"
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
This patch fixes the following DTC warning with W=1:
"Node /memory has a reg or ranges property, but no unit name"
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
This patch fixes the following DTC warning with W=1:
"Node /soc has a reg or ranges property, but no unit name"
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
This patch fixes the following DTC warning with W=1:
"Node /soc has a reg or ranges property, but no unit name"
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
This patch fixes the following DTC warning with W=1:
"Node /soc has a reg or ranges property, but no unit name"
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
As noted in [1], "there are a number of problems with skeleton.dtsi,
and it would be prefereable to remove it entirely." This patch is to
remove skeleton.dtsi inclusion from berlin2.
[1] http://www.spinics.net/lists/arm-kernel/msg528080.html
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
As noted in [1], "there are a number of problems with skeleton.dtsi,
and it would be prefereable to remove it entirely." This patch is to
remove skeleton.dtsi inclusion from berlin2cd.
[1] http://www.spinics.net/lists/arm-kernel/msg528080.html
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
As noted in [1], "there are a number of problems with skeleton.dtsi,
and it would be prefereable to remove it entirely." This patch is to
remove skeleton.dtsi inclusion from berlin2q.
[1] http://www.spinics.net/lists/arm-kernel/msg528080.html
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
This patch adds the L2 cache topology for berlin4ct which has 1MB L2
cache.
[Sebastian: rename cache node from "l2-cache" to "cache"]
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
After commit f29a72c24a ("watchdog: dw_wdt: Convert to use watchdog
infrastructure"), the dw_wdt driver can support multiple variants, so
unconditionally enable all dw_wdt nodes now.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Commit ac82d12772 ("arm64: perf: add Cortex-A53 support") adds the
cortex A53 PMU support, thus instead of using the generic armv8-pmuv3
compatibility use the more specific Cortex A53 compatibility.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
After commit f29a72c24a ("watchdog: dw_wdt: Convert to use watchdog
infrastructure"), the dw_wdt driver can support multiple variants, so
unconditionally enable all dw_wdt nodes now.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
After commit f29a72c24a ("watchdog: dw_wdt: Convert to use watchdog
infrastructure"), the dw_wdt driver can support multiple variants, so
unconditionally enable all dw_wdt nodes now.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
The HTC GPIO driver is a pure GPIO driver and I just can not
see what it is doing inside MFD. Let's just move it to GPIO
and take this opportunity to move the platform data to
<linux/platform_data/gpio-htc-egpio.h>
Cc: arm@kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
SBBR mentions SPCR as a mandatory ACPI table. So enable it for ARM64
Earlycon should be set up as early as possible. ACPI boot tables are
mapped in arch/arm64/kernel/acpi.c:acpi_boot_table_init() that
is called from setup_arch() and that's where we parse SPCR.
So it has to be opted-in per-arch.
When ACPI_SPCR_TABLE is defined initialization of DT earlycon is
deferred until the DT/ACPI decision is done. Initialize DT earlycon
if ACPI is disabled.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Aleksey Makarov <aleksey.makarov@linaro.org>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, irq stack bootmem is allocated for all possible cpus
before nr_cpus value changes the list of possible cpus. As a result,
there is unnecessary wastage of bootmemory.
Move the irq stack bootmem allocation so that it happens after
possible cpu list is modified based on nr_cpus value.
Signed-off-by: Atish Patra <atish.patra@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If kernel boot parameter nr_cpus is set, it should define the number
of CPUs that can ever be available in the system i.e.
cpu_possible_mask. setup_nr_cpu_ids() overrides the nr_cpu_ids based
on the cpu_possible_mask during kernel initialization. If
cpu_possible_mask is not set based on the nr_cpus value, earlier part
of the kernel would be initialized using nr_cpus value leading to a
kernel crash.
Set cpu_possible_mask based on nr_cpus value. Thus setup_nr_cpu_ids()
becomes redundant and does not corrupt nr_cpu_ids value.
Signed-off-by: Atish Patra <atish.patra@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit af1b1a9b36 ("sparc64 mm: Fix base TSB sizing when hugetlb
pages are used") addressed the difference between hugetlb and THP
pages when computing TSB sizes. The following additional issues
were also discovered while working with the code.
In order to save memory, THP makes use of a huge zero page. This huge
zero page does not count against a task's RSS, but it does consume TSB
entries. This is similar to hugetlb pages. Therefore, count huge
zero page entries in hugetlb_pte_count.
Accounting of THP pages is done in the routine set_pmd_at().
Unfortunately, this does not catch the case where a THP page is split.
To handle this case, decrement the count in pmdp_invalidate().
pmdp_invalidate is only called when splitting a THP. However, 'sanity
checks' are added in case it is ever called for other purposes.
A more general issue exists with HPAGE_SIZE accounting.
hugetlb_pte_count tracks the number of HPAGE_SIZE (8M) pages. This
value is used to size the TSB for HPAGE_SIZE pages. However,
each HPAGE_SIZE page consists of two REAL_HPAGE_SIZE (4M) pages.
The TSB contains an entry for each REAL_HPAGE_SIZE page. Therefore,
the number of REAL_HPAGE_SIZE pages should be used to size the huge
page TSB. A new compile time constant REAL_HPAGE_PER_HPAGE is used
to multiply hugetlb_pte_count before sizing the TSB.
Changes from V1
- Fixed build issue if hugetlb or THP not configured
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To fix:
WARNING: vmlinux.o(.text.unlikely+0x580): Section mismatch in
reference from the function find_numa_latencies_for_group() to the
function .init.text:find_mlgroup()
The function find_numa_latencies_for_group() references the
function __init find_mlgroup(). This is often because
find_numa_latencies_for_group lacks a __init annotation or the
annotation of find_mlgroup is wrong.
It turns out find_numa_latencies_for_group is only called from:
static int __init numa_parse_mdesc(void)
and hence we can tag find_numa_latencies_for_group with __init.
In doing so we see that find_best_numa_node_for_mlgroup is only
called from within __init and hence can also be marked with __init.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Nitin Gupta <nitin.m.gupta@oracle.com>
Cc: Chris Hyser <chris.hyser@oracle.com>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: sparclinux@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As with dsb() and isb(), add a __tlbi() helper so that we can avoid
distracting asm boilerplate every time we want a TLBI. As some TLBI
operations take an argument while others do not, some pre-processor is
used to handle these two cases with different assembly blocks.
The existing tlbflush.h code is moved over to use the helper.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
[ rename helper to __tlbi, update comment and commit log ]
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm64 forces CONFIG_SMP=y with commit 4b3dc9679c, no need to
add SMP dependence for NUMA.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds an option to use XZ compression for the kernel image.
Currently this is only enabled for 64-bit Book3S targets, which is
roughly equivalent to the platforms that use the kernel's zImage
wrapper, and that have been tested.
The bulk of the 32-bit platforms and 64-bit BookE use uboot images,
which relies on uboot implementing XZ. In future we can enable XZ
support for those targets once someone has tested it.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This modifies the wrapper script so that the -Z option takes an argument
to specify the compression type. It can either be 'gz', 'xz' or 'none'.
The legazy --no-gzip and -z options are still supported and will set the
compression to none and gzip respectively, but they are not documented.
Only XZ -6 is used for compression rather than XZ -9. Using compression
levels higher than 6 requires the decompressor to build a large (64MB)
dictionary when decompressing and some environments cannot satisfy such
large allocations (e.g. POWER 6 LPAR partition firmware).
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This code is no longer used and can be removed.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently the powerpc boot wrapper has its own wrapper around zlib to
handle decompressing gzipped kernels. The kernel decompressor library
functions now provide a generic interface that can be used in the
pre-boot environment. This allows boot wrappers to easily support
different compression algorithms. This patch converts the wrapper to use
this new API, but does not add support for using new algorithms.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Most architectures allow the compression algorithm used to produced the
vmlinuz image to be selected as a kernel config option. In preperation
for supporting algorithms other than gzip in the powerpc boot wrapper
the makefile needs to be modified to use these config options.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The powerpc boot wrapper is potentially compiled with a separate
toolchain and/or toolchain flags than the rest of the kernel. The usual
case is a 64-bit big endian kernel builds a 32-bit big endian wrapper.
The main problem with this is that the wrapper does not have access to
the kernel headers (without a lot of gross hacks). To get around this
the required headers are copied into the build directory via several sed
scripts which rewrite problematic includes. This patch moves these
fixups out of the makefile into a separate .sed script file to clean up
makefile slightly.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Reword first paragraph of change log a little]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJX6H4uAAoJEHm+PkMAQRiG5sMH/3yzrMiUCSokdS+cvY+jgKAG
JS58JmRvBPz2mRaU3MRPBGRDeCz/Nc9LggL2ZcgM+E1ZYirlYyQfIED3lkqk5R07
kIN1wmb+kQhXyU4IY3fEX7joqyKC6zOy4DUChPkBQU0/0+VUmdVmcJvsuPlnMZtf
g95m0BdYTui+eDezASRqOEp3Lb5ONL4c3ao4yBP0LHF033ctj3VJQiyi5uERPZJ0
5e6Mo7Wxn78t9WqJLQAiEH46kTwT2plNlxf3XXqTenfIdbWhqE873HPGeSMa3VQV
VywXTpCpSPQsA8BYg66qIbebdKOhs9MOviHVfqDtwQlvwhjlBDya0gNHfI5fSy4=
=Y/L5
-----END PGP SIGNATURE-----
Merge tag 'v4.8-rc8' into drm-next
Linux 4.8-rc8
There was a lot of fallout in the imx/amdgpu/i915 drivers, so backmerge
it now to avoid troubles.
* tag 'v4.8-rc8': (1442 commits)
Linux 4.8-rc8
fault_in_multipages_readable() throws set-but-unused error
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
radix tree: fix sibling entry handling in radix_tree_descend()
radix tree test suite: Test radix_tree_replace_slot() for multiorder entries
fix memory leaks in tracing_buffers_splice_read()
tracing: Move mutex to protect against resetting of seq data
MIPS: Fix delay slot emulation count in debugfs
MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
mm: delete unnecessary and unsafe init_tlb_ubc()
huge tmpfs: fix Committed_AS leak
shmem: fix tmpfs to handle the huge= option properly
blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
MIPS: Fix pre-r6 emulation FPU initialisation
arm64: kgdb: handle read-only text / modules
arm64: Call numa_store_cpu_info() earlier.
locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
nvme-rdma: only clear queue flags after successful connect
i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
perf/core: Limit matching exclusive events to one PMU
...
drivers/platform/x86/dell-smo8800.c is touched due to the following obscenity:
drivers/platform/x86/dell-smo8800.c ->
linux/interrupt.h ->
linux/hardirq.h ->
asm/hardirq.h ->
linux/irq.h ->
asm/hw_irq.h ->
asm/sections.h ->
asm/uaccess.h
is the only chain of includes pulling asm/uaccess.h there.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
It was introduced in "arch, x86: pmem api for ensuring durability
of persistent memory updates" in July 2015, along with the code
that really used that stuff. Three weeks later all that code
got moved to asm/pmem.h by "pmem, x86: move x86 PMEM API to new
pmem.h header"; include, however, was left behind.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The only reason it used to be needed had been a (bogus) use of
set_fs() in start_thread() and that got removed in 2011...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
it has no business in uaccess.h, everything else has it in pgtable.h and
the only user (mm/mmap.c) unconditionally pulls asm/pgtable.h via linux/mm.h.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.
CURRENT_TIME is also not y2038 safe.
This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.
Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felipe Balbi <balbi@kernel.org>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The tx-threshold parameter sets the TX FIFO low water threshold
trigger for the Altera 16550-FIFO32 soft IP.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Exynos MIPI driver does not work anymore (it is board file only) so
it is removed. Remove also config options.
Cc: Inki Dae <inki.dae@samsung.com>
Cc: Donghwa Lee <dh09.lee@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Enable throttle function for SOC_THERM.
Set "hot" trips for cpu and gpu thermal zones, which
can trigger the SOC_THERM hardware throttle.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Set general "critical" trip temperatures for cpu, gpu, mem and pllx
thermal zones on Tegra210, these trips can trigger shut down or reset.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Adds soctherm node for Tegra210, and add cpu,
gpu, mem, pllx as thermal-zones. Set critical
trip temperatures for them.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Enable throttle function for SOC_THERM.
Set "hot" trips for cpu and gpu thermal zones, which
can trigger the SOC_THERM hardware throttle.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Set general "critical" trip temperatures for cpu, gpu, mem and pllx
thermal zones on Tegra132, these trips can trigger shut down or reset.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The Tegra132 has the specific settings for soctherm,
so change to use campatible "nvidia,tegra132-soctherm" for it.
And adds cpu, gpu, mem and pllx thermal zones.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Enable throttle function for SOC_THERM.
Set "hot" trips for cpu and gpu thermal zones, which
can trigger the SOC_THERM hardware throttle.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Set general "critical" trip temperatures for cpu, gpu, mem and pllx
thermal zones for all Tegra124 platform, these trips can trigger
shut down or reset.
Tegra124 Jetson TK1 was already set "critical" trips before, so it
can overwrite the general values.
Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
The MMCR2 register is available twice, one time with number 785
(privileged access), and one time with number 769 (unprivileged,
but it can be disabled completely). In former times, the Linux
kernel was using the unprivileged register 769 only, but since
commit 8dd75ccb57 ("powerpc: Use privileged SPR number
for MMCR2"), it uses the privileged register 785 instead.
The KVM-PR code then of course also switched to use the SPR 785,
but this is causing older guest kernels to crash, since these
kernels still access 769 instead. So to support older kernels
with KVM-PR again, we have to support register 769 in KVM-PR, too.
Fixes: 8dd75ccb57
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
On POWER8E and POWER8NVL, KVM-PR does not announce support for
64kB page sizes and 1TB segments yet. Looks like this has just
been forgotton so far, since there is no reason why this should
be different to the normal POWER8 CPUs.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Remove duplicate setting of the the "B" field when doing a tlbie(l).
In compute_tlbie_rb(), the "B" field is set again just before
returning the rb value to be used for tlbie(l).
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
We use logical negate where bitwise negate was intended. It means that
we never return -EINVAL here.
Fixes: ce11e48b7f ('KVM: PPC: E500: Add userspace debug stub support')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This takes out the code that arranges to run two (or more) virtual
cores on a single subcore when possible, that is, when both vcores
are from the same VM, the VM is configured with one CPU thread per
virtual core, and all the per-subcore registers have the same value
in each vcore. Since the VTB (virtual timebase) is a per-subcore
register, and will almost always differ between vcores, this code
is disabled on POWER8 machines, meaning that it is only usable on
POWER7 machines (which don't have VTB). Given the tiny number of
POWER7 machines which have firmware that allows them to run HV KVM,
the benefit of simplifying the code outweighs the loss of this
feature on POWER7 machines.
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
POWER8 has one virtual timebase (VTB) register per subcore, not one
per CPU thread. The HV KVM code currently treats VTB as a per-thread
register, which can lead to spurious soft lockup messages from guests
which use the VTB as the time source for the soft lockup detector.
(CPUs before POWER8 did not have the VTB register.)
For HV KVM, this fixes the problem by making only the primary thread
in each virtual core save and restore the VTB value. With this,
the VTB state becomes part of the kvmppc_vcore structure. This
also means that "piggybacking" of multiple virtual cores onto one
subcore is not possible on POWER8, because then the virtual cores
would share a single VTB register.
PR KVM emulates a VTB register, which is per-vcpu because PR KVM
has no notion of CPU threads or SMT. For PR KVM we move the VTB
state into the kvmppc_vcpu_book3s struct.
Cc: stable@vger.kernel.org # v3.14+
Reported-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Fix up the silent merge conflict between commit c291b01515 in x86/urgent
and commit f7c28833c2 in x86/apic which both remove num_processors++
from the original location and then add it at two different locations. As a
result num_processors is incremented twice which can cut the number of
available cpus in half.
Remove the one which is added by commit c291b01515.
In hindsight I should have merged x86/urgent into x86/apic _before_ adding
the nodeid bits, but in hindsight we are always smarter.
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Fixes: 1e1b37273c ("Merge branch 'x86/urgent' into x86/apic")
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1609261350090.5483@nanos
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Bring in the upstream modifications so we can fixup the silent merge
conflict which is introduced by this merge.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use the new sun7i-a20-mmc compatible for the mmc controllers on sun7i
and newer.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
This patch updates the s3c24xx dma driver to be able to pass a
dma_slave_map array via the platform data. This is needed to
be able to use the new, simpler dmaengine API [1].
I used the virtual DMA channels as a parameter for the dma_filter
function. By doing that, I could reuse the existing filter function in
drivers/dma/s3c24xx-dma.c.
I have tested this on my mini2440 board with the audio driver.
According to my observations, dma_request_slave_channel in the
function dmaengine_pcm_new in the file
sound/soc/soc-generic-dmaengine-pcm.c now returns a valid DMA channel
whereas before no DMA channel was returned at that point.
Entries for DMACH_XD0, DMACH_XD1 and DMACH_TIMER are missing because I
don't realy know which driver to use for these.
[1]
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-December/393635.html
Signed-off-by: Sam Van Den Berge <sam.van.den.berge@telenet.be>
Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add methods to map/unmap device resources addresses for dma_map_ops that
are IOMMU aware. This is needed to map a device MMIO register from a
physical address.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Move OF_NUMA select under NUMA config, and select ACPI_NUMA
when ACPI enabled.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In some places, dump_backtrace() is called with a NULL tsk parameter,
e.g. in bug_handler() in arch/arm64, or indirectly via show_stack() in
core code. The expectation is that this is treated as if current were
passed instead of NULL. Similar is true of unwind_frame().
Commit a80a0eb70c ("arm64: make irq_stack_ptr more robust") didn't
take this into account. In dump_backtrace() it compares tsk against
current *before* we check if tsk is NULL, and in unwind_frame() we never
set tsk if it is NULL.
Due to this, we won't initialise irq_stack_ptr in either function. In
dump_backtrace() this results in calling dump_mem() for memory
immediately above the IRQ stack range, rather than for the relevant
range on the task stack. In unwind_frame we'll reject unwinding frames
on the IRQ stack.
In either case this results in incomplete or misleading backtrace
information, but is not otherwise problematic. The initial percpu areas
(including the IRQ stacks) are allocated in the linear map, and dump_mem
uses __get_user(), so we shouldn't access anything with side-effects,
and will handle holes safely.
This patch fixes the issue by having both functions handle the NULL tsk
case before doing anything else with tsk.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: a80a0eb70c ("arm64: make irq_stack_ptr more robust")
Acked-by: James Morse <james.morse@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Simplify exit_mce_inject() by using debugfs_remove_recursive() and do
away with the noodling over the dentry elements.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160926083152.30848-3-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change predecrement compare to post decrement compare to avoid an
unsigned integer wrap-around comparisomn when decrementing in the while
loop.
For example, if the debugfs_create_file() fails when 'i' is zero, the
current situation will predecrement 'i' in the while loop, wrapping 'i' to
the maximum signed integer and cause multiple out of bounds reads on
dfs_fls[i].d as the loop interates to zero.
Also, as Borislav Petkov suggested, return -ENODEV rather than -ENOMEM
on the error condition.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: http://lkml.kernel.org/r/20160926083152.30848-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In many of clk_disable() implementations, it is a no-op for a NULL
pointer input, but this is one of the exceptions.
Making it treewide consistent will allow clock consumers to call
clk_disable() without NULL pointer check.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
The old style use of printk(KERN_INFO) is depracated. Convert use of it
in setup_no.c to the modern pr_info().
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
During the arch setup phase of kernel boot we print out in the boot banner
that we are uClinux configured. The printk currently contains a bunch of
useless newlines and carriage returns - producing wastefull empty lines.
Remove these.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
The uboot command line support needs to be used by both MMU and no-MMU
setups, but currently we only have the code in the no-MMU code paths.
Move the uboot command line processing code into its own file. Add
appropriate calls to it from both the MMU and no-MMU arch setup code.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
If we boot up and find no hardware FPU we panic and die.
Change this behavior to be that if we boot up and we _expect_ a hardware
FPU to be present then panic. Don't panic if we don't actually expect to
have any hardware FPU.
This lets us compile a kernel without FPU if we really choose too.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Most of the m68k code that supports a hardware FPU is surrounded by
CONFIG_FPU. Be consistent and surround the hardware FPU instruction
setup in setup_mm.c with CONFIG_FPU as well as the check for
CONFIG_M68KFPU_EMU_ONLY.
The existing classic m68k architectures all define CONFIG_FPU, so they
see no change from this. But on ColdFire where we do not support the
emulated FP code we can now compile without CONFIG_FPU being set as well.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Our local m68k architecture dump_fpu() is conditionally compiled in on
CONFIG_FPU. That is OK for all existing MMU enabled CPU types, but won't
handle the case for some ColdFire SoC CPU parts that we want to support
that have no FPU hardware.
dump_fpu() is expected to be present by the ELF loader, so we must always
have it available and exported.
Remove the conditional and reorganize the dump_fpu hard FPU code path
to let the compiler remove code when not needed.
This change based on changes and discussion from Yannick Gicquel
<yannick.gicquel@open.eurogiciel.org>.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
The ACR registers of the ColdFire define at a macro level what regions
of the addresses space should have caching or other attribute types applied.
Currently for the MMU enabled setups we map the interal IO peripheral addres
space as uncachable based on the define for the MBAR address (CONFIG_MBAR).
Not all ColdFire SoC use a programmable MBAR register address. Some parts
have fixed addressing for their internal peripheral registers.
Generalize the way we get the internal peripheral base address so all types
can be accomodated in the ACR definitions. Each ColdFire SoC type now sets
its IO memory base and size definitions (which may be based on MBAR) which
are then used in the ACR definitions.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
The early ColdFire bootmem_alloc() code is currently only included in
the board support for the Coldire 54xx platforms. It will be used on all
ColdFire MMU enabled platforms as others are supported. So move the
mcf54xx_bootmem_alloc() function to be generally available to all MMU
enabled ColdFire parts (and use a more generic name for it).
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Not all ColdFire SoC parts that have an MMU also have an FPU - so set
an FPU type (via m68k_fputype) appropriate for the configured platform.
With this set correctly /proc/cpuinfo will report FPU "none" on devices
that don't have one. And kernel code paths that initialize FPU hardware
will now only execute if an FPU is actually present.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Create a new machine type for platforms based around the ColdFire 5441x
SoC family. Set that machine type on startup when building for this
platform type.
Currently the ColdFire head.S hard codes a M54xx machine type at startup -
since that is the only platform type currently supported with MMU enabled.
The m5441x has an MMU and this change forms part of the support required
to run it with the MMU enabled.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Move the selection of CONFIG_FPU to each CPU type configuration.
Currently for m68k we have a global set of CONFIG_FPU based on if CONFIG_MMU
is enabled or not. There is at least one CPU family we support (m5441x)
that has an MMU but has no FPU hardware. So we need to be able to have
CONFIG_MMU set and CONFIG_FPU not set.
Whether we build for a CPU with MMU enabled or not doesn't change the
fact that it has FPU hardware support. Our current non-MMU builds have
never had CONIG_FPU enabled - and in fact the kernel will not compile
with that set and CONFIG_MMU not set at the moment. It is easy enough
to fix this - but it would involve a structure change to sigcontext.h,
and that is a user space exported header (so ABI change).
This change makes no configuration visible changes, and all configs
end up with the same configuration settings as before.
This change based on changes and discussion from Yannick Gicquel
<yannick.gicquel@open.eurogiciel.org>.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
The pin write code that supports the UART signals is not using he correct
word write IO access method. It correctly reads the correct 16 bit
registrer, it should also write the new value back with a 16 bit write.
Fix it to use writew().
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Most ColdFire support code has switched to using IO memory access
methods (readb/writeb/etc) when reading and writing internal peripheral
device registers. The WildFire board specific halt code was missed.
As it is now the WildFire code is broken, since all register definitions
were changed to be register addresses only some time ago.
Fix the WildFire board code to use the appropriate IO access functions.
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
The early setup code for the ColdFire 53xx platform accesses variables
before the RAM and other system initialization steps may have taken place.
Currently it has 2 global variables that will end up in the bss section
that are accessed during this early setup. There is a special static RAM
stack setup at this time, but not necessarily the RAM where kernel data
sections will end up.
Even on system setups where RAM is setup by a boot loader the access
to the early setup variables is before the BSS section has been initialized.
This can potentially corrupt a ram loaded root filesystem that sits in that
memory area before it has been moved.
These 2 variables are not used at all after being set, and can just be
removed.
Reported-by: Christian Gieseler <christiangieseler@yahoo.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
- powernv/pci: Fix m64 checks for SR-IOV and window alignment from Russell Currey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJX50CvAAoJEFHr6jzI4aWAqJAP/0/0D8YGwOuIoYD2GmfoasKR
TFbbuhX3xnfdiRG6w/sFBI3oh7icCw7hC+Qj1lNu9D3L/UkxOTBny+W07KvWzX44
Yu74nEHgq3mVrRAU4McztbKIUBK2zagGwwCcGZXZl/uQI1ylvmmpcH3xClQzF+oA
xKk8eB1OW2Ay6+y+FkSuyBHHSfww6QCk7ERPqaStCW9Uy+dDBjIwStLQuOpAhN/o
Z9K+JwpPJ8qgw1Pe9pvrD5MjcM0hR+tUZm6LklZCCk89feqlwcrz9cpOrmTdGuF+
n1iacpDaFf6IOlhI+6ImrT15llTgSk/nu9GNIRFDwOjVCuGy5aDQBtWuRFiVNggp
vkZWFSl594Jn5H9/s6MpMXygSl36NMKgM/ZKvUsEAe6mF0Kb9pZRB7b/aV+ajkCQ
rkQCe0KKSF6+D3wu3SmMe0NTc3/GkgxZN0lTnqUaB5PSRqwvVwurXugnAKr7arhj
JSu9/QSeOxNI5ytDF1Nf9/RN0DT+L1w0vun083DupyJkG1hrjzm9kI0lACQTr/QX
TxAWXGjiTsUOeM4pfNzqaJE4fNUc0TIc41jgWMx9qXzbKjhijgKEPtmyDMz93GVY
hFXyRAMsWUOsQGP5tiLFYG0PkNsmCDIwca+yg47EicBQGTpEsGLYUBRvIILYNBKI
ULl0yMLZWekl1rzthDdB
=35xQ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull one more powerpc fix from Michael Ellerman:
"powernv/pci: Fix m64 checks for SR-IOV and window alignment from
Russell Currey"
* tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment
The config option HAVE_UNSTABLE_SCHED_CLOCK is set automatically when compiling
for SMP. There is no need to clear the stable-clock flag via
clear_sched_clock_stable() when starting secondary CPUs, and even worse,
clearing it triggers wrong self-detected CPU stall warnings on 64bit Mako
machines.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.7+
Commit 432c6bacbd ("MIPS: Use per-mm page to execute branch delay slot
instructions") accidentally removed use of the MIPS_FPU_EMU_INC_STATS
macro from do_dsemulret, leading to the ds_emul file in debugfs always
returning zero even though we perform delay slot emulations.
Fix this by re-adding the use of the MIPS_FPU_EMU_INC_STATS macro.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 432c6bacbd ("MIPS: Use per-mm page to execute branch delay slot instructions")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14301/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch fixes the possibility of a deadlock when bringing up
secondary CPUs.
The deadlock occurs because the set_cpu_online() is called before
synchronise_count_slave(). This can cause a deadlock if the boot CPU,
having scheduled another thread, attempts to send an IPI to the
secondary CPU, which it sees has been marked online. The secondary is
blocked in synchronise_count_slave() waiting for the boot CPU to enter
synchronise_count_master(), but the boot cpu is blocked in
smp_call_function_many() waiting for the secondary to respond to it's
IPI request.
Fix this by marking the CPU online in cpu_callin_map and synchronising
counters before declaring the CPU online and calculating the maps for
IPIs.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reported-by: Justin Chen <justinpopo6@gmail.com>
Tested-by: Justin Chen <justinpopo6@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: stable@vger.kernel.org # v4.1+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14302/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When faulting on some trap, the kernel currently reports in dmesg:
do_page_fault() command='perl' type=6 address=0xbe400403 in libcrypt-2.24.so[f9086000+9000]
vm_start = 0x00922000, vm_end = 0x00aed000
With this change the trap type additionally gets reported as human readable
string which makes it simpler to recognize the type of problem:
do_page_fault() command='perl' type=6 address=0xbe400403 in libcrypt-2.24.so[f9086000+9000]
trap #6: Instruction TLB miss fault, vm_start = 0x00922000, vm_end = 0x00aed000
Signed-off-by: Helge Deller <deller@gmx.de>
Pull perf fixes from Thomas Gleixner:
"Three fixlets for perf:
- add a missing NULL pointer check in the intel BTS driver
- make BTS an exclusive PMU because BTS can only handle one event at
a time
- ensure that exclusive events are limited to one PMU so that several
exclusive events can be scheduled on different PMU instances"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Limit matching exclusive events to one PMU
perf/x86/intel/bts: Make it an exclusive PMU
perf/x86/intel/bts: Make sure debug store is valid
Pull locking fixes from Thomas Gleixner:
"Two smallish fixes:
- use the proper asm constraint in the Super-H atomic_fetch_ops
- a trivial typo fix in the Kconfig help text"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
locking/atomic, arch/sh: Fix ATOMIC_FETCH_OP()
Pull EFI fixes from Thomas Gleixner:
"Two fixes for EFI/PAT:
- a 32bit overflow bug in the PAT code which was unearthed by the
large EFI mappings
- prevent a boot hang on large systems when EFI mixed mode is enabled
but not used"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Only map RAM into EFI page tables if in mixed-mode
x86/mm/pat: Prevent hang during boot when mapping pages
In case of error, the function platform_device_register_simple()
returns ERR_PTR() and never returns NULL. The NULL test in the
return value check must therefor be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Vadim Pasternak <vadimp@mellanox.com>
Cc: platform-driver-x86@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus reported the following objtool warning:
kernel/signal.o: warning: objtool: .altinstr_replacement+0x54: call without frame pointer save/setup
The warning is valid. It's caused by the fact that gcc placed the call
instruction in alternative_call_2()'s inline asm before the frame
pointer setup, which breaks frame pointer convention and can result in a
bad stack trace.
Force a stack frame to be created before the call instruction by listing
the stack pointer as an output operand in the inline asm statement.
Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160923214939.j5o7c67nhepzmh3t@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Fix kgdb breakpoint insertion in read-only text sections (when
CONFIG_DEBUG_RODATA or CONFIG_DEBUG_SET_MODULE_RONX are enabled)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJX5SUUAAoJEGvWsS0AyF7xJhsQAIBvpLUNntihhcKTLbBkpQ96
ni/cludUBgQ8fil3leiR9zmfDDI1meohgETct2Z7qMHaTCx5Ej72GG5cdQl71quy
a2Ks/k7BlwsMLMzHMQLreZhF/1z0XFzOwjbXIMAvxTLUNHCZRvtuZ3mES5+JTtYc
u/obI3JhuKJjkV74CFSfSd6NoRP+Xgmzqmj/afN2TKsZwczqJw33BeNABWxi41SR
o3OcqgJppmQbCyH7KeKsYrG3zghpzraDj9e/qwWwVDiw00bWar+rE2sxmQM5pHuT
RsCmRXIVy/nbVduKl9t4UamkvMsS+dL048bRD2WIOu8hUkZKgpT/c6KcBgspYGXm
ZFBSJZITamU3r9txNMn/WNVp/S2i3KaojrxzrTPG/6wencT5ZfCOfBqWB44I6Xf4
PG8mfmHzEJwpHUMwz5mSnkigNWe5y3CJhRNUO3GQp+4ULAVo3JbD1oyF57WC2PJh
30MOPfl+jxv+uMT83T1zDsIZ0JdufvuaHJ7d9AP1wVcK5PVGW6icuO5Krd3fk7CQ
VJ1yflwya9TVEqRUQDDHwnKlCHohILJomx9myV578/lmEOs8FgA5Dbx+/grgS6/P
+9YKB7hv8uK26uGO/bn41YCqhgCoJX2YqEi4qECRLwDjNtNyYD3XBLT25jAbiopi
Gfew0xtNcFagew1BHYzD
=NCBp
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"A couple of last-minute arm64 fixes for 4.8:
- Fix secondary CPU to NUMA node assignment
- Fix kgdb breakpoint insertion in read-only text sections (when
CONFIG_DEBUG_RODATA or CONFIG_DEBUG_SET_MODULE_RONX are enabled)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kgdb: handle read-only text / modules
arm64: Call numa_store_cpu_info() earlier.
In the mipsr2_decoder() function, used to emulate pre-MIPSr6
instructions that were removed in MIPSr6, the init_fpu() function is
called if a removed pre-MIPSr6 floating point instruction is the first
floating point instruction used by the task. However, init_fpu()
performs varous actions that rely upon not being migrated. For example
in the most basic case it sets the coprocessor 0 Status.CU1 bit to
enable the FPU & then loads FP register context into the FPU registers.
If the task were to migrate during this time, it may end up attempting
to load FP register context on a different CPU where it hasn't set the
CU1 bit, leading to errors such as:
do_cpu invoked from kernel context![#2]:
CPU: 2 PID: 7338 Comm: fp-prctl Tainted: G D 4.7.0-00424-g49b0c82 #2
task: 838e4000 ti: 88d38000 task.ti: 88d38000
$ 0 : 00000000 00000001 ffffffff 88d3fef8
$ 4 : 838e4000 88d38004 00000000 00000001
$ 8 : 3400fc01 801f8020 808e9100 24000000
$12 : dbffffff 807b69d8 807b0000 00000000
$16 : 00000000 80786150 00400fc4 809c0398
$20 : 809c0338 0040273c 88d3ff28 808e9d30
$24 : 808e9d30 00400fb4
$28 : 88d38000 88d3fe88 00000000 8011a2ac
Hi : 0040273c
Lo : 88d3ff28
epc : 80114178 _restore_fp+0x10/0xa0
ra : 8011a2ac mipsr2_decoder+0xd5c/0x1660
Status: 1400fc03 KERNEL EXL IE
Cause : 1080002c (ExcCode 0b)
PrId : 0001a920 (MIPS I6400)
Modules linked in:
Process fp-prctl (pid: 7338, threadinfo=88d38000, task=838e4000, tls=766527d0)
Stack : 00000000 00000000 00000000 88d3fe98 00000000 00000000 809c0398 809c0338
808e9100 00000000 88d3ff28 00400fc4 00400fc4 0040273c 7fb69e18 004a0000
004a0000 004a0000 7664add0 8010de18 00000000 00000000 88d3fef8 88d3ff28
808e9100 00000000 766527d0 8010e534 000c0000 85755000 8181d580 00000000
00000000 00000000 004a0000 00000000 766527d0 7fb69e18 004a0000 80105c20
...
Call Trace:
[<80114178>] _restore_fp+0x10/0xa0
[<8011a2ac>] mipsr2_decoder+0xd5c/0x1660
[<8010de18>] do_ri+0x90/0x6b8
[<80105c20>] ret_from_exception+0x0/0x10
Fix this by disabling preemption around the call to init_fpu(), ensuring
that it starts & completes on one CPU.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: b0a668fb20 ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6")
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.0+
Patchwork: https://patchwork.linux-mips.org/patch/14305/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Instead of comparing the name to a magic string, use archdata to
explicitly communicate whether the arch timer is suitable for
direct vdso access.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Erratum A-008585 says that the ARM generic timer counter "has the
potential to contain an erroneous value for a small number of core
clock cycles every time the timer value changes". Accesses to TVAL
(both read and write) are also affected due to the implicit counter
read. Accesses to CVAL are not affected.
The workaround is to reread TVAL and count registers until successive
reads return the same value. Writes to TVAL are replaced with an
equivalent write to CVAL.
The workaround is to reread TVAL and count registers until successive reads
return the same value, and when writing TVAL to retry until counter
reads before and after the write return the same value.
The workaround is enabled if the fsl,erratum-a008585 property is found in
the timer node in the device tree. This can be overridden with the
clocksource.arm_arch_timer.fsl-a008585 boot parameter, which allows KVM
users to enable the workaround until a mechanism is implemented to
automatically communicate this information.
This erratum can be found on LS1043A and LS2080A.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Scott Wood <oss@buserror.net>
[will: renamed read macro to reflect that it's not usually unstable]
Signed-off-by: Will Deacon <will.deacon@arm.com>
A new accessor for gic_write_bpr1 is added to arch_gicv3.h in 4.9,
whilst the CP15 accessors are redifined in a separate branch.
This leads to a horrible clash, where the new accessor ends up with
a crap "asm volatile" definition.
Work around this by carrying our own definition of gic_write_bpr1,
creating a small conflict which will be obvious to resolve.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
the tilcdc quirks:
- Fix typo with recent beagleboard-x15 changes for mmc2_pins_default
- Add am335x blue-and-red-wiring quirk as specified in the binding in
Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt. Also
fix up the whitespace formatting for am335x-evmsk.
- Fix for recent igepv5 power button for GPIO_ACTIVE_LOW. Also fix
up the whitespace formatting for the button
-----BEGIN PGP SIGNATURE-----
iQIuBAABCAAYBQJX4w32ERx0b255QGF0b21pZGUuY29tAAoJEBvUPslcq6VzGggP
+waNl/O3wMsqw4VlTKxOKWAT3KtS3pWiCs0U0YO7KXV4txKqwinVr2uLn4HRb9Jv
W254ngX7m0II6d4ws04+amcWuCmCaebwgm7zZwb4vf20TQ9MIns4oymPWXQRLhYK
NY4ADMz+7nQ5uevhGtxOXqQqOK1aE5l+51zJ93ytMvqaq806ho8W9XaNo9kEUBI6
0521PFDJuK+6rwIbb8VwXLE6cUajKz/j/25Nq5lC7RUpN+tHyfo+WSGesTW4+/Mi
/27c9LS0tsrYlk1MRYPZ5c2ly1NBwTBCtdg6lJxgvWapH5+8YNdxgfgT8Sym+h8W
aN3ZpiYkpdWhTptU4TW3o9sETxWZWAKn0TS6YklAzlXylZ6PTTRuLNzYV92P5IUq
AtGIyfInkhk9tiioQo6b9ftwjxQYvTTd0GUpVmvCwTCjxQN9yZ5ut8ZaCaIAXiq4
FzH5j36QmhLd9xaDiu5Gi+RqPhlIJXQVyOm/4fdCY8RWnQq2k6yh5+TLpErXIR0q
xM4fY/DH9CtqTR8T6zqvwOnakSOvy9cDlVfCrP1nB2L2IT82oso82bkajnSMjWNB
lDCE707Iwj13hkvAOfPoT03YAZqq3T6zc2VARRSXKyt6D83NT5q7I+xOxp3kL4U5
dXiDX+qKcjkrb6/lqKyCxq7cK/gYh60dkHjomRE7E6Ij
=14Zw
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.9/dt-pt3-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/dt
Pull "few minor fixes for omap dts files for v4.9 merge window"
Few fixes for omap dts files for v4.9 merge window. Let's also add
the tilcdc quirks:
- Fix typo with recent beagleboard-x15 changes for mmc2_pins_default
- Add am335x blue-and-red-wiring quirk as specified in the binding in
Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt. Also
fix up the whitespace formatting for am335x-evmsk.
- Fix for recent igepv5 power button for GPIO_ACTIVE_LOW. Also fix
up the whitespace formatting for the button
* tag 'omap-for-v4.9/dt-pt3-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap5-igep0050.dts: Use tabs for indentation
ARM: dts: Fix igepv5 power button GPIO direction
ARM: dts: am335x-evmsk: Add blue-and-red-wiring -property to lcdc node
ARM: dts: am335x-evmsk: Whitespace cleanup of lcdc related nodes
ARM: dts: am335x-evm: Add blue-and-red-wiring -property to lcdc node
ARM: dts: am57xx-beagle-x15-common: Fix wrong pinctrl selection for mmc2
Return value of class_create should be considered in module init function.
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
According to commit ef158bdf83
("mtd: Remove unused symbol CONFIG_MTDRAM_ABS_POS")
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
We unlocked a few lines earlier so we can delete these unlocks.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
Add set_dev_domain_options() to set PCI domain-specific options as devices
are added. The first usage is to request exclusive userspace control of
PCIe hotplug indicators in VMD domains.
Devices in a VMD domain use PCIe hotplug Attention and Power Indicators in
a non-standard way; tell pciehp to ignore the indicators so userspace can
control them via the sysfs "attention" file.
To determine whether a bus is within a VMD domain, add a bool to the
pci_sysdata structure that the VMD driver sets during initialization.
[bhelgaas: changelog]
Requested-by: Kapil Karkra <kapil.karkra@intel.com>
Tested-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This file was only including module.h for exception table related
functions. We've now separated that content out into its own file
"extable.h" so now move over to that and avoid all the extra header
content in module.h that we don't really need to compile this file.
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: linux-cris-kernel@axis.com
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
The Kconfig CONFIG_ETRAX_AXISFLASHMAP_MTD0WHOLE
does not exist for crisv10, so remove it.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
commit d47529b2e9
"gpio: don't include module.h in shared driver header"
removed <linux/module.h> from the <linux/gpio/driver.h> header.
It seems arch/arm/mach-omap2/board-rx51-peripherals.c
is using __initdata_or_module from <linux/module.h> through
<linux/gpio.h> to <linux/gpio/driver.h>, so break this dependency
so that we get a clean compile.
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Tony Lindgren <tony@atomide.com>
Fixes: d47529b2e9 ("gpio: don't include module.h in shared driver header")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Handle read-only cases when CONFIG_DEBUG_RODATA (4.0) or
CONFIG_DEBUG_SET_MODULE_RONX (3.18) are enabled by using
aarch64_insn_write() instead of probe_kernel_write() as introduced by
commit 2f896d5866 ("arm64: use fixmap for text patching") in 4.0.
Fixes: 11d91a770f ("arm64: Add CONFIG_DEBUG_SET_MODULE_RONX support")
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The wq_numa_init() function makes a private CPU to node map by calling
cpu_to_node() early in the boot process, before the non-boot CPUs are
brought online. Since the default implementation of cpu_to_node()
returns zero for CPUs that have never been brought online, the
workqueue system's view is that *all* CPUs are on node zero.
When the unbound workqueue for a non-zero node is created, the
tsk_cpus_allowed() for the worker threads is the empty set because
there are, in the view of the workqueue system, no CPUs on non-zero
nodes. The code in try_to_wake_up() using this empty cpumask ends up
using the cpumask empty set value of NR_CPUS as an index into the
per-CPU area pointer array, and gets garbage as it is one past the end
of the array. This results in:
[ 0.881970] Unable to handle kernel paging request at virtual address fffffb1008b926a4
[ 1.970095] pgd = fffffc00094b0000
[ 1.973530] [fffffb1008b926a4] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[ 1.982610] Internal error: Oops: 96000004 [#1] SMP
[ 1.987541] Modules linked in:
[ 1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: G W 4.8.0-rc6-preempt-vol+ #9
[ 1.999435] Hardware name: Cavium ThunderX CN88XX board (DT)
[ 2.005159] task: fffffe0fe89cc300 task.stack: fffffe0fe8b8c000
[ 2.011158] PC is at try_to_wake_up+0x194/0x34c
[ 2.015737] LR is at try_to_wake_up+0x150/0x34c
[ 2.020318] pc : [<fffffc00080e7468>] lr : [<fffffc00080e7424>] pstate: 600000c5
[ 2.027803] sp : fffffe0fe8b8fb10
[ 2.031149] x29: fffffe0fe8b8fb10 x28: 0000000000000000
[ 2.036522] x27: fffffc0008c63bc8 x26: 0000000000001000
[ 2.041896] x25: fffffc0008c63c80 x24: fffffc0008bfb200
[ 2.047270] x23: 00000000000000c0 x22: 0000000000000004
[ 2.052642] x21: fffffe0fe89d25bc x20: 0000000000001000
[ 2.058014] x19: fffffe0fe89d1d00 x18: 0000000000000000
[ 2.063386] x17: 0000000000000000 x16: 0000000000000000
[ 2.068760] x15: 0000000000000018 x14: 0000000000000000
[ 2.074133] x13: 0000000000000000 x12: 0000000000000000
[ 2.079505] x11: 0000000000000000 x10: 0000000000000000
[ 2.084879] x9 : 0000000000000000 x8 : 0000000000000000
[ 2.090251] x7 : 0000000000000040 x6 : 0000000000000000
[ 2.095621] x5 : ffffffffffffffff x4 : 0000000000000000
[ 2.100991] x3 : 0000000000000000 x2 : 0000000000000000
[ 2.106364] x1 : fffffc0008be4c24 x0 : ffffff0ffffada80
[ 2.111737]
[ 2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfffffe0fe8b8c020)
[ 2.120102] Stack: (0xfffffe0fe8b8fb10 to 0xfffffe0fe8b90000)
[ 2.125914] fb00: fffffe0fe8b8fb80 fffffc00080e7648
.
.
.
[ 2.442859] Call trace:
[ 2.445327] Exception stack(0xfffffe0fe8b8f940 to 0xfffffe0fe8b8fa70)
[ 2.451843] f940: fffffe0fe89d1d00 0000040000000000 fffffe0fe8b8fb10 fffffc00080e7468
[ 2.459767] f960: fffffe0fe8b8f980 fffffc00080e4958 ffffff0ff91ab200 fffffc00080e4b64
[ 2.467690] f980: fffffe0fe8b8f9d0 fffffc00080e515c fffffe0fe8b8fa80 0000000000000000
[ 2.475614] f9a0: fffffe0fe8b8f9d0 fffffc00080e58e4 fffffe0fe8b8fa80 0000000000000000
[ 2.483540] f9c0: fffffe0fe8d10000 0000000000000040 fffffe0fe8b8fa50 fffffc00080e5ac4
[ 2.491465] f9e0: ffffff0ffffada80 fffffc0008be4c24 0000000000000000 0000000000000000
[ 2.499387] fa00: 0000000000000000 ffffffffffffffff 0000000000000000 0000000000000040
[ 2.507309] fa20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2.515233] fa40: 0000000000000000 0000000000000000 0000000000000000 0000000000000018
[ 2.523156] fa60: 0000000000000000 0000000000000000
[ 2.528089] [<fffffc00080e7468>] try_to_wake_up+0x194/0x34c
[ 2.533723] [<fffffc00080e7648>] wake_up_process+0x28/0x34
[ 2.539275] [<fffffc00080d3764>] create_worker+0x110/0x19c
[ 2.544824] [<fffffc00080d69dc>] alloc_unbound_pwq+0x3cc/0x4b0
[ 2.550724] [<fffffc00080d6bcc>] wq_update_unbound_numa+0x10c/0x1e4
[ 2.557066] [<fffffc00080d7d78>] workqueue_online_cpu+0x220/0x28c
[ 2.563234] [<fffffc00080bd288>] cpuhp_invoke_callback+0x6c/0x168
[ 2.569398] [<fffffc00080bdf74>] cpuhp_up_callbacks+0x44/0xe4
[ 2.575210] [<fffffc00080be194>] cpuhp_thread_fun+0x13c/0x148
[ 2.581027] [<fffffc00080dfbac>] smpboot_thread_fn+0x19c/0x1a8
[ 2.586929] [<fffffc00080dbd64>] kthread+0xdc/0xf0
[ 2.591776] [<fffffc0008083380>] ret_from_fork+0x10/0x50
[ 2.597147] Code: b00057e1 91304021 91005021 b8626822 (b8606821)
[ 2.603464] ---[ end trace 58c0cd36b88802bc ]---
[ 2.608138] Kernel panic - not syncing: Fatal exception
Fix by moving call to numa_store_cpu_info() for all CPUs into
smp_prepare_cpus(), which happens before wq_numa_init(). Since
smp_store_cpu_info() now contains only a single function call,
simplify by removing the function and out-lining its contents.
Suggested-by: Robert Richter <rric@kernel.org>
Fixes: 1a2db30034 ("arm64, numa: Add NUMA support for arm64 platforms.")
Cc: <stable@vger.kernel.org> # 4.7.x-
Signed-off-by: David Daney <david.daney@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Tested-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Run kvm-unit-tests/eventinj.flat in L1:
Sending NMI to self
After NMI to self
FAIL: NMI
This test scenario is to test whether VMM can handle NMI IDT-vectoring info correctly.
At the beginning, L2 writes LAPIC to send a self NMI, the EPT page tables on both L1
and L0 are empty so:
- The L2 accesses memory can generate EPT violation which can be intercepted by L0.
The EPT violation vmexit occurred during delivery of this NMI, and the NMI info is
recorded in vmcs02's IDT-vectoring info.
- L0 walks L1's EPT12 and L0 sees the mapping is invalid, it injects the EPT violation into L1.
The vmcs02's IDT-vectoring info is reflected to vmcs12's IDT-vectoring info since
it is a nested vmexit.
- L1 receives the EPT violation, then fixes its EPT12.
- L1 executes VMRESUME to resume L2 which generates vmexit and causes L1 exits to L0.
- L0 emulates VMRESUME which is called from L1, then return to L2.
L0 merges the requirement of vmcs12's IDT-vectoring info and injects it to L2 through
vmcs02.
- The L2 re-executes the fault instruction and cause EPT violation again.
- Since the L1's EPT12 is valid, L0 can fix its EPT02
- L0 resume L2
The EPT violation vmexit occurred during delivery of this NMI again, and the NMI info
is recorded in vmcs02's IDT-vectoring info. L0 should inject the NMI through vmentry
event injection since it is caused by EPT02's EPT violation.
However, vmx_inject_nmi() refuses to inject NMI from IDT-vectoring info if vCPU is in
guest mode, this patch fix it by permitting to inject NMI from IDT-vectoring if it is
the L0's responsibility to inject NMI from IDT-vectoring info to L2.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Bandan Das <bsd@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
I observed that kvmvapic(to optimize flexpriority=N or AMD) is used
to boost TPR access when testing kvm-unit-test/eventinj.flat tpr case
on my haswell desktop (w/ flexpriority, w/o APICv). Commit (8d14695f95
x86, apicv: add virtual x2apic support) disable virtual x2apic mode
completely if w/o APICv, and the author also told me that windows guest
can't enter into x2apic mode when he developed the APICv feature several
years ago. However, it is not truth currently, Interrupt Remapping and
vIOMMU is added to qemu and the developers from Intel test windows 8 can
work in x2apic mode w/ Interrupt Remapping enabled recently.
This patch enables TPR shadow for virtual x2apic mode to boost
windows guest in x2apic mode even if w/o APICv.
Can pass the kvm-unit-test.
Suggested-by: Radim Krčmář <rkrcmar@redhat.com>
Suggested-by: Wincy Van <fanwenyi0529@gmail.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wincy Van <fanwenyi0529@gmail.com>
Cc: Yang Zhang <yang.zhang.wz@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
WARNING: CPU: 1 PID: 4230 at kernel/sched/core.c:7564 __might_sleep+0x7e/0x80
do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff8d0de7f9>] prepare_to_swait+0x39/0xa0
CPU: 1 PID: 4230 Comm: qemu-system-x86 Not tainted 4.8.0-rc5+ #47
Call Trace:
dump_stack+0x99/0xd0
__warn+0xd1/0xf0
warn_slowpath_fmt+0x4f/0x60
? prepare_to_swait+0x39/0xa0
? prepare_to_swait+0x39/0xa0
__might_sleep+0x7e/0x80
__gfn_to_pfn_memslot+0x156/0x480 [kvm]
gfn_to_pfn+0x2a/0x30 [kvm]
gfn_to_page+0xe/0x20 [kvm]
kvm_vcpu_reload_apic_access_page+0x32/0xa0 [kvm]
nested_vmx_vmexit+0x765/0xca0 [kvm_intel]
? _raw_spin_unlock_irqrestore+0x36/0x80
vmx_check_nested_events+0x49/0x1f0 [kvm_intel]
kvm_arch_vcpu_runnable+0x2d/0xe0 [kvm]
kvm_vcpu_check_block+0x12/0x60 [kvm]
kvm_vcpu_block+0x94/0x4c0 [kvm]
kvm_arch_vcpu_ioctl_run+0x619/0x1aa0 [kvm]
? kvm_arch_vcpu_ioctl_run+0xdf1/0x1aa0 [kvm]
kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
===============================
[ INFO: suspicious RCU usage. ]
4.8.0-rc5+ #47 Not tainted
-------------------------------
./include/linux/kvm_host.h:535 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-x86/4230:
#0: (&vcpu->mutex){+.+.+.}, at: [<ffffffffc062975c>] vcpu_load+0x1c/0x60 [kvm]
stack backtrace:
CPU: 1 PID: 4230 Comm: qemu-system-x86 Not tainted 4.8.0-rc5+ #47
Call Trace:
dump_stack+0x99/0xd0
lockdep_rcu_suspicious+0xe7/0x120
gfn_to_memslot+0x12a/0x140 [kvm]
gfn_to_pfn+0x12/0x30 [kvm]
gfn_to_page+0xe/0x20 [kvm]
kvm_vcpu_reload_apic_access_page+0x32/0xa0 [kvm]
nested_vmx_vmexit+0x765/0xca0 [kvm_intel]
? _raw_spin_unlock_irqrestore+0x36/0x80
vmx_check_nested_events+0x49/0x1f0 [kvm_intel]
kvm_arch_vcpu_runnable+0x2d/0xe0 [kvm]
kvm_vcpu_check_block+0x12/0x60 [kvm]
kvm_vcpu_block+0x94/0x4c0 [kvm]
kvm_arch_vcpu_ioctl_run+0x619/0x1aa0 [kvm]
? kvm_arch_vcpu_ioctl_run+0xdf1/0x1aa0 [kvm]
kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
? __fget+0xfd/0x210
? __lock_is_held+0x54/0x70
do_vfs_ioctl+0x96/0x6a0
? __fget+0x11c/0x210
? __fget+0x5/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x81/0x220
entry_SYSCALL64_slow_path+0x25/0x25
These can be triggered by running kvm-unit-test: ./x86-run x86/vmx.flat
The nested preemption timer is based on hrtimer which is started on L2
entry, stopped on L2 exit and evaluated via the new check_nested_events
hook. The current logic adds vCPU to a simple waitqueue (TASK_INTERRUPTIBLE)
if need to yield pCPU and w/o holding srcu read lock when accesses memslots,
both can be in nested preemption timer evaluation path which results in
the warning above.
This patch fix it by leveraging request bit to async reload APIC access
page before vmentry in order to avoid to reload directly during the nested
preemption timer evaluation, it is safe since the vmcs01 is loaded and
current is nested vmexit.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
kvm_guest.config is useful for KVM guests on other arches, and nothing
in it appears to be x86 specific, so just move the whole file. Kbuild
will find it in either location.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: kvmarm@lists.cs.columbia.edu
Cc: kvm@vger.kernel.org
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The same logic appears twice and should probably be pulled out into a
function.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rui Teng <rui.teng@linux.vnet.ibm.com>
[mpe: Rename to tm_flush_hash_page() and move comment into the function]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CLR_TOP32() is defined as blank. Last useful instance of CLR_TOP32()
was removed by commit 40ef8cbc6d ("powerpc: Get 64-bit configs to
compile with ARCH=powerpc") in 2005.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On some CPUs like the 8xx, _PAGE_RW hence _PAGE_WRITE is defined
as 0 and _PAGE_RO has to be set when a page is not writable
_PAGE_RO is defined by default in pte-common.h, however BOOK3S/64
doesn't include that file so _PAGE_RO has to be defined explicitly
in book3s/64/pgtable.h
Fixes: a7b9f671f2 ("powerpc32: adds handling of _PAGE_RO")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In eeh_handle_special_event(), eeh_pe_bus_get() is called before calling
eeh_report_failure() on every device under a PE. If a PE was missing a
bus for some reason, the error would occur before reporting failure, even
though eeh_report_failure() doesn't require a bus.
Fix this by moving the bus retrieval and error check after the
eeh_report_failure() calls.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When the PE used in pnv_eeh_reset() is that of a VF,
pnv_eeh_reset_vf_pe() is used. Unlike the other reset functions called
in pnv_eeh_reset(), the VF reset doesn't require a bus, and if a bus was
missing the function would error out before resetting the VF PE.
To avoid this, reorder the VF reset function to occur before finding and
checking the bus.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
eeh_pe_bus_get() can return NULL if a PCI bus isn't found for a given PE.
Some callers don't check this, and can cause a null pointer dereference
under certain circumstances.
Fix this by checking NULL everywhere eeh_pe_bus_get() is called.
Fixes: 8a6b1bc70d ("powerpc/eeh: EEH core to handle special event")
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When we originally added the ability to split the exception vectors from
the kernel (commit 1f6a93e4c3 ("powerpc: Make it possible to move the
interrupt handlers away from the kernel" 2008-09-15)), the LOAD_HANDLER() macro
used an addi instruction to compute the offset of the common handler
from the kernel base address.
Using addi meant the handler had to be within 32K of the kernel base
address, due to the addi instruction taking a signed immediate value.
That necessitated creating a trampoline for the system call handler,
because system_call_common (in entry64.S) is not linked within 32K of
the kernel base address.
Later in commit 61e2390ede ("powerpc: Make load_hander handle upto 64k
offset" 2012-11-15) we changed LOAD_HANDLER to take a 64K offset, by
changing it to use ori.
Although system_call_common is not in head_64.S or exceptions-64s.S, it
is included in head-y, which causes it to be linked early in the kernel
text, so in practice it ends up below 64K. Additionally if it can't be
placed below 64K the linker will fail to build with a "relocation
truncated to fit" error.
So remove the trampoline.
Newer toolchains are able to work out that the ori in LOAD_HANDLER only
takes a 16 bit offset, and so they generate a 16 bit relocation. Older
toolchains (binutils 2.22 at least) are not so smart, so we have to add
the @l annotation to tell the assembler to generate a 16 bit relocation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The 0xf80 hv_facility_unavailable trampoline branches to the 0xf60
handler. This works because they both do the same thing, but it should
be fixed.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On EEH events the kernel will print a dump of relevant registers.
If EEH is unavailable (i.e. CONFIG_EEH is disabled, a new platform
doesn't have EEH support, etc) this information isn't readily available.
Add a new debugfs handler to trigger a PHB register dump, so that this
information can be made available on demand.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The only difference is now the TCE table check which doesn't need
to be ifdef'ed out, it will basically do nothing on BookE (it is
only useful for ancient IBM machines).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently we turn the MMU off after copying the image, and we make
sure there is no overlap between the hash table and the target pages
in that case.
That doesn't work for Radix however. In that case, the page tables
are scattered and we can't really enforce that the target of the
image isn't overlapping one of them.
So instead, let's turn the MMU off before copying the image in radix
mode. Thankfully, in radix mode, even under a hypervisor, we know we
don't have the same kind of RMA limitations that hash mode has.
While at it, also turn the MMU off early when using hash in non-LPAR
mode, that way we can get rid of the collision check completely.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Just using the hash ops won't work anymore since radix will have
NULL in there. Instead create an mmu_cleanup_all() function which
will do the right thing based on the MMU mode.
For Radix, for now I clear UPRT and the PTCR, effectively switching
back to Radix with no partition table setup.
Currently set it to NULL on BookE thought it might be a good idea
to wipe the TLB there (Scott ?)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
With Radix, it can be NULL even on !BOOKE these days so replace
the ifdef with a NULL check which is cleaner anyway.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
fixes the warning:
lib/iomap.c: In function ‘ioread8_rep’:
./arch/cris/include/asm/io.h:139:31: warning: statement with no effect [-Wunused-value]
#define insb(port,addr,count) (cris_iops ? cris_iops->read_io(port,addr,1,count) : 0)
^
lib/iomap.c:56:3: note: in definition of macro ‘IO_COND’
is_pio; \
^
lib/iomap.c:197:16: note: in expansion of macro ‘insb’
IO_COND(addr, insb(port,dst,count), mmio_insb(addr, dst, count));
^
cris_iops was previously set to NULL (no matter if CONFIG_PCI was set or
not), but was removed in commit ab28e96fd1 ("CRIS v32: remove old GPIO
and LEDs code"). Before commit ab28e96fd1 ("CRIS v32: remove old GPIO
and LEDs code"), cris_iops could have been set from an external module,
since it was exported, but as commit c24bf9b4cc ("CRIS: fix I/O
macros") noted, the macros using cris_iops have been broken since first
included, so they could never have worked.
Because of this, instead of readding cris_iops, remove all special
handling of cris_iops. By doing so, we can rely on the default
implementation of almost all functions previously defined in our arch
specific io.h.
Signed-off-by: Niklas Cassel <nks@flawful.org>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
I/O port access. Normally there is no I/O space on CRIS but when
Cardbus/PCI is enabled the request is passed through the bridge.
lib/pci_iomap.c: In function ‘pci_iomap_range’:
lib/pci_iomap.c:43:3: error: implicit declaration of function ‘ioport_map’ [-Werror=implicit-function-declaration]
return __pci_ioport_map(dev, start, len);
^
Signed-off-by: Niklas Cassel <nks@flawful.org>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
It is not possible to netboot a dev88 using etraxfs_defconfig,
since etraxfs_defconfig does not set CONFIG_ETRAX_MEM_GRP*_CONFIG
or CONFIG_ETRAX_SDRAM_GRP*_CONFIG, and the default values does not work.
This new defconfig has correct memory configuration values,
points out the correct DTB to build in (CONFIG_BUILTIN_DTB="dev88"),
enables the serial driver (CONFIG_SERIAL_ETRAXFS) and the
GPIO driver (CONFIG_GPIO_ETRAXFS), and enables LEDS.
Signed-off-by: Niklas Cassel <nks@flawful.org>
Signed-off-by: Jesper Nilsson <jespern@axis.com>
The code previously depended on list_head being defined
as the first item in struct intmem_allocation.
arch/cris/arch-v32/mm/intmem.c: In function ‘crisv32_intmem_free’:
arch/cris/arch-v32/mm/intmem.c:116:14: warning: comparison of distinct pointer types lacks a cast
if ((prev != &intmem_allocations) &&
^
arch/cris/arch-v32/mm/intmem.c:123:14: warning: comparison of distinct pointer types lacks a cast
if ((next != &intmem_allocations) &&
^
Signed-off-by: Niklas Cassel <nks@flawful.org>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Cannot add __init macro to crisv32_intmem_init,
since the function is being called by other functions.
Creating a wrapper instead.
arch/cris/arch-v32/mm/intmem.c: At top level:
arch/cris/arch-v32/mm/intmem.c:148:17: warning: initialization from incompatible pointer type
device_initcall(crisv32_intmem_init);
^
include/linux/init.h:184:58: note: in definition of macro ‘__define_initcall’
__attribute__((__section__(".initcall" #id ".init"))) = fn; \
arch/cris/arch-v32/mm/intmem.c:148:1: note: in expansion of macro ‘device_initcall’
device_initcall(crisv32_intmem_init);
^
Signed-off-by: Niklas Cassel <nks@flawful.org>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Just like intel_pt, intel_bts can only handle one event at a time,
which is the reason we introduced PERF_PMU_CAP_EXCLUSIVE in the first
place. However, at the moment one can have as many intel_bts events
within the same context at the same time as one pleases. Only one of
them, however, will get scheduled and receive the actual trace data.
Fix this by making intel_bts an "exclusive" PMU.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160920154811.3255-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We cannot use the "z" constraint twice, since its a single register
(r0). Change the one not used by movli.l/movco.l to "r".
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lazy unmap (defer tlb flush after unmap until dma address reuse) can
greatly reduce the number of RPCIT instructions in the best case. In
reality we are often far away from the best case scenario because our
implementation suffers from the following problem:
To create dma addresses we maintain an iommu bitmap and a pointer into
that bitmap to mark the start of the next search. That pointer moves from
the start to the end of that bitmap and we issue a global tlb flush
once that pointer wraps around. To prevent address reuse before we issue
the tlb flush we even have to move the next pointer during unmaps - when
clearing a bit > next. This could lead to a situation where we only use
the rear part of that bitmap and issue more tlb flushes than expected.
To fix this we no longer clear bits during unmap but maintain a 2nd
bitmap which we use to mark addresses that can't be reused until we issue
the global tlb flush after wrap around.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Split dma_update_trans into __dma_update_trans which handles updating
the dma translation tables and __dma_purge_tlb which takes care of
purging associated entries in the dma translation lookaside buffer.
The map_sg API makes use of this split approach by calling
__dma_update_trans once per physically contiguous address range but
__dma_purge_tlb only once per dma contiguous address range.
This results in less invocations of the expensive RPCIT instruction
when using map_sg.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Our map_sg implementation mapped sg entries independently of each other.
For ease of use and possible performance improvements this patch changes
the implementation to try to map as many (likely physically non-contiguous)
sglist entries as possible into a contiguous DMA segment.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Simplify the code we use to calculate dma addresses by putting
everything related in a dma_alloc_address function. Also provide
a dma_free_address counterpart.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We calculate dma addresses using an iommu bitmap. Since commit
69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory")
we've made sure that addresses created using that bitmap are below
the maximum reported by firmware. Thus the additional check for
that address to be within range can be removed.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
By now both VHE and non-VHE initialisation sequences query supported
VMID size. Lets keep only single instance of this code under
init_common_resources().
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
This patch allows to build and use vgic-v3 in 32-bit mode.
Unfortunately, it can not be split in several steps without extra
stubs to keep patches independent and bisectable. For instance,
virt/kvm/arm/vgic/vgic-v3.c uses function from vgic-v3-sr.c, handling
access to GICv3 cpu interface from the guest requires vgic_v3.vgic_sre
to be already defined.
It is how support has been done:
* handle SGI requests from the guest
* report configured SRE on access to GICv3 cpu interface from the guest
* required vgic-v3 macros are provided via uapi.h
* static keys are used to select GIC backend
* to make vgic-v3 build KVM_ARM_VGIC_V3 guard is removed along with
the static inlines
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
vgic-v3 save/restore routines are written in such way that they map
arm64 system register naming nicely, but it does not fit to arm
world. To keep virt/kvm/arm/hyp/vgic-v3-sr.c untouched we create a
mapping with a function for each register mapping the 32-bit to the
64-bit accessors.
Please, note that 64-bit wide ICH_LR is split in two 32-bit halves
(ICH_LR and ICH_LRC) accessed independently.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Headers linux/irqchip/arm-gic.v3.h and arch/arm/include/asm/kvm_hyp.h
are included in virt/kvm/arm/hyp/vgic-v3-sr.c and both define macros
called __ACCESS_CP15 and __ACCESS_CP15_64 which obviously creates a
conflict. These macros were introduced independently for GIC and KVM
and, in fact, do the same thing.
As an option we could add prefixes to KVM and GIC version of macros so
they won't clash, but it'd introduce code duplication. Alternatively,
we could keep macro in, say, GIC header and include it in KVM one (or
vice versa), but such dependency would not look nicer.
So we follow arm64 way (it handles this via sysreg.h) and move only
single set of macros to asm/cp15.h
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
vgic-v3 driver uses architecture specific MPIDR_LEVEL_SHIFT macro to
encode the affinity in a form compatible with ICC_SGI* registers.
Unfortunately, that macro is missing on ARM, so let's add it.
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
By now ITS code guarded with KVM_ARM_VGIC_V3 config option which was
introduced to hide everything specific to vgic-v3 from 32-bit world.
We are going to support vgic-v3 in 32-bit world and KVM_ARM_VGIC_V3
will gone, but we don't have support for ITS there yet and we need to
continue keeping ITS away.
Introduce the new config option to prevent ITS code being build in
32-bit mode when support for vgic-v3 is done.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
So we can reuse the code under arch/arm
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Since we are going to share vgic-v3 save/restore code with ARM keep
arch specific accessors separately.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Currently GIC backend is selected via alternative framework and this
is fine. We are going to introduce vgic-v3 to 32-bit world and there
we don't have patching framework in hand, so we can either check
support for GICv3 every time we need to choose which backend to use or
try to optimise it by using static keys. The later looks quite
promising because we can share logic involved in selecting GIC backend
between architectures if both uses static keys.
This patch moves arm64 from alternative to static keys framework for
selecting GIC backend. For that we embed static key into vgic_global
and enable the key during vgic initialisation based on what has
already been exposed by the host GIC driver.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
In commit:
ec776ef6bb ("x86/mm: Add support for the non-standard protected e820 type")
Christoph references the original patch I wrote implementing pmem support.
The intent of the 'max_pfn' changes in that commit were to enable persistent
memory ranges to be covered by the struct page memmap by default.
However, that approach was abandoned when Christoph ported the patches [1], and
that functionality has since been replaced by devm_memremap_pages().
In the meantime, this max_pfn manipulation is confusing kdump [2] that
assumes that everything covered by the max_pfn is "System RAM". This
results in kdump hanging or crashing.
[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-March/000348.html
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1351098
So fix it.
Reported-by: Zhang Yi <yizhan@redhat.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Zhang Yi <yizhan@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@vger.kernel.org> # v4.1 and later kernels
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-nvdimm@lists.01.org
Fixes: ec776ef6bb ("x86/mm: Add support for the non-standard protected e820 type")
Link: http://lkml.kernel.org/r/147448744538.34910.11287693517367139607.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>