Dynamic boosting of HWP performance on IO wake showed significant
improvement to IO workloads. This series was intended for Skylake Xeon
platforms only and feature was enabled by default based on CPU model
number.
But some Xeon platforms reused the Skylake desktop CPU model number. This
caused some undesirable side effects to some graphics workloads. Since
they are heavily IO bound, the increase in CPU performance decreased the
power available for GPU to do its computing and hence decrease in graphics
benchmark performance.
For example on a Skylake desktop, GpuTest benchmark showed average FPS
reduction from 529 to 506.
This change makes sure that HWP boost feature is only enabled for Skylake
server platforms by using ACPI FADT preferred PM Profile. If some desktop
users wants to get benefit of boost, they can still enable boost from
intel_pstate sysfs attribute "hwp_dynamic_boost".
Fixes: 41ab43c9c8 (cpufreq: intel_pstate: enable boost for Skylake Xeon)
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Make sure of_device_id tables are NULL terminated.
Found by coccinelle spatch "misc/of_table.cocci"
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Ilia Lin <ilia.lin@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744 (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann@suse.com>
Reviewed-by: Andreas Herrmann <aherrmann@suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann@suse.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We should return if get_cpu_device() fails or it leads to a NULL
dereference. Also dev_pm_opp_of_get_opp_desc_node() returns NULL on
error, it never returns error pointers.
Fixes: 46e2856b8e (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When scaling max/min settings are changed, internally they are converted
to a ratio using the max turbo 1 core turbo frequency. This works fine
when 1 core max is same irrespective of the core. But under Turbo 3.0,
this will not be the case. For example:
Core 0: max turbo pstate: 43 (4.3GHz)
Core 1: max turbo pstate: 45 (4.5GHz)
In this case 1 core turbo ratio will be maximum of all, so it will be
45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores
,then on core one it will be
= max_state * policy->max / max_freq;
= 43 * (4000000/4500000) = 38 (3.8GHz)
= 38
which is 200MHz less than the desired.
On core2, it will be correctly set to ratio 40 (4GHz). Same holds true
for scaling min frequency limit. So this requires usage of correct turbo
max frequency for core one, which in this case is 4.3GHz. So we need to
adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that
core.
This change uses the HWP capability of a core to adjust max turbo
frequency. But since Broadwell HWP doesn't use ratios in the HWP
capabilities, we have to use legacy max 1 core turbo ratio. This is not
a problem as the HWP capabilities don't differ among cores in Broadwell.
We need to check for non Broadwell CPU model for applying this change,
though.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add device remove and module exit code to make the driver
functioning as a loadable module.
Fixes: ac28927659 (cpufreq: kryo: allow building as a loadable module)
Signed-off-by: Ilia Lin <ilia.lin@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In event of error returned by the nvmem_cell_read() non-pointer value
may be dereferenced. Fix this with error handling.
Additionally free the allocated speedbin buffer, as per the API.
Fixes: 9ce36edd1a52 (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Ilia Lin <ilia.lin@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Revert a recent PM core change that attempted to fix an issue
related to device links, but introduced a regression (Rafael
Wysocki).
- Fix build when the recently added cpufreq driver for Kryo
processors is selected by making it possible to build that
driver as a module (Arnd Bergmann).
- Fix the long idle detection mechanism in the out-of-band
(ondemand and conservative) cpufreq governors (Chen Yu).
- Add support for devices in multiple power domains to the
generic power domains (genpd) framework (Ulf Hansson).
- Add support for iowait boosting on systems with hardware-managed
P-states (HWP) enabled to the intel_pstate driver and make it use
that feature on systems with Skylake Xeon processors as it is
reported to improve performance significantly on those systems
(Srinivas Pandruvada).
- Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq
drivers (Colin Ian King, Suman Anna, Sébastien Szymanski).
- Change the behavior of the wakeup_count device attribute in
sysfs to expose the number of events when the device might have
aborted system suspend in progress (Ravi Chandra Sadineni).
- Fix two minor issues in the cpupower utility (Abhishek Goel,
Colin Ian King).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJbIOf+AAoJEILEb/54YlRxk5EQAIyLpvR0zdp2gMaMl3rbWqtM
W6XpJbLzL4be9zHKDj4bycO6nbevPOr5oXgm3DQUaUvkLo86cUl2NJlNAv789UZR
NQ8L51WiY4hG4WDrBQntEBw7TDBUDuo6TEa2/0WJQQhj6WQP821oehmF4G+N9A9h
z9YhwbWNgivulyNy09nAcVgJ39cxUVWb9EmTXthp0KnyJzn8de+V3MxlEwJTAmHc
jma9PEil9Key2rS8LRr+djvwa6tYKydOCjkA+o6m7Fo1IVaaVydDgciG4tjnsHNV
wtEfbOZnisnkYrNEbViqQhhnsvSLkTtfAku58Ove5Kz2GPSPjyIoRrK7FUfDetr+
ZQLWq6TPzR9u2m3kQfhHB6C463bGxd4s2BntPH2RLHbs82FENEtGkHdxQOv5B1tW
Gvl9gF9ZDov6gL3jftNdhIz4rQVGaXQlY5/q+alV1I3jhyg7zddht4oh+nNt41XR
ysszEg9K62w/QAuqZeUsHaR7pPoZZDQzr3TRkKX0uvl88jq4HUPj+aKqNYxq0IrZ
uYd92gqvD7HH1UKRPqjvZ65Uj5WTbn7picAYJhTlQR4b73X0j66xDSZp/IZVpbEc
ierDftBxdwklnfxrpy19yJKgIDB89zLP0IX+3BacEC+BWguI//MOb5X0EEpcf/WK
eyG13J1wTF1qLzKDdur9
=VROk
-----END PGP SIGNATURE-----
Merge tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These revert a recent PM core change that introduced a regression, fix
the build when the recently added Kryo cpufreq driver is selected, add
support for devices attached to multiple power domains to the generic
power domains (genpd) framework, add support for iowait boosting on
systens with hardware-managed P-states (HWP) enabled to the
intel_pstate driver, modify the behavior of the wakeup_count device
attribute in sysfs, fix a few issues and clean up some ugliness,
mostly in cpufreq (core and drivers) and in the cpupower utility.
Specifics:
- Revert a recent PM core change that attempted to fix an issue
related to device links, but introduced a regression (Rafael
Wysocki)
- Fix build when the recently added cpufreq driver for Kryo
processors is selected by making it possible to build that driver
as a module (Arnd Bergmann)
- Fix the long idle detection mechanism in the out-of-band (ondemand
and conservative) cpufreq governors (Chen Yu)
- Add support for devices in multiple power domains to the generic
power domains (genpd) framework (Ulf Hansson)
- Add support for iowait boosting on systems with hardware-managed
P-states (HWP) enabled to the intel_pstate driver and make it use
that feature on systems with Skylake Xeon processors as it is
reported to improve performance significantly on those systems
(Srinivas Pandruvada)
- Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq
drivers (Colin Ian King, Suman Anna, Sébastien Szymanski)
- Change the behavior of the wakeup_count device attribute in sysfs
to expose the number of events when the device might have aborted
system suspend in progress (Ravi Chandra Sadineni)
- Fix two minor issues in the cpupower utility (Abhishek Goel, Colin
Ian King)"
* tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "PM / runtime: Fixup reference counting of device link suppliers at probe"
cpufreq: imx6q: check speed grades for i.MX6ULL
cpufreq: governors: Fix long idle detection logic in load calculation
cpufreq: intel_pstate: enable boost for Skylake Xeon
PM / wakeup: Export wakeup_count instead of event_count via sysfs
PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains
PM / Domains: Add support for multi PM domains per device to genpd
PM / Domains: Split genpd_dev_pm_attach()
PM / Domains: Don't attach devices in genpd with multi PM domains
PM / Domains: dt: Allow power-domain property to be a list of specifiers
cpufreq: intel_pstate: New sysfs entry to control HWP boost
cpufreq: intel_pstate: HWP boost performance on IO wakeup
cpufreq: intel_pstate: Add HWP boost utility and sched util hooks
cpufreq: ti-cpufreq: Use devres managed API in probe()
cpufreq: ti-cpufreq: Fix an incorrect error return value
cpufreq: ACPI: make function acpi_cpufreq_fast_switch() static
cpufreq: kryo: allow building as a loadable module
cpupower : Fix header name to read idle state name
cpupower: fix spelling mistake: "logilename" -> "logfilename"
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed (Kees)
-----BEGIN PGP SIGNATURE-----
Comment: Kees Cook <kees@outflux.net>
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsgVtMWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsJEACLYe2EbwLFJz7emOT1KUGK5R1b
oVxJog0893WyMqgk9XBlA2lvTBRBYzR3tzsadfYo87L3VOBzazUv0YZaweJb65sF
bAvxW3nY06brhKKwTRed1PrMa1iG9R63WISnNAuZAq7+79mN6YgW4G6YSAEF9lW7
oPJoPw93YxcI8JcG+dA8BC9w7pJFKooZH4gvLUSUNl5XKr8Ru5YnWcV8F+8M4vZI
EJtXFmdlmxAledUPxTSCIojO8m/tNOjYTreBJt9K1DXKY6UcgAdhk75TRLEsp38P
fPvMigYQpBDnYz2pi9ourTgvZLkffK1OBZ46PPt8BgUZVf70D6CBg10vK47KO6N2
zreloxkMTrz5XohyjfNjYFRkyyuwV2sSVrRJqF4dpyJ4NJQRjvyywxIP4Myifwlb
ONipCM1EjvQjaEUbdcqKgvlooMdhcyxfshqJWjHzXB6BL22uPzq5jHXXugz8/ol8
tOSM2FuJ2sBLQso+szhisxtMd11PihzIZK9BfxEG3du+/hlI+2XgN7hnmlXuA2k3
BUW6BSDhab41HNd6pp50bDJnL0uKPWyFC6hqSNZw+GOIb46jfFcQqnCB3VZGCwj3
LH53Be1XlUrttc/NrtkvVhm4bdxtfsp4F7nsPFNDuHvYNkalAVoC3An0BzOibtkh
AtfvEeaPHaOyD8/h2Q==
=zUUp
-----END PGP SIGNATURE-----
Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more overflow updates from Kees Cook:
"The rest of the overflow changes for v4.18-rc1.
This includes the explicit overflow fixes from Silvio, further
struct_size() conversions from Matthew, and a bug fix from Dan.
But the bulk of it is the treewide conversions to use either the
2-factor argument allocators (e.g. kmalloc(a * b, ...) into
kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a *
b) into vmalloc(array_size(a, b)).
Coccinelle was fighting me on several fronts, so I've done a bunch of
manual whitespace updates in the patches as well.
Summary:
- Error path bug fix for overflow tests (Dan)
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed
(Kees)"
* tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
treewide: Use array_size in f2fs_kvzalloc()
treewide: Use array_size() in f2fs_kzalloc()
treewide: Use array_size() in f2fs_kmalloc()
treewide: Use array_size() in sock_kmalloc()
treewide: Use array_size() in kvzalloc_node()
treewide: Use array_size() in vzalloc_node()
treewide: Use array_size() in vzalloc()
treewide: Use array_size() in vmalloc()
treewide: devm_kzalloc() -> devm_kcalloc()
treewide: devm_kmalloc() -> devm_kmalloc_array()
treewide: kvzalloc() -> kvcalloc()
treewide: kvmalloc() -> kvmalloc_array()
treewide: kzalloc_node() -> kcalloc_node()
treewide: kzalloc() -> kcalloc()
treewide: kmalloc() -> kmalloc_array()
mm: Introduce kvcalloc()
video: uvesafb: Fix integer overflow in allocation
UBIFS: Fix potential integer overflow in allocation
leds: Use struct_size() in allocation
Convert intel uncore to struct_size
...
This branch contains platform-related driver updates for ARM and ARM64.
Highlights:
- ARM SCMI (System Control & Management Interface) driver cleanups
- Hisilicon support for LPC bus w/ ACPI
- Reset driver updates for several platforms: Uniphier,
- Rockchip power domain bindings and hardware descriptions for several SoCs.
- Tegra memory controller reset improvements
-----BEGIN PGP SIGNATURE-----
iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlsfB94PHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3k2IP/i9T71QoanZ3k6o/d+YUqmTuUiA+EJWFANry
8KSjBKmYDON/GLgRCiNZR8P0NZ3d1LgFk5gZDdhMrOtoGtd8k8q0KyqLxjKAWHt6
opSrGucmE1gy9FvJdUkK+y148vM+Ea4SXRVOZxbLV5qm3inPwnopJjgKAfnhIn4X
QmkSca90CyEc3kPdBdfMeAKL+7SRb4mbFHAXXVE7QiWvjrEjUkvtNVTazf5Nroc4
PbI97zSFrmSFO4ZK0jZHCd4R2xhsJwzDQ/UKHC9C9/IdFMLfnJ7dxIf97QYn41Kl
H46FneMZZZ1FibN+Mj5hC/tByE8FrMtWh636z031s6kkamSqLiBAZFlGpHABxQJs
3tN1vBP40R7hzm76yQAC4Uopr5xOtmLr6KBMBBRr+Axf9jHMS4m/WP1chwZFpFjI
Awxc0VCjBUm+haHvK85J4eHrzbWPjG+8aV5Ar5DHVo8et3MzCdX0ycoDeUT787qc
qzEcCjGPbXHBR1aXUX8stRW5x8zoGH/4IUYMo5IGadiFuXSna6ERG9IHq3fAU5Fp
ZzNNKedtodn9NoMr3NJJk1ndyrUr0lpXwlVqFeksRTa+INk2FHKd0cQfxwV33kS9
wHXw+v323uxa3Tz2TXKS7PavY5yr6fZ0dLC2+xEDqHq6bsLxo1DnBEnaola+Jg+u
9hKEuSff
=xs+f
-----END PGP SIGNATURE-----
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Olof Johansson:
"This contains platform-related driver updates for ARM and ARM64.
Highlights:
- ARM SCMI (System Control & Management Interface) driver cleanups
- Hisilicon support for LPC bus w/ ACPI
- Reset driver updates for several platforms: Uniphier,
- Rockchip power domain bindings and hardware descriptions for
several SoCs.
- Tegra memory controller reset improvements"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (59 commits)
ARM: tegra: fix compile-testing PCI host driver
soc: rockchip: power-domain: add power domain support for px30
dt-bindings: power: add binding for px30 power domains
dt-bindings: power: add PX30 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3228
dt-bindings: power: add binding for rk3228 power domains
dt-bindings: power: add RK3228 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3128
dt-bindings: power: add binding for rk3128 power domains
dt-bindings: power: add RK3128 SoCs header for power-domain
soc: rockchip: power-domain: add power domain support for rk3036
dt-bindings: power: add binding for rk3036 power domains
dt-bindings: power: add RK3036 SoCs header for power-domain
dt-bindings: memory: tegra: Remove Tegra114 SATA and AFI reset definitions
memory: tegra: Remove Tegra114 SATA and AFI reset definitions
memory: tegra: Register SMMU after MC driver became ready
soc: mediatek: remove unneeded semicolon
soc: mediatek: add a fixed wait for SRAM stable
soc: mediatek: introduce a CAPS flag for scp_domain_data
soc: mediatek: reuse regmap_read_poll_timeout helpers
...
Check the max speed supported from the fuses for i.MX6ULL and update the
operating points table accordingly.
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to current code implementation, detecting the long
idle period is done by checking if the interval between two
adjacent utilization update handlers is long enough. Although
this mechanism can detect if the idle period is long enough
(no utilization hooks invoked during idle period), it might
not cover a corner case: if the task has occupied the CPU
for too long which causes no context switches during that
period, then no utilization handler will be launched until this
high prio task is scheduled out. As a result, the idle_periods
field might be calculated incorrectly because it regards the
100% load as 0% and makes the conservative governor who uses
this field confusing.
Change the detection to compare the idle_time with sampling_rate
directly.
Reported-by: Artem S. Tashkinov <t.artem@mailcity.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Enable HWP boost on Skylake server and workstations.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A new attribute is added to intel_pstate sysfs to enable/disable
HWP dynamic performance boost.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This change uses SCHED_CPUFREQ_IOWAIT flag to boost HWP performance.
Since SCHED_CPUFREQ_IOWAIT flag is set frequently, we don't start
boosting steps unless we see two consecutive flags in two ticks. This
avoids boosting due to IO because of regular system activities.
To avoid synchronization issues, the actual processing of the flag is
done on the local CPU callback.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Added two utility functions to HWP boost up gradually and boost down to
the default cached HWP request values.
Boost up:
Boost up updates HWP request minimum value in steps. This minimum value
can reach upto at HWP request maximum values depends on how frequently,
this boost up function is called. At max, boost up will take three steps
to reach the maximum, depending on the current HWP request levels and HWP
capabilities. For example, if the current settings are:
If P0 (Turbo max) = P1 (Guaranteed max) = min
No boost at all.
If P0 (Turbo max) > P1 (Guaranteed max) = min
Should result in one level boost only for P0.
If P0 (Turbo max) = P1 (Guaranteed max) > min
Should result in two level boost:
(min + p1)/2 and P1.
If P0 (Turbo max) > P1 (Guaranteed max) > min
Should result in three level boost:
(min + p1)/2, P1 and P0.
We don't set any level between P0 and P1 as there is no guarantee that
they will be honored.
Boost down:
After the system is idle for hold time of 3ms, the HWP request is reset
to the default value from HWP init or user modified one via sysfs.
Caching of HWP Request and Capabilities
Store the HWP request value last set using MSR_HWP_REQUEST and read
MSR_HWP_CAPABILITIES. This avoid reading of MSRs in the boost utility
functions.
These boost utility functions calculated limits are based on the latest
HWP request value, which can be modified by setpolicy() callback. So if
user space modifies the minimum perf value, that will be accounted for
every time the boost up is called. There will be case when there can be
contention with the user modified minimum perf, in that case user value
will gain precedence. For example just before HWP_REQUEST MSR is updated
from setpolicy() callback, the boost up function is called via scheduler
tick callback. Here the cached MSR value is already the latest and limits
are updated based on the latest user limits, but on return the MSR write
callback called from setpolicy() callback will update the HWP_REQUEST
value. This will be used till next time the boost up function is called.
In addition add a variable to control HWP dynamic boosting. When HWP
dynamic boost is active then set the HWP specific update util hook. The
contents in the utility hooks will be filled in the subsequent patches.
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The ti_cpufreq_probe() function uses regular kzalloc to allocate
the ti_cpufreq_data structure and kfree for freeing this memory
on failures. Simplify this code by using the devres managed
API.
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 05829d9431 (cpufreq: ti-cpufreq: kfree opp_data when
failure) has fixed a memory leak in the failure path, however
the patch returned a positive value on get_cpu_device() failure
instead of the previous negative value. Fix this incorrect error
return value properly.
Fixes: 05829d9431 (cpufreq: ti-cpufreq: kfree opp_data when failure)
Cc: 4.14+ <stable@vger.kernel.org> # v4.14+
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The acpi_cpufreq_fast_switch() function is local to the source and
does not need to be in global scope, so make it static.
Cleans up sparse warning:
drivers/cpufreq/acpi-cpufreq.c:468:14: warning: symbol
'acpi_cpufreq_fast_switch' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Building the kryo cpufreq driver while QCOM_SMEM is a loadable module
results in a link error:
drivers/cpufreq/qcom-cpufreq-kryo.o: In function `qcom_cpufreq_kryo_probe':
qcom-cpufreq-kryo.c:(.text+0xbc): undefined reference to `qcom_smem_get'
The problem is that Kconfig ignores interprets the dependency as met
when the dependent symbol is a 'bool' one. By making it 'tristate',
it will be forced to be a module here, which builds successfully.
Fixes: 46e2856b8e (cpufreq: Add Kryo CPU scaling driver)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These update the ACPICA code in the kernel to the 20180508 upstream
revision and make it support the RT patch, add CPPC v3 support to the
ACPI CPPC library, add a WDAT-based watchdog quirk to prevent clashes
with the RTC, add quirks to the ACPI AC and battery drivers, and
update the ACPI SoC drivers.
Specifics:
- Update the ACPICA code in the kernel to the 20180508 upstream
revision including:
* iASL -tc option enhancement (Bob Moore).
* Debugger improvements (Bob Moore).
* Support for tables larger than 1 MB in acpidump/acpixtract
(Bob Moore).
* Minor fixes and cleanups (Colin Ian King, Toomas Soome).
- Make the ACPICA code in the kernel support the RT patch (Sebastian
Andrzej Siewior, Steven Rostedt).
- Add a kmemleak annotation to the ACPICA code (Larry Finger).
- Add CPPC v3 support to the ACPI CPPC library and fix two issues
related to CPPC (Prashanth Prakash, Al Stone).
- Add an ACPI WDAT-based watchdog quirk to prefer iTCO_wdt on
systems where WDAT clashes with the RTC SRAM (Mika Westerberg).
- Add some quirks to the ACPI AC and battery drivers (Carlo Caione,
Hans de Goede).
- Update the ACPI SoC drivers for Intel (LPSS) and AMD (APD)
platforms (Akshu Agrawal, Hans de Goede).
- Fix up some assorted minor issues (Al Stone, Laszlo Toth,
Mathieu Malaterre).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJbFR2IAAoJEILEb/54YlRxsEcP/0vxyGvTXUnHH13h1ayhbcEL
qKA//1Zvw55mFb86nlnIVWUfpVx5pN4sOd2MZL2bDmNP8+ZX1QFbwtkpP8MtnRyX
0pmPb2QpmHz/ZHutVg3rGxFOV4Sk0IPevQW7d4eu/Gc6K/UkfREMj10PWu9EoNWm
6UCKuONxeoUAtE/OTfvU33ghpYgBH506Kg6/EwLLRX0tA10NBa11L9ijUEkVfa0s
RDn5Z9ndjGGed4XtlSNkfabEJpc6uX9JvbAw77DKZgyqEZgQyNa4CZ2C1ZtH66lZ
HCS62eJ6ePGUvzW/KTn825+MOGni5YisGlfonHuF9xHpPNSp7Doxmyny3lMYHL7S
l0XdI2G7UIOFO5eaRM/zYQPQyF05lna28MVvTWJWBA3bcUz4rav9fzPKYp5l+To0
xK1Ol84sowlkOKoi/RIcOx5nsh05SufGqNu9RuXjRZ4iK0bJPCD3YNt+tH95b94M
0YOuoDkIylieVET4xgHB5vBOj8EgMOLtSaGSpEbh816i+BdZMx7YnS4xSmQ+JFjd
rVCJMnXcLZ82j3pkvPpR0W3VLVbEJPjLftENjSVJ+vA9lR567byBFrvzTEmsEQoi
KlNWh5qDmoR96dPheLdhD1HtsL070xLhyqYOUDvtqTiVH8uNxoWfQzcSfvXPvLqc
XGCcG/oPP9FFpNZfLzAB
=FAIF
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to the 20180508 upstream
revision and make it support the RT patch, add CPPC v3 support to the
ACPI CPPC library, add a WDAT-based watchdog quirk to prevent clashes
with the RTC, add quirks to the ACPI AC and battery drivers, and
update the ACPI SoC drivers.
Specifics:
- Update the ACPICA code in the kernel to the 20180508 upstream
revision including:
* iASL -tc option enhancement (Bob Moore).
* Debugger improvements (Bob Moore).
* Support for tables larger than 1 MB in acpidump/acpixtract (Bob
Moore).
* Minor fixes and cleanups (Colin Ian King, Toomas Soome).
- Make the ACPICA code in the kernel support the RT patch (Sebastian
Andrzej Siewior, Steven Rostedt).
- Add a kmemleak annotation to the ACPICA code (Larry Finger).
- Add CPPC v3 support to the ACPI CPPC library and fix two issues
related to CPPC (Prashanth Prakash, Al Stone).
- Add an ACPI WDAT-based watchdog quirk to prefer iTCO_wdt on systems
where WDAT clashes with the RTC SRAM (Mika Westerberg).
- Add some quirks to the ACPI AC and battery drivers (Carlo Caione,
Hans de Goede).
- Update the ACPI SoC drivers for Intel (LPSS) and AMD (APD)
platforms (Akshu Agrawal, Hans de Goede).
- Fix up some assorted minor issues (Al Stone, Laszlo Toth, Mathieu
Malaterre)"
* tag 'acpi-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
ACPICA: Mark acpi_ut_create_internal_object_dbg() memory allocations as non-leaks
ACPI / watchdog: Prefer iTCO_wdt always when WDAT table uses RTC SRAM
mailbox: PCC: erroneous error message when parsing ACPI PCCT
ACPICA: Update version to 20180508
ACPICA: acpidump/acpixtract: Support for tables larger than 1MB
ACPI: APD: Add AMD misc clock handler support
clk: x86: Add ST oscout platform clock
ACPICA: Update version to 20180427
ACPICA: Debugger: Removed direct support for EC address space in "Test Objects"
ACPICA: Debugger: Add Package support for "test objects" command
ACPICA: Improve error messages for the namespace root node
ACPICA: Fix potential infinite loop in acpi_rs_dump_byte_list
ACPICA: vsnprintf: this statement may fall through
ACPICA: Tables: Fix spelling mistake in comment
ACPICA: iASL: Enhance the -tc option (create AML hex file in C)
ACPI: Add missing prototype_for arch_post_acpi_subsys_init()
ACPI / tables: improve comments regarding acpi_parse_entries_array()
ACPICA: Convert acpi_gbl_hardware lock back to an acpi_raw_spinlock
ACPICA: provide abstraction for raw_spinlock_t
ACPI / CPPC: Fix invalid PCC channel status errors
...
* acpi-cppc:
mailbox: PCC: erroneous error message when parsing ACPI PCCT
ACPI / CPPC: Fix invalid PCC channel status errors
ACPI / CPPC: Document CPPC sysfs interface
cpufreq / CPPC: Support for CPPC v3
ACPI / CPPC: Check for valid PCC subspace only if PCC is used
ACPI / CPPC: Add support for CPPC v3
* acpi-misc:
ACPI: Add missing prototype_for arch_post_acpi_subsys_init()
ACPI: add missing newline to printk
* acpi-battery:
ACPI / battery: Add quirk to avoid checking for PMIC with native driver
ACPI / battery: Ignore AC state in handle_discharging on systems where it is broken
ACPI / battery: Add handling for devices which wrongly report discharging state
ACPI / battery: Remove initializer for unused ident dmi_system_id
ACPI / AC: Remove initializer for unused ident dmi_system_id
* acpi-ac:
ACPI / AC: Add quirk to avoid checking for PMIC with native driver
In Certain QCOM SoCs like apq8096 and msm8996 that have KRYO processors,
the CPU frequency subset and voltage value of each OPP varies
based on the silicon variant in use. Qualcomm Process Voltage Scaling Tables
defines the voltage and frequency value based on the msm-id in SMEM
and speedbin blown in the efuse combination.
The qcom-cpufreq-kryo driver reads the msm-id and efuse value from the SoC
to provide the OPP framework with required information.
This is used to determine the voltage and frequency value for each OPP of
operating-points-v2 table when it is parsed by the OPP framework.
Signed-off-by: Ilia Lin <ilialin@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Tested-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use the static SRCU initializer for `cpufreq_transition_notifier_list'.
This avoids the init_cpufreq_transition_notifier_list() initcall. Its
only purpose is to initialize the SRCU notifier once during boot and set
another variable which is used as an indicator whether the init was
perfromed before cpufreq_register_notifier() was used.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the policy limits are updated via cpufreq_update_policy() and
subsequently via sysfs, the limits stored in user_policy may be
set incorrectly.
For example, if both min and max are set via sysfs to the maximum
available frequency, user_policy.min and user_policy.max will also
be the maximum. If a policy notifier triggered by
cpufreq_update_policy() lowers both the min and the max at this
point, that change is not reflected by the user_policy limits, so
if the max is updated again via sysfs to the same lower value,
then user_policy.max will be lower than user_policy.min which
shouldn't happen. In particular, if one of the policy CPUs is
then taken offline and back online, cpufreq_set_policy() will
fail for it due to a failing limits check.
To prevent that from happening, initialize the min and max fields
of the new_policy object to the ones stored in user_policy that
were previously set via sysfs.
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject & changelog ]
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This routine checks if the CPU running this code belongs to the policy
of the target CPU or if not, can it do remote DVFS for it remotely. But
the current name of it implies as if it is only about doing remote
updates.
Rename it to make it more relevant.
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently tegra20-cpufreq kernel module isn't getting autoloaded because
there is no device associated with the module, this is one of two patches
that resolves the module autoloading. This patch adds a module alias that
will associate the tegra20-cpufreq kernel module with the platform device,
other patch will instantiate the actual platform device. And now it makes
sense to wrap cpufreq driver into a platform driver for consistency.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Nothing prevents Tegra20 CPUFreq module to be unloaded, hence allow it to
be built as a non-builtin kernel module.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Don't even try to request the clocks during of module initialization on
non-Tegra20 machines (this is the case for a multi-platform kernel) for
consistency.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove checking of the CPU number for consistency as it won't ever fail
unless there is a severe bug in the cpufreq core.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The EMC driver has been gone 4 years ago, since the commit a7cbe92cef
("ARM: tegra: remove tegra EMC scaling driver"). Remove the EMC clock
usage as it does nothing. We may consider re-implementing the EMC scaling
later, probably using PM Memory Bandwidth QoS API.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove unused/unneeded headers and sort them in the alphabet order.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove unneeded blank line and replace whitespaces with a tab in the code
for consistency.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Change module description to be in line with the other Tegra drivers, just
for consistency.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This contains all of the trivial review comments that were not
addressed as the series was already queued up for v4.17 and were not
critical to go as fixes.
They generally just improve code readability, fix kernel-docs, remove
unused/unnecessary code, follow standard function naming and simplifies
certain exit paths.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJa+a0LAAoJEABBurwxfuKYfb8P/Rorx2vzPuHBDUPAurGkMKJe
9+VQIN3zAvKTF6QIsXVMtLpRJ9LOj1b5i4LpRM92pujliN5w1NcE00TV+BDyXCVh
SY/H8ECbPj8szg3UKVCivFDRzcdDP4TqY0rlxKr45W2nN5J36GFZqZzOoC/FDKCK
vvmUKAW/lkI4U5N16YskddzfZhz1OByqVxblXDq6HlBRaqjG5JZzIL3TAynCwlD8
3X/anQ+XHmM4EAr7WZ5JxlmOCcUw4FfXU18oDwzhF3wHJlLGWwM7e9ORsmpCPQCe
x3W6SwDMW67Ol9Oj89GSUldHmR7jSrXZg4TWY+LK/gvnqFgysHD+TZkhhETI3ori
YRXu9AA6XEr/f/1NRDHUAmlu9vRIqBrK4mFqadcfWixFCDuEwlN8sehxj1Hv/6oj
AP9utzT4pHpsZdEhwArXiSn8RkEal/3Q4WEaBXnVzQQUgXFLIRJ2cyuh/BCy68v/
0Z6W++dmtxfpYmYGfwjLCaAN/OqdnhDUoRhVxXt8fNroqDM30qHIfCObHIC0L4lK
QITqC6Ewbga2eASARuGuM7WksYh54S7GoT91sxif7XGSaJTeey0QE3NnsXxouwLB
wlMuH5JDEHtOBY/E+8c1Jq5NNsp/mfdP0DtUTyMLfhQe8lx7DOryx6Si0zCEdvjX
td9yfnZzZ/hUrBCiHKmO
=OzDu
-----END PGP SIGNATURE-----
Merge tag 'scmi-updates-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into next/drivers
SCMI cleanups for v4.18
This contains all of the trivial review comments that were not
addressed as the series was already queued up for v4.17 and were not
critical to go as fixes.
They generally just improve code readability, fix kernel-docs, remove
unused/unnecessary code, follow standard function naming and simplifies
certain exit paths.
* tag 'scmi-updates-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: simplify exit path by returning on error
firmware: arm_scmi: improve exit paths and code readability
firmware: arm_scmi: remove unnecessary bitmap_zero
firmware: arm_scmi: drop unused `con_priv` structure member
firmware: arm_scmi: rename scmi_xfer_{init,get,put}
firmware: arm_scmi: rename get_transition_latency and add_opps_to_device
firmware: arm_scmi: fix kernel-docs documentation
firmware: arm_scmi: improve code readability using bitfield accessor macros
Signed-off-by: Olof Johansson <olof@lixom.net>
This reverts commit 034def597b.
This is no longer needed since the following commit and to the best
of my knowledge is not relied on by any upstream DTS:
edeec420de (cpufreq: dt-platdev: Automatically create cpufreq
device with OPP v2)
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit bea2ebca6b.
This is no longer needed since the following commit and to the best
of my knowledge is not relied on by any upstream DTS:
edeec420de (cpufreq: dt-platdev: Automatically create cpufreq
device with OPP v2)
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Allow use of the trace_pstate_sample trace function
when the intel_pstate driver is in passive mode.
Since the core_busy and scaled_busy fields are not
used, and it might be desirable to know which path
through the driver was used, either intel_cpufreq_target
or intel_cpufreq_fast_switch, re-task the core_busy
field as a flag indicator.
The user can then use the intel_pstate_tracer.py utility
to summarize and plot the trace.
Note: The core_busy feild still goes by that name
in include/trace/events/power.h and within the
intel_pstate_tracer.py script and csv file headers,
but it is graphed as "performance", and called
core_avg_perf now in the intel_pstate driver.
Sometimes, in passive mode, the driver is not called for
many tens or even hundreds of seconds. The user
needs to understand, and not be confused by, this limitation.
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Armada-37xx driver registers a cpufreq-dt driver. Not having
CONFIG_CPUFREQ_DT selected leads to a silent abort during the probe.
Prevent that situation by having the former depending on the latter.
Fixes: 92ce45fb87 (cpufreq: Add DVFS support for Armada 37xx)
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpufreq_notify_transition() calls __cpufreq_notify_transition() for each
CPU of a policy. There is a lot of code in __cpufreq_notify_transition()
though which isn't required to be executed for each CPU, like checking
about disabled cpufreq or irqs, adjusting jiffies, updating cpufreq
stats and some debug print messages.
This commit merges __cpufreq_notify_transition() into
cpufreq_notify_transition() and modifies cpufreq_notify_transition() to
execute minimum amount of code for each CPU.
Also fix the kerneldoc for cpufreq_notify_transition() while at it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Trivial fix to spelling mistake in s3c_freq_dbg debug message text.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>