mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-26 04:59:46 +07:00
d0411ec8ca
- Add MMIO interface support to the Intel RAPL power capping driver and update the int340X thermal driver to provide a RAPL MMIO interface (Zhang Rui, Stephen Rothwell). - Add Intel Ice Lake CPU IDs to the RAPL driver (Zhang Rui, Rajneesh Bhardwaj). - Make cpufreq use the PM QoS framework (instead of notifiers) for managing the min and max frequency constraints (Viresh Kumar). - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson Huang). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl0wLTgSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxpckP/2DmQ5ydn1EvY47mpMtdEcnptCaMUwoM xL6oo3hcpZsmHL8uslESW/LCE+Zadq1oz0k8rCiMKIzbujTzaDD+Pbzd/8ueY8nK YHyNMQaVS23iixix0/UqTsDd2zk2HNvaBGihHhnQAS0tXdJKgvWa7lgIpG2rBQL3 7P/aug1xum1r6DSbDA1+4dQ7XAZcLzK8mw25mdBRo+exBxGCJreOUu1K8DaQ4wrw aw2DuvBUk4+wmE1acq693uLbDjGA26F6H2Ul/BpcvYK2Bdn+Ec8sOK1N/xkRxdbc 0IxozORawIG6SQRrP1+N+3Tmau79iOnSSAOAtum6XJ06bRvRXo48IrbtnTHuYCrO gLLzG9DkjYeJx1ymcqQXscAhH4cwm2zEvBqCdy9RXXViZVHmWfN9LeGlYTc9lV33 +X9Co7XevOA3f4DRVhbDFJi/neVEu+RIeBfd7F0dnU0nY2pbtHrBducmTiQOQG2N HiQ0HPTrMJI7qmr56AxnXQPqvlDP3ys/dygid/vIZL/7Pjqa2T6oWqixvQR+T5/X wajJM5k8d+KVW0QuuKBGTzhRCjnD+fqGagn+X9kaF8F66i9tMeeSYhd7abIBxMQV nU61DsOgLWmulrk3f1BSdKaig0U9QMxQLIZI15/CuLH8hv/ctuodCDJLxXykfxlG CaJOoWC18kj2 =Svee -----END PGP SIGNATURE----- Merge tag 'pm-5.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These modify the Intel RAPL driver to allow it to use an MMIO interface to the hardware, make the int340X thermal driver provide such an interface for it, add Intel Ice Lake CPU IDs to the RAPL driver (these changes depend on the previously merged x86 arch changes), update cpufreq to use the PM QoS framework for managing the min and max frequency limits, and add update the imx-cpufreq-dt cpufreq driver to support i.MX8MN. Specifics: - Add MMIO interface support to the Intel RAPL power capping driver and update the int340X thermal driver to provide a RAPL MMIO interface (Zhang Rui, Stephen Rothwell). - Add Intel Ice Lake CPU IDs to the RAPL driver (Zhang Rui, Rajneesh Bhardwaj). - Make cpufreq use the PM QoS framework (instead of notifiers) for managing the min and max frequency constraints (Viresh Kumar). - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson Huang)" * tag 'pm-5.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits) cpufreq: Make cpufreq_generic_init() return void intel_rapl: need linux/cpuhotplug.h for enum cpuhp_state powercap/rapl: Add Ice Lake NNPI support to RAPL driver powercap/intel_rapl: add support for ICX-D powercap/intel_rapl: add support for ICX powercap/intel_rapl: add support for IceLake desktop intel_rapl: Fix module autoloading issue int340X/processor_thermal_device: add support for MMIO RAPL intel_rapl: support two power limits for every RAPL domain intel_rapl: support 64 bit register intel_rapl: abstract RAPL common code intel_rapl: cleanup hardcoded MSR access intel_rapl: cleanup some functions intel_rapl: abstract register access operations intel_rapl: abstract register address intel_rapl: introduce struct rapl_if_private intel_rapl: introduce intel_rapl.h intel_rapl: remove hardcoded register index intel_rapl: use reg instead of msr cpufreq: imx-cpufreq-dt: Add i.MX8MN support ...
228 lines
10 KiB
ReStructuredText
228 lines
10 KiB
ReStructuredText
===============================
|
|
PM Quality Of Service Interface
|
|
===============================
|
|
|
|
This interface provides a kernel and user mode interface for registering
|
|
performance expectations by drivers, subsystems and user space applications on
|
|
one of the parameters.
|
|
|
|
Two different PM QoS frameworks are available:
|
|
1. PM QoS classes for cpu_dma_latency, network_latency, network_throughput,
|
|
memory_bandwidth.
|
|
2. the per-device PM QoS framework provides the API to manage the per-device latency
|
|
constraints and PM QoS flags.
|
|
|
|
Each parameters have defined units:
|
|
|
|
* latency: usec
|
|
* timeout: usec
|
|
* throughput: kbs (kilo bit / sec)
|
|
* memory bandwidth: mbs (mega bit / sec)
|
|
|
|
|
|
1. PM QoS framework
|
|
===================
|
|
|
|
The infrastructure exposes multiple misc device nodes one per implemented
|
|
parameter. The set of parameters implement is defined by pm_qos_power_init()
|
|
and pm_qos_params.h. This is done because having the available parameters
|
|
being runtime configurable or changeable from a driver was seen as too easy to
|
|
abuse.
|
|
|
|
For each parameter a list of performance requests is maintained along with
|
|
an aggregated target value. The aggregated target value is updated with
|
|
changes to the request list or elements of the list. Typically the
|
|
aggregated target value is simply the max or min of the request values held
|
|
in the parameter list elements.
|
|
Note: the aggregated target value is implemented as an atomic variable so that
|
|
reading the aggregated value does not require any locking mechanism.
|
|
|
|
|
|
From kernel mode the use of this interface is simple:
|
|
|
|
void pm_qos_add_request(handle, param_class, target_value):
|
|
Will insert an element into the list for that identified PM QoS class with the
|
|
target value. Upon change to this list the new target is recomputed and any
|
|
registered notifiers are called only if the target value is now different.
|
|
Clients of pm_qos need to save the returned handle for future use in other
|
|
pm_qos API functions.
|
|
|
|
void pm_qos_update_request(handle, new_target_value):
|
|
Will update the list element pointed to by the handle with the new target value
|
|
and recompute the new aggregated target, calling the notification tree if the
|
|
target is changed.
|
|
|
|
void pm_qos_remove_request(handle):
|
|
Will remove the element. After removal it will update the aggregate target and
|
|
call the notification tree if the target was changed as a result of removing
|
|
the request.
|
|
|
|
int pm_qos_request(param_class):
|
|
Returns the aggregated value for a given PM QoS class.
|
|
|
|
int pm_qos_request_active(handle):
|
|
Returns if the request is still active, i.e. it has not been removed from a
|
|
PM QoS class constraints list.
|
|
|
|
int pm_qos_add_notifier(param_class, notifier):
|
|
Adds a notification callback function to the PM QoS class. The callback is
|
|
called when the aggregated value for the PM QoS class is changed.
|
|
|
|
int pm_qos_remove_notifier(int param_class, notifier):
|
|
Removes the notification callback function for the PM QoS class.
|
|
|
|
|
|
From user mode:
|
|
|
|
Only processes can register a pm_qos request. To provide for automatic
|
|
cleanup of a process, the interface requires the process to register its
|
|
parameter requests in the following way:
|
|
|
|
To register the default pm_qos target for the specific parameter, the process
|
|
must open one of /dev/[cpu_dma_latency, network_latency, network_throughput]
|
|
|
|
As long as the device node is held open that process has a registered
|
|
request on the parameter.
|
|
|
|
To change the requested target value the process needs to write an s32 value to
|
|
the open device node. Alternatively the user mode program could write a hex
|
|
string for the value using 10 char long format e.g. "0x12345678". This
|
|
translates to a pm_qos_update_request call.
|
|
|
|
To remove the user mode request for a target value simply close the device
|
|
node.
|
|
|
|
|
|
2. PM QoS per-device latency and flags framework
|
|
================================================
|
|
|
|
For each device, there are three lists of PM QoS requests. Two of them are
|
|
maintained along with the aggregated targets of resume latency and active
|
|
state latency tolerance (in microseconds) and the third one is for PM QoS flags.
|
|
Values are updated in response to changes of the request list.
|
|
|
|
The target values of resume latency and active state latency tolerance are
|
|
simply the minimum of the request values held in the parameter list elements.
|
|
The PM QoS flags aggregate value is a gather (bitwise OR) of all list elements'
|
|
values. One device PM QoS flag is defined currently: PM_QOS_FLAG_NO_POWER_OFF.
|
|
|
|
Note: The aggregated target values are implemented in such a way that reading
|
|
the aggregated value does not require any locking mechanism.
|
|
|
|
|
|
From kernel mode the use of this interface is the following:
|
|
|
|
int dev_pm_qos_add_request(device, handle, type, value):
|
|
Will insert an element into the list for that identified device with the
|
|
target value. Upon change to this list the new target is recomputed and any
|
|
registered notifiers are called only if the target value is now different.
|
|
Clients of dev_pm_qos need to save the handle for future use in other
|
|
dev_pm_qos API functions.
|
|
|
|
int dev_pm_qos_update_request(handle, new_value):
|
|
Will update the list element pointed to by the handle with the new target
|
|
value and recompute the new aggregated target, calling the notification
|
|
trees if the target is changed.
|
|
|
|
int dev_pm_qos_remove_request(handle):
|
|
Will remove the element. After removal it will update the aggregate target
|
|
and call the notification trees if the target was changed as a result of
|
|
removing the request.
|
|
|
|
s32 dev_pm_qos_read_value(device, type):
|
|
Returns the aggregated value for a given device's constraints list.
|
|
|
|
enum pm_qos_flags_status dev_pm_qos_flags(device, mask)
|
|
Check PM QoS flags of the given device against the given mask of flags.
|
|
The meaning of the return values is as follows:
|
|
|
|
PM_QOS_FLAGS_ALL:
|
|
All flags from the mask are set
|
|
PM_QOS_FLAGS_SOME:
|
|
Some flags from the mask are set
|
|
PM_QOS_FLAGS_NONE:
|
|
No flags from the mask are set
|
|
PM_QOS_FLAGS_UNDEFINED:
|
|
The device's PM QoS structure has not been initialized
|
|
or the list of requests is empty.
|
|
|
|
int dev_pm_qos_add_ancestor_request(dev, handle, type, value)
|
|
Add a PM QoS request for the first direct ancestor of the given device whose
|
|
power.ignore_children flag is unset (for DEV_PM_QOS_RESUME_LATENCY requests)
|
|
or whose power.set_latency_tolerance callback pointer is not NULL (for
|
|
DEV_PM_QOS_LATENCY_TOLERANCE requests).
|
|
|
|
int dev_pm_qos_expose_latency_limit(device, value)
|
|
Add a request to the device's PM QoS list of resume latency constraints and
|
|
create a sysfs attribute pm_qos_resume_latency_us under the device's power
|
|
directory allowing user space to manipulate that request.
|
|
|
|
void dev_pm_qos_hide_latency_limit(device)
|
|
Drop the request added by dev_pm_qos_expose_latency_limit() from the device's
|
|
PM QoS list of resume latency constraints and remove sysfs attribute
|
|
pm_qos_resume_latency_us from the device's power directory.
|
|
|
|
int dev_pm_qos_expose_flags(device, value)
|
|
Add a request to the device's PM QoS list of flags and create sysfs attribute
|
|
pm_qos_no_power_off under the device's power directory allowing user space to
|
|
change the value of the PM_QOS_FLAG_NO_POWER_OFF flag.
|
|
|
|
void dev_pm_qos_hide_flags(device)
|
|
Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS list
|
|
of flags and remove sysfs attribute pm_qos_no_power_off from the device's power
|
|
directory.
|
|
|
|
Notification mechanisms:
|
|
|
|
The per-device PM QoS framework has a per-device notification tree.
|
|
|
|
int dev_pm_qos_add_notifier(device, notifier, type):
|
|
Adds a notification callback function for the device for a particular request
|
|
type.
|
|
|
|
The callback is called when the aggregated value of the device constraints list
|
|
is changed.
|
|
|
|
int dev_pm_qos_remove_notifier(device, notifier, type):
|
|
Removes the notification callback function for the device.
|
|
|
|
|
|
Active state latency tolerance
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
This device PM QoS type is used to support systems in which hardware may switch
|
|
to energy-saving operation modes on the fly. In those systems, if the operation
|
|
mode chosen by the hardware attempts to save energy in an overly aggressive way,
|
|
it may cause excess latencies to be visible to software, causing it to miss
|
|
certain protocol requirements or target frame or sample rates etc.
|
|
|
|
If there is a latency tolerance control mechanism for a given device available
|
|
to software, the .set_latency_tolerance callback in that device's dev_pm_info
|
|
structure should be populated. The routine pointed to by it is should implement
|
|
whatever is necessary to transfer the effective requirement value to the
|
|
hardware.
|
|
|
|
Whenever the effective latency tolerance changes for the device, its
|
|
.set_latency_tolerance() callback will be executed and the effective value will
|
|
be passed to it. If that value is negative, which means that the list of
|
|
latency tolerance requirements for the device is empty, the callback is expected
|
|
to switch the underlying hardware latency tolerance control mechanism to an
|
|
autonomous mode if available. If that value is PM_QOS_LATENCY_ANY, in turn, and
|
|
the hardware supports a special "no requirement" setting, the callback is
|
|
expected to use it. That allows software to prevent the hardware from
|
|
automatically updating the device's latency tolerance in response to its power
|
|
state changes (e.g. during transitions from D3cold to D0), which generally may
|
|
be done in the autonomous latency tolerance control mode.
|
|
|
|
If .set_latency_tolerance() is present for the device, sysfs attribute
|
|
pm_qos_latency_tolerance_us will be present in the devivce's power directory.
|
|
Then, user space can use that attribute to specify its latency tolerance
|
|
requirement for the device, if any. Writing "any" to it means "no requirement,
|
|
but do not let the hardware control latency tolerance" and writing "auto" to it
|
|
allows the hardware to be switched to the autonomous mode if there are no other
|
|
requirements from the kernel side in the device's list.
|
|
|
|
Kernel code can use the functions described above along with the
|
|
DEV_PM_QOS_LATENCY_TOLERANCE device PM QoS type to add, remove and update
|
|
latency tolerance requirements for devices.
|