linux_dsm_epyc7002/tools/power/cpupower/man/cpupower-set.1
Peter Zijlstra 8e7fbcbc22 sched: Remove stale power aware scheduling remnants and dysfunctional knobs
It's been broken forever (i.e. it's not scheduling in a power
aware fashion), as reported by Suresh and others sending
patches, and nobody cares enough to fix it properly ...
so remove it to make space free for something better.

There's various problems with the code as it stands today, first
and foremost the user interface which is bound to topology
levels and has multiple values per level. This results in a
state explosion which the administrator or distro needs to
master and almost nobody does.

Furthermore large configuration state spaces aren't good, it
means the thing doesn't just work right because it's either
under so many impossibe to meet constraints, or even if
there's an achievable state workloads have to be aware of
it precisely and can never meet it for dynamic workloads.

So pushing this kind of decision to user-space was a bad idea
even with a single knob - it's exponentially worse with knobs
on every node of the topology.

There is a proposal to replace the user interface with a single
3 state knob:

 sched_balance_policy := { performance, power, auto }

where 'auto' would be the preferred default which looks at things
like Battery/AC mode and possible cpufreq state or whatever the hw
exposes to show us power use expectations - but there's been no
progress on it in the past many months.

Aside from that, the actual implementation of the various knobs
is known to be broken. There have been sporadic attempts at
fixing things but these always stop short of reaching a mergable
state.

Therefore this wholesale removal with the hopes of spurring
people who care to come forward once again and work on a
coherent replacement.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-17 13:48:56 +02:00

95 lines
3.1 KiB
Groff

.TH CPUPOWER\-SET "1" "22/02/2011" "" "cpupower Manual"
.SH NAME
cpupower\-set \- Set processor power related kernel or hardware configurations
.SH SYNOPSIS
.ft B
.B cpupower set [ \-b VAL ] [ \-s VAL ] [ \-m VAL ]
.SH DESCRIPTION
\fBcpupower set \fP sets kernel configurations or directly accesses hardware
registers affecting processor power saving policies.
Some options are platform wide, some affect single cores. By default values
are applied on all cores. How to modify single core configurations is
described in the cpupower(1) manpage in the \-\-cpu option section. Whether an
option affects the whole system or can be applied to individual cores is
described in the Options sections.
Use \fBcpupower info \fP to read out current settings and whether they are
supported on the system at all.
.SH Options
.PP
\-\-perf-bias, \-b
.RS 4
Sets a register on supported Intel processore which allows software to convey
its policy for the relative importance of performance versus energy savings to
the processor.
The range of valid numbers is 0-15, where 0 is maximum
performance and 15 is maximum energy efficiency.
The processor uses this information in model-specific ways
when it must select trade-offs between performance and
energy efficiency.
This policy hint does not supersede Processor Performance states
(P-states) or CPU Idle power states (C-states), but allows
software to have influence where it would otherwise be unable
to express a preference.
For example, this setting may tell the hardware how
aggressively or conservatively to control frequency
in the "turbo range" above the explicitly OS-controlled
P-state frequency range. It may also tell the hardware
how aggressively it should enter the OS requested C-states.
This option can be applied to individual cores only via the \-\-cpu option,
cpupower(1).
Setting the performance bias value on one CPU can modify the setting on
related CPUs as well (for example all CPUs on one socket), because of
hardware restrictions.
Use \fBcpupower -c all info -b\fP to verify.
This options needs the msr kernel driver (CONFIG_X86_MSR) loaded.
.RE
.PP
\-\-sched\-mc, \-m [ VAL ]
.RE
\-\-sched\-smt, \-s [ VAL ]
.RS 4
\-\-sched\-mc utilizes cores in one processor package/socket first before
processes are scheduled to other processor packages/sockets.
\-\-sched\-smt utilizes thread siblings of one processor core first before
processes are scheduled to other cores.
The impact on power consumption and performance (positiv or negativ) heavily
depends on processor support for deep sleep states, frequency scaling and
frequency boost modes and their dependencies between other thread siblings
and processor cores.
Taken over from kernel documentation:
Adjust the kernel's multi-core scheduler support.
Possible values are:
.RS 2
0 - No power saving load balance (default value)
1 - Fill one thread/core/package first for long running threads
2 - Also bias task wakeups to semi-idle cpu package for power
savings
.RE
.SH "SEE ALSO"
cpupower-info(1), cpupower-monitor(1), powertop(1)
.PP
.SH AUTHORS
.nf
\-\-perf\-bias parts written by Len Brown <len.brown@intel.com>
Thomas Renninger <trenn@suse.de>