Commit Graph

415 Commits

Author SHA1 Message Date
Len Brown
fd3933ca7b tools/power turbostat: fix MSR_IA32_MISC_ENABLE MWAIT printout
MSR_IA32_MISC_ENABLE[18] is the MWAIT ENABLE bit, not DISABLE bit...

so

MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST No-MWAIT PREFETCH TURBO)

should print as:

MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT PREFETCH TURBO)

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:06 -04:00
Artem Bityutskiy
47936f944e tools/power turbostat: fix printing on input
The recent patch that implements table printing on a keypress introduced a
regression - turbostat prints the table almost continuously if it is run from a
daemon program.

The problem is also easy to reproduce like this:

echo | turbostat

The reason is that we cannot assume that stdin is always a TTY. It can be many
things.

This patch adds fixes the problem by limiting the new keypress functionality to
TTYs only. If stdin is not a TTY, we just sleep for the full interval time.

While on it, clean-up 'do_sleep()' to return no value, as callers do not expect
that anyway.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:05 -04:00
Len Brown
b9ad8ee0da tools/power turbostat: end current interval upon newline input
In turbostat interval mode, a newline typed on standard input
will now conclude the current interval.  Data will immediately
be collected and printed for that interval, and the next interval
will be started.

This is similar to the recently added SIGUSR1 feature.
But that is for use by programs, while this is for interactive use.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:05 -04:00
Len Brown
072119606a tools/power turbostat: on SIGUSR1: sample, print and continue
Interval-mode turbostat now catches and discards SIGUSR1.

Thus, SIGUSR1 can be used to tell turbostat to cut short
the current measurement interval.  Turbostat will then start
the next measurement interval using the regular interval length.

This can be used to give turbostat variable intervals.
Invoke turbostat with --interval LARGE_NUMBER_SEC
and have a program that has permission to send it a SIGUSR1
always before LARGE_NUMBER_SEC expires.

It may also be useful to use "--enable Time_Of_Day_Seconds"
to observe the actual interval length.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:04 -04:00
Len Brown
8aa2ed0b28 tools/power turbostat: on SIGINT: sample, print and exit
When running in interval-mode, catch interrupts
and print a final data record before exiting.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:04 -04:00
Len Brown
3f44a5c62b tools/power turbostat: add --enable Time_Of_Day_Seconds
Add a Time_Of_Day_Seconds column showing when measurement
for each row was completed.  Units are [sec.subsec] since Epoch,
as reported by gettimeofday(2).

While useful to correlate turbostat output with other tools,
this built-in column is disabled, by default.

Add the "--enable" option to enable such disabled-by-default
built-in columns:

"--enable Time_Of_Day_Seconds"
"--enable usec"

"--enable all", will enable all disabled-by-defauilt built-in counters.

When "--debug" is used, all disabled-by-default columns are enabled,
unless explicitly skipped using "--hide"

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:04 -04:00
Artem Bityutskiy
2085e12441 tools/power turbostat: fix Skylake Xeon package C-state display
Turbostat neglects to display all package C-states for some Skylake Xeon BIOS configurations.

This is due to a typo in the table decoding MSR_PKG_CST_CONFIG_CONTROL (0x000000e2)

Here we fix that typo, according to Intel SDM, vol 4, Table 2-41 -
"MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H".

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:03 -04:00
Doug Smythies
35459105de tools/power/x86/intel_pstate_tracer: Add optional setting of trace buffer memory allocation
Allow the user to override the default trace buffer memory allocation
by adding a command line option to override the default.

The patch also:

Adds a SIGINT (i.e. CTRL C exit) handler,
so that things can be cleaned up before exit.

Moves the postion of some other cleanup from after to
before the potential "No valid data to plot" exit.

Replaces all quit() calls with sys.exit, because
quit() is not supposed to be used in scripts.

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-17 10:19:34 +02:00
Doug Smythies
fbe313884d tools/power/x86/intel_pstate_tracer: Free the trace buffer memory
The trace buffer memory should be, mostly, freed after
the buffer has been output.

This patch is required before a future patch that will allow
the user to override the default, and specify the trace buffer
memory allocation as a command line option.

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-01-10 01:12:04 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Len Brown
c97cc7dbce Revert "tools/power turbostat: stop migrating, unless '-m'"
This reverts commit c91fc8519d.

That change caused a C6 and PC6 residency regression on large idle systems.

Users also complained about new output indicating jitter:

turbostat: cpu6 jitter 3794 9142

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: 4.13+ <stable@vger.kernel.org> # v4.13+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-10-18 03:17:45 +02:00
Rafael J. Wysocki
5422583bfa Merge back PM tools material for v4.13. 2017-06-27 01:42:51 +02:00
Len Brown
f7d44a8f3f tools/power turbostat: update version number
Signed-off-by: Len Brown <len.brown@intel.com>
2017-06-24 20:03:42 -07:00
Len Brown
f26b151977 tools/power turbostat: decode MSR_IA32_MISC_ENABLE only on Intel
otherwise, turbostat bails on on AMD Opteron boxes:

turbostat: cpu26: msr offset 0x1a0 read failed: Input/output error

Reported-by: Kamil Kolakowski <kkolakow@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-06-24 20:03:41 -07:00
Len Brown
c91fc8519d tools/power turbostat: stop migrating, unless '-m'
Turbostat has the capability to set its own affinity to
each CPU so that its MSR accesses are on the local CPU.

However, using the in-kernel cross-call in  the msr driver
tends to be less invasive, so do that -- by-default.
'-m' remains to get the old behaviour.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-06-24 20:03:19 -07:00
Len Brown
f4fdf2b474 tools/power turbostat: if --debug, print sampling overhead
The --debug option now pre-pends each row with
the number  of micro-seconds [usec] to collect
the finishing snapshot for that row.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-06-23 20:52:23 -07:00
Len Brown
a99d87306f tools/power turbostat: hide SKL counters, when not requested
Skylake has some new counters, and they were erroneously
exempt  from --show and --hide

eg.

turbostat  --quiet --show CPU
CPU	Totl%C0	Any%C0	GFX%C0	CPUGFX%
-	116.73	90.56	85.69	79.00
0	117.78	91.38	86.47	79.71
2
1
3

is now

CPU
-
0
2
1
3

Signed-off-by: Len Brown <len.brown@intel.com>
2017-06-23 20:52:23 -07:00
Rafael J. Wysocki
a32f80b30d Merge branch 'utilities' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull power management utilities updates from Len Brown.

* 'utilities' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  intel_pstate: use updated msr-index.h HWP.EPP values
  tools/power x86_energy_perf_policy: support HWP.EPP
  x86: msr-index.h: fix shifts to ULL results in HWP macros.
  x86: msr-index.h: define HWP.EPP values
  x86: msr-index.h: define EPB mid-points
2017-05-16 03:15:27 +02:00
Len Brown
4beec1d751 tools/power x86_energy_perf_policy: support HWP.EPP
x86_energy_perf_policy(8) was created as an example
of how the user, or upper-level OS, can manage
MSR_IA32_ENERGY_PERF_BIAS (EPB).

Hardware consults EPB when it makes internal decisions
balancing energy-saving vs performance.
For example, should HW quickly or slowly
transition into and out of power-saving idles states?
Should HW quickly or slowly ramp frequency up or down
in response to demand in the turbo-frequency range?

Depending on the processor, EPB may have package, core,
or CPU thread scope.  As such, the only general policy
is to write the same value to EPB on every CPU in the system.

Recent platforms add support for Hardware Performance States (HWP).
HWP effectively extends hardware frequency control from
the opportunistic turbo-frequency range to control the entire
range of available processor frequencies.

Just as turbo-mode used EPB, HWP can use EPB to help decicde
how quickly to ramp frequency and voltage up and down
in response to changing demand.  Indeed, BDX and BDX-DE,
the first processors to support HWP, use EPB for this purpose.

Starting in SKL, HWP no longer looks to EPB for influence.
Instead, it looks in a new MSR specifically for this purpose:
IA32_HWP_REQUEST.Energy_Performance_Preference (HWP.EPP).
HWP.EPP is like EPB, except that it is specific to HWP-mode
frequency selection.  Also, HWP.EPP is defined to have
per CPU-thread scope.

Starting in SKX, IA32_HWP_REQUEST is augmented by
IA32_HWP_REQUEST_PKG -- which has the same function, but is
defined to have package-wide scope.  A new bit in IA32_HWP_REQUEST
determines if it over-rides the IA32_HWP_REQUEST_PKG or not.

Note that HWP-mode can be enabled in several ways.
The "in-band" method is for HWP to be exposed in CPUID,
and for the Linux intel_pstate driver to recognized that,
and thus enable HWP.  In this case, starting in Linux 4.10, intel_pstate
exports cpufreq sysfs attribute "energy_performance_preference"
which can be used to manage HWP.EPP.  This interface can be
used to set HWP.EPP to these values:

0 performance
128 balance_performance (default)
192 balance_power
255 power

Here, x86_energy_performance_policy is updated to use
idential strings and values as intel_pstate.

But HWP-mode may also be enabled by firmware before the OS boots,
and the OS may not be aware of HWP.  In this case, intel_pstate
is not available to provide sysfs attributes, and x86_energy_perf_policy
or a similar utility is invaluable for managing HWP.EPP, for
this utility works the same, no matter if cpufreq is enabled or not.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-05-11 21:27:45 -04:00
Doug Smythies
010a522cf2 tools/power/x86/intel_pstate_tracer: Adjust directory ownership
The intel_pstate_tracer.py script only needs to be run as root
when it is also used to actually acquire the trace data that
it will post process. Otherwise it is generally preferable
that it be run as a regular user.
If run the first time as root the results directory will be
incorrect for any subsequent run as a regular user. For any run
as root the specific testname subdirectory will not allow any
subsequent file saves by a regular user. Typically, and for example,
the regular user might be attempting to save a .csv file converted to
a spreadsheet with added calculations or graphs.

Set the directories and files owner and groups IDs to be the regular
user, if required.

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 02:47:28 +02:00
Rafael J. Wysocki
ad0d9c3bca Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat utility fixes for v4.11 from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: update version number
  tools/power turbostat: fix impossibly large CPU%c1 value
  tools/power turbostat: turbostat.8 add missing column definitions
  tools/power turbostat: update HWP dump to decimal from hex
  tools/power turbostat: enable package THERM_INTERRUPT dump
  tools/power turbostat: show missing Core and GFX power on SKL and KBL
  tools/power turbostat: bugfix: GFXMHz column not changing
2017-04-13 14:50:11 +02:00
Len Brown
5f9bf02a58 tools/power turbostat: update version number
Signed-off-by: Len Brown <len.brown@intel.com>
2017-04-12 20:03:50 -04:00
Len Brown
95149369c1 tools/power turbostat: fix impossibly large CPU%c1 value
Most CPUs do not have a hardware c1 counter,
and so turbostat derives c1 residency:

c1 = TSC - MPERF - other_core_cstate_counters

As it is not possible to atomically read these coutners,
measurement jitter can case this calcuation to "go negative"
when very close to 0.  Turbostat detect that case and
simply prints c1 = 0.00%

But that check neglected to account for systems where the TSC
crystal clock domain and the MPERF BCLK domain are differ by
a small amount.  That allowed very small negative c1 numbers
to escape this check and be printed as huge positve numbers.

This code begs for a bit of cleanup, but this patch
is the minimal change to fix the issue.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-04-12 20:03:50 -04:00
Doug Smythies
ab23d1146a tools/power turbostat: turbostat.8 add missing column definitions
Add GFX%rc6 and GFXMHz to the column descriptions section
of the turbostat man page.

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-04-12 20:03:49 -04:00
Len Brown
6dbd25a245 tools/power turbostat: update HWP dump to decimal from hex
Syntax only.

The HWP CAPABILTIES and REQUEST ratios are more easily
viewed in decimal -- just multiply by 100 and you get MHz...

new:
cpu0: MSR_HWP_CAPABILITIES: 0x010c1b23 (high 35 guar 27 eff 12 low 1)
cpu0: MSR_HWP_REQUEST: 0x80002301 (min 1 max 35 des 0 epp 0x80 window 0x0 pkg 0x0)

old:
cpu0: MSR_HWP_CAPABILITIES: 0x010c1b23 (high 0x23 guar 0x1b eff 0xc low 0x1)
cpu0: MSR_HWP_REQUEST: 0x80002301 (min 0x1 max 0x23 des 0x0 epp 0x80 window 0x0 pkg 0x0)

Signed-off-by: Len Brown <len.brown@intel.com>
2017-04-12 20:03:35 -04:00
Len Brown
f4896fa502 tools/power turbostat: enable package THERM_INTERRUPT dump
cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x00641400 (100 C)
cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x884b0800 (25 C)
cpu0: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x00000003 (100 C, 100 C)

Enable the same per-core output, but hide it behind --debug
because it is too verbose on big systems.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-04-12 20:03:34 -04:00
Len Brown
818249216d tools/power turbostat: show missing Core and GFX power on SKL and KBL
While the current SDM is silent on the matter, the Core and GFX
RAPL power meters on SKL and KBL appear to work -- so show them.

Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-04-12 20:03:19 -04:00
Len Brown
22048c5485 tools/power turbostat: bugfix: GFXMHz column not changing
turbostat displays a GFXMHz column, which comes from reading
/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz

But GFXMHz was not changing, even when a manual
cat /sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz
showed a new value.

It turns out that a rewind() on the open file is not sufficient,
fflush() (or a close/open) is needed to read fresh values.

Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-04 15:42:48 -05:00
Rafael J. Wysocki
6bff9c609f Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull changes related to turbostat for v4.11 from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (44 commits)
  tools/power turbostat: version 17.02.24
  tools/power turbostat: bugfix: --add u32 was printed as u64
  tools/power turbostat: show error on exec
  tools/power turbostat: dump p-state software config
  tools/power turbostat: show package number, even without --debug
  tools/power turbostat: support "--hide C1" etc.
  tools/power turbostat: move --Package and --processor into the --cpu option
  tools/power turbostat: turbostat.8 update
  tools/power turbostat: update --list feature
  tools/power turbostat: use wide columns to display large numbers
  tools/power turbostat: Add --list option to show available header names
  tools/power turbostat: fix zero IRQ count shown in one-shot command mode
  tools/power turbostat: add --cpu parameter
  tools/power turbostat: print sysfs C-state stats
  tools/power turbostat: extend --add option to accept /sys path
  tools/power turbostat: skip unused counters on BDX
  tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
  tools/power turbostat: skip unused counters on SKX
  tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
  tools/power turbostat: initial Gemini Lake SOC support
  ...
2017-03-01 23:34:38 +01:00
Len Brown
e3942ed8c6 tools/power turbostat: version 17.02.24
The turbostat before this last set of changes is obsolete.
This new version can do a lot more, but it also has
some different defaults, that might catch some off-guard.
So it seems a good time to give a new version number.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:26 -05:00
Len Brown
5f3aea5777 tools/power turbostat: bugfix: --add u32 was printed as u64
When the "u32" keyword is used with --add, it means that
the output should be truncated to 32-bits.  This was not
happening and all 64-bits were printed.

Also, when no column name was used for an added MSR,
The default column name was in deximal, eg. MSR16.
Users report that they tend to use hex MSR numbers,
so print them in hex.  To always fit into the columns,
use the syntax M0x10.  Note that the user can always
supply any column header that they want.

eg --add msr0x10,MY_TSC

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:26 -05:00
Len Brown
0815a3d09b tools/power turbostat: show error on exec
When turbostat is run in one-shot command mode,
the parent takes the 'before' counter snapshot,
fork/exec/wait for the child to exit,
takes the 'after' counter snapshot,
and prints the results.

however, if the child fails to exec the command,
it immediately returns, without indicating that
anythign was wrong.

Add an error message showing that exec failed:

sudo turbostat sleeeep 4
...
turbostat: exec sleeeep: No such file or directory
...

Note that the parent will still print out the statistics,
because it can't tell the difference between the failed
exec and a command that is purposefully returning
the same status.  Unfortunately, this may obscure the
error message.  However, if the --out parameter is used,
the error message is evident on stderr.

Reported-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:25 -05:00
Len Brown
7293fccdff tools/power turbostat: dump p-state software config
cpu1: cpufreq driver: acpi-cpufreq
cpu1: cpufreq governor: ondemand
cpufreq boost: 1

or

cpu0: cpufreq driver: intel_pstate
cpu0: cpufreq governor: powersave
cpufreq intel_pstate no_turbo: 0

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:25 -05:00
Len Brown
7da6e3e212 tools/power turbostat: show package number, even without --debug
On multi-package systems, the "Package" column was being displayed
only if --debug was used.  Show it always.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:25 -05:00
Len Brown
dd778a5e6b tools/power turbostat: support "--hide C1" etc.
Originally, the only way to hide the sysfs C-state statistics columns
was with "--hide sysfs".  This was because we process "--hide" before
we probe for those columns.

hack --hide to remember deferred hide requests, and apply
them when sysfs is probed.

"--hide sysfs" is still available as short-hand to refer to
the entire group of counters.

The down-side of this change is that we no longer error check for
bogus --hide column names.  But the user will quickly figure that
out if a column they mean to hide is still there...

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:24 -05:00
Len Brown
4e4e1e7c6e tools/power turbostat: move --Package and --processor into the --cpu option
--Package is now "--cpu package",
which will display just the 1st CPU in each package

--processor is not "--cpu core"
which will display just the 1st CPU in each core

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:24 -05:00
Len Brown
da67e2b9fd tools/power turbostat: turbostat.8 update
update examples to show recently updated features.
In particular
--add
--show
--hide
--cpu
--list

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:23 -05:00
Len Brown
6168c2e0fb tools/power turbostat: update --list feature
Make it possible to take the entire un-edited output
from `turbostat --list` and feed it to "turbostat --show"
or "turbostat --hide".

To do this, the leading comma was removed
(no mater what columns are active)
and also they dynamic C-state "C1, C2, C3" etc are replaced
by the string "sysfs", which refers to them as a group.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:23 -05:00
Len Brown
0de6c0df4e tools/power turbostat: use wide columns to display large numbers
When a counter overlfows 7 columns, it shifts the remaining
columns to the right, so they no longer line up under
their column header.

Update turbostat to dectect when it is handling large
numbers, and switch to wider columns where, necessary.

Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:23 -05:00
Len Brown
c8ade3616a tools/power turbostat: Add --list option to show available header names
It is handy to know the list of column header names,
so that they can be used with --add and --skip

The new --list option shows them:

sudo ./turbostat --list --hide sysfs
,Core,CPU,Avg_MHz,Busy%,Bzy_MHz,TSC_MHz,IRQ,SMI,CPU%c1,CPU%c3,CPU%c6,CPU%c7,CoreTmp,PkgTmp,GFX%rc6,GFXMHz,PkgWatt,CorWatt,GFXWatt

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:22 -05:00
Len Brown
218f0e8d5c tools/power turbostat: fix zero IRQ count shown in one-shot command mode
The IRQ column has been working for periodic mode,
but not in one-shot command mode, it shows only 0.

until now.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:22 -05:00
Len Brown
1ef7d21afe tools/power turbostat: add --cpu parameter
With the --cpu parameter, turbostat prints only lines
for the specified set of CPUs:

sudo ./turbostat --quiet --show Core,CPU --cpu 0,1,3..5,6-7
	Core	CPU
	-	-
	0	0
	0	4
	1	1
	1	5
	2	6
	3	3
	3	7

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:22 -05:00
Len Brown
41618e63f2 tools/power turbostat: print sysfs C-state stats
When turbostat shows % of time in a CPU idle power state,
it has always been showing information from underlying
hardware residency counters.

While this reflects what the hardware is doing, and is thus
useful for understanding the hardware,
it doesn't directly tell us what Linux requested --
which is useful for tuning Linux itself.

Here we add columns to turbostat to show the
Linux cpuidle sub-system statistics:
/sys/devices/system/cpu/cpu*/cpuidle/state*/*

The first group of columns are the "usage", which is the
number of times software requested that C-state in the
measurement interval. eg C1 below.

The second group of columns are the "time", which is the percentage
of the measurement interval time that software has requested
the specified C-state. eg C1% below.

These software counters can be compared to the underlying
hardware residency counters (eg CPU%c1	CPU%c3	CPU%c6	CPU%c7)
to compare what sofware requested to what the hardware delivered.

These sysfs attributes are discovered when turbostat starts,
rather than being "built in".  So the --show and --hide
parameters do not know about these dynamic column names.
However "--show sysfs" and "--hide sysfs" act on the
entire group of columns:

turbostat --show sysfs
...
cpu4: POLL: CPUIDLE CORE POLL IDLE
cpu4: C1: MWAIT 0x00
cpu4: C1E: MWAIT 0x01
cpu4: C3: MWAIT 0x10
cpu4: C6: MWAIT 0x20
cpu4: C7s: MWAIT 0x32
...
C1 	C1E	C3 	C6 	C7s	C1% 	C1E%	C3%	C6% 	C7s%
3	6	5	1	188	0.00	0.02	0.00	0.00	99.93
0	6	5	0	58	0.00	0.16	0.02	0.00	99.70
0	0	0	0	9	0.00	0.00	0.00	0.00	99.96
0	0	0	1	24	0.00	0.00	0.00	0.02	99.93
0	0	0	0	9	0.00	0.00	0.00	0.00	99.97
0	0	0	0	32	0.00	0.00	0.00	0.00	99.96
0	0	0	0	7	0.00	0.00	0.00	0.00	99.98
2	0	0	0	36	0.00	0.00	0.00	0.00	99.97
1	0	0	0	13	0.00	0.00	0.00	0.00	99.98

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:21 -05:00
Len Brown
495c7654cc tools/power turbostat: extend --add option to accept /sys path
Previously, the --add option could specify only an MSR.

Here is is extended so an arbitrary /sys attribute,
as specified by an absolute file path name.

sudo ./turbostat --add /sys/devices/system/cpu/cpu0/cpuidle/state5/usage

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:21 -05:00
Len Brown
ade0ebacdf tools/power turbostat: skip unused counters on BDX
Skip these two counters on BDX, as they are always zero:
cc7, pc7

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:21 -05:00
Len Brown
31e07522be tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
Newer processors do not hard-code the the number of cpus in each bin
to {1, 2, 3, 4, 5, 6, 7, 8}  Rather, they can specify any number
of CPUS in each of the 8 bins:

eg.

...
37 * 100.0 = 3600.0 MHz max turbo 4 active cores
38 * 100.0 = 3700.0 MHz max turbo 3 active cores
39 * 100.0 = 3800.0 MHz max turbo 2 active cores
39 * 100.0 = 3900.0 MHz max turbo 1 active cores

could now look something like this:

...
37 * 100.0 = 3600.0 MHz max turbo 16 active cores
38 * 100.0 = 3700.0 MHz max turbo 8 active cores
39 * 100.0 = 3800.0 MHz max turbo 4 active cores
39 * 100.0 = 3900.0 MHz max turbo 2 active cores

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:20 -05:00
Len Brown
34c7619762 tools/power turbostat: skip unused counters on SKX
Skip these four counters on SKX, as they are always zero:
cc3, pc3
cc7, pc7

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:20 -05:00
Len Brown
7170a37437 tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
The CC1 column in tubostat can be computed by subtracting
the core c-state residency countes from the total Cx residency.

CC1 = (Idle_time_as_measured by MPERF) - (all core C-states with
residency counters)

However, as the underlying counter reads are not atomic,
error can be noticed in this calculations, especially
when the numbers are small.

Denverton has a hardware CC1 residency counter
to improve the accuracy of the cc1 statistic -- use it.

At the same time, Denverton has no concept of CC3, PC3, CC7, PC7,
so skip collecting and printing those columns.

Finally, a note of clarification.
Turbostat prints the standard PC2 residency counter,
but on Denverton hardware, that actually means PC1E.
Turbostat prints the standard PC6 residency counter,
but on Denverton hardware, that actually means PC2.

At this point, we document that differnce in this commit message,
rather than adding a quirk to the software.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:20 -05:00
Len Brown
ac01ac1371 tools/power turbostat: initial Gemini Lake SOC support
Gemini Lake is similar to Apollo Lake (Broxton/Goldmont)

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:19 -05:00
Len Brown
0f47c08d8c tools/power turbostat: bug fixes to --add, --show/--hide features
Fix a bug with --add, where the title of the column
is un-initialized if not specified by the user.

The initial implementation of --show and --hide
neglected to handle the pc8/pc9/pc10 counters.

Fix a bug where "--show Core" only worked with --debug

Reported-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:13 -05:00
Len Brown
008d396eb2 tools/power turbostat: use tsc_tweak everwhere it is needed
The CPU ticks at a rate in the "bus clock" domain.
eg. 100 MHz * bus_ratio.

On newer processors, the TSC has been moved out of this BCLK
domain and into a separate crystal-clock domain.

While the TSC ticks "close to" the base frequency, those that look
closely at the numbers will notice small errors in calculations that
mix units of TSC clocks and bus clocks.

"tsc_tweak" was introduced to address the most visible
mixing -- the %Busy and the the Busy_MHz calculations.
(A simplification as since removed TSC from the BusyMHz calculation)

Here we apply the tsc_tweak to everyplace where BCLK
and TSC units are mixed.  The results is that
on a system which is 100% idle, the sum of the C-states
are now much more likely to be closer to 100%.

Reported-by: Travis Downs <travis.downs@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:13 -05:00
Len Brown
96e4715857 tools/power turbostat: print system config, unless --quiet
Some users want turbostat to tell them everything, by default.
Some users want turbostat to be quiet, by default.

I find that I'm in the 1st camp, and so I've never liked
needing to type the --debug parameter to decode the system
configuration.

So here we change the default and print the system configuration,
by default.  (The --debug option is now un-documented, though
it does still exist for debugging turbostat internals)

When you do not want to see the system configuration
header, use the new "--quiet" option.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:12 -05:00
Len Brown
fee86541d2 tools/power turbostat: show all columns, independent of --debug
Some time ago, turbostat overflowed 80 columns.

So on the assumption that a "casual" user would always
want topology and frequency columns, we hid the rest
of the columns and the system configuration decoding
behind the --debug option.

Not everybody liked that change -- including me.
I use --debug 99% of the time...

Well, now we have "-o file" to put turbostat output into a file,
so unless you are watching real-time in a small window,
column count is less frequently a factor.

And more recently, we got the "--hide columnA,columnB" option
to specify columns to skip.

So now we "un-hide" the rest of the columns from behind --debug,
and show them all, by default.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:12 -05:00
Len Brown
33148d671c tools/power turbostat: decode MSR_MISC_FEATURE_CONTROL
useful for observing if the BIOS disabled prefetch
Not architectural, but docuemented as present on NHM, SNB
and is present on others.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:11 -05:00
Len Brown
b3a34e9382 tools/power turbostat: decode CPUID(6).TURBO
show the CPUID feature for turbo to clarify the case
when it may not be shown in MISC_ENABLE

CPUID(6): APERF, TURBO, DTS, PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, EPB
cpu4: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT TURBO)

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:04 -05:00
Len Brown
0f7887c49b tools/power turbostat: dump Atom P-states correctly
Turbostat dumps MSR_TURBO_RATIO_LIMIT on Core Architecture.
But Atom Architecture uses MSR_ATOM_CORE_RATIOS and
MSR_ATOM_CORE_TURBO_RATIOS.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:04 -05:00
Len Brown
e651262477 tools/power turbostat: further decode MSR_IA32_MISC_ENABLE
Decode MISC_ENABLE.NO_TURBO,
also use the #defines in msr-index.h for decoding this register

cpu0: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT TURBO)

Although it is not architectural, decode also
MSR_IA32_MISC_ENABLE.prefetch-disable (bit-9).
documented to be present on: Core, P4, Intel-Xeon
reserved on: Atom, Silvermont, Nehalem, SNB, PHI ec.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:03 -05:00
Len Brown
710f273ba9 tools/power turbostat: add precision to --debug frequency output
Add a digit of precision to the --debug output for frequency range.
This is useful when BCLK is not an integer.

old:
6 * 83 = 500 MHz max efficiency frequency
26 * 83 = 2166 MHz base frequency

new:
6 * 83.3 = 499.8 MHz max efficiency frequency
26 * 83.3 = 2165.8 MHz base frequency

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:02 -05:00
Len Brown
0539ba118f tools/power turbostat: Baytrail c-state support
The Baytrail SOC, with its Silvermont core, has some unique properties:

1. a hardware CC1 residency counter
2. a module-c6 residency counter
3. a package-c6 counter at traditional package-c7 counter address.

The SOC does not support c3, pc3, c7 or pc7 counters.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:14:02 -05:00
Len Brown
1df2e55abc tools/power turbostat: use new name for MSR_PKG_CST_CONFIG_CONTROL
Previously called MSR_NHM_SNB_PKG_CST_CFG_CTL

Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01 00:13:17 -05:00
Len Brown
f264288847 tools/power turbostat: update MSR_PKG_CST_CONFIG_CONTROL decoding
AMT value 0 is unlimited, not PC0

Signed-off-by: Len Brown <len.brown@intel.com>
2017-02-25 16:52:32 -05:00
Len Brown
8f6196c192 tools/power turbostat: Baytrail: remove debug line in quiet mode
Without --debug, a debug line was printed on Baytrail:

SLM BCLK: 83.3 Mhz

Signed-off-by: Len Brown <len.brown@intel.com>
2017-02-25 16:52:31 -05:00
Len Brown
71616c8e93 tools/power turbostat: decode Baytrail CC6 and MC6 demotion configuration
with --debug, see:

cpu0: MSR_CC6_DEMOTION_POLICY_CONFIG: 0x00000000 (DISable-CC6-Demotion)
cpu0: MSR_MC6_DEMOTION_POLICY_CONFIG: 0x00000000 (DISable-MC6-Demotion)

Note that the hardware default is to enable demotion,
and Linux started clearing these registers in 3.17.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-02-25 16:52:30 -05:00
Len Brown
cf4cbe5314 tools/power turbostat: BYT does not have MSR_MISC_PWR_MGMT
and so --debug fails with:

turbostat: msr 1 offset 0x1aa read failed: Input/output error

It seems that baytrail, and airmont do not have this MSR.
It is included in subsequent Goldmont Atom.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-02-25 16:52:29 -05:00
Len Brown
812db3f77b tools/power turbostat: Add --show and --hide parameters
Add the "--show" and "--hide" cmdline parameters.

By default, turbostat shows all columns.

turbostat --hide counter_list
will continue showing all columns, except for those listed.

turbostat --show counter_list
will show _only_ the listed columns

These features work for built-in counters, and have no effect
on columns added with the --add parameter.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-02-25 16:52:28 -05:00
Len Brown
678a3bd1b3 tools/power turbostat: fix bugs in --add option
When --add was used more than once, overflowed buffers
caused some counters to be stored on top of others,
corrupting the results.  Simplify the code by simply
reserving space for up to 16 added counters per each
cpu, core, package.

Per-cpu added counters were being printed only per-core.

Signed-off-by: Len Brown <len.brown@intel.com>
2017-02-25 16:52:28 -05:00
Doug Smythies
48385dd740 tools/power/x86: Debug utility for intel_pstate driver
This utility can be used to debug and tune the performance of the
intel_pstate driver.

This utility can be used in two ways:

 - If there is Linux trace file with pstate_sample events enabled, then
   this utility can parse the trace file and generate performance plots.

 - If user has not specified a trace file as input via command line
   parameters, then this utility enables and collects trace data for a
   user-specified interval and generates performance plots.

Signed-off-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-16 00:54:57 +01:00
Len Brown
6886fee4d7 tools/power turbostat: remove obsolete -M, -m, -C, -c options
The new --add option has replaced the -M, -m, -C, -c options
Eg.

-M 0x10 is now --add msr0x10,raw
-m 0x10 is now --add msr0x10,raw,u32
-C 0x10 is now --add msr0x10,delta
-c 0x10 is now --add msr0x10,delta,u32

The --add option can be repeated to add any number of counters,
while the previous options were limited to adding one of each type.

In addition, the --add option can accept a column label,
and can also display a counter as a percentage of elapsed cycles.

Eg. --add msr0x3fe,core,percent,MY_CC3

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-24 15:38:09 -05:00
Len Brown
388e9c8134 tools/power turbostat: Make extensible via the --add parameter
Create the "--add" parameter.  This can be used to teach an existing
turbostat binary about any number of any type of counter.

turbostat(8) details the syntax for --add.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-24 15:16:10 -05:00
Len Brown
7268d407ad tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz
This changes only the TSC frequency decoding line seen with --debug

old: TSC: 1382 MHz (19200000 Hz * 216 / 3 / 1000000)
new: TSC: 1800 MHz (25000000 Hz * 216 / 3 / 1000000)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 23:18:26 -05:00
Len Brown
5cc6323c79 tools/power turbostat: line up headers when -M is used
The -M option adds an 18-column item, and the header
needs to be wide enough to keep the header aligned
with the columns.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 21:14:38 -05:00
Len Brown
d8ebb44226 tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding
SKX has fewer package C-states than previous generations,
and so the decoding of PKG_CSTATE_LIMIT has changed.

This changes the line ending with pkg-cstate-limit=XXX: pcYYY

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 20:27:46 -05:00
Len Brown
005c82d64d tools/power turbostat: Support Knights Mill (KNM)
Original-author: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:35:38 -05:00
Srinivas Pandruvada
ddadb8adea tools/power turbostat: Display HWP OOB status
Display if the HWP is enabled in OOB (Out of band) mode.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:20 -05:00
Xiaolong Wang
5bbac26eae tools/power turbostat: fix Denverton BCLK
Add Denverton to the group of SandyBridge and later processors,
to let the bclk be recognized as 100MHz rather than 133MHz,
then avoid the wrong value of the frequencies based on it,
including Bzy_MHz, max efficiency freuency, base frequency,
and turbo mode frequencies.

Signed-off-by: Xiaolong Wang <xiaolong.wang@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:19 -05:00
Len Brown
869ce69e1e tools/power turbostat: use intel-family.h model strings
All except for model 1F, a Nehalem, which is currently incorrectly
indentified as a Westmere in that new header.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:19 -05:00
Jacob Pan
0f64490978 tools/power/turbostat: Add Denverton RAPL support
The Denverton CPU RAPL supports package, core, and DRAM domains.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:18 -05:00
Jacob Pan
2c48c990ea tools/power/turbostat: Add Denverton support
Denverton is an Atom based micro server which shares the same
Goldmont architecture as Broxton. The available C-states on
Denverton is a subset of Broxton with only C1, C1e, and C6.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:18 -05:00
Jacob Pan
9148494c59 tools/power/turbostat: split core MSR support into status + limit
Some CPUs may not have PP0/Core domain power limit MSRs. We
should still allow its domain energy status to be used. This
patch splits PP0/Core RAPL into two separate flags for power
limit and energy status such that energy status can continue
to be reported without power limit.

Without this patch, turbostat will not be able to use the
remaining RAPL features if some PL MSRs are not present.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:17 -05:00
Colin Ian King
0a91e55152 tools/power turbostat: fix error case overflow read of slm_freq_table[]
When i >= SLM_BCLK_FREQS, the frequency read from the slm_freq_table
is off the end of the array because msr is set to 3 rather than the
actual array index i.  Set i to 3 rather than msr to fix this.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:17 -05:00
Mika Westerberg
01a67adfc5 tools/power turbostat: Allocate correct amount of fd and irq entries
The tool uses topo.max_cpu_num to determine number of entries needed for
fd_percpu[] and irqs_per_cpu[]. For example on a system with 4 CPUs
topo.max_cpu_num is 3 so we get too small array for holding per-CPU items.

Fix this to use right number of entries, which is topo.max_cpu_num + 1.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:16 -05:00
Len Brown
3d109de23c tools/power turbostat: switch to tab delimited output
Switch to tab-delimited output from fixed-width columns
to make it simpler to import into spreadsheets.

As the fixed width columnns were 8-spaces wide,
the output on the screen should not change.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:16 -05:00
Len Brown
ba3dec99fc tools/power turbostat: Gracefully handle ACPI S3
turbostat gives valid results across suspend to idle, aka freeze,
whether invoked in  interval mode, or in command mode.
Indeed, this can be used to measure suspend to idle:

turbostat echo freeze > /sys/power/state

But this does not work across suspend to ACPI S3, because the
processor counters, including the TSC, are reset on resume.
Further, when turbostat detects a problem, it does't forgive
the hardware, and interval mode will print *'s from there on out.

Instead, upon detecting counters going backwards, simply
reset and start over.

Interval mode across ACPI S3: (observe TSC going backwards)

root@sharkbay:/home/lenb/turbostat-src# ./turbostat -M 0x10
     CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz           MSR 0x010
       -       1    0.06     858    2294  0x0000000000000000
       0       0    0.06     847    2294  0x0000002a254b98ac
       1       1    0.06     878    2294  0x0000002a254efa3a
       2       1    0.07     843    2294  0x0000002a2551df65
       3       0    0.05     863    2294  0x0000002a2553fea2
turbostat: re-initialized with num_cpus 4
     CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz           MSR 0x010
       -       2    0.20     849    2294  0x0000000000000000
       0       2    0.26     856    2294  0x0000000449abb60d
       1       2    0.20     844    2294  0x0000000449b087ec
       2       2    0.21     850    2294  0x0000000449b35d5d
       3       1    0.12     839    2294  0x0000000449b5fd5a
^C

Command mode across ACPI S3:
root@sharkbay:/home/lenb/turbostat-src# ./turbostat -M 0x10 sleep 10
./turbostat: Counter reset detected
14.196299 sec

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:15 -05:00
Len Brown
e975db5d52 tools/power turbostat: tidy up output on Joule counter overflow
The RAPL Joules counter is limited in capacity.
Turbostat estimates how soon it can roll-over
based on the max TDP of the processor --
which tells us the maximum increment rate.

eg.
RAPL: 2759 sec. Joule Counter Range, at 95 Watts

So if a sample duration is longer than 2759 seconds on this system,
'**' replace the decimal place in the display to indicate
that the results may be suspect.

But the display had an extra ' ' in this case, throwing off the columns.

Also, the -J "Joules" option appended an extra "time" column
to the display.  While this may be useful, it printed the interval time,
which may not be the accurate time per processor.  Remove this column,
which appeared only when using '-J',
as we plan to add accurate per-cpu interval times in a future commit.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01 01:33:15 -05:00
Srinivas Pandruvada
ebf5926a00 tools/power turbostat: Replace MSR_NHM_TURBO_RATIO_LIMIT
Replace MSR_NHM_TURBO_RATIO_LIMIT with MSR_TURBO_RATIO_LIMIT.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-07 15:31:59 +02:00
Andy Shevchenko
ac485cb4b3 tools/turbostat: allow user to alter DESTDIR and PREFIX
When run
	make -C tools DESTDIR=/my/nice/dir turbostat_install
get a message
	install: cannot create regular file '/usr/bin/turbostat': Permission denied

Allow user to alter DESTDIR and PREFIX variables.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-28 00:37:04 +02:00
Rafael J. Wysocki
73659be769 Merge branches 'pm-core', 'powercap' and 'pm-tools'
* pm-core:
  PM / wakeirq: fix wakeirq setting after wakup re-configuration from sysfs
  PM / runtime: Document steps for device removal

* powercap:
  powercap: intel_rapl: Add missing Haswell model

* pm-tools:
  tools/power turbostat: work around RC6 counter wrap
  tools/power turbostat: initial KBL support
  tools/power turbostat: initial SKX support
  tools/power turbostat: decode BXT TSC frequency via CPUID
  tools/power turbostat: initial BXT support
  tools/power turbostat: print IRTL MSRs
  tools/power turbostat: SGX state should print only if --debug
2016-04-08 21:46:56 +02:00
Len Brown
9185e988e9 tools/power turbostat: work around RC6 counter wrap
Sometimes the rc6 sysfs counter spontaneously resets,
causing turbostat prints a very large number
as it tries to calcuate % = 100 * (old - new) / interval

When we see (old > new), print ***.**% instead
of a bogus huge number.

Note that this detection is not fool-proof, as the counter
could reset several times and still result in new > old.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07 22:18:40 +02:00
Len Brown
cdc57272ea tools/power turbostat: initial KBL support
KBL is similar to SKL

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07 22:18:38 +02:00
Len Brown
ec53e594c6 tools/power turbostat: initial SKX support
SKX has a lot in common with HSX

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07 22:18:36 +02:00
Len Brown
e8efbc80db tools/power turbostat: decode BXT TSC frequency via CPUID
Hard-code BXT ART to 19200MHz, so turbostat --debug
can fully enumerate TSC:

CPUID(0x15): eax_crystal: 3 ebx_tsc: 186 ecx_crystal_hz: 0
TSC: 1190 MHz (19200000 Hz * 186 / 3 / 1000000)

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07 22:18:35 +02:00
Len Brown
e4085d543e tools/power turbostat: initial BXT support
Broxton has a lot in common with SKL

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07 22:18:33 +02:00
Len Brown
5a63426e2a tools/power turbostat: print IRTL MSRs
Some processors use the Interrupt Response Time Limit (IRTL) MSR value
to describe the maximum IRQ response time latency for deep
package C-states.  (Though others have the register, but do not use it)
Lets print it out to give insight into the cases where it is used.

IRTL begain in SNB, with PC3/PC6/PC7, and HSW added PC8/PC9/PC10.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07 22:18:32 +02:00
Len Brown
8ae7225591 tools/power turbostat: SGX state should print only if --debug
The CPUID.SGX bit was printed, even if --debug was used

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07 22:18:30 +02:00
Linus Torvalds
277edbabf6 Power management and ACPI material for v4.6-rc1, part 1
- Redesign of cpufreq governors and the intel_pstate driver to
    make them use callbacks invoked by the scheduler to trigger CPU
    frequency evaluation instead of using per-CPU deferrable timers
    for that purpose (Rafael Wysocki).
 
  - Reorganization and cleanup of cpufreq governor code to make it
    more straightforward and fix some concurrency problems in it
    (Rafael Wysocki, Viresh Kumar).
 
  - Cleanup and improvements of locking in the cpufreq core (Viresh
    Kumar).
 
  - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
    Kumar, Eric Biggers).
 
  - intel_pstate driver updates including fixes, optimizations and a
    modification to make it enable enable hardware-coordinated P-state
    selection (HWP) by default if supported by the processor (Philippe
    Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
    Franciosi).
 
  - Operating Performance Points (OPP) framework updates to improve
    its handling of voltage regulators and device clocks and updates
    of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
 
  - Updates of the powernv cpufreq driver to fix initialization
    and cleanup problems in it and correct its worker thread handling
    with respect to CPU offline, new powernv_throttle tracepoint
    (Shilpasri Bhat).
 
  - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
 
  - ACPICA updates including one fix for a regression introduced
    by previos changes in the ACPICA code (Bob Moore, Lv Zheng,
    David Box, Colin Ian King).
 
  - Support for installing ACPI tables from initrd (Lv Zheng).
 
  - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
    Chaugule).
 
  - Support for _HID(ACPI0010) devices (ACPI processor containers)
    and ACPI processor driver cleanups (Sudeep Holla).
 
  - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
    Aleksey Makarov).
 
  - Modification of the ACPI PCI IRQ management code to make it treat
    255 in the Interrupt Line register as "not connected" on x86 (as
    per the specification) and avoid attempts to use that value as
    a valid interrupt vector (Chen Fan).
 
  - ACPI APEI fixes related to resource leaks (Josh Hunt).
 
  - Removal of modularity from a few ACPI drivers (BGRT, GHES,
    intel_pmic_crc) that cannot be built as modules in practice (Paul
    Gortmaker).
 
  - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
    as a valid resource type (Harb Abdulhamid).
 
  - New device ID (future AMD I2C controller) in the ACPI driver for
    AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
 
  - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
 
  - cpuidle menu governor optimization to avoid a square root
    computation in it (Rasmus Villemoes).
 
  - Fix for potential use-after-free in the generic device properties
    framework (Heikki Krogerus).
 
  - Updates of the generic power domains (genpd) framework including
    support for multiple power states of a domain, fixes and debugfs
    output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
    Geert Uytterhoeven).
 
  - Intel RAPL power capping driver updates to reduce IPI overhead in
    it (Jacob Pan).
 
  - System suspend/hibernation code cleanups (Eric Biggers, Saurabh
    Sengar).
 
  - Year 2038 fix for the process freezer (Abhilash Jindal).
 
  - turbostat utility updates including new features (decoding of more
    registers and CPUID fields, sub-second intervals support, GFX MHz
    and RC6 printout, --out command line option), fixes (syscall jitter
    detection and workaround, reductioin of the number of syscalls made,
    fixes related to Xeon x200 processors, compiler warning fixes) and
    cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJW50NXAAoJEILEb/54YlRxvr8QAIktC9+ft0y5AmU46hDcBWcK
 QutyWJL9X9BS6DWBJZA2qclDYFmhMfi5Fza1se0gQ9TnLB/KrBwHWLsiYoTsb1k+
 nPKf214aPk+qAhkVuyB4leNWML9Qz9n9jwku/EYxWWpgtbSRf3+0ioIKZeWWc/8V
 JvuaOu4O+g/tkmL7QTrnGWBwhIIssAAV85QPsHkx+g68MrCj4UMMzm7z9G21SPXX
 bmP8yIHsczX/XnRsY0W2NSno7Vdk6ImHpDJ26IAZg28WRNPWICHgGYHvB0TTWMvb
 tts+yqfF7/7QLRjT/M8k9CzDBDE/DnVqoZ0fNJ+aYr7hNKF32mtAN+jH9ZB9dl/P
 fEFapJkPxnWyzAoVoB9Dz0rkcZkYMlbxlLWzUGpaPq0JflUUTzLk0ApSjmMn4HRO
 UddwCDdyHTaYThp3gn6GbOb0pIP0SdOVbI1M2QV2x/4PLcT2Ft8Np1+1RFWOeinZ
 Bdl9AE890big0808mqbBzw/buETwr9FjHtCdDPXpP0vJpkBLu3nIYRNb0LCt39es
 mWMp6dFhGgvGj3D3ahTuV3GI8hdpDkh9SObexa11RCjkTKrXcwEmFxHxLeFXwKYq
 alG278bo6cSChRMziS1lis+W/3tsJRN4TXUSv1PPzJHrFgptQVFRStU9ngBKP+pN
 WB+itPc4Fw0YHOrAFsrx
 =cfty
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "This time the majority of changes go into cpufreq and they are
  significant.

  First off, the way CPU frequency updates are triggered is different
  now.  Instead of having to set up and manage a deferrable timer for
  each CPU in the system to evaluate and possibly change its frequency
  periodically, cpufreq governors set up callbacks to be invoked by the
  scheduler on a regular basis (basically on utilization updates).  The
  "old" governors, "ondemand" and "conservative", still do all of their
  work in process context (although that is triggered by the scheduler
  now), but intel_pstate does it all in the callback invoked by the
  scheduler with no need for any additional asynchronous processing.

  Of course, this eliminates the overhead related to the management of
  all those timers, but also it allows the cpufreq governor code to be
  simplified quite a bit.  On top of that, the common code and data
  structures used by the "ondemand" and "conservative" governors are
  cleaned up and made more straightforward and some long-standing and
  quite annoying problems are addressed.  In particular, the handling of
  governor sysfs attributes is modified and the related locking becomes
  more fine grained which allows some concurrency problems to be avoided
  (particularly deadlocks with the core cpufreq code).

  In principle, the new mechanism for triggering frequency updates
  allows utilization information to be passed from the scheduler to
  cpufreq.  Although the current code doesn't make use of it, in the
  works is a new cpufreq governor that will make decisions based on the
  scheduler's utilization data.  That should allow the scheduler and
  cpufreq to work more closely together in the long run.

  In addition to the core and governor changes, cpufreq drivers are
  updated too.  Fixes and optimizations go into intel_pstate, the
  cpufreq-dt driver is updated on top of some modification in the
  Operating Performance Points (OPP) framework and there are fixes and
  other updates in the powernv cpufreq driver.

  Apart from the cpufreq updates there is some new ACPICA material,
  including a fix for a problem introduced by previous ACPICA updates,
  and some less significant changes in the ACPI code, like CPPC code
  optimizations, ACPI processor driver cleanups and support for loading
  ACPI tables from initrd.

  Also updated are the generic power domains framework, the Intel RAPL
  power capping driver and the turbostat utility and we have a bunch of
  traditional assorted fixes and cleanups.

  Specifics:

   - Redesign of cpufreq governors and the intel_pstate driver to make
     them use callbacks invoked by the scheduler to trigger CPU
     frequency evaluation instead of using per-CPU deferrable timers for
     that purpose (Rafael Wysocki).

   - Reorganization and cleanup of cpufreq governor code to make it more
     straightforward and fix some concurrency problems in it (Rafael
     Wysocki, Viresh Kumar).

   - Cleanup and improvements of locking in the cpufreq core (Viresh
     Kumar).

   - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
     Kumar, Eric Biggers).

   - intel_pstate driver updates including fixes, optimizations and a
     modification to make it enable enable hardware-coordinated P-state
     selection (HWP) by default if supported by the processor (Philippe
     Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
     Franciosi).

   - Operating Performance Points (OPP) framework updates to improve its
     handling of voltage regulators and device clocks and updates of the
     cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).

   - Updates of the powernv cpufreq driver to fix initialization and
     cleanup problems in it and correct its worker thread handling with
     respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
     Bhat).

   - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).

   - ACPICA updates including one fix for a regression introduced by
     previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
     Colin Ian King).

   - Support for installing ACPI tables from initrd (Lv Zheng).

   - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
     Chaugule).

   - Support for _HID(ACPI0010) devices (ACPI processor containers) and
     ACPI processor driver cleanups (Sudeep Holla).

   - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
     Aleksey Makarov).

   - Modification of the ACPI PCI IRQ management code to make it treat
     255 in the Interrupt Line register as "not connected" on x86 (as
     per the specification) and avoid attempts to use that value as a
     valid interrupt vector (Chen Fan).

   - ACPI APEI fixes related to resource leaks (Josh Hunt).

   - Removal of modularity from a few ACPI drivers (BGRT, GHES,
     intel_pmic_crc) that cannot be built as modules in practice (Paul
     Gortmaker).

   - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
     as a valid resource type (Harb Abdulhamid).

   - New device ID (future AMD I2C controller) in the ACPI driver for
     AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).

   - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).

   - cpuidle menu governor optimization to avoid a square root
     computation in it (Rasmus Villemoes).

   - Fix for potential use-after-free in the generic device properties
     framework (Heikki Krogerus).

   - Updates of the generic power domains (genpd) framework including
     support for multiple power states of a domain, fixes and debugfs
     output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
     Geert Uytterhoeven).

   - Intel RAPL power capping driver updates to reduce IPI overhead in
     it (Jacob Pan).

   - System suspend/hibernation code cleanups (Eric Biggers, Saurabh
     Sengar).

   - Year 2038 fix for the process freezer (Abhilash Jindal).

   - turbostat utility updates including new features (decoding of more
     registers and CPUID fields, sub-second intervals support, GFX MHz
     and RC6 printout, --out command line option), fixes (syscall jitter
     detection and workaround, reductioin of the number of syscalls
     made, fixes related to Xeon x200 processors, compiler warning
     fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"

* tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
  tools/power turbostat: bugfix: TDP MSRs print bits fixing
  tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
  tools/power turbostat: call __cpuid() instead of __get_cpuid()
  tools/power turbostat: indicate SMX and SGX support
  tools/power turbostat: detect and work around syscall jitter
  tools/power turbostat: show GFX%rc6
  tools/power turbostat: show GFXMHz
  tools/power turbostat: show IRQs per CPU
  tools/power turbostat: make fewer systems calls
  tools/power turbostat: fix compiler warnings
  tools/power turbostat: add --out option for saving output in a file
  tools/power turbostat: re-name "%Busy" field to "Busy%"
  tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
  tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
  tools/power turbostat: allow sub-sec intervals
  ACPI / APEI: ERST: Fixed leaked resources in erst_init
  ACPI / APEI: Fix leaked resources
  intel_pstate: Do not skip samples partially
  intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
  intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
  ...
2016-03-16 14:10:53 -07:00
Chen Yu
685b535b2c tools/power turbostat: bugfix: TDP MSRs print bits fixing
MSR_CONFIG_TDP_NOMINAL:
should print all 8 bits of base_ratio (bit 0:7) 0xFF

MSR_CONFIG_TDP_LEVEL_1:
should print all 15 bits of PKG_MIN_PWR_LVL1 (bit 48:62) 0x7FFF
should print all 15 bits of PKG_MAX_PWR_LVL1 (bit 32:46) 0x7FFF
should print all 8 bits of LVL1_RATIO (bit 16:23) 0xFF
should print all 15 bits of PKG_TDP_LVL1 (bit 0:14) 0x7FFF

And the same modification to MSR_CONFIG_TDP_LEVEL_2.

MSR_TURBO_ACTIVATION_RATIO:
should print all 8 bits of MAX_NON_TURBO_RATIO (bit 0:7) 0xFF

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 04:22:57 -04:00
Len Brown
6c34f160df tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x1e008008 (...pkg-cstate-limit=0: unlimited)
should print as
MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x1e008008 (...pkg-cstate-limit=8: unlimited)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 04:22:47 -04:00
Len Brown
5aea2f7f64 tools/power turbostat: call __cpuid() instead of __get_cpuid()
turbostat already checks whether calling each cpuid leavf is legal,
and it doesn't look at the function return value,
so call the simpler gcc intrinsic __cpuid() instead of __get_cpuid().

syntax only, no functional change

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:42 -04:00
Len Brown
aa8d8cc79a tools/power turbostat: indicate SMX and SGX support
SGX presence is related to a SKL power workaround,
so lets show when that is enabled.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:42 -04:00
Len Brown
0102b06747 tools/power turbostat: detect and work around syscall jitter
The accuracy of Bzy_Mhz and Busy% depend on reading
the TSC, APERF, and MPERF close together in time.

When there is a very short measurement interval,
or a large system is profoundly idle, the changes
in APERF and MPERF may be very small.
They can be small enough that an expensive interrupt
between reading APERF and MPERF can cause the APERF/MPERF
ratio to become inaccurate, resulting in invalid
calculation and display of Bzy_MHz.

A dummy APERF read of APERF makes this problem
much more rare.  Apparently this 1st systemn call
after exiting a long stretch of idle is when we
typically see expensive timer interrupts that cause
large jitter.

For the cases that dummy APERF read fails to prevent,
we compare the latency of the APERF and MPERF reads.
If they differ by more than 2x, we re-issue them.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:41 -04:00
Len Brown
fdf676e51f tools/power turbostat: show GFX%rc6
The column "GFX%c6" show the percentage of time the GPU
is in the "render C6" state, rc6.  Deep package C-states on several
systems depend on the GPU being in RC6.

This information comes from the counter
/sys/class/drm/card0/power/rc6_residency_ms,
as read before and after the measurement interval.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:41 -04:00
Len Brown
27d47356b6 tools/power turbostat: show GFXMHz
Under the column "GFXMHz", show a snapshot of this attribute:
/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz

This is an instantaneous snapshot of what sysfs presents
at the end of the measurement interval.  turbostat does
not average or otherwise perform any math on this value.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:40 -04:00
Len Brown
562a2d377b tools/power turbostat: show IRQs per CPU
The new IRQ column shows how many interrupts have occurred on each CPU
during the measurement inteval.  This information comes from
the difference between /proc/interrupts shapshots made before
and after the measurement interval.

The first row, the system summary, shows the sum of the IRQS
for all CPUs during that interval.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:40 -04:00
Len Brown
36229897ba tools/power turbostat: make fewer systems calls
skip the open(2)/close(2) on each msr read
by keeping the /dev/cpu/*/msr files open.

The remaining read(2) is generally far fewer cycles
than the removed open(2) system call.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:40 -04:00
Len Brown
58cc30a4e6 tools/power turbostat: fix compiler warnings
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:39 -04:00
Len Brown
b7d8c1483b tools/power turbostat: add --out option for saving output in a file
By default...

Turbostat --debug gconfiguration info goes to stderr.

In FORK mode, turbostat statistics go to stderr.

In PERIODIC mode, turbostat statistics go to stdout.

These defaults do not change, but an option "--out file"
will send all output above only to the specified file.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:39 -04:00
Len Brown
75d2e44e60 tools/power turbostat: re-name "%Busy" field to "Busy%"
some tools processing turbostat output
have difficulty with items that begin with %...

Reported-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:38 -04:00
Hubert Chrzaniuk
cbf97abaf3 tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
Following changes have been made:
- changed MSR_NHM_TURBO_RATIO_LIMIT to MSR_TURBO_RATIO_LIMIT in debug print
  for consistency with Developer Manual
- updated definition of bitfields in MSR_TURBO_RATIO_LIMIT and appropriate
  parsing code
- added x200 to list of architectures that do not support Nahlem compatible
  definition of MSR_TURBO_RATIO_LIMIT register (x200 has the register but
  bits definition is custom)
- fixed typo in code that parses MSR_TURBO_RATIO_LIMIT
  (logical instead of bitwise operator)
- changed MSR_TURBO_RATIO_LIMIT parsing algorithm so the print out had the
  same order as implementations for other platforms

Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:38 -04:00
Chrzaniuk, Hubert
121b48bb77 tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
x200 does not enable any way to programmatically obtain bus clock
speed. Bclk for the architecture has a fixed value of 100 MHz.
At the same time x200 cannot be included in has_snb_msrs since
it does not support C7 idle state.

prior to this patch, MHz values reported on this chip
were erroneously calculated using bclk of 133MHz,
causing MHz values to be reported 33% higher than actual.

Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:37 -04:00
Len Brown
2a0609c02e tools/power turbostat: allow sub-sec intervals
turbostat -i interval_sec

will sample and display statistics every interval_sec.
interval_sec used to be a whole number of seconds,
but now we accept a decimal, as small as 0.001 sec (1 ms).

Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-13 03:55:32 -04:00
Colin Ian King
1b69317d2d tools/power turbostat: fix various build warnings
When building with gcc 6 we're getting various build warnings that just
require some trivial function declaration and call fixes:

  turbostat.c: In function ‘dump_cstate_pstate_config_info’:
  turbostat.c:1973:1: warning: type of ‘family’ defaults to ‘int’
   dump_cstate_pstate_config_info(family, model)
  turbostat.c:1973:1: warning: type of ‘model’ defaults to ‘int’
  turbostat.c: In function ‘get_tdp’:
  turbostat.c:2145:8: warning: type of ‘model’ defaults to ‘int’
   double get_tdp(model)
  turbostat.c: In function ‘perf_limit_reasons_probe’:
  turbostat.c:2259:6: warning: type of ‘family’ defaults to ‘int’
   void perf_limit_reasons_probe(family, model)
  turbostat.c:2259:6: warning: type of ‘model’ defaults to ‘int’

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-wbicer8n0s9qe6ql8h9x478e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-03 11:10:39 -03:00
Len Brown
f0057310b4 tools/power turbostat: Decode MSR_MISC_PWR_MGMT
This MSR is helpful to show if P-state HW coordination
is enabled or disabled.

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:43:05 -05:00
Len Brown
7f5c258e1c tools/power turbostat: decode HWP registers
# turbostat --debug
...
CPUID(6): ... HWP, HWPnotify, HWPwindow, HWPepp, HWPpkg ...
...
cpu0: MSR_PM_ENABLE: 0x00000001 (HWP)
cpu0: MSR_HWP_CAPABILITIES: 0x01050916 (high 0x16 guar 0x9 eff 0x5 low 0x1)
cpu0: MSR_HWP_REQUEST: 0x80001604 (min 0x4 max 0x16 des 0x0 epp 0x80 window 0x0 pkg 0x0)
cpu0: MSR_HWP_INTERRUPT: 0x00000001 (EN_Guaranteed_Perf_Change, Dis_Excursion_Min)
cpu0: MSR_HWP_STATUS: 0x00000000 (No-Guaranteed_Perf_Change, No-Excursion_Min)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:42:34 -05:00
Len Brown
61a87ba789 tools/power turbostat: CPUID(0x16) leaf shows base, max, and bus frequency
This CPUID leaf is available on Skylake:

CPUID(0x16): base_mhz: 1500 max_mhz: 2200 bus_mhz: 100

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:42:20 -05:00
Len Brown
69807a638f tools/power turbostat: decode more CPUID fields
for debugging, dump a few more fields:

CPUID(1): SSE3 MONITOR EIST TM2 TSC MSR ACPI-TM TM

cpu0: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MONITOR)

Signed-off-by: Len Brown <len.brown@intel.com>
2016-02-17 01:41:53 -05:00
Len Brown
ec0adc539b tools/power turbostat: use new name for MSR_PLATFORM_INFO
MSR_PLATFORM_INFO is the new name for MSR_NHM_PLATFORM_INFO

no functional change

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-13 23:27:13 +01:00
Len Brown
759d2a932b tools/power turbostat: bugfix: print MAX_NON_TURBO_RATIO
MSR_TURBO_ACTIVATION_RATIO: 0x00000016 (MAX_NON_TURBO_RATIO=6 lock=0)
should print all 7 bits of MAX_NON_TURBO_RATIO (in decimal):
MSR_TURBO_ACTIVATION_RATIO: 0x00000016 (MAX_NON_TURBO_RATIO=22 lock=0)

Signed-off-by: Len Brown <len.brown@intel.com>
2015-10-22 02:42:12 -04:00
Len Brown
21ed5574d1 tools/power turbostat: simplify Bzy_MHz calculation
Bzy_MHz = TSC_delta*tsc_tweak/APERF_delta/MPERF_delta/measurement_interval

becomes

    Bzy_MHz = base_mhz/APERF_delta/MPERF_delta

on systems which support MSR_NHM_PLATFORM_INFO.

base_mhz is calculated directly from the base_ratio
reported in MSR_NHM_PLATFORM_INFO * bclk,
and bclk is discovered via MSR or cpuid.

This reduces the dependency of Bzy_MHz calculation on the TSC.
Previously, there were 4 TSC readings required in each caculation,
the raw TSC delta combined with the measurement_interval.
This also removes the "tsc_tweak" correction factor used when
TSC runs on a different base clock from the CPU's bclk.

After this change, tsc_tweak is used only for %Busy.

The end-result should be a Bzy_MHz result slightly less prone to jitter.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-10-19 22:50:01 -04:00
Len Brown
af71b980c0 tools/power turbosat: update version number
Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 09:49:55 -04:00
Len Brown
a2b7b74945 tools/power turbostat: SKL: Adjust for TSC difference from base frequency
On a Skylake with 1500MHz base frequency,
the TSC runs at 1512MHz.

This is because the TSC is no longer in the n*100 MHz BCLK domain,
but is now in the m*24MHz crystal clock domain. (24 MHz * 63 = 1512 MHz)

This adds error to several calculations in turbostat,
unless the TSC sample sizes are adjusted for this difference.

Note that calculations in the time domain are immune
from this issue, as the timing sub-system has already
calibrated the TSC against a known wall clock.

AVG_MHz = APERF_delta/measurement_interval

	need no adjustment.  APERF_delta is in the BCLK domain,
	and measurement_interval is in the time domain.

TSC_MHz  =  TSC_delta/measurement_interval

	needs no adjustment -- as we really do want to report
	the actual measured TSC delta here, and measurement_interval
	is in the accurate time domain.

%Busy = MPERF_delta/TSC_delta

	needs adjustment to use TSC_BCLK_DOMAIN_delta.
	TSC_BCLK_DOMAIN_delta = TSC_delta * base_hz / tsc_hz

Bzy_MHz = TSC_delta/APERF_delta/MPERF_delta/measurement_interval

	need adjustment as above.

No other metrics in turbostat need to be adjusted.

Before:

     CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
       -     550   24.84    2216    1512
       0    2191   98.73    2219    1514
       2       0    0.01    2130    1512
       1       9    0.43    2016    1512
       3       2    0.08    2016    1512

After:

     CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz
       -     550   25.05    2198    1512
       0    2190   99.62    2199    1512
       2       0    0.01    2152    1512
       1       9    0.46    2000    1512
       3       2    0.10    2000    1512

Note that in this example, the "Before" Bzy_MHz
was reported as exceeding the 2200 max turbo rate.
Also, even a pinned spin loop would not be reported
as over 99% busy.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 00:50:54 -04:00
Hubert Chrzaniuk
b2b34dfe4d tools/power turbostat: KNL workaround for %Busy and Avg_MHz
KNL increments APERF and MPERF every 1024 clocks.
This is compliant with the architecture specification,
which requires that only the ratio of APERF/MPERF need be valid.

However, turbostat takes advantage of the fact that these
two MSRs increment every un-halted clock
at the actual and base frequency:

AVG_MHz = APERF_delta/measurement_interval

%Busy = MPERF_delta/TSC_delta

This quirk is needed for these calculations to also work on KNL,
which would otherwise show a value 1024x smaller than expected.

Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 00:50:54 -04:00
Len Brown
756357b8e4 tools/power turbostat: IVB Xeon: fix --debug regression
Staring in Linux-4.3-rc1,
commit 6fb3143b56 ("tools/power turbostat: dump CONFIG_TDP")
touches MSR 0x648, which is not supported on IVB-Xeon.
This results in "turbostat --debug" exiting on those systems:

turbostat: /dev/cpu/2/msr offset 0x648 read failed: Input/output error

Remove IVB-Xeon from the list of machines supporting with that MSR.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-09-26 00:50:48 -04:00
Rafael J. Wysocki
82bb70c599 Merge branch 'turbostat' of https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux into pm-tools
Pull turbostat changes for v4.3 from Len Brown.

* 'turbostat' of https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: fix typo on DRAM column in Joules-mode
  tools/power turbostat: fix parameter passing for forked command
  tools/power turbostat: dump CONFIG_TDP
  tools/power turbostat: cpu0 is no longer hard-coded, so  update output
  tools/power turbostat: update turbostat(8)
2015-08-24 23:10:02 +02:00
Len Brown
bd6906ed3d tools/power turbostat: fix typo on DRAM column in Joules-mode
< RAM_W
> RAM_J

Reported-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-07-24 10:35:23 -04:00
Len Brown
a01e72fbc4 tools/power turbostat: fix parameter passing for forked command
turbostat supports forked command when sampling cpu state. However,
the forked command is not allowed to be executed with options, otherwise
turbostat might regard these options as invalid turbostat options.

For example:

./turbostat stress -c 4 -t 10
./turbostat: unrecognized option '-t'

Reported-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-07-15 21:49:41 -04:00
Len Brown
6fb3143b56 tools/power turbostat: dump CONFIG_TDP
Config TDP is a feature that allows parts to be configured
for different thermal limits after they have left the factory.

This can have an effect on the operation of the part,
particularly in determiniing...

Max Non-turbo Ratio
Turbo Activation Ratio

Signed-off-by: Len Brown <len.brown@intel.com>
2015-06-17 16:23:45 -04:00
Len Brown
bfae205226 tools/power turbostat: cpu0 is no longer hard-coded, so update output
The --debug option reads a number of per-package MSRs.
Previously we explicitly read them on cpu0, but recently
turbostat changed to read them on the current "base_cpu".

Update the print-out to reflect base_cpu, rather than
the hard-coded cpu0.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-06-17 12:27:21 -04:00
Borislav Petkov
b72e7464e4 x86/uapi: Do not export <asm/msr-index.h> as part of the user API headers
This header containing all MSRs and respective bit definitions
got exported to userspace in conjunction with the big UAPI
shuffle.

But, it doesn't belong in the UAPI headers because userspace can
do its own MSR defines and exporting them from the kernel blocks
us from doing cleanups/renames in that header. Which is
ridiculous - it is not kernel's job to export such a header and
keep MSRs list and their names stable.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433436928-31903-19-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07 15:36:04 +02:00
Len Brown
75fd7ffa7f tools/power turbostat: update turbostat(8)
Remove reference to the original Nehalem Turbo white paper,
since it has moved, and these mechanisms have now long since
been documented in the Software Developer's Manual.

Reported-by: Jeremie Lagraviere <jeremie@simula.no>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-06-03 07:37:24 -04:00
Len Brown
a68c7c3ff0 tools/power turbostat: update version number to 4.7
Signed-off-by: Len Brown <len.brown@intel.com>
2015-05-27 18:04:01 -04:00
Prarit Bhargava
7ce7d5de6d tools/power turbostat: allow running without cpu0
Linux-3.7 added CONFIG_BOOTPARAM_HOTPLUG_CPU0,
allowing systems to offline cpu0.

But when cpu0 is offline, turbostat will not run:

 # turbostat ls
turbostat: no /dev/cpu/0/msr

This patch replaces the hard-coded use of cpu0 in turbostat
with the current cpu, allowing it to run without a cpu0.

Fewer cross-calls may also be needed due to use of current cpu,
though this hard-coding was used only for the --debug preamble.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-05-27 18:04:01 -04:00
Len Brown
e9be7dd628 tools/power turbostat: correctly decode of ENERGY_PERFORMANCE_BIAS
When EPB is 0xF, turbosat was incorrectly describing it as "custom"
instead of calling it "powersave":

< cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x0000000f (custom)
> cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x0000000f (powersave)

Signed-off-by: Len Brown <len.brown@intel.com>
2015-05-27 18:04:00 -04:00
Dasaratharaman Chandramouli
fb5d432722 tools/power turbostat: enable turbostat to support Knights Landing (KNL)
Changes mainly to account for minor differences in Knights Landing(KNL):
1. KNL supports C1 and C6 core states.
2. KNL supports PC2, PC3 and PC6 package states.
3. KNL has a different encoding of the TURBO_RATIO_LIMIT MSR

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-05-27 18:03:57 -04:00
Dasaratharaman Chandramouli
e275b3885d tools/power turbostat: correctly display more than 2 threads/core
Without this update, turbostat displays only 2 threads per core.
Some processors, such as Xeon Phi, have more.

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-05-27 17:26:42 -04:00
Len Brown
e9257f5fa4 tools/power turbostat: correct dumped pkg-cstate-limit value
HSW expanded MSR_PKG_CST_CONFIG_CONTROL.Package-C-State-Limit,
from bits[2:0] used by previous implementations, to [3:0].
The value 1000b is unlimited, and is used by BDW and SKL too.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:52 -04:00
Len Brown
8a5bdf41d2 tools/power turbostat: calculate TSC frequency from CPUID(0x15) on SKL
turbostat --debug
...
CPUID(0x15): eax_crystal: 2 ebx_tsc: 100 ecx_crystal_hz: 0
TSC: 1200 MHz (24000000 Hz * 100 / 2 / 1000000)

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:52 -04:00
Andrey Semin
40ee8e3b9d tools/power turbostat: correct DRAM RAPL units on recent Xeon processors
While not yet documented in the Software Developer's Manual,
the data-sheet for modern Xeon states that DRAM RAPL ENERGY units
are fixed at 15.3 uJ, rather than being discovered via MSR.

Before this patch, DRAM energy on these products is over-stated by turbostat
because the RAPL units are 4x larger.

ref: "Xeon E5-2600 v3/E5-1600 v3 Datasheet Volume 2"
http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf

Signed-off-by: Andrey Semin <andrey.semin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:52 -04:00
Len Brown
0b2bb6925e tools/power turbostat: Initial Skylake support
Skylake adds some additional residency counters.

Skylake supports a different mix of RAPL registers
from any previous product.

In most other ways, Skylake is like Broadwell.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:51 -04:00
Thomas D
f82263c698 tools/power turbostat: Use $(CURDIR) instead of $(PWD) and add support for O= option in Makefile
Since commit ee0778a301
("tools/power: turbostat: make Makefile a bit more capable")
turbostat's Makefile is using

  [...]
  BUILD_OUTPUT    := $(PWD)
  [...]

which obviously causes trouble when building "turbostat" with

  make -C /usr/src/linux/tools/power/x86/turbostat ARCH=x86 turbostat

because GNU make does not update nor guarantee that $PWD is set.

This patch changes the Makefile to use $CURDIR instead, which GNU make
guarantees to set and update (i.e. when using "make -C ...") and also
adds support for the O= option (see "make help" in your root of your
kernel source tree for more details).

Link: https://bugs.gentoo.org/show_bug.cgi?id=533918
Fixes: ee0778a301 ("tools/power: turbostat: make Makefile a bit more capable")
Signed-off-by: Thomas D. <whissi@whissi.de>
Cc: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:51 -04:00
Len Brown
a21d38c846 tools/power turbostat: modprobe msr, if needed
Some distros (Ubuntu) ship the msr driver as a module.
If turbosat is run as root on those systems, and discovers
that there is no /dev/cpu/cpu0/msr, it will now "modprobe msr"
for the user.

If not root, the modprobe attempt will fail, and turbostat will exit as before:

turbostat: no /dev/cpu/0/msr, Try "# modprobe msr" : No such file or directory

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:51 -04:00
Len Brown
fcd17211bd tools/power turbostat: dump MSR_TURBO_RATIO_LIMIT2
and up to 18 cores of turbo ratio limit
when using the turbostat --debug option.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:50 -04:00
Len Brown
12bb43c615 tools/power turbostat: use new MSR_TURBO_RATIO_LIMIT names
s/MSR_NHM_TURBO_RATIO_LIMIT/MSR_TURBO_RATIO_LIMIT/
s/MSR_IVT_TURBO_RATIO_LIMIT/MSR_TURBO_RATIO_LIMIT1/

syntax only -- use the documented strings describing these registers.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-18 14:20:50 -04:00
Len Brown
8f61f3598d tools/power turbostat: label base frequency
syntax only.

The cool kids are now using the phrase "base frequency",
where in the past we used "max non-turbo frequency" or "TSC frequency".

This distinction becomes important when a processor has a TSC
that runs at a different speed than the "base frequency".

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-13 15:52:54 -04:00
Len Brown
e33cbe852d tools/power turbostat: update PERF_LIMIT_REASONS decoding
cosmetic only.

order the decoding of MSR_PERF_LIMIT_REASONS bits
from MSB to LSB -- which you notice when more than 1 bit is set
and you are, say, comparing the output to the documentation...

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-13 15:52:54 -04:00
Len Brown
1cc21f7b6b tools/power turbostat: simplify default output
Casual turbostat users generally just want to know MHz.
So by default, just print enough information to make sense of MHz.

All the other configuration data and columns for C-states and temperature etc,
are printed with the --debug option.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-04-13 15:52:54 -04:00
Len Brown
48a0631c89 tools/power turbostat: support additional Broadwell model
Signed-off-by: Len Brown <len.brown@intel.com>
2015-02-10 15:59:53 -05:00
Len Brown
d8af6f5f0f tools/power turbostat: update parameters, documentation
Long format options added, though the short ones should still work.
eg. the new "--Counter 0x10" is the same as the old "-C 0x10"

Note this Incompatibility:
Old:
-v displayed verbose debug output

New:
-v and --version simpaly display version

Additional parameters:
-d and --debug display verbose debug output
-h and --help display a help message

Updated turbosat.8 man page accordingly.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-02-10 01:56:38 -05:00
Len Brown
ee7e38e3d8 tools/power turbostat: Skip printing disabled package C-states
Replaced previously open-coded Package C-state Limit decoding
with table-driven decoding.  In doing so, updated to match January 2015
"Intel(R) 64 and IA-23 Architectures Software Developer's Manual".

In the past, turbostat would print package C-state residency columns
for all package states supported by the model's architecture, even though
a particular SKU may not support them, or they may be disabled by the BIOS.
Now turbostat will skip printing colunns if MSRs indicate that they are not enabled.
eg. many SKUs don't support PC7, and so that column will no longer be printed.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-02-09 23:39:45 -05:00
Len Brown
a729617c58 tools/power turbostat: relax dependency on APERF_MSR
While turbostat is significantly less useful on systems
with no APERF_MSR, it seems more friendly
to run on such systems and report what we can,
rather than refusing to run.

Update man page to reflect recent changes.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-02-09 18:28:18 -05:00
Len Brown
d789944753 tools/power turbostat: relax dependency on invariant TSC
Turbostat can be useful on systems that do not support invariant TSC,
so allow it to run on those systgems.

All arithmetic in turbostat using the TSC value is per-processsor,
so it does not depend on the TSC values being in sync acrosss processors.

Turbostat uses gettimeofday() for the measurement interval
rather than using the TSC directly, so that key metric
is also immune from variable TSC.

Turbostat prints a TSC sanity check column:

TSC_MHz = TSC_delta/interval

If this column is constant and is close to the processor
base frequency, then the TSC is behaving properly.

The other key turbostat columns are calculated this way:

Avg_Mhz = APERF_delta/interval

%Busy = MPERF_delta/TSC_delta

Bzy_MHz = TSC_delta/APERF_delta/MPERF_delta/interval

Tested on Core2 and Core2-Xeon, and so this patch includes
a few other changes to remove the assumption that target
systems are Nehalem and newer.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-02-09 18:28:08 -05:00
Len Brown
3a9a941d0b tools/power turbostat: decode MSR_*_PERF_LIMIT_REASONS
The Processor generation code-named Haswell
added MSR_{CORE | GFX | RING}_PERF_LIMIT_REASONS
to explain when and how the processor limits frequency.

turbostat -v
will now decode these bits.

Each MSR has an "Active" set of bits which describe
current conditions, and a "Logged" set of bits,
which describe what has happened since last cleared.

Turbostat currently doesn't clear the log bits.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-02-09 16:44:24 -05:00
Len Brown
98481e79b6 tools/power turbostat: relax dependency on root permission
For turbostat to run as non-root, it needs to permissions:

1. read access to /dev/cpu/*/msr
	via standard user/group/world file permissions

2. CAP_SYS_RAWIO
	eg.  # setcap cap_sys_rawio=ep turbostat

Yes, running as root still works.

Signed-off-by: Len Brown <len.brown@intel.com>
2015-02-09 16:41:16 -05:00
Len Brown
e7c95ff32d tools/power turbostat: tweak whitespace in output format
turbostat -S
output was off by 1 space before this patch.

Signed-off-by: Len Brown <len.brown@intel.com>
2014-08-15 17:34:44 -04:00
Jean Delvare
3482124a6a tools / power: turbostat: Drop temperature checks
The Intel 64 and IA-32 Architectures Software Developer's Manual says
that TjMax is stored in bits 23:16 of MSR_TEMPERATURE TARGET (0x1a2).
That's 8 bits, not 7, so it must be masked with 0xFF rather than 0x7F.

The manual has no mention of which values should be considered valid,
which kind of implies that they all are. Arbitrarily discarding values
outside a specific range is wrong. The upper range check had to be
fixed recently (commit 144b44b1) and the lower range check is just as
wrong. See bug #75071:

https://bugzilla.kernel.org/show_bug.cgi?id=75071

There are many Xeon processor series with TjMax of 70, 71 or 80
degrees Celsius, way below the arbitrary 85 degrees Celsius limit.
There may be other (past or future) models with even lower limits.

So drop this arbitrary check. The only value that would be clearly
invalid is 0. Everything else should be accepted.

After these changes, turbostat is aligned with what the coretemp
driver does.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Len Brown <len.brown@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-05-07 00:14:46 +02:00
Len Brown
4e8e863fed tools/power turbostat: Run on Broadwell
Signed-off-by: Len Brown <len.brown@intel.com>
2014-03-05 22:20:02 -05:00
Len Brown
fc04cc67ea tools/power turbostat: simplify output, add Avg_MHz
Use 8 columns for each number ouput.
We don't fit into 80 columns on most machines,
so keep the format simple.

Print frequency in MHz instead of GHz.
We've got 8 columns now, so use them to
show low frequency in a more natural unit.

Many users didn't understand what %c0 meant,
so re-name it to be %Busy.

Add Avg_MHz column, which is the frequency that many
users expect to see -- the total number of cycles executed
over the measurement interval.

People found the previous GHz to be confusing, since
it was the speed only over the non-idle interval.
That measurement has been re-named Bzy_MHz.

Suggested-by: Dirk J. Brandewie
Signed-off-by: Len Brown <len.brown@intel.com>
2014-03-05 22:19:55 -05:00
Andy Shevchenko
3b4d5c7fec tools/power turbostat: introduce -s to dump counters
The new option allows just run turbostat and get dump of counter values. It's
useful when we have something more than one program to test.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-02-01 15:24:28 -05:00
Andy Shevchenko
f591c38b91 tools/power turbostat: remove unused command line option
The -s is not used, let's remove it, and update quick help accordingly.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-02-01 15:22:31 -05:00
Dirk Brandewie
5c56be9a25 turbostat: Add option to report joules consumed per sample
Add "-J" option to report energy consumed in joules per sample.  This option
also adds the sample time to the reported values.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:32 -05:00
Len Brown
e6f9bb3cc6 turbostat: run on HSX
Haswell Xeon has slightly different RAPL support than client HSW,
which prevented the previous version of turbostat from running on HSX.

Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:22 -05:00
Josh Triplett
7ade7f48b1 turbostat: Add a .gitignore to ignore the compiled turbostat binary
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:10 -05:00
Josh Triplett
b2c95d90a7 turbostat: Clean up error handling; disambiguate error messages; use err and errx
Most of turbostat's error handling consists of printing an error (often
including an errno) and exiting.  Since perror doesn't support a format
string, those error messages are often ambiguous, such as just showing a
file path, which doesn't uniquely identify which call failed.

turbostat already uses _GNU_SOURCE, so switch to the err and errx
functions from err.h, which take a format string.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:10 -05:00
Josh Triplett
57a42a34d1 turbostat: Factor out common function to open file and exit on failure
Several different functions in turbostat contain the same pattern of
opening a file and exiting on failure.  Factor out a common fopen_or_die
function for that.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:09 -05:00
Josh Triplett
95aebc44e7 turbostat: Add a helper to parse a single int out of a file
Many different chunks of code in turbostat open a file, parse a single
int out of it, and close it.  Factor that out into a common function.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:09 -05:00
Josh Triplett
7482341976 turbostat: Check return value of fscanf
Some systems declare fscanf with the warn_unused_result attribute.  On
such systems, turbostat generates the following warnings:

turbostat.c: In function 'get_core_id':
turbostat.c:1203:8: warning: ignoring return value of 'fscanf', declared with attribute warn_unused_result [-Wunused-result]
turbostat.c: In function 'get_physical_package_id':
turbostat.c:1186:8: warning: ignoring return value of 'fscanf', declared with attribute warn_unused_result [-Wunused-result]
turbostat.c: In function 'cpu_is_first_core_in_package':
turbostat.c:1169:8: warning: ignoring return value of 'fscanf', declared with attribute warn_unused_result [-Wunused-result]
turbostat.c: In function 'cpu_is_first_sibling_in_core':
turbostat.c:1148:8: warning: ignoring return value of 'fscanf', declared with attribute warn_unused_result [-Wunused-result]

Fix these by checking the return value of those four calls to fscanf.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:09 -05:00
Josh Triplett
2b92865e64 turbostat: Use GCC's CPUID functions to support PIC
turbostat uses inline assembly to call cpuid.  On 32-bit x86, on systems
that have certain security features enabled by default that make -fPIC
the default, this causes a build error:

turbostat.c: In function ‘check_cpuid’:
turbostat.c:1906:2: error: PIC register clobbered by ‘ebx’ in ‘asm’
  asm("cpuid" : "=a" (fms), "=c" (ecx), "=d" (edx) : "a" (1) : "ebx");
  ^

GCC provides a header cpuid.h, containing a __get_cpuid function that
works with both PIC and non-PIC.  (On PIC, it saves and restores ebx
around the cpuid instruction.)  Use that instead.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: stable@vger.kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:08 -05:00
Josh Triplett
2e9c6bc7fb turbostat: Don't attempt to printf an off_t with %zx
turbostat uses the format %zx to print an off_t.  However, %zx wants a
size_t, not an off_t.  On 32-bit targets, those refer to different
types, potentially even with different sizes.  Use %llx and a cast
instead, since printf does not have a length modifier for off_t.

Without this patch, when compiling for a 32-bit target:

turbostat.c: In function 'get_msr':
turbostat.c:231:3: warning: format '%zx' expects argument of type 'size_t', but argument 4 has type 'off_t' [-Wformat]

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:08 -05:00
Josh Triplett
b731f3119d turbostat: Don't put unprocessed uapi headers in the include path
turbostat's Makefile puts arch/x86/include/uapi/ in the include path, so
that it can include <asm/msr.h> from it.  It isn't in general safe to
include even uapi headers directly from the kernel tree without
processing them through scripts/headers_install.sh, but asm/msr.h
happens to work.

However, that include path can break with some versions of system
headers, by overriding some system headers with the unprocessed versions
directly from the kernel source.  For instance:

In file included from /build/x86-generic/usr/include/bits/sigcontext.h:28:0,
                 from /build/x86-generic/usr/include/signal.h:339,
                 from /build/x86-generic/usr/include/sys/wait.h:31,
                 from turbostat.c:27:
../../../../arch/x86/include/uapi/asm/sigcontext.h:4:28: fatal error: linux/compiler.h: No such file or directory

This occurs because the system bits/sigcontext.h on that build system
includes <asm/sigcontext.h>, and asm/sigcontext.h in the kernel source
includes <linux/compiler.h>, which scripts/headers_install.sh would have
filtered out.

Since turbostat really only wants a single header, just include that one
header rather than putting an entire directory of kernel headers on the
include path.

In the process, switch from msr.h to msr-index.h, since turbostat just
wants the MSR numbers.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: stable@vger.kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2014-01-18 22:34:07 -05:00
Len Brown
144b44b135 tools / power turbostat: Support Silvermont
Support the next generation Intel Atom processor
mirco-architecture, formerly called Silvermont.

The server version, formerly called "Avoton",
is named the "Intel(R) Atom(TM) Processor C2000 Product Family".

The client version, formerly called "Bay Trail",
is named the "Intel Atom Processor Z3000 Series",
as well as various "Intel Pentium Processor"
and "Intel Celeron Processor" brands, depending
on form-factor.

Silvermont has a set of MSRs not far off from NHM,
but the RAPL register set is a sub-set of those previously supported.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-12 23:16:02 +01:00
Josh Triplett
b844db3187 turbostat: Increase output buffer size to accommodate C8-C10
On platforms with C8-C10 support, the additional C-states cause
turbostat to overrun its output buffer of 128 bytes per CPU.  Increase
this to 256 bytes per CPU.

[ As a bugfix, this should go into 3.10; however, since the C8-C10
  support didn't go in until after 3.9, this need not go into any stable
  kernel. ]

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-13 09:55:56 -07:00
Kristen Carlson Accardi
ca58710f3a tools/power turbostat: display C8, C9, C10 residency
Display residency in the new C-states, C8, C9, C10.

C8, C9, C10 are present on some:
"Fourth Generation Intel(R) Core(TM) Processors",
which are based on Intel(R) microarchitecture code name Haswell.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2013-04-17 19:23:26 -04:00
Len Brown
149c2319c6 tools/power turbostat: additional Haswell CPU-id
There is an additional HSW CPU-id, 0x46,
which has C-states exactly like CPU-id 0x45.

Signed-off-by: Len Brown <len.brown@intel.com>
2013-03-15 11:05:26 -04:00
Len Brown
1ed51011af tools/power turbostat: display SMI count by default
The SMI counter is popular -- so display it by default
rather than requiring an option.  What the heck,
we've blown the 80 column budget on many systems already...

Note that the value displayed is the delta
during the measurement interval.
The absolute value of the counter can still be seen with
the generic 32-bit MSR option, ie.  -m 0x34

Signed-off-by: Len Brown <len.brown@intel.com>
2013-02-13 18:22:12 -05:00
Len Brown
6792041834 tools/power turbostat: decode MSR_IA32_POWER_CTL
When verbose is enabled, print the C1E-Enable
bit in MSR_IA32_POWER_CTL.

also delete some redundant tests on the verbose variable.

Signed-off-by: Len Brown <len.brown@intel.com>
2013-02-08 19:26:16 -05:00
Len Brown
70b43400bc tools/power turbostat: support Haswell
This patch enables turbostat to run properly on the
next-generation Intel(R) Microarchitecture, code named "Haswell" (HSW).

HSW supports the BCLK and counters found in SNB.

Signed-off-by: Len Brown <len.brown@intel.com>
2013-02-08 19:25:57 -05:00
Linus Torvalds
6842d98de7 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull powertool update from Len Brown:
 "This updates the tree w/ the latest version of turbostat, which
  reports temperature and - on SNB and later - Watts."

Fix up semantic merge conflict as per Len.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools: Allow tools to be installed in a user specified location
  tools/power: turbostat: make Makefile a bit more capable
  tools/power x86_energy_perf_policy: close /proc/stat in for_every_cpu()
  tools/power turbostat: v3.0: monitor Watts and Temperature
  tools/power turbostat: fix output buffering issue
  tools/power turbostat: prevent infinite loop on migration error path
  x86 power: define RAPL MSRs
  tools/power/x86/turbostat: share kernel MSR #defines
2012-12-18 12:34:29 -08:00
Josh Boyer
55f1f545f7 tools: Allow tools to be installed in a user specified location
When building x86_energy_perf_policy or turbostat within the confines of
a packaging system such as RPM, we need to be able to have it install to
the buildroot and not the root filesystem of the build machine.  This
adds a DESTDIR variable that when set will act as a prefix for the
install location of these tools.

Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-30 01:09:45 -05:00
Mark Asselstine
ee0778a301 tools/power: turbostat: make Makefile a bit more capable
The turbostat Makefile is pretty simple, its output is placed in the
same directory as the source, the install rule has no concept of a
prefix or sysroot, and you can set CC to use a specific compiler but
not use the more familiar CROSS_COMPILE. By making a few minor changes
these limitations are removed while leaving the default behavior
matching what it used to be.

Example build with these changes:
make CROSS_COMPILE=i686-wrs-linux-gnu- DESTDIR=/tmp install

or from the tools directory
make CROSS_COMPILE=i686-wrs-linux-gnu- DESTDIR=/tmp turbostat_install

Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-30 01:09:45 -05:00
Colin Ian King
84764a415c tools/power x86_energy_perf_policy: close /proc/stat in for_every_cpu()
Instead of returning out of for_every_cpu() we should break out of the loop=
 which will then tidy up correctly by closing the file /proc/stat.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-30 01:09:44 -05:00
Len Brown
889facbee3 tools/power turbostat: v3.0: monitor Watts and Temperature
Show power in Watts and temperature in Celsius
when hardware support is present.

Intel's Sandy Bridge and Ivy Bridge processor generations support RAPL
(Run-Time-Average-Power-Limiting).  Per the Intel SDM
(Intel® 64 and IA-32 Architectures Software Developer Manual)
RAPL provides hardware energy counters and power control MSRs
(Model Specific Registers).  RAPL MSRs are designed primarily
as a method to implement power capping.  However, they are useful
for monitoring system power whether or not power capping is used.

In addition, Turbostat now shows temperature from DTS
(Digital Thermal Sensor) and PTM (Package Thermal Monitor) hardware,
if present.

As before, turbostat reads MSRs, and never writes MSRs.

New columns are present in turbostat output:

The Pkg_W column shows Watts for each package (socket) in the system.
On multi-socket systems, the system summary on the 1st row shows the sum
for all sockets together.

The Cor_W column shows Watts due to processors cores.
Note that Core_W is included in Pkg_W.

The optional GFX_W column shows Watts due to the graphics "un-core".
Note that GFX_W is included in Pkg_W.

The optional RAM_W column on server processors shows Watts due to DRAM DIMMS.
As DRAM DIMMs are outside the processor package, RAM_W is not included in Pkg_W.

The optional PKG_% and RAM_% columns on server processors shows the % of time
in the measurement interval that RAPL power limiting is in effect on the
package and on DRAM.

Note that the RAPL energy counters have some limitations.

First, hardware updates the counters about once every milli-second.
This is fine for typical turbostat measurement intervals > 1 sec.
However, when turbostat is used to measure events that approach
1ms, the counters are less useful.

Second, the 32-bit energy counters are subject to wrapping.
For example, a counter incrementing 15 micro-Joule units
on a 130 Watt TDP server processor could (in theory)
roll over in about 9 minutes.  Turbostat detects and handles
up to 1 counter overflow per measurement interval.
But when the measurement interval exceeds the guaranteed
counter range, we can't detect if more than 1 overflow occured.
So in this case turbostat indicates that the results are
in question by replacing the fractional part of the Watts
in the output with "**":

Pkg_W  Cor_W GFX_W
  3**    0**   0**

Third, the RAPL counters are energy (Joule) counters -- they sum up
weighted events in the package to estimate energy consumed.  They are
not analong power (Watt) meters.  In practice, they tend to under-count
because they don't cover every possible use of energy in the package.
The accuracy of the RAPL counters will vary between product generations,
and between SKU's in the same product generation, and with temperature.

turbostat's -v (verbose) option now displays more power and thermal configuration
information -- as shown on the turbostat.8 manual page.
For example, it now displays the Package and DRAM Thermal Design Power (TDP):

cpu0: MSR_PKG_POWER_INFO: 0x2f064001980410 (130 W TDP, RAPL 51 - 200 W, 0.045898 sec.)
cpu0: MSR_DRAM_POWER_INFO,: 0x28025800780118 (35 W TDP, RAPL 15 - 75 W, 0.039062 sec.)
cpu8: MSR_PKG_POWER_INFO: 0x2f064001980410 (130 W TDP, RAPL 51 - 200 W, 0.045898 sec.)
cpu8: MSR_DRAM_POWER_INFO,: 0x28025800780118 (35 W TDP, RAPL 15 - 75 W, 0.039062 sec.)

Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-30 01:09:44 -05:00
Len Brown
ddac0d6872 tools/power turbostat: fix output buffering issue
In periodic mode, turbostat writes to stdout,
but users were un-able to re-direct stdout, eg.

turbostat > outputfile

would result in an empty outputfile.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-30 01:09:43 -05:00
Len Brown
e52966c084 tools/power turbostat: prevent infinite loop on migration error path
Turbostat assumed if it can't migrate to a CPU, then the CPU
must have gone off-line and turbostat should re-initialize
with the new topology.

But if turbostat can not migrate because it is restricted by
a cpuset, then it will fail to migrate even after re-initialization,
resulting in an infinite loop.

Spit out a warning when we can't migrate
and endure only 2 re-initialize cycles in a row
before giving up and exiting.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-27 00:03:06 -05:00
Len Brown
9c63a650bb tools/power/x86/turbostat: share kernel MSR #defines
Now that turbostat is built in the kernel tree,
it can share MSR #defines with the kernel.

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
2012-11-23 21:40:04 -05:00
Len Brown
d91bb17c2a tools/power turbostat: graceful fail on garbage input
When invald MSR's are specified on the command line,
turbostat should simply print an error and exit.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-01 00:22:00 -04:00
Len Brown
39300ffb9b tools/power turbostat: Repair Segmentation fault when using -i option
Fix regression caused by commit 8e180f3cb6
(tools/power turbostat: add [-d MSR#][-D MSR#] options to print counter
deltas)

Signed-off-by: Len Brown <len.brown@intel.com>
2012-11-01 00:21:43 -04:00
Len Brown
f9240813e6 tools/power/turbostat: add option to count SMIs, re-name some options
Counting SMIs is popular, so add a dedicated "-s" option to do it,
and juggle some of the other option letters.

-S is now system summary (was -s)
-c is 32 bit counter (was -d)
-C is 64-bit counter (was -D)
-p is 1st thread in core (was -c)
-P is 1st thread in package (was -p)

bump the minor version number

Signed-off-by: Len Brown <len.brown@intel.com>
2012-10-06 15:26:31 -04:00
Len Brown
8e180f3cb6 tools/power turbostat: add [-d MSR#][-D MSR#] options to print counter deltas
# turbostat -d 0x34
is useful for printing the number of SMI's within an interval
on Nehalem and newer processors.

where
 # turbostat -m 0x34
will simply print out the total SMI count since reset.

Suggested-by: Andi Kleen
Signed-off-by: Len Brown <len.brown@intel.com>
2012-09-27 22:04:56 -04:00
Len Brown
2f32edf12c tools/power turbostat: add [-m MSR#] option
-m MSR# prints the specified MSR in 32-bit format
-M MSR# prints the specified MSR in 64-bit format

Signed-off-by: Len Brown <len.brown@intel.com>
2012-09-26 18:17:21 -04:00
Len Brown
130ff304f6 tools/power turbostat: make -M output pretty
The -M option dumps the specified 64-bit MSR with every sample.

Previously it was output at the end of each line.
However, with the v2 style of printing, the lines are now staggered,
making MSR output hard to read.

So move the MSR output column to the left where things are aligned.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-09-26 18:17:21 -04:00
Len Brown
6574a5d505 tools/power turbostat: print more turbo-limit information
The "turbo-limit" is the maximum opportunistic processor
speed, assuming no electrical or thermal constraints.
For a given processor, the turbo-limit varies, depending
on the number of active cores.  Generally, there is more
opportunity when fewer cores are active.

Under the "-v" verbose option, turbostat would
print the turbo-limits for the four cases
of 1 to 4 cores active.

Expand that capability to cover the cases of turbo
opportunities with up to 16 cores active.

Note that not all hardware platforms supply this information,
and that sometimes a valid limit may be specified for
a core which is not actually present.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-09-26 18:15:48 -04:00
Len Brown
d7db690165 tools/power turbostat: delete unused line
MSR_TSC is no longer needed because
we now use RDTSC directly.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-09-26 18:11:48 -04:00
Len Brown
1300651b40 tools/power turbostat: run on IVB Xeon
This fix is required to run on IVB Xeon,
which previously had an incorrect cpuid model number listed.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-09-26 18:11:31 -04:00
Len Brown
c3ae331d1c tools/power: turbostat: fix large c1% issue
Under some conditions, c1% was displayed as very large number,
much higher than 100%.

c1% is not measured, it is derived as "that, which is left over"
from other counters.  However, the other counters are not collected
atomically, and so it is possible for c1% to be calaculagted as
a small negative number -- displayed as very large positive.

There was a check for mperf vs tsc for this already,
but it needed to also include the other counters
that are used to calculate c1.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-07-19 22:26:33 -04:00
Len Brown
c98d5d9444 tools/power: turbostat v2 - re-write for efficiency
Measuring large profoundly-idle configurations
requires turbostat to be more lightweight.
Otherwise, the operation of turbostat itself
can interfere with the measurements.

This re-write makes turbostat topology aware.
Hardware is accessed in "topology order".
Redundant hardware accesses are deleted.
Redundant output is deleted.
Also, output is buffered and
local RDTSC use replaces remote MSR access for TSC.

From a feature point of view, the output
looks different since redundant figures are absent.
Also, there are now -c and -p options -- to restrict
output to the 1st thread in each core, and the 1st
thread in each package, respectively.  This is helpful
to reduce output on big systems, where more detail
than the "-s" system summary is desired.
Finally, periodic mode output is now on stdout, not stderr.

Turbostat v2 is also slightly more robust in
handling run-time CPU online/offline events,
as it now checks the actual map of on-line cpus rather
than just the total number of on-line cpus.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-07-19 22:26:14 -04:00
Len Brown
650a37f32d tools/power turbostat: fix IVB support
Initial IVB support went into turbostat in Linux-3.1:
553575f1ae
(tools turbostat: recognize and run properly on IVB)

However, when running on IVB, turbostat would fail
to report the new couters added with SNB, c7, pc2 and pc7.
So in scenarios where these counters are non-zero on IVB,
turbostat would report erroneous residencey results.

In particular c7 time would be added to c1 time,
since c1 time is calculated as "that which is left over".

Also, turbostat reports MHz capabilities when passed
the "-v" option, and it would incorrectly report 133MHz
bclk instead of 100MHz bclk for IVB, which would inflate
GHz reported with that option.

This patch is a backport of a fix already included in turbostat v2.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-03 23:47:49 -04:00
Len Brown
d15cf7c129 tools/power turbostat: fix un-intended affinity of forked program
Linux 3.4 included a modification to turbostat to
lower cross-call overhead by using scheduler affinity:

15aaa34654
(tools turbostat: reduce measurement overhead due to IPIs)

In the use-case where turbostat forks a child program,
that change had the un-intended side-effect of binding
the child to the last cpu in the system.

This change removed the binding before forking the child.

This is a back-port of a fix already included in turbostat v2.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-03 23:24:00 -04:00
Len Brown
15aaa34654 tools turbostat: harden against cpu online/offline
Sometimes users have turbostat running in interval mode
when they take processors offline/online.

Previously, turbostat would survive, but not gracefully.

Tighten up the error checking so turbostat notices
changesn sooner, and print just 1 line on change:

turbostat: re-initialized with num_cpus %d

Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-29 22:27:19 -04:00
Len Brown
88c3281f7b tools turbostat: reduce measurement overhead due to IPIs
turbostat uses /dev/cpu/*/msr interface to read MSRs.
For modern systems, it reads 10 MSR/CPU.  This can
be observed as 10 "Function Call Interrupts"
per CPU per sample added to /proc/interrupts.

This overhead is measurable on large idle systems,
and as Yoquan Song pointed out, it can even trick
cpuidle into thinking the system is busy.

Here turbostat re-schedules itself in-turn to each
CPU so that its MSR reads will always be local.
This replaces the 10 "Function Call Interrupts"
with a single "Rescheduling interrupt" per sample
per CPU.

On an idle 32-CPU system, this shifts some residency from
the shallow c1 state to the deeper c7 state:

 # ./turbostat.old -s
   %c0  GHz  TSC    %c1    %c3    %c6    %c7   %pc2   %pc3   %pc6   %pc7
  0.27 1.29 2.29   0.95   0.02   0.00  98.77  20.23   0.00  77.41   0.00
  0.25 1.24 2.29   0.98   0.02   0.00  98.75  20.34   0.03  77.74   0.00
  0.27 1.22 2.29   0.54   0.00   0.00  99.18  20.64   0.00  77.70   0.00
  0.26 1.22 2.29   1.22   0.00   0.00  98.52  20.22   0.00  77.74   0.00
  0.26 1.38 2.29   0.78   0.02   0.00  98.95  20.51   0.05  77.56   0.00
^C
 i# ./turbostat.new -s
   %c0  GHz  TSC    %c1    %c3    %c6    %c7   %pc2   %pc3   %pc6   %pc7
  0.27 1.20 2.29   0.24   0.01   0.00  99.49  20.58   0.00  78.20   0.00
  0.27 1.22 2.29   0.25   0.00   0.00  99.48  20.79   0.00  77.85   0.00
  0.27 1.20 2.29   0.25   0.02   0.00  99.46  20.71   0.03  77.89   0.00
  0.28 1.26 2.29   0.25   0.01   0.00  99.46  20.89   0.02  77.67   0.00
  0.27 1.20 2.29   0.24   0.01   0.00  99.48  20.65   0.00  78.04   0.00

cc: Youquan Song <youquan.song@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-29 22:04:58 -04:00
Len Brown
e23da0370f tools turbostat: add summary option
turbostat -s
cuts down on the amount of output, per user request.

also treak some output whitespace and the man page.

Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-29 13:22:06 -04:00
Linus Torvalds
507a03c1cb Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
This includes initial support for the recently published ACPI 5.0 spec.
In particular, support for the "hardware-reduced" bit that eliminates
the dependency on legacy hardware.

APEI has patches resulting from testing on real hardware.

Plus other random fixes.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits)
  acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
  intel_idle: Split up and provide per CPU initialization func
  ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
  ACPI processor: Remove unneeded cpuidle_unregister_driver call
  intel idle: Make idle driver more robust
  intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle
  ACPI: kernel-parameters.txt : Add intel_idle.max_cstate
  intel_idle: remove redundant local_irq_disable() call
  ACPI processor: Fix error path, also remove sysdev link
  ACPI: processor: fix acpi_get_cpuid for UP processor
  intel_idle: fix API misuse
  ACPI APEI: Convert atomicio routines
  ACPI: Export interfaces for ioremapping/iounmapping ACPI registers
  ACPI: Fix possible alignment issues with GAS 'address' references
  ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)
  ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)
  ACPI: Store SRAT table revision
  ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
  ACPI, Record ACPI NVS regions
  ACPI, APEI, EINJ, Refine the fix of resource conflict
  ...
2012-01-18 15:51:48 -08:00
Len Brown
79ba0db69c Merge branches 'einj', 'intel_idle', 'misc', 'srat' and 'turbostat-ivb' into release 2012-01-18 01:15:54 -05:00
Arun Thomas
9b6cf1a012 tools/power turbostat: update fields in manpage
Field names were shortened: "pkg" is now "pk", "core" is now "cr"

Signed-off-by: Arun Thomas <arun.thomas@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-12-15 16:34:20 +01:00
Len Brown
553575f1ae tools turbostat: recognize and run properly on IVB
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-18 03:32:01 -05:00
Len Brown
efb90582c5 Merge branches 'acpi', 'idle', 'mrst-pmu' and 'pm-tools' into next 2011-11-06 22:14:50 -05:00
Len Brown
d30c4b7a87 tools/power turbostat: fit output into 80 columns on snb-ep
Reduce columns for package number to 1.
If you can afford more than 9 packages,
you can also afford a terminal with more than 80 columns:-)

Also shave a column also off the package C-states

Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-02 18:33:31 -04:00
Len Brown
e4c0d0e22c tools/power x86_energy_perf_policy: fix print of uninitialized string
Looks like I was going to stick the brand string
in the verbose ouput, but didn't get around to it.

Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-15 23:39:00 -04:00
Len Brown
aeae1e92da tools/power turbostat: less verbose debugging
dump only the counters which are active

Signed-off-by: Len Brown <len.brown@intel.com>
2011-07-03 21:41:33 -04:00
Jiri Kosina
07f9479a40 Merge branch 'master' into for-next
Fast-forwarded to current state of Linus' tree as there are patches to be
applied for files that didn't exist on the old branch.
2011-04-26 10:22:59 +02:00
Justin P. Mattock
6eab04a876 treewide: remove extra semicolons
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-04-10 17:01:05 +02:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Len Brown
a829eb4d7e tools: turbostat: style updates
Follow kernel coding style traditions more closely.
Delete typedef, re-name "per cpu counters" to
simply be counters etc.

This patch changes no functionality.

Suggested-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-02-10 23:58:13 -05:00
Thomas Renninger
8209e054b6 tools: turbostat: fix bitwise and operand
bug could cause false positive on indicating
presence of invarient TSC or APERF support.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-02-10 23:58:11 -05:00
Len Brown
eca0bdd326 Merge branches 'turbostat' and 'x86_energy_perf_policy' into tools 2011-01-11 23:06:28 -05:00
Len Brown
d5532ee7b4 tools: create power/x86/x86_energy_perf_policy
MSR_IA32_ENERGY_PERF_BIAS first became available on Westmere Xeon.
It is implemented in all Sandy Bridge processors -- mobile, desktop and server.
It is expected to become increasingly important in subsequent generations.

x86_energy_perf_policy is a user-space utility to set the
hardware energy vs performance policy hint in the processor.
Most systems would benefit from "x86_energy_perf_policy normal"
at system startup, as the hardware default is maximum performance
at the expense of energy efficiency.

See x86_energy_perf_policy.8 man page for more information.

Background:

Linux-2.6.36 added "epb" to /proc/cpuinfo to indicate
if an x86 processor supports MSR_IA32_ENERGY_PERF_BIAS,
without actually modifying the MSR.

In March, 2010, Venkatesh Pallipadi proposed a small driver
that programmed MSR_IA32_ENERGY_PERF_BIAS, based on
the cpufreq governor in use.  It also offered
a boot-time cmdline option to override.
http://lkml.org/lkml/2010/3/4/457
But hiding the hardware policy behind the
governor choice was deemed "kinda icky".

In June, 2010, I proposed a generic user/kernel API to
generalize the power/performance policy trade-off.
"RFC: /sys/power/policy_preference"
http://lkml.org/lkml/2010/6/16/399
That is my preference for implementing this capability,
but I received no support on the list.

So in September, 2010, I sent x86_energy_perf_policy.c to LKML,
a user-space utility that scribbles directly to the MSR.
http://lkml.org/lkml/2010/9/28/246

Here is that same utility, after responding to some review feedback,
to live in tools/power/, where it is easily found.

Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-11 23:02:21 -05:00
Len Brown
103a8fea9b tools: create power/x86/turbostat
turbostat is a Linux tool to observe proper operation
of Intel(R) Turbo Boost Technology.

turbostat displays the actual processor frequency
on x86 processors that include APERF and MPERF MSRs.

Note that turbostat is of limited utility on Linux
kernels 2.6.29 and older, as acpi_cpufreq cleared
APERF/MPERF up through that release.

On Intel Core i3/i5/i7 (Nehalem) and newer processors,
turbostat also displays residency in idle power saving states,
which are necessary for diagnosing any cpuidle issues
that may have an effect on turbo-mode.

See the turbostat.8 man page for example usage.

Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-11 22:46:02 -05:00