mirror of
https://github.com/AuxXxilium/linux_dsm_epyc7002.git
synced 2025-01-27 18:32:48 +07:00
b95fcd2c1c
Add NO_NMI_WATCHDOG metric constraint to Page_Walks_Utilization for Sky Lake and Cascade Lake. Committer testing: On a Lenovo T480S, Intel(R) Core(TM) i7-8650U Kaby Lake, that looking at x86's mapfile.csv file is a: $ grep -w skylake tools/perf/pmu-events/arch/x86/mapfile.csv GenuineIntel-6-[4589]E,v24,skylake,core $ So uses the constraint added in this patch in this file: tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json Before: # perf stat -a -M Page_Walks_Utilization sleep 2 Performance counter stats for 'system wide': <not counted> itlb_misses.walk_pending (0.00%) <not counted> dtlb_load_misses.walk_pending (0.00%) <not counted> dtlb_store_misses.walk_pending (0.00%) <not counted> ept.walk_pending (0.00%) <not counted> cycles (0.00%) 2.001750514 seconds time elapsed Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog The events in group usually have to be from the same PMU. Try reorganizing the group. # After: # perf stat -a -M Page_Walks_Utilization sleep 2 Splitting metric group Page_Walks_Utilization into standalone metrics. Try disabling the NMI watchdog to comply NO_NMI_WATCHDOG metric constraint: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog , Performance counter stats for 'system wide': 36,883,102 itlb_misses.walk_pending # 0.1 Page_Walks_Utilization (79.99%) 123,104,146 dtlb_load_misses.walk_pending (80.02%) 13,720,795 dtlb_store_misses.walk_pending (79.99%) 0 ept.walk_pending (79.99%) 1,519,948,400 cycles (80.01%) 2.002170780 seconds time elapsed # Before and after, if we disable the nmi_watchdog we get: # echo 0 > /proc/sys/kernel/nmi_watchdog # perf stat -a -M Page_Walks_Utilization sleep 2 Performance counter stats for 'system wide': 33,721,658 itlb_misses.walk_pending # 0.1 Page_Walks_Utilization 84,070,996 dtlb_load_misses.walk_pending 9,816,071 dtlb_store_misses.walk_pending 0 ept.walk_pending 704,920,899 cycles 2.002331670 seconds time elapsed # More information about the metric expressions: # perf stat -v -a -M Page_Walks_Utilization sleep 2 Using CPUID GenuineIntel-6-8E-A metric expr ( itlb_misses.walk_pending + dtlb_load_misses.walk_pending + dtlb_store_misses.walk_pending + ept.walk_pending ) / ( 2 * cycles ) for Page_Walks_Utilization found event itlb_misses.walk_pending found event dtlb_load_misses.walk_pending found event dtlb_store_misses.walk_pending found event ept.walk_pending found event cycles adding {itlb_misses.walk_pending,dtlb_load_misses.walk_pending,dtlb_store_misses.walk_pending,ept.walk_pending,cycles}:W -> cpu/umask=0x10,(null)=0x186a3,event=0x85/ -> cpu/umask=0x10,(null)=0x1e8483,event=0x8/ -> cpu/umask=0x10,(null)=0x1e8483,event=0x49/ -> cpu/umask=0x10,(null)=0x1e8483,event=0x4f/ itlb_misses.walk_pending: 8085772 16010162799 16010162799 dtlb_load_misses.walk_pending: 28134579 16010162799 16010162799 dtlb_store_misses.walk_pending: 7276535 16010162799 16010162799 ept.walk_pending: 2 16010162799 16010162799 cycles: 315140605 16010162799 16010162799 Performance counter stats for 'system wide': 8,085,772 itlb_misses.walk_pending # 0.1 Page_Walks_Utilization 28,134,579 dtlb_load_misses.walk_pending 7,276,535 dtlb_store_misses.walk_pending 2 ept.walk_pending 315,140,605 cycles 2.002333181 seconds time elapsed # Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lore.kernel.org/lkml/1582581564-184429-6-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
---|---|---|
.. | ||
arch | ||
Build | ||
jevents.c | ||
jevents.h | ||
jsmn.c | ||
jsmn.h | ||
json.c | ||
json.h | ||
pmu-events.h | ||
README |
The contents of this directory allow users to specify PMU events in their CPUs by their symbolic names rather than raw event codes (see example below). The main program in this directory, is the 'jevents', which is built and executed _BEFORE_ the perf binary itself is built. The 'jevents' program tries to locate and process JSON files in the directory tree tools/perf/pmu-events/arch/foo. - Regular files with '.json' extension in the name are assumed to be JSON files, each of which describes a set of PMU events. - The CSV file that maps a specific CPU to its set of PMU events is to be named 'mapfile.csv' (see below for mapfile format). - Directories are traversed, but all other files are ignored. - To reduce JSON event duplication per architecture, platform JSONs may use "ArchStdEvent" keyword to dereference an "Architecture standard events", defined in architecture standard JSONs. Architecture standard JSONs must be located in the architecture root folder. Matching is based on the "EventName" field. The PMU events supported by a CPU model are expected to grouped into topics such as Pipelining, Cache, Memory, Floating-point etc. All events for a topic should be placed in a separate JSON file - where the file name identifies the topic. Eg: "Floating-point.json". All the topic JSON files for a CPU model/family should be in a separate sub directory. Thus for the Silvermont X86 CPU: $ ls tools/perf/pmu-events/arch/x86/silvermont cache.json memory.json virtual-memory.json frontend.json pipeline.json The JSONs folder for a CPU model/family may be placed in the root arch folder, or may be placed in a vendor sub-folder under the arch folder for instances where the arch and vendor are not the same. Using the JSON files and the mapfile, 'jevents' generates the C source file, 'pmu-events.c', which encodes the two sets of tables: - Set of 'PMU events tables' for all known CPUs in the architecture, (one table like the following, per JSON file; table name 'pme_power8' is derived from JSON file name, 'power8.json'). struct pmu_event pme_power8[] = { ... { .name = "pm_1plus_ppc_cmpl", .event = "event=0x100f2", .desc = "1 or more ppc insts finished,", }, ... } - A 'mapping table' that maps each CPU of the architecture, to its 'PMU events table' struct pmu_events_map pmu_events_map[] = { { .cpuid = "004b0000", .version = "1", .type = "core", .table = pme_power8 }, ... }; After the 'pmu-events.c' is generated, it is compiled and the resulting 'pmu-events.o' is added to 'libperf.a' which is then used to build perf. NOTES: 1. Several CPUs can support same set of events and hence use a common JSON file. Hence several entries in the pmu_events_map[] could map to a single 'PMU events table'. 2. The 'pmu-events.h' has an extern declaration for the mapping table and the generated 'pmu-events.c' defines this table. 3. _All_ known CPU tables for architecture are included in the perf binary. At run time, perf determines the actual CPU it is running on, finds the matching events table and builds aliases for those events. This allows users to specify events by their name: $ perf stat -e pm_1plus_ppc_cmpl sleep 1 where 'pm_1plus_ppc_cmpl' is a Power8 PMU event. However some errors in processing may cause the alias build to fail. Mapfile format =============== The mapfile enables multiple CPU models to share a single set of PMU events. It is required even if such mapping is 1:1. The mapfile.csv format is expected to be: Header line CPUID,Version,Dir/path/name,Type where: Comma: is the required field delimiter (i.e other fields cannot have commas within them). Comments: Lines in which the first character is either '\n' or '#' are ignored. Header line The header line is the first line in the file, which is always _IGNORED_. It can be empty. CPUID: CPUID is an arch-specific char string, that can be used to identify CPU (and associate it with a set of PMU events it supports). Multiple CPUIDS can point to the same File/path/name.json. Example: CPUID == 'GenuineIntel-6-2E' (on x86). CPUID == '004b0100' (PVR value in Powerpc) Version: is the Version of the mapfile. Dir/path/name: is the pathname to the directory containing the CPU's JSON files, relative to the directory containing the mapfile.csv Type: indicates whether the events are "core" or "uncore" events. Eg: $ grep silvermont tools/perf/pmu-events/arch/x86/mapfile.csv GenuineIntel-6-37,v13,silvermont,core GenuineIntel-6-4D,v13,silvermont,core GenuineIntel-6-4C,v13,silvermont,core i.e the three CPU models use the JSON files (i.e PMU events) listed in the directory 'tools/perf/pmu-events/arch/x86/silvermont'.