linux_dsm_epyc7002/arch/x86/kernel/cpu
Matt Fleming 2c534c0da0 perf/x86/intel/cqm: Return cached counter value from IRQ context
Peter reported the following potential crash which I was able to
reproduce with his test program,

[  148.765788] ------------[ cut here ]------------
[  148.765796] WARNING: CPU: 34 PID: 2840 at kernel/smp.c:417 smp_call_function_many+0xb6/0x260()
[  148.765797] Modules linked in:
[  148.765800] CPU: 34 PID: 2840 Comm: perf Not tainted 4.2.0-rc1+ #4
[  148.765803]  ffffffff81cdc398 ffff88085f105950 ffffffff818bdfd5 0000000000000007
[  148.765805]  0000000000000000 ffff88085f105990 ffffffff810e413a 0000000000000000
[  148.765807]  ffffffff82301080 0000000000000022 ffffffff8107f640 ffffffff8107f640
[  148.765809] Call Trace:
[  148.765810]  <NMI>  [<ffffffff818bdfd5>] dump_stack+0x45/0x57
[  148.765818]  [<ffffffff810e413a>] warn_slowpath_common+0x8a/0xc0
[  148.765822]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
[  148.765824]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
[  148.765825]  [<ffffffff810e422a>] warn_slowpath_null+0x1a/0x20
[  148.765827]  [<ffffffff811613f6>] smp_call_function_many+0xb6/0x260
[  148.765829]  [<ffffffff8107f640>] ? intel_cqm_stable+0x60/0x60
[  148.765831]  [<ffffffff81161748>] on_each_cpu_mask+0x28/0x60
[  148.765832]  [<ffffffff8107f6ef>] intel_cqm_event_count+0x7f/0xe0
[  148.765836]  [<ffffffff811cdd35>] perf_output_read+0x2a5/0x400
[  148.765839]  [<ffffffff811d2e5a>] perf_output_sample+0x31a/0x590
[  148.765840]  [<ffffffff811d333d>] ? perf_prepare_sample+0x26d/0x380
[  148.765841]  [<ffffffff811d3497>] perf_event_output+0x47/0x60
[  148.765843]  [<ffffffff811d36c5>] __perf_event_overflow+0x215/0x240
[  148.765844]  [<ffffffff811d4124>] perf_event_overflow+0x14/0x20
[  148.765847]  [<ffffffff8107e7f4>] intel_pmu_handle_irq+0x1d4/0x440
[  148.765849]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[  148.765853]  [<ffffffff81219bad>] ? vunmap_page_range+0x19d/0x2f0
[  148.765854]  [<ffffffff81219d11>] ? unmap_kernel_range_noflush+0x11/0x20
[  148.765859]  [<ffffffff814ce6fe>] ? ghes_copy_tofrom_phys+0x11e/0x2a0
[  148.765863]  [<ffffffff8109e5db>] ? native_apic_msr_write+0x2b/0x30
[  148.765865]  [<ffffffff8109e44d>] ? x2apic_send_IPI_self+0x1d/0x20
[  148.765869]  [<ffffffff81065135>] ? arch_irq_work_raise+0x35/0x40
[  148.765872]  [<ffffffff811c8d86>] ? irq_work_queue+0x66/0x80
[  148.765875]  [<ffffffff81075306>] perf_event_nmi_handler+0x26/0x40
[  148.765877]  [<ffffffff81063ed9>] nmi_handle+0x79/0x100
[  148.765879]  [<ffffffff81064422>] default_do_nmi+0x42/0x100
[  148.765880]  [<ffffffff81064563>] do_nmi+0x83/0xb0
[  148.765884]  [<ffffffff818c7c0f>] end_repeat_nmi+0x1e/0x2e
[  148.765886]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[  148.765888]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[  148.765890]  [<ffffffff811d07a6>] ? __perf_event_task_sched_in+0x36/0xa0
[  148.765891]  <<EOE>>  [<ffffffff8110ab66>] finish_task_switch+0x156/0x210
[  148.765898]  [<ffffffff818c1671>] __schedule+0x341/0x920
[  148.765899]  [<ffffffff818c1c87>] schedule+0x37/0x80
[  148.765903]  [<ffffffff810ae1af>] ? do_page_fault+0x2f/0x80
[  148.765905]  [<ffffffff818c1f4a>] schedule_user+0x1a/0x50
[  148.765907]  [<ffffffff818c666c>] retint_careful+0x14/0x32
[  148.765908] ---[ end trace e33ff2be78e14901 ]---

The CQM task events are not safe to be called from within interrupt
context because they require performing an IPI to read the counter value
on all sockets. And performing IPIs from within IRQ context is a
"no-no".

Make do with the last read counter value currently event in
event->count when we're invoked in this context.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Will Auld <will.auld@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-26 10:22:29 +02:00
..
mcheck Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 17:59:09 -07:00
microcode x86/microcode: Correct CPU family related variable types 2015-06-07 15:38:15 +02:00
mtrr x86/mm/pat: Wrap pat_enabled into a function API 2015-05-27 14:41:01 +02:00
.gitignore
amd.c Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 17:59:09 -07:00
bugs_64.c
bugs.c x86/fpu: Move various internal function prototypes to fpu/internal.h 2015-05-19 15:47:48 +02:00
centaur.c x86: Remove CONFIG_X86_OOSTORE 2014-03-11 10:16:18 -07:00
common.c x86/fpu: Fix FPU related boot regression when CPUID masking BIOS feature is enabled 2015-06-30 07:22:10 +02:00
cpu.h x86/cpu: Track legacy CPU model data only on 32-bit kernels 2013-10-26 13:34:39 +02:00
cyrix.c x86: Delete non-required instances of include <linux/init.h> 2014-01-06 21:25:18 -08:00
hypervisor.c hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests 2015-05-05 18:27:43 +01:00
intel_cacheinfo.c x86: Kill CONFIG_X86_HT 2015-06-07 15:33:44 +02:00
intel_pt.h perf/x86/intel/pt: Add Intel PT PMU driver 2015-04-02 17:14:20 +02:00
intel.c x86/cpu/intel: Fix trivial typo in intel_tlb_table[] 2015-02-22 08:55:58 +01:00
Makefile perf/x86/intel/bts: Add BTS PMU driver 2015-04-02 17:14:21 +02:00
match.c x86: align x86 arch with generic CPU modalias handling 2014-02-18 12:45:38 -08:00
mkcapflags.sh x86/build: Fix mkcapflags.sh bash-ism 2015-02-19 02:21:00 +01:00
mshyperv.c x86: Use entering[_ack]_irq() instead of open coding it 2015-05-15 16:03:18 +02:00
perf_event_amd_ibs.c perf/x86/amd/ibs: Convert force_ibs_eilvt_setup() to void 2015-02-18 17:01:46 +01:00
perf_event_amd_iommu.c cpumask: factor out show_cpumap into separate helper function 2014-11-07 11:45:00 -08:00
perf_event_amd_iommu.h
perf_event_amd_uncore.c cpumask: factor out show_cpumap into separate helper function 2014-11-07 11:45:00 -08:00
perf_event_amd.c perf/x86: Add 'index' param to get_event_constraint() callback 2015-04-02 17:33:10 +02:00
perf_event_intel_bts.c Replace module_init with appropriate alternate initcall in non modules. 2015-07-02 10:36:29 -07:00
perf_event_intel_cqm.c perf/x86/intel/cqm: Return cached counter value from IRQ context 2015-07-26 10:22:29 +02:00
perf_event_intel_ds.c perf/x86/intel/pebs: Add PEBSv3 decoding 2015-06-07 16:09:16 +02:00
perf_event_intel_lbr.c perf/x86/intel: Drain the PEBS buffer during context switches 2015-06-07 16:08:54 +02:00
perf_event_intel_pt.c Replace module_init with appropriate alternate initcall in non modules. 2015-07-02 10:36:29 -07:00
perf_event_intel_rapl.c Merge branch 'linus' into timers/core 2015-05-19 16:12:32 +02:00
perf_event_intel_uncore_nhmex.c perf/x86/uncore: Fix coccinelle warnings 2014-08-13 07:51:09 +02:00
perf_event_intel_uncore_snb.c perf/x86/intel/uncore: Add Broadwell-U uncore IMC PMU support 2015-05-11 11:57:47 +02:00
perf_event_intel_uncore_snbep.c perf/x86/intel/uncore: Fix CBOX bit wide and UBOX reg on Haswell-EP 2015-06-07 15:46:50 +02:00
perf_event_intel_uncore.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 18:57:44 -07:00
perf_event_intel_uncore.h Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 15:19:21 -07:00
perf_event_intel.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 15:52:04 -07:00
perf_event_knc.c x86: Replace __get_cpu_var uses 2014-08-26 13:45:49 -04:00
perf_event_p4.c x86: Replace __get_cpu_var uses 2014-08-26 13:45:49 -04:00
perf_event_p6.c perf/x86/intel/p6: Add userspace RDPMC quirk for PPro 2014-02-09 13:08:24 +01:00
perf_event.c perf/x86: Fix 'active_events' imbalance 2015-06-30 13:08:46 +02:00
perf_event.h Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 15:45:41 -07:00
perfctr-watchdog.c
powerflags.c
proc.c x86: Replace cpu_**_mask() with topology_**_cpumask() 2015-05-27 15:22:17 +02:00
rdrand.c x86, rdrand: When nordrand is specified, disable RDSEED as well 2014-05-11 20:25:20 -07:00
scattered.c x86: Add Intel Processor Trace (INTEL_PT) cpu feature detection 2015-04-02 17:14:18 +02:00
topology.c
transmeta.c x86: Delete non-required instances of include <linux/init.h> 2014-01-06 21:25:18 -08:00
umc.c x86: Delete non-required instances of include <linux/init.h> 2014-01-06 21:25:18 -08:00
vmware.c