linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-15 10:46:49 +07:00

Author	SHA1	Message	Date
Alexey Brodkin	e0d5321fac	arc: perf: Enable generic "cache-references" and "cache-misses" events We used to live with PERF_COUNT_HW_CACHE_REFERENCES and PERF_COUNT_HW_CACHE_REFERENCES not specified on ARC. Those events are actually aliases to 2 cache events that we do support and so this change sets "cache-reference" and "cache-misses" events in the same way as "L1-dcache-loads" and L1-dcache-load-misses. And while at it adding debug info for cache events as well as doing a subtle fix in HW events debug info - config value is much better represented by hex so we may see not only event index but as well other control bits set (if they exist). Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-09-30 14:48:18 -07:00
Andrea Gelmini	2547476a5e	Fix typos Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2016-05-30 10:07:32 +05:30
Arnaldo Carvalho de Melo	cfbcf46845	perf core: Pass max stack as a perf_callchain_entry context This makes perf_callchain_{user,kernel}() receive the max stack as context for the perf_callchain_entry, instead of accessing the global sysctl_perf_event_max_stack. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-16 23:11:50 -03:00
Vineet Gupta	c6317bc7c5	ARCv2: perf: Ensure perf intr gets enabled on all cores This was the second perf intr issue perf sampling on multicore requires intr to be enabled on all cores. ARC perf probe code used helper arc_request_percpu_irq() which calls - request_percpu_irq() on core0 - enable_percpu_irq() on all all cores (including core0) genirq requires that request be made ahead of enable call. However if perf probe happened on non core0 (observed on a 3.18 kernel), enable would get called ahead of request, failing obviously and rendering perf intr disabled on all such cores [ 11.120000] 1 ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support] [ 11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed [ 11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed [ 11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed [ 11.140000] 0 =====> request_percpu_irq() IRQ 20 [ 11.140000] 0 -----> enable_percpu_irq() IRQ 20 Fix this fragility, by calling request_percpu_irq() on whatever core calls probe (there is no requirement on which core calls this anyways) and then calling enable on each cores. Interestingly this started as invesigation of STAR 9000838902: "sporadically IRQs enabled on perf prob" which was about occassional boot spew as request_percpu_irq got called non-locally (from an IPI), and re-enabled interrupts in following path proc_mkdir -> spin_unlock_irq() which the irq work code didn't like. \| ARC perf : 8 counters (48 bits), 113 conditions, [overflow IRQ support] \| \| BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()! \| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2 \| \| Stack Trace: \| arc_unwind_core.constprop.1+0x94/0x104 \| dump_stack+0x62/0x98 \| irq_work_run_list+0xb0/0xb4 \| irq_work_run+0x22/0x3c \| do_IPI+0x74/0x9c \| handle_irq_event_percpu+0x34/0x164 \| handle_percpu_irq+0x58/0x78 \| generic_handle_irq+0x1e/0x2c \| arch_do_IRQ+0x3c/0x60 \| ret_from_exception+0x0/0x8 Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-snps-arc@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Alexey Brodkin <abrodkin@synopsys.com> Cc: <stable@vger.kernel.org> #4.2+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-12-12 16:03:59 +05:30
Vineet Gupta	9b28829d6d	ARCv2: perf: Finally introduce HS perf unit With all features in place, the ARC HS pct block can now be effectively allowed to be probed/used Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:59:07 +05:30
Alexey Brodkin	e525c37f84	ARCv2: perf: SMP support * split off pmu info into singleton and per-cpu bits * setup PMU on all cores Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:58:42 +05:30
Alexey Brodkin	e6b1d126bb	ARCv2: perf: implement exclusion of event counting in user or kernel mode Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:58:14 +05:30
Alexey Brodkin	36481cf7fb	ARCv2: perf: Support sampling events using overflow interrupts In times of ARC 700 performance counters didn't have support of interrupt an so for ARC we only had support of non-sampling events. Put simply only "perf stat" was functional. Now with ARC HS we have support of interrupts in performance counters which this change introduces support of. ARC performance counters act in the following way in regard of interrupts generation. [1] A counter counts starting from value set in PCT_COUNT register pair [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised Basic setup look like this: [1] PCT_COUNT = 0; [2] PCT_INT_CNT = __limit_value__; [3] Enable interrupts for that counter and let it run [4] Let counter reach its limit [5] Handle interrupt when it happens Note that PCT HW block is build in CPU core and so ints interrupt line (which is basically OR of all counters IRQs) is wired directly to top-level IRQC. That means do de-assert PCT interrupt it's required to reset IRQs from all counters that have reached their limit values. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:57:43 +05:30
Alexey Brodkin	1fe8bfa5ff	ARCv2: perf: implement "event_set_period" This generalization prepares for support of overflow interrupts. Hardware event counters on ARC work that way: Each counter counts from programmed start value (set in ARC_REG_PCT_COUNT) to a limit value (set in ARC_REG_PCT_INT_CNT) and once limit value is reached this timer generates an interrupt. Even though this hardware implementation allows for more flexibility, in Linux kernel we decided to mimic behavior of other architectures this way: [1] Set limit value as half of counter's max value (to allow counter to run after reaching it limit, see below for more explanation): ---------->8----------- arc_pmu->max_period = (1ULL << counter_size) / 2 - 1ULL; ---------->8----------- [2] Set start value as "arc_pmu->max_period - sample_period" and then count up to the limit Our event counters don't stop on reaching max value (the one we set in ARC_REG_PCT_INT_CNT) but continue to count until kernel explicitly stops each of them. And setting a limit as half of counter capacity is done to allow capturing of additional events in between moment when interrupt was triggered until we're actually processing PMU interrupts. That way we're trying to be more precise. For example if we count CPU cycles we keep track of cycles while running through generic IRQ handling code: [1] We set counter period as say 100_000 events of type "crun" [2] Counter reaches that limit and raises its interrupt [3] Once we get in PMU IRQ handler we read current counter value from ARC_REG_PCT_SNAP ans see there something like 105_000. If counters stop on reaching a limit value then we would miss additional 5000 cycles. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:57:29 +05:30
Vineet Gupta	fb7c572551	ARC: perf: cap the number of counters to hardware max of 32 The number of counters in PCT can never be more than 32 (while countable conditions could be 100+) for both ARCompact and ARCv2 And while at it update copyright dates. Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-27 14:57:03 +05:30
Vineet Gupta	090749502f	ARC: add/fix some comments in code - no functional change Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-08-20 19:05:49 +05:30
Tobias Klauser	082ae1e157	ARC: perf: Remove unnecessary local variable Directly return the result of perf_pmu_register() in arc_pmu_device_probe() instead of assigning and returning variable ret. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-06-19 18:09:28 +05:30
Max Filippov	7002f77541	arc: fix use of uninitialized arc_pmu static arc_pmu in the arch/arc/kernel/perf_event.c is not initialized as it's shadowed by a local variable of the same name in the arc_pmu_device_probe. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Fixes: `03c94fcf95` "ARC: perf: make @arc_pmu static global" CC: <stable@vger.kernel.org> # 4.1 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-06-19 18:09:28 +05:30
Vineet Gupta	d8f6ad85cb	ARC: perf: don't add code for impossible case Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-04-20 18:27:55 +05:30
Vineet Gupta	30fdd373f2	ARC: perf: Rename DT binding to not confuse with power mgmt Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-04-20 18:27:36 +05:30
Vineet Gupta	22f6b89912	ARC: perf: add user space attribution in callchains The actual user space unwinding is more involved, so simply capture the user space PC Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-04-20 18:27:35 +05:30
Vineet Gupta	389e3160b9	ARC: perf: Add kernel callchain support Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-04-20 18:27:35 +05:30
Vineet Gupta	bde80c237e	ARC: perf: Add some comments/debug stuff Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-04-20 18:27:30 +05:30
Vineet Gupta	03c94fcf95	ARC: perf: make @arc_pmu static global Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2015-04-20 17:21:17 +05:30
Vineet Gupta	5637208253	ARC: boot: cpu feature print enhancements Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2014-10-13 14:46:22 +05:30
Vince Weaver	2cc9e588b0	arc, perf: Use common PMU interrupt disabled code Transition to using the new generic PERF_PMU_CAP_NO_INTERRUPT method for failing a sampling event when no PMU interrupt is available. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Rob Herring <robh+dt@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1406150159280.16738@vincent-weaver-1.umelst.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-06-18 18:43:44 +02:00
Vineet Gupta	da990a4f2d	ARC: [perf] Fix a few thinkos Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2013-11-28 15:49:59 +05:30
Mischa Jonker	230c4aadcc	ARC: perf: ARC 700 PMU doesn't support sampling events The ARC 700 does not have an interrupt associated with it, and as such it cannot trigger when a counter overflows. As the counters are 48 bit, it will usually take at least 100 days before a counter overflows, so for mere counting of events, there is no problem. Sampling is not supported though. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2013-11-15 10:52:28 +05:30
Mischa Jonker	0dd450fe13	ARC: Add perf support for ARC700 cores This adds basic perf support for ARC700 cores. Most PERF_COUNT_HW* events are supported now. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2013-11-12 09:45:38 +05:30

24 Commits