The CCI ports programming interface is available to the kernel
only when booted in secure mode (or when firmware enables
non-secure access to override CCI ports control). In both cases,
firmware reports the CCI ports availability through the device
tree CCI ports nodes, which must be parsed and their status checked
by the kernel probing path.
This check is currently missing and may cause the kernel to
erroneously believe it is free to take control of CCI ports
where in practice CCI ports control is forbidden.
Add the missing CCI port availability check to the CCI driver
in order to guarantee sane CCI usage.
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Pull CPU hotplug updates from Thomas Gleixner:
"Yet another batch of cpu hotplug core updates and conversions:
- Provide core infrastructure for multi instance drivers so the
drivers do not have to keep custom lists.
- Convert custom lists to the new infrastructure. The block-mq custom
list conversion comes through the block tree and makes the diffstat
tip over to more lines removed than added.
- Handle unbalanced hotplug enable/disable calls more gracefully.
- Remove the obsolete CPU_STARTING/DYING notifier support.
- Convert another batch of notifier users.
The relayfs changes which conflicted with the conversion have been
shipped to me by Andrew.
The remaining lot is targeted for 4.10 so that we finally can remove
the rest of the notifiers"
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
cpufreq: Fix up conversion to hotplug state machine
blk/mq: Reserve hotplug states for block multiqueue
x86/apic/uv: Convert to hotplug state machine
s390/mm/pfault: Convert to hotplug state machine
mips/loongson/smp: Convert to hotplug state machine
mips/octeon/smp: Convert to hotplug state machine
fault-injection/cpu: Convert to hotplug state machine
padata: Convert to hotplug state machine
cpufreq: Convert to hotplug state machine
ACPI/processor: Convert to hotplug state machine
virtio scsi: Convert to hotplug state machine
oprofile/timer: Convert to hotplug state machine
block/softirq: Convert to hotplug state machine
lib/irq_poll: Convert to hotplug state machine
x86/microcode: Convert to hotplug state machine
sh/SH-X3 SMP: Convert to hotplug state machine
ia64/mca: Convert to hotplug state machine
ARM/OMAP/wakeupgen: Convert to hotplug state machine
ARM/shmobile: Convert to hotplug state machine
arm64/FP/SIMD: Convert to hotplug state machine
...
For one of the CCI events exposed under sysfs, "snoop" was typo'd as
"snopp". Correct this such that users see the expected event name when
enumerating events via sysfs.
Cc: arm@kernel.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: rt@linutronix.de
Cc: Olof Johansson <olof@lixom.net>
Link: http://lkml.kernel.org/r/1471024183-12666-5-git-send-email-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Suzuki K. Poulose <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153334.679142601@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
cci_pmu_sync_counters and pmu_event_set_period are internal functions
to the CCI PMU driver, so make them static to avoid polluting the kernel
namespace.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add ARM CoreLink CCI-550 cache coherent interconnect PMU
driver support. The CCI-550 PMU shares all the attributes of CCI-500
PMU, except for an additional master interface (MI-6 - 0xe).
CCI-550 requires the same work around as for CCI-500 to
write to the PMU counter.
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
CCI-550 PMU shares most of the CCI-500 PMU attributes including the
event format, PMU event codes. The only difference is an additional
master interface (MI6 - 0xe). Hence we share the driver code for both,
except for a model specific event validate method.
This patch renames the common CCI500 symbols to CCI5xx, including the
Kconfig symbol.
No functional changes to the PMU driver.
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The CCI PMU driver sets the event counter to the half of the maximum
value(2^31) it can count before we start the counters via
pmu_event_set_period(). This is done to give us the best chance to
handle the overflow interrupt, taking care of extreme interrupt latencies.
However, CCI-500 comes with advanced power saving schemes, which
disables the clock to the event counters unless the counters are enabled to
count (PMCR.CEN). This prevents the driver from writing the period to the
counters before starting them. Also, there is no way we can reset the
individual event counter to 0 (PMCR.RST resets all the counters, losing
their current readings). However the value of the counter is preserved and
could be read back, when the counters are not enabled.
So we cannot reliably use the counters and compute the number of events
generated during the sampling period since we don't have the value of the
counter at start.
This patch works around this issue by changing writes to the counter
with the following steps.
1) Disable all the counters (remembering any counters which were enabled)
2) Enable the PMU, now that all the counters are disabled.
For each counter to be programmed, repeat steps 3-7
3) Save the current event and program the target counter to count an
invalid event, which by spec is guaranteed to not-generate any events.
4) Enable the target counter.
5) Write to the target counter.
6) Disable the target counter
7) Restore the event back on the target counter.
8) Disable the PMU
9) Restore the status of the all the counters
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add a hook for writing to CCI PMU counters. This callback
can be used for CCI models which requires some extra work
to program the PMU counter values. To accommodate group writes
and single counter writes, the call back accepts a bitmask
of the counter indices which need to be programmed with the
given value.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On CCI-500 writing to a counter requires turning the PMU on. So,
synchronising the counter state should not be performed for such special cases,
while turning the PMU on. This patch adds a helper, __cci_pmu_enable_nosync(),
without flushing the counter states.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Adds helper routines to disable the counter controls for
all the counters on the CCI PMU and restore it back, by
preserving the original state in caller provided mask.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add helper routines to check if the counter is enabled or not.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
pmu_write_counter() is now only called from pmu_write_counters(),
which does so for each set index in the given mask, bounded by
cci_pmu->num_cntrs. So, there is no need for an extra check to
make sure the given counter is valid inside pmu_write_counter.
This patch gets rid of that.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
CCI PMU driver always reprograms the counters to a safe value (half of the
counter max, = 2^31) before starting the profiling to account for extreme
interrupt latencies. Also, the cost of writing to a PMU counter could be
very costly on some PMUs(e.g, CCI-500). In order to ammortise the cost of
programming the counters, this patch delays the counter writes to pmu::pmu_enable().
We use the PER_HES_ARCH flag to keep track of the counters which need to
be programmed. Before turning on the PMU, we go through the counters that
were marked for write, and perform the operation in a batch.
To unify all the counter writes to pmu_enable(), this patch also makes sure that
we disable-and-enable the PMU in the irq handler to program any counters that
overflowed.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch refactors the CCI PMU driver code a little bit to
make it easier share the code for enabling/disabling the CCI
PMU. This will be used by the hooks to work around the special cases
where writing to a counter is not always that easy(e.g, CCI-500)
No functional changes.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add a helper to group the writes to PMU counter, this will be
used to delay setting the event period to pmu::pmu_enable()
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
cpumask_any_but returns value >= nr_cpu_ids if there are no more CPUs.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2038576
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There's no need to dynamically initialise attribute pointers when we can
get the compiler to do it for us. We also don't need a dev_ext_attribute
for the cpumask, as the drvdata for a PMU device is a pointer to struct
pmu.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Each CCI model have different event/source codes and formats. This
patch exports this information via the sysfs, which includes the
aliases for the events. The aliases are listed by 'perf list', helping
the users to specify the name of the event instead of the binary
config values.
Each event alias must accompany the 'source' code except for the
following cases :
1) CCI-400 - cycles event, doesn't relate to an interface.
2) CCI-500 - Global events to the CCI. (Fixed source code = 0xf)
Each CCI model provides two sets of attributes(format and event),
which are dynamically populated before registering the PMU, to
allow for the appropriate information.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
CCI-500 provides 8 event counters which can count any of the
supported events independently. The PMU event id is a 9-bit
value made of two parts.
bits [8:5] - Source port
0x0-0x6 Slave Ports
0x8-0xD Master Ports
0xf Global Events to CCI
0x7,0xe Reserved
bits [0:4] - Event code (specific to each type of port)
The generic CCI-500 controlling interface remains the same with CCI-400.
However there are some differences in the PMU event counters.
- No cycle counter
- Upto 8 counters(4 in CCI-400)
- Each counter area is 64K(4K in CCI400)
- The counter0 starts at offset 0x10000 from the base of CCI
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Rename CCI400 specific defintions from CCI_xxx to CCI400_xxx.
Introduce generic ARM_CCI_PMU to cover common code for handling
the CCI PMU.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Given that each CCI has different set of interfaces and
its associated events, it is good to abstract the validation of the
event codes to make it easier to add support for a new CCI model.
This patch also abstracts the mapping of a given event to a counter,
as there are some special counters for certain specific events.
We assume that the fixed hardware counters are always at the beginning,
so that we can use cci_model->fixed_hw_events as an upper bound to given
idx to check if we need to program the counter for an event.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Adds the PMU model specific counters to the PMU model
abstraction to make it easier to add a new PMU.
The patch cleans up the naming convention used all over
the code.
e.g, CCI_PMU_MAX_HW_EVENTS => maximum number of events that
can be counted at any time, which is in fact the maximum
number of counters available.
Change all such namings to use 'counters' instead of events.
This patch also abstracts the following:
1) Size of a PMU event counter area.
2) Maximum number of programmable counters supported by the PMU model
3) Number of counters which counts fixed events (e.g, cycle
counter on CCI-400).
Also changes some of the static allocation of the data
structures to dynamic, to accommodate the number of events
supported by a PMU.
Gets rid ofthe CCI_PMU_* defines for the model. All such
data should be accessed via the model abstraction.
Limits the number of counters to the maximum supported
by the 'model'.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch gets rid of the global struct cci_pmu variable and makes
the code use the cci_pmu explicitly. Makes code a bit more robust
and reader friendly.
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Currently in validate_group(), there is a static initializer
for fake_pmu.used_mask which is based on CPU_BITS_NONE but
the used_mask array size is based on CCI_PMU_MAX_HW_EVENTS.
CCI_PMU_MAX_HW_EVENTS is not based on NR_CPUS, so CPU_BITS_NONE
is not correct and will cause a build failure if NR_CPUS
is set high enough to make CPU_BITS_NONE larger than used_mask.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
We mask the event with the CCI_PMU_EVENT_MASK, before passing
the config to pmu_validate_hw_event(), which causes extra bits
to be ignored and qualifies an invalid event code as valid.
e.g,
$ perf stat -a -C 0 -e CCI_400/config=0x1ff,name=cycles/ sleep 1
Performance counter stats for 'system wide':
506951142 cycles
1.013879626 seconds time elapsed
where, cycles has an event coding of 0xff. This patch also removes
the unnecessary 'event' mask in pmu_write_register, since the config_base
is set by the pmu code after the event is validated.
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch separates the PMU driver code from the low level
CCI driver code and enables the PMU driver for ARM64.
Introduces config options for both.
ARM_CCI400_PORT_CTRL - controls the low level driver code for
CCI400 ports.
ARM_CCI400_PMU - controls the PMU driver code
ARM_CCI400_COMMON - Common defintions for CCI400
This patch also changes:
ARM_CCI - common code for probing the CCI devices. This can be
used for adding support for newer CCI versions(e.g, CCI-500).
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Abhilash Kesavan <a.kesavan@samsung.com>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Avoid secure transactions while probing the CCI PMU. The
existing code makes use of the Peripheral ID2 (PID2) register
to determine the revision of the CCI400, which requires a
secure transaction. This puts a limitation on the usage of the
driver on systems running non-secure Linux(e.g, ARM64).
Updated the device-tree binding for cci pmu node to add the explicit
revision number for the compatible field.
The supported strings are :
arm,cci-400-pmu,r0
arm,cci-400-pmu,r1
arm,cci-400-pmu - DEPRECATED. See NOTE below
NOTE: If the revision is not mentioned, we need to probe the cci revision,
which could be fatal on a platform running non-secure. We need a reliable way
to know if we can poke the CCI registers at runtime on ARM32. We depend on
'mcpm_is_available()' when it is available. mcpm_is_available() returns true
only when there is a registered driver for mcpm. Otherwise, we assume that we
don't have secure access, and skips probing the revision number(ARM64 case).
The MCPM should figure out if it is safe to access the CCI. Unfortunately
there isn't a reliable way to indicate the same via dtb. This patch doesn't
address/change the current situation. It only deals with the CCI-PMU, leaving
the assumptions about the secure access as it has been, prior to this patch.
Cc: devicetree@vger.kernel.org
Cc: Punit Agrawal <punit.agrawal@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
CCI400 has different event specifications for PMU, for revsion
0 and revision 1. As of now, we check the revision every single
time before using the parameters for the PMU. This patch abstracts
the details of the pmu models in a struct (cci_pmu_model) and
stores the information in cci_pmu at initialisation time, avoiding
multiple probe operations.
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
No functional changes, only code re-arrangements for easier split of the
PMU code vs low level driver code. Extracts the port handling code
to cci_probe_ports().
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The perf core implicitly rejects events spanning multiple HW PMUs, as in
these cases the event->ctx will differ. However this validation is
performed after pmu::event_init() is called in perf_init_event(), and
thus pmu::event_init() may be called with a group leader from a
different HW PMU.
The CCI PMU driver does not take this fact into account, and assumes
that the any other hardware event belongs to the CCI. There are two
issues with it :
1) It is wrong and we should reject such groups.
2) Validation allocates an temporary idx for this non-cci event, which leads
to wrong calculation of the counter availability, and eventually lesser
number of events in the group.
This patch updates the CCI PMU driver to first test for and reject
events from other PMUs, which is the right thing to do.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Peter Ziljstra (Intel) <peterz@infradead.org>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
* Line termination only requires one extra space at the end of the
buffer. Use PAGE_SIZE - 1 instead of PAGE_SIZE - 2 when formatting.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The arm-cci driver completes the probe sequence even if the cci node is
marked as disabled. Add a check in the driver to honour the cci status
in the device tree.
Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
The ARM CPU PMUs and the ARM CCI PMU are using the same framework
despite being substantially different in programming model, which makes
it difficult to handle either particularly well.
This patch migrates the ARM CCI PMU driver away from the arm_pmu
framework, matching the style of the CCN PMU driver and other 'uncore'
PMU drivers. This will enable refactoring of the arm_pmu framework to
better support CPU PMUs. Event context migration on hotplug is not yet
added due to a race on event->ctx in the core perf code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
[will: fix whitespace issues]
Signed-off-by: Will Deacon <will.deacon@arm.com>
In commit ae91d60ba8, a bug was fixed that
involved converting !x & y to !(x & y). The code below shows the same
pattern, and thus should perhaps be fixed in the same way.
The Coccinelle semantic patch that makes this change is as follows:
// <smpl>
@@ expression E1,E2; @@
(
!E1 & !E2
|
- !E1 & E2
+ !(E1 & E2)
)
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
The event numbering changed between revision r0 and r1 of the CCI
PMU. Expose this to userspace to allow tooling to handle the
differences in event numbers.
Suggested-by: Drew Richardson <Drew.Richardson@arm.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The driver queries the CCI IP revision to distinguish between r0 and r1
scheme for event numbers and currently supports upto version r1p2. To
minimise code churn every time there's a new version of the IP, assume
that event numbering doesn't change for revisions > r1p0 (which is
the case).
The driver will still need an update for future revisions that change
the event numbers.
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch fixes a bug/typo in the CCI driver kcalloc usage
that inadvertently swapped the parameters order in the
kcalloc call and went unnoticed.
Reported-by: Xia Feng <xiafeng@allwinnertech.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull ARM updates from Russell King:
"Included in this series are:
1. BE8 (modern big endian) changes for ARM from Ben Dooks
2. big.Little support from Nicolas Pitre and Dave Martin
3. support for LPAE systems with all system memory above 4GB
4. Perf updates from Will Deacon
5. Additional prefetching and other performance improvements from Will.
6. Neon-optimised AES implementation fro Ard.
7. A number of smaller fixes scattered around the place.
There is a rather horrid merge conflict in tools/perf - I was never
notified of the conflict because it originally occurred between Will's
tree and other stuff. Consequently I have a resolution which Will
forwarded me, which I'll forward on immediately after sending this
mail.
The other notable thing is I'm expecting some build breakage in the
crypto stuff on ARM only with Ard's AES patches. These were merged
into a stable git branch which others had already pulled, so there's
little I can do about this. The problem is caused because these
patches have a dependency on some code in the crypto git tree - I
tried requesting a branch I can pull to resolve these, and all I got
each time from the crypto people was "we'll revert our patches then"
which would only make things worse since I still don't have the
dependent patches. I've no idea what's going on there or how to
resolve that, and since I can't split these patches from the rest of
this pull request, I'm rather stuck with pushing this as-is or
reverting Ard's patches.
Since it should "come out in the wash" I've left them in - the only
build problems they seem to cause at the moment are with randconfigs,
and since it's a new feature anyway. However, if by -rc1 the
dependencies aren't in, I think it'd be best to revert Ard's patches"
I resolved the perf conflict roughly as per the patch sent by Russell,
but there may be some differences. Any errors are likely mine. Let's
see how the crypto issues work out..
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (110 commits)
ARM: 7868/1: arm/arm64: remove atomic_clear_mask() in "include/asm/atomic.h"
ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg().
ARM: 7866/1: include: asm: use 'long long' instead of 'u64' within atomic.h
ARM: 7871/1: amba: Extend number of IRQS
ARM: 7887/1: Don't smp_cross_call() on UP devices in arch_irq_work_raise()
ARM: 7872/1: Support arch_irq_work_raise() via self IPIs
ARM: 7880/1: Clear the IT state independent of the Thumb-2 mode
ARM: 7878/1: nommu: Implement dummy early_paging_init()
ARM: 7876/1: clear Thumb-2 IT state on exception handling
ARM: 7874/2: bL_switcher: Remove cpu_hotplug_driver_{lock,unlock}()
ARM: footbridge: fix build warnings for netwinder
ARM: 7873/1: vfp: clear vfp_current_hw_state for dying cpu
ARM: fix misplaced arch_virt_to_idmap()
ARM: 7848/1: mcpm: Implement cpu_kill() to synchronise on powerdown
ARM: 7847/1: mcpm: Factor out logical-to-physical CPU translation
ARM: 7869/1: remove unused XSCALE_PMU Kconfig param
ARM: 7864/1: Handle 64-bit memory in case of 32-bit phys_addr_t
ARM: 7863/1: Let arm_add_memory() always use 64-bit arguments
ARM: 7862/1: pcpu: replace __get_cpu_var_uses
ARM: 7861/1: cacheflush: consolidate single-CPU ARMv7 cache disabling code
...
cci_enable_port_for_self written in asm and it works with h/w
registers that are in little endian format. When run in big
endian mode it needs byteswaped constants before/after it
writes/reads to/from such registers
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
This patch fix the error handle of function cci_pmu_probe():
- using IS_ERR() instead of NULL test for the return value of
devm_ioremap_resource() since it nerver return NULL.
- remove kfree() for devm_kzalloc allocated memory
- remove dev_warn() since devm_ioremap_resource() has error message
already.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Extend the existing CCI driver to support the PMU by registering a perf
backend for it.
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Dave Martin <dave.martin@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
[will: removed broken __init annotations]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Since the CPU device nodes can be retrieved using arch_of_get_cpu_node,
we can use it to avoid parsing the cpus node searching the cpu nodes and
mapping to logical index.
This patch removes parsing DT for cpu nodes by using of_get_cpu_node.
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
When we build a kernel with support for both ARMv6 and ARMv7,
gas is trying to be helpful by pointing out that the arm-cci
driver would not work on ARMv6:
/tmp/ccu1LDeU.s: Assembler messages:
/tmp/ccu1LDeU.s:450: Error: selected processor does not support ARM mode `wfi '
/tmp/ccu1LDeU.s:451: Error: selected processor does not support ARM mode `wfe '
make[4]: *** [drivers/bus/arm-cci.o] Error 1
We know that the driver will only be used on ARMv7, hence we
can annotate the inline assembly listing to allow those instructions.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nicolas Pitre <nico@linaro.org>
Cc: Dave Martin <dave.martin@linaro.org>
This provides cci_enable_port_for_self(). This is the counterpart to
cci_disable_port_by_cpu(self).
This is meant to be called from the MCPM machine specific power_up_setup
callback code when the appropriate affinity level needs to be initialized.
The code therefore has to be position independent as the MMU is still off
and it cannot rely on any stack space.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Dave Martin <dave.martin@linaro.org>
On ARM multi-cluster systems coherency between cores running on
different clusters is managed by the cache-coherent interconnect (CCI).
It allows broadcasting of TLB invalidates and memory barriers and it
guarantees cache coherency at system level through snooping of slave
interfaces connected to it.
This patch enables the basic infrastructure required in Linux to handle and
programme the CCI component.
Non-local variables used by the CCI management functions called by power
down function calls after disabling the cache must be flushed out to main
memory in advance, otherwise incoherency of those values may occur if they
are sitting in the cache of some other CPU when power down functions
execute. Driver code ensures that relevant data structures are flushed
from inner and outer caches after the driver probe is completed.
CCI slave port resources are linked to set of CPUs through bus masters
phandle properties that link the interface resources to masters node in
the device tree.
Documentation describing the CCI DT bindings is provided with the patch.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>