We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Update the .c files that depend on these APIs.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix up affected files that include this signal functionality via sched.h.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
After commit 9908859aca (cpuidle/menu: add per CPU PM QoS resume
latency consideration) the cpuidle menu governor calls
dev_pm_qos_read_value() on CPU devices to read the current resume
latency QoS constraint values for them. That function takes a spinlock
to prevent the device's power.qos pointer from becoming NULL during
the access which is a problem for the RT patchset where spinlocks are
converted into mutexes and the idle loop stops working.
However, it is not even necessary for the menu governor to take
that spinlock, because the power.qos pointer accessed under it
cannot be modified during the access anyway.
For this reason, introduce a "raw" routine for accessing device
QoS resume latency constraints without locking and use it in the
menu governor.
Fixes: 9908859aca (cpuidle/menu: add per CPU PM QoS resume latency consideration)
Acked-by: Alex Shi <alex.shi@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
They were never used in the kernel, so get rid of them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reading array at given index before checking if index is valid results in
illegal memory access.
The bug was detected using KASAN framework.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Once a subdomain is powered off, genpd queues a power off work for each of
the subdomain's corresponding masters, thus postponing the masters to be
powered off to a later point.
When genpd used intermediate power off states, which was removed in
commit ba2bbfbf63 ("PM / Domains: Remove intermediate states from the
power off sequence"), this behaviour made sense, but now it simply doesn't.
Genpd can easily try to power off the masters in the same context as the
subdomain, of course by acquiring/releasing the lock. Then, let's convert
to this behaviour, as it avoids unnecessary works from being queued.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The parameter name is_async, for genpd_power_off() gives a poor description
of its purpose. To clarify, let's rename it to one_dev_on and update the
documentation of it in the function header.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Following changes in genpd_power_on() makes it invoke genpd_power_off().
To enable these changes and avoiding to declare genpd_power_off(), let's
move its implementation above genpd_power_on(). In this way, following
changes should become easier to review.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-opp: (24 commits)
PM / OPP: Expose _of_get_opp_desc_node as dev_pm_opp API
PM / OPP: Make _find_opp_table_unlocked() static
PM / OPP: Update Documentation to remove RCU specific bits
PM / OPP: Simplify dev_pm_opp_get_max_volt_latency()
PM / OPP: Simplify _opp_set_availability()
PM / OPP: Move away from RCU locking
PM / OPP: Take kref from _find_opp_table()
PM / OPP: Update OPP users to put reference
PM / OPP: Add 'struct kref' to struct dev_pm_opp
PM / OPP: Use dev_pm_opp_get_opp_table() instead of _add_opp_table()
PM / OPP: Take reference of the OPP table while adding/removing OPPs
PM / OPP: Return opp_table from dev_pm_opp_set_*() routines
PM / OPP: Add 'struct kref' to OPP table
PM / OPP: Add per OPP table mutex
PM / OPP: Split out part of _add_opp_table() and _remove_opp_table()
PM / OPP: Don't expose srcu_head to register notifiers
PM / OPP: Rename dev_pm_opp_get_suspend_opp() and return OPP rate
PM / OPP: Don't allocate OPP table from _opp_allocate()
PM / OPP: Rename and split _dev_pm_opp_remove_table()
PM / OPP: Add light weight _opp_free() routine
...
There are two reasons for reporting wakeup event when dedicated wakeup
IRQ is triggered:
- wakeup events accounting, so proper statistical data will be
displayed in sysfs and debugfs;
- there are small window when System is entering suspend during which
dedicated wakeup IRQ can be lost:
dpm_suspend_noirq()
|- device_wakeup_arm_wake_irqs()
|- dev_pm_arm_wake_irq(X)
|- IRQ is enabled and marked as wakeup source
[1]...
|- suspend_device_irqs()
|- suspend_device_irq(X)
|- irqd_set(X, IRQD_WAKEUP_ARMED);
|- wakup IRQ armed
The wakeup IRQ can be lost if it's triggered at point [1]
and not armed yet.
Hence, fix above cases by adding simple pm_wakeup_event() call in
handle_threaded_wake_irq().
Fixes: 4990d4fe32 (PM / Wakeirq: Add automated device wake IRQ handling)
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
[ tony@atomide.com: added missing return to avoid warnings ]
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Dedicated wakeirq is a one time event to wake-up the system from
low-power state and then call pm_runtime_resume() on the device wired
with the dedicated wakeirq.
Sometimes dedicated wakeirqs can get deferred if they trigger after we
call disable_irq_nosync() in dev_pm_disable_wake_irq(). This can happen
if pm_runtime_get() is called around the same time a wakeirq fires.
If an interrupt fires after disable_irq_nosync(), by default it will get
tagged with IRQS_PENDING and will run later on when the interrupt is
enabled again.
Deferred wakeirqs usually just produce pointless wake-up events. But they
can also cause suspend to fail if the deferred wakeirq fires during
dpm_suspend_noirq() for example. So we really don't want to see the
deferred wakeirqs triggering after the device has resumed.
Let's fix the issue by setting IRQ_DISABLE_UNLAZY flag for the dedicated
wakeirqs. The other option would be to implement irq_disable() in the
dedicated wakeirq controller, but that's not a generic solution.
For reference below is what happens with a IRQ_TYPE_EDGE_BOTH IRQ
type wakeirq:
- resume by dedicated IRQ (EDGE_FALLING)
- suspend_enter()
....
- arch_suspend_enable_irqs()
|- dedicated IRQ armed and fired
|- irq_pm_check_wakeup()
|- disarm, disable IRQ and mark as IRQS_PENDING
....
- dpm_resume_noirq()
|- resume_device_irqs()
|- __enable_irq()
|- check_irq_resend()
|- handle_threaded_wake_irq()
|- dedicated IRQ processed
|- device_wakeup_disarm_wake_irqs()
|- disable_irq_wake()
....
!-> dedicated IRQ (EDGE_RISING)
-| handle_edge_irq()
|- IRQ disabled: mask_ack_irq and mark as IRQS_PENDING
....
- subsequent suspend
....
|- dpm_suspend_noirq()
|- device_wakeup_arm_wake_irqs()
|- __enable_irq()
|- check_irq_resend()
(a) |- handle_threaded_wake_irq()
|- pm_wakeup_event() --> abort suspend
....
|- suspend_device_irqs()
|- suspend_device_irq()
|- dedicated IRQ armed
....
(b) |- resend_irqs
|- irq_pm_check_wakeup()
|- IRQ armed -> abort suspend
because of pending IRQ System suspend can be aborted at points
(a)-not armed or (b)-armed.
Fixes: 4990d4fe32 (PM / Wakeirq: Add automated device wake IRQ handling)
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
[ tony@atomide.com: added a comment, updated the description ]
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We currently rely on runtime PM to enable dedicated wakeirq for suspend.
This assumption fails in the following two cases:
1. If the consumer driver does not have runtime PM implemented, the
dedicated wakeirq never gets enabled for suspend
2. If the consumer driver has runtime PM implemented, but does not idle
in suspend
Let's fix the issue by always enabling the dedicated wakeirq during
suspend.
Depends-on: bed570307e (PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend)
Fixes: 4990d4fe32 (PM / Wakeirq: Add automated device wake IRQ handling)
Reported-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
[ tony@atomide.com: updated based on bed570307e, added description ]
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rename _of_get_opp_desc_node to dev_pm_opp_of_get_opp_desc_node and add it
to include/linux/pm_opp.h to allow other drivers, such as platform OPP
and cpufreq drivers, to make use of it.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As the PM core may invoke the *noirq() callbacks asynchronously, the
current lock-less approach in genpd doesn't work. The consequence is that
we may find concurrent operations racing to power on/off the PM domain.
As of now, no immediate errors has been reported, but it's probably only a
matter time. Therefor let's fix the problem now before this becomes a real
issue, by deploying the locking scheme to the relevant functions.
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The earlier comment stated that the dev_warn_once() was going to be printed
once per device. Let's fix that, as dev_warn_once() is printed only once,
no matter of the device.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fixes the following sparse warning:
drivers/base/power/opp/core.c:49:18: warning:
symbol '_find_opp_table_unlocked' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The might_sleep_if() assertions in __pm_runtime_idle(),
__pm_runtime_suspend() and __pm_runtime_resume() may generate
false-positive warnings in some situations. For example, that
happens if a nested pm_runtime_get_sync()/pm_runtime_put() pair
is executed with disabled interrupts within an outer
pm_runtime_get_sync()/pm_runtime_put() section for the same device.
[Generally, pm_runtime_get_sync() may sleep, so it should not be
called with disabled interrupts, but in this particular case the
previous pm_runtime_get_sync() guarantees that the device will not
be suspended, so the inner pm_runtime_get_sync() will return
immediately after incrementing the device's usage counter.]
That started to happen in the i915 driver in 4.10-rc, leading to
the following splat:
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1032
in_atomic(): 1, irqs_disabled(): 0, pid: 1500, name: Xorg
1 lock held by Xorg/1500:
#0: (&dev->struct_mutex){+.+.+.}, at:
[<ffffffffa0680c13>] i915_mutex_lock_interruptible+0x43/0x140 [i915]
CPU: 0 PID: 1500 Comm: Xorg Not tainted
Call Trace:
dump_stack+0x85/0xc2
___might_sleep+0x196/0x260
__might_sleep+0x53/0xb0
__pm_runtime_resume+0x7a/0x90
intel_runtime_pm_get+0x25/0x90 [i915]
aliasing_gtt_bind_vma+0xaa/0xf0 [i915]
i915_vma_bind+0xaf/0x1e0 [i915]
i915_gem_execbuffer_relocate_entry+0x513/0x6f0 [i915]
i915_gem_execbuffer_relocate_vma.isra.34+0x188/0x250 [i915]
? trace_hardirqs_on+0xd/0x10
? i915_gem_execbuffer_reserve_vma.isra.31+0x152/0x1f0 [i915]
? i915_gem_execbuffer_reserve.isra.32+0x372/0x3a0 [i915]
i915_gem_do_execbuffer.isra.38+0xa70/0x1a40 [i915]
? __might_fault+0x4e/0xb0
i915_gem_execbuffer2+0xc5/0x260 [i915]
? __might_fault+0x4e/0xb0
drm_ioctl+0x206/0x450 [drm]
? i915_gem_execbuffer+0x340/0x340 [i915]
? __fget+0x5/0x200
do_vfs_ioctl+0x91/0x6f0
? __fget+0x111/0x200
? __fget+0x5/0x200
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x23/0xc6
even though the code triggering it is correct.
Unfortunately, the might_sleep_if() assertions in question are
too coarse-grained to cover such cases correctly, so make them
a bit less sensitive in order to avoid the false-positives.
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
dev_pm_opp_get_max_volt_latency() calls _find_opp_table() two times
effectively.
Merge _get_regulator_count() into dev_pm_opp_get_max_volt_latency() to
avoid that.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As we don't use RCU locking anymore, there is no need to replace an
earlier OPP node with a new one. Just update the existing one.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The RCU locking isn't well suited for the OPP core. The RCU locking fits
better for reader heavy stuff, while the OPP core have at max one or two
readers only at a time.
Over that, it was getting very confusing the way RCU locking was used
with the OPP core. The individual OPPs are mostly well handled, i.e. for
an update a new structure was created and then that replaced the older
one. But the OPP tables were updated directly all the time from various
parts of the core. Though they were mostly used from within RCU locked
region, they didn't had much to do with RCU and were governed by the
mutex instead.
And that mixed with the 'opp_table_lock' has made the core even more
confusing.
Now that we are already managing the OPPs and the OPP tables with kernel
reference infrastructure, we can get rid of RCU locking completely and
simplify the code a lot.
Remove all RCU references from code and comments.
Acquire opp_table->lock while parsing the list of OPPs though.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Take reference of the OPP table from within _find_opp_table(). Also
update the callers of _find_opp_table() to call
dev_pm_opp_put_opp_table() after they have used the OPP table.
Note that _find_opp_table() increments the reference under the
opp_table_lock.
Now that the OPP table wouldn't get freed until the callers of
_find_opp_table() call dev_pm_opp_put_opp_table(), there is no need to
take the opp_table_lock or rcu_read_lock() around it. Drop them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch updates dev_pm_opp_find_freq_*() routines to get a reference
to the OPPs returned by them.
Also updates the users of dev_pm_opp_find_freq_*() routines to call
dev_pm_opp_put() after they are done using the OPPs.
As it is guaranteed the that OPPs wouldn't get freed while being used,
the RCU read side locking present with the users isn't required anymore.
Drop it as well.
This patch also updates all users of devfreq_recommended_opp() which was
returning an OPP received from the OPP core.
Note that some of the OPP core routines have gained
rcu_read_{lock|unlock}() calls, as those still use RCU specific APIs
within them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> [Devfreq]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add kref to struct dev_pm_opp for easier accounting of the OPPs.
Note that the OPPs are freed under the opp_table->lock mutex only.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Migrate all users of _add_opp_table() to use dev_pm_opp_get_opp_table()
to guarantee that the OPP table doesn't get freed while being used.
Also update _managed_opp() to get the reference to the OPP table.
Now that the OPP table wouldn't get freed while these routines are
executing after dev_pm_opp_get_opp_table() is called, there is no need
to take opp_table_lock. Drop them as well.
Now that _add_opp_table(), _remove_opp_table() and the unlocked release
routines aren't used anymore, remove them.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Take reference of the OPP table while adding and removing OPPs, that
helps us remove special checks in _remove_opp_table().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now that we have proper kernel reference infrastructure in place for OPP
tables, use it to guarantee that the OPP table isn't freed while being
used by the callers of dev_pm_opp_set_*() APIs.
Make them all return the pointer to the OPP table after taking its
reference and put the reference back with dev_pm_opp_put_*() APIs.
Now that the OPP table wouldn't get freed while these routines are
executing after dev_pm_opp_get_opp_table() is called, there is no need
to take opp_table_lock. Drop them as well.
Remove the rcu specific comments from these routines as they aren't
relevant anymore.
Note that prototypes of dev_pm_opp_{set|put}_regulators() were already
updated by another patch.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add kref to struct opp_table for easier accounting of the OPP table.
Note that the new routine dev_pm_opp_get_opp_table() takes the reference
from under the opp_table_lock, which guarantees that the OPP table
doesn't get freed unless dev_pm_opp_put_opp_table() is called for the
OPP table.
Two separate release mechanisms are added: locked and unlocked. In
unlocked version the routines aren't required to take/drop
opp_table_lock as the callers have already done that. This is required
to avoid breaking git bisect, otherwise we may get lockdeps between
commits. Once all the users of OPP table are updated the unlocked
version shall be removed.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add per OPP table lock to protect opp_table->opp_list.
Note that at few places opp_list is used under the rcu_read_lock() and
so a mutex can't be added there for now. This will be fixed by a later
patch.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Split out parts of _add_opp_table() and _remove_opp_table() into
separate routines. This improves readability as well.
Should result in no functional changes.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Let the OPP core provide helpers to register notifiers for any device,
instead of exposing srcu_head outside of the core.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is only one user of dev_pm_opp_get_suspend_opp() and that uses it
to get the OPP rate for the suspend_opp.
Rename dev_pm_opp_get_suspend_opp() as dev_pm_opp_get_suspend_opp_freq()
and return the rate directly from it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no point in trying to find/allocate the table for every OPP
that is added for a device. It would be far more efficient to allocate
the table only once and pass its pointer to the routines that add the
OPP entry.
Locking is removed from _opp_add_static_v2() and _opp_add_v1() now as
the callers call them with that lock already held.
Call to _remove_opp_table() routine is also removed from _opp_free()
now, as opp_table isn't allocated from within _opp_allocate(). This is
handled by the routines which created the OPP table in the first place.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Later patches would want to remove OPP table (and its OPPs) using the
opp_table pointer instead of 'dev'.
In order to prepare for that, rename _dev_pm_opp_remove_table() as
_dev_pm_opp_find_and_remove_table() split out part of it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The OPPs which are never successfully added using _opp_add() are not
required to be freed with the _opp_remove() routine, as a simple kfree()
is enough for them.
Introduce a new light weight routine _opp_free(), which will do that.
That also helps us removing the 'notify' parameter to _opp_remove(),
which isn't required anymore.
Note that _opp_free() contains a call to _remove_opp_table() as the OPP
table might have been added for this very OPP only. The
_remove_opp_table() routine returns quickly if there are more OPPs in
the table. This will be simplified in later patches though.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The code adding static OPPs for V2 bindings already does so. Make the V1
bindings specific code behave the same.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Make the naming consistent with how other routines are named.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This TODO doesn't make sense anymore as we have all the information in a
single OPP table. Remove it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are two types of duplicate OPPs that get different behavior from
the core:
A) An earlier OPP is marked 'available' and has same freq/voltages as
the new one.
B) An earlier OPP with same frequency, but is marked 'unavailable' OR
doesn't have same voltages as the new one.
The OPP core returns 0 for the first one, but -EEXIST for the second.
While the OPP core returns 0 for the first case, its callers don't free
the newly allocated OPP structure which isn't used anymore. Fix that by
returning -EBUSY instead of 0, but make the callers return 0 eventually.
As this isn't a critical fix, its not getting marked for stable kernel.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently the mix of genpd_poweron(), genpd_power_on(),
genpd_sync_poweron() and the ->power_on() callback, makes
a bit difficult to follow the path of execution. The similar
applies to the functions dealing with power off.
In a way to improve this understanding, let's do the following renaming:
genpd_power_on() -> _genpd_power_on()
genpd_poweron() -> genpd_power_on()
genpd_sync_poweron() -> genpd_sync_power_on()
genpd_power_off() -> _genpd_power_off()
genpd_poweroff() -> genpd_power_off()
genpd_sync_poweroff() -> genpd_sync_power_off()
genpd_poweroff_unused() -> genpd_power_off_unused()
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch fixes the following gcc warning:
drivers/base/power/domain.c: In function ‘genpd_runtime_resume’:
drivers/base/power/domain.c:642:14: warning: ‘time_start’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start)
The same problem (in another function in this same file) was fixed in
commit d33d5a6c88 (avoid spurious "may be used uninitialized" warning)
Signed-off-by: Augusto Mecking Caringi <augustocaringi@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The timer type simplifications caused a new gcc warning:
drivers/base/power/domain.c: In function ‘genpd_runtime_suspend’:
drivers/base/power/domain.c:562:14: warning: ‘time_start’ may be used uninitialized in this function [-Wmaybe-uninitialized]
elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start));
despite the actual use of "time_start" not having changed in any way.
It appears that simply changing the type of ktime_t from a union to a
plain scalar type made gcc check the use.
The variable wasn't actually used uninitialized, but gcc apparently
failed to notice that the conditional around the use was exactly the
same as the conditional around the initialization of that variable.
Add an unnecessary initialization just to shut up the compiler.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.
Get rid of the union and just keep ktime_t as simple typedef of type s64.
The conversion was done with coccinelle and some manual mopping up.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Here's the new driver core patches for 4.10-rc1.
Big thing here is the nice addition of "functional dependencies" to the
driver core. The idea has been talked about for a very long time, great
job to Rafael for stepping up and implementing it. It's been tested for
longer than the 4.9-rc1 date, we held off on merging it earlier in order
to feel more comfortable about it.
Other than that, it's just a handful of small other patches, some good
cleanups to the mess that is the firmware class code, and we have a test
driver for the deferred probe logic.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWFAvPQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym3NgCgmhFeWEkp9SDt17YGGavmnzQUlBQAoJlUipJp
PHeQkq15ZWw3wWC9FEvM
=91M1
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here's the new driver core patches for 4.10-rc1.
Big thing here is the nice addition of "functional dependencies" to
the driver core. The idea has been talked about for a very long time,
great job to Rafael for stepping up and implementing it. It's been
tested for longer than the 4.9-rc1 date, we held off on merging it
earlier in order to feel more comfortable about it.
Other than that, it's just a handful of small other patches, some good
cleanups to the mess that is the firmware class code, and we have a
test driver for the deferred probe logic.
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (30 commits)
firmware: Correct handling of fw_state_wait() return value
driver core: Silence device links sphinx warning
firmware: remove warning at documentation generation time
drivers: base: dma-mapping: Fix typo in dmam_alloc_non_coherent comments
driver core: test_async: fix up typo found by 0-day
firmware: move fw_state_is_done() into UHM section
firmware: do not use fw_lock for fw_state protection
firmware: drop bit ops in favor of simple state machine
firmware: refactor loading status
firmware: fix usermode helper fallback loading
driver core: firmware_class: convert to use class_groups
driver core: devcoredump: convert to use class_groups
driver core: class: add class_groups support
kernfs: Declare two local data structures static
driver-core: fix platform_no_drv_owner.cocci warnings
drivers/base/memory.c: Remove unused 'first_page' variable
driver core: add CLASS_ATTR_WO()
drivers: base: cacheinfo: support DT overrides for cache properties
drivers: base: cacheinfo: add pr_fmt logging
drivers: base: cacheinfo: fix boot error message when acpi is enabled
...
- New cpufreq driver for Broadcom STB SoCs and a Device Tree binding
for it (Markus Mayer).
- Support for ARM Integrator/AP and Integrator/CP in the generic
DT cpufreq driver and elimination of the old Integrator cpufreq
driver (Linus Walleij).
- Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier,
and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie,
Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik).
- cpufreq core fix to eliminate races that may lead to using
inactive policy objects and related cleanups (Rafael Wysocki).
- cpufreq schedutil governor update to make it use SCHED_FIFO
kernel threads (instead of regular workqueues) for doing delayed
work (to reduce the response latency in some cases) and related
cleanups (Viresh Kumar).
- New cpufreq sysfs attribute for resetting statistics (Markus
Mayer).
- cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis,
Viresh Kumar).
- Support for using generic cpufreq governors in the intel_pstate
driver (Rafael Wysocki).
- Support for per-logical-CPU P-state limits and the EPP/EPB
(Energy Performance Preference/Energy Performance Bias) knobs
in the intel_pstate driver (Srinivas Pandruvada).
- New CPU ID for Knights Mill in intel_pstate (Piotr Luc).
- intel_pstate driver modification to use the P-state selection
algorithm based on CPU load on platforms with the system profile
in the ACPI tables set to "mobile" (Srinivas Pandruvada).
- intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki,
Srinivas Pandruvada).
- cpufreq powernv driver updates including fast switching support
(for the schedutil governor), fixes and cleanus (Akshay Adiga,
Andrew Donnellan, Denis Kirjanov).
- acpi-cpufreq driver rework to switch it over to the new CPU
offline/online state machine (Sebastian Andrzej Siewior).
- Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth
Prakash).
- Idle injection rework (to make it use the regular idle path
instead of a home-grown custom one) and related powerclamp
thermal driver updates (Peter Zijlstra, Jacob Pan, Petr Mladek,
Sebastian Andrzej Siewior).
- New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy
Shevchenko, Piotr Luc).
- intel_idle driver cleanups and switch over to using the new CPU
offline/online state machine (Anna-Maria Gleixner, Sebastian
Andrzej Siewior).
- cpuidle DT driver update to support suspend-to-idle properly
(Sudeep Holla).
- cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian,
Rafael Wysocki).
- Preliminary support for power domains including CPUs in the
generic power domains (genpd) framework and related DT bindings
(Lina Iyer).
- Assorted fixes and cleanups in the generic power domains (genpd)
framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven).
- Preliminary support for devices with multiple voltage regulators
and related fixes and cleanups in the Operating Performance Points
(OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd).
- System sleep state selection interface rework to make it easier
to support suspend-to-idle as the default system suspend method
(Rafael Wysocki).
- PM core fixes and cleanups, mostly related to the interactions
between the system suspend and runtime PM frameworks (Ulf Hansson,
Sahitya Tummala, Tony Lindgren).
- Latency tolerance PM QoS framework imorovements (Andrew
Lutomirski).
- New Knights Mill CPU ID for the Intel RAPL power capping driver
(Piotr Luc).
- Intel RAPL power capping driver fixes, cleanups and switch over
to using the new CPU offline/online state machine (Jacob Pan,
Thomas Gleixner, Sebastian Andrzej Siewior).
- Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc,
rockchip-dfi devfreq drivers and the devfreq core (Axel Lin,
Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh
Kumar).
- Fix for false-positive KASAN warnings during resume from ACPI S3
(suspend-to-RAM) on x86 (Josh Poimboeuf).
- Memory map verification during resume from hibernation on x86 to
ensure a consistent address space layout (Chen Yu).
- Wakeup sources debugging enhancement (Xing Wei).
- rockchip-io AVS driver cleanup (Shawn Lin).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYTx4+AAoJEILEb/54YlRx9f8P/2SlNHUENW5qh6FtCw00oC2u
UqJerQJ2L38UgbgxbE/0VYblma9rFABDWC1eO2xN2XdcdW5UPBKPVvNcOgNe1Clh
gjy3RxZXVpmjfzt2kGfsTLEuGnHqwvx51hTUkeA2LwvkOal45xb8ZESmy8opCtiv
iG4LwmPHoxdX5Za5nA9ItFKzxyO1EoyNSnBYAVwALDHxmNOfxEcRevfurASt/0M9
brCCZJA0/sZxeL0lBdy8fNQPIBTUfCoTJG/MtmzGrObJ9wMFvEDfXrVEyZiWs/zA
AAZ4kQL77enrIKgrLN8e0G6LzTLHoVcvn38Xjf24dKUqhd7ACBhYcnW+jK3+7EAd
gjZ8efObQsiuyK/EDLUNw35tt96CHOqfrQCj2tIwRVvk9EekLqAGXdIndTCr2kYW
RpefmP5kMljnm/nQFOVLwMEUQMuVkvUE7EgxADy7DoDmepBFC4ICRDWPye70R2kC
0O1Tn2PAQq4Fd1tyI9TYYz0YQQkRoaRb5rfYUSzbRbeCdsphUopp4Vhsiyn6IcnF
XnLbg6pRAat82MoS9n4pfO/VCo8vkErKA8tut9G7TDakkrJoEE7l31PdKW0hP3f6
sBo6xXy6WTeivU/o/i8TbM6K4mA37pBaj78ooIkWLgg5fzRaS2+0xSPVy2H9x1m5
LymHcobCK9rSZ1l208Fe
=vhxI
-----END PGP SIGNATURE-----
Merge tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"Again, cpufreq gets more changes than the other parts this time (one
new driver, one old driver less, a bunch of enhancements of the
existing code, new CPU IDs, fixes, cleanups)
There also are some changes in cpuidle (idle injection rework, a
couple of new CPU IDs, online/offline rework in intel_idle, fixes and
cleanups), in the generic power domains framework (mostly related to
supporting power domains containing CPUs), and in the Operating
Performance Points (OPP) library (mostly related to supporting devices
with multiple voltage regulators)
In addition to that, the system sleep state selection interface is
modified to make it easier for distributions with unchanged user space
to support suspend-to-idle as the default system suspend method, some
issues are fixed in the PM core, the latency tolerance PM QoS
framework is improved a bit, the Intel RAPL power capping driver is
cleaned up and there are some fixes and cleanups in the devfreq
subsystem
Specifics:
- New cpufreq driver for Broadcom STB SoCs and a Device Tree binding
for it (Markus Mayer)
- Support for ARM Integrator/AP and Integrator/CP in the generic DT
cpufreq driver and elimination of the old Integrator cpufreq driver
(Linus Walleij)
- Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier,
and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie,
Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik)
- cpufreq core fix to eliminate races that may lead to using inactive
policy objects and related cleanups (Rafael Wysocki)
- cpufreq schedutil governor update to make it use SCHED_FIFO kernel
threads (instead of regular workqueues) for doing delayed work (to
reduce the response latency in some cases) and related cleanups
(Viresh Kumar)
- New cpufreq sysfs attribute for resetting statistics (Markus Mayer)
- cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis,
Viresh Kumar)
- Support for using generic cpufreq governors in the intel_pstate
driver (Rafael Wysocki)
- Support for per-logical-CPU P-state limits and the EPP/EPB (Energy
Performance Preference/Energy Performance Bias) knobs in the
intel_pstate driver (Srinivas Pandruvada)
- New CPU ID for Knights Mill in intel_pstate (Piotr Luc)
- intel_pstate driver modification to use the P-state selection
algorithm based on CPU load on platforms with the system profile in
the ACPI tables set to "mobile" (Srinivas Pandruvada)
- intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki,
Srinivas Pandruvada)
- cpufreq powernv driver updates including fast switching support
(for the schedutil governor), fixes and cleanus (Akshay Adiga,
Andrew Donnellan, Denis Kirjanov)
- acpi-cpufreq driver rework to switch it over to the new CPU
offline/online state machine (Sebastian Andrzej Siewior)
- Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth
Prakash)
- Idle injection rework (to make it use the regular idle path instead
of a home-grown custom one) and related powerclamp thermal driver
updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej
Siewior)
- New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy
Shevchenko, Piotr Luc)
- intel_idle driver cleanups and switch over to using the new CPU
offline/online state machine (Anna-Maria Gleixner, Sebastian
Andrzej Siewior)
- cpuidle DT driver update to support suspend-to-idle properly
(Sudeep Holla)
- cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian,
Rafael Wysocki)
- Preliminary support for power domains including CPUs in the generic
power domains (genpd) framework and related DT bindings (Lina Iyer)
- Assorted fixes and cleanups in the generic power domains (genpd)
framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven)
- Preliminary support for devices with multiple voltage regulators
and related fixes and cleanups in the Operating Performance Points
(OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd)
- System sleep state selection interface rework to make it easier to
support suspend-to-idle as the default system suspend method
(Rafael Wysocki)
- PM core fixes and cleanups, mostly related to the interactions
between the system suspend and runtime PM frameworks (Ulf Hansson,
Sahitya Tummala, Tony Lindgren)
- Latency tolerance PM QoS framework imorovements (Andrew Lutomirski)
- New Knights Mill CPU ID for the Intel RAPL power capping driver
(Piotr Luc)
- Intel RAPL power capping driver fixes, cleanups and switch over to
using the new CPU offline/online state machine (Jacob Pan, Thomas
Gleixner, Sebastian Andrzej Siewior)
- Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc,
rockchip-dfi devfreq drivers and the devfreq core (Axel Lin,
Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar)
- Fix for false-positive KASAN warnings during resume from ACPI S3
(suspend-to-RAM) on x86 (Josh Poimboeuf)
- Memory map verification during resume from hibernation on x86 to
ensure a consistent address space layout (Chen Yu)
- Wakeup sources debugging enhancement (Xing Wei)
- rockchip-io AVS driver cleanup (Shawn Lin)"
* tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (127 commits)
devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks
devfreq: rk3399_dmc: Remove dangling rcu_read_unlock()
devfreq: exynos: Don't use OPP structures outside of RCU locks
Documentation: intel_pstate: Document HWP energy/performance hints
cpufreq: intel_pstate: Support for energy performance hints with HWP
cpufreq: intel_pstate: Add locking around HWP requests
PM / sleep: Print active wakeup sources when blocking on wakeup_count reads
PM / core: Fix bug in the error handling of async suspend
PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend
PM / Domains: Fix compatible for domain idle state
PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators()
PM / OPP: Allow platform specific custom set_opp() callbacks
PM / OPP: Separate out _generic_set_opp()
PM / OPP: Add infrastructure to manage multiple regulators
PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage()
PM / OPP: Manage supply's voltage/current in a separate structure
PM / OPP: Don't use OPP structure outside of rcu protected section
PM / OPP: Reword binding supporting multiple regulators per device
PM / OPP: Fix incorrect cpu-supply property in binding
cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state()
..
Pull timer updates from Thomas Gleixner:
"The time/timekeeping/timer folks deliver with this update:
- Fix a reintroduced signed/unsigned issue and cleanup the whole
signed/unsigned mess in the timekeeping core so this wont happen
accidentaly again.
- Add a new trace clock based on boot time
- Prevent injection of random sleep times when PM tracing abuses the
RTC for storage
- Make posix timers configurable for real tiny systems
- Add tracepoints for the alarm timer subsystem so timer based
suspend wakeups can be instrumented
- The usual pile of fixes and updates to core and drivers"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
timekeeping: Use mul_u64_u32_shr() instead of open coding it
timekeeping: Get rid of pointless typecasts
timekeeping: Make the conversion call chain consistently unsigned
timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion
alarmtimer: Add tracepoints for alarm timers
trace: Update documentation for mono, mono_raw and boot clock
trace: Add an option for boot clock as trace clock
timekeeping: Add a fast and NMI safe boot clock
timekeeping/clocksource_cyc2ns: Document intended range limitation
timekeeping: Ignore the bogus sleep time if pm_trace is enabled
selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous"
clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap
clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map()
arm64: dts: rockchip: Arch counter doesn't tick in system suspend
clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend
posix-timers: Make them configurable
posix_cpu_timers: Move the add_device_randomness() call to a proper place
timer: Move sys_alarm from timer.c to itimer.c
ptp_clock: Allow for it to be optional
Kconfig: Regenerate *.c_shipped files after previous changes
...
* pm-core:
PM / core: Fix bug in the error handling of async suspend
PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend
PM / Runtime: Defer resuming of the device in pm_runtime_force_resume()
PM / Runtime: Don't allow to suspend a device with an active child
net: smsc911x: Synchronize the runtime PM status during system suspend
PM / Runtime: Convert pm_runtime_set_suspended() to return an int
PM / Runtime: Clarify comment in rpm_resume() when resuming the parent
PM / Runtime: Remove the exported function pm_children_suspended()
* pm-qos:
PM / QoS: Export dev_pm_qos_update_user_latency_tolerance
PM / QoS: Fix writing 'auto' to pm_qos_latency_tolerance_us
PM / QoS: Improve sysfs pm_qos_latency_tolerance validation
* pm-avs:
PM / AVS: rockchip-io: make the log more consistent
* pm-domains:
PM / Domains: Fix compatible for domain idle state
PM / Domains: Do not print PM domain add error message if EPROBE_DEFER
PM / Domains: Fix a warning message
PM / Domains: check for negative return from of_count_phandle_with_args()
PM / doc: Update device documentation for devices in IRQ-safe PM domains
PM / Domains: Support IRQ safe PM domains
PM / Domains: Abstract genpd locking
dt/bindings / PM/Domains: Update binding for PM domain idle states
PM / Domains: Save the fwnode in genpd_power_state
PM / Domains: Allow domain power states to be read from DT
PM / Domains: Add residency property to genpd states
PM / Domains: Make genpd state allocation dynamic
Conflicts:
arch/arm/mach-imx/gpc.c
If there are any wakeup events being processed, read operation
on /sys/power/wakeup_count will be blocked, so print the names
of all active wakeup sources to help to find out who is preventing
system suspend from triggering.
While at it change pr_info() in pm_print_active_wakeup_sources()
to pr_debug() to avoid excessive log noise.
Signed-off-by: xing wei <xing.wei@intel.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If async_suspend is enabled for parent and child devices, then
PM framework has to ensure that parent's async suspend gets called
only after child's async suspend is done. In case if child's async
suspend fails with error, then parent's async suspend must not be
invoked. The current code uses async_error to ensure this but there
is a problem with it in __device_suspend(). This function notifies
the completion of child's async suspend before updating its error
via async_error variable. As a result, parent's async suspend gets
invoked even though it's child suspend has failed. Fix this bug by
updating the async_error before notifying the child's completion.
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
[ rjw: Rearranged wthitespace ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
I noticed some wakeirq flakeyness with consumer drivers not using
autosuspend. For drivers not using autosuspend, the wakeirq may never
get unmasked in rpm_suspend() because of irq desc->depth.
We are configuring dedicated wakeirqs to start with IRQ_NOAUTOEN as we
naturally don't want them running until rpm_suspend() is called.
However, when a consumer driver initially calls pm_runtime_get(), we
now wrongly start with disable_irq_nosync() call on the dedicated
wakeirq that is disabled to start with.
This causes desc->depth to toggle between 1 and 2 instead of the usual
0 and 1. This can prevent enable_irq() from unmasking the wakeirq as
that only happens at desc->depth 1.
This does not necessarily show up with drivers using autosuspend as
there is time for disable_irq_nosync() before rpm_suspend() gets called
after the autosuspend timeout.
Let's fix the issue by adding wirq->status that lazily gets set on
the first rpm_suspend(). We also need PM runtime core private functions
for dev_pm_enable_wake_irq_check() and dev_pm_disable_wake_irq_check()
so we can enable the dedicated wakeirq on the first rpm_suspend().
While at it, let's also fix the comments for dev_pm_enable_wake_irq()
and dev_pm_disable_wake_irq(). Those can still be used by the consumer
drivers as needed because the IRQ core manages the interrupt usecount
for us.
Fixes: 4990d4fe32 (PM / Wakeirq: Add automated device wake IRQ handling)
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Re-using idle state definition provided by arm,idle-state for domain
idle states creates a lot of confusion and limits further evolution of
the domain idle definition. To keep things clear and simple, define a
idle states for domain using a new compatible "domain-idle-state".
Fix existing PM domains code to look for the newly defined compatible.
Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If a platform specific OPP driver has called this routine first and set
the regulators, then the second call from cpufreq-dt driver will hit the
WARN_ON(). Remove the WARN_ON(), but continue to return error in such
cases.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The generic set_opp() handler isn't sufficient for platforms with
complex DVFS. For example, some TI platforms have multiple regulators
for a CPU device. The order in which various supplies need to be
programmed is only known to the platform code and its best to leave it
to it.
This patch implements APIs to register platform specific set_opp()
callback.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Later patches would add support for custom set_opp() callbacks. This
patch separates out the code for _generic_set_opp() handler in order to
prepare for that.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch adds infrastructure to manage multiple regulators and updates
the only user (cpufreq-dt) of dev_pm_opp_set{put}_regulator().
This is preparatory work for adding full support for devices with
multiple regulators.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pass the entire supply structure instead of all of its fields.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This is a preparatory step for multiple regulator per device support.
Move the voltage/current variables to a new structure.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The OPP structure must not be used out of the rcu protected section.
Cache the values to be used in separate variables instead.
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Joonyoung Shim reported an interesting problem on his ARM octa-core
Odoroid-XU3 platform. During system suspend, dev_pm_opp_put_regulator()
was failing for a struct device for which dev_pm_opp_set_regulator() is
called earlier.
This happened because an earlier call to
dev_pm_opp_of_cpumask_remove_table() function (from cpufreq-dt.c file)
removed all the entries from opp_table->dev_list apart from the last CPU
device in the cpumask of CPUs sharing the OPP.
But both dev_pm_opp_set_regulator() and dev_pm_opp_put_regulator()
routines get CPU device for the first CPU in the cpumask. And so the OPP
core failed to find the OPP table for the struct device.
This patch attempts to fix this problem by returning a pointer to the
opp_table from dev_pm_opp_set_regulator() and using that as the
parameter to dev_pm_opp_put_regulator(). This ensures that the
dev_pm_opp_put_regulator() doesn't fail to find the opp table.
Note that similar design problem also exists with other
dev_pm_opp_put_*() APIs, but those aren't used currently by anyone and
so we don't need to update them for now.
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
Reported-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ Viresh: Wrote commit log and tested on exynos 5250 ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
EPROBE_DEFER is not an error, hence printing an error message like
renesas_irqc e61c0000.interrupt-controller: failed to add to PM domain always-on: -517
may confuse the user.
Suppress the error message in case of EPROBE_DEFER to fix this.
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
nvme wants a module parameter that overrides the default latency
tolerance. This makes it easy for nvme to reflect that default in
sysfs.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If it was already 'auto', then writing 'auto' again would
incorrectly fail.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Negative values are special. Don't let users write them directly.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Power management suspend/resume tracing (ab)uses the RTC to store
suspend/resume information persistently. As a consequence the RTC value is
clobbered when timekeeping is resumed and tries to inject the sleep time.
Commit a4f8f6667f ("timekeeping: Cap array access in timekeeping_debug")
plugged a out of bounds array access in the timekeeping debug code which
was caused by the clobbered RTC value, but we still use the clobbered RTC
value for sleep time injection into kernel timekeeping, which will result
in random adjustments depending on the stored "hash" value.
To prevent this keep track of the RTC clobbering and ignore the invalid RTC
timestamp at resume. If the system resumed successfully clear the flag,
which marks the RTC as unusable, warn the user about the RTC clobber and
recommend to adjust the RTC with 'ntpdate' or 'rdate'.
[jstultz: Fixed up pr_warn formating, and implemented suggestions from Ingo]
[ tglx: Rewrote changelog ]
Originally-from: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Xunlei Pang <xlpang@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1480372524-15181-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When the pm_runtime_force_suspend|resume() helpers were invented, we still
had CONFIG_PM_RUNTIME and CONFIG_PM_SLEEP as separate Kconfig options.
To make sure these helpers worked for all combinations and without
introducing too much of complexity, the device was always resumed in
pm_runtime_force_resume().
More precisely, when CONFIG_PM_SLEEP was set and CONFIG_PM_RUNTIME was
unset, we needed to resume the device as the subsystem/driver couldn't
rely on using runtime PM to do it.
As the CONFIG_PM_RUNTIME option was merged into CONFIG_PM a while ago, it
removed this combination, of using CONFIG_PM_SLEEP without the earlier
CONFIG_PM_RUNTIME.
For this reason we can now rely on the subsystem/driver to use runtime PM
to resume the device, instead of forcing that to be done in all cases. In
other words, let's defer the runtime resume to a later point when it's
actually needed.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The first argument of WARN() is the condition, followed by the message.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Consider two devices, A and B, where B is a child of A, and B utilizes
asynchronous suspend (it does not matter whether A is sync or async). If
B fails to suspend_noirq() or suspend_late(), or is interrupted by a
wakeup (pm_wakeup_pending()), then it aborts and sets the async_error
variable. However, device A does not (immediately) check the async_error
variable; it may continue to run its own suspend_noirq()/suspend_late()
callback. This is bad.
We can resolve this problem by doing our error and wakeup checking
(particularly, for the async_error flag) after waiting for children to
suspend, instead of before. This also helps align the logic for the noirq and
late suspend cases with the logic in __device_suspend().
It's easy to observe this erroneous behavior by, for example, forcing a
device to sleep a bit in its suspend_noirq() (to ensure the parent is
waiting for the child to complete), then return an error, and watch the
parent suspend_noirq() still get called. (Or similarly, fake a wakeup
event at the right (or is it wrong?) time.)
Fixes: de377b3972 (PM / sleep: Asynchronous threads for suspend_late)
Fixes: 28b6fd6e37 (PM / sleep: Asynchronous threads for suspend_noirq)
Reported-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When resuming a device in __pm_runtime_set_status(), the prerequisite is
that its parent must already be active, else an error code is returned and
the device's status remains suspended.
When suspending a device there is no similar constraints being validated.
Let's change this to make the behaviour consistent, by not allowing to
suspend a device with an active child, unless it has been explicitly set to
ignore its children.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the device has no links to suppliers that should be used for
runtime PM (links with DEVICE_LINK_PM_RUNTIME set), there is no
reason to walk the list of suppliers for that device during
runtime suspend and resume.
Add a simple mechanism to detect that case and possibly avoid the
extra unnecessary overhead.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Modify the runtime PM framework to use device links to ensure that
supplier devices will not be suspended if any of their consumer
devices are active.
The idea is to reference count suppliers on the consumer's resume
and drop references to them on its suspend. The information on
whether or not the supplier has been reference counted by the
consumer's (runtime) resume is stored in a new field (rpm_active)
in the link object for each link.
It may be necessary to clean up those references when the
supplier is unbinding and that's why the links whose status is
DEVICE_LINK_SUPPLIER_UNBIND are skipped by the runtime suspend
and resume code.
The above means that if the consumer device is probed in the
runtime-active state, the supplier has to be resumed and reference
counted by device_link_add() so the code works as expected on its
(runtime) suspend. There is a new flag, DEVICE_LINK_RPM_ACTIVE,
to tell device_link_add() about that (in which case the caller
is responsible for making sure that the consumer really will
be runtime-active when runtime PM is enabled for it).
The other new link flag, DEVICE_LINK_PM_RUNTIME, tells the core
whether or not the link should be used for runtime PM at all.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make the device suspend/resume part of the core system
suspend/resume code use device links to ensure that supplier
and consumer devices will be suspended and resumed in the right
order in case of async suspend/resume.
The idea, roughly, is to use dpm_wait() to wait for all consumers
before a supplier device suspend and to wait for all suppliers
before a consumer device resume.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, there is a problem with taking functional dependencies
between devices into account.
What I mean by a "functional dependency" is when the driver of device
B needs device A to be functional and (generally) its driver to be
present in order to work properly. This has certain consequences
for power management (suspend/resume and runtime PM ordering) and
shutdown ordering of these devices. In general, it also implies that
the driver of A needs to be working for B to be probed successfully
and it cannot be unbound from the device before the B's driver.
Support for representing those functional dependencies between
devices is added here to allow the driver core to track them and act
on them in certain cases where applicable.
The argument for doing that in the driver core is that there are
quite a few distinct use cases involving device dependencies, they
are relatively hard to get right in a driver (if one wants to
address all of them properly) and it only gets worse if multiplied
by the number of drivers potentially needing to do it. Morever, at
least one case (asynchronous system suspend/resume) cannot be handled
in a single driver at all, because it requires the driver of A to
wait for B to suspend (during system suspend) and the driver of B to
wait for A to resume (during system resume).
For this reason, represent dependencies between devices as "links",
with the help of struct device_link objects each containing pointers
to the "linked" devices, a list node for each of them, status
information, flags, and an RCU head for synchronization.
Also add two new list heads, representing the lists of links to the
devices that depend on the given one (consumers) and to the devices
depended on by it (suppliers), and a "driver presence status" field
(needed for figuring out initial states of device links) to struct
device.
The entire data structure consisting of all of the lists of link
objects for all devices is protected by a mutex (for link object
addition/removal and for list walks during device driver probing
and removal) and by SRCU (for list walking in other case that will
be introduced by subsequent change sets). If CONFIG_SRCU is not
selected, however, an rwsem is used for protecting the entire data
structure.
In addition, each link object has an internal status field whose
value reflects whether or not drivers are bound to the devices
pointed to by the link or probing/removal of their drivers is in
progress etc. That field is only modified under the device links
mutex, but it may be read outside of it in some cases (introduced by
subsequent change sets), so modifications of it are annotated with
WRITE_ONCE().
New links are added by calling device_link_add() which takes three
arguments: pointers to the devices in question and flags. In
particular, if DL_FLAG_STATELESS is set in the flags, the link status
is not to be taken into account for this link and the driver core
will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the
flags, the driver core will remove the link automatically when the
consumer device driver unbinds from it.
One of the actions carried out by device_link_add() is to reorder
the lists used for device shutdown and system suspend/resume to
put the consumer device along with all of its children and all of
its consumers (and so on, recursively) to the ends of those lists
in order to ensure the right ordering between all of the supplier
and consumer devices.
For this reason, it is not possible to create a link between two
devices if the would-be supplier device already depends on the
would-be consumer device as either a direct descendant of it or a
consumer of one of its direct descendants or one of its consumers
and so on.
There are two types of link objects, persistent and non-persistent.
The persistent ones stay around until one of the target devices is
deleted, while the non-persistent ones are removed automatically when
the consumer driver unbinds from its device (ie. they are assumed to
be valid only as long as the consumer device has a driver bound to
it). Persistent links are created by default and non-persistent
links are created when the DL_FLAG_AUTOREMOVE flag is passed
to device_link_add().
Both persistent and non-persistent device links can be deleted
with an explicit call to device_link_del().
Links created without the DL_FLAG_STATELESS flag set are managed
by the driver core using a simple state machine. There are 5 states
each link can be in: DORMANT (unused), AVAILABLE (the supplier driver
is present and functional), CONSUMER_PROBE (the consumer driver is
probing), ACTIVE (both supplier and consumer drivers are present and
functional), and SUPPLIER_UNBIND (the supplier driver is unbinding).
The driver core updates the link state automatically depending on
what happens to the linked devices and for each link state specific
actions are taken in addition to that.
For example, if the supplier driver unbinds from its device, the
driver core will also unbind the drivers of all of its consumers
automatically under the assumption that they cannot function
properly without the supplier. Analogously, the driver core will
only allow the consumer driver to bind to its device if the
supplier driver is present and functional (ie. the link is in
the AVAILABLE state). If that's not the case, it will rely on
the existing deferred probing mechanism to wait for the supplier
driver to become available.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The return from of_count_phandle_with_args can be negative, so we
should avoid kcalloc of a negative count of genpd_power_stat structs
by sanity checking if count is zero or less.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Generic Power Domains currently support turning on/off only in process
context. This prevents the usage of PM domains for domains that could be
powered on/off in a context where IRQs are disabled. Many such domains
exist today and do not get powered off, when the IRQ safe devices in
that domain are powered off, because of this limitation.
However, not all domains can operate in IRQ safe contexts. Genpd
therefore, has to support both cases where the domain may or may not
operate in IRQ safe contexts. Configuring genpd to use an appropriate
lock for that domain, would allow domains that have IRQ safe devices to
runtime suspend and resume, in atomic context.
To achieve domain specific locking, set the domain's ->flag to
GENPD_FLAG_IRQ_SAFE while defining the domain. This indicates that genpd
should use a spinlock instead of a mutex for locking the domain. Locking
is abstracted through genpd_lock() and genpd_unlock() functions that use
the flag to determine the appropriate lock to be used for that domain.
Domains that have lower latency to suspend and resume and can operate
with IRQs disabled may now be able to save power, when the component
devices and sub-domains are idle at runtime.
The restriction this imposes on the domain hierarchy is that non-IRQ
safe domains may not have IRQ-safe subdomains, but IRQ safe domains may
have IRQ safe and non-IRQ safe subdomains and devices.
Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Abstract genpd lock/unlock calls, in preparation for domain specific
locks added in the following patches.
Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Save the fwnode for the genpd state in the state node. PM Domain clients
may use the fwnode to read in the platform specific domain state
properties and associate them with the state.
Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch allows domains to define idle states in the DT. SoC's can
define domain idle states in DT using the "domain-idle-states" property
of the domain provider. Add API to read the idle states from DT that can
be set in the genpd object.
This patch is based on the original patch by Marc Titinger.
Signed-off-by: Marc Titinger <mtitinger+renesas@baylibre.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Allow PM Domain states to be defined dynamically by the drivers. This
removes the limitation on the maximum number of states possible for a
domain.
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Lina Iyer <lina.iyer@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The exported function pm_children_suspended() has only one caller, which is
the runtime PM internal function, rpm_check_suspend_allowed().
Let's clean-up this code, by removing pm_children_suspended() altogether
and instead do the one-liner check directly in rpm_check_suspend_allowed().
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These log messages are wrong because _of_get_opp_desc_node() returns
an operating-points-v2 node.
Commit a6eed752f5 ("PM / OPP: passing NULL to PTR_ERR()") fixed
static checker warnings, and reworded the messages at the same time
(but the latter was not mentioned in the git-log).
Restore the correct messages.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since commit f47b72a15a ("PM / OPP: Move CONFIG_OF dependent code
in a separate file"), this function is defined and called only in
drivers/base/power/opp/of.c .
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-domains:
PM / Domains: Rename pm_genpd_sync_poweron|poweroff()
PM / Domains: Don't measure latency of ->power_on|off() during system PM
PM / Domains: Remove redundant system PM callbacks
PM / Domains: Simplify detaching a device from its genpd
PM / Domains: Allow holes in genpd_data.domains array
PM / Domains: Add support for removing nested PM domains by provider
PM / Domains: Add support for removing PM domains
PM / Domains: Store the provider in the PM domain structure
PM / Domains: Prepare for adding support to remove PM domains
PM / Domains: Verify the PM domain is present when adding a provider
PM / Domains: Don't expose xlate and provider helper functions
PM / Domains: Don't expose generic_pm_domain structure to clients
staging: board: Remove calls to of_genpd_get_from_provider()
ARM: EXYNOS: Remove calls to of_genpd_get_from_provider()
PM / Domains: Add new helper functions for device-tree
PM / Domains: Always enable debugfs support if available
The OPP framework allows each OPP to set a opp-supported-hw property
which provides values that are matched against supported_hw values
provided by the platform to limit support for certain OPPs on specific
hardware. Currently, if the platform does not set supported_hw values,
all OPPs are interpreted as supported, even if they have provided their
own opp-supported-hw values.
If an OPP has provided opp-supported-hw, it is indicating that there is
some specific hardware configuration it is supported by. These constraints
should be honored, and if no supported_hw has been provided by the
platform, there is no way to determine if that OPP is actually supported,
so it should be marked as not supported.
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These are internal static functions to genpd. Let's conform to the naming
rules, by dropping the "pm_" prefix from these.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Measure latency does by itself contribute to an increased latency, thus we
should avoid it when it isn't needed.
Currently genpd measures latencies in the system PM phase for the
->power_on|off() callbacks, except in the syscore case when it's not
allowed to use ktime_get() as timekeeping may be suspended.
Since there should be plenty of occasions during runtime PM to perform
these measurements, let's rely on that and drop them from system PM. This
will also make it consistent for how measurements are done of the runtime
PM callbacks (as those may be invoked during system PM).
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In cases when the PM domain haven't assigned a system PM callback, the PM
core fall-backs to check for the callback at the driver level instead.
This makes it redundant to assign a pm_generic_* helper function to a
corresponding system PM callback at a PM domain level.
Therefore, let's remove these assignments in pm_genpd_init().
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There's no need to validate the PM domain by using genpd_lookup_dev() when
removing the device via genpd's genpd_dev_pm_detach() function. That's
because this function can't be called, unless there is a valid PM domain
for the device.
To simplify the behaviour, let's move code from pm_genpd_remove_device()
into a new internal function, genpd_remove_device(), which is called from
pm_genpd_remove_device() and genpd_dev_pm_detach().
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Lina Iyer <lina.iyer@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When CONFIG_OPTIMIZE_INLINING is set and we are building with -Wmaybe-uninitialized
enabled, we can get a warning for the opp core driver:
drivers/base/power/opp/core.c: In function 'dev_pm_opp_set_rate':
drivers/base/power/opp/core.c:560:8: warning: 'ou_volt_min' may be used uninitialized in this function [-Wmaybe-uninitialized]
This has only now appeared as a result of commit 797da5598f ("PM / devfreq:
Add COMPILE_TEST for build coverage"), which makes the driver visible in
some configurations that didn't have it before.
The warning is a false positive that I got with gcc-6.1.1, but there is
a simple workaround in removing the local variables that we get warnings
for (all three are affected depending on the configuration). This also
makes the code easier to read.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In platforms such as Rockchip's, the array of domains isn't always
filled without holes, as which domains are present depend on the
particular SoC revision.
By allowing holes to be in the array, such SoCs can still use a single
set of constants to index the array of power domains.
Fixes: 0159ec6707 (PM / Domains: Verify the PM domain is present when adding a provider)
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Further testing with false negatives suppressed by commit 293e2421fe
("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
identified a few more unprotected uses of RCU from the idle loop.
Because RCU actively ignores idle-loop code (for energy-efficiency
reasons, among other things), using RCU from the idle loop can result
in too-short grace periods, in turn resulting in arbitrary misbehavior.
The affected function is rpm_suspend().
The resulting lockdep-RCU splat is as follows:
------------------------------------------------------------------------
Warning from omap3
===============================
[ INFO: suspicious RCU usage. ]
4.6.0-rc5-next-20160426+ #1112 Not tainted
-------------------------------
include/trace/events/rpm.h:63 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
#0: (&(&dev->power.lock)->rlock){-.-...}, at: [<c052ee24>] __pm_runtime_suspend+0x54/0x84
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
Hardware name: Generic OMAP36xx (Flattened Device Tree)
[<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
[<c047fec8>] (dump_stack) from [<c052d7b4>] (rpm_suspend+0x604/0x7e4)
[<c052d7b4>] (rpm_suspend) from [<c052ee34>] (__pm_runtime_suspend+0x64/0x84)
[<c052ee34>] (__pm_runtime_suspend) from [<c04bf3bc>] (omap2_gpio_prepare_for_idle+0x5c/0x70)
[<c04bf3bc>] (omap2_gpio_prepare_for_idle) from [<c01255e8>] (omap_sram_idle+0x140/0x244)
[<c01255e8>] (omap_sram_idle) from [<c0126b48>] (omap3_enter_idle_bm+0xfc/0x1ec)
[<c0126b48>] (omap3_enter_idle_bm) from [<c0601db8>] (cpuidle_enter_state+0x80/0x3d4)
[<c0601db8>] (cpuidle_enter_state) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
[<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
[<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
------------------------------------------------------------------------
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If a device supports PM domains that are subdomains of another PM
domain, then the PM domains should be removed in reverse order to
ensure that the subdomains are removed first. Furthermore, if there is
more than one provider, then there needs to be a way to remove the
domains in reverse order for a specific provider.
Add the function of_genpd_remove_last() to remove the last PM domain
added by a given PM domain provider and return the generic_pm_domain
structure for the PM domain that was removed.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>