Commit Graph

96 Commits

Author SHA1 Message Date
Lukasz Majewski
6f19efc0a1 cpufreq: Add boost frequency support in core
This commit adds boost frequency support in cpufreq core (Hardware &
Software). Some SoCs (like Exynos4 - e.g. 4x12) allow setting frequency
above its normal operation limits. Such mode shall be only used for a
short time.

Overclocking (boost) support is essentially provided by platform
dependent cpufreq driver.

This commit unifies support for SW and HW (Intel) overclocking solutions
in the core cpufreq driver. Previously the "boost" sysfs attribute was
defined in the ACPI processor driver code. By default boost is disabled.
One global attribute is available at: /sys/devices/system/cpu/cpufreq/boost.

It only shows up when cpufreq driver supports overclocking.
Under the hood frequencies dedicated for boosting are marked with a
special flag (CPUFREQ_BOOST_FREQ) at driver's frequency table.
It is the user's concern to enable/disable overclocking with a proper call
to sysfs.

The cpufreq_boost_trigger_state() function is defined non static on purpose.
It is used later with thermal subsystem to provide automatic enable/disable
of the BOOST feature.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17 02:00:44 +01:00
Viresh Kumar
652ed95d5f cpufreq: introduce cpufreq_generic_get() routine
CPUFreq drivers that use clock frameworks interface,i.e. clk_get_rate(),
to get CPUs clk rate, have similar sort of code used in most of them.

This patch adds a generic ->get() which will do the same thing for them.
All those drivers are required to now is to set .get to cpufreq_generic_get()
and set their clk pointer in policy->clk during ->init().

Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17 02:00:44 +01:00
Viresh Kumar
fcd7af917a cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly
There are several problems with cpufreq stats in the way it handles
cpufreq_unregister_driver() and suspend/resume..

 - We must not lose data collected so far when suspend/resume happens
   and so stats directories must not be removed/allocated during these
   operations, which is done currently.

 - cpufreq_stat has registered notifiers with both cpufreq and hotplug.
   It adds sysfs stats directory with a cpufreq notifier: CPUFREQ_NOTIFY
   and removes this directory with a notifier from hotplug core.

   In case cpufreq_unregister_driver() is called (on rmmod cpufreq driver),
   stats directories per cpu aren't removed as CPUs are still online. The
   only call cpufreq_stats gets is cpufreq_stats_update_policy_cpu() for
   all CPUs except the last of each policy. And pointer to stat information
   is stored in the entry for last CPU in the per-cpu cpufreq_stats_table.
   But policy structure would be freed inside cpufreq core and so that will
   result in memory leak inside cpufreq stats (as we are never freeing
   memory for stats).

   Now if we again insert the module cpufreq_register_driver() will be
   called and we will again allocate stats data and put it on for first
   CPU of every policy.  In case we only have a single CPU per policy, we
   will return with a error from cpufreq_stats_create_table() due to this
   code:

	if (per_cpu(cpufreq_stats_table, cpu))
		return -EBUSY;

   And so probably cpufreq stats directory would not show up anymore (as
   it was added inside last policies->kobj which doesn't exist anymore).
   I haven't tested it, though. Also the values in stats files wouldn't
   be refreshed as we are using the earlier stats structure.

 - CPUFREQ_NOTIFY is called from cpufreq_set_policy() which is called for
   scenarios where we don't really want cpufreq_stat_notifier_policy() to get
   called. For example whenever we are changing anything related to a policy:
   min/max/current freq, etc. cpufreq_set_policy() is called and so cpufreq
   stats is notified. Where we don't do any useful stuff other than simply
   returning with -EBUSY from cpufreq_stats_create_table(). And so this
   isn't the right notifier that cpufreq stats..

 Due to all above reasons this patch does following changes:
 - Add new notifiers CPUFREQ_CREATE_POLICY and CPUFREQ_REMOVE_POLICY,
   which are only called when policy is created/destroyed. They aren't
   called for suspend/resume paths..
 - Use these notifiers in cpufreq_stat_notifier_policy() to create/destory
   stats sysfs entries. And so cpufreq_unregister_driver() or suspend/resume
   shouldn't be a problem for cpufreq_stats.
 - Return early from cpufreq_stat_cpu_callback() for suspend/resume sequence,
   so that we don't free stats structure.

Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-17 02:00:43 +01:00
Viresh Kumar
d3916691c9 cpufreq: Make sure CPU is running on a freq from freq-table
Sometimes boot loaders set CPU frequency to a value outside of frequency table
present with cpufreq core. In such cases CPU might be unstable if it has to run
on that frequency for long duration of time and so its better to set it to a
frequency which is specified in freq-table. This also makes cpufreq stats
inconsistent as cpufreq-stats would fail to register because current frequency
of CPU isn't found in freq-table.

Because we don't want this change to affect boot process badly, we go for the
next freq which is >= policy->cur ('cur' must be set by now, otherwise we will
end up setting freq to lowest of the table as 'cur' is initialized to zero).

In case current frequency doesn't match any frequency from freq-table, we throw
warnings to user, so that user can get this fixed in their bootloaders or
freq-tables.

Reported-by: Carlos Hernandez <ceh@ti.com>
Reported-and-tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06 14:17:25 +01:00
Viresh Kumar
ae6b427132 cpufreq: Mark ARM drivers with CPUFREQ_NEED_INITIAL_FREQ_CHECK flag
Sometimes boot loaders set CPU frequency to a value outside of frequency table
present with cpufreq core. In such cases CPU might be unstable if it has to run
on that frequency for long duration of time and so its better to set it to a
frequency which is specified in frequency table.

On some systems we can't really say what frequency we're running at the moment
and so for these we shouldn't check if we are running at a frequency present in
frequency table. And so we really can't force this for all the cpufreq drivers.

Hence we are created another flag here: CPUFREQ_NEED_INITIAL_FREQ_CHECK that
will be marked by platforms which want to go for this check at boot time.

Initially this is done for all ARM platforms but others may follow if required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06 14:17:25 +01:00
Viresh Kumar
f7ba3b41e2 cpufreq: Introduce cpufreq_notify_post_transition()
This introduces a new routine cpufreq_notify_post_transition() which
can be used to send POSTCHANGE notification for new freq with or
without both {PRE|POST}CHANGE notifications for last freq. This is
useful at multiple places, especially for sending transition failure
notifications.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-01-06 01:43:44 +01:00
Rafael J. Wysocki
12205a4b79 Revert "cpufreq: suspend governors on system suspend/hibernate"
Commit 5a87182aa2 (cpufreq: suspend governors on system
suspend/hibernate) causes hibernation problems to happen on
Bjørn Mork's and Paul Bolle's systems, so revert it.

Fixes: 5a87182aa2 (cpufreq: suspend governors on system suspend/hibernate)
Reported-by: Bjørn Mork <bjorn@mork.no>
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-08 01:04:17 +01:00
Rafael J. Wysocki
7cdcec991c Merge branches 'pm-cpuidle' and 'pm-cpufreq'
* pm-cpuidle:
  cpuidle: Check for dev before deregistering it.
  intel_idle: Fixed C6 state on Avoton/Rangeley processors

* pm-cpufreq:
  cpufreq: fix garbage kobjects on errors during suspend/resume
  cpufreq: suspend governors on system suspend/hibernate
2013-12-06 02:17:59 +01:00
Viresh Kumar
5a87182aa2 cpufreq: suspend governors on system suspend/hibernate
This patch adds cpufreq suspend/resume calls to dpm_{suspend|resume}_noirq()
for handling suspend/resume of cpufreq governors.

Lan Tianyu (Intel) & Jinhyuk Choi (Broadcom) found anr issue where
tunables configuration for clusters/sockets with non-boot CPUs was
getting lost after suspend/resume, as we were notifying governors
with CPUFREQ_GOV_POLICY_EXIT on removal of the last cpu for that
policy and so deallocating memory for tunables. This is fixed by
this patch as we don't allow any operation on governors after
device suspend and before device resume now.

Reported-and-tested-by: Lan Tianyu <tianyu.lan@intel.com>
Reported-by: Jinhyuk Choi <jinchoi@broadcom.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Changelog, minor cleanups]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-28 14:47:31 +01:00
Linus Torvalds
049ffa8ab3 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
 "This is a combo of -next and some -fixes that came in in the
  intervening time.

  Highlights:

  New drivers:
    ARM Armada driver for Marvell Armada 510 SOCs

  Intel:
    Broadwell initial support under a default off switch,
    Stereo/3D HDMI mode support
    Valleyview improvements
    Displayport improvements
    Haswell fixes
    initial mipi dsi panel support
    CRC support for debugging
    build with CONFIG_FB=n

  Radeon:
    enable DPM on a number of GPUs by default
    secondary GPU powerdown support
    enable HDMI audio by default
    Hawaii support

  Nouveau:
    dynamic pm code infrastructure reworked, does nothing major yet
    GK208 modesetting support
    MSI fixes, on by default again
    PMPEG improvements
    pageflipping fixes

  GMA500:
    minnowboard SDVO support

  VMware:
    misc fixes

  MSM:
    prime, plane and rendernodes support

  Tegra:
    rearchitected to put the drm driver into the drm subsystem.
    HDMI and gr2d support for tegra 114 SoC

  QXL:
    oops fix, and multi-head fixes

  DRM core:
    sysfs lifetime fixes
    client capability ioctl
    further cleanups to device midlayer
    more vblank timestamp fixes"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (789 commits)
  drm/nouveau: do not map evicted vram buffers in nouveau_bo_vma_add
  drm/nvc0-/gr: shift wrapping bug in nvc0_grctx_generate_r406800
  drm/nouveau/pwr: fix missing mutex unlock in a failure path
  drm/nv40/therm: fix slowing down fan when pstate undefined
  drm/nv11-: synchronise flips to vblank, unless async flip requested
  drm/nvc0-: remove nasty fifo swmthd hack for flip completion method
  drm/nv10-: we no longer need to create nvsw object on user channels
  drm/nouveau: always queue flips relative to kernel channel activity
  drm/nouveau: there is no need to reserve/fence the new fb when flipping
  drm/nouveau: when bailing out of a pushbuf ioctl, do not remove previous fence
  drm/nouveau: allow nouveau_fence_ref() to be a noop
  drm/nvc8/mc: msi rearm is via the nvc0 method
  drm/ttm: Fix vma page_prot bit manipulation
  drm/vmwgfx: Fix a couple of compile / sparse warnings and errors
  drm/vmwgfx: Resource evict fixes
  drm/edid: compare actual vrefresh for all modes for quirks
  drm: shmob_drm: Convert to clk_prepare/unprepare
  drm/nouveau: fix 32-bit build
  drm/i915/opregion: fix build error on CONFIG_ACPI=n
  Revert "drm/radeon/audio: don't set speaker allocation on DCE4+"
  ...
2013-11-15 14:19:54 +09:00
Viresh Kumar
7dbf694db6 cpufreq: distinguish drivers that do asynchronous notifications
There are few special cases like exynos5440 which doesn't send POSTCHANGE
notification from their ->target() routine and call some kind of bottom halves
for doing this work, work/tasklet/etc.. From which they finally send POSTCHANGE
notification.

Its better if we distinguish them from other cpufreq drivers in some way so that
core can handle them specially. So this patch introduces another flag:
CPUFREQ_ASYNC_NOTIFICATION, which will be set by such drivers.

This also changes exynos5440-cpufreq.c and powernow-k8 in order to set this
flag.

Acked-by: Amit Daniel Kachhap <amit.daniel@samsung.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-31 00:11:08 +01:00
viresh kumar
ad7722dab7 cpufreq: create per policy rwsem instead of per CPU cpu_policy_rwsem
We have per-CPU cpu_policy_rwsem for cpufreq core, but we never use
all of them. We always use rwsem of policy->cpu and so we can
actually make this rwsem per policy instead.

This patch does this change. With this change other tricky situations
are also avoided now, like which lock to take while we are changing
policy->cpu, etc.

Suggested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-25 23:54:12 +02:00
Viresh Kumar
9c0ebcf78f cpufreq: Implement light weight ->target_index() routine
Currently, the prototype of cpufreq_drivers target routines is:

int target(struct cpufreq_policy *policy, unsigned int target_freq,
		unsigned int relation);

And most of the drivers call cpufreq_frequency_table_target() to get a valid
index of their frequency table which is closest to the target_freq. And they
don't use target_freq and relation after that.

So, it makes sense to just do this work in cpufreq core before calling
cpufreq_frequency_table_target() and simply pass index instead. But this can be
done only with drivers which expose their frequency table with cpufreq core. For
others we need to stick with the old prototype of target() until those drivers
are converted to expose frequency tables.

This patch implements the new light weight prototype for target_index() routine.
It looks like this:

int target_index(struct cpufreq_policy *policy, unsigned int index);

CPUFreq core will call cpufreq_frequency_table_target() before calling this
routine and pass index to it. Because CPUFreq core now requires to call routines
present in freq_table.c CONFIG_CPU_FREQ_TABLE must be enabled all the time.

This also marks target() interface as deprecated. So, that new drivers avoid
using it. And Documentation is updated accordingly.

It also converts existing .target() to newly defined light weight
.target_index() routine for many driver.

Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rjw@rjwysocki.net>
2013-10-25 22:42:24 +02:00
Daniel Vetter
c75b505dda cpufreq: Add dummy cpufreq_cpu_get/put for CONFIG_CPU_FREQ=n
The drm/i915 driver wants to adjust it's own power policies using the
cpu policies as a guideline (we can implicitly boost the cpus through
the gpus on some platforms). To avoid a dreaded select (since a
depends will leave users wondering where where their driver has gone
too) add dummy functions.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: kbuild test robot <fengguang.wu@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: cpufreq@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-18 15:05:29 +02:00
Viresh Kumar
70e9e77833 cpufreq: create cpufreq_generic_init() routine
Many CPUFreq drivers for SMP system (where all cores share same clock lines), do
similar stuff in their ->init() part.

This patch creates a generic routine in cpufreq core which can be used by these
so that we can remove some redundant code.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 00:50:33 +02:00
Viresh Kumar
184345129c cpufreq: define generic .attr, .exit() and .verify() routines
Most of the CPUFreq drivers do similar things in .exit() and .verify() routines
and .attr. So its better if we have generic routines for them which can be used
by cpufreq drivers then.

This patch introduces generic .attr, .exit() and .verify() cpufreq drivers.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 00:50:23 +02:00
Viresh Kumar
be49e3465f cpufreq: add new routine cpufreq_verify_within_cpu_limits()
Most of the users of cpufreq_verify_within_limits() calls it for
limiting with min/max from policy->cpuinfo. We can make that code
simple by introducing another routine which will do this for them
automatically.

This patch adds another routine cpufreq_verify_within_cpu_limits()
and updates others to use it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 00:50:23 +02:00
Viresh Kumar
0b981e7074 cpufreq: use cpufreq_driver->flags to mark CPUFREQ_HAVE_GOVERNOR_PER_POLICY
Use cpufreq_driver->flags to mark CPUFREQ_HAVE_GOVERNOR_PER_POLICY instead
of a separate field within cpufreq_driver. This will save some bytes of
memory.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 00:50:23 +02:00
Viresh Kumar
6461f018e7 cpufreq: rewrite cpufreq_driver->flags using shift operator
Currently cpufreq_driver's flags are defined directly using 0x1, 0x2, 0x4, 0x8,
etc.. As the list grows it becomes less readable..

Use bitwise shift operator << to generate these numbers for respective
positions.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-16 00:50:23 +02:00
Viresh Kumar
27047a6036 cpufreq: Add new helper cpufreq_table_validate_and_show()
Almost every cpufreq driver is required to validate its frequency table with:
cpufreq_frequency_table_cpuinfo() and then expose it to cpufreq core with:
cpufreq_frequency_table_get_attr().

This patch creates another helper routine cpufreq_table_validate_and_show() that
will do both these steps in a single call and will return 0 for success, error
otherwise.

This also fixes potential bugs in cpufreq drivers where people have called
cpufreq_frequency_table_get_attr() before calling
cpufreq_frequency_table_cpuinfo(), as the later may fail.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-30 20:18:40 +02:00
Rafael J. Wysocki
798282a871 Revert "cpufreq: make sure frequency transitions are serialized"
Commit 7c30ed5 (cpufreq: make sure frequency transitions are
serialized) attempted to serialize frequency transitions by
adding checks to the CPUFREQ_PRECHANGE and CPUFREQ_POSTCHANGE
notifications.  However, it assumed that the notifications will
always originate from the driver's .target() callback, but they
also can be triggered by cpufreq_out_of_sync() and that leads to
warnings like this on some systems:

 WARNING: CPU: 0 PID: 14543 at drivers/cpufreq/cpufreq.c:317
 __cpufreq_notify_transition+0x238/0x260()
 In middle of another frequency transition

accompanied by a call trace similar to this one:

 [<ffffffff81720daa>] dump_stack+0x46/0x58
 [<ffffffff8106534c>] warn_slowpath_common+0x8c/0xc0
 [<ffffffff815b8560>] ? acpi_cpufreq_target+0x320/0x320
 [<ffffffff81065436>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff815b1ec8>] __cpufreq_notify_transition+0x238/0x260
 [<ffffffff815b33be>] cpufreq_notify_transition+0x3e/0x70
 [<ffffffff815b345d>] cpufreq_out_of_sync+0x6d/0xb0
 [<ffffffff815b370c>] cpufreq_update_policy+0x10c/0x160
 [<ffffffff815b3760>] ? cpufreq_update_policy+0x160/0x160
 [<ffffffff81413813>] cpufreq_set_cur_state+0x8c/0xb5
 [<ffffffff814138df>] processor_set_cur_state+0xa3/0xcf
 [<ffffffff8158e13c>] thermal_cdev_update+0x9c/0xb0
 [<ffffffff8159046a>] step_wise_throttle+0x5a/0x90
 [<ffffffff8158e21f>] handle_thermal_trip+0x4f/0x140
 [<ffffffff8158e377>] thermal_zone_device_update+0x57/0xa0
 [<ffffffff81415b36>] acpi_thermal_check+0x2e/0x30
 [<ffffffff81415ca0>] acpi_thermal_notify+0x40/0xdc
 [<ffffffff813e7dbd>] acpi_device_notify+0x19/0x1b
 [<ffffffff813f8241>] acpi_ev_notify_dispatch+0x41/0x5c
 [<ffffffff813e3fbe>] acpi_os_execute_deferred+0x25/0x32
 [<ffffffff81081060>] process_one_work+0x170/0x4a0
 [<ffffffff81082121>] worker_thread+0x121/0x390
 [<ffffffff81082000>] ? manage_workers.isra.20+0x170/0x170
 [<ffffffff81088fe0>] kthread+0xc0/0xd0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8173582c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81088f20>] ? flush_kthread_worker+0xb0/0xb0

For this reason, revert commit 7c30ed5 along with the fix 266c13d
(cpufreq: Fix serialization of frequency transitions) on top of it
and we will revisit the serialization problem later.

Reported-by: Alessandro Bono <alessandro.bono@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10 02:54:50 +02:00
Srivatsa S. Bhat
56d07db274 cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
Commit "cpufreq: serialize calls to __cpufreq_governor()" had been a temporary
and partial solution to the race condition between writing to a cpufreq sysfs
file and taking a CPU offline. Now that we have a proper and complete solution
to that problem, remove the temporary fix.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10 02:49:47 +02:00
Viresh Kumar
19c763031a cpufreq: serialize calls to __cpufreq_governor()
We can't take a big lock around __cpufreq_governor() as this causes
recursive locking for some cases. But calls to this routine must be
serialized for every policy. Otherwise we can see some unpredictable
events.

For example, consider following scenario:

__cpufreq_remove_dev()
 __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
   policy->governor->governor(policy, CPUFREQ_GOV_STOP);
    cpufreq_governor_dbs()
     case CPUFREQ_GOV_STOP:
      mutex_destroy(&cpu_cdbs->timer_mutex)
      cpu_cdbs->cur_policy = NULL;
  <PREEMPT>
store()
 __cpufreq_set_policy()
  __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
    policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
     case CPUFREQ_GOV_LIMITS:
      mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex)
       if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL

And so store() will eventually result in a crash if cur_policy is
NULL at this point.

Introduce an additional variable which would guarantee serialization
here.

Reported-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-10 02:49:46 +02:00
Viresh Kumar
adc97d6a73 cpufreq: Drop the owner field from struct cpufreq_driver
We don't need to set .owner = THIS_MODULE any more in cpufreq drivers
as this field isn't used any more by the cpufreq core.

This patch removes it and updates all dependent drivers accordingly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-10 03:24:47 +02:00
Lukasz Majewski
c88a1f8b96 cpufreq: Store cpufreq policies in a list
Policies available in the cpufreq framework are now linked together.
They are accessible via cpufreq_policy_list defined in the cpufreq
core.

[rjw: Fix from Yinghai Lu folded in]
Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-10 03:24:06 +02:00
Viresh Kumar
3a3e9e06d0 cpufreq: Give consistent names to cpufreq_policy objects
They are called policy, cur_policy, new_policy, data, etc.  Just call
them policy wherever possible.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-07 23:34:10 +02:00
Viresh Kumar
74aca95da7 cpufreq: Re-arrange declarations in cpufreq.h
They are pretty much mixed up.  Although generic headers are present,
definitions/declarations are present outside of them too ...

This patch just moves stuff up and down to make it look better and
consistent.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-07 23:34:10 +02:00
Viresh Kumar
5ff0a26803 cpufreq: Clean up header files included in the core
This patch addresses the following issues in the header files in the
cpufreq core:
 - Include headers in ascending order, so that we don't add same
   many times by mistake.
 - <asm/> must be included after <linux/>, so that they override
   whatever they need to.
 - Remove unnecessary includes.
 - Don't include files already included by cpufreq.h or
   cpufreq_governor.h.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-07 23:34:09 +02:00
Stratos Karafotis
cffe4e0e74 cpufreq: Remove unused function __cpufreq_driver_getavg()
The target frequency calculation method in the ondemand governor has
changed and it is now independent of the measured average frequency.
Consequently, the __cpufreq_driver_getavg() function and getavg
member of struct cpufreq_driver are not used any more, so drop them.

[rjw: Changelog]
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-26 01:06:44 +02:00
Viresh Kumar
266c13d767 cpufreq: Fix serialization of frequency transitions
Commit 7c30ed ("cpufreq: make sure frequency transitions are serialized")
interacts poorly with systems that have a single core freqency for all
cores.  On such systems we have a single policy for all cores with
several CPUs.  When we do a frequency transition the governor calls the
pre and post change notifiers which causes cpufreq_notify_transition()
per CPU.  Since the policy is the same for all of them all CPUs after
the first and the warnings added are generated by checking a per-policy
flag the warnings will be triggered for all cores after the first.

Fix this by allowing notifier to be called for n times. Where n is the number of
cpus in policy->cpus.

Reported-and-tested-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-04 13:12:44 +02:00
Lan Tianyu
f4fd379784 acpi-cpufreq: Add new sysfs attribute freqdomain_cpus
Commits fcf8058 (cpufreq: Simplify cpufreq_add_dev()) and aa77a52
(cpufreq: acpi-cpufreq: Don't set policy->related_cpus from .init())
changed the contents of the "related_cpus" sysfs attribute on systems
where acpi-cpufreq is used and user space can't get the list of CPUs
which are in the same hardware coordination CPU domain (provided by
the ACPI AML method _PSD) via "related_cpus" any more.

To make up for that loss add a new sysfs attribute "freqdomian_cpus"
for the acpi-cpufreq driver which exposes the list of CPUs in the
same domain regardless of whether it is coordinated by hardware or
software.

[rjw: Changelog, documentation]
References: https://bugzilla.kernel.org/show_bug.cgi?id=58761
Reported-by: Jean-Philippe Halimi <jean-philippe.halimi@exascale-computing.eu>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-27 21:51:09 +02:00
Viresh Kumar
7c30ed532c cpufreq: make sure frequency transitions are serialized
Whenever we are changing frequency of a cpu, we are calling PRECHANGE and
POSTCHANGE notifiers. They must be serialized. i.e. PRECHANGE or POSTCHANGE
shouldn't be called twice contiguously.

This can happen due to bugs in users of __cpufreq_driver_target() or actual
cpufreq drivers who are sending these notifiers.

This patch adds some protection against this. Now, we keep track of the last
transaction and see if something went wrong.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-27 21:49:55 +02:00
Viresh Kumar
bb176f7d03 cpufreq: Fix minor formatting issues
There were a few noticeable formatting issues in core cpufreq code.
This cleans them up to make code look better.  The changes include:
 - Whitespace cleanup.
 - Rearrangements of code.
 - Multiline comments fixes.
 - Formatting changes to fit 80 columns.

Copyright information in cpufreq.c is also updated to include my name
for 2013.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-21 01:06:34 +02:00
Xiaoguang Chen
95731ebb11 cpufreq: Fix governor start/stop race condition
Cpufreq governors' stop and start operations should be carried out
in sequence.  Otherwise, there will be unexpected behavior, like in
the example below.

Suppose there are 4 CPUs and policy->cpu=CPU0, CPU1/2/3 are linked
to CPU0.  The normal sequence is:

 1) Current governor is userspace.  An application tries to set the
    governor to ondemand.  It will call __cpufreq_set_policy() in
    which it will stop the userspace governor and then start the
    ondemand governor.

 2) Current governor is userspace.  The online of CPU3 runs on CPU0.
    It will call cpufreq_add_policy_cpu() in which it will first
    stop the userspace governor, and then start it again.

If the sequence of the above two cases interleaves, it becomes:

 1) Application stops userspace governor
 2)                                  Hotplug stops userspace governor

which is a problem, because the governor shouldn't be stopped twice
in a row.  What happens next is:

 3) Application starts ondemand governor
 4)                                  Hotplug starts a governor

In step 4, the hotplug is supposed to start the userspace governor,
but now the governor has been changed by the application to ondemand,
so the ondemand governor is started once again, which is incorrect.

The solution is to prevent policy governors from being stopped
multiple times in a row.  A governor should only be stopped once for
one policy.  After it has been stopped, no more governor stop
operations should be executed.

Also add a mutex to serialize governor operations.

[rjw: Changelog.  And you owe me a beverage of my choice.]
Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-21 00:56:04 +02:00
Viresh Kumar
5070158804 cpufreq: rename index as driver_data in cpufreq_frequency_table
The "index" field of struct cpufreq_frequency_table was never an
index and isn't used at all by the cpufreq core.  It only is useful
for cpufreq drivers for their internal purposes.

Many people nowadays blindly set it in ascending order with the
assumption that the core will use it, which is a mistake.

Rename it to "driver_data" as that's what its purpose is. All of its
users are updated accordingly.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-04 14:25:59 +02:00
Viresh Kumar
2361be2366 cpufreq: Don't create empty /sys/devices/system/cpu/cpufreq directory
When we don't have any file in cpu/cpufreq directory we shouldn't
create it. Specially with the introduction of per-policy governor
instance patchset, even governors are moved to
cpu/cpu*/cpufreq/governor-name directory and so this directory is
just not required.

Lets have it only when required.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-27 13:24:02 +02:00
Viresh Kumar
72a4ce340a cpufreq: Move get_cpu_idle_time() to cpufreq.c
Governors other than ondemand and conservative can also use
get_cpu_idle_time() and they aren't required to compile
cpufreq_governor.c. So, move these independent routines to
cpufreq.c instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-27 13:20:56 +02:00
Viresh Kumar
944e9a0316 cpufreq: governors: Move get_governor_parent_kobj() to cpufreq.c
get_governor_parent_kobj() can be used by any governor, generic
cpufreq governors or platform specific ones and so must be present in
cpufreq.c instead of cpufreq_governor.c.

This patch moves it to cpufreq.c. This also adds
EXPORT_SYMBOL_GPL(get_governor_parent_kobj) so that modules can use
this function too.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-27 13:20:56 +02:00
Viresh Kumar
b43a7ffbf3 cpufreq: Notify all policy->cpus in cpufreq_notify_transition()
policy->cpus contains all online cpus that have single shared clock line. And
their frequencies are always updated together.

Many SMP system's cpufreq drivers take care of this in individual drivers but
the best place for this code is in cpufreq core.

This patch modifies cpufreq_notify_transition() to notify frequency change for
all cpus in policy->cpus and hence updates all users of this API.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-02 15:24:00 +02:00
Viresh Kumar
4d5dcc4211 cpufreq: governor: Implement per policy instances of governors
Currently, there can't be multiple instances of single governor_type.
If we have a multi-package system, where we have multiple instances
of struct policy (per package), we can't have multiple instances of
same governor. i.e. We can't have multiple instances of ondemand
governor for multiple packages.

Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one
instance of a governor_type in the system.

This is a bottleneck for multicluster system, where we want different
packages to use same governor type, but with different tunables.

This patch uses the infrastructure provided by earlier patch and
implements init/exit routines for ondemand and conservative
governors.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:34 +02:00
Viresh Kumar
7bd353a995 cpufreq: Add per policy governor-init/exit infrastructure
Currently, there can't be multiple instances of single governor_type.
If we have a multi-package system, where we have multiple instances
of struct policy (per package), we can't have multiple instances of
same governor. i.e. We can't have multiple instances of ondemand
governor for multiple packages.

Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one
instance of a governor_type in the system.

This is a bottleneck for multicluster system, where we want different
packages to use same governor type, but with different tunables.

This patch is inclined towards providing this infrastructure. Because
we are required to allocate governor's resources dynamically now, we
must do it at policy creation and end. And so got
CPUFREQ_GOV_POLICY_INIT/EXIT.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:34 +02:00
Viresh Kumar
62b36cc1c8 cpufreq: Remove unnecessary use of policy->shared_type
policy->shared_type field was added only for SoCs with ACPI support:

commit 3b2d99429e
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Wed Dec 14 15:05:00 2005 -0500

    P-state software coordination for ACPI core

    http://bugzilla.kernel.org/show_bug.cgi?id=5737

Many non-ACPI systems are filling this field by mistake, which makes its usage
confusing. Lets clean it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 01:29:32 +01:00
Viresh Kumar
b394058f06 cpufreq: governors: Reset tunables only for cpufreq_unregister_governor()
Currently, whenever governor->governor() is called for CPUFRREQ_GOV_START event
we reset few tunables of governor. Which isn't correct, as this routine is
called for every cpu hot-[un]plugging event. We should actually be resetting
these only when the governor module is removed and re-installed.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 01:29:31 +01:00
Fabio Baltieri
2624f90c16 cpufreq: governors: implement generic policy_is_shared
Implement a generic helper function policy_is_shared() to replace the
current dbs_sw_coordinated_cpus() at cpufreq level, so that it can be
used by code other than cpufreq governors.

Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:16 +01:00
Viresh Kumar
951fc5f458 cpufreq: Update Documentation for cpus and related_cpus
Documentation related to cpus and related_cpus is confusing and not very clear.
Over that CPUFreq core has seen much changes recently. Lets update documentation
and comments for cpus and related_cpus.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:16 +01:00
Borislav Petkov
ab45bd9bed cpufreq: Sort function prototypes properly
Move function prototypes to a place where they logically fit better.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:15 +01:00
Borislav Petkov
9d95046e5d cpufreq: Add a get_current_driver helper
Add a helper function to return cpufreq_driver->name.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:15 +01:00
Viresh Kumar
b8eed8af94 cpufreq: Simplify __cpufreq_remove_dev()
__cpufreq_remove_dev() is called on multiple occasions: cpufreq_driver
unregister and cpu removals.

Current implementation of this routine is overly complex without much need. If
the cpu to be removed is the policy->cpu, we remove the policy first and add all
other cpus again from policy->cpus and then finally call __cpufreq_remove_dev()
again to remove the cpu to be deleted. Haahhhh..

There exist a simple solution to removal of a cpu:
- Simply use the old policy structure
- update its fields like: policy->cpu, etc.
- notify any users of cpufreq, which depend on changing policy->cpu

Hence this patch, which tries to implement the above theory. It is tested well
by myself on ARM big.LITTLE TC2 SoC, which has 5 cores (2 A15 and 3 A7). Both
A15's share same struct policy and all A7's share same policy structure.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:14 +01:00
Viresh Kumar
4471a34f9a cpufreq: governors: remove redundant code
Initially ondemand governor was written and then using its code conservative
governor is written. It used a lot of code from ondemand governor, but copy of
code was created instead of using the same routines from both governors. Which
increased code redundancy, which is difficult to manage.

This patch is an attempt to move common part of both the governors to
cpufreq_governor.c file to come over above mentioned issues.

This shouldn't change anything from functionality point of view.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:07 +01:00
viresh kumar
2aacdfff9c cpufreq: Move common part from governors to separate file, v2
Multiple cpufreq governers have defined similar get_cpu_idle_time_***()
routines. These routines must be moved to some common place, so that all
governors can use them.

So moving them to cpufreq_governor.c, which seems to be a better place for
keeping these routines.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:06 +01:00