linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-24 22:30:52 +07:00

Author	SHA1	Message	Date
Arnd Bergmann	be0408d74d	cpufreq: dbx500: add a Kconfig symbol Moving the cooling code into the cpufreq driver caused a possible build failure when the cpu_thermal helper code is a loadable module or disabled: drivers/cpufreq/dbx500-cpufreq.o: In function `dbx500_cpufreq_ready': dbx500-cpufreq.c:(.text.dbx500_cpufreq_ready+0x4): undefined reference to `cpufreq_cooling_register' This adds the same dependency that we have in other cpufreq drivers, forcing the driver to be disabled when we can't possibly link it. Fixes: `19678ffb9f` (cpufreq: dbx500: Manage cooling device from cpufreq driver) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-05-14 13:40:16 +02:00
Linus Torvalds	ac3c4aa248	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS updates from James Hogan: "math-emu: - Add missing clearing of BLTZALL and BGEZALL emulation counters - Fix BC1EQZ and BC1NEZ condition handling - Fix BLEZL and BGTZL identification BPF: - Add JIT support for SKF_AD_HATYPE - Use unsigned access for unsigned SKB fields - Quit clobbering callee saved registers in JIT code - Fix multiple problems in JIT skb access helpers Loongson 3: - Select MIPS_L1_CACHE_SHIFT_6 Octeon: - Remove vestiges of CONFIG_CAVIUM_OCTEON_2ND_KERNEL - Remove unused L2C types and macros. - Remove unused SLI types and macros. - Fix compile error when USB is not enabled. - Octeon: Remove unused PCIERCX types and macros. - Octeon: Clean up platform code. SNI: - Remove recursive include of cpu-feature-overrides.h Sibyte: - Export symbol periph_rev to sb1250-mac network driver. - Fix Kconfig warning. Generic platform: - Enable Root FS on NFS in generic_defconfig SMP-MT: - Use CPU interrupt controller IPI IRQ domain support UASM: - Add support for LHU for uasm. - Remove needless ISA abstraction mm: - Add 48-bit VA space and 4-level page tables for 4K pages. PCI: - Add controllers before the specified head irqchip driver for MIPS CPU: - Replace magic 0x100 with IE_SW0 - Prepare for non-legacy IRQ domains - Introduce IPI IRQ domain support MAINTAINERS: - Update email-id of Rahul Bedarkar NET: - sb1250-mac: Add missing MODULE_LICENSE() CPUFREQ: - Loongson2: drop set_cpus_allowed_ptr() Misc: - Disable Werror when W= is set - Opt into HAVE_COPY_THREAD_TLS - Enable GENERIC_CPU_AUTOPROBE - Use common outgoing-CPU-notification code - Remove dead define of ST_OFF - Remove CONFIG_ARCH_HAS_ILOG2_U{32,64} - Stengthen IPI IRQ domain sanity check - Remove confusing else statement in __do_page_fault() - Don't unnecessarily include kmalloc.h into <asm/cache.h>. - Delete unused definition of SMP_CACHE_SHIFT. - Delete redundant definition of SMP_CACHE_BYTES" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (39 commits) MIPS: Sibyte: Fix Kconfig warning. MIPS: Sibyte: Export symbol periph_rev to sb1250-mac network driver. NET: sb1250-mac: Add missing MODULE_LICENSE() MAINTAINERS: Update email-id of Rahul Bedarkar MIPS: Remove confusing else statement in __do_page_fault() MIPS: Stengthen IPI IRQ domain sanity check MIPS: smp-mt: Use CPU interrupt controller IPI IRQ domain support irqchip: mips-cpu: Introduce IPI IRQ domain support irqchip: mips-cpu: Prepare for non-legacy IRQ domains irqchip: mips-cpu: Replace magic 0x100 with IE_SW0 MIPS: Remove CONFIG_ARCH_HAS_ILOG2_U{32,64} MIPS: generic: Enable Root FS on NFS in generic_defconfig MIPS: mach-rm: Remove recursive include of cpu-feature-overrides.h MIPS: Opt into HAVE_COPY_THREAD_TLS CPUFREQ: Loongson2: drop set_cpus_allowed_ptr() MIPS: uasm: Remove needless ISA abstraction MIPS: Remove dead define of ST_OFF MIPS: Use common outgoing-CPU-notification code MIPS: math-emu: Fix BC1EQZ and BC1NEZ condition handling MIPS: r2-on-r6-emu: Clear BLTZALL and BGEZALL debugfs counters ...	2017-05-12 09:56:30 -07:00
Len Brown	3cedbc5a6d	intel_pstate: use updated msr-index.h HWP.EPP values intel_pstate exports sysfs attributes for setting and observing HWP.EPP. These attributes use strings to describe 4 operating states, and inside the driver, these strings are mapped to numerical register values. The authorative mapping between the strings and numerical HWP.EPP values are now globally defined in msr-index.h, replacing the out-dated mapping that were open-coded into intel_pstate.c new old string --- --- ------ 0 0 performance 128 64 balance_performance 192 128 balance_power 255 192 power Note that the HW and BIOS default value on most system is 128, which intel_pstate will now call "balance_performance" while it used to call it "balance_power". Signed-off-by: Len Brown <len.brown@intel.com>	2017-05-11 21:27:53 -04:00
Linus Torvalds	291b38a756	Annotation of module parameters that specify device settings -----BEGIN PGP SIGNATURE----- iQIVAwUAWPiW6vSw1s6N8H32AQLOrw/+NTqGf7bjq+64YKS6NfR0XDgE+wNJltGO ck7zJW3NHIg76RNu8s0I9xg5aVmwizz3Z5DGROZquaolnezux4tQihZ3AFyxIzLc +Y3WHYagcML7yFfjl/WznCLRD5EW3yPln4lCvQO0nW/xICRYeRI057JaIbi2Dtek BhcXt3c4AjXDLdYJkgtHV3p2R2mt8hcdFdWqqx6s7JaIThZNRGNzxAgtbcB9k5IW HVG9ZEIL73VBYWHrYivzjHYF5rBnNCPt87eOwDQeTOSkhv8te+u9k+bH8vxZw1T0 XUtDrLBndKiuVo2GUfLkkF8LItx3Q9eLCJYy0joaIliyPqTEsPx9KjQ+Af0cxS9s ZPCZ5SYf96stKmDeL5xaMfrAmeyVHJ4lc4JTOqdzbIT8blsOSfYO/03p0ALShSDv /RQLaKGlf8Bjoy8PwKFcXb4sIDufcd/U1Av/EMFXxOfgN/u2JUkGKq6EaIM5B68L fHPje+aR9VNELPmPjwNOWtmN4I79EH3EItQf7zv0KG+UeKhcHLx/EAcSJ3ZRKEkH Lathg7pPOEJGArPiVO79TZzBG01ADn1aiwv65XObMzNZ+54xI/mN/Y1DNF/kL5jU XzvNzEjFt8mwMIZGVNdAt4+pDyMfIZGZSyUkSRKFnaQZMIvQrfQIU9RLBYLX5eOx +/p0VkIwDpg= =lbS7 -----END PGP SIGNATURE----- Merge tag 'hwparam-20170420' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull hw lockdown support from David Howells: "Annotation of module parameters that configure hardware resources including ioports, iomem addresses, irq lines and dma channels. This allows a future patch to prohibit the use of such module parameters to prevent that hardware from being abused to gain access to the running kernel image as part of locking the kernel down under UEFI secure boot conditions. Annotations are made by changing: module_param(n, t, p) module_param_named(n, v, t, p) module_param_array(n, t, m, p) to: module_param_hw(n, t, hwtype, p) module_param_hw_named(n, v, t, hwtype, p) module_param_hw_array(n, t, hwtype, m, p) where the module parameter refers to a hardware setting hwtype specifies the type of the resource being configured. This can be one of: ioport Module parameter configures an I/O port iomem Module parameter configures an I/O mem address ioport_or_iomem Module parameter could be either (runtime set) irq Module parameter configures an I/O port dma Module parameter configures a DMA channel dma_addr Module parameter configures a DMA buffer address other Module parameter configures some other value Note that the hwtype is compile checked, but not currently stored (the lockdown code probably won't require it). It is, however, there for future use. A bonus is that the hwtype can also be used for grepping. The intention is for the kernel to ignore or reject attempts to set annotated module parameters if lockdown is enabled. This applies to options passed on the boot command line, passed to insmod/modprobe or direct twiddling in /sys/module/ parameter files. The module initialisation then needs to handle the parameter not being set, by (1) giving an error, (2) probing for a value or (3) using a reasonable default. What I can't do is just reject a module out of hand because it may take a hardware setting in the module parameters. Some important modules, some ipmi stuff for instance, both probe for hardware and allow hardware to be manually specified; if the driver is aborts with any error, you don't get any ipmi hardware. Further, trying to do this entirely in the module initialisation code doesn't protect against sysfs twiddling. [!] Note that in and of itself, this series of patches should have no effect on the the size of the kernel or code execution - that is left to a patch in the next series to effect. It does mark annotated kernel parameters with a KERNEL_PARAM_FL_HWPARAM flag in an already existing field" * tag 'hwparam-20170420' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (38 commits) Annotate hardware config module parameters in sound/pci/ Annotate hardware config module parameters in sound/oss/ Annotate hardware config module parameters in sound/isa/ Annotate hardware config module parameters in sound/drivers/ Annotate hardware config module parameters in fs/pstore/ Annotate hardware config module parameters in drivers/watchdog/ Annotate hardware config module parameters in drivers/video/ Annotate hardware config module parameters in drivers/tty/ Annotate hardware config module parameters in drivers/staging/vme/ Annotate hardware config module parameters in drivers/staging/speakup/ Annotate hardware config module parameters in drivers/staging/media/ Annotate hardware config module parameters in drivers/scsi/ Annotate hardware config module parameters in drivers/pcmcia/ Annotate hardware config module parameters in drivers/pci/hotplug/ Annotate hardware config module parameters in drivers/parport/ Annotate hardware config module parameters in drivers/net/wireless/ Annotate hardware config module parameters in drivers/net/wan/ Annotate hardware config module parameters in drivers/net/irda/ Annotate hardware config module parameters in drivers/net/hamradio/ Annotate hardware config module parameters in drivers/net/ethernet/ ...	2017-05-10 19:13:03 -07:00
Kees Cook	063246641d	format-security: move static strings to const While examining output from trial builds with -Wformat-security enabled, many strings were found that should be defined as "const", or as a char array instead of char pointer. This makes some static analysis easier, by producing fewer false positives. As these are all trivial changes, it seemed best to put them all in a single patch rather than chopping them up per maintainer. Link: http://lkml.kernel.org/r/20170405214711.GA5711@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Jes Sorensen <jes@trained-monkey.org> [runner.c] Cc: Tony Lindgren <tony@atomide.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: David Airlie <airlied@linux.ie> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Jiri Slaby <jslaby@suse.com> Cc: Patrice Chotard <patrice.chotard@st.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Mugunthan V N <mugunthanvnm@ti.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Jarod Wilson <jarod@redhat.com> Cc: Florian Westphal <fw@strlen.de> Cc: Antonio Quartulli <a@unstable.cc> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Kejian Yan <yankejian@huawei.com> Cc: Daode Huang <huangdaode@hisilicon.com> Cc: Qianqian Xie <xieqianqian@huawei.com> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Christian Gromm <christian.gromm@microchip.com> Cc: Andrey Shvetsov <andrey.shvetsov@k2l.de> Cc: Jason Litzinger <jlitzingerdev@gmail.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-05-08 17:15:14 -07:00
Stephen Boyd	ad61dd303a	scripts/spelling.txt: add regsiter -> register spelling mistake This typo is quite common. Fix it and add it to the spelling file so that checkpatch catches it earlier. Link: http://lkml.kernel.org/r/20170317011131.6881-2-sboyd@codeaurora.org Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-05-08 17:15:13 -07:00
Linus Torvalds	3527d3e951	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - another round of rq-clock handling debugging, robustization and fixes - PELT accounting improvements - CPU hotplug related ->cpus_allowed affinity handling fixes all around the tree - ... plus misc fixes, cleanups and updates" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) sched/x86: Update reschedule warning text crypto: N2 - Replace racy task affinity logic cpufreq/sparc-us2e: Replace racy task affinity logic cpufreq/sparc-us3: Replace racy task affinity logic cpufreq/sh: Replace racy task affinity logic cpufreq/ia64: Replace racy task affinity logic ACPI/processor: Replace racy task affinity logic ACPI/processor: Fix error handling in __acpi_processor_start() sparc/sysfs: Replace racy task affinity logic powerpc/smp: Replace open coded task affinity logic ia64/sn/hwperf: Replace racy task affinity logic ia64/salinfo: Replace racy task affinity logic workqueue: Provide work_on_cpu_safe() ia64/topology: Remove cpus_allowed manipulation sched/fair: Move the PELT constants into a generated header sched/fair: Increase PELT accuracy for small tasks sched/fair: Fix comments sched/Documentation: Add 'sched-pelt' tool sched/fair: Fix corner case in __accumulate_sum() sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() ...	2017-05-01 19:12:53 -07:00
Rafael J. Wysocki	2addac72af	Merge schedutil governor updates for v4.12.	2017-04-28 23:13:33 +02:00
Rafael J. Wysocki	2dee4b0e0b	Merge intel_pstate driver updates for v4.12.	2017-04-28 23:13:04 +02:00
David Howells	40059ec670	Annotate hardware config module parameters in drivers/cpufreq/ When the kernel is running in secure boot mode, we lock down the kernel to prevent userspace from modifying the running kernel image. Whilst this includes prohibiting access to things like /dev/mem, it must also prevent access by means of configuring driver modules in such a way as to cause a device to access or modify the kernel image. To this end, annotate module_param* statements that refer to hardware configuration and indicate for future reference what type of parameter they specify. The parameter parser in the core sees this information and can skip such parameters with an error message if the kernel is locked down. The module initialisation then runs as normal, but just sees whatever the default values for those parameters is. Note that we do still need to do the module initialisation because some drivers have viable defaults set in case parameters aren't specified and some drivers support automatic configuration (e.g. PNP or PCI) in addition to manually coded parameters. This patch annotates drivers in drivers/cpufreq/. Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> cc: linux-pm@vger.kernel.org	2017-04-20 12:02:32 +01:00
Mikko Perttunen	939dc6f51e	cpufreq: Add Tegra186 cpufreq driver Add a new cpufreq driver for Tegra186 (and likely later). The CPUs are organized into two clusters, Denver and A57, with two and four cores respectively. CPU frequency can be adjusted by writing the desired rate divisor and a voltage hint to a special per-core register. The frequency of each core can be set individually; however, this is just a hint as all CPUs in a cluster will run at the maximum rate of non-idle CPUs in the cluster. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-04-19 23:23:08 +02:00
Christophe Jaillet	eafca85163	cpufreq: imx6q: Fix error handling code According to the previous error handling code, it is likely that 'goto out_free_opp' is expected here in order to avoid a memory leak in error handling path. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-04-19 23:22:01 +02:00
Leonard Crestez	5aa1599ff0	cpufreq: imx6q: Set max suspend_freq to avoid changes during suspend If the cpufreq driver tries to modify voltage/freq during suspend/resume it might need to control an external PMIC via I2C or SPI but those devices might be already suspended. This issue is likely to happen whenever the LDOs have their vin-supply set. To avoid this scenario we just increase cpufreq to the maximum before suspend. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-04-19 23:22:01 +02:00
Irina Tirdea	54cad2fce7	cpufreq: imx6q: Fix handling EPROBE_DEFER from regulator If there are any errors in getting the cpu0 regulators, the driver returns -ENOENT. In case the regulators are not yet available, the devm_regulator_get calls will return -EPROBE_DEFER, so that the driver can be probed later. If we return -ENOENT, the driver will fail its initialization and will not try to probe again (when the regulators become available). Return the actual error received from regulator_get in probe. Print a differentiated message in case we need to probe the device later and in case we actually failed. Also add a message to inform when the driver has been successfully registered. Signed-off-by: Irina Tirdea <irina.tirdea@nxp.com> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-04-19 23:22:00 +02:00
Rafael J. Wysocki	1b72e7fd30	cpufreq: schedutil: Use policy-dependent transition delays Make the schedutil governor take the initial (default) value of the rate_limit_us sysfs attribute from the (new) transition_delay_us policy parameter (to be set by the scaling driver). That will allow scaling drivers to make schedutil use smaller default values of rate_limit_us and reduce the default average time interval between consecutive frequency changes. Make intel_pstate set transition_delay_us to 500. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-04-17 18:37:27 +02:00
Rafael J. Wysocki	5ed8a1c19d	Merge branch 'intel_pstate' into pm-cpufreq-sched	2017-04-17 01:13:02 +02:00
Thomas Gleixner	12699ac53a	cpufreq/sparc-us2e: Replace racy task affinity logic The access to the HBIRD_ESTAR_MODE register in the cpu frequency control functions must happen on the target CPU. This is achieved by temporarily setting the affinity of the calling user space thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. CPU hotplug and concurrent affinity settings for that thread resulting in code executing on the wrong CPU and overwriting the new affinity setting. Replace it by a straight forward smp function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704131020280.2408@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-04-15 12:20:56 +02:00
Thomas Gleixner	9fe24c4e92	cpufreq/sparc-us3: Replace racy task affinity logic The access to the safari config register in the CPU frequency functions must be executed on the target CPU. This is achieved by temporarily setting the affinity of the calling user space thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. CPU hotplug and concurrent affinity settings for that thread resulting in code executing on the wrong CPU and overwriting the new affinity setting. Replace it by a straight forward smp function call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/20170412201043.047558840@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-04-15 12:20:55 +02:00
Thomas Gleixner	205dcc1ecb	cpufreq/sh: Replace racy task affinity logic The target() callback must run on the affected cpu. This is achieved by temporarily setting the affinity of the calling thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. concurrent affinity settings for that thread resulting in code executing on the wrong CPU. Replace it by work_on_cpu(). All call pathes which invoke the callbacks are already protected against CPU hotplug. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/20170412201042.958216363@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-04-15 12:20:55 +02:00
Thomas Gleixner	38f05ed04b	cpufreq/ia64: Replace racy task affinity logic The get() and target() callbacks must run on the affected cpu. This is achieved by temporarily setting the affinity of the calling thread to the requested CPU and reset it to the original affinity afterwards. That's racy vs. concurrent affinity settings for that thread resulting in code executing on the wrong CPU and overwriting the new affinity setting. Replace it by work_on_cpu(). All call pathes which invoke the callbacks are already protected against CPU hotplug. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: linux-pm@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Len Brown <lenb@kernel.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1704122231100.2548@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-04-15 12:20:55 +02:00
Rafael J. Wysocki	c97ad0fc4f	Merge back cpufreq core changes for v4.12.	2017-04-15 00:23:36 +02:00
Chen Yu	c4a3fa261b	cpufreq: Bring CPUs up even if cpufreq_online() failed There is a report that after commit `27622b061e` ("cpufreq: Convert to hotplug state machine"), the normal CPU offline/online cycle fails on some platforms. According to the ftrace result, this problem was triggered on platforms using acpi-cpufreq as the default cpufreq driver, and due to the lack of some ACPI freq method (eg. _PCT), cpufreq_online() failed and returned a negative value, so the CPU hotplug state machine rolled back the CPU online process. Actually, from the user's perspective, the failure of cpufreq_online() should not prevent that CPU from being brought up, although cpufreq might not work on that CPU. BTW, during system startup cpufreq_online() is not invoked via CPU online but by the cpufreq device creation process, so the APs can be brought up even though cpufreq_online() fails in that stage. This patch ignores the return value of cpufreq_online/offline() and lets the cpufreq framework deal with the failure. cpufreq_online() itself will do a proper rollback in that case and if _PCT is missing, the ACPI cpufreq driver will print a warning if the corresponding debug options have been enabled. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194581 Fixes: `27622b061e` ("cpufreq: Convert to hotplug state machine") Reported-and-tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.9+ <stable@vger.kernel.org> # 4.9+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-04-13 03:38:44 +02:00
Sebastian Andrzej Siewior	759f534e93	CPUFREQ: Loongson2: drop set_cpus_allowed_ptr() It is pure mystery to me why we need to be on a specific CPU while looking up a value in an array. My best shot at this is that before commit `d4019f0a92` ("cpufreq: move freq change notifications to cpufreq core") it was required to invoke cpufreq_notify_transition() on a special CPU. Since it looks like a waste, remove it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: tglx@linutronix.de Cc: linux-pm@vger.kernel.org Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15888/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2017-04-12 13:52:21 +02:00
Rafael J. Wysocki	69a07f1803	Merge back cpufreq changes for v4.12.	2017-04-06 01:27:21 +02:00
Rafael J. Wysocki	46e1d5e972	Merge branches 'pm-cpufreq-fixes' and 'pm-cpuidle-fixes' * pm-cpufreq-fixes: cpufreq: Fix creation of symbolic links to policy directories * pm-cpuidle-fixes: cpuidle: powernv: Pass correct drv->cpumask for registration	2017-03-31 23:00:53 +02:00
Box, David E	630e57573e	cpufreq: intel_pstate: Add support for Gemini Lake Use same parameters as INTEL_FAM6_ATOM_GOLDMONT to enable Gemini Lake. Signed-off-by: Box, David E <david.e.box@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-29 22:45:32 +02:00
Rafael J. Wysocki	b02aabe8ab	cpufreq: intel_pstate: Eliminate intel_pstate_get_min_max() Some computations in intel_pstate_get_min_max() are not necessary and one of its two callers doesn't even use the full result. First off, the fixed-point value of cpu->max_perf represents a non-negative number between 0 and 1 inclusive and cpu->min_perf cannot be greater than cpu->max_perf. It is not necessary to check those conditions every time the numbers in question are used. Moreover, since intel_pstate_max_within_limits() only needs the upper boundary, it doesn't make sense to compute the lower one in there and returning min and max from intel_pstate_get_min_max() via pointers doesn't look particularly nice. For the above reasons, drop intel_pstate_get_min_max(), add a helper to get the base P-state for min/max computations and carry out them directly in the previous callers of intel_pstate_get_min_max(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:17 +02:00
Rafael J. Wysocki	2bfc4cbb5f	cpufreq: intel_pstate: Do not walk policy->cpus intel_pstate_hwp_set() is the only function walking policy->cpus in intel_pstate. The rest of the code simply assumes one CPU per policy, including the initialization code. Therefore it doesn't make sense for intel_pstate_hwp_set() to walk policy->cpus as it is guaranteed to have only one bit set for policy->cpu. For this reason, rearrange intel_pstate_hwp_set() to take the CPU number as the argument and drop the loop over policy->cpus from it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:16 +02:00
Rafael J. Wysocki	8ca6ce3701	cpufreq: intel_pstate: Introduce pid_in_use() Add a new function pid_in_use() to return the information on whether or not the PID-based P-state selection algorithm is in use. That allows a couple of complicated conditions in the code to be reduced to simple checks against the new function's return value. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:16 +02:00
Rafael J. Wysocki	2f49afc2a6	cpufreq: intel_pstate: Drop struct cpu_defaults The cpu_defaults structure is redundant, because it only contains one member of type struct pstate_funcs which can be used directly instead of struct cpu_defaults. For this reason, drop struct cpu_defaults, use struct pstate_funcs directly instead of it where applicable and rename all of the variables of that type accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:16 +02:00
Rafael J. Wysocki	de4a76cb58	cpufreq: intel_pstate: Move cpu_defaults definitions Move the definitions of the cpu_defaults structures after the definitions of utilization update callback routines to avoid extra declarations of the latter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:16 +02:00
Rafael J. Wysocki	67dd9bf441	cpufreq: intel_pstate: Add update_util callback to pstate_funcs Avoid using extra function pointers during P-state selection by dropping the get_target_pstate member from struct pstate_funcs, adding a new update_util callback to it (to be registered with the CPU scheduler as the utilization update callback in the active mode) and reworking the utilization update callback routines to invoke specific P-state selection functions directly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:16 +02:00
Rafael J. Wysocki	eabd22c657	cpufreq: intel_pstate: Use different utilization update callbacks Notice that some overhead in the utilization update callbacks registered by intel_pstate in the active mode can be avoided if those callbacks are tailored to specific configurations of the driver. For example, the utilization update callback for the HWP enabled case only needs to update the average CPU performance periodically whereas the utilization update callback for the PID-based algorithm does not need to take IO-wait boosting into account and so on. With that in mind, define three utilization update callbacks for three different use cases: HWP enabled, the CPU load "powersave" P-state selection algorithm and the PID-based "powersave" P-state selection algorithm and modify the driver initialization to choose the callback matching its current configuration. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:15 +02:00
Rafael J. Wysocki	0042b2c069	cpufreq: intel_pstate: Modify check in intel_pstate_update_status() One of the checks in intel_pstate_update_status() implicitly relies on the information that there are only two struct cpufreq_driver objects available, but it is better to do it directly against the value it really is about (to make the code easier to follow if nothing else). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:15 +02:00
Rafael J. Wysocki	ee8df89a68	cpufreq: intel_pstate: Drop driver_registered variable The driver_registered variable in intel_pstate is used for checking whether or not the driver has been registered, but intel_pstate_driver can be used for that too (with the rule that the driver is not registered as long as it is NULL). That is a bit more straightforward and the code may be simplified a bit this way, so modify the driver accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:15 +02:00
Rafael J. Wysocki	694cb17347	cpufreq: intel_pstate: Skip unnecessary PID resets on init PID controller parameters only need to be initialized if the get_target_pstate_use_performance() P-state selection routine is going to be used. It is not necessary to initialize them otherwise, so don't do that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:14 +02:00
Rafael J. Wysocki	7aec5b50e9	cpufreq: intel_pstate: Set HWP sampling interval once In the HWP enabled case pid_params.sample_rate_ns only needs to be updated once, because it is global, so do that when setting hwp_active instead of doing it during the initialization of every CPU. Moreover, pid_params.sample_rate_ms is never used if HWP is enabled, so do not update it at all then. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:14 +02:00
Rafael J. Wysocki	ff35f02ea1	cpufreq: intel_pstate: Clean up intel_pstate_busy_pid_reset() intel_pstate_busy_pid_reset() is the only caller of pid_reset(), pid_p_gain_set(), pid_i_gain_set(), and pid_d_gain_set(). Moreover, it passes constants as two parameters of pid_reset() and all of the other routines above essentially contain the same code, so fold all of them into the caller and drop unnecessary computations. Introduce percent_fp() for converting integer values in percent to fixed-point fractions and use it in the above code cleanup. Finally, rename intel_pstate_busy_pid_reset() to intel_pstate_pid_reset() as it also is used for the initialization of PID parameters for every CPU and the meaning of the "busy" part of the name is not particularly clear. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:11 +02:00
Rafael J. Wysocki	4ddd0146c7	cpufreq: intel_pstate: Fold intel_pstate_reset_all_pid() into the caller There is only one caller of intel_pstate_reset_all_pid(), which is pid_param_set() used in the debugfs interface only, and having that code split does not make it particularly convenient to follow. For this reason, move the body of intel_pstate_reset_all_pid() into its caller and drop that function. Also change the loop from for_each_online_cpu() (which is obviously racy with respect to CPU offline/online) to for_each_possible_cpu(), so that all PID parameters are reset for all CPUs regardless of their online/offline status (to prevent, for example, a previously offline CPU from going online with a stale set of PID parameters). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:09 +02:00
Rafael J. Wysocki	5c43905369	cpufreq: intel_pstate: Initialize pid_params statically Notice that both the existing struct cpu_defaults instances in which PID parameters are actually initialized use the same values of those parameters, so it is not really necessary to copy them over to pid_params dynamically. Instead, initialize pid_params statically with those values and drop the unused pid_policy member from struct cpu_defaults along with copy_pid_params() used for initializing it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:08 +02:00
Rafael J. Wysocki	6404367862	cpufreq: intel_pstate: Drop pointless initialization of PID parameters The P-state selection algorithm used by intel_pstate for Atom processors is not based on the PID controller and the initialization of PID parametrs for those processors is pointless and confusing, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:07 +02:00
Rafael J. Wysocki	e14cf8857e	cpufreq: intel_pstate: Eliminate struct perf_limits After recent changes the purpose of struct perf_limits is not particularly clear any more and the code may be made somewhat easier to follow by eliminating it, so go for that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-28 23:12:07 +02:00
Rafael J. Wysocki	2f0ba790df	cpufreq: Fix creation of symbolic links to policy directories The cpufreq core only tries to create symbolic links from CPU directories in sysfs to policy directories in cpufreq_add_dev(), either when a given CPU is registered or when the cpufreq driver is registered, whichever happens first. That is not sufficient, however, because cpufreq_add_dev() may be called for an offline CPU whose policy object has not been created yet and, quite obviously, the symbolic cannot be added in that case. Fix that by making cpufreq_online() attempt to add symbolic links to policy objects for the CPUs in the related_cpus mask of every new policy object created by it. The cpufreq_driver_lock locking around the for_each_cpu() loop in cpufreq_online() is dropped, because it is not necessary and the code is somewhat simpler without it. Moreover, failures to create a symbolic link will not be regarded as hard errors any more and the CPUs without those links will not be taken offline automatically, but that should not be problematic in practice. Reported-and-tested-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 4.9+ <stable@vger.kernel.org> # 4.9+	2017-03-27 19:33:09 +02:00
Rafael J. Wysocki	80b120ca1a	cpufreq: intel_pstate: Avoid transient updates of cpuinfo.max_freq Both intel_pstate_verify_policy() and intel_cpufreq_verify_policy() set policy->cpuinfo.max_freq depending on the turbo status, but the updates made by them are discarded by the core, because the policy object passed to them by the core is temporary and cpuinfo.max_freq from that object is not copied to the final policy object in cpufreq_set_policy(). However, cpufreq_set_policy() passes the temporary policy object to the ->setpolicy callback of the driver, so intel_pstate_set_policy() actually sees the policy->cpuinfo.max_freq value updated by intel_pstate_verify_policy() and not the final one. It also updates policy->max sometimes which basically has no effect after it returns, because the core discards that update. To avoid confusion, eliminate policy->cpuinfo.max_freq updates from intel_pstate_verify_policy() and intel_cpufreq_verify_policy() entirely and check the maximum frequency explicitly in intel_pstate_update_perf_limits() instead of relying on the transiently updated policy->cpuinfo.max_freq value. Moreover, move the max->policy adjustment carried out in intel_pstate_set_policy() to a separate function and call that function from the ->verify driver callbacks to ensure that it will actually be effective. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-24 03:04:32 +01:00
Rafael J. Wysocki	c5a2ee7dde	cpufreq: intel_pstate: Active mode P-state limits rework The coordination of P-state limits used by intel_pstate in the active mode (ie. by default) is problematic, because it synchronizes all of the limits (ie. the global ones and the per-policy ones) so as to use one common pair of P-state limits (min and max) across all CPUs in the system. The drawbacks of that are as follows: - If P-states are coordinated in hardware, it is not necessary to coordinate them in software on top of that, so in that case all of the above activity is in vain. - If P-states are not coordinated in hardware, then the processor is actually capable of setting different P-states for different CPUs and coordinating them at the software level simply doesn't allow that capability to be utilized. - The coordination works in such a way that setting a per-policy limit (eg. scaling_max_freq) for one CPU causes the common effective limit to change (and it will affect all of the other CPUs too), but subsequent reads from the corresponding sysfs attributes for the other CPUs will return stale values (which is confusing). - Reads from the global P-state limit attributes, min_perf_pct and max_perf_pct, return the effective common values and not the last values set through these attributes. However, the last values set through these attributes become hard limits that cannot be exceeded by writes to scaling_min_freq and scaling_max_freq, respectively, and they are not exposed, so essentially users have to remember what they are. All of that is painful enough to warrant a change of the management of P-state limits in the active mode. To that end, redesign the active mode P-state limits management in intel_pstate in accordance with the following rules: (1) All CPUs are affected by the global limits (that is, none of them can be requested to run faster than the global max and none of them can be requested to run slower than the global min). (2) Each individual CPU is affected by its own per-policy limits (that is, it cannot be requested to run faster than its own per-policy max and it cannot be requested to run slower than its own per-policy min). (3) The global and per-policy limits can be set independently. Also, the global maximum and minimum P-state limits will be always expressed as percentages of the maximum supported turbo P-state. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-24 03:04:31 +01:00
Rafael J. Wysocki	553953453b	cpufreq: intel_pstate: Use load-based P-state selection more widely Extend the set of systems for which intel_pstate will use the "powersave" P-state selection algorithm based on CPU load in the active mode by systems with ACPI preferred profile set to "tablet", "appliance PC", "desktop", or "workstation" (ie. everything with a specified preferred profile that is not a "server"). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-24 03:04:31 +01:00
Rafael J. Wysocki	eb5139d1a2	cpufreq: intel_pstate: Support HWP processors in all operation modes Currently, some processors supporting HWP are only supported by intel_pstate if HWP is actually going to be used and not supported otherwise which is confusing. Specifically, they are not supported if "intel_pstate=no_hwp" is passed to the kernel in the command line or if the driver is started in the passive mode ("intel_pstate=passive"). There is no real reason for that, because everything about those processor is known anyway and the driver can work with them in all modes, so make that happen, but use the load-based P-state selection algorithm for the active mode "powersave" policy with them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-24 03:04:31 +01:00
Rafael J. Wysocki	f1a91645b7	Merge back intel_pstate updates for 4.12.	2017-03-24 03:04:10 +01:00
Rafael J. Wysocki	6488294e4a	Merge branches 'pm-cpufreq-fixes', 'pm-cpufreq-sched-fixes' and 'intel_pstate-fixes' * pm-cpufreq-fixes: cpufreq: Restore policy min/max limits on CPU online * pm-cpufreq-sched-fixes: cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start() * intel_pstate-fixes: cpufreq: intel_pstate: Fix policy data management in passive mode cpufreq: intel_pstate: One set of global limits in active mode	2017-03-24 00:43:26 +01:00
Viresh Kumar	ff010472fb	cpufreq: Restore policy min/max limits on CPU online On CPU online the cpufreq core restores the previous governor (or the previous "policy" setting for ->setpolicy drivers), but it does not restore the min/max limits at the same time, which is confusing, inconsistent and real pain for users who set the limits and then suspend/resume the system (using full suspend), in which case the limits are reset on all CPUs except for the boot one. Fix this by making cpufreq_online() restore the limits when an inactive policy is brought online. The commit log and patch are inspired from Rafael's earlier work. Reported-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.3+ <stable@vger.kernel.org> # 4.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-22 02:38:27 +01:00
Rafael J. Wysocki	64897b20ee	cpufreq: intel_pstate: Fix policy data management in passive mode The policy->cpuinfo.max_freq and policy->max updates in intel_cpufreq_turbo_update() are excessive as they are done for no good reason and may lead to problems in principle, so they should be dropped. However, after dropping them intel_cpufreq_turbo_update() becomes almost entirely pointless, because the check made by it is made again down the road in intel_pstate_prepare_request(). The only thing in it that still needs to be done is the call to update_turbo_state(), so drop intel_cpufreq_turbo_update() altogether and make its callers invoke update_turbo_state() directly instead of it. In addition to that, fix intel_cpufreq_verify_policy() so that it checks global.no_turbo in addition to global.turbo_disabled when updating policy->cpuinfo.max_freq to make it consistent with intel_pstate_verify_policy(). Fixes: `001c76f05b` (cpufreq: intel_pstate: Generic governors support) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-21 22:19:07 +01:00
Rafael J. Wysocki	7de32556df	cpufreq: intel_pstate: One set of global limits in active mode In the active mode intel_pstate currently uses two sets of global limits, each associated with one of the possible scaling_governor settings in that mode: "powersave" or "performance". The driver switches over from one of those sets to the other depending on the scaling_governor setting for the last CPU whose per-policy cpufreq interface in sysfs was last used to change parameters exposed in there. That obviously leads to no end of issues when the scaling_governor settings differ between CPUs. The most recent issue was introduced by commit `a240c4aa5d` (cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy) that eliminated the reinitialization of "performance" limits in intel_pstate_set_policy() preventing the max limit from being set to anything below 100, among other things. Namely, an undesirable side effect of commit `a240c4aa5d` is that now, after setting scaling_governor to "performance" in the active mode, the per-policy limits for the CPU in question go to the highest level and stay there even when it is switched back to "powersave" later. As it turns out, some distributions set scaling_governor to "performance" temporarily for all CPUs to speed-up system initialization, so that change causes them to misbehave later. To fix that, get rid of the performance/powersave global limits split and use just one set of global limits for everything. From the user's persepctive, after this modification, when scaling_governor is switched from "performance" to "powersave" or the other way around on one CPU, the limits settings (ie. the global max/min_perf_pct and per-policy scaling_max/min_freq for any CPUs) will not change. Still, switching from "performance" to "powersave" or the other way around changes the way in which P-states are selected and in particular "performance" causes the driver to always request the highest P-state it is allowed to ask for for the given CPU. Fixes: `a240c4aa5d` (cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-18 00:57:39 +01:00
Rafael J. Wysocki	8b766e05d8	Merge branches 'pm-cpufreq-fixes' and 'intel_pstate-fixes' * pm-cpufreq-fixes: cpufreq: Fix and clean up show_cpuinfo_cur_freq() * intel_pstate-fixes: cpufreq: intel_pstate: Avoid percentages in limits-related computations cpufreq: intel_pstate: Correct frequency setting in the HWP mode cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set()	2017-03-18 00:45:09 +01:00
Viresh Kumar	19678ffb9f	cpufreq: dbx500: Manage cooling device from cpufreq driver The best place to register the CPU cooling device is from the cpufreq driver as we would know if all the resources are already available or not. That's what is done for the cpufreq-dt.c driver as well. The cpu-cooling driver for dbx500 platform was just (un)registering with the thermal framework and that can be handled easily by the cpufreq driver as well and in proper sequence as well. Get rid of the cooling driver and its its users and manage everything from the cpufreq driver instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-16 00:14:31 +01:00
Rafael J. Wysocki	9b4f603e7a	cpufreq: Fix and clean up show_cpuinfo_cur_freq() There is a missing newline in show_cpuinfo_cur_freq(), so add it, but while at it clean that function up somewhat too. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: All applicable <stable@vger.kernel.org>	2017-03-16 00:12:40 +01:00
Rafael J. Wysocki	e4c204ced0	cpufreq: intel_pstate: Avoid percentages in limits-related computations Currently, intel_pstate_update_perf_limits() first converts the policy minimum and maximum limits into percentages of the maximum turbo frequency (rounding up to an integer) and then converts these percentages to fractions (by using fixed-point arithmetic to divide them by 100). That introduces a rounding error unnecessarily, because the fractions can be obtained by carrying out fixed-point divisions directly on the input numbers. Rework the computations in intel_pstate_hwp_set() to use fractions instead of percentages (and drop redundant local variables from there) and modify intel_pstate_update_perf_limits() to compute the fractions directly and percentages out of them. While at it, introduce percent_ext_fp() for converting percentages to fractions (with extended number of fraction bits) and use it in the computations. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-15 16:52:29 +01:00
Srinivas Pandruvada	3f8ed54aee	cpufreq: intel_pstate: Correct frequency setting in the HWP mode In the functions intel_pstate_hwp_set(), min/max range from HWP capability MSR along with max_perf_pct and min_perf_pct, is used to set the HWP request MSR. In some cases this doesn't result in the correct HWP max/min in HWP request. For example: In the following case: HWP capabilities from MSR 0x771 0x70a1220 Here cpufreq min/max frequencies from above MSR dump are 700MHz and 3.2GHz respectively. This will result in hwp_min = 0x07 hwp_max = 0x20 To limit max frequency to 2GHz: perf_limits->max_perf_pct = 63 (2GHz as a percent of 3.2GHz rounded up) With the current calculation: adj_range = max_perf_pct * range / 100; adj_range = 63 * (32 - 7) / 100 adj_range = 15 max = hw_min + adj_range; max = 7 + 15 = 22 This will result in HWP request of 0x160f, which will result in a frequency cap of 2.2GHz not 2GHz. The problem with the above calculation is that hwp_min of 7 is treated as 0% in the range. But max_perf_pct is calculated with respect to minimum as 0 and max as 3.2GHz or hwp_max, so adding hwp_min to it will result in more than the desired. Since the min_perf_pct and max_perf_pct is already a percent of max frequency or hwp_max, this min/max HWP request value can be calculated directly applying these percentage to hwp_max. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-14 03:56:39 +01:00
Rafael J. Wysocki	6e7408acd0	cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set() Fix the debugfs interface for PID tuning to actually update pid_params.sample_rate_ns on PID parameters updates, as changing pid_params.sample_rate_ms via debugfs has no effect now. Fixes: `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-03-13 23:55:12 +01:00
YuanTian Tang	b51d3388e2	cpufreq: qoriq: enhance bus frequency calculation On some platforms, property device-type may be missed in soc node in dts which caused the bus-frequency can not be obtained correctly. This patch enhanced the bus-frequency calculation. When property device-type is missed in dts, bus-frequency will be obtained by looking up clock table to get platform clock and hence get its frequency. Signed-off-by: Tang Yuantian <andy.tang@nxp.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-12 23:10:53 +01:00
Daniel Kurtz	cf9a243825	cpufreq: mediatek: Add support for MT8176 and MT817x The Mediatek MT8173 is just one of several SOCs from the same MT817x family, including the 6-core (4-little/2-big) MT8176. The mt8173-cpufreq driver supports all of these SOCs, however, machines using them may use a different machine compatible. Since this driver checks explicitly for the machine compatible string, add support for the whole family. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-12 23:10:53 +01:00
Daniel Kurtz	08a74cbb1b	cpufreq: mt8173: Mark mt8173_cpufreq_driver_init as __init This function is only called once at boot by device_initcall(), so mark it as __init. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-12 23:10:53 +01:00
Rafael J. Wysocki	5f98ced1c9	cpufreq: intel_pstate: Drop redundant wrapper function intel_pstate_hwp_set_policy() is a wrapper around intel_pstate_hwp_set(), but the only value it adds is to check hwp_active before calling the latter and one of its two callers has already checked hwp_active before that happens, so in that code path the additional check is redundant and using the wrapper is rather pointless. For this reason, drop intel_pstate_hwp_set_policy() and make its callers invoke intel_pstate_hwp_set() directly (after checking hwp_active). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-03-12 23:07:58 +01:00
Rafael J. Wysocki	fd8e57d5d3	Merge branch 'pm-cpufreq' * pm-cpufreq: cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy cpufreq: intel_pstate: Fix intel_pstate_verify_policy() cpufreq: intel_pstate: Fix global settings in active mode cpufreq: Add the "cpufreq.off=1" cmdline option cpufreq: intel_pstate: Avoid triggering cpu_frequency tracepoint unnecessarily cpufreq: intel_pstate: Fix intel_cpufreq_verify_policy() cpufreq: intel_pstate: Do not use performance_limits in passive mode	2017-03-09 15:12:27 +01:00
Rafael J. Wysocki	a240c4aa5d	cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy If the current P-state selection algorithm is set to "performance" in intel_pstate_set_policy(), the limits may be initialized from scratch, but only if no_turbo is not set and the maximum frequency allowed for the given CPU (i.e. the policy object representing it) is at least equal to the max frequency supported by the CPU. In all of the other cases, the limits will not be updated. For example, the following can happen: # cat intel_pstate/status active # echo performance > cpufreq/policy0/scaling_governor # cat intel_pstate/min_perf_pct 100 # echo 94 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 100 # cat cpufreq/policy0/scaling_max_freq 3100000 echo 3000000 > cpufreq/policy0/scaling_max_freq # cat intel_pstate/min_perf_pct 94 # echo 95 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 95 That is confusing for two reasons. First, the initial attempt to change min_perf_pct to 94 seems to have no effect, even though setting the global limits should always work. Second, after changing scaling_max_freq for policy0 the global min_perf_pct attribute shows 94, even though it should have not been affected by that operation in principle. Moreover, the final attempt to change min_perf_pct to 95 worked as expected, because scaling_max_freq for the only policy with scaling_governor equal to "performance" was different from the maximum at that time. To make all that confusion go away, modify intel_pstate_set_policy() so that it doesn't reinitialize the limits at all. At the same time, change intel_pstate_set_performance_limits() to set min_sysfs_pct to 100 in the "performance" limits set so that switching the P-state selection algorithm to "performance" causes intel_pstate/min_perf_pct in sysfs to go to 100 (or whatever value min_sysfs_pct in the "performance" limits is set to later). That requires per-CPU limits to be initialized explicitly rather than by copying the global limits to avoid setting min_sysfs_pct in the per-CPU limits to 100. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-06 00:06:05 +01:00
Rafael J. Wysocki	d74b199291	cpufreq: intel_pstate: Fix intel_pstate_verify_policy() The code added to intel_pstate_verify_policy() by commit `1443ebbacf` (cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy) should use perf_limits instead of limits, because otherwise setting global limits via sysfs may affect policies inconsistently. For example, in the sequence of shell commands below, the scaling_min_freq attribute for policy1 and policy2 should be affected in the same way, because scaling_governor is set in the same way for both of them: # cat cpufreq/policy1/scaling_governor powersave # cat cpufreq/policy2/scaling_governor powersave # echo performance > cpufreq/policy0/scaling_governor # echo 94 > intel_pstate/min_perf_pct # cat cpufreq/policy0/scaling_min_freq 2914000 # cat cpufreq/policy1/scaling_min_freq 2914000 # cat cpufreq/policy2/scaling_min_freq 800000 The are affected differently, because intel_pstate_verify_policy() is invoked with limits set to &performance_limits (left behind by policy0) for policy1 and with limits set to &powersave_limits (left behind by policy1) for policy2. Since perf_limits is set to the set of limits matching the policy being updated, using it instead of limits fixes the inconsistency. Fixes: `1443ebbacf` (cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-06 00:06:05 +01:00
Rafael J. Wysocki	cd59b4bed9	cpufreq: intel_pstate: Fix global settings in active mode Commit `111b8b3fe4` (cpufreq: intel_pstate: Always keep all limits settings in sync) changed intel_pstate to invoke cpufreq_update_policy() for every registered CPU on global sysfs attributes updates, but that led to undesirable effects in the active mode if the "performance" P-state selection algorithm is configufred for one CPU and the "powersave" one is chosen for all of the other CPUs. Namely, in that case, the following is possible: # cd /sys/devices/system/cpu/ # cat intel_pstate/max_perf_pct 100 # cat intel_pstate/min_perf_pct 26 # echo performance > cpufreq/policy0/scaling_governor # cat intel_pstate/max_perf_pct 100 # cat intel_pstate/min_perf_pct 100 # echo 94 > intel_pstate/min_perf_pct # cat intel_pstate/min_perf_pct 26 The reason why this happens is because intel_pstate attempts to maintain two sets of global limits in the active mode, one for the "performance" P-state selection algorithm and one for the "powersave" P-state selection algorithm, but the P-state selection algorithms are set per policy, so the global limits cannot reflect all of them at the same time if they are different for different policies. In the particular situation above, the attempt to change min_perf_pct to 94 caused cpufreq_update_policy() to be run for a CPU with the "powersave" P-state selection algorithm and intel_pstate_set_policy() called by it silently switched the global limits to the "powersave" set which finally was reflected by the sysfs interface. To prevent that from happening, modify intel_pstate_update_policies() to always switch back to the set of limits that was used right before it has been invoked. Fixes: `111b8b3fe4` (cpufreq: intel_pstate: Always keep all limits settings in sync) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-06 00:06:04 +01:00
Len Brown	d82f269255	cpufreq: Add the "cpufreq.off=1" cmdline option Add the "cpufreq.off=1" cmdline option. At boot-time, this allows a user to request CONFIG_CPU_FREQ=n behavior from a kernel built with CONFIG_CPU_FREQ=y. This is analogous to the existing "cpuidle.off=1" option and CONFIG_CPU_IDLE=y This capability is valuable when we need to debug end-user issues in the BIOS or in Linux. It is also convenient for enabling comparisons, which may otherwise require a new kernel, or help from BIOS SETUP, which may be buggy or unavailable. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-06 00:05:31 +01:00
Rafael J. Wysocki	6407829901	cpufreq: intel_pstate: Avoid triggering cpu_frequency tracepoint unnecessarily In the passive mode the cpu_frequency trace event is already triggered by the cpufreq core or by scaling governors, so intel_pstate should not trigger it once again for the same P-state updates. In addition to that, the frequency returned by intel_cpufreq_fast_switch() and passed via freqs.new from intel_cpufreq_target() to cpufreq_freq_transition_end() should reflect the P-state actually set, so make that happen. Fixes: `001c76f05b` (cpufreq: intel_pstate: Generic governors support) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-04 01:38:42 +01:00
Rafael J. Wysocki	7f17326fc0	cpufreq: intel_pstate: Fix intel_cpufreq_verify_policy() The intel_pstate_update_perf_limits() called from intel_cpufreq_verify_policy() may cause global P-state limits to change which is generally confusing and unnecessary. In the passive mode the global limits are only applied to the frequency selected by the scaling governor (they are not taken into account by governors when making decisions anyway), so making them follow the per-policy limits serves no purpose and may go against user expectations (as it generally causes the global attributes in sysfs to change even though they have not been written to in some cases). Fix that by dropping the intel_pstate_update_perf_limits() invocation from intel_cpufreq_verify_policy() (which also reduces the code size by a few lines). This change does not affect the per-CPU limits case, because those limits allow any P-state to be set by default in the passive mode and it removes the only piece of code updating them in that mode, so the per-policy settings will be the only ones taken into account in that case as expected. Fixes: `001c76f05b` (cpufreq: intel_pstate: Generic governors support) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-04 01:38:41 +01:00
Rafael J. Wysocki	2bc756e7dd	cpufreq: intel_pstate: Do not use performance_limits in passive mode Using performance_limits in the passive mode doesn't make sense, because in that mode the global limits are applied to the frequency selected by the scaling governor. The maximum and minimum P-state limits in performance_limits are both set to 100 percent which will put all CPUs into the turbo range regardless of what governor is used and what frequencies are selected by it (that is particularly undesirable on CPUs with the generic powersave governor attached). For this reason, make intel_pstate_register_driver() always point limits to powersave_limits in the passive mode. Fixes: `001c76f05b` (cpufreq: intel_pstate: Generic governors support) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-03-04 01:38:41 +01:00
Linus Torvalds	1827adb11a	Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull sched.h split-up from Ingo Molnar: "The point of these changes is to significantly reduce the <linux/sched.h> header footprint, to speed up the kernel build and to have a cleaner header structure. After these changes the new <linux/sched.h>'s typical preprocessed size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K lines), which is around 40% faster to build on typical configs. Not much changed from the last version (-v2) posted three weeks ago: I eliminated quirks, backmerged fixes plus I rebased it to an upstream SHA1 from yesterday that includes most changes queued up in -next plus all sched.h changes that were pending from Andrew. I've re-tested the series both on x86 and on cross-arch defconfigs, and did a bisectability test at a number of random points. I tried to test as many build configurations as possible, but some build breakage is probably still left - but it should be mostly limited to architectures that have no cross-compiler binaries available on kernel.org, and non-default configurations" * 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits) sched/headers: Clean up <linux/sched.h> sched/headers: Remove #ifdefs from <linux/sched.h> sched/headers: Remove the <linux/topology.h> include from <linux/sched.h> sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h> sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h> sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h> sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h> sched/headers: Remove <linux/sched.h> from <linux/sched/init.h> sched/core: Remove unused prefetch_stack() sched/headers: Remove <linux/rculist.h> from <linux/sched.h> sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h> sched/headers: Remove <linux/signal.h> from <linux/sched.h> sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> sched/headers: Remove the runqueue_is_locked() prototype sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h> sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h> sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h> sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h> sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h> ...	2017-03-03 10:16:38 -08:00
Linus Torvalds	c82be9d224	Power management turbostat utility updates for v4.11-rc1 These update turbostat significantly and in particular: - Default output is now verbose, --debug is no longer required to get all counters. As a result, some options have been added to specify exactly what output is wanted. - Added --quiet to skip system configuration output - Added --list, --show and --hide parameters - Added --cpu parameter - Enhanced Baytrail SoC support - Added Gemini Lake SoC support - Added sysfs C-state columns Also the symbol definitions in arch/x86/include/asm/intel-family.h and arch/x86/include/asm/msr-index.h are updated and the intel_idle and intel_pstate drivers are modified to use the updated symbols. Credits to Len Brown for all of these changes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYuLyNAAoJEILEb/54YlRxEvkQAJsggzpgGrlhrO6KHSm4yC9M CqhBVsdeppX1ZTAVPiMk/pcXQYtL5fZ97ELk2So/CjT5Nh3jwDPMA/ux5n3uiob+ O2BTdtxnpNLxPQPQM1mW7Dr/uAIRlJug9gSMxKDbFSU9Oe3aET58PUdUTs7xaT59 nbtLxVSvzrdGk/bX6WO4ic+7F2licJLZPfDGhYidnoika8LxD4M+cIO73gFpgqQi yoKrTZyLimvneFT0eAUUvHIyKjkJIxeMfslW57uBpz8rW5my+3UwsdpRG4AIVeWc wSBlsNqj+TuR4BBiZ2VR2RoHF3qbH/SceI+k864BqyThfyK/g2q/vV/GvLZQCR/R yWcajWD9kvLKvnm1D3XYOIQDBeP4l60j3vVwHytSvmaPYjn5Ms3jq6b+2K6zkXMM 8y3leW/hgw+rGCacdXPrKIlpBykSV7h+TnD2iMxeeDISNkbefWWDe/WB6HncocAg HDtKRvU9ntRq6/MlnTKbCFM5c0oCXWRw4QNjDy3AsjJELgeAIwiqpHWMKO6XltFj qU/rdyW/BTCuAlIjWVbjooAIJZ268geupeug3zvE3uGzrxT4DaVIo8W1wtJ+XQrt By7sOW/gMQ2EcTJQiuFjS/Gz5gOKQ2F8OLCm6T8Prjh6SxrCUAiuIvP0LmxUCa8i KMlx+8c9E2f9j+TTt9AP =oMZe -----END PGP SIGNATURE----- Merge tag 'pm-turbostat-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull turbostat utility updates from Rafael Wysocki: "Power management turbostat utility updates. These update turbostat significantly and in particular: - default output is now verbose, --debug is no longer required to get all counters. As a result, some options have been added to specify exactly what output is wanted. - added --quiet to skip system configuration output - added --list, --show and --hide parameters - added --cpu parameter - enhanced Baytrail SoC support - added Gemini Lake SoC support - added sysfs C-state columns Also the symbol definitions in arch/x86/include/asm/intel-family.h and arch/x86/include/asm/msr-index.h are updated and the intel_idle and intel_pstate drivers are modified to use the updated symbols. Credits to Len Brown for all of these changes" * tag 'pm-turbostat-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits) tools/power turbostat: version 17.02.24 tools/power turbostat: bugfix: --add u32 was printed as u64 tools/power turbostat: show error on exec tools/power turbostat: dump p-state software config tools/power turbostat: show package number, even without --debug tools/power turbostat: support "--hide C1" etc. tools/power turbostat: move --Package and --processor into the --cpu option tools/power turbostat: turbostat.8 update tools/power turbostat: update --list feature tools/power turbostat: use wide columns to display large numbers tools/power turbostat: Add --list option to show available header names tools/power turbostat: fix zero IRQ count shown in one-shot command mode tools/power turbostat: add --cpu parameter tools/power turbostat: print sysfs C-state stats tools/power turbostat: extend --add option to accept /sys path tools/power turbostat: skip unused counters on BDX tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits tools/power turbostat: skip unused counters on SKX tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7 tools/power turbostat: initial Gemini Lake SOC support ...	2017-03-02 17:41:27 -08:00
Linus Torvalds	080e4168c0	More power management updates for v4.11-rc1 - Fix for a cpuidle menu governor problem that started to take an unnecessary spinlock after one of the recent updates and that did not play well with the RT patch (Rafael Wysocki). - Fix for the new intel_pstate operation mode switching feature added recently that did not reinitialize P-state limits properly when switching operation modes (Rafael Wysocki). - Removal of unused global notifiers from the PM QoS framework (Viresh Kumar). - Generic power domains framework update to make it handle asynchronous invocations of PM callbacks in the "noirq" phases of system suspend/hibernation correctly (Ulf Hansson). - Two hibernation core cleanups (Rafael Wysocki). - intel_idle cleanup related to the sysfs interface (Len Brown). - Off-by-one bug fix in the OPP (Operating Performance Points) framework (Andrzej Hajda). - OPP framework's documentation fix (Viresh Kumar). - cpufreq qoriq driver cleanup (Tang Yuantian). - Fixes for typos in comments in the device runtime PM framework (Christophe Jaillet). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYuLeGAAoJEILEb/54YlRxJvcP/0BRmh8Hn4Itx/NIWNg71X6j U+v8Pn8T3MP33gCcleLYlgre2JUIAUDmhdK99+UOx+/abhjMhQSaF3HhTOwYaPtQ 6njoHVS0NnfqUf+x5kp+EpRxBVNucYVbdRTVd1DsHIeLLz/96DFOzb/R7tko/pKx pFMWvNdotHLLgXOG1UvdRimwTDlFMffxFzD8Se53LPjRXS0S73A5VWfqZOye44Re j3W1AJ0Idgq5uduA6J8x1MWbaxDq1h+j6CSUm05yvqrINzxXwXt0Hv6stCQTo+Gb YMdiBd8MujNyAgcchw3jiDQ8Vp+zmfLPcHrfPe//SSefj26eB8LyVNSYelvbUdOz cNjvyErva37MmaegCL9QC7WbLM+A7VE6bU6YzDCi/rR8jYMJ51Fb9jGiYb/oimry OLlblEekikUsskWv4hGV1JVt5VhmUMlagWtexxn+lMszATcZro0tfXu/vgQWksYs noUnwuWJWxvj2aNMsvbzW3HLlTGSmYl2UxJ7IymQQaTDblwF9Kg61rm3+5coUctd ifceynDVp9Gju25faYgZ+Dq9+o8ktlOGOHRRPdLIRNJ/T+4tUDnlGkdbPb+Tfn03 XUIzYCu74U8/oW8gOk6t0WpmWzvxEXNgdirdEIR6y3loYIC0Jr3v4gyD975Eug74 Hzfrdg7ignAmWV+nf6UY =SeUF -----END PGP SIGNATURE----- Merge tag 'pm-extra-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates deom Rafael Wysocki: "These fix two bugs introduced by recent power management updates (in the cpuidle menu governor and intel_pstate) and a few other issues, clean up things and remove unused code. Specifics: - Fix for a cpuidle menu governor problem that started to take an unnecessary spinlock after one of the recent updates and that did not play well with the RT patch (Rafael Wysocki). - Fix for the new intel_pstate operation mode switching feature added recently that did not reinitialize P-state limits properly when switching operation modes (Rafael Wysocki). - Removal of unused global notifiers from the PM QoS framework (Viresh Kumar). - Generic power domains framework update to make it handle asynchronous invocations of PM callbacks in the "noirq" phases of system suspend/hibernation correctly (Ulf Hansson). - Two hibernation core cleanups (Rafael Wysocki). - intel_idle cleanup related to the sysfs interface (Len Brown). - Off-by-one bug fix in the OPP (Operating Performance Points) framework (Andrzej Hajda). - OPP framework's documentation fix (Viresh Kumar). - cpufreq qoriq driver cleanup (Tang Yuantian). - Fixes for typos in comments in the device runtime PM framework (Christophe Jaillet)" * tag 'pm-extra-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / OPP: Documentation: Fix opp-microvolt in examples intel_idle: stop exposing platform acronyms in sysfs cpufreq: intel_pstate: Fix limits issue with operation mode switching PM / hibernate: Define pr_fmt() and use pr_*() instead of printk() PM / hibernate: Untangle power_down() cpuidle: menu: Avoid taking spinlock for accessing QoS values PM / QoS: Remove global notifiers PM / runtime: Fix some typos cpufreq: qoriq: clean up unused code PM / OPP: fix off-by-one bug in dev_pm_opp_get_max_volt_latency loop PM / Domains: Power off masters immediately in the power off sequence PM / Domains: Rename is_async to one_dev_on for genpd_power_off() PM / Domains: Move genpd_power_off() above genpd_power_on()	2017-03-02 17:33:52 -08:00
Rafael J. Wysocki	9b5e9cb164	Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-sleep' * pm-cpuidle: intel_idle: stop exposing platform acronyms in sysfs cpuidle: menu: Avoid taking spinlock for accessing QoS values * pm-cpufreq: cpufreq: intel_pstate: Fix limits issue with operation mode switching cpufreq: qoriq: clean up unused code * pm-sleep: PM / hibernate: Define pr_fmt() and use pr_*() instead of printk() PM / hibernate: Untangle power_down()	2017-03-03 00:43:11 +01:00
Ingo Molnar	55687da166	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/cpufreq.h> We are going to split <linux/sched/cpufreq.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/cpufreq.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:30 +01:00
Ingo Molnar	0c98d344fe	sched/core: Remove the tsk_cpus_allowed() wrapper So the original intention of tsk_cpus_allowed() was to 'future-proof' the field - but it's pretty ineffectual at that, because half of the code uses ->cpus_allowed directly ... Also, the wrapper makes the code longer than the original expression! So just get rid of it. This also shrinks <linux/sched.h> a bit. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:24 +01:00
Rafael J. Wysocki	6bff9c609f	Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull changes related to turbostat for v4.11 from Len Brown. * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (44 commits) tools/power turbostat: version 17.02.24 tools/power turbostat: bugfix: --add u32 was printed as u64 tools/power turbostat: show error on exec tools/power turbostat: dump p-state software config tools/power turbostat: show package number, even without --debug tools/power turbostat: support "--hide C1" etc. tools/power turbostat: move --Package and --processor into the --cpu option tools/power turbostat: turbostat.8 update tools/power turbostat: update --list feature tools/power turbostat: use wide columns to display large numbers tools/power turbostat: Add --list option to show available header names tools/power turbostat: fix zero IRQ count shown in one-shot command mode tools/power turbostat: add --cpu parameter tools/power turbostat: print sysfs C-state stats tools/power turbostat: extend --add option to accept /sys path tools/power turbostat: skip unused counters on BDX tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits tools/power turbostat: skip unused counters on SKX tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7 tools/power turbostat: initial Gemini Lake SOC support ...	2017-03-01 23:34:38 +01:00
Len Brown	92134bdbc6	intel_pstate: use MSR_ATOM_RATIOS definitions from msr-index.h Originally, these MSRs were locally defined in this driver. Now the definitions are in msr-index.h -- use them. Signed-off-by: Len Brown <len.brown@intel.com>	2017-03-01 00:14:03 -05:00
Rafael J. Wysocki	c3a49c8991	cpufreq: intel_pstate: Fix limits issue with operation mode switching There is a problem with intel_pstate operation mode switching introduced by commit `fb1fe1041c` (cpufreq: intel_pstate: Operation mode control from sysfs), because the global sysfs limits are preserved across operation modes while per-policy limits are reinitialized from scratch on a mode switch and both sets of limits may get out of sync this way. Fix that by always reinitializing the global limits upon the registration of the driver. Fixes: `fb1fe1041c` (cpufreq: intel_pstate: Operation mode control from sysfs) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2017-02-28 13:55:52 +01:00
Linus Torvalds	af8999f672	ARM: SoC non-urgent fixes for merge window We sometimes collect non-critical fixes that come in during the later part of the merge window in a branch for the next release instead, and this is that contents for v4.11. Most of these are OMAP fixes, dealing with OMAP36/37 detection, quirks and setup. There's also some fixes for Davinci and a Kconfig fix for SCPI to only enable on ARM{,64}. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYrMlHAAoJEIwa5zzehBx3oZ4P/3nRgb4dtwEwXwFmJf8Xd4nu yetQbcwRreHvh8utsU2Pe+8tffV8jLgsW8TxZ43d6deYFii046HhZAXtvTTVgFpE OA0fJpNJ00KYqP1Nx5q/kwZoH3uBz442uMUQ9lyziB3RpimhRsiKyHwnTyuWljyx hPmO1XKcF6pQBXk1uwOzO1lSDUeOn4eAmeLonlG1gQ5qtrkU0WbrTPxpmn/CB546 LH5Nj0qVRzEa7xr8O+2nzeKPVwcXGwsKVKCDbSJmsey2KOEDnEjjxpToAh3WnA4W Tm1av5QdyqsLVqAMkNYezrS8EzBjRKa1ma4xUqsNoIhO1XI7xa/LkonU8a0+ZdSX p48DCvv7IHX5IqdIHHB0s1eICvTsW8Cp/4YUJzuZDFbS9B2t5b3412+n43tVa8l3 HYPeTzL5S3VOrMtpQKkGAFrw5OGm+URy4CYQxpX5DxSRSqvXTj12ajBHRbfdbzCO r2i2rhKL07PF3DAf8L1coHcBQDS7Vc/k+fhKCQy+W1RDxmjYwYKSI9agOyZi1HQ7 X+0HuUyKTthCE2kUrj4rye/87MffWwdjNgnOZiHR1X7YtWgnjp1g9K+mLZHh/y5m Tq/M55cK9h6dOghx121jYFkkvDclEQDemJuDbKY0sEMDrDXtppcI/T+znZ1LTq7i 1eaK4lTyAX7dbQJUQCwe =NhZq -----END PGP SIGNATURE----- Merge tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC non-urgent fixes from Arnd Bergmann: "We sometimes collect non-critical fixes that come in during the later part of the merge window in a branch for the next release instead, and this is that contents for v4.11. Most of these are OMAP fixes, dealing with OMAP36/37 detection, quirks and setup. There's also some fixes for Davinci and a Kconfig fix for SCPI to only enable on ARM{,64}" * tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: firmware: arm_scpi: Add hardware dependencies ARM: OMAP3: Fix SoC detection of OMAP36/37 Family ARM: OMAP5: Add HWMOD_SWSUP_SIDLE_ACT flag for UART ARM: dts: Fix compatible for ti81xx uarts for 8250 ARM: dts: Fix am335x and dm814x scm syscon to probe children ARM: OMAP2+: Fix init for multiple quirks for the same SoC ARM: dts: Fix omap3 off mode pull defines bus: da850-mstpri: fix my e-mail address ARM: davinci: da850: fix da850_set_pll0rate() ARM: davinci: da850: coding style fix	2017-02-23 15:28:04 -08:00
Tang Yuantian	17b4eaf475	cpufreq: qoriq: clean up unused code This snip code is not needed anymore since its user get_hard_smp_processor_id() has been removed. Signed-off-by: Tang Yuantian <yuantian.tang@nxp.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-23 23:01:49 +01:00
Linus Torvalds	02c3de1105	Power management updates for v4.11-rc1 - Operating Performance Points (OPP) framework fixes, cleanups and switch over from RCU-based synchronization to reference counting using krefs (Viresh Kumar, Wei Yongjun, Dave Gerlach). - cpufreq core cleanups and documentation updates (Viresh Kumar, Rafael Wysocki). - New cpufreq driver for Broadcom BMIPS SoCs (Markus Mayer). - New cpufreq-dt sub-driver for TI SoCs requiring special handling, like in the AM335x, AM437x, DRA7x, and AM57x families, along with new DT bindings for it (Dave Gerlach, Paul Gortmaker). - ARM64 SoCs support for the qoriq cpufreq driver (Tang Yuantian). - intel_pstate driver updates including a new sysfs knob to control the driver's operation mode and fixes related to the no_turbo sysfs knob and the hardware-managed P-states feature support (Rafael Wysocki, Srinivas Pandruvada). - New interface to export ultra-turbo frequencies for the powernv cpufreq driver (Shilpasri Bhat). - Assorted fixes for cpufreq drivers (Arnd Bergmann, Dan Carpenter, Wei Yongjun). - devfreq core fixes, mostly related to the sysfs interface exported by it (Chanwoo Choi, Chris Diamand). - Updates of the exynos-bus and exynos-ppmu devfreq drivers (Chanwoo Choi). - Device PM QoS extension to support CPUs and support for per-CPU wakeup (device resume) latency constraints in the cpuidle menu governor (Alex Shi). - Wakeup IRQs framework fixes (Grygorii Strashko). - Generic power domains framework update including a fix to make it handle asynchronous invocations of noirq suspend/resume callbacks correctly (Ulf Hansson, Geert Uytterhoeven). - Assorted fixes and cleanups in the core suspend/hibernate code, PM QoS framework and x86 ACPI idle support code (Corentin Labbe, Geert Uytterhoeven, Geliang Tang, John Keeping, Nick Desaulniers). - Update of the analyze_suspend.py script is updated to version 4.5 offering multiple improvements (Todd Brandt). - New tool for intel_pstate diagnostics using the pstate_sample tracepoint (Doug Smythies). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYq3IjAAoJEILEb/54YlRx/lYP+gNXhfETSzjd4kWSHy3FVEDb gc5rMiE2j0OYgVSXwBI7p4EqMPy56lSWBASvbF2o6v9CIxb880KLFEsMDCVHwn46 6xfEnIRxf1oeRqn7EG9ZPIcTgNsUyvK+gah7zgLXu/0KU7ceXxygvNk47qpeOZ8f dKYgIk/TOSGPC8H2nsg8VBKlK/ZOj5hID4F3MmFw6yDuWVCYuh2EokYXS4Nx0JwY UQGpWtz+FWWs71vhgVl33GbPXWvPqA7OMe0btZ3RCnhnz4tA/mH+jDWiaspCdS3J vKGeZyZptjIMJcufm3X7s7ghYjELheqQusMODDXk4AaWQ5nz8V5/h7NThYfa9J1b M93Tb0rMb2MqUhBpv/M6D3qQroZmhq55QKfQrul3QWSOiQUzTWJcbbpyeBQ7nkrI F1qNqQfuCnBL/r9y7HpW8P2iFg9kCHkwTtXMdp/lzGXdKzSGtAUSkYg5ohnUzQTp 2WCPTEk+5DxLVPjW5rDoZOotr5p1kdcdWBk6r3MEWRokZK6PJo7rJBcnTtXSo2mO lLRba006q+fTlI5wZtjAI0rOiS3JgtT6cRx7uPjZlze9TGjklJhdsCPJbM5gcOT+ YiOxvqD+9if5QRSxiEZNj3bQ43wYhXmpctfIanyxziq09BPIPxvgfRR/BkUzc34R ps4CIvImim5v5xc8Zsbk =57xJ -----END PGP SIGNATURE----- Merge tag 'pm-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The majority of changes go into the Operating Performance Points (OPP) framework and cpufreq this time, followed by devfreq and some scattered updates all over. The OPP changes are mostly related to switching over from RCU-based synchronization, that turned out to be overly complicated and problematic, to reference counting using krefs. In the cpufreq land there are core cleanups, documentation updates, a new driver for Broadcom BMIPS SoCs, a new cpufreq-dt sub-driver for TI SoCs that require special handling, ARM64 SoCs support for the qoriq driver, intel_pstate updates, powernv driver update and assorted fixes. The devfreq changes are mostly fixes related to the sysfs interface and some Exynos drivers updates. Apart from that, the cpuidle menu governor will support per-CPU PM QoS constraints for the wakeup latency now, some bugs in the wakeup IRQs framework are fixed, the generic power domains framework should handle asynchronous invocations of noirq suspend/resume callbacks from now on, the analyze_suspend.py script is updated and there is a new tool for intel_pstate diagnostics. Specifics: - Operating Performance Points (OPP) framework fixes, cleanups and switch over from RCU-based synchronization to reference counting using krefs (Viresh Kumar, Wei Yongjun, Dave Gerlach) - cpufreq core cleanups and documentation updates (Viresh Kumar, Rafael Wysocki) - New cpufreq driver for Broadcom BMIPS SoCs (Markus Mayer) - New cpufreq-dt sub-driver for TI SoCs requiring special handling, like in the AM335x, AM437x, DRA7x, and AM57x families, along with new DT bindings for it (Dave Gerlach, Paul Gortmaker) - ARM64 SoCs support for the qoriq cpufreq driver (Tang Yuantian) - intel_pstate driver updates including a new sysfs knob to control the driver's operation mode and fixes related to the no_turbo sysfs knob and the hardware-managed P-states feature support (Rafael Wysocki, Srinivas Pandruvada) - New interface to export ultra-turbo frequencies for the powernv cpufreq driver (Shilpasri Bhat) - Assorted fixes for cpufreq drivers (Arnd Bergmann, Dan Carpenter, Wei Yongjun) - devfreq core fixes, mostly related to the sysfs interface exported by it (Chanwoo Choi, Chris Diamand) - Updates of the exynos-bus and exynos-ppmu devfreq drivers (Chanwoo Choi) - Device PM QoS extension to support CPUs and support for per-CPU wakeup (device resume) latency constraints in the cpuidle menu governor (Alex Shi) - Wakeup IRQs framework fixes (Grygorii Strashko) - Generic power domains framework update including a fix to make it handle asynchronous invocations of noirq suspend/resume callbacks correctly (Ulf Hansson, Geert Uytterhoeven) - Assorted fixes and cleanups in the core suspend/hibernate code, PM QoS framework and x86 ACPI idle support code (Corentin Labbe, Geert Uytterhoeven, Geliang Tang, John Keeping, Nick Desaulniers) - Update of the analyze_suspend.py script is updated to version 4.5 offering multiple improvements (Todd Brandt) - New tool for intel_pstate diagnostics using the pstate_sample tracepoint (Doug Smythies)" tag 'pm-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (85 commits) MAINTAINERS: cpufreq: add bmips-cpufreq.c PM / QoS: Fix memory leak on resume_latency.notifiers PM / Documentation: Spelling s/wrtie/write/ PM / sleep: Fix test_suspend after sleep state rework cpufreq: CPPC: add ACPI_PROCESSOR dependency cpufreq: make ti-cpufreq explicitly non-modular cpufreq: Do not clear real_cpus mask on policy init tools/power/x86: Debug utility for intel_pstate driver AnalyzeSuspend: fix drag and zoom bug in javascript PM / wakeirq: report a wakeup_event on dedicated wekup irq PM / wakeirq: Fix spurious wake-up events for dedicated wakeirqs PM / wakeirq: Enable dedicated wakeirq for suspend cpufreq: dt: Don't use generic platdev driver for ti-cpufreq platforms cpufreq: ti: Add cpufreq driver to determine available OPPs at runtime Documentation: dt: add bindings for ti-cpufreq PM / OPP: Expose _of_get_opp_desc_node as dev_pm_opp API cpufreq: qoriq: Don't look at clock implementation details cpufreq: qoriq: add ARM64 SoCs support PM / Domains: Provide dummy governors if CONFIG_PM_GENERIC_DOMAINS=n cpufreq: brcmstb-avs-cpufreq: remove unnecessary platform_set_drvdata() ...	2017-02-20 17:41:31 -08:00
Linus Torvalds	828cad8ea0	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this (fairly busy) cycle were: - There was a class of scheduler bugs related to forgetting to update the rq-clock timestamp which can cause weird and hard to debug problems, so there's a new debug facility for this: which uncovered a whole lot of bugs which convinced us that we want to keep the debug facility. (Peter Zijlstra, Matt Fleming) - Various cputime related updates: eliminate cputime and use u64 nanoseconds directly, simplify and improve the arch interfaces, implement delayed accounting more widely, etc. - (Frederic Weisbecker) - Move code around for better structure plus cleanups (Ingo Molnar) - Move IO schedule accounting deeper into the scheduler plus related changes to improve the situation (Tejun Heo) - ... plus a round of sched/rt and sched/deadline fixes, plus other fixes, updats and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (85 commits) sched/core: Remove unlikely() annotation from sched_move_task() sched/autogroup: Rename auto_group.[ch] to autogroup.[ch] sched/topology: Split out scheduler topology code from core.c into topology.c sched/core: Remove unnecessary #include headers sched/rq_clock: Consolidate the ordering of the rq_clock methods delayacct: Include <uapi/linux/taskstats.h> sched/core: Clean up comments sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds sched/clock: Add dummy clear_sched_clock_stable() stub function sched/cputime: Remove generic asm headers sched/cputime: Remove unused nsec_to_cputime() s390, sched/cputime: Remove unused cputime definitions powerpc, sched/cputime: Remove unused cputime definitions s390, sched/cputime: Make arch_cpu_idle_time() to return nsecs ia64, sched/cputime: Remove unused cputime definitions ia64: Convert vtime to use nsec units directly ia64, sched/cputime: Move the nsecs based cputime headers to the last arch using it sched/cputime: Remove jiffies based cputime sched/cputime, vtime: Return nsecs instead of cputime_t to account sched/cputime: Complete nsec conversion of tick based accounting ...	2017-02-20 12:52:55 -08:00
Arnd Bergmann	a578884fa0	cpufreq: CPPC: add ACPI_PROCESSOR dependency Without the Kconfig dependency, we can get this warning: warning: ACPI_CPPC_CPUFREQ selects ACPI_CPPC_LIB which has unmet direct dependencies (ACPI && ACPI_PROCESSOR) Fixes: `5477fb3bd1` (ACPI / CPPC: Add a CPUFreq driver for use with CPPC) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-16 01:00:03 +01:00
Paul Gortmaker	149ab86496	cpufreq: make ti-cpufreq explicitly non-modular The Kconfig currently controlling compilation of this code is: drivers/cpufreq/Kconfig.arm:config ARM_TI_CPUFREQ drivers/cpufreq/Kconfig.arm: bool "Texas Instruments CPUFreq support" ...meaning that it currently is not being built as a module by anyone. Lets remove the couple traces of modular infrastructure use, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-16 00:58:52 +01:00
Rafael J. Wysocki	f451014692	cpufreq: Do not clear real_cpus mask on policy init If new_policy is set in cpufreq_online(), the policy object has just been created and its real_cpus mask has been zeroed on allocation, and the driver's ->init() callback should not touch it. It doesn't need to be cleared again, so don't do that. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-02-16 00:57:42 +01:00
Dave Gerlach	051bd84bb4	cpufreq: dt: Don't use generic platdev driver for ti-cpufreq platforms Some TI platforms, specifically those in the am33xx, am43xx, dra7xx, and am57xx families of SoCs can make use of the ti-cpufreq driver to selectively enable OPPs based on the exact configuration in use. The ti-cpufreq is given the responsibility of creating the cpufreq-dt platform device when the driver is in use so drop am33xx and dra7xx from the cpufreq-dt-platdev driver so it is not created twice. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-09 22:59:00 +01:00
Dave Gerlach	e13cf046cd	cpufreq: ti: Add cpufreq driver to determine available OPPs at runtime Some TI SoCs, like those in the AM335x, AM437x, DRA7x, and AM57x families, have different OPPs available for the MPU depending on which specific variant of the SoC is in use. This can be determined through use of the revision and an eFuse register present in the silicon. Introduce a ti-cpufreq driver that can read the aformentioned values and provide them as version matching data to the opp framework. Through this the opp-supported-hw dt binding that is part of the operating-points-v2 table can be used to indicate availability of OPPs for each device. This driver also creates the "cpufreq-dt" platform_device after passing the version matching data to the OPP framework so that the cpufreq-dt handles the actual cpufreq implementation. Even without the necessary data to pass the version matching data the driver will still create this device to maintain backwards compatibility with operating-points v1 tables. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-09 22:57:48 +01:00
Rafael J. Wysocki	40e993aa04	Merge OPP material for v4.11 to satisfy dependencies.	2017-02-09 22:52:35 +01:00
Tang Yuantian	b1e9a64972	cpufreq: qoriq: Don't look at clock implementation details Get the CPU clock's potential parent clocks from the clock interface itself, rather than manually parsing the clocks property to find a phandle, looking at the clock-names property of that, and assuming that those are valid parent clocks for the cpu clock. This is necessary now that the clocks are generated based on the clock driver's knowledge of the chip rather than a fragile device-tree description of the mux options. We can now rely on the clock driver to ensure that the mux only exposes options that are valid. The cpufreq driver was currently being overly conservative in some cases -- for example, the "min_cpufreq = get_bus_freq()" restriction only applies to chips with erratum A-004510, and whether the freq_mask used on p5020 is needed depends on the actual frequencies of the PLLs (FWIW, p5040 has a similar limitation but its .freq_mask was zero) -- and the frequency mask mechanism made assumptions about particular parent clock indices that are no longer valid. Signed-off-by: Scott Wood <scottwood@nxp.com> Signed-off-by: Tang Yuantian <yuantian.tang@nxp.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-09 14:33:02 +01:00
Tang Yuantian	5026ac2314	cpufreq: qoriq: add ARM64 SoCs support Add ARM64 config to Kconfig to enable CPU frequency feature on NXP ARM64 SoCs. Signed-off-by: Tang Yuantian <yuantian.tang@nxp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-09 14:33:01 +01:00
Wei Yongjun	113f9017e5	cpufreq: brcmstb-avs-cpufreq: remove unnecessary platform_set_drvdata() The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-09 01:22:46 +01:00
Dan Carpenter	a69261e447	cpufreq: s3c2416: double free on driver init error path The "goto err_armclk;" error path already does a clk_put(s3c_freq->hclk); so this is a double free. Fixes: `34ee550752` ([CPUFREQ] Add S3C2416/S3C2450 cpufreq driver) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-09 01:22:45 +01:00
Markus Mayer	cdb56cbfd7	cpufreq: bmips-cpufreq: CPUfreq driver for Broadcom's BMIPS SoCs Add the MIPS CPUfreq driver. This driver currently supports CPUfreq on BMIPS5xxx-based SoCs. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-09 01:22:44 +01:00
Rafael J. Wysocki	56c7303e62	Merge back earlier cpufreq changes for v4.11.	2017-02-09 01:18:14 +01:00
Rafael J. Wysocki	cbf304e420	Merge branches 'pm-core-fixes' and 'pm-cpufreq-fixes' * pm-core-fixes: PM / runtime: Avoid false-positive warnings from might_sleep_if() * pm-cpufreq-fixes: cpufreq: intel_pstate: Disable energy efficiency optimization cpufreq: brcmstb-avs-cpufreq: properly retrieve P-state upon suspend cpufreq: brcmstb-avs-cpufreq: extend sysfs entry brcm_avs_pmap	2017-02-06 14:52:10 +01:00
Srinivas Pandruvada	6e978b22ef	cpufreq: intel_pstate: Disable energy efficiency optimization Some Kabylake desktop processors may not reach max turbo when running in HWP mode, even if running under sustained 100% utilization. This occurs when the HWP.EPP (Energy Performance Preference) is set to "balance_power" (0x80) -- the default on most systems. It occurs because the platform BIOS may erroneously enable an energy-efficiency setting -- MSR_IA32_POWER_CTL BIT-EE, which is not recommended to be enabled on this SKU. On the failing systems, this BIOS issue was not discovered when the desktop motherboard was tested with Windows, because the BIOS also neglects to provide the ACPI/CPPC table, that Windows requires to enable HWP, and so Windows runs in legacy P-state mode, where this setting has no effect. Linux' intel_pstate driver does not require ACPI/CPPC to enable HWP, and so it runs in HWP mode, exposing this incorrect BIOS configuration. There are several ways to address this problem. First, Linux can also run in legacy P-state mode on this system. As intel_pstate is how Linux enables HWP, booting with "intel_pstate=disable" will run in acpi-cpufreq/ondemand legacy p-state mode. Or second, the "performance" governor can be used with intel_pstate, which will modify HWP.EPP to 0. Or third, starting in 4.10, the /sys/devices/system/cpu/cpufreq/policy*/energy_performance_preference attribute in can be updated from "balance_power" to "performance". Or fourth, apply this patch, which fixes the erroneous setting of MSR_IA32_POWER_CTL BIT_EE on this model, allowing the default configuration to function as designed. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Len Brown <len.brown@intel.com> Cc: 4.6+ <stable@vger.kernel.org> # 4.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-04 00:11:08 +01:00
Srinivas Pandruvada	8fc7554ae5	cpufreq: intel_pstate: Calculate guaranteed performance for HWP When HWP is active, turbo activation ratio is not used to calculate max non turbo ratio. But on these systems the max non turbo ratio is decided by config TDP settings. This change removes usage of MSR_TURBO_ACTIVATION_RATIO for HWP systems, instead directly use TDP ratios, when more than one TDPs are available. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-04 00:05:33 +01:00
Srinivas Pandruvada	4e5d3f713b	cpufreq: intel_pstate: Make HWP limits compatible with legacy Under HWP the performance limits are calculated using max_perf_pct and min_perf_pct using possible performance, not available performance. The available performance can be reduced by no_turbo setting. To make compatible with legacy mode, use max/min performance percentage with respect to available performance. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-04 00:05:32 +01:00
Srinivas Pandruvada	7d9a8a9f4e	cpufreq: intel_pstate: Lower frequency than expected under no_turbo When turbo is not disabled by BIOS, but user disabled from intel P-State sysfs and changes max/min using cpufreq sysfs, the resultant frequency is lower than what user requested. The reason for this, when the perf limits are calculated in set_policy() callback, they are with reference to max cpu frequency (turbo frequency ), but when enforced in the intel_pstate_get_min_max() they are with reference to max available performance as documented in the intel_pstate documentation (in this case max non turbo P-State). This needs similar change as done in intel_cpufreq_verify_policy() for passive mode. Set policy->cpuinfo.max_freq based on the turbo status. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-04 00:05:32 +01:00
Rafael J. Wysocki	fb1fe1041c	cpufreq: intel_pstate: Operation mode control from sysfs Make it possible to change the operation mode of intel_pstate with the help of a new sysfs attribute called "status". There are three possible configurations that can be selected using this attribute: "off" - The driver is not in use at this time. "active" - The driver works as a P-state governor (default). "passive" - The driver works as a regular cpufreq one and collaborates with the generic cpufreq governors (it sets P-states as requested by those governors). [This is the same mode the driver can be started in by passing intel_pstate=passive in the kernel command line.] The current setting is returned by reads from this attribute. Writing one of the above strings to it changes the operation mode as indicated by that string, if possible. If HW-managed P-states (HWP) feature is enabled, it is not possible to change the driver's operation mode and attempts to write to this attribute will fail. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-04 00:05:31 +01:00
Rafael J. Wysocki	0c30b65b3c	cpufreq: intel_pstate: Expose global sysfs attributes upfront Expose the intel_pstate's global sysfs attributes before registering the driver to prepare for the addition of an attribute that also will have to work if the driver is not registered. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-04 00:05:30 +01:00
Viresh Kumar	052f573f5c	cpufreq: Remove CPUFREQ_START notifier event Its not used anymore, remove it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-04 00:05:30 +01:00
Shilpasri G Bhat	b12f7a2b01	cpufreq: powernv: Add boost files to export ultra-turbo frequencies In P8+, Workload Optimized Frequency(WOF) provides the capability to boost the cpu frequency based on the utilization of the other cpus running in the chip. The On-Chip-Controller(OCC) firmware will control the achievability of these frequencies depending on the power headroom available in the chip. Currently the ultra-turbo frequencies provided by this feature are exported along with the turbo and sub-turbo frequencies as scaling_available_frequencies. This patch will export the ultra-turbo frequencies separately as scaling_boost_frequencies in WOF enabled systems. This patch will add the boost sysfs file which can be used to disable/enable ultra-turbo frequencies. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-03 23:59:41 +01:00
Viresh Kumar	801e0f378f	cpufreq: Remove CONFIG_CPU_FREQ_STAT_DETAILS config option This doesn't have any benefit apart from saving a small amount of memory when it is disabled. The ifdef hackery in the code makes it dirty unnecessarily. Clean it up by removing the Kconfig option completely. Few defconfigs are also updated and CONFIG_CPU_FREQ_STAT_DETAILS is replaced with CONFIG_CPU_FREQ_STAT now in them, as users wanted stats to be enabled. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-03 23:59:39 +01:00
Viresh Kumar	f9f41e3ef9	cpufreq: Remove policy create/remove notifiers Those were added by: commit `fcd7af917a` ("cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly") but aren't used anymore since: commit `1aefc75b24` ("cpufreq: stats: Make the stats code non-modular"). Remove them. Also remove the redundant parameter to the respective routines. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-02-03 23:59:38 +01:00
Frederic Weisbecker	7fb1327ee9	sched/cputime: Convert kcpustat to nsecs Kernel CPU stats are stored in cputime_t which is an architecture defined type, and hence a bit opaque and requiring accessors and mutators for any operation. Converting them to nsecs simplifies the code and is one step toward the removal of cputime_t in the core code. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1485832191-26889-4-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-02-01 09:13:47 +01:00
Viresh Kumar	8a31d9d942	PM / OPP: Update OPP users to put reference This patch updates dev_pm_opp_find_freq_() routines to get a reference to the OPPs returned by them. Also updates the users of dev_pm_opp_find_freq_() routines to call dev_pm_opp_put() after they are done using the OPPs. As it is guaranteed the that OPPs wouldn't get freed while being used, the RCU read side locking present with the users isn't required anymore. Drop it as well. This patch also updates all users of devfreq_recommended_opp() which was returning an OPP received from the OPP core. Note that some of the OPP core routines have gained rcu_read_{lock\|unlock}() calls, as those still use RCU specific APIs within them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> [Devfreq] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-01-30 09:22:21 +01:00
Viresh Kumar	fa30184d19	PM / OPP: Return opp_table from dev_pm_opp_set_() routines Now that we have proper kernel reference infrastructure in place for OPP tables, use it to guarantee that the OPP table isn't freed while being used by the callers of dev_pm_opp_set_() APIs. Make them all return the pointer to the OPP table after taking its reference and put the reference back with dev_pm_opp_put_*() APIs. Now that the OPP table wouldn't get freed while these routines are executing after dev_pm_opp_get_opp_table() is called, there is no need to take opp_table_lock. Drop them as well. Remove the rcu specific comments from these routines as they aren't relevant anymore. Note that prototypes of dev_pm_opp_{set\|put}_regulators() were already updated by another patch. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-01-30 09:22:21 +01:00
Viresh Kumar	3aa26a3b2e	PM / OPP: Rename dev_pm_opp_get_suspend_opp() and return OPP rate There is only one user of dev_pm_opp_get_suspend_opp() and that uses it to get the OPP rate for the suspend_opp. Rename dev_pm_opp_get_suspend_opp() as dev_pm_opp_get_suspend_opp_freq() and return the rate directly from it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-01-27 11:49:09 +01:00
Markus Mayer	3c223c19ae	cpufreq: brcmstb-avs-cpufreq: properly retrieve P-state upon suspend The AVS GET_PMAP command does return a P-state along with the P-map information. However, that P-state is the initial P-state when the P-map was first downloaded to AVS. It is not the current P-state. Therefore, we explicitly retrieve the P-state using the GET_PSTATE command. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-01-27 11:43:49 +01:00
Markus Mayer	9b02c54bc9	cpufreq: brcmstb-avs-cpufreq: extend sysfs entry brcm_avs_pmap We extend the brcm_avs_pmap sysfs entry (which issues the GET_PMAP command to AVS) to include all fields from struct pmap. This means adding mode (AVS, DVS, DVFS) and state (the P-state) to the output. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-01-27 11:43:48 +01:00
Rafael J. Wysocki	ff7e593c9c	Merge branches 'pm-sleep' and 'pm-cpufreq' * pm-sleep: Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag" * pm-cpufreq: cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy	2017-01-27 00:08:59 +01:00
Srinivas Pandruvada	1443ebbacf	cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy A side effect of keeping intel_pstate sysfs limits in sync with cpufreq is that the now sysfs limits can't enforced under performance policy. For example, if the max_perf_pct is changed from 100 to 80, this will call intel_pstate_set_policy(), which will change the max_perf to 100 again for performance policy. Same issue happens, when no_turbo is set. This change calculates max and min frequency using sysfs performance limits in intel_pstate_verify_policy() and adjusts policy limits by calling cpufreq_verify_within_limits(). Also, it causes the setting of performance limits to be skipped if no_turbo is set. Fixes: `111b8b3fe4` (cpufreq: intel_pstate: Always keep all limits settings in sync) Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-01-20 03:35:27 +01:00
Rafael J. Wysocki	3baad65546	Merge branch 'pm-cpufreq' * pm-cpufreq: cpufreq: dt: Add support for APM X-Gene 2 cpufreq: intel_pstate: Always keep all limits settings in sync cpufreq: intel_pstate: Use locking in intel_cpufreq_verify_policy() cpufreq: intel_pstate: Use locking in intel_pstate_resume() cpufreq: intel_pstate: Do not expose PID parameters in passive mode	2017-01-06 14:34:52 +01:00
Hoan Tran	e11b6293a8	cpufreq: dt: Add support for APM X-Gene 2 Add the compatible string for supporting the generic device tree cpufreq-dt driver on APM's X-Gene 2 SoC. Signed-off-by: Hoan Tran <hotran@apm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-01-05 00:27:51 +01:00
Bartosz Golaszewski	b40881738f	ARM: davinci: da850: fix da850_set_pll0rate() This function is confusing - its second argument is an index to the freq table, not the requested clock rate in Hz, but it's used as the set_rate callback for the pll0 clock. It leads to an oops when the caller doesn't know the internals and passes the rate in Hz as argument instead of the cpufreq index since this argument isn't bounds checked either. Fix it by iterating over the array of supported frequencies and selecting a one that matches or returning -EINVAL for unsupported rates. Also: update the davinci cpufreq driver. It's the only user of this clock and currently it passes the cpufreq table index to clk_set_rate(), which is confusing. Make it pass the requested clock rate in Hz. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [nsekhar@ti.com: commit headline update] Signed-off-by: Sekhar Nori <nsekhar@ti.com>	2017-01-02 15:02:51 +05:30
Rafael J. Wysocki	111b8b3fe4	cpufreq: intel_pstate: Always keep all limits settings in sync Make intel_pstate update per-logical-CPU limits when the global settings are changed to ensure that they are always in sync and users will not see confusing values in per-logical-CPU sysfs attributes. This also fixes the problem that setting the "no_turbo" global attribute to 1 in the "passive" mode (ie. when intel_pstate acts as a regular cpufreq driver) when scaling_governor is set to "performance" has no effect. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-12-31 21:48:44 +01:00
Rafael J. Wysocki	cad3046796	cpufreq: intel_pstate: Use locking in intel_cpufreq_verify_policy() Race conditions are possible if intel_cpufreq_verify_policy() is executed in parallel with global limits updates from sysfs, so the invocation of intel_pstate_update_perf_limits() in it should be carried out under intel_pstate_limits_lock. Make that happen. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-12-31 21:48:43 +01:00
Rafael J. Wysocki	aa439248ab	cpufreq: intel_pstate: Use locking in intel_pstate_resume() Theoretically, intel_pstate_resume() may be executed in parallel with intel_pstate_set_policy(), if the latter is invoked via cpufreq_update_policy() as a result of a notification, so use intel_pstate_limits_lock in there too to avoid race conditions. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-12-31 21:48:42 +01:00
Rafael J. Wysocki	366430b5c2	cpufreq: intel_pstate: Do not expose PID parameters in passive mode If intel_pstate works in the passive mode in which it acts as a regular cpufreq driver and collaborates with generic cpufreq governors, the PID parameters are not used, so do not expose them via debugfs in that case. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-27 03:30:11 +01:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Rafael J. Wysocki	7b99f1aeed	Merge branch 'pm-cpufreq' * pm-cpufreq: cpufreq: s3c64xx: remove incorrect __init annotation cpufreq: Remove CPU hotplug callbacks only if they were initialized CPU/hotplug: Clarify description of __cpuhp_setup_state() return value	2016-12-22 14:34:55 +01:00
Arnd Bergmann	adec57c61c	cpufreq: s3c64xx: remove incorrect __init annotation s3c64xx_cpufreq_config_regulator is incorrectly annotated as __init, since the caller is also not init: WARNING: vmlinux.o(.text+0x92fe1c): Section mismatch in reference from the function s3c64xx_cpufreq_driver_init() to the function .init.text:s3c64xx_cpufreq_config_regulator() With modern gcc versions, the function gets inline, so we don't see the warning, this only happens with gcc-4.6 and older. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-21 02:54:18 +01:00
Boris Ostrovsky	2a8fa123d9	cpufreq: Remove CPU hotplug callbacks only if they were initialized Since CPU hotplug callbacks are requested for CPUHP_AP_ONLINE_DYN state, successful callback initialization will result in cpuhp_setup_state() returning a positive value. Therefore acpi_cpufreq_online being zero indicates that callbacks have not been installed. This means that acpi_cpufreq_boost_exit() should only remove them if acpi_cpufreq_online is positive. Trying to call cpuhp_remove_state_nocalls(0) will cause a BUG(). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-21 02:52:52 +01:00
Linus Torvalds	7b9dc3f75f	Power management material for v4.10-rc1 - New cpufreq driver for Broadcom STB SoCs and a Device Tree binding for it (Markus Mayer). - Support for ARM Integrator/AP and Integrator/CP in the generic DT cpufreq driver and elimination of the old Integrator cpufreq driver (Linus Walleij). - Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier, and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie, Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik). - cpufreq core fix to eliminate races that may lead to using inactive policy objects and related cleanups (Rafael Wysocki). - cpufreq schedutil governor update to make it use SCHED_FIFO kernel threads (instead of regular workqueues) for doing delayed work (to reduce the response latency in some cases) and related cleanups (Viresh Kumar). - New cpufreq sysfs attribute for resetting statistics (Markus Mayer). - cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis, Viresh Kumar). - Support for using generic cpufreq governors in the intel_pstate driver (Rafael Wysocki). - Support for per-logical-CPU P-state limits and the EPP/EPB (Energy Performance Preference/Energy Performance Bias) knobs in the intel_pstate driver (Srinivas Pandruvada). - New CPU ID for Knights Mill in intel_pstate (Piotr Luc). - intel_pstate driver modification to use the P-state selection algorithm based on CPU load on platforms with the system profile in the ACPI tables set to "mobile" (Srinivas Pandruvada). - intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki, Srinivas Pandruvada). - cpufreq powernv driver updates including fast switching support (for the schedutil governor), fixes and cleanus (Akshay Adiga, Andrew Donnellan, Denis Kirjanov). - acpi-cpufreq driver rework to switch it over to the new CPU offline/online state machine (Sebastian Andrzej Siewior). - Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth Prakash). - Idle injection rework (to make it use the regular idle path instead of a home-grown custom one) and related powerclamp thermal driver updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej Siewior). - New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy Shevchenko, Piotr Luc). - intel_idle driver cleanups and switch over to using the new CPU offline/online state machine (Anna-Maria Gleixner, Sebastian Andrzej Siewior). - cpuidle DT driver update to support suspend-to-idle properly (Sudeep Holla). - cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian, Rafael Wysocki). - Preliminary support for power domains including CPUs in the generic power domains (genpd) framework and related DT bindings (Lina Iyer). - Assorted fixes and cleanups in the generic power domains (genpd) framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven). - Preliminary support for devices with multiple voltage regulators and related fixes and cleanups in the Operating Performance Points (OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd). - System sleep state selection interface rework to make it easier to support suspend-to-idle as the default system suspend method (Rafael Wysocki). - PM core fixes and cleanups, mostly related to the interactions between the system suspend and runtime PM frameworks (Ulf Hansson, Sahitya Tummala, Tony Lindgren). - Latency tolerance PM QoS framework imorovements (Andrew Lutomirski). - New Knights Mill CPU ID for the Intel RAPL power capping driver (Piotr Luc). - Intel RAPL power capping driver fixes, cleanups and switch over to using the new CPU offline/online state machine (Jacob Pan, Thomas Gleixner, Sebastian Andrzej Siewior). - Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc, rockchip-dfi devfreq drivers and the devfreq core (Axel Lin, Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar). - Fix for false-positive KASAN warnings during resume from ACPI S3 (suspend-to-RAM) on x86 (Josh Poimboeuf). - Memory map verification during resume from hibernation on x86 to ensure a consistent address space layout (Chen Yu). - Wakeup sources debugging enhancement (Xing Wei). - rockchip-io AVS driver cleanup (Shawn Lin). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYTx4+AAoJEILEb/54YlRx9f8P/2SlNHUENW5qh6FtCw00oC2u UqJerQJ2L38UgbgxbE/0VYblma9rFABDWC1eO2xN2XdcdW5UPBKPVvNcOgNe1Clh gjy3RxZXVpmjfzt2kGfsTLEuGnHqwvx51hTUkeA2LwvkOal45xb8ZESmy8opCtiv iG4LwmPHoxdX5Za5nA9ItFKzxyO1EoyNSnBYAVwALDHxmNOfxEcRevfurASt/0M9 brCCZJA0/sZxeL0lBdy8fNQPIBTUfCoTJG/MtmzGrObJ9wMFvEDfXrVEyZiWs/zA AAZ4kQL77enrIKgrLN8e0G6LzTLHoVcvn38Xjf24dKUqhd7ACBhYcnW+jK3+7EAd gjZ8efObQsiuyK/EDLUNw35tt96CHOqfrQCj2tIwRVvk9EekLqAGXdIndTCr2kYW RpefmP5kMljnm/nQFOVLwMEUQMuVkvUE7EgxADy7DoDmepBFC4ICRDWPye70R2kC 0O1Tn2PAQq4Fd1tyI9TYYz0YQQkRoaRb5rfYUSzbRbeCdsphUopp4Vhsiyn6IcnF XnLbg6pRAat82MoS9n4pfO/VCo8vkErKA8tut9G7TDakkrJoEE7l31PdKW0hP3f6 sBo6xXy6WTeivU/o/i8TbM6K4mA37pBaj78ooIkWLgg5fzRaS2+0xSPVy2H9x1m5 LymHcobCK9rSZ1l208Fe =vhxI -----END PGP SIGNATURE----- Merge tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "Again, cpufreq gets more changes than the other parts this time (one new driver, one old driver less, a bunch of enhancements of the existing code, new CPU IDs, fixes, cleanups) There also are some changes in cpuidle (idle injection rework, a couple of new CPU IDs, online/offline rework in intel_idle, fixes and cleanups), in the generic power domains framework (mostly related to supporting power domains containing CPUs), and in the Operating Performance Points (OPP) library (mostly related to supporting devices with multiple voltage regulators) In addition to that, the system sleep state selection interface is modified to make it easier for distributions with unchanged user space to support suspend-to-idle as the default system suspend method, some issues are fixed in the PM core, the latency tolerance PM QoS framework is improved a bit, the Intel RAPL power capping driver is cleaned up and there are some fixes and cleanups in the devfreq subsystem Specifics: - New cpufreq driver for Broadcom STB SoCs and a Device Tree binding for it (Markus Mayer) - Support for ARM Integrator/AP and Integrator/CP in the generic DT cpufreq driver and elimination of the old Integrator cpufreq driver (Linus Walleij) - Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier, and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie, Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik) - cpufreq core fix to eliminate races that may lead to using inactive policy objects and related cleanups (Rafael Wysocki) - cpufreq schedutil governor update to make it use SCHED_FIFO kernel threads (instead of regular workqueues) for doing delayed work (to reduce the response latency in some cases) and related cleanups (Viresh Kumar) - New cpufreq sysfs attribute for resetting statistics (Markus Mayer) - cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis, Viresh Kumar) - Support for using generic cpufreq governors in the intel_pstate driver (Rafael Wysocki) - Support for per-logical-CPU P-state limits and the EPP/EPB (Energy Performance Preference/Energy Performance Bias) knobs in the intel_pstate driver (Srinivas Pandruvada) - New CPU ID for Knights Mill in intel_pstate (Piotr Luc) - intel_pstate driver modification to use the P-state selection algorithm based on CPU load on platforms with the system profile in the ACPI tables set to "mobile" (Srinivas Pandruvada) - intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki, Srinivas Pandruvada) - cpufreq powernv driver updates including fast switching support (for the schedutil governor), fixes and cleanus (Akshay Adiga, Andrew Donnellan, Denis Kirjanov) - acpi-cpufreq driver rework to switch it over to the new CPU offline/online state machine (Sebastian Andrzej Siewior) - Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth Prakash) - Idle injection rework (to make it use the regular idle path instead of a home-grown custom one) and related powerclamp thermal driver updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej Siewior) - New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy Shevchenko, Piotr Luc) - intel_idle driver cleanups and switch over to using the new CPU offline/online state machine (Anna-Maria Gleixner, Sebastian Andrzej Siewior) - cpuidle DT driver update to support suspend-to-idle properly (Sudeep Holla) - cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian, Rafael Wysocki) - Preliminary support for power domains including CPUs in the generic power domains (genpd) framework and related DT bindings (Lina Iyer) - Assorted fixes and cleanups in the generic power domains (genpd) framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven) - Preliminary support for devices with multiple voltage regulators and related fixes and cleanups in the Operating Performance Points (OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd) - System sleep state selection interface rework to make it easier to support suspend-to-idle as the default system suspend method (Rafael Wysocki) - PM core fixes and cleanups, mostly related to the interactions between the system suspend and runtime PM frameworks (Ulf Hansson, Sahitya Tummala, Tony Lindgren) - Latency tolerance PM QoS framework imorovements (Andrew Lutomirski) - New Knights Mill CPU ID for the Intel RAPL power capping driver (Piotr Luc) - Intel RAPL power capping driver fixes, cleanups and switch over to using the new CPU offline/online state machine (Jacob Pan, Thomas Gleixner, Sebastian Andrzej Siewior) - Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc, rockchip-dfi devfreq drivers and the devfreq core (Axel Lin, Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar) - Fix for false-positive KASAN warnings during resume from ACPI S3 (suspend-to-RAM) on x86 (Josh Poimboeuf) - Memory map verification during resume from hibernation on x86 to ensure a consistent address space layout (Chen Yu) - Wakeup sources debugging enhancement (Xing Wei) - rockchip-io AVS driver cleanup (Shawn Lin)" * tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (127 commits) devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks devfreq: rk3399_dmc: Remove dangling rcu_read_unlock() devfreq: exynos: Don't use OPP structures outside of RCU locks Documentation: intel_pstate: Document HWP energy/performance hints cpufreq: intel_pstate: Support for energy performance hints with HWP cpufreq: intel_pstate: Add locking around HWP requests PM / sleep: Print active wakeup sources when blocking on wakeup_count reads PM / core: Fix bug in the error handling of async suspend PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend PM / Domains: Fix compatible for domain idle state PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators() PM / OPP: Allow platform specific custom set_opp() callbacks PM / OPP: Separate out _generic_set_opp() PM / OPP: Add infrastructure to manage multiple regulators PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage() PM / OPP: Manage supply's voltage/current in a separate structure PM / OPP: Don't use OPP structure outside of rcu protected section PM / OPP: Reword binding supporting multiple regulators per device PM / OPP: Fix incorrect cpu-supply property in binding cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state() ..	2016-12-13 10:41:53 -08:00
Rafael J. Wysocki	fecc8c0ebd	Merge branch 'pm-cpufreq' * pm-cpufreq: (51 commits) Documentation: intel_pstate: Document HWP energy/performance hints cpufreq: intel_pstate: Support for energy performance hints with HWP cpufreq: intel_pstate: Add locking around HWP requests cpufreq: ondemand: Set MIN_FREQUENCY_UP_THRESHOLD to 1 cpufreq: intel_pstate: Add Knights Mill CPUID MAINTAINERS: Add bug tracking system location entry for cpufreq cpufreq: dt: Add support for zx296718 cpufreq: acpi-cpufreq: drop rdmsr_on_cpus() usage cpufreq: acpi-cpufreq: Convert to hotplug state machine cpufreq: intel_pstate: fix intel_pstate_exit_perf_limits() prototype cpufreq: intel_pstate: Set EPP/EPB to 0 in performance mode cpufreq: schedutil: Rectify comment in sugov_irq_work() function cpufreq: intel_pstate: increase precision of performance limits cpufreq: intel_pstate: round up min_perf limits cpufreq: Make cpufreq_update_policy() void ACPI / processor: Make acpi_processor_ppc_has_changed() void cpufreq: Avoid using inactive policies cpufreq: intel_pstate: Generic governors support cpufreq: intel_pstate: Request P-states control from SMM if needed cpufreq: dt: Add support for r8a7743 and r8a7745 ...	2016-12-12 20:45:01 +01:00
Rafael J. Wysocki	57def856f3	Merge branch 'pm-opp' * pm-opp: PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators() PM / OPP: Allow platform specific custom set_opp() callbacks PM / OPP: Separate out _generic_set_opp() PM / OPP: Add infrastructure to manage multiple regulators PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage() PM / OPP: Manage supply's voltage/current in a separate structure PM / OPP: Don't use OPP structure outside of rcu protected section PM / OPP: Reword binding supporting multiple regulators per device PM / OPP: Fix incorrect cpu-supply property in binding PM / OPP: Pass opp_table to dev_pm_opp_put_regulator() PM / OPP: fix debug/error messages in dev_pm_opp_of_get_sharing_cpus() PM / OPP: make _of_get_opp_desc_node() a static function	2016-12-12 20:44:01 +01:00
Srinivas Pandruvada	984edbdccc	cpufreq: intel_pstate: Support for energy performance hints with HWP It is possible to provide hints to the HWP algorithms in the processor to be more performance centric to more energy centric. These hints are provided by using HWP energy performance preference (EPP) or energy performance bias (EPB) settings. The scope of these settings is per logical processor, which means that each of the logical processors in the package can be programmed with a different value. This change provides cpufreq sysfs interface to provide hint. For each policy, two additional attributes will be available to check and provide hint. These attributes will only be present when the intel_pstate driver is using HWP mode. These attributes are: - energy_performance_available_preferences - energy_performance_preference To get list of supported hints: $ cat energy_performance_available_preferences default performance balance_performance balance_power power The current preference can be read or changed via cpufreq sysfs attribute "energy_performance_preference". Reading from this attribute will display current effective setting changed via any method. User can write any of the valid preference string to this attribute. User can always restore to power-on default by writing "default". Implementation Since these hints can be provided by direct MSR write or using some tools like x86_energy_perf_policy, the driver internally doesn't maintain any state. The user operation will result in direct read/write of MSR: 0x774 (HWP_REQUEST_MSR). Also driver use read modify write to update other fields in this MSR. Summary of changes: - struct cpudata field epp_saved is renamed to epp_powersave, as this stores the value to restore once policy is switched from performance to powersave to restore original powersave EPP value. - A new struct cpudata field epp_saved is used to store the raw MSR EPP/EPB value when a CPU goes offline or on suspend and restore on online/resume. This ensures that EPP value is restored to correct value irrespective of the means used to set. - EPP/EPB value ranges are fixed for each preference, which can be set for the cpufreq sysfs, so user request is mapped to/from this range. - New attributes are only added when HWP is present. - Since EPP value of 0 is valid the fields are initialized to -EINVAL when not valid. The field epp_default is read only once after powerup to avoid reading on subsequent CPU online operation - New suspend callback to store epp on suspend operation - Don't invalidate old epp_saved field on resume and online as now we can restore last epp value on suspend and this field can still have old EPP value sampled during switch to performance from powersave. - While here optimized setting of cpu_data->epp_powersave = epp in intel_pstate_hwp_set() as this was done in both true and false paths. - epp/epb set function returns error to caller on failure to pass on to user space for display. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-08 01:43:05 +01:00
Srinivas Pandruvada	b59fe54053	cpufreq: intel_pstate: Add locking around HWP requests To avoid race conditions from multiple threads, increase the scope of intel_pstate_limits_lock to include HWP requests also. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-08 01:43:04 +01:00
Viresh Kumar	dfbe4678d7	PM / OPP: Add infrastructure to manage multiple regulators This patch adds infrastructure to manage multiple regulators and updates the only user (cpufreq-dt) of dev_pm_opp_set{put}_regulator(). This is preparatory work for adding full support for devices with multiple regulators. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Dave Gerlach <d-gerlach@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-06 02:27:59 +01:00
Chen Yu	4dd63b49a7	cpufreq: ondemand: Set MIN_FREQUENCY_UP_THRESHOLD to 1 Currently the minimal up_threshold is 11, and user may want to use a smaller minimal up_threshold for performance tuning, so MIN_FREQUENCY_UP_THRESHOLD could be set to 1 because: 1. Current systems wouldn't be affected as they have already a value >= 11. 2. New systems with a default kernel would keep still the default value that is >= 11. Users now have the advantage that they can make their own decisions and customize the 'trip point' to switch to the max frequency. Link: https://bugzilla.kernel.org/show_bug.cgi?id=65501 Signed-off-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-01 22:43:33 +01:00
Piotr Luc	58bf454272	cpufreq: intel_pstate: Add Knights Mill CPUID Add Knights Mill (KNM) to the list of CPUIDs supported by intel_pstate. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-12-01 15:08:12 +01:00
Baoyou Xie	ab83805667	cpufreq: dt: Add support for zx296718 Add the compatible string for supporting the generic cpufreq driver on the ZTE's zx296718 SoC. Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-30 22:42:47 +01:00
Stephen Boyd	91291d9ad9	PM / OPP: Pass opp_table to dev_pm_opp_put_regulator() Joonyoung Shim reported an interesting problem on his ARM octa-core Odoroid-XU3 platform. During system suspend, dev_pm_opp_put_regulator() was failing for a struct device for which dev_pm_opp_set_regulator() is called earlier. This happened because an earlier call to dev_pm_opp_of_cpumask_remove_table() function (from cpufreq-dt.c file) removed all the entries from opp_table->dev_list apart from the last CPU device in the cpumask of CPUs sharing the OPP. But both dev_pm_opp_set_regulator() and dev_pm_opp_put_regulator() routines get CPU device for the first CPU in the cpumask. And so the OPP core failed to find the OPP table for the struct device. This patch attempts to fix this problem by returning a pointer to the opp_table from dev_pm_opp_set_regulator() and using that as the parameter to dev_pm_opp_put_regulator(). This ensures that the dev_pm_opp_put_regulator() doesn't fail to find the opp table. Note that similar design problem also exists with other dev_pm_opp_put_*() APIs, but those aren't used currently by anyone and so we don't need to update them for now. Cc: 4.4+ <stable@vger.kernel.org> # 4.4+ Reported-by: Joonyoung Shim <jy0922.shim@samsung.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ Viresh: Wrote commit log and tested on exynos 5250 ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-30 22:41:28 +01:00
Tim Chen	de966cf4a4	sched/x86: Change CONFIG_SCHED_ITMT to CONFIG_SCHED_MC_PRIO Rename CONFIG_SCHED_ITMT for Intel Turbo Boost Max Technology 3.0 to CONFIG_SCHED_MC_PRIO. This makes the configuration extensible in future to other architectures that wish to similarly establish CPU core priorities support in the scheduler. The description in Kconfig is updated to reflect this change with added details for better clarity. The configuration is explicitly default-y, to enable the feature on CPUs that have this feature. It has no effect on non-TBM3 CPUs. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@suse.de Cc: jolsa@redhat.com Cc: linux-acpi@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/2b2ee29d93e3f162922d72d0165a1405864fbb23.1480444902.git.tim.c.chen@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-30 08:27:08 +01:00
Sebastian Andrzej Siewior	a3605c46e0	cpufreq: acpi-cpufreq: drop rdmsr_on_cpus() usage The online / pre_down callback is invoked on the target CPU since commit `1cf4f629d9` ("cpu/hotplug: Move online calls to hotplugged cpu") which means for the hotplug callback we can use rmdsrl() instead of rdmsr_on_cpus(). This leaves us with set_boost() as the only user which still needs to read/write the MSR on different CPUs. There is no point in doing that update on all cpus with the read modify write magic via per cpu data. We simply can issue a function call on all online CPUs which also means that we need half that many IPIs. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-28 14:31:06 +01:00
Sebastian Andrzej Siewior	4d66ddf28d	cpufreq: acpi-cpufreq: Convert to hotplug state machine Install the callbacks via the state machine. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-28 14:31:06 +01:00
Arnd Bergmann	7a3ba767f6	cpufreq: intel_pstate: fix intel_pstate_exit_perf_limits() prototype The addition of the generic governor support marked the intel_pstate_exit_perf_limits as inline(), which fixed a warning, but it introduced another warning: drivers/cpufreq/intel_pstate.c: In function ‘intel_pstate_exit_perf_limits’: drivers/cpufreq/intel_pstate.c:483:1: error: no return statement in function returning non-void [-Werror=return-type] This changes it back to a 'void' return type, and changes the corresponding intel_pstate_init_acpi_perf_limits() function to be inline as well for consistency. Fixes: `001c76f05b` (cpufreq: intel_pstate: Generic governors support) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-28 14:24:21 +01:00
Srinivas Pandruvada	8442885fca	cpufreq: intel_pstate: Set EPP/EPB to 0 in performance mode When user has selected performance policy, then set the EPP (Energy Performance Preference) or EPB (Energy Performance Bias) to maximum performance mode. Also when user switch back to powersave, then restore EPP/EPB to last EPP/EPB value before entering performance mode. If user has not changed EPP/EPB manually then it will be power on default value. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-28 14:23:56 +01:00
Rafael J. Wysocki	17669006ad	cpufreq/intel_pstate: Use CPPC to get max performance Use the acpi cppc_lib interface to get CPPC performance limits and update the per cpu priority for the ITMT scheduler. If the highest performance of CPUs differs the ITMT feature is enabled. Co-developed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: linux-pm@vger.kernel.org Cc: peterz@infradead.org Cc: jolsa@redhat.com Cc: rjw@rjwysocki.net Cc: linux-acpi@vger.kernel.org Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: bp@suse.de Link: http://lkml.kernel.org/r/0998b98943bcdec7d1ddd4ff27358da555ea8e92.1479844244.git.tim.c.chen@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-11-24 20:44:20 +01:00
Srinivas Pandruvada	d5dd33d9de	cpufreq: intel_pstate: increase precision of performance limits Even with round up of limits->min_perf and limits->max_perf, in some cases resultant performance is 100 MHz less than the desired. For example when the maximum frequency is 3.50 GHz, setting scaling_min_frequency to 2.3 GHz always results in 2.2 GHz minimum. Currently the fixed floating point operation uses 8 bit precision for calculating limits->min_perf and limits->max_perf. For some operations in this driver the 14 bit precision is used. Using the 14 bit precision also for calculating limits->min_perf and limits->max_perf, addresses this issue. Introduced fp_ext_toint() equivalent to fp_toint() and int_ext_tofp() equivalent to int_tofp() with 14 bit precision. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-22 02:31:49 +01:00
Srinivas Pandruvada	46992d6b55	cpufreq: intel_pstate: round up min_perf limits In some use cases, user wants to enforce a minimum performance limit on CPUs. But because of simple division the resultant performance is 100 MHz less than the desired in some cases. For example when the maximum frequency is 3.50 GHz, setting scaling_min_frequency to 1.6 GHz always results in 1.5 GHz minimum. With simple round up, the frequency can be set to 1.6 GHz to minimum in this case. This round up is already done to max_policy_pct and max_perf, so do the same for min_policy_pct and min_perf. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-22 02:31:48 +01:00
Rafael J. Wysocki	30248feff5	cpufreq: Make cpufreq_update_policy() void The return value of cpufreq_update_policy() is never used, so make it void. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-11-21 14:35:43 +01:00
Rafael J. Wysocki	182e36af06	cpufreq: Avoid using inactive policies There are two places in the cpufreq core in which low-level driver callbacks may be invoked for an inactive cpufreq policy, which isn't guaranteed to work in general. Both are due to possible races with CPU offline. First, in cpufreq_get(), the policy may become inactive after the check against policy->cpus in cpufreq_cpu_get() and before policy->rwsem is acquired, in which case using it going forward may not be correct. Second, an analogous situation is possible in cpufreq_update_policy(). Avoid using inactive policies by adding policy_is_inactive() checks to the code in the above places. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-11-21 14:35:42 +01:00
Rafael J. Wysocki	001c76f05b	cpufreq: intel_pstate: Generic governors support There may be reasons to use generic cpufreq governors (eg. schedutil) on Intel platforms instead of the intel_pstate driver's internal governor. However, that currently can only be done by disabling intel_pstate altogether and using the acpi-cpufreq driver instead of it, which is subject to limitations. First of all, acpi-cpufreq only works on systems where the _PSS object is present in the ACPI tables for all logical CPUs. Second, on those systems acpi-cpufreq will only use frequencies listed by _PSS which may be suboptimal. In particular, by convention, the whole turbo range is represented in _PSS as a single P-state and the frequency assigned to it is greater by 1 MHz than the greatest non-turbo frequency listed by _PSS. That may confuse governors to use turbo frequencies less frequently which may lead to suboptimal performance. For this reason, make it possible to use the intel_pstate driver with generic cpufreq governors as a "normal" cpufreq driver. That mode is enforced by adding intel_pstate=passive to the kernel command line and cannot be disabled at run time. In that mode, intel_pstate provides a cpufreq driver interface including the ->target() and ->fast_switch() callbacks and is listed in scaling_driver as "intel_cpufreq". Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Doug Smythies <dsmythies@telus.net>	2016-11-21 14:32:32 +01:00
Rafael J. Wysocki	d0ea59e188	cpufreq: intel_pstate: Request P-states control from SMM if needed Currently, intel_pstate is unable to control P-states on my IvyBridge-based Acer Aspire S5, because they are controlled by SMM on that machine by default and it is necessary to request OS control of P-states from it via the SMI Command register exposed in the ACPI FADT. intel_pstate doesn't do that now, but acpi-cpufreq and other cpufreq drivers for x86 platforms do. Address this problem by making intel_pstate use the ACPI-defined mechanism as well. However, intel_pstate is not modular and it doesn't need the module refcount tricks played by acpi_processor_notify_smm(), so export the core of this function to it as acpi_processor_pstate_control() and make it call that. [The changes in processor_perflib.c related to this should not make any functional difference for the acpi_processor_notify_smm() users]. To be safe, only call acpi_processor_notify_smm() from intel_pstate if ACPI _PPC support is enabled in it. Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-11-17 22:47:47 +01:00
Geert Uytterhoeven	f0da898b46	cpufreq: dt: Add support for r8a7743 and r8a7745 Add the compatible strings for supporting the generic cpufreq driver on the Renesas RZ/G1M (r8a7743) and RZ/G1E (r8a7745) SoCs. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-16 23:31:52 +01:00
Denis Kirjanov	8a10c06a20	cpufreq: powernv: Disable preemption while checking CPU throttling state With preemption turned on we can read incorrect throttling state while being switched to CPU on a different chip. BUG: using smp_processor_id() in preemptible [00000000] code: cat/7343 caller is .powernv_cpufreq_throttle_check+0x2c/0x710 CPU: 13 PID: 7343 Comm: cat Not tainted 4.8.0-rc5-dirty #1 Call Trace: [c0000007d25b75b0] [c000000000971378] .dump_stack+0xe4/0x150 (unreliable) [c0000007d25b7640] [c0000000005162e4] .check_preemption_disabled+0x134/0x150 [c0000007d25b76e0] [c0000000007b63ac] .powernv_cpufreq_throttle_check+0x2c/0x710 [c0000007d25b7790] [c0000000007b6d18] .powernv_cpufreq_target_index+0x288/0x360 [c0000007d25b7870] [c0000000007acee4] .__cpufreq_driver_target+0x394/0x8c0 [c0000007d25b7920] [c0000000007b22ac] .cpufreq_set+0x7c/0xd0 [c0000007d25b79b0] [c0000000007adf50] .store_scaling_setspeed+0x80/0xc0 [c0000007d25b7a40] [c0000000007ae270] .store+0xa0/0x100 [c0000007d25b7ae0] [c0000000003566e8] .sysfs_kf_write+0x88/0xb0 [c0000007d25b7b70] [c0000000003553b8] .kernfs_fop_write+0x178/0x260 [c0000007d25b7c10] [c0000000002ac3cc] .__vfs_write+0x3c/0x1c0 [c0000007d25b7cf0] [c0000000002ad584] .vfs_write+0xc4/0x230 [c0000007d25b7d90] [c0000000002aeef8] .SyS_write+0x58/0x100 [c0000007d25b7e30] [c00000000000bfec] system_call+0x38/0xfc Fixes: `09a972d162` (cpufreq: powernv: Report cpu frequency throttling) Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-16 23:29:59 +01:00
Stratos Karafotis	42d951c851	cpufreq: conservative: Fix comment explaining frequency updates The original comment about the frequency increase to maximum is wrong. Both increase and decrease happen at steps. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-16 23:15:56 +01:00
Stratos Karafotis	00bfe05889	cpufreq: conservative: Decrease frequency faster for deferred updates Conservative governor changes the CPU frequency in steps. That means that if a CPU runs at max frequency, it will need several sampling periods to return to min frequency when the workload is finished. If the update function that calculates the load and target frequency is deferred, the governor might need even more time to decrease the frequency. This may have impact to power consumption and after all conservative should decrease the frequency if there is no workload at every sampling rate. To resolve the above issue calculate the number of sampling periods that the update is deferred. Considering that for each sampling period conservative should drop the frequency by a freq_step because the CPU was idle apply the proper subtraction to requested frequency. Below, the kernel trace with and without this patch. First an intensive workload is applied on a specific CPU. Then the workload is removed and the CPU goes to idle. WITHOUT <idle>-0 [007] dN.. 620.329153: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 620.350857: cpu_frequency: state=1700000 cpu_id=7 kworker/7:2-556 [007] .... 620.370856: cpu_frequency: state=1900000 cpu_id=7 kworker/7:2-556 [007] .... 620.390854: cpu_frequency: state=2100000 cpu_id=7 kworker/7:2-556 [007] .... 620.411853: cpu_frequency: state=2200000 cpu_id=7 kworker/7:2-556 [007] .... 620.432854: cpu_frequency: state=2400000 cpu_id=7 kworker/7:2-556 [007] .... 620.453854: cpu_frequency: state=2600000 cpu_id=7 kworker/7:2-556 [007] .... 620.494856: cpu_frequency: state=2900000 cpu_id=7 kworker/7:2-556 [007] .... 620.515856: cpu_frequency: state=3100000 cpu_id=7 kworker/7:2-556 [007] .... 620.536858: cpu_frequency: state=3300000 cpu_id=7 kworker/7:2-556 [007] .... 620.557857: cpu_frequency: state=3401000 cpu_id=7 <idle>-0 [007] d... 669.591363: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 669.591939: cpu_idle: state=4294967295 cpu_id=7 <idle>-0 [007] d... 669.591980: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] dN.. 669.591989: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 670.201224: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 670.221975: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 670.222016: cpu_frequency: state=3300000 cpu_id=7 <idle>-0 [007] d... 670.222026: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 670.234964: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 670.801251: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 671.236046: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 671.236073: cpu_frequency: state=3100000 cpu_id=7 <idle>-0 [007] d... 671.236112: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 671.393437: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 671.401277: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 671.404083: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 671.404111: cpu_frequency: state=2900000 cpu_id=7 <idle>-0 [007] d... 671.404125: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 671.404974: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 671.501180: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 671.995414: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 671.995459: cpu_frequency: state=2800000 cpu_id=7 <idle>-0 [007] d... 671.995469: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 671.996287: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 672.001305: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.078374: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 672.078410: cpu_frequency: state=2600000 cpu_id=7 <idle>-0 [007] d... 672.078419: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.158020: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 672.158040: cpu_frequency: state=2400000 cpu_id=7 <idle>-0 [007] d... 672.158044: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.160038: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 672.234557: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.237121: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 672.237174: cpu_frequency: state=2100000 cpu_id=7 <idle>-0 [007] d... 672.237186: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.237778: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 672.267902: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.269860: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 672.269906: cpu_frequency: state=1900000 cpu_id=7 <idle>-0 [007] d... 672.269914: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.271902: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 672.751342: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 672.823056: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-556 [007] .... 672.823095: cpu_frequency: state=1600000 cpu_id=7 WITH <idle>-0 [007] dN.. 4380.928009: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-399 [007] .... 4380.949767: cpu_frequency: state=2000000 cpu_id=7 kworker/7:2-399 [007] .... 4380.969765: cpu_frequency: state=2200000 cpu_id=7 kworker/7:2-399 [007] .... 4381.009766: cpu_frequency: state=2500000 cpu_id=7 kworker/7:2-399 [007] .... 4381.029767: cpu_frequency: state=2600000 cpu_id=7 kworker/7:2-399 [007] .... 4381.049769: cpu_frequency: state=2800000 cpu_id=7 kworker/7:2-399 [007] .... 4381.069769: cpu_frequency: state=3000000 cpu_id=7 kworker/7:2-399 [007] .... 4381.089771: cpu_frequency: state=3100000 cpu_id=7 kworker/7:2-399 [007] .... 4381.109772: cpu_frequency: state=3400000 cpu_id=7 kworker/7:2-399 [007] .... 4381.129773: cpu_frequency: state=3401000 cpu_id=7 <idle>-0 [007] d... 4428.226159: cpu_idle: state=1 cpu_id=7 <idle>-0 [007] d... 4428.226176: cpu_idle: state=4294967295 cpu_id=7 <idle>-0 [007] d... 4428.226181: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 4428.227177: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 4428.551640: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 4428.649239: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-399 [007] .... 4428.649268: cpu_frequency: state=2800000 cpu_id=7 <idle>-0 [007] d... 4428.649278: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 4428.689856: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 4428.799542: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 4428.801683: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-399 [007] .... 4428.801748: cpu_frequency: state=1700000 cpu_id=7 <idle>-0 [007] d... 4428.801761: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 4428.806545: cpu_idle: state=4294967295 cpu_id=7 ... <idle>-0 [007] d... 4429.051880: cpu_idle: state=4 cpu_id=7 <idle>-0 [007] d... 4429.086240: cpu_idle: state=4294967295 cpu_id=7 kworker/7:2-399 [007] .... 4429.086293: cpu_frequency: state=1600000 cpu_id=7 Without the patch the CPU dropped to min frequency after 3.2s With the patch applied the CPU dropped to min frequency after 0.86s Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-16 23:15:56 +01:00
Viresh Kumar	d5f905a93c	cpufreq: conservative: Rename get_freq_target() to get_freq_step() What's returned from this function is the delta by which the frequency must be increased or decreased and not the final frequency that should be selected. Name it properly to match its purpose. Also update the variables used to store that value. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-14 21:34:52 +01:00
Akshay Adiga	c9a81e6864	cpufreq: powernv: Fix uninitialized lpstate_idx in gpstates_timer_handler() lpstate_idx remains uninitialized in the case when elapsed_time is greater than MAX_RAMP_DOWN_TIME. At the end of rampdown the global pstate should be equal to the local pstate. Fixes: `20b15b7663` (cpufreq: powernv: Use PMCR to verify global and localpstate) Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-14 21:32:31 +01:00
Srinivas Pandruvada	7f7a516ee3	cpufreq: intel_pstate: Use CPU load based algorithm for PM_MOBILE Use get_target_pstate_use_cpu_load() to calculate target P-State for devices, with the preferred power management profile in ACPI FADT set to PM_MOBILE. This may help in resolving some thermal issues caused by low sustained cpu bound workloads. The current algorithm tend to over provision in this case as it doesn't look at the CPU busyness. Also included the fix from Arnd Bergmann <arnd@arndb.de> to solve compile issue, when CONFIG_ACPI is not defined. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-14 21:25:23 +01:00
Robert Jarzmik	dcd2ea410d	cpufreq: pxa: use generic platdev driver for device-tree For device-tree based pxa25x and pxa27x platforms, cpufreq-dt driver is doing the job as well as pxa2xx-cpufreq, so add these platforms to the compatibility list. This won't work for legacy non device-tree platforms where pxa2xx-cpufreq is still required. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-11 02:08:42 +01:00
Markus Mayer	ee7930ee27	cpufreq: stats: New sysfs attribute for clearing statistics Allow CPUfreq statistics to be cleared by writing anything to /sys/.../cpufreq/stats/reset. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-11 01:51:11 +01:00
Viresh Kumar	26f0dbc9ab	cpufreq: governor: Don't use 'timer' keyword The earlier implementation of governors used background timers and so functions, mutex, etc had 'timer' keyword in their names. But that's not true anymore. Replace 'timer' with 'update', as those functions, variables are based around updates to frequency. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-11 01:48:33 +01:00
Akshay Adiga	20b15b7663	cpufreq: powernv: Use PMCR to verify global and local pstate As fast_switch() may get called with interrupt disable mode, we cannot hold a mutex to update the global_pstate_info. So currently, fast_switch() does not update the global_pstate_info and it will end up with stale data whenever pstate is updated through fast_switch(). As the gpstate_timer can fire after fast_switch() has updated the pstates, the timer handler cannot rely on the cached values of local and global pstate and needs to read it from the PMCR. Only gpstate_timer_handler() is affected by the stale cached pstate data beacause either fast_switch() or target_index() routines will be called for a given govenor, but gpstate_timer can fire after the governor has changed to schedutil. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-11 01:41:02 +01:00
Akshay Adiga	60c9efb8f7	cpufreq: powernv: Adding fast_switch for schedutil Adding fast_switch which does light weight operation to set the desired pstate. Both global and local pstates are set to the same desired pstate. Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-11 01:41:02 +01:00
Wei Yongjun	e7d040b8a2	cpufreq: brcmstb-avs-cpufreq: make symbol brcm_avs_cpufreq_attr static Fixes the following sparse warning: drivers/cpufreq/brcmstb-avs-cpufreq.c:982:18: warning: symbol 'brcm_avs_cpufreq_attr' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Markus Mayer <mmayer@broadcom.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-11 01:32:53 +01:00
Srinivas Pandruvada	a410c03d66	cpufreq: intel_pstate: protect limits variable The limits variable gets modified from intel_pstate sysfs and also gets modified from cpufreq sysfs. So protect with a mutex to keep data integrity, when they are getting modified from multiple threads. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:10:54 +01:00
Markus Mayer	33de45c133	cpufreq: brcmstb-avs-cpufreq: add debugfs support In order to aid debugging, we add a debugfs interface to the driver that allows direct interaction with the AVS co-processor. The debugfs interface provides a means for reading all and writing some of the mailbox registers directly from the shell prompt and enables a user to execute the communications protocol between ARM CPU and AVS CPU step-by-step. This interface should be used for debugging purposes only. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:07:38 +01:00
Markus Mayer	de322e0859	cpufreq: brcmstb-avs-cpufreq: AVS CPUfreq driver for Broadcom STB SoCs This driver supports voltage and frequency scaling on Broadcom STB SoCs using AVS firmware with DFS and DVFS support. Actual frequency or voltage scaling is done exclusively by the AVS firmware. The driver merely provides a standard CPUfreq interface to other kernel components and userland, and instructs the AVS firmware to perform frequency or voltage changes on its behalf. Signed-off-by: Markus Mayer <mmayer@broadcom.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:07:37 +01:00
Masahiro Yamada	1758b3374b	cpufreq: dt: add Socionext UniPhier SoCs support Add compatible strings for Pro5, PXs2, LD6b, LD11, LD20 SoCs to use the generic cpufreq driver. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:05:42 +01:00
Srinivas Pandruvada	5879f87739	cpufreq: intel_pstate: Reduce impact due to rounding error When policy->max and policy->min are same, in some cases they don't result in the same frequency cap. The max_policy_pct is rounded up but not min_perf_pct. So even when they are same, results in different percentage or maximum and minimum. Since minimum is a conservative value for power, a lower value without rounding is better in most of the cases, unless user wants policy->max = policy->min. This change uses use the same policy percentage when policy->max and policy->min are same. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:04:06 +01:00
Srinivas Pandruvada	eae48f046f	cpufreq: intel_pstate: Per CPU P-State limits Intel P-State offers two interface to set performance limits: - Intel P-State sysfs /sys/devices/system/cpu/intel_pstate/max_perf_pct /sys/devices/system/cpu/intel_pstate/min_perf_pct - cpufreq /sys/devices/system/cpu/cpu/cpufreq/scaling_max_freq /sys/devices/system/cpu/cpu/cpufreq/scaling_min_freq In the current implementation both of the above methods, change limits to every CPU in the system. Moreover the limits placed using cpufreq policy interface also presented in the Intel P-State sysfs via modified max_perf_pct and min_per_pct during sysfs reads. This allows to check percent of reduced/increased performance, irrespective of method used to limit. There are some new generations of processors, where it is possible to have limits placed on individual CPU cores. Using cpufreq interface it is possible to set limits on each CPU. But the current processing will use last limits placed on all CPUs. So the per core limit feature of CPUs can't be used. This change brings in capability to set P-States limits for each CPU, with some limitations. In this case what should be the read of max_perf_pct and min_perf_pct? It can be most restrictive limits placed on any CPU or max possible performance on any given CPU on which no limits are placed. In either case someone will have issue. So the consensus is, we can't have both sysfs controls present when user wants to use limit per core limits. - By default per-core-control feature is not enabled. So no one will notice any difference. - The way to enable is by kernel command line intel_pstate=per_cpu_perf_limits - When the per-core-controls are enabled there is no display of for both read and write on /sys/devices/system/cpu/intel_pstate/max_perf_pct /sys/devices/system/cpu/intel_pstate/min_perf_pct - User can change limits using /sys/devices/system/cpu/cpu/cpufreq/scaling_max_freq /sys/devices/system/cpu/cpu/cpufreq/scaling_min_freq /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor - User can still observe turbo percent and number of P-States from /sys/devices/system/cpu/intel_pstate/turbo_pct /sys/devices/system/cpu/intel_pstate/num_pstates - User can read write system wide turbo status /sys/devices/system/cpu/no_turbo While changing this BUG_ON is changed to WARN_ON, as they are not fatal errors for the system. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:04:06 +01:00
Linus Walleij	ae8b8d8f86	cpufreq: retire the Integrator cpufreq driver After switching the core module clocks controlling the Integrator clock frequencies to the common clock framework, defining the operating points in the device tree, and activating the generic DT-based CPUfreq driver, we can retire the old Integrator cpufreq driver. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:01:18 +01:00
Linus Walleij	650ec6cfe3	cpufreq: enable the DT cpufreq driver on the Integrators This enables the generic DT and OPP-based cpufreq driver on the ARM Integrator/AP and Integrator/CP. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-11-01 06:01:18 +01:00
Rafael J. Wysocki	fe0f59c412	Merge back earlier cpufreq material for v4.10.	2016-10-30 06:12:50 +01:00
Rafael J. Wysocki	8b2ada27dc	Merge branches 'pm-cpufreq-fixes' and 'pm-sleep-fixes' * pm-cpufreq-fixes: cpufreq: intel_pstate: Always set max P-state in performance mode cpufreq: intel_pstate: Set P-state upfront in performance mode * pm-sleep-fixes: PM / suspend: Fix missing KERN_CONT for suspend message	2016-10-29 01:29:17 +02:00
Rafael J. Wysocki	2f1d407ada	cpufreq: intel_pstate: Always set max P-state in performance mode The only times at which intel_pstate checks the policy set for a given CPU is the initialization of that CPU and updates of its policy settings from cpufreq when intel_pstate_set_policy() is invoked. That is insufficient, however, because intel_pstate uses the same P-state selection function for all CPUs regardless of the policy setting for each of them and the P-state limits are shared between them. Thus if the policy is set to "performance" for a particular CPU, it may not behave as expected if the cpufreq settings are changed subsequently for another CPU. That can be easily demonstrated by writing "performance" to scaling_governor for all CPUs and then switching it to "powersave" for one of them in which case all of the CPUs will behave as though their scaling_governor were all "powersave" (even though the policy still appears to be "performance" for the remaining CPUs). Fix this problem by modifying intel_pstate_adjust_busy_pstate() to always set the P-state to the maximum allowed by the current limits for all CPUs whose policy is set to "performance". Note that it still is recommended to always change the policy setting in the same way for all CPUs even with this fix applied to avoid confusion. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-24 23:20:25 +02:00
Rafael J. Wysocki	a6c6ead141	cpufreq: intel_pstate: Set P-state upfront in performance mode After commit `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) the cpufreq governor callbacks may not be invoked on NOHZ_FULL CPUs and, in particular, switching to the "performance" policy via sysfs may not have any effect on them. That is a problem, because it usually is desirable to squeeze the last bit of performance out of those CPUs, so work around it by setting the maximum P-state (within the limits) in intel_pstate_set_policy() upfront when the policy is CPUFREQ_POLICY_PERFORMANCE. Fixes: `a4675fbc4a` (cpufreq: intel_pstate: Replace timers with utilization update callbacks) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-10-21 22:18:22 +02:00
Srinivas Pandruvada	185d82456e	cpufreq: intel_pstate: Remove PID debugfs when not used When target state is calculated using get_target_pstate_use_cpu_load(), PID controller is not used, hence it has no effect on performance. So don't present debugfs entries to tune PID controller. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-21 22:16:26 +02:00
Rafael J. Wysocki	1d29815ef2	cpufreq: intel_pstate: Drop boost_iowait flag The "IOwait boosting" mechanism is only used by the get_target_pstate_use_cpu_load() governor function and the boost_iowait flag in pid_params is always set when that function is in use (and it is never set otherwise). This means that the boost_iowait flag is in fact redundant and may be dropped. For this reason, replace the boost_iowait flag check in intel_pstate_update_util() with an equivalent check against pstate_funcs.get_target_pstate and drop that flag. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-10-21 22:13:51 +02:00
Prakash, Prashanth	974f86498e	cpufreq / CPPC: Add MODULE_DEVICE_TABLE for cppc_cpufreq driver MODULE_DEVICE_TABLE is added so that CPPC cpufreq module can be automatically loaded when we have a acpi processor device with "ACPI0007" hid. Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-21 15:11:23 +02:00
Linus Torvalds	ef98988ba3	More power management updates for v4.9-rc1 - Fix two cpufreq regressions causing undesirable changes in behavior to appear (one in the core and one in the conservative governor) introduced during the 4.8 cycle (Aaro Koskinen, Rafael Wysocki). - Fix the way the intel_pstate driver accesses MSRs related to the hardware-managed P-states (HWP) feature during the initialization which currently is unsafe and may cause the processor to generate a general protection fault (Srinivas Pandruvada). - Rework the intel_pstate's P-state selection algorithm used on Atom processors to avoid known problems with the current one and to make the computation more straightforward, which also happens to improve performance in multiple benchmarks a bit (Rafael Wysocki). - Improve two comments in the intel_pstate driver (Rafael Wysocki). - Fix the desired performance computation in the CPPC cpufreq driver (Hoan Tran). - Fix the devfreq core to avoid printing misleading error messages in some cases (Tobias Jakobi). - Fix the error code path in devfreq_add_device() to use proper locking around list modifications (Axel Lin). - Fix a build failure and remove a couple of redundant updates of variables in the exynos-nocp devfreq driver (Axel Lin). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJYANqMAAoJEILEb/54YlRx+U4P/A1ZJ/93u+ChipehTckNDogR xMCNsUz6Pn9VIdilEnaUcsCaNc93R7e6KjwgSO7Caeriw4syW3YZz2LuGQTihs8b 5vnvVvya9Bw1aXUweeayogMyOYZV1y1G/yzq7/+c02/cgxO8WBPnmGrE17Mhu43q IF1pQJ257e0HgKKspuzy+twRCLwnOqHbvWtQnEi2rzuaGrsK7XZk9yRuaXK4NshQ +M9hrHlw+OdmI+9lLmH8Ap2G68EJ4Q2o69sbAQ6MWgxRU44D0uEqgbT16cIdDs3J c9VCgiqHuhj2bfd9vqNAjr4bGdy4iwcEKyz2nkIl0KEq9tTPtJky8v6WUzV0+rbR xVbGIWg8X5wKe/Ndve2GLDrqhuVJ0hZkRdqpzRgm08VBGpRlmM0Gjqk+uEKqA2n2 IhidwTlzbQFVh437cjqupCKVXPb2POdgNyk4fEK7WVckRR3K7LR+rXoWN1uwW2YJ 9rjQBX0n2UfZ9Ft+gVO6/faWZlqLPmx60lHQSXNHvNY04HfZ5EiRFGEZEX1g0Uep 16nYHpB+qx/GwR7druGQVVY58YEp2g68jbpL2ehr2lLBYVSExy0kiOrS7GpoA0vd ngImjroJ842wQYjfek4Gi8VfGu+tsuMIVdjltOn1sVZ1QvprgF/atZHcN84eV8BU OyEGOQ7H1idEZ14Oa19C =3yoB -----END PGP SIGNATURE----- Merge tag 'pm-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "This includes a couple of fixes for cpufreq regressions introduced in 4.8, a rework of the intel_pstate algorithm used on Atom processors (that took some time to test) plus a fix and a couple of cleanups in that driver, a CPPC cpufreq driver fix, and a some devfreq fixes and cleanups (core and exynos-nocp). Specifics: - Fix two cpufreq regressions causing undesirable changes in behavior to appear (one in the core and one in the conservative governor) introduced during the 4.8 cycle (Aaro Koskinen, Rafael Wysocki). - Fix the way the intel_pstate driver accesses MSRs related to the hardware-managed P-states (HWP) feature during the initialization which currently is unsafe and may cause the processor to generate a general protection fault (Srinivas Pandruvada). - Rework the intel_pstate's P-state selection algorithm used on Atom processors to avoid known problems with the current one and to make the computation more straightforward, which also happens to improve performance in multiple benchmarks a bit (Rafael Wysocki). - Improve two comments in the intel_pstate driver (Rafael Wysocki). - Fix the desired performance computation in the CPPC cpufreq driver (Hoan Tran). - Fix the devfreq core to avoid printing misleading error messages in some cases (Tobias Jakobi). - Fix the error code path in devfreq_add_device() to use proper locking around list modifications (Axel Lin). - Fix a build failure and remove a couple of redundant updates of variables in the exynos-nocp devfreq driver (Axel Lin)" * tag 'pm-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: CPPC: Correct desired_perf calculation cpufreq: conservative: Fix next frequency selection cpufreq: skip invalid entries when searching the frequency cpufreq: intel_pstate: Fix struct pstate_adjust_policy kerneldoc cpufreq: intel_pstate: Proportional algorithm for Atom PM / devfreq: Skip status update on uninitialized previous_freq PM / devfreq: Add proper locking around list_del() PM / devfreq: exynos-nocp: Remove redundant code PM / devfreq: exynos-nocp: Select REGMAP_MMIO cpufreq: intel_pstate: Clarify comment in get_target_pstate_use_performance() cpufreq: intel_pstate: Fix unsafe HWP MSR access	2016-10-14 12:46:13 -07:00
Hoan Tran	c197d75803	cpufreq: CPPC: Correct desired_perf calculation The desired_perf is an abstract performance number. Its value should be in the range of [lowest perf, highest perf] of CPPC. The correct calculation is desired_perf = freq * cppc_highest_perf / cppc_dmi_max_khz And cppc_cpufreq_set_target() returns if desired_perf is exactly the same with the old perf. Signed-off-by: Hoan Tran <hotran@apm.com> Reviewed-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-13 23:10:41 +02:00
Rafael J. Wysocki	abb6627910	cpufreq: conservative: Fix next frequency selection Commit `d352cf47d9` (cpufreq: conservative: Do not use transition notifications) overlooked the case when the "frequency step" used by the conservative governor is small relative to the distances between the available frequencies and broke the algorithm by using policy->cur instead of the previously requested frequency when computing the next one. As a result, the governor may not be able to go outside of a narrow range between two consecutive available frequencies. Fix the problem by making the governor save the previously requested frequency and select the next one relative that value (unless it is out of range, in which case policy->cur will be used instead). Fixes: `d352cf47d9` (cpufreq: conservative: Do not use transition notifications) Link: https://bugzilla.kernel.org/show_bug.cgi?id=177171 Reported-and-tested-by: Aleksey Rybalkin <aleksey@rybalkin.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.8+ <stable@vger.kernel.org> # 4.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-13 14:42:06 +02:00
Rafael J. Wysocki	3954517e2f	cpufreq: intel_pstate: Fix struct pstate_adjust_policy kerneldoc It looks like the name of struct pstate_adjust_policy was updated without updating its kerneldoc comment accordingly, so fix that mistake. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-12 20:58:14 +02:00
Rafael J. Wysocki	0843e83c1a	cpufreq: intel_pstate: Proportional algorithm for Atom The PID algorithm used by the intel_pstate driver tends to drive performance to the minimum for workloads with utilization below the setpoint, which is undesirable, so replace it with a modified "proportional" algorithm on Atom. The new algorithm will set the new P-state to be 1.25 times the available maximum times the (frequency-invariant) utilization during the previous sampling period except when the target P-state computed this way is lower than the average P-state during the previous sampling period. In the latter case, it will increase the target by 50% of the difference between it and the average P-state to prevent performance from dropping down too fast in some cases. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>	2016-10-12 20:58:13 +02:00
Rafael J. Wysocki	f00593a4bd	cpufreq: intel_pstate: Clarify comment in get_target_pstate_use_performance() Make the comment explaining the meaning of the perf_scaled variable in get_target_pstate_use_performance() more straightforward. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-09 18:54:57 +02:00
Srinivas Pandruvada	f9f4872df6	cpufreq: intel_pstate: Fix unsafe HWP MSR access This is a requirement that MSR MSR_PM_ENABLE must be set to 0x01 before reading MSR_HWP_CAPABILITIES on a given CPU. If cpufreq init() is scheduled on a CPU which is not same as policy->cpu or migrates to a different CPU before calling msr read for MSR_HWP_CAPABILITIES, it is possible that MSR_PM_ENABLE was not to set to 0x01 on that CPU. This will cause GP fault. So like other places in this path rdmsrl_on_cpu should be used instead of rdmsrl. Moreover the scope of MSR_HWP_CAPABILITIES is on per thread basis, so it should be read from the same CPU, for which MSR MSR_HWP_REQUEST is getting set. dmesg dump or warning: [ 22.014488] WARNING: CPU: 139 PID: 1 at arch/x86/mm/extable.c:50 ex_handler_rdmsr_unsafe+0x68/0x70 [ 22.014492] unchecked MSR access error: RDMSR from 0x771 [ 22.014493] Modules linked in: [ 22.014507] CPU: 139 PID: 1 Comm: swapper/0 Not tainted 4.7.5+ #1 ... ... [ 22.014516] Call Trace: [ 22.014542] [<ffffffff813d7dd1>] dump_stack+0x63/0x82 [ 22.014558] [<ffffffff8107bc8b>] __warn+0xcb/0xf0 [ 22.014561] [<ffffffff8107bcff>] warn_slowpath_fmt+0x4f/0x60 [ 22.014563] [<ffffffff810676f8>] ex_handler_rdmsr_unsafe+0x68/0x70 [ 22.014564] [<ffffffff810677d9>] fixup_exception+0x39/0x50 [ 22.014604] [<ffffffff8102e400>] do_general_protection+0x80/0x150 [ 22.014610] [<ffffffff817f9ec8>] general_protection+0x28/0x30 [ 22.014635] [<ffffffff81687940>] ? get_target_pstate_use_performance+0xb0/0xb0 [ 22.014642] [<ffffffff810600c7>] ? native_read_msr+0x7/0x40 [ 22.014657] [<ffffffff81688123>] intel_pstate_hwp_set+0x23/0x130 [ 22.014660] [<ffffffff81688406>] intel_pstate_set_policy+0x1b6/0x340 [ 22.014662] [<ffffffff816829bb>] cpufreq_set_policy+0xeb/0x2c0 [ 22.014664] [<ffffffff81682f39>] cpufreq_init_policy+0x79/0xe0 [ 22.014666] [<ffffffff81682cb0>] ? cpufreq_update_policy+0x120/0x120 [ 22.014669] [<ffffffff816833a6>] cpufreq_online+0x406/0x820 [ 22.014671] [<ffffffff8168381f>] cpufreq_add_dev+0x5f/0x90 [ 22.014717] [<ffffffff81530ac8>] subsys_interface_register+0xb8/0x100 [ 22.014719] [<ffffffff816821bc>] cpufreq_register_driver+0x14c/0x210 [ 22.014749] [<ffffffff81fe1d90>] intel_pstate_init+0x39d/0x4d5 [ 22.014751] [<ffffffff81fe13f2>] ? cpufreq_gov_dbs_init+0x12/0x12 Cc: 4.3+ <stable@vger.kernel.org> # 4.3+ Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-10-09 18:54:13 +02:00
Linus Torvalds	82fa407da0	Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: - Correct ARMs dma-mapping to use the correct printk format strings. - Avoid defining OBJCOPYFLAGS globally which upsets lkdtm rodata testing. - Cleanups to ARMs asm/memory.h include. - L2 cache cleanups. - Allow flat nommu binaries to be executed on ARM MMU systems. - Kernel hardening - add more read-only after init annotations, including making some kernel vdso variables const. - Ensure AMBA primecell clocks are appropriately defaulted. - ARM breakpoint cleanup. - Various StrongARM 11x0 and companion chip (SA1111) updates to bring this legacy platform to use more modern APIs for (eg) GPIOs and interrupts, which will allow us in the future to reduce some of the board-level driver clutter and elimate function callbacks into board code via platform data. There still appears to be interest in these platforms! - Remove the now redundant secure_flush_area() API. - Module PLT relocation optimisations. Ard says: This series of 4 patches optimizes the ARM PLT generation code that is invoked at module load time, to get rid of the O(n^2) algorithm that results in pathological load times of 10 seconds or more for large modules on certain STB platforms. - ARMv7M cache maintanence support. - L2 cache PMU support * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (35 commits) ARM: sa1111: provide to_sa1111_device() macro ARM: sa1111: add sa1111_get_irq() ARM: sa1111: clean up duplication in IRQ chip implementation ARM: sa1111: implement a gpio_chip for SA1111 GPIOs ARM: sa1111: move irq cleanup to separate function ARM: sa1111: use devm_clk_get() ARM: sa1111: use devm_kzalloc() ARM: sa1111: ensure we only touch RAB bus type devices when removing ARM: 8611/1: l2x0: add PMU support ARM: 8610/1: V7M: Add dsb before jumping in handler mode ARM: 8609/1: V7M: Add support for the Cortex-M7 processor ARM: 8608/1: V7M: Indirect proc_info construction for V7M CPUs ARM: 8607/1: V7M: Wire up caches for V7M processors with cache support. ARM: 8606/1: V7M: introduce cache operations ARM: 8605/1: V7M: fix notrace variant of save_and_disable_irqs ARM: 8604/1: V7M: Add support for reading the CTR with read_cpuid_cachetype() ARM: 8603/1: V7M: Add addresses for mem-mapped V7M cache operations ARM: 8602/1: factor out CSSELR/CCSIDR operations that use cp15 directly ARM: kernel: avoid brute force search on PLT generation ARM: kernel: sort relocation sections before allocating PLTs ...	2016-10-06 07:59:37 -07:00
Russell King	301a36fa70	Merge branches 'misc' and 'sa1111-base' into for-linus	2016-10-06 08:56:43 +01:00
Linus Torvalds	597f03f9d1	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: "Yet another batch of cpu hotplug core updates and conversions: - Provide core infrastructure for multi instance drivers so the drivers do not have to keep custom lists. - Convert custom lists to the new infrastructure. The block-mq custom list conversion comes through the block tree and makes the diffstat tip over to more lines removed than added. - Handle unbalanced hotplug enable/disable calls more gracefully. - Remove the obsolete CPU_STARTING/DYING notifier support. - Convert another batch of notifier users. The relayfs changes which conflicted with the conversion have been shipped to me by Andrew. The remaining lot is targeted for 4.10 so that we finally can remove the rest of the notifiers" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) cpufreq: Fix up conversion to hotplug state machine blk/mq: Reserve hotplug states for block multiqueue x86/apic/uv: Convert to hotplug state machine s390/mm/pfault: Convert to hotplug state machine mips/loongson/smp: Convert to hotplug state machine mips/octeon/smp: Convert to hotplug state machine fault-injection/cpu: Convert to hotplug state machine padata: Convert to hotplug state machine cpufreq: Convert to hotplug state machine ACPI/processor: Convert to hotplug state machine virtio scsi: Convert to hotplug state machine oprofile/timer: Convert to hotplug state machine block/softirq: Convert to hotplug state machine lib/irq_poll: Convert to hotplug state machine x86/microcode: Convert to hotplug state machine sh/SH-X3 SMP: Convert to hotplug state machine ia64/mca: Convert to hotplug state machine ARM/OMAP/wakeupgen: Convert to hotplug state machine ARM/shmobile: Convert to hotplug state machine arm64/FP/SIMD: Convert to hotplug state machine ...	2016-10-03 19:43:08 -07:00
Linus Torvalds	72d39926f0	ACPI material for v4.9-rc1 - Update of the ACPICA code in the kernel to upstream revision 20160831 with the following major changes: * New mechanism for GPE masking. * Fixes for issues related to the LoadTable operator and table loading. * Fixes for issues related to so-called module-level code (MLC), that is AML that doesn't belong to any methods. * Change of the return value of the _OSI method to reflect the Windows behavior. * GAS (Generic Address Structure) support fix related to 32-bit FADT addresses. * Elimination of unnecessary FADT version 2 support. * ACPI tools fixes and cleanups. From Bob Moore, Lv Zheng, and Jung-uk Kim. - ACPI sysfs interface updates to fix GPE handling (on top of the new GPE masking mechanism in ACPICA) and issues related to table loading (Lv Zheng). - New watchdog driver based on the ACPI WDAT (ACPI Watchdog Action Table), needed on some platforms to replace the iTCO watchdog that doesn't work there and related updates of the intel_pmc_ipc, i2c/i801 and MFD/lcp_ich drivers (Mika Westerberg). - Driver core fix to prevent it from leaking secondary fwnode objects during device removal (Lukas Wunner). - New definitions of built-in properties for UART in ACPI-based x86 SoC drivers and a 8250_dw driver quirk for the APM X-Gene SoC (Heikki Krogerus). - New device ID for the Vulcan SPI controller and constification of local strucures in the AMD SoC (APD) ACPI driver (Kamlakant Patel, Julia Lawall). - Fix for a bug causing the allocation of PCI resorces to fail if ACPI-enumerated child platform devices are registered below the PCI devices in question (Mika Westerberg). - Change of the default polarity for PCI legacy IRQs to high on systems booting wth ACPI on platforms with a GIC interrupt controller model fixing the discrepancy between the specification and HW behavior (Lorenzo Pieralisi). - Fixes for the handling of system suspend/resume in the ACPI EC driver and update of that driver to make it cope with the cases when the EC device defined in the ECDT has to be used throughout the entire system life cycle (Lv Zheng). - Update of the ACPI CPPC library to allow it to batch requests sent over the PCC channel (to reduce overhead), to support the fixed functional hardware (FFH) CPPC registers access type, to notify the mailbox framework about TX completions when the interrupt flag is set for the PCC mailbox, and to support HW-Reduced Communication Subspace type 2 (Ashwin Chaugule, Prashanth Prakash, Srinivas Pandruvada, Hoan Tran). - ACPI button driver fix and documentation update related to the handling of laptop lids (Lv Zheng). - ACPI battery driver initialization fix (Carlos Garnacho). - ACPI GPIO enumeration documentation update (Mika Westerberg). - Assorted updates of the core ACPI bus type code (Lukas Wunner, Lv Zheng). - Assorted cleanups of the ACPI table parsing code and the x86-specific ACPI code (Al Stone). - Fixes for assorted ACPI-related issues found in linux-next (Wei Yongjun). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJX8Y5+AAoJEILEb/54YlRx73oP/RiAi86NKjOj+GfYceVe37jn 6lSqoMugjgTQHRYvYiQCjJ/BR0GzQZqUkz9TAu1Op14+rhTH3OhSfPizzJWCpVfA G9l9ZRQNnsKNs14bbYmWtmWduh46dFLVFJqo+M/0H3ZMFZu6Adcb+1SBtXHUoQ6L z69ngFxTu3yRvqS4cmm5h7SOx5W2uZZl8zViJW8jgyGhUBStG87gzR6wsYBldGCk XFxcaGWBXRccWGAQLSwfs0psQccEooCqbpsDqaUdrK/mI0rsQr88f25ZxEE7Zw7H bv3py1cgJBZRq36L7eBGQXjIE7YQey6qG2lug2zsUJWe+vzy2vHjHVJHuBXKKgv3 txOA6QZx63UgEyN3zFT7K5ek6uOnkKdeE+s+Laj+K/x4V2R6gbtgO011EVcXy+bI NvqsO76tfPHpwrn5s1VVc5lcEBEPHKHb+WulHrqhSSU4ivk0gtJDeSI+c8xta6YT XwSry5tozDLkG1uEZqkyY1XTlOUAHO8E6YcrlOv2z1+mG7L8OH/vCp1apzgexsZA 1683AH5cwKc3KaP+4QdKGdxY2BDxb7OTVh3cGy4kAYb6tqQ/vj7vlRiJvtaMBtFw xJn3buuagwJzKtgebpA565opvyFAfUX/RNFlTP63aXAefSAgq6KLq70vKFxkIZto H1LpUbmiEbuBml8CBGb1 =xDOQ -----END PGP SIGNATURE----- Merge tag 'acpi-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "First off, the ACPICA code in the kernel is updated to upstream revision 20160831 that brings in a few bug fixes and cleanups. In particular, it is possible to mask GPEs now (and the sysfs interface for GPE control is fixed on top of that), problems related to the table loading mechanism are fixed and all code related to FADT version 2 (which has never been part of the ACPI specification) is dropped. On the new features front, there is a new watchdog driver based on the ACPI WDAT (ACPI Watchdog Action Table), needed on some platforms to replace the iTCO watchdog that doesn't work there, and some UART devices get new definitions of built-in properties (to be accessed via the generic device properties API). Also, included is a fix for an ACPI-related PCI resorces allocation issue and a few problems in the EC driver and in the button and battery drivers are fixed. In addition to that, the ACPI CPPC library is updated to make batching of requests sent over the PCC channel possible (which reduces the PCC usage overhead substantially in some cases) and to support functional fixed hardware (FFH) type of CPPC registers access (which will allow CPPC to be used on x86 too in the future). As usual, there are some assorted fixes and cleanups too. Specifics: - Update of the ACPICA code in the kernel to upstream revision 20160831 with the following major changes: * New mechanism for GPE masking. * Fixes for issues related to the LoadTable operator and table loading. * Fixes for issues related to so-called module-level code (MLC), that is AML that doesn't belong to any methods. * Change of the return value of the _OSI method to reflect the Windows behavior. * GAS (Generic Address Structure) support fix related to 32-bit FADT addresses. * Elimination of unnecessary FADT version 2 support. * ACPI tools fixes and cleanups. From Bob Moore, Lv Zheng, and Jung-uk Kim. - ACPI sysfs interface updates to fix GPE handling (on top of the new GPE masking mechanism in ACPICA) and issues related to table loading (Lv Zheng). - New watchdog driver based on the ACPI WDAT (ACPI Watchdog Action Table), needed on some platforms to replace the iTCO watchdog that doesn't work there and related updates of the intel_pmc_ipc, i2c/i801 and MFD/lcp_ich drivers (Mika Westerberg). - Driver core fix to prevent it from leaking secondary fwnode objects during device removal (Lukas Wunner). - New definitions of built-in properties for UART in ACPI-based x86 SoC drivers and a 8250_dw driver quirk for the APM X-Gene SoC (Heikki Krogerus). - New device ID for the Vulcan SPI controller and constification of local strucures in the AMD SoC (APD) ACPI driver (Kamlakant Patel, Julia Lawall). - Fix for a bug causing the allocation of PCI resorces to fail if ACPI-enumerated child platform devices are registered below the PCI devices in question (Mika Westerberg). - Change of the default polarity for PCI legacy IRQs to high on systems booting wth ACPI on platforms with a GIC interrupt controller model fixing the discrepancy between the specification and HW behavior (Lorenzo Pieralisi). - Fixes for the handling of system suspend/resume in the ACPI EC driver and update of that driver to make it cope with the cases when the EC device defined in the ECDT has to be used throughout the entire system life cycle (Lv Zheng). - Update of the ACPI CPPC library to allow it to batch requests sent over the PCC channel (to reduce overhead), to support the fixed functional hardware (FFH) CPPC registers access type, to notify the mailbox framework about TX completions when the interrupt flag is set for the PCC mailbox, and to support HW-Reduced Communication Subspace type 2 (Ashwin Chaugule, Prashanth Prakash, Srinivas Pandruvada, Hoan Tran). - ACPI button driver fix and documentation update related to the handling of laptop lids (Lv Zheng). - ACPI battery driver initialization fix (Carlos Garnacho). - ACPI GPIO enumeration documentation update (Mika Westerberg). - Assorted updates of the core ACPI bus type code (Lukas Wunner, Lv Zheng). - Assorted cleanups of the ACPI table parsing code and the x86-specific ACPI code (Al Stone). - Fixes for assorted ACPI-related issues found in linux-next (Wei Yongjun)" * tag 'acpi-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (98 commits) ACPI / documentation: Use recommended name in GPIO property names watchdog: wdat_wdt: Fix warning for using 0 as NULL watchdog: wdat_wdt: fix return value check in wdat_wdt_probe() platform/x86: intel_pmc_ipc: Do not create iTCO watchdog when WDAT table exists i2c: i801: Do not create iTCO watchdog when WDAT table exists mfd: lpc_ich: Do not create iTCO watchdog when WDAT table exists ACPI / bus: Adjust ACPI subsystem initialization for new table loading mode ACPICA: Parser: Fix a regression in LoadTable support ACPICA: Tables: Fix "UNLOAD" code path lock issues ACPI / watchdog: Add support for WDAT hardware watchdog ACPI / platform: Pay attention to parent device's resources PCI: Add pci_find_resource() ACPI / CPPC: Support PCC with interrupt flag ACPI / sysfs: Update sysfs signature handling code ACPI / sysfs: Fix an issue for LoadTable opcode ACPICA: Tables: Fix a regression in acpi_tb_find_table() ACPI / tables: Remove duplicated include from tables.c ACPI / APD: constify local structures x86: ACPI: make variable names clearer in acpi_parse_madt_lapic_entries() x86: ACPI: remove extraneous white space after semicolon ...	2016-10-03 10:11:58 -07:00
Rafael J. Wysocki	b6e2511782	Merge branch 'pm-cpufreq-sched' into pm-cpufreq	2016-10-02 01:42:33 +02:00
Rafael J. Wysocki	0d573c6a01	Merge branches 'acpi-x86', 'acpi-cppc' and 'acpi-soc' * acpi-x86: x86: ACPI: make variable names clearer in acpi_parse_madt_lapic_entries() x86: ACPI: remove extraneous white space after semicolon * acpi-cppc: ACPI / CPPC: Support PCC with interrupt flag ACPI / CPPC: Add prefix cppc to cpudata structure name ACPI / CPPC: Add support for functional fixed hardware address ACPI / CPPC: Don't return on CPPC probe failure ACPI / CPPC: Allow build with ACPI_CPU_FREQ_PSS config ACPI / CPPC: check for error bit in PCC status field ACPI / CPPC: move all PCC related information into pcc_data ACPI / CPPC: add sysfs support to compute delivered performance ACPI / CPPC: set a non-zero value for transition_latency ACPI / CPPC: support for batching CPPC requests ACPI / CPPC: acquire pcc_lock only while accessing PCC subspace ACPI / CPPC: restructure read/writes for efficient sys mapped reg ops mailbox: pcc: Support HW-Reduced Communication Subspace type 2 * acpi-soc: ACPI / APD: constify local structures ACPI / APD: Add device HID for Vulcan SPI controller	2016-10-02 01:39:09 +02:00
Colin Ian King	9ad0a1b6a2	cpufreq: st: add missing \n to end of dev_err message Trival fix, dev_err message is missing a \n, so add it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-26 15:11:42 +02:00
Colin Ian King	4c232f9469	cpufreq: kirkwood: add missing \n to end of dev_err messages Trival fix, dev_err messages are missing a \n, so add it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-26 15:10:58 +02:00
Sebastian Andrzej Siewior	5372e054a1	cpufreq: Fix up conversion to hotplug state machine The function cpufreq_register_driver() returns zero on success and since commit `27622b061e` ("cpufreq: Convert to hotplug state machine") erroneously a positive number. Due to the "if (x) assume_error" construct all callers assumed an error and as a consequence the cpu freq kworker crashes with a NULL pointer dereference. Reset the return value back to zero in the success case. Fixes: `27622b061e` ("cpufreq: Convert to hotplug state machine") Reported-by: Borislav Petkov <bp@alien8.de> Reported-and-tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: peterz@infradead.org Cc: rjw@rjwysocki.net Link: http://lkml.kernel.org/r/20160920145628.lp2bmq72ip3oiash@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-20 16:59:21 +02:00
Sebastian Andrzej Siewior	27622b061e	cpufreq: Convert to hotplug state machine Install the callbacks via the state machine. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Viresh Kumar <viresh.kumar@linaro.or Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160906170457.32393-13-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-19 21:44:29 +02:00
Hoan Tran	f89f4147f7	cpufreq: CPPC: Avoid overflow when calculating desired_perf This patch fixes overflow issue when calculating the desired_perf. Fixes: `ad38677df4` (cpufreq: CPPC: Force reporting values in KHz to fix user space interface) Signed-off-by: Hoan Tran <hotran@apm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-16 23:59:19 +02:00
Dave Gerlach	e01072d22d	cpufreq: ti: Use generic platdev driver Now that the cpufreq-dt-platdev is used to create the cpufreq-dt platform device for all OMAP platforms and the platform code that did it before has been removed, add ti,am33xx and ti,dra7xx to the machine list in cpufreq-dt-platdev which had relied on the removed platform code to do this previously. Fixes: `7694ca6e1d` (cpufreq: omap: Use generic platdev driver) Signed-off-by: Dave Gerlach <d-gerlach@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.7+ <stable@vger.kernel.org> # 4.7+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-16 23:57:04 +02:00
Srinivas Pandruvada	3ba7bcaa36	cpufreq: intel_pstate: Add io_boost trace Add io_boost percent to current pstate_sample tracepoint. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-16 23:55:30 +02:00
Rafael J. Wysocki	09c448d3c6	cpufreq: intel_pstate: Use IOWAIT flag in Atom algorithm Modify the P-state selection algorithm for Atom processors to use the new SCHED_CPUFREQ_IOWAIT flag instead of the questionable get_cpu_iowait_time_us() function. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-14 02:28:13 +02:00
Al Stone	ad38677df4	cpufreq: CPPC: Force reporting values in KHz to fix user space interface When CPPC is being used by ACPI on arm64, user space tools such as cpupower report CPU frequency values from sysfs that are incorrect. What the driver was doing was reporting the values given by ACPI tables in whatever scale was used to provide them. However, the ACPI spec defines the CPPC values as unitless abstract numbers. Internal kernel structures such as struct perf_cap, in contrast, expect these values to be in KHz. When these struct values get reported via sysfs, the user space tools also assume they are in KHz, causing them to report incorrect values (for example, reporting a CPU frequency of 1MHz when it should be 1.8GHz). The downside is that this approach has some assumptions: (1) It relies on SMBIOS3 being used, and that the Max Frequency value for a processor is set to a non-zero value. (2) It assumes that all processors run at the same speed, or that the CPPC values have all been scaled to reflect relative speed. This patch retrieves the largest CPU Max Frequency from a type 4 DMI record that it can find. This may not be an issue, however, as a sampling of DMI data on x86 and arm64 indicates there is often only one such record regardless. Since CPPC is relatively new, it is unclear if the ACPI ASL will always be written to reflect any sort of relative performance of processors of differing speeds. (3) It assumes that performance and frequency both scale linearly. For arm64 servers, this may be sufficient, but it does rely on firmware values being set correctly. Hence, other approaches will be considered in the future. This has been tested on three arm64 servers, with and without DMI, with and without CPPC support. Signed-off-by: Al Stone <ahs3@redhat.com> Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-13 02:47:44 +02:00
Viresh Kumar	26619804e7	cpufreq: create link to policy only for registered CPUs If a cpufreq driver is registered very early in the boot stage (e.g. registered from postcore_initcall()), then cpufreq core may generate kernel warnings for it. In this case, the CPUs are brought online, then the cpufreq driver is registered, and then the CPU topology devices are registered. However, by the time cpufreq_add_dev() gets called, the cpu device isn't stored in the per-cpu variable (cpu_sys_devices,) which is read by get_cpu_device(). So the cpufreq core fails to get device for the CPU, for which cpufreq_add_dev() was called in the first place and we will hit a WARN_ON(!cpu_dev). Even if we reuse the 'dev' parameter passed to cpufreq_add_dev() to avoid that warning, there might be other CPUs online that share the policy with the cpu for which cpufreq_add_dev() is called. Eventually get_cpu_device() will return NULL for them as well, and we will hit the same WARN_ON() again. In order to fix these issues, change cpufreq core to create links to the policy for a cpu only when cpufreq_add_dev() is called for that CPU. Reuse the 'real_cpus' mask to track that as well. Note that cpufreq_remove_dev() already handles removal of the links for individual CPUs and cpufreq_add_dev() has aligned with that now. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-13 02:41:15 +02:00
Julia Lawall	42ce8921cc	intel_pstate: constify local structures For structure types defined in the same file or local header files, find top-level static structure declarations that have the following properties: 1. Never reassigned. 2. Address never taken 3. Not passed to a top-level macro call 4. No pointer or array-typed field passed to a function or stored in a variable. Declare structures having all of these properties as const. Done using Coccinelle. Based on a suggestion by Joe Perches <joe@perches.com>. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-13 02:40:24 +02:00
Viresh Kumar	297a66221d	cpufreq: dt: Support governor tunables per policy The cpufreq-dt driver is also used for systems with multiple clock/voltage domains for CPUs, i.e. multiple cpufreq policies in a system. And in such cases the platform users may want to enable "governor tunables per policy". Support that via platform data, as not all users of the driver would want that behavior. Reported-by: Juri Lelli <Juri.Lelli@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-13 02:39:12 +02:00
Viresh Kumar	33cc4fc1b2	cpufreq: dt: Update kconfig description The cpufreq DT driver also supports systems that have multiple clock/voltage domains for CPUs, i.e. multiple policy systems. The description of the Kconfig entry was never updated after the driver was modified to support such systems, fix it. Reported-by: Juri Lelli <Juri.Lelli@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-13 02:39:12 +02:00
Viresh Kumar	e86eee6bc2	cpufreq: dt: Remove unused code This is leftover from an earlier patch which removed the usage of platform data but forgot to remove this line. Remove it now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-13 02:39:12 +02:00
Geert Uytterhoeven	ffdf8b867b	cpufreq: dt: Add support for r8a7792 Add the compatible string for supporting the generic cpufreq driver on the Renesas R-Car V2H (r8a7792) SoC. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-13 02:34:14 +02:00
Rafael J. Wysocki	d0fbf1d328	Merge back earlier cpufreq material for v4.9.	2016-09-12 14:49:29 +02:00
Srinivas Pandruvada	41dd640389	ACPI / CPPC: Add prefix cppc to cpudata structure name Since struct cpudata is defined in a header file, add prefix cppc_ to make it not a generic name. Otherwise it causes compile issue in locally define structure with the same name. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-09-08 23:02:15 +02:00
Rafael J. Wysocki	3689ad7ed6	cpufreq: Drop unnecessary check from cpufreq_policy_alloc() Since cpufreq_policy_alloc() doesn't use its dev variable for anything useful, drop that variable from there along with the NULL check against it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-09-01 00:29:10 +02:00
Wei Yongjun	bd37e022e3	cpufreq: dt: Add terminate entry for of_device_id tables Make sure of_device_id tables are NULL terminated. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Fixes: `f56aad1d98` (cpufreq: dt: Add generic platform-device creation support) CC: 4.7+ <stable@vger.kernel.org> # 4.7+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-31 02:49:05 +02:00
Prakash, Prashanth	be8b88d7d9	ACPI / CPPC: set a non-zero value for transition_latency Compute the expected transition latency for frequency transitions using the values from the PCCT tables when the desired perf register is in PCC. Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org> Reviewed-by: Alexey Klimov <alexey.klimov@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-31 01:02:33 +02:00
Russell King	83809b90a6	ARM: sa1100: move StrongARM CPU ID checks to cputype.h Move the StrongARM CPU ID checks out of the platform's hardware.h file into asm/cputype.h Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>	2016-08-23 10:25:17 +01:00
Markus Elfring	b0d8a69d08	cpufreq-SCPI: Delete unnecessary assignment for the field "owner" The field "owner" is set by the core. Thus delete an unneeded initialisation. Generated by scripts/coccinelle/api/platform_no_drv_owner.cocci Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-18 03:42:32 +02:00
Rafael J. Wysocki	58919e83c8	cpufreq / sched: Pass flags to cpufreq_update_util() It is useful to know the reason why cpufreq_update_util() has just been called and that can be passed as flags to cpufreq_update_util() and to the ->func() callback in struct update_util_data. However, doing that in addition to passing the util and max arguments they already take would be clumsy, so avoid it. Instead, use the observation that the schedutil governor is part of the scheduler proper, so it can access scheduler data directly. This allows the util and max arguments of cpufreq_update_util() and the ->func() callback in struct update_util_data to be replaced with a flags one, but schedutil has to be modified to follow. Thus make the schedutil governor obtain the CFS utilization information from the scheduler and use the "RT" and "DL" flags instead of the special utilization value of ULONG_MAX to track updates from the RT and DL sched classes. Make it non-modular too to avoid having to export scheduler variables to modules at large. Next, update all of the other users of cpufreq_update_util() and the ->func() callback in struct update_util_data accordingly. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-08-16 22:14:55 +02:00
Chanwoo Choi	c4b4057267	cpufreq: dt: Add exynos5433 compatible to use generic cpufreq driver This patch adds the exynos5433 compatible string for supporting the generic cpufreq driver on Exynos5433. Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-16 13:57:30 +02:00
Rafael J. Wysocki	0aeeb3e73f	Merge branches 'pm-sleep' and 'pm-cpufreq' * pm-sleep: PM / hibernate: Restore processor state before using per-CPU variables x86/power/64: Always create temporary identity mapping correctly * pm-cpufreq: cpufreq: powernv: Fix crash in gpstate_timer_handler()	2016-08-12 22:53:58 +02:00
Akshay Adiga	8e85946777	cpufreq: powernv: Fix crash in gpstate_timer_handler() Commit `09ca4c9b59` (cpufreq: powernv: Replacing pstate_id with frequency table index) changes calc_global_pstate() to use cpufreq_table index instead of pstate_id. But in gpstate_timer_handler(), pstate_id was being passed instead of cpufreq_table index, which caused index_to_pstate() to access out of bound indices, leading to this crash. Adding sanity check for index and pstate, to ensure only valid pstate and index values are returned. Call Trace: [c00000078d66b130] [c00000000011d224] __free_irq+0x234/0x360 (unreliable) [c00000078d66b1c0] [c00000000011d44c] free_irq+0x6c/0xa0 [c00000078d66b1f0] [c00000000006c4f8] opal_event_shutdown+0x88/0xd0 [c00000078d66b230] [c000000000067a4c] opal_shutdown+0x1c/0x90 [c00000078d66b260] [c000000000063a00] pnv_shutdown+0x20/0x40 [c00000078d66b280] [c000000000021538] machine_restart+0x38/0x90 [c0000000078d66b310] [c000000000965ea0] panic+0x284/0x300 [c00000078d66b3a0] [c00000000001f508] die+0x388/0x450 [c00000078d66b430] [c000000000045a50] bad_page_fault+0xd0/0x140 [c00000078d66b4a0] [c000000000008964] handle_page_fault+0x2c/0x30 interrupt: 300 at gpstate_timer_handler+0x150/0x260 LR = gpstate_timer_handler+0x130/0x260 [c00000078d66b7f0] [c000000000132b58] call_timer_fn+0x58/0x1c0 [c00000078d66b880] [c000000000132e20] expire_timers+0x130/0x1d0 [c00000078d66b8f0] [c000000000133068] run_timer_softirq+0x1a8/0x230 [c00000078d66b980] [c0000000000b535c] __do_softirq+0x18c/0x400 [c00000078d66ba70] [c0000000000b5828] irq_exit+0xc8/0x100 [c00000078d66ba90] [c00000000001e214] timer_interrupt+0xa4/0xe0 [c00000078d66bac0] [c0000000000027d0] decrementer_common+0x150/0x180 interrupt: 901 at arch_local_irq_restore+0x74/0x90 0] [c000000000106b34] call_cpuidle+0x44/0x90 [c00000078d66be50] [c00000000010708c] cpu_startup_entry+0x38c/0x460 [c00000078d66bf20] [c00000000003d930] start_secondary+0x330/0x380 [c00000078d66bf90] [c000000000008e6c] start_secondary_prolog+0x10/0x14 Fixes: `09ca4c9b59` (cpufreq: powernv: Replacing pstate_id with frequency table index) Reported-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-06 14:52:26 +02:00
Linus Torvalds	11d8ec408d	More power management updates for v4.8-rc1 - Prevent the low-level assembly hibernate code on x86-64 from referring to __PAGE_OFFSET directly as a symbol which doesn't work when the kernel identity mapping base is randomized, in which case __PAGE_OFFSET is a variable (Rafael Wysocki). - Avoid selecting CPU_FREQ_STAT by default as the statistics are not required for proper cpufreq operation (Borislav Petkov). - Add Skylake-X and Broadwell-X IDs to the intel_pstate's list of processors where out-of-band (OBB) control of P-states is possible and if that is in use, intel_pstate should not attempt to manage P-states (Srinivas Pandruvada). - Drop some unnecessary checks from the wakeup IRQ handling code in the PM core (Markus Elfring). - Reduce the number operating performance point (OPP) lookups in one of the OPP framework's helper functions (Jisheng Zhang). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXpKLNAAoJEILEb/54YlRx6L4P/39GR0kVB7vIyCajdWc4f3gh 0zh5RbKC0YT4F6upebzgQ9uYS9cv4y+df7ShVwKQAea3wDReEmZhM/egOGw0Ls+8 SS1MiJq1LSekyMIWH6cXwZsH69/V0LuWTBbzWYBgHUDbfEMlgwV5ZZMEH14/2bWw d4SLUUiW5P42im+IDAxpdYneKOrbJo3txj6WbOutgtIrHdPko6lF1dDouKvI1QTk zCBOkEB9nELq3rWN/sbPmHzbmbj/yFiiHk5+iqwuKKJZI8PQB9/C6Qmc3wvjtPpc GXLPI+OHqLgBMofGsiKOvm3hPQAIjf/ERsUilHLE3qOi/Mi0qj7U4dFivZrPWaCG j2bD+b36TffmG1r8L7NYbEKU60syeIFSRqAngbyswu6XF+NdVboaifENGcgM3tBC pUC1mFh/4PMKP5zW9mOwE6WSntZkw14CVR+A3fRFOuTBavNvjGdckwLi/aBsZdU3 K4DJUFzdELF7+JdqnQV35yV2tgMbJhxQa/QBykiFBh3AyqliOZ8uBoIxxLinrGmH XWR3kB4ZPRBIStGI9IpG3lNhLLU7mLIdaFhGayicicAwFcLsXN7oKWVzYfUqWA9T 1ptXApcRf6S9J1JnKHkznoQ1/D1pNYLbH+t7BOtWdK8SWBhMS2Lzo2kyKWsCFkJS 16J8j1tcLDhN1dvcBT55 =WvPB -----END PGP SIGNATURE----- Merge tag 'pm-extra-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "A few more fixes and cleanups in the x86-64 low-level hibernation code, PM core, cpufreq (Kconfig and intel_pstate), and the operating points framework. Specifics: - Prevent the low-level assembly hibernate code on x86-64 from referring to __PAGE_OFFSET directly as a symbol which doesn't work when the kernel identity mapping base is randomized, in which case __PAGE_OFFSET is a variable (Rafael Wysocki). - Avoid selecting CPU_FREQ_STAT by default as the statistics are not required for proper cpufreq operation (Borislav Petkov). - Add Skylake-X and Broadwell-X IDs to the intel_pstate's list of processors where out-of-band (OBB) control of P-states is possible and if that is in use, intel_pstate should not attempt to manage P-states (Srinivas Pandruvada). - Drop some unnecessary checks from the wakeup IRQ handling code in the PM core (Markus Elfring). - Reduce the number operating performance point (OPP) lookups in one of the OPP framework's helper functions (Jisheng Zhang)" * tag 'pm-extra-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/power/64: Do not refer to __PAGE_OFFSET from assembly code cpufreq: Do not default-yes CPU_FREQ_STAT cpufreq: intel_pstate: Add more out-of-band IDs PM / OPP: optimize dev_pm_opp_set_rate() performance a bit PM-wakeup: Delete unnecessary checks before three function calls	2016-08-05 23:26:16 -04:00
Rafael J. Wysocki	e2b3b80de5	Merge branches 'pm-sleep', 'pm-cpufreq', 'pm-core' and 'pm-opp' * pm-sleep: x86/power/64: Do not refer to __PAGE_OFFSET from assembly code * pm-cpufreq: cpufreq: Do not default-yes CPU_FREQ_STAT cpufreq: intel_pstate: Add more out-of-band IDs * pm-core: PM-wakeup: Delete unnecessary checks before three function calls * pm-opp: PM / OPP: optimize dev_pm_opp_set_rate() performance a bit	2016-08-05 15:46:55 +02:00
Borislav Petkov	79ad70de53	cpufreq: Do not default-yes CPU_FREQ_STAT CPU frequency transition statistics are not absolutely required for proper cpufreq operation on the system AFAICT so remove the default-yes setting in Kconfig. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-03 01:34:11 +02:00
Linus Torvalds	43a0a98aa8	ARM: SoC driver updates for v4.8 Driver updates for ARM SoCs. A slew of changes this release cycle. The reset driver tree, that we merge through arm-soc for historical reasons, is also sizable this time around. Among the changes: - clps711x: Treewide changes to compatible strings, merged here for simplicity. - Qualcomm: SCM firmware driver cleanups, move to platform driver - ux500: Major cleanups, removal of old mach-specific infrastructure. - Atmel external bus memory driver - Move of brcmstb platform to the rest of bcm - PMC driver updates for tegra, various fixes and improvements - Samsung platform driver updates to support 64-bit Exynos platforms - Reset controller cleanups moving to devm_reset_controller_register() APIs - Reset controller driver for Amlogic Meson - Reset controller driver for Hisilicon hi6220 - ARM SCPI power domain support -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXnm1XAAoJEIwa5zzehBx35lcP/ApuQarIXeZCQZtjlUBV9McW o3o7FhKFHePmEPeoYCvVeK5D8NykTkQv3WpnCknoxPJzxGJF7jbPWQJcVnXfKOXD kTcyIK15WL2HHtSE3lYyLfyUPz8AbJyRt0l0cxgcg6jvo+uzlWooNz1y78rLIYzg UwRssj7OiHv4dsyYRHZIsjnB8gMWw8rYMk154gP2xy6MnNXXzzOVVnOiVqxSZBm+ EgIIcROMOqkkHuFlClMYKluIgrmgz1Ypjf+FuAg7dqXZd+TGRrmGXeI7SkGThfLu nyvY3N18NViNu7xOUkI9zg7+ifyYM8Si9ylalSICSJdIAxZfiwFqFaLJvVWKU1rY rBOBjKckQI0/X9WYusFNFHcijhIFV8/FgGAnVRRMPdvlCss7Zp03C9mR4AEhmKMX rLG49x81hU1C+LftC59ml3iB8dhZrrRkbxNHjLFHVGWNrKMrmJKa8JhXGRAoNM+u LRauiuJZatqvLfISNvpfcoW2EashVoU3f+uC8ymT3QCyME3wZm0t7T4tllxhMfBl sOgJqNkTKDmPLofwm/dASiLML7ZF1WePScrFyOACnj9K4mUD+OaCnowtWoQPu0eI aNmT84oosJ2S9F/iUDPtFHXdzQ+1QPPfSiQ9FXMoauciVq/2F+pqq68yYgqoxFOG vmkmG2YM4Wyq43u0BONR =O8+y -----END PGP SIGNATURE----- Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Olof Johansson: "Driver updates for ARM SoCs. A slew of changes this release cycle. The reset driver tree, that we merge through arm-soc for historical reasons, is also sizable this time around. Among the changes: - clps711x: Treewide changes to compatible strings, merged here for simplicity. - Qualcomm: SCM firmware driver cleanups, move to platform driver - ux500: Major cleanups, removal of old mach-specific infrastructure. - Atmel external bus memory driver - Move of brcmstb platform to the rest of bcm - PMC driver updates for tegra, various fixes and improvements - Samsung platform driver updates to support 64-bit Exynos platforms - Reset controller cleanups moving to devm_reset_controller_register() APIs - Reset controller driver for Amlogic Meson - Reset controller driver for Hisilicon hi6220 - ARM SCPI power domain support" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (100 commits) ARM: ux500: consolidate base platform files ARM: ux500: move soc_id driver to drivers/soc ARM: ux500: call ux500_setup_id later ARM: ux500: consolidate soc_device code in id.c ARM: ux500: remove cpu_is_u* helpers ARM: ux500: use CLK_OF_DECLARE() ARM: ux500: move l2x0 init to .init_irq mfd: db8500 stop passing around platform data ASoC: ab8500-codec: remove platform data based probe ARM: ux500: move ab8500_regulator_plat_data into driver ARM: ux500: remove unused regulator data soc: raspberrypi-power: add CONFIG_OF dependency firmware: scpi: add CONFIG_OF dependency video: clps711x-fb: Changing the compatibility string to match with the smallest supported chip input: clps711x-keypad: Changing the compatibility string to match with the smallest supported chip pwm: clps711x: Changing the compatibility string to match with the smallest supported chip serial: clps711x: Changing the compatibility string to match with the smallest supported chip irqchip: clps711x: Changing the compatibility string to match with the smallest supported chip clocksource: clps711x: Changing the compatibility string to match with the smallest supported chip clk: clps711x: Changing the compatibility string to match with the smallest supported chip ...	2016-08-01 18:36:01 -04:00
Srinivas Pandruvada	65c1262f40	cpufreq: intel_pstate: Add more out-of-band IDs Add Skylake-X and Broadwell-X IDs for out-of-band (OBB) control of P-States. For these processors, if MSR_MISC_PWR_MGMT BIT(8) == 1, then the Intel P-State driver should exit as OS can't control P-States. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw : Subject/changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-28 23:58:17 +02:00
Linus Torvalds	6453dbdda3	Power management material for v4.8-rc1 - Rework the cpufreq governor interface to make it more straightforward and modify the conservative governor to avoid using transition notifications (Rafael Wysocki). - Rework the handling of frequency tables by the cpufreq core to make it more efficient (Viresh Kumar). - Modify the schedutil governor to reduce the number of wakeups it causes to occur in cases when the CPU frequency doesn't need to be changed (Steve Muckle, Viresh Kumar). - Fix some minor issues and clean up code in the cpufreq core and governors (Rafael Wysocki, Viresh Kumar). - Add Intel Broxton support to the intel_pstate driver (Srinivas Pandruvada). - Fix problems related to the config TDP feature and to the validity of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka, Srinivas Pandruvada). - Make intel_pstate update the cpu_frequency tracepoint even if the frequency doesn't change to avoid confusing powertop (Rafael Wysocki). - Clean up the usage of __init/__initdata in intel_pstate, mark some of its internal variables as __read_mostly and drop an unused structure element from it (Jisheng Zhang, Carsten Emde). - Clean up the usage of some duplicate MSR symbols in intel_pstate and turbostat (Srinivas Pandruvada). - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay Adiga, Viresh Kumar, Ben Dooks). - Fix a regression (introduced during the 4.5 cycle) in the pcc-cpufreq driver by reverting the problematic commit (Andreas Herrmann). - Add support for Intel Denverton to intel_idle, clean up Broxton support in it and make it explicitly non-modular (Jacob Pan, Jan Beulich, Paul Gortmaker). - Add support for Denverton and Ivy Bridge server to the Intel RAPL power capping driver and make it more careful about the handing of MSRs that may not be present (Jacob Pan, Xiaolong Wang). - Fix resume from hibernation on x86-64 by making the CPU offline during resume avoid using MONITOR/MWAIT in the "play dead" loop which may lead to an inadvertent "revival" of a "dead" CPU and a page fault leading to a kernel crash from it (Rafael Wysocki). - Make memory management during resume from hibernation more straightforward (Rafael Wysocki). - Add debug features that should help to detect problems related to hibernation and resume from it (Rafael Wysocki, Chen Yu). - Clean up hibernation core somewhat (Rafael Wysocki). - Prevent KASAN from instrumenting the hibernation core which leads to large numbers of false-positives from it (James Morse). - Prevent PM (hibernate and suspend) notifiers from being called during the cleanup phase if they have not been called during the corresponding preparation phase which is possible if one of the other notifiers returns an error at that time (Lianwei Wang). - Improve suspend-related debug printout in the tasks freezer and clean up suspend-related console handling (Roger Lu, Borislav Petkov). - Update the AnalyzeSuspend script in the kernel sources to version 4.2 (Todd Brandt). - Modify the generic power domains framework to make it handle system suspend/resume better (Ulf Hansson). - Make the runtime PM framework avoid resuming devices synchronously when user space changes the runtime PM settings for them and improve its error reporting (Rafael Wysocki, Linus Walleij). - Fix error paths in devfreq drivers (exynos, exynos-ppmu, exynos-bus) and in the core, make some devfreq code explicitly non-modular and change some of it into tristate (Bartlomiej Zolnierkiewicz, Peter Chen, Paul Gortmaker). - Add DT support to the generic PM clocks management code and make it export some more symbols (Jon Hunter, Paul Gortmaker). - Make the PCI PM core code slightly more robust against possible driver errors (Andy Shevchenko). - Make it possible to change DESTDIR and PREFIX in turbostat (Andy Shevchenko). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXl7/dAAoJEILEb/54YlRx+VgQAIQJOWvxKew3Yl02c/sdj9OT 5VNnFrzGzdcAPofvvG9qGq8B0Es1vYehJpwwOB21ri8EvYv0riIiU1yrqslObojQ oaZOkSBpbIoKjGR4CpYA/A+feE+8EqIBdPGd+lx5a6oRdUi7tRVHBG9lyLO3FB/i jan1q8dMpZsmu+Y+rVVHGnCVuIlIEqr2ZnZfCwDAulO2Arp/QFAh4kH08ELATvrl bkPa25vq7/VMP/vCDzrfZKD5mUuKogIRu/J5wx4py1nE+FB35cKKyqBOgklLwAeY UI8vjDhr/myNUs54AZlktOkq47TCYvjvhX9kmOxBjuWqFbRusU012IRek1fYPRIV ZqbkqNX7UEVQwunAEg9AyFwyzEtOht93dQDT5RLEd4QzKuM76gmHpLeTGGMzE+nu FnmF9JGl4DVwqpZl9yU2+hR2Mt3bP8OF8qYmNiGUB3KO4emPslhSd+6y8liA5Bx2 SJf0Gb//vaHCh3/uMnwAonYPqRkZvBLOMwuL1VUjNQfRMnQtDdgHMYB1aT/EglPA 8ww6j4J8rVRLAxvYQ3UEmNA/vBNclKXblRR18+JddEZP9/oX0ATfwnCCUpr839uk xxyQhrm4/AI60+PHWCX4GG80YrKdOGTkF7LXCQZanVWjjuyF17rufegZ2YWLT07v JU1Cmumfdy2jJluT8xsR =uVGz -----END PGP SIGNATURE----- Merge tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "Again, the majority of changes go into the cpufreq subsystem, but there are no big features this time. The cpufreq changes that stand out somewhat are the governor interface rework and improvements related to the handling of frequency tables. Apart from those, there are fixes and new device/CPU IDs in drivers, cleanups and an improvement of the new schedutil governor. Next, there are some changes in the hibernation core, including a fix for a nasty problem related to the MONITOR/MWAIT usage by CPU offline during resume from hibernation, a few core improvements related to memory management during resume, a couple of additional debug features and cleanups. Finally, we have some fixes and cleanups in the devfreq subsystem, generic power domains framework improvements related to system suspend/resume, support for some new chips in intel_idle and in the power capping RAPL driver, a new version of the AnalyzeSuspend utility and some assorted fixes and cleanups. Specifics: - Rework the cpufreq governor interface to make it more straightforward and modify the conservative governor to avoid using transition notifications (Rafael Wysocki). - Rework the handling of frequency tables by the cpufreq core to make it more efficient (Viresh Kumar). - Modify the schedutil governor to reduce the number of wakeups it causes to occur in cases when the CPU frequency doesn't need to be changed (Steve Muckle, Viresh Kumar). - Fix some minor issues and clean up code in the cpufreq core and governors (Rafael Wysocki, Viresh Kumar). - Add Intel Broxton support to the intel_pstate driver (Srinivas Pandruvada). - Fix problems related to the config TDP feature and to the validity of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka, Srinivas Pandruvada). - Make intel_pstate update the cpu_frequency tracepoint even if the frequency doesn't change to avoid confusing powertop (Rafael Wysocki). - Clean up the usage of __init/__initdata in intel_pstate, mark some of its internal variables as __read_mostly and drop an unused structure element from it (Jisheng Zhang, Carsten Emde). - Clean up the usage of some duplicate MSR symbols in intel_pstate and turbostat (Srinivas Pandruvada). - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay Adiga, Viresh Kumar, Ben Dooks). - Fix a regression (introduced during the 4.5 cycle) in the pcc-cpufreq driver by reverting the problematic commit (Andreas Herrmann). - Add support for Intel Denverton to intel_idle, clean up Broxton support in it and make it explicitly non-modular (Jacob Pan, Jan Beulich, Paul Gortmaker). - Add support for Denverton and Ivy Bridge server to the Intel RAPL power capping driver and make it more careful about the handing of MSRs that may not be present (Jacob Pan, Xiaolong Wang). - Fix resume from hibernation on x86-64 by making the CPU offline during resume avoid using MONITOR/MWAIT in the "play dead" loop which may lead to an inadvertent "revival" of a "dead" CPU and a page fault leading to a kernel crash from it (Rafael Wysocki). - Make memory management during resume from hibernation more straightforward (Rafael Wysocki). - Add debug features that should help to detect problems related to hibernation and resume from it (Rafael Wysocki, Chen Yu). - Clean up hibernation core somewhat (Rafael Wysocki). - Prevent KASAN from instrumenting the hibernation core which leads to large numbers of false-positives from it (James Morse). - Prevent PM (hibernate and suspend) notifiers from being called during the cleanup phase if they have not been called during the corresponding preparation phase which is possible if one of the other notifiers returns an error at that time (Lianwei Wang). - Improve suspend-related debug printout in the tasks freezer and clean up suspend-related console handling (Roger Lu, Borislav Petkov). - Update the AnalyzeSuspend script in the kernel sources to version 4.2 (Todd Brandt). - Modify the generic power domains framework to make it handle system suspend/resume better (Ulf Hansson). - Make the runtime PM framework avoid resuming devices synchronously when user space changes the runtime PM settings for them and improve its error reporting (Rafael Wysocki, Linus Walleij). - Fix error paths in devfreq drivers (exynos, exynos-ppmu, exynos-bus) and in the core, make some devfreq code explicitly non-modular and change some of it into tristate (Bartlomiej Zolnierkiewicz, Peter Chen, Paul Gortmaker). - Add DT support to the generic PM clocks management code and make it export some more symbols (Jon Hunter, Paul Gortmaker). - Make the PCI PM core code slightly more robust against possible driver errors (Andy Shevchenko). - Make it possible to change DESTDIR and PREFIX in turbostat (Andy Shevchenko)" * tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits) Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency" PM / hibernate: Introduce test_resume mode for hibernation cpufreq: export cpufreq_driver_resolve_freq() cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index() PCI / PM: check all fields in pci_set_platform_pm() cpufreq: acpi-cpufreq: use cached frequency mapping when possible cpufreq: schedutil: map raw required frequency to driver frequency cpufreq: add cpufreq_driver_resolve_freq() cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT intel_pstate: Update cpu_frequency tracepoint every time cpufreq: intel_pstate: clean remnant struct element PM / tools: scripts: AnalyzeSuspend v4.2 x86 / hibernate: Use hlt_play_dead() when resuming from hibernation cpufreq: powernv: Replacing pstate_id with frequency table index intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate() PM / hibernate: Image data protection during restoration PM / hibernate: Add missing braces in __register_nosave_region() PM / hibernate: Clean up comments in snapshot.c PM / hibernate: Clean up function headers in snapshot.c PM / hibernate: Add missing braces in hibernate_setup() ...	2016-07-26 17:29:07 -07:00
Linus Torvalds	55392c4c06	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "This update provides the following changes: - The rework of the timer wheel which addresses the shortcomings of the current wheel (cascading, slow search for next expiring timer, etc). That's the first major change of the wheel in almost 20 years since Finn implemted it. - A large overhaul of the clocksource drivers init functions to consolidate the Device Tree initialization - Some more Y2038 updates - A capability fix for timerfd - Yet another clock chip driver - The usual pile of updates, comment improvements all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits) tick/nohz: Optimize nohz idle enter clockevents: Make clockevents_subsys static clocksource/drivers/time-armada-370-xp: Fix return value check timers: Implement optimization for same expiry time in mod_timer() timers: Split out index calculation timers: Only wake softirq if necessary timers: Forward the wheel clock whenever possible timers/nohz: Remove pointless tick_nohz_kick_tick() function timers: Optimize collect_expired_timers() for NOHZ timers: Move __run_timers() function timers: Remove set_timer_slack() leftovers timers: Switch to a non-cascading wheel timers: Reduce the CPU index space to 256k timers: Give a few structs and members proper names hlist: Add hlist_is_singular_node() helper signals: Use hrtimer for sigtimedwait() timers: Remove the deprecated mod_timer_pinned() API timers, net/ipv4/inet: Initialize connection request timers as pinned timers, drivers/tty/mips_ejtag: Initialize the poll timer as pinned timers, drivers/tty/metag_da: Initialize the poll timer as pinned ...	2016-07-25 20:43:12 -07:00
Linus Torvalds	8e466955d6	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "The main changes in this cycle were: - Intel-SoC enhancements (Andy Shevchenko) - Intel CPU symbolic model definition rework (Dave Hansen) - ... other misc changes" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) x86/sfi: Enable enumeration of SD devices x86/pci: Use MRFLD abbreviation for Merrifield x86/platform/intel-mid: Make vertical indentation consistent x86/platform/intel-mid: Mark regulators explicitly defined x86/platform/intel-mid: Rename mrfl.c to mrfld.c x86/platform/intel-mid: Enable spidev on Intel Edison boards x86/platform/intel-mid: Extend PWRMU to support Penwell x86/pci, x86/platform/intel_mid_pci: Remove duplicate power off code x86/platform/intel-mid: Add pinctrl for Intel Merrifield x86/platform/intel-mid: Enable GPIO expanders on Edison x86/platform/intel-mid: Add Power Management Unit driver x86/platform/atom/punit: Enable support for Merrifield x86/platform/intel_mid_pci: Rework IRQ0 workaround x86, thermal: Clean up and fix CPU model detection for intel_soc_dts_thermal x86, mmc: Use Intel family name macros for mmc driver x86/intel_telemetry: Use Intel family name macros for telemetry driver x86/acpi/lss: Use Intel family name macros for the acpi_lpss driver x86/cpufreq: Use Intel family name macros for the intel_pstate cpufreq driver x86/platform: Use new Intel model number macros x86/intel_idle: Use Intel family macros for intel_idle ...	2016-07-25 19:15:35 -07:00
Rafael J. Wysocki	bc841e260c	Merge branch 'pm-cpu' * pm-cpu: x86: remove duplicate turbo ratio limit MSRs tools/power turbostat: Replace MSR_NHM_TURBO_RATIO_LIMIT cpufreq: intel_pstate: Replace MSR_NHM_TURBO_RATIO_LIMIT	2016-07-25 13:46:30 +02:00
Andreas Herrmann	da7d3abe1c	Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency" This reverts commit `790d849bf8`. Using a v4.7-rc7 kernel on a HP ProLiant triggered following messages pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor The last line was shown for each CPU in the system. Testing v4.5 (where commit `790d849b` was integrated) triggered similar messages. Same behaviour on a 2nd HP Proliant system. So commit `790d849bf` (cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency) causes the system to use performance governor which, I guess, was not the intention of the patch. Enabling debug output in pcc-cpufreq provides following verbose output: pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz pcc_get_offset: for CPU 0: pcc_cpu_data input_offset: 0x44, pcc_cpu_data output_offset: 0x48 init: policy->max is 2800000, policy->min is 1200000 get: get_freq for CPU 0 get: SUCCESS: (virtual) output_offset for cpu 0 is 0xffffc9000d7c0048, contains a value of: 0xff06. Speed is: 168000 MHz cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor target: CPU 0 should go to target freq: 2800000 (virtual) input_offset is 0xffffc9000d7c0044 target: was SUCCESSFUL for cpu 0 I am asking to revert `790d849bf` to re-enable usage of ondemand governor with pcc-cpufreq. Fixes: `790d849bf` (cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency) CC: <stable@vger.kernel.org> # 4.5+ Signed-off-by: Andreas Herrmann <aherrmann@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-22 23:51:06 +02:00
Steve Muckle	ae2c1ca686	cpufreq: export cpufreq_driver_resolve_freq() Export cpufreq_driver_resolve_freq() since governors may be compiled as modules. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steve Muckle <smuckle@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-22 13:53:51 +02:00
Viresh Kumar	abe8bd024e	cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index() The handlers provided by cpufreq core are sufficient for resolving the frequency for drivers providing ->target_index(), as the core already has the frequency table and so ->resolve_freq() isn't required for such platforms. This patch disallows drivers with ->target_index() callback to use the ->resolve_freq() callback. Also, it fixes a potential kernel crash for drivers providing ->target() but no ->resolve_freq(). Fixes: `e3c0623608` "cpufreq: add cpufreq_driver_resolve_freq()" Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-21 23:45:17 +02:00
Steve Muckle	5b6667c76d	cpufreq: acpi-cpufreq: use cached frequency mapping when possible A call to cpufreq_driver_resolve_freq will cache the mapping from the desired target frequency to the frequency table index. If there is a mapping for the desired target frequency then use it instead of looking up the mapping again. Signed-off-by: Steve Muckle <smuckle@linaro.org> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-21 22:28:32 +02:00
Steve Muckle	e3c0623608	cpufreq: add cpufreq_driver_resolve_freq() Cpufreq governors may need to know what a particular target frequency maps to in the driver without necessarily wanting to set the frequency. Support this operation via a new cpufreq API, cpufreq_driver_resolve_freq(). This API returns the lowest driver frequency equal or greater than the target frequency (CPUFREQ_RELATION_L), subject to any policy (min/max) or driver limitations. The mapping is also cached in the policy so that a subsequent fast_switch operation can avoid repeating the same lookup. The API will call a new cpufreq driver callback, resolve_freq(), if it has been registered by the driver. Otherwise the frequency is resolved via cpufreq_frequency_table_target(). Rather than require ->target() style drivers to provide a resolve_freq() callback it is left to the caller to ensure that the driver implements this callback if necessary to use cpufreq_driver_resolve_freq(). Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Steve Muckle <smuckle@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-21 14:46:08 +02:00
Srinivas Pandruvada	da7de91c3e	cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT The MSR MSR_HWP_INTERRUPT is valid only when CPUID.06H:EAX[8] = 1, so check for feature before accessing this MSR. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-21 14:29:30 +02:00
Rafael J. Wysocki	bc95a454b6	intel_pstate: Update cpu_frequency tracepoint every time Currently, intel_pstate only updates the cpu_frequency tracepoint if the new P-state to set is different from the current one, but that causes powertop to report 100% idle on an 100% loaded system sometimes. Prevent that from happening by updating the cpu_frequency tracepoint every time intel_pstate_update_pstate() is called. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>-	2016-07-21 14:28:37 +02:00
Carsten Emde	2630abc243	cpufreq: intel_pstate: clean remnant struct element When I was working with the Intel P state driver I came across a remnant struct element that is no longer needed after the function intel_pstate_calc_freq() was retired. Signed-off-by: Carsten Emde <C.Emde@osadl.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-21 14:26:00 +02:00
Akshay Adiga	09ca4c9b59	cpufreq: powernv: Replacing pstate_id with frequency table index Refactoring code to use frequency table index instead of pstate_id. This abstraction will make the code independent of the pstate values. - No functional changes - The highest frequency is at frequency table index 0 and the frequency decreases as the index increases. - Macros pstates_to_idx() and idx_to_pstate() can be used for conversion between pstate_id and index. - powernv_pstate_info now contains frequency table index to min, max and nominal frequency (instead of pstate_ids) - global_pstate_info new stores index values instead pstate ids. - variables renamed as *_idx which now store index instead of pstate Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-12 02:47:10 +02:00
Jan Kiszka	5fc8f707a2	intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate() If MSR_CONFIG_TDP_CONTROL is locked, we currently try to address some MSR 0x80000648 or so. Mask out the relevant level bits 0 and 1. Found while running over the Jailhouse hypervisor which became upset about this strange MSR index. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.4+ <stable@vger.kernel.org> # 4.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-11 15:12:30 +02:00
Srinivas Pandruvada	100cf6f277	cpufreq: intel_pstate: Replace MSR_NHM_TURBO_RATIO_LIMIT Replace MSR_NHM_TURBO_RATIO_LIMIT with MSR_TURBO_RATIO_LIMIT. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-07 15:31:58 +02:00
Thomas Gleixner	7bc54b652f	timers, cpufreq/powernv: Initialize the gpstate timer as pinned Pinned timers must carry the pinned attribute in the timer structure itself, so convert the code to the new API. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094341.297014487@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-07 10:25:14 +02:00
Viresh Kumar	825773609c	cpufreq: Reuse new freq-table helpers This patch migrates few users of cpufreq tables to the new helpers that work on sorted freq-tables. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-07 00:14:27 +02:00
Viresh Kumar	da0c6dc00c	cpufreq: Handle sorted frequency tables more efficiently cpufreq drivers aren't required to provide a sorted frequency table today, and even the ones which provide a sorted table aren't handled efficiently by cpufreq core. This patch adds infrastructure to verify if the freq-table provided by the drivers is sorted or not, and use efficient helpers if they are sorted. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-07 00:13:20 +02:00
Rafael J. Wysocki	8d540ea792	cpufreq: Drop redundant check from cpufreq_update_current_freq() Both callers of cpufreq_update_current_freq(), cpufreq_update_policy() and cpufreq_start_governor(), check cpufreq_suspended before calling that function, so drop the redundant cpufreq_suspended check from it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2016-07-04 13:22:35 +02:00
Rafael J. Wysocki	5d1191ab6c	Merge back earlier cpufreq material for v4.8.	2016-07-04 13:21:43 +02:00
Rafael J. Wysocki	742c87bf27	cpufreq: Avoid false-positive WARN_ON()s in cpufreq_update_policy() CPU notifications from the firmware coming in when cpufreq is suspended cause cpufreq_update_current_freq() to return 0 which triggers the WARN_ON() in cpufreq_update_policy() for no reason. Avoid that by checking cpufreq_suspended before calling cpufreq_update_current_freq(). Fixes: `c9d9c929e6` (cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.6+ <stable@vger.kernel.org> # 4.6+	2016-06-28 03:29:29 +02:00
Jisheng Zhang	4a7cb7a96a	intel_pstate: Declare pid_params/pstate_funcs/hwp_active __read_mostly pid_params is written once by copy_pid_params() during initialization, and thereafter is mostly read by hot path intel_pstate_update_util(). The read of pid_params gets more after commit `a4675fbc4a` ("cpufreq: intel_pstate: Replace timers with utilization update callbacks") pstate_funcs is written once by copy_cpu_funcs() during initialization, and thereafter is mostly read by hot path intel_pstate_update_util() hwp_active is written to once during initialization and thereafter is mostly read by hot path intel_pstate_update_util(). The fact that they are mostly read and not written to makes them candidates for __read_mostly declarations. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-28 00:04:04 +02:00
Jisheng Zhang	29327c84ba	intel_pstate: add __init/__initdata marker to some functions/variables These functions/variables are not needed after booting, so mark them as __init or __initdata. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-28 00:04:04 +02:00
Jisheng Zhang	eed436095e	intel_pstate: Fix incorrect placement of __initdata __initdata should be placed between the variable name and equal sign (if there is) for the variable to be placed in the intended section. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-28 00:04:04 +02:00
Masahiro Yamada	ca5eda5d3d	cpufreq: dt: call of_node_put() before error out If of_match_node() fails, this init function bails out without calling of_node_put(). Also change of_node_put(of_root) to of_node_put(np); both of them hold the same pointer, but it seems better to call of_node_put() against the node returned by of_find_node_by_path(). Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-27 23:49:44 +02:00
Rafael J. Wysocki	5ab666e095	intel_pstate: Do not clear utilization update hooks on policy changes intel_pstate_set_policy() is invoked by the cpufreq core during driver initialization, on changes of policy attributes (minimim and maximum frequency, for example) via sysfs and via CPU notifications from the platform firmware. On some platforms the latter may occur relatively often. Commit `bb6ab52f2b` (intel_pstate: Do not set utilization update hook too early) made intel_pstate_set_policy() clear the CPU's utilization update hook before updating the policy attributes for it (and set the hook again after doind that), but that involves invoking synchronize_sched() and adds overhead to the CPU notifications mentioned above and to the sched-RCU handling in general. That extra overhead is arguably not necessary, because updating policy attributes when the CPU's utilization update hook is active should not lead to any adverse effects, so drop the clearing of the hook from intel_pstate_set_policy() and make it check if the hook has been set already when attempting to set it. Fixes: `bb6ab52f2b` (intel_pstate: Do not set utilization update hook too early) Reported-by: Jisheng Zhang <jszhang@marvell.com> Tested-by: Jisheng Zhang <jszhang@marvell.com> Tested-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-27 23:47:15 +02:00
Mike Galbraith	3c67a829bd	cpufreq: pcc-cpufreq: Fix doorbell.access_width Commit `920de6ebfa` (ACPICA: Hardware: Enhance acpi_hw_validate_register() with access_width/bit_offset awareness) apparently exposed a latent bug, doorbell.access_width is initialized to 64, but per Lv Zheng, it should be 4, and indeed, making that change does bring pcc-cpufreq back to life. Fixes: `920de6ebfa` (ACPICA: Hardware: Enhance acpi_hw_validate_register() with access_width/bit_offset awareness) Suggested-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-23 23:09:51 +02:00
Ben Dooks	187364b6fc	cpufreq: s5pv210: use relaxed IO accesors The use of __raw IO accesors is not endian safe and should be used sparingly. The relaxed variants should be as lightweight and also are endian safe. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>	2016-06-22 14:00:21 +02:00
Rafael J. Wysocki	56b7808572	Merge back earlier cpufreq changes for v4.8.	2016-06-20 14:31:41 +02:00
Srinivas Pandruvada	b00345d199	cpufreq: intel_pstate: Adjust _PSS[0] freqeuency if needed The maximum turbo P-State used by the intel_pstate driver may be limited by ACPI _PSS table entry 0. After commit `9522a2ff9c` (cpufreq: intel_pstate: Enforce _PPC limits), the maximum performance on servers will be capped by the _PSS table entry 0 by default. Even though that is formally correct, it may lead to preformance regressions in some cases. Namely, if the _PSS table entry 0 is not the maximum turbo P-State, performance measured after commit `9522a2ff9c` will not match the performance measured before that commit on the same system. For this reason, modify the code to always use the maximum turbo frequency as the one that corresponds to _PSS table entry 0 if turbo is enabled in the BIOS. This way, the performance levels from before commit `9522a2ff9c` will be restored on the affected systems. Fixes: `9522a2ff9c` (cpufreq: intel_pstate: Enforce _PPC limits) Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw : Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-15 01:56:47 +02:00
Ingo Molnar	f6f4bbc997	Merge branch 'x86/cpu' into x86/platform, to avoid conflict Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-06-14 12:25:07 +02:00

... 3 4 5 6 7 ...

2352 Commits