Commit Graph

21188 Commits

Author SHA1 Message Date
Linus Torvalds
d63a978865 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - More gradual enhancements to atomic ops: new atomic*_read_ctrl()
     ops, synchronize atomic_{read,set}() ordering requirements between
     architectures, add atomic_long_t bitops.  (Peter Zijlstra)

   - Add _{relaxed|acquire|release}() variants for inc/dec atomics and
     use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
     This enables weakly ordered architectures (such as arm64) to make
     use of more locking related optimizations.  (Davidlohr Bueso)

   - Implement atomic[64]_{inc,dec}_relaxed() on ARM.  (Will Deacon)

   - Futex kernel data cache footprint micro-optimization.  (Rasmus
     Villemoes)

   - pvqspinlock runtime overhead micro-optimization.  (Waiman Long)

   - misc smaller fixlets"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
  locking/rwsem: Use acquire/release semantics
  locking/mcs: Use acquire/release semantics
  locking/rtmutex: Use acquire/release semantics
  locking/mutex: Use acquire/release semantics
  locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
  atomic: Implement atomic_read_ctrl()
  atomic, arch: Audit atomic_{read,set}()
  atomic: Add atomic_long_t bitops
  futex: Force hot variables into a single cache line
  locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
  locking/osq: Relax atomic semantics
  locking/qrwlock: Rename ->lock to ->wait_lock
  locking/Documentation/lockstat: Fix typo - lokcing -> locking
  locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
2015-11-03 16:10:43 -08:00
Linus Torvalds
2814228699 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:
 "The main changes in this cycle were:

   - Improvements to expedited grace periods (Paul E McKenney)

   - Performance improvements to and locktorture tests for percpu-rwsem
     (Oleg Nesterov, Paul E McKenney)

   - Torture-test changes (Paul E McKenney, Davidlohr Bueso)

   - Documentation updates (Paul E McKenney)

   - Miscellaneous fixes (Paul E McKenney, Boqun Feng, Oleg Nesterov,
     Patrick Marlier)"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  fs/writeback, rcu: Don't use list_entry_rcu() for pointer offsetting in bdi_split_work_to_wbs()
  rcu: Better hotplug handling for synchronize_sched_expedited()
  rcu: Enable stall warnings for synchronize_rcu_expedited()
  rcu: Add tasks to expedited stall-warning messages
  rcu: Add online/offline info to expedited stall warning message
  rcu: Consolidate expedited CPU selection
  rcu: Prepare for consolidating expedited CPU selection
  cpu: Remove try_get_online_cpus()
  rcu: Stop excluding CPU hotplug in synchronize_sched_expedited()
  rcu: Stop silencing lockdep false positive for expedited grace periods
  rcu: Switch synchronize_sched_expedited() to IPI
  locktorture: Fix module unwind when bad torture_type specified
  torture: Forgive non-plural arguments
  rcutorture: Fix unused-function warning for torturing_tasks()
  rcutorture: Fix module unwind when bad torture_type specified
  rcu_sync: Cleanup the CONFIG_PROVE_RCU checks
  locking/percpu-rwsem: Clean up the lockdep annotations in percpu_down_read()
  locking/percpu-rwsem: Fix the comments outdated by rcu_sync
  locking/percpu-rwsem: Make use of the rcu_sync infrastructure
  locking/percpu-rwsem: Make percpu_free_rwsem() after kzalloc() safe
  ...
2015-11-03 15:40:38 -08:00
Linus Torvalds
6aa2fdb87c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The irq departement delivers:

   - Rework the irqdomain core infrastructure to accomodate ACPI based
     systems.  This is required to support ARM64 without creating
     artificial device tree nodes.

   - Sanitize the ACPI based ARM GIC initialization by making use of the
     new firmware independent irqdomain core

   - Further improvements to the generic MSI management

   - Generalize the irq migration on CPU hotplug

   - Improvements to the threaded interrupt infrastructure

   - Allow the migration of "chained" low level interrupt handlers

   - Allow optional force masking of interrupts in disable_irq[_nosysnc]

   - Support for two new interrupt chips - Sigh!

   - A larger set of errata fixes for ARM gicv3

   - The usual pile of fixes, updates, improvements and cleanups all
     over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
  Document that IRQ_NONE should be returned when IRQ not actually handled
  PCI/MSI: Allow the MSI domain to be device-specific
  PCI: Add per-device MSI domain hook
  of/irq: Use the msi-map property to provide device-specific MSI domain
  of/irq: Split of_msi_map_rid to reuse msi-map lookup
  irqchip/gic-v3-its: Parse new version of msi-parent property
  PCI/MSI: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
  of/irq: Use of_msi_get_domain instead of open-coded "msi-parent" parsing
  of/irq: Add support code for multi-parent version of "msi-parent"
  irqchip/gic-v3-its: Add handling of PCI requester id.
  PCI/MSI: Add helper function pci_msi_domain_get_msi_rid().
  of/irq: Add new function of_msi_map_rid()
  Docs: dt: Add PCI MSI map bindings
  irqchip/gic-v2m: Add support for multiple MSI frames
  irqchip/gic-v3: Fix translation of LPIs after conversion to irq_fwspec
  irqchip/mxs: Add Alphascale ASM9260 support
  irqchip/mxs: Prepare driver for hardware with different offsets
  irqchip/mxs: Panic if ioremap or domain creation fails
  irqdomain: Documentation updates
  irqdomain/msi: Use fwnode instead of of_node
  ...
2015-11-03 14:40:01 -08:00
Linus Torvalds
7b2a4306f9 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "The timer departement provides:

   - More y2038 work in the area of ntp and pps.

   - Optimization of posix cpu timers

   - New time related selftests

   - Some new clocksource drivers

   - The usual pile of fixes, cleanups and improvements"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  timeconst: Update path in comment
  timers/x86/hpet: Type adjustments
  clocksource/drivers/armada-370-xp: Implement ARM delay timer
  clocksource/drivers/tango_xtal: Add new timer for Tango SoCs
  clocksource/drivers/imx: Allow timer irq affinity change
  clocksource/drivers/exynos_mct: Use container_of() instead of this_cpu_ptr()
  clocksource/drivers/h8300_*: Remove unneeded memset()s
  clocksource/drivers/sh_cmt: Remove unneeded memset() in sh_cmt_setup()
  clocksource/drivers/em_sti: Remove unneeded memset()s
  clocksource/drivers/mediatek: Use GPT as sched clock source
  clockevents/drivers/mtk: Fix spurious interrupt leading to crash
  posix_cpu_timer: Reduce unnecessary sighand lock contention
  posix_cpu_timer: Convert cputimer->running to bool
  posix_cpu_timer: Check thread timers only when there are active thread timers
  posix_cpu_timer: Optimize fastpath_timer_check()
  timers, kselftest: Add 'adjtick' test to validate adjtimex() tick adjustments
  timers: Use __fls in apply_slack()
  clocksource: Remove return statement from void functions
  net: sfc: avoid using timespec
  ntp/pps: use y2038 safe types in pps_event_time
  ...
2015-11-03 14:13:41 -08:00
Linus Torvalds
95fc00a4e1 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull memremap fix from Dan Williams:
 "The new memremap() api introduced in the 4.3 cycle to unify/replace
  ioremap_cache() and ioremap_wt() is mishandling the highmem case.
  This patch has received a build success notification from a
  0day-kbuild-robot run and has received an ack from Ard"

From the commit message:
 "The impact of this bug is low for now since the pmem driver is the
  only user of memremap(), but this is important to fix before more
  conversions to memremap arrive in 4.4"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  memremap: fix highmem support
2015-11-01 14:13:54 -08:00
Ingo Molnar
e4340bbb07 Merge branch 'linus' into core/rcu, to fix up a semantic conflict
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-28 13:17:20 +01:00
Linus Torvalds
9e17f90702 Turns out we should have always been disabling preemption here;
someone finally caught it thanks to Peter Z's additional checks.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWLfFDAAoJENkgDmzRrbjxhFQP/jK2tSn8I2kDsd2sNZelXor+
 uQGKBpws8c2qY/MXru7HwlAX4L6Qlk6hCm0FyI9ZTqPBfR9sxjqFzOzWjcwFYrHa
 EXPPzJcGBR1voK/n/e+J/xHFQgXuIVKiOtmz1PZOp8Hs4zO3SftT4+MuTCpuaC0x
 79P0crJ4lV8FKFc+5pjWXceWRIG5XQuIbmDjNLOycuiNad7mtrEnR4xbMQBxD1Hf
 P1xT4pKckwVcSvRNw8St25ov65N/l/CPlDtYXov4eay4/Jd4wTFRSipe9v70tOkL
 vUGFGwa76ZM9X9y1Q85l2oQEkKMffSl82JEXyDrGPITwlYX0qRGOGfXPL8ihgcFi
 Gd3abBdI68HQidkwPNBJmMdSEX8q3ygzFeFOLYNdmeklhqnKRZHvDvbok1poBnSf
 EJbE6Cv7eKY9e74lu60NAf76XB48oL/BvEfdW1l4p/8GGxc0EYc6XGjjeHLsQffT
 7yMRzEp99nX4JQoSjeDRroLbCCAzzoAcogJnlrnYFywzfVkyWUU3sPpbMFxvRDSW
 Xuk/8Ii15AYwXnbQOpojtIJAi9f6F8CzIbtsO5J9cFju8lFUzUYpwC0Lm/pPoxQT
 PqgDqVvS3wBOaL7iEA1Sy615qUCxu6eIKK6gqnFs1hK9sTg/aSUF+Oz668rQ/a/E
 2XvWiB2C9NErM5IHtQdh
 =8AgP
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module preemption fix from Rusty Russell:
 "Turns out we should have always been disabling preemption here;
  someone finally caught it thanks to Peter Z's additional checks"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: Fix locking in symbol_put_addr()
2015-10-28 07:17:50 +09:00
Dan Williams
182475b7a2 memremap: fix highmem support
Currently memremap checks if the range is "System RAM" and returns the
kernel linear address.  This is broken for highmem platforms where a
range may be "System RAM", but is not part of the kernel linear mapping.
Fallback to ioremap_cache() in these cases, to let the arch code attempt
to handle it.

Note that ARM ioremap will WARN when attempting to remap ram, and in
that case the caller needs to be fixed.  For this reason, existing
ioremap_cache() usages for ARM are already trained to avoid attempts to
remap ram.

The impact of this bug is low for now since the pmem driver is the only
user of memremap(), but this is important to fix before more conversions
to memremap arrive in 4.4.

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-10-26 16:55:56 -04:00
Jason A. Donenfeld
03f136a207 timeconst: Update path in comment
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: hofrat@osadl.org
Link: http://lkml.kernel.org/r/1436894685-5868-1-git-send-email-Jason@zx2c4.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-26 10:06:06 +09:00
Linus Torvalds
df55793680 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes all around the map: an instrumentation fix, a nohz
  usability fix, a lockdep annotation fix and two task group scheduling
  fixes"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Add missing lockdep_unpin() annotations
  sched/deadline: Fix migration of SCHED_DEADLINE tasks
  nohz: Revert "nohz: Set isolcpus when nohz_full is set"
  sched/fair: Update task group's load_avg after task migration
  sched/fair: Fix overly small weight for interactive group entities
  sched, tracing: Stop/start critical timings around the idle=poll idle loop
2015-10-23 22:31:39 +09:00
Linus Torvalds
9f30931a54 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "9 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  ocfs2/dlm: unlock lockres spinlock before dlm_lockres_put
  fault-inject: fix inverted interval/probability values in printk
  lib/Kconfig.debug: disable -Wframe-larger-than warnings with KASAN=y
  mm: make sendfile(2) killable
  thp: use is_zero_pfn() only after pte_present() check
  mailmap: update Javier Martinez Canillas' email
  MAINTAINERS: add Sergey as zsmalloc reviewer
  mm: cma: fix incorrect type conversion for size during dma allocation
  kmod: don't run async usermode helper as a child of kworker thread
2015-10-23 22:10:51 +09:00
Peter Zijlstra
0aaafaabfc sched/core: Add missing lockdep_unpin() annotations
Luca and Wanpeng reported two missing annotations that led to
false lockdep complaints. Add the missing annotations.

Reported-by: Luca Abeni <luca.abeni@unitn.it>
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: cbce1a6867 ("sched,lockdep: Employ lock pinning")
Link: http://lkml.kernel.org/r/20151023095008.GY17308@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-23 12:02:10 +02:00
Oleg Nesterov
5211613978 kmod: don't run async usermode helper as a child of kworker thread
call_usermodehelper_exec_sync() does fork() + wait() with "unignored"
SIGCHLD.  What we have missed is that this worker thread can have other
children previously forked by call_usermodehelper_exec_work() without
UMH_WAIT_PROC.  If such a child exits in between it becomes a zombie
because auto-reaping only works if SIGCHLD is ignored, and nobody can
reap it (unless/until this worker thread exits too).

Change the !UMH_WAIT_PROC case to use CLONE_PARENT.

Note: this is only first step.  All PF_KTHREAD tasks, even created by
kernel_thread() should have ->parent == kthreadd by default.

Fixes: bb304a5c6f ("kmod: handle UMH_WAIT_PROC from system unbound workqueue")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-23 17:55:10 +09:00
Steven Rostedt (Red Hat)
1904be1b6b tracing: Do not allow stack_tracer to record stack in NMI
The code in stack tracer should not be executed within an NMI as it grabs
spinlocks and stack tracing an NMI gives the possibility of causing a
deadlock. Although this is safe on x86_64, because it does not perform stack
traces when the task struct stack is not in use (interrupts and NMIs), it
may be an issue for NMIs on i386 and other archs that use the same stack as
the NMI.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-10-20 21:52:23 -04:00
Steven Rostedt (Red Hat)
a2d7629048 tracing: Have stack tracer force RCU to be watching
The stack tracer was triggering the WARN_ON() in module.c:

 static void module_assert_mutex_or_preempt(void)
 {
 #ifdef CONFIG_LOCKDEP
	if (unlikely(!debug_locks))
		return;

	WARN_ON(!rcu_read_lock_sched_held() &&
		!lockdep_is_held(&module_mutex));
 #endif
 }

The reason is that the stack tracer traces all function calls, and some of
those calls happen while exiting or entering user space and idle. Some of
these functions are called after RCU had already stopped watching, as RCU
does not watch userspace or idle CPUs.

If a max stack is hit, then the save_stack_trace() is called, which will
check module addresses and call module_assert_mutex_or_preempt(), and then
trigger the warning. Sad part is, the warning itself will also do a stack
trace and tigger the same warning. That probably should be fixed.

The warning was added by 0be964be0d "module: Sanitize RCU usage and
locking" but this bug has probably been around longer. But it's unlikely to
cause much harm, but the new warning causes the system to lock up.

Cc: stable@vger.kernel.org # 4.2+
Cc: Peter Zijlstra <peterz@infradead.org>
Cc:"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-10-20 11:38:08 -04:00
Thomas Gleixner
b2c280bdd6 Merge branch 'fortglx/4.4/time' of https://git.linaro.org/people/john.stultz/linux into timers/core
Time updates from John Stultz:

     - More 2038 work from Arnd Bergmann around ntp and pps
2015-10-20 12:36:37 +02:00
Ingo Molnar
a1a2ab2ff7 Linux 4.3-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWJCaDAAoJEHm+PkMAQRiGvJQH/jGlxawu+mnNFjIRXmk8Zucq
 cxDVTPnQ4S1DJMyRSQTqe6ccDgDM/TEISJPZ0Gwc2SQLda27zdQiz05upbqHSPSH
 /FUHMKiMRhhIBOzdfEGR4Ry2YZID+DlG5EWCgRKf/GlgY3STZSpjIBPYd3Rl3zsU
 qragNjtbCE6eUpnLwkv40Q2mzzR3/fZxJ0drgE/QiSF8P0VmvVbrcQMF58vnUQgy
 9URp48mLhmKj+IrElSnvXMG0XLrKn1tRYXGXd3w+x1Nlr//SHYsIuQHUcp6SAVQh
 p2HXLL7eoy8rs45MCiOfXG7VgNWF6HFjVMtrNUcNfgBsLCx0qFMVrKIILrGtKh0=
 =ETze
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rc6' into locking/core, to pick up fixes before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20 10:16:46 +02:00
Luca Abeni
5aa5050787 sched/deadline: Fix migration of SCHED_DEADLINE tasks
Commit:

  9d51426242 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target")

broke select_task_rq_dl() and find_lock_later_rq(), because it introduced
a comparison between the local task's deadline and dl.earliest_dl.curr of
the remote queue.

However, if the remote runqueue does not contain any SCHED_DEADLINE
task its earliest_dl.curr is 0 (always smaller than the deadline of
the local task) and the remote runqueue is not selected for pushing.

As a result, if an application creates multiple SCHED_DEADLINE
threads, they will never be pushed to runqueues that do not already
contain SCHED_DEADLINE tasks.

This patch fixes the issue by checking if dl.dl_nr_running == 0.

Signed-off-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@linux.intel.com>
Fixes: 9d51426242 ("sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target")
Link: http://lkml.kernel.org/r/1444982781-15608-1-git-send-email-luca.abeni@unitn.it
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20 10:13:36 +02:00
Frederic Weisbecker
0baabb385e nohz: Revert "nohz: Set isolcpus when nohz_full is set"
This reverts:

  8cb9764fc8 ("nohz: Set isolcpus when nohz_full is set")

We assumed that full-nohz users always want scheduler isolation on full
dynticks CPUs, therefore we included full-nohz CPUs on cpu_isolated_map.

This means that tasks run by default on CPUs outside the nohz_full range
unless their affinity is explicity overwritten.

This suits pure isolation workloads but when the machine is needed to
run common workloads, the available sets of CPUs to run common tasks
becomes reduced.

We reach an extreme case when CONFIG_NO_HZ_FULL_ALL is enabled as it
leaves only CPU 0 for non-isolation tasks, which makes people think that
their supercomputer regressed to 90's UP - which is true in a sense.

Some full-nohz users appear to be interested in running normal workloads
either before or after an isolation workload. Full-nohz isn't optimized
toward normal workloads but it's still better than UP performance.

We are reaching a limitation in kernel presets here. Lets revert this
cpu_isolated_map inclusion and let userspace do its own scheduler
isolation using cpusets or explicit affinity settings.

Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Link: http://lkml.kernel.org/r/1444663283-30068-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20 10:13:36 +02:00
Yuyang Du
3e386d56ba sched/fair: Update task group's load_avg after task migration
When cfs_rq has cfs_rq->removed_load_avg set (when a task migrates from
this cfs_rq), we need to update its contribution to the group's load_avg.

This should not increase tg's update too much, because in most cases, the
cfs_rq has already decayed its load_avg.

Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444699103-20272-2-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20 10:13:35 +02:00
Yuyang Du
fde7d22e01 sched/fair: Fix overly small weight for interactive group entities
Commit:

  9d89c257df ("sched/fair: Rewrite runnable load and utilization average tracking")

led to an overly small weight for interactive group entities. The bad case
can be easily reproduced when a number of CPU hogs compete for the CPUs
at the same time (thanks to Mike). This is largly because the task group's
load average tracking cross CPUs lags behind the real changes.

To fix this we accelerate the group share distribution process by using
the load.weight of the cfs_rq. This may increase the entire group's
share, but we have to do so to protect the (fragile) interactive
tasks, especially from CPU hogs.

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1444699103-20272-1-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-20 10:13:34 +02:00
Ingo Molnar
c13dc31adb Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

  - Miscellaneous fixes. (Paul E. McKenney, Boqun Feng, Oleg Nesterov, Patrick Marlier)

  - Improvements to expedited grace periods. (Paul E. McKenney)

  - Performance improvements to and locktorture tests for percpu-rwsem.
    (Oleg Nesterov, Paul E. McKenney)

  - Torture-test changes. (Paul E. McKenney, Davidlohr Bueso)

  - Documentation updates. (Paul E. McKenney)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-19 10:09:54 +02:00
Linus Torvalds
81429a6dbc Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq/timer fixes from Thomas Gleixner:
 "irq: a fix for the new hierarchical MSI interrupt handling which
  unbreaks PCI=n configurations.

  timers: a fix for the new hrtimer clock offset update mechanism to
  ensure that the boot time offset is respected"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/msi: Do not use pci_msi_[un]mask_irq as default methods

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Increment clock_was_set_seq in timekeeping_init()
2015-10-17 08:47:27 -07:00
Thomas Gleixner
56fd16caba timekeeping: Increment clock_was_set_seq in timekeeping_init()
timekeeping_init() can set the wall time offset, so we need to
increment the clock_was_set_seq counter. That way hrtimers will pick
up the early offset immediately. Otherwise on a machine which does not
set wall time later in the boot process the hrtimer offset is stale at
0 and wall time timers are going to expire with a delay of 45 years.

Fixes: 868a3e915f "hrtimer: Make offset update smarter"
Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Stefan Liebler <stli@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
2015-10-16 15:50:22 +02:00
Marc Zyngier
0701c53e46 genirq/msi: Do not use pci_msi_[un]mask_irq as default methods
When we create a generic MSI domain, that MSI_FLAG_USE_DEF_CHIP_OPS
is set, and that any of .mask or .unmask are NULL in the irq_chip
structure, we set them to pci_msi_[un]mask_irq.

This is a bad idea for at least two reasons:
- PCI_MSI might not be selected, kernel fails to build (yes, this is
  legitimate, at least on arm64!)
- This may not be a PCI/MSI domain at all (platform MSI, for example)

Either way, this looks wrong. Move the overriding of mask/unmask to
the PCI counterpart, and panic is any of these two methods is not
set in the core code (they really should be present).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/1444760085-27857-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-16 12:40:43 +02:00
Linus Torvalds
995e2fe9a4 Merge branch 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixlet from Tejun Heo:
 "Single patch to make delayed work always be queued on the local CPU"

This is not actually something we should guarantee, but it's something
we by accident have historically done, and at least one call site has
grown to depend on it.

I'm going to fix that known broken callsite, but in the meantime this
makes the accidental behavior be explicit, just in case there are other
cases that might depend on it.

* 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: make sure delayed work run in local cpu
2015-10-15 12:58:37 -07:00
Jason Low
c8d75aa47d posix_cpu_timer: Reduce unnecessary sighand lock contention
It was found while running a database workload on large systems that
significant time was spent trying to acquire the sighand lock.

The issue was that whenever an itimer expired, many threads ended up
simultaneously trying to send the signal. Most of the time, nothing
happened after acquiring the sighand lock because another thread
had just already sent the signal and updated the "next expire" time.
The fastpath_timer_check() didn't help much since the "next expire"
time was updated after the threads exit fastpath_timer_check().

This patch addresses this by having the thread_group_cputimer structure
maintain a boolean to signify when a thread in the group is already
checking for process wide timers, and adds extra logic in the fastpath
to check the boolean.

Signed-off-by: Jason Low <jason.low2@hp.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: George Spelvin <linux@horizon.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1444849677-29330-5-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-15 11:23:41 +02:00
Jason Low
d5c373eb56 posix_cpu_timer: Convert cputimer->running to bool
In the next patch in this series, a new field 'checking_timer' will
be added to 'struct thread_group_cputimer'. Both this and the
existing 'running' integer field are just used as boolean values. To
save space in the structure, we can make both of these fields booleans.

This is a preparatory patch to convert the existing running integer
field to a boolean.

Suggested-by: George Spelvin <linux@horizon.com>
Signed-off-by: Jason Low <jason.low2@hp.com>
Reviewed: George Spelvin <linux@horizon.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1444849677-29330-4-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-15 11:23:41 +02:00
Jason Low
934715a191 posix_cpu_timer: Check thread timers only when there are active thread timers
The fastpath_timer_check() contains logic to check for if any timers
are set by checking if !task_cputime_zero(). Similarly, we can do this
before calling check_thread_timers(). In the case where there
are only process-wide timers, this will skip all of the computations for
per-thread timers when there are no per-thread timers.

As suggested by George, we can put the task_cputime_zero() check in
check_thread_timers(), since that is more of an optization to the
function. Similarly, we move the existing check of cputimer->running
to check_process_timers().

Signed-off-by: Jason Low <jason.low2@hp.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: George Spelvin <linux@horizon.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1444849677-29330-3-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-15 11:23:41 +02:00
Jason Low
7c177d994e posix_cpu_timer: Optimize fastpath_timer_check()
In fastpath_timer_check(), the task_cputime() function is always
called to compute the utime and stime values. However, this is not
necessary if there are no per-thread timers to check for. This patch
modifies the code such that we compute the task_cputime values only
when there are per-thread timers set.

Signed-off-by: Jason Low <jason.low2@hp.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: George Spelvin <linux@horizon.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: hideaki.kimura@hpe.com
Cc: terry.rudd@hpe.com
Cc: scott.norton@hpe.com
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1444849677-29330-2-git-send-email-jason.low2@hp.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-15 11:23:41 +02:00
Marc Zyngier
be5436c83a irqdomain/msi: Use fwnode instead of of_node
As we continue to push of_node towards the outskirts of irq domains,
let's start tackling the case of msi_create_irq_domain and its little
friends.

This has limited impact in both PCI/MSI, platform MSI, and a few
drivers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-17-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:25 +02:00
Marc Zyngier
2a5e9a072d irqdomain: Introduce irq_domain_create_hierarchy
As we're about to start converting the various MSI layers to
use fwnode_handle instead of device_node, add irq_domain_create_hierarchy
as a directly equivalent of irq_domain_add_hierarchy (which still
exists as a compatibility interface).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-16-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:25 +02:00
Marc Zyngier
b145dcc45a irqdomain: Add a fwnode_handle allocator
In order to be able to reference an irqdomain from ACPI, we need
to be able to create an identifier, which is usually a struct
device_node.

This device node does't really fit the ACPI infrastructure, so
we cunningly allocate a new structure containing a fwnode_handle,
and return that.

This structure doesn't really point to a device (interrupt
controllers are not "real" devices in Linux), but as we cannot
really deny that they exist, we create them with a new fwnode_type
(FWNODE_IRQCHIP).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-9-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:24 +02:00
Marc Zyngier
1bf4ddc46c irqdomain: Introduce irq_domain_create_{linear, tree}
Just like we have irq_domain_add_{linear,tree} to create a irq domain
identified by an of_node, introduce irq_domain_create_{linear,tree}
that do the same thing, except that they take a struct fwnode_handle.

Existing functions get rewritten in terms of the new ones so that
everything keeps working as before (and __irq_domain_add is now
fwnode_handle based as well).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-8-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:24 +02:00
Marc Zyngier
c0131f09de irqdomain: Introduce irq_create_fwspec_mapping
Just like we have irq_create_of_mapping, irq_create_fwspec_mapping
creates a IRQ domain mapping for an interrupt described in a
struct irq_fwspec.

irq_create_of_mapping gets rewritten in terms of the new function,
and the hack we introduced before gets removed (now that no stacked
irqchip uses of_phandle_args anymore).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-7-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:24 +02:00
Marc Zyngier
11e4438ee3 irqdomain: Introduce a firmware-specific IRQ specifier structure
So far the closest thing to a generic IRQ specifier structure is
of_phandle_args, which happens to be pretty OF specific (the of_node
pointer in there is quite annoying).

Let's introduce 'struct irq_fwspec' that can be used in place of
of_phandle_args for OF, but also for other firmware implementations
(that'd be ACPI). This is used together with a new 'translate' method
that is the pendent of 'xlate'.

We convert irq_create_of_mapping to use this new structure (with a
small hack that will be removed later).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-5-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:23 +02:00
Marc Zyngier
130b8c6c8d irqdomain: Allow irq domain lookup by fwnode
So far, our irq domains are still looked up by device node.
Let's change this and allow a domain to be looked up using
a fwnode_handle pointer.

The existing interfaces are preserved with a couple of helpers.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-4-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:23 +02:00
Marc Zyngier
f110711a60 irqdomain: Convert irqdomain-%3Eof_node to fwnode
Now that we have everyone accessing the of_node field via the
irq_domain_get_of_node accessor, it is pretty easy to swap it
for a pointer to a fwnode_handle.

This translates into a few limited changes in __irq_domain_add,
and an updated irq_domain_get_of_node.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-3-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:23 +02:00
Marc Zyngier
5d4c9bc776 irqdomain: Use irq_domain_get_of_node() instead of direct field access
The struct irq_domain contains a "struct device_node *" field
(of_node) that is almost the only link between the irqdomain
and the device tree infrastructure.

In order to prepare for the removal of that field, convert all
users to use irq_domain_get_of_node() instead.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:23 +02:00
Thomas Gleixner
e50226b4b8 Merge branch 'linus' into irq/core
Bring in upstream updates for patches which depend on them
2015-10-13 19:00:14 +02:00
Ingo Molnar
b9f27c0f4f Linux 4.3-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWGqXxAAoJEHm+PkMAQRiGP80H+wX+0PTYpCNaH09LpvlRVZvM
 vG3RnqIYQ8cyOrJJEoVqLznm4vRmw3GrbHxhzDEVrX1beXNVRJdqyZUOttQuDuyS
 A/NWGaZCcu45VcL08NVMCqYv9D7HwDxe5WOhXigo2QX4nlbmJsBoU24ibV35nGYT
 2xdhEwHIH3X+qlzp8Mya6mheYHO+eZ6C+jSy7dYjQoXto0Acz6SoGC6/lhsV4biw
 ENRTJY7y+wzG6ND2PrjF6QV0SwCDAU/f7KcYe01+wm74/uCLYgQuUOPRUeu6ydfN
 Li6CNwN8NzcimLTF4zmQWBte8SkQDVM9LeC8Eyoz2aUYzq7hf6fTfihmQSZnQtQ=
 =vDtg
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rc5' into timers/core, to pick up fixes before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 09:51:18 +02:00
Daniel Bristot de Oliveira
9babcd7929 sched, tracing: Stop/start critical timings around the idle=poll idle loop
When using idle=poll, the preemptoff tracer is always showing
the idle task as the culprit for long latencies. That happens
because critical timings are not stopped before idle loop. This
patch stops critical timings before entering the idle loop,
starting it again after the idle loop.

This problem does not affect the irqsoff tracer because
interruptions are enabled before entering the idle loop.

Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Reviewed-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/10fc3705874aef11dbe152a068b591a7be1899b4.1444314899.git.bristot@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 09:45:25 +02:00
Rasmus Villemoes
9fc4468d54 timers: Use __fls in apply_slack()
In apply_slack(), find_last_bit() is applied to a bitmask consisting
of precisely BITS_PER_LONG bits. Since mask is non-zero, we might as
well eliminate the function call and use __fls() directly. On x86_64,
this shaves 23 bytes of the only caller, mod_timer().

This also gets rid of Coverity CID 1192106, but that is a false
positive: Coverity is not aware that mask != 0 implies that
find_last_bit will not return BITS_PER_LONG.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/1443771931-6284-1-git-send-email-linux@rasmusvillemoes.dk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-11 22:13:46 +02:00
Guillaume Gomez
cfed432d7f clocksource: Remove return statement from void functions
Signed-off-by: Guillaume Gomez <guillaume1.gomez@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/CAAOQCfSDgmqSWDBsetau%2ByF8x0%2BDagCF_pfFw0p5xH_BKkKEog@mail.gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-11 22:13:46 +02:00
Linus Torvalds
9a78f9c3c6 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
 "Fix a long standing state race in finish_task_switch()"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Fix TASK_DEAD race in finish_task_switch()
2015-10-11 10:24:32 -07:00
Thomas Gleixner
e9849777d0 genirq: Add flag to force mask in disable_irq[_nosync]()
If an irq chip does not implement the irq_disable callback, then we
use a lazy approach for disabling the interrupt. That means that the
interrupt is marked disabled, but the interrupt line is not
immediately masked in the interrupt chip. It only becomes masked if
the interrupt is raised while it's marked disabled. We use this to avoid
possibly expensive mask/unmask operations for common case operations.

Unfortunately there are devices which do not allow the interrupt to be
disabled easily at the device level. They are forced to use
disable_irq_nosync(). This can result in taking each interrupt twice.

Instead of enforcing the non lazy mode on all interrupts of a irq
chip, provide a settings flag, which can be set by the driver for that
particular interrupt line.

Reported-and-tested-by: Duc Dang <dhdang@apm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1510092348370.6097@nanos
2015-10-11 11:33:42 +02:00
Feng Wu
fcf1ae2f7a genirq: Make irq_set_vcpu_affinity available for CONFIG_SMP=n
irq_set_vcpu_affinity() is needed when CONFIG_SMP=n, so move the
definition out of "#ifdef CONFIG_SMP"

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
Cc: jiang.liu@linux.intel.com
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1443860438-144926-1-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-09 22:47:27 +02:00
Mika Westerberg
e509bd7da1 genirq: Allow migration of chained interrupts by installing default action
When a CPU is offlined all interrupts that have an action are migrated to
other still online CPUs. However, if the interrupt has chained handler
installed this is not done. Chained handlers are used by GPIO drivers which
support interrupts, for instance.

When the affinity is not corrected properly we end up in situation where
most interrupts are not arriving to the online CPUs anymore. For example on
Intel Braswell system which has SD-card card detection signal connected to
a GPIO the IO-APIC routing entries look like below after CPU1 is offlined:

  pin30, enabled , level, low , V(52), IRR(0), S(0), logical , D(03), M(1)
  pin31, enabled , level, low , V(42), IRR(0), S(0), logical , D(03), M(1)
  pin32, enabled , level, low , V(62), IRR(0), S(0), logical , D(03), M(1)
  pin5b, enabled , level, low , V(72), IRR(0), S(0), logical , D(03), M(1)

The problem here is that the destination mask still contains both CPUs even
if CPU1 is already offline. This means that the IO-APIC still routes
interrupts to the other CPU as well.

We solve the problem by providing a default action for chained interrupts.
This action allows the migration code to correct affinity (as it finds
desc->action != NULL).

Also make the default action handler to emit a warning if for some reason a
chained handler ends up calling it.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1444039935-30475-1-git-send-email-mika.westerberg@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-09 22:47:27 +02:00
Arnd Bergmann
e3096c9c7c genirq: Fix handle_bad_irq kerneldoc comment
A recent cleanup removed the 'irq' parameter from many functions, but
left the documentation for this in place for at least one function.

This removes it.

Fixes: bd0b9ac405 ("genirq: Remove irq argument from irq flow handlers")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: kbuild-all@01.org
Cc: Austin Schuh <austin@peloton-tech.com>
Cc: Santosh Shilimkar <ssantosh@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/5400000.cD19rmgWjV@wuerfel
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-09 17:17:30 +02:00
Arnd Bergmann
9d67dc5da5 genirq: Export handle_bad_irq
A cleanup of the omap gpio driver introduced a use of the
handle_bad_irq() function in a device driver that can be
a loadable module.

This broke the ARM allmodconfig build:

ERROR: "handle_bad_irq" [drivers/gpio/gpio-omap.ko] undefined!

This patch exports the handle_bad_irq symbol in order to
allow the use in modules.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Santosh Shilimkar <ssantosh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Austin Schuh <austin@peloton-tech.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/5847725.4IBopItaOr@wuerfel
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-09 17:17:30 +02:00