Pull scheduler updates from Ingo Molnar:
"The biggest change in this cycle is the rewrite of the main SMP load
balancing metric: the CPU load/utilization. The main goal was to make
the metric more precise and more representative - see the changelog of
this commit for the gory details:
9d89c257df ("sched/fair: Rewrite runnable load and utilization average tracking")
It is done in a way that significantly reduces complexity of the code:
5 files changed, 249 insertions(+), 494 deletions(-)
and the performance testing results are encouraging. Nevertheless we
need to keep an eye on potential regressions, since this potentially
affects every SMP workload in existence.
This work comes from Yuyang Du.
Other changes:
- SCHED_DL updates. (Andrea Parri)
- Simplify architecture callbacks by removing finish_arch_switch().
(Peter Zijlstra et al)
- cputime accounting: guarantee stime + utime == rtime. (Peter
Zijlstra)
- optimize idle CPU wakeups some more - inspired by Facebook server
loads. (Mike Galbraith)
- stop_machine fixes and updates. (Oleg Nesterov)
- Introduce the 'trace_sched_waking' tracepoint. (Peter Zijlstra)
- sched/numa tweaks. (Srikar Dronamraju)
- misc fixes and small cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
sched/deadline: Fix comment in enqueue_task_dl()
sched/deadline: Fix comment in push_dl_tasks()
sched: Change the sched_class::set_cpus_allowed() calling context
sched: Make sched_class::set_cpus_allowed() unconditional
sched: Fix a race between __kthread_bind() and sched_setaffinity()
sched: Ensure a task has a non-normalized vruntime when returning back to CFS
sched/numa: Fix NUMA_DIRECT topology identification
tile: Reorganize _switch_to()
sched, sparc32: Update scheduler comments in copy_thread()
sched: Remove finish_arch_switch()
sched, tile: Remove finish_arch_switch
sched, sh: Fold finish_arch_switch() into switch_to()
sched, score: Remove finish_arch_switch()
sched, avr32: Remove finish_arch_switch()
sched, MIPS: Get rid of finish_arch_switch()
sched, arm: Remove finish_arch_switch()
sched/fair: Clean up load average references
sched/fair: Provide runnable_load_avg back to cfs_rq
sched/fair: Remove task and group entity load when they are dead
sched/fair: Init cfs_rq's sched_entity load average
...
Here is the new patches for the driver core / sysfs for 4.3-rc1.
Very small number of changes here, all the details are in the shortlog,
nothing major happening at all this kernel release, which is nice to
see.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlXV9EwACgkQMUfUDdst+ylv1ACgj7srYyvumehX1zfRVzEWNuez
chQAoKHnSpDMME/WmhQQRxzQ5pfd1Pni
=uGHg
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the new patches for the driver core / sysfs for 4.3-rc1.
Very small number of changes here, all the details are in the
shortlog, nothing major happening at all this kernel release, which is
nice to see"
* tag 'driver-core-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
bus: subsys: update return type of ->remove_dev() to void
driver core: correct device's shutdown order
driver core: fix docbook for device_private.device
selftests: firmware: skip timeout checks for kernels without user mode helper
kernel, cpu: Remove bogus __ref annotations
cpu: Remove bogus __ref annotation of cpu_subsys_online()
firmware: fix wrong memory deallocation in fw_add_devm_name()
sysfs.txt: update show method notes about sprintf/snprintf/scnprintf usage
devres: fix devres_get()
Move the simulator bits into finish_arch_post_lock_switch() and
properly call __switch_to() from _switch_to().
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438783412-10990-1-git-send-email-cmetcalf@ezchip.com
[ Made it a delta to: fe363adb92 ("sched, tile: Remove finish_arch_switch"). ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This function can leak kernel stack data when the user siginfo_t has a
positive si_code value. The top 16 bits of si_code descibe which fields
in the siginfo_t union are active, but they are treated inconsistently
between copy_siginfo_from_user32, copy_siginfo_to_user32 and
copy_siginfo_to_user.
copy_siginfo_from_user32 is called from rt_sigqueueinfo and
rt_tgsigqueueinfo in which the user has full control overthe top 16 bits
of si_code.
This fixes the following information leaks:
x86: 8 bytes leaked when sending a signal from a 32-bit process to
itself. This leak grows to 16 bytes if the process uses x32.
(si_code = __SI_CHLD)
x86: 100 bytes leaked when sending a signal from a 32-bit process to
a 64-bit process. (si_code = -1)
sparc: 4 bytes leaked when sending a signal from a 32-bit process to a
64-bit process. (si_code = any)
parsic and s390 have similar bugs, but they are not vulnerable because
rt_[tg]sigqueueinfo have checks that prevent sending a positive si_code
to a different process. These bugs are also fixed for consistency.
Signed-off-by: Amanieu d'Antras <amanieu@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Its return value is not used by the subsys core and nothing meaningful
can be done with it, even if we want to use it. The subsys device is
anyway getting removed.
Update prototype of ->remove_dev() to make its return type as void. Fix
all usage sites as well.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the simulator bits into switch_to() and use
finish_arch_post_lock_switch() for the homecache migration bits.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We were previously using free_bootmem() and just getting lucky
that nothing too bad happened.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: stable@vger.kernel.org
Pull more vfs updates from Al Viro:
"Assorted VFS fixes and related cleanups (IMO the most interesting in
that part are f_path-related things and Eric's descriptor-related
stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
fs-cache series, DAX patches, Jan's file_remove_suid() work"
[ I'd say this is much more than "fixes and related cleanups". The
file_table locking rule change by Eric Dumazet is a rather big and
fundamental update even if the patch isn't huge. - Linus ]
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
9p: cope with bogus responses from server in p9_client_{read,write}
p9_client_write(): avoid double p9_free_req()
9p: forgetting to cancel request on interrupted zero-copy RPC
dax: bdev_direct_access() may sleep
block: Add support for DAX reads/writes to block devices
dax: Use copy_from_iter_nocache
dax: Add block size note to documentation
fs/file.c: __fget() and dup2() atomicity rules
fs/file.c: don't acquire files->file_lock in fd_install()
fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
vfs: avoid creation of inode number 0 in get_next_ino
namei: make set_root_rcu() return void
make simple_positive() public
ufs: use dir_pages instead of ufs_dir_pages()
pagemap.h: move dir_pages() over there
remove the pointless include of lglock.h
fs: cleanup slight list_entry abuse
xfs: Correctly lock inode when removing suid and file capabilities
fs: Call security_ops->inode_killpriv on truncate
fs: Provide function telling whether file_remove_privs() will do anything
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVkPNDAAoJEOvOhAQsB9HWTNwP/1xtv8s2f7dY1JOV9T3oad7K
FJYOnFRu1CbXqtOGgJQlsY5eUc3liC+UEkqMFmvX008GIoIGi/aq1alzM4ySlu45
c8QttAS9aFFHwsNQUFA8rNN2Lz1xmhKi3ovc/+BBN9stgX0W0fJHX8A7TYtBsVFa
YqfkNP/4XGH+Taz4B7Id6Mv3RJfB+9TWMlHJ4oKl1NhT+fU+Ce2888K7y5llHGIz
Y9yDt7hMUv/7ysOpiHbvSKy3XnitTNx9JbN8CDQV22krpgsU1k0nYloxOVj5K0h0
vxcjpQ1Wmjlc7RO826tciMi3ZD880GK5n8NHuI87d/N/egXRP0Tsy1iy9eGK0R7i
udXR2y4RP5gD7SPuMJCUCrBTxkfp+rxQ775Keo/R9r4v/KzpKX6e0LcEDjiLsk88
5UHUZNdPgXxw85O354QwX05jAucPIs6Eq8PR324F+R+FU8x5EI6GWtFts0K4YI7j
ebsgaQR/aqvRlr859iJBFGBwEu0YWcbkVb6kKdMSjE4x0a3YxhFe6aXXll0g+iIZ
wGR54nRpBUUvh+qqlrSFTc3VA4f1KPdhylcfEmfSH2iNjARvDR61vzkLW1Nt6u0I
aM6ZYcfbGhGHt+pycqe6LAydS3qRyWDA6QTu6+TFZid/Ay6NBEI+Ubbx+eLNf8vr
+trFtqFvEfIMuT1BvOXo
=TR34
-----END PGP SIGNATURE-----
Merge tag 'module-misc-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull init.h/module.h fragility fixes from Paul Gortmaker:
"Fixup various init.h misuses that are fragile wrt code moving to
module.h
What started as a removal of no longer required include <linux/init.h>
due to the earlier __cpuinit and __devinit removal led to the
observation that some module specfic support was living in init.h
itself, thus preventing the full removal from introducing compile
regressions.
This series includes a few final fixups needed prior to the relocation
of the modular init code from <init.h> to <module.h>. These are
things that weren't easily categorized into any of the other previous
series categories already requested for pull.
That said, each fixup branch (including this one) is independent and
there are no ordering constraints. Only the final code relocation
(which is NOT in this pull) requires that all my cleanup branches be
merged first"
* tag 'module-misc-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
tile: add init.h to usb.c to avoid compile failure
arm: fix implicit #include <linux/init.h> in entry asm.
x86: replace __init_or_module with __init in non-modular vsmp_64.c
Pull arch/tile updates from Chris Metcalf:
"These are a grab bag of changes to improve debugging and respond to a
variety of issues raised on LKML over the last couple of months"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: avoid a "label not used" warning in do_page_fault()
tile: vdso: use raw_read_seqcount_begin() in vdso
tile: force CONFIG_TILEGX if ARCH != tilepro
tile: improve stack backtrace
tile: fix "odd fault" warning for stack backtraces
tile: set up initial stack top to honor STACK_TOP_DELTA
tile: support delivering NMIs for multicore backtrace
drivers/tty/hvc/hvc_tile.c: properly return -EAGAIN
tile: add <asm/word-at-a-time.h> and enable support functions
tile: use READ_ONCE() in arch_spin_is_locked()
tile: modify arch_spin_unlock_wait() semantics
Pending header cleanups will reveal this file is using the
init.h content implicitly with the following fail:
arch/tile/kernel/usb.c:69:1: warning: data definition has no type or storage class [enabled by default]
arch/tile/kernel/usb.c:69:1: error: type defaults to 'int' in declaration of 'arch_initcall'
arch/tile/kernel/usb.c:69:1: warning: parameter names (without types) in function declaration [enabled by default]
arch/tile/kernel/usb.c:62:19: warning: 'tilegx_usb_init' defined but not used
Explicitly add init.h to get arch_initcall and avoid this.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Chris Metcalf <cmetcalf@ezchip.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Previously we were using read_seqcount_begin(), which works fine until
lockdep is enabled in the kernel, at which point lockdep locking shows
up in the vdso and userspace will take a GPV accessing a kernel-only
SPR when calling gettimeofday() etc.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
This commit fixes a number of issues with the tile backtrace code.
- Don't try to identify userspace shared object or executable paths
if we are doing a backtrace from an interrupt; it's not legal,
and also unlikely to be interesting. Likewise, don't try to do
it for other address spaces, since d_path() assumes it is being
called in "current" context.
- Move "in_backtrace" from thread_struct to thread_info.
This way we can access it even if our stack thread_info has been
clobbered, which makes backtracing more robust.
- Avoid using "current" directly when testing for is_sigreturn().
Since "current" may be corrupt, we're better off using kbt->task
explicitly to look up the vdso_base for the current task.
Conveniently, this simplifies the internal APIs (we only need
one is_sigreturn() function now).
- Avoid bogus "Odd fault" warning when pc/sp/ex1 are all zero,
as is true for kernel threads above the last frame.
- Hook into Tejun Heo's dump_stack() framework in lib/dump_stack.c.
- Write last entry in save_stack_trace() as ULONG_MAX, not zero,
since ftrace (at least) relies on finding that marker.
- Implement save_stack_trace_regs() and save_strack_trace_user(),
and set CONFIG_USER_STACKTRACE_SUPPORT.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
We were setting ex1 in new kernel threads to KERNEL_PL.
But since we just do a simple context-switch, not an iret,
any value set here is ignored anyway, and its presence causes
stack backtraces to end with a warning about an "odd fault".
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
For some reason this was never changed to match the rest of the
code where we always initialize the kernel sp 64 bytes below
the top of the page. This is generally harmless, but it does
mean that if you do a dump_stack() early on in kernel boot you
see a bogus warning about stack overrun.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
A new hypervisor service was added some time ago (MDE 4.2.1 or
later, or MDE 4.3 or later) that allows cores to request NMIs
to be delivered to other cores. Use this facility to deliver
a request that causes a backtrace to be generated on each core,
and hook it into the magic SysRq functionality.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
The code accidentally used cpu_isset() previously in one place
(though properly node_isset() elsewhere).
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
functions, prompted by their mis-use in staging.
With these function removed, all cpu functions should only iterate to
nr_cpu_ids, so we finally only allocate that many bits when cpumasks
are allocated offstack.
Thanks,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVNPMuAAoJENkgDmzRrbjx7ZIP/j65e6xs1jEyXR3WOYSdTU1x
bMo6JcII6O1oEZLgyKXgx9KiBg6uIIDta1NG/H/XIe354dwfHVsHvj5HHHQR5Xof
iRrjLOaHj4XglI3hvsk0eEEl3/OBBLgyo9bUwDvMF1fmr/9tW4caMs3Op6n7Evzm
YIvoAyeJ0A8BfEtOU5lXhcVIGmnHtSw0x6mdGXpXIBmWYQPCtdQP868s4lnl44w0
bSNpAYdzEqg64Ph3SK0prgWPrn5+5EiaAhV7HZzENZ5+o0DAdIXWq/W7uHyCWPKH
536cJDojec+nSUQkPYngngGprxrKO02aBcMw/3JGJ0tdCDj8yw3XAyVAFzz4hmMb
Lkmyv4QHHIILLvJ4ZRH5KHWCjjVBg41LNCs2H3HnoxFACdm0lZYKHsUAh2ucBVtU
Wb/eHmLxOG43UIkpX4yrhy3SfE1ZdnOVzEzOzPXtr51t8ojqk+bLFe/hJ6EkzrQX
X+90qHfBq+PMJlAnc3zdXHjxoJrL6KPWVwVvFrNeibgEKtVvy/BiwZkS6QceC1Ea
TatOYA5r6awFVHHQCooN1DGAxN5Juvu2SmdnTUA9ymsCNDghj1YUoAKRNP81u8Sa
pe3hco/63iCuPna+vlwNDU6SgsaMk9m0p+1n1BiDIfVJIkWYCNeG+u2gQkzbDKlQ
AJuKKQv1QuZiF0ylZ0wq
=VAgA
-----END PGP SIGNATURE-----
Merge tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell:
"This is the final removal (after several years!) of the obsolete
cpus_* functions, prompted by their mis-use in staging.
With these function removed, all cpu functions should only iterate to
nr_cpu_ids, so we finally only allocate that many bits when cpumasks
are allocated offstack"
* tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
cpumask: remove __first_cpu / __next_cpu
cpumask: resurrect CPU_MASK_CPU0
linux/cpumask.h: add typechecking to cpumask_test_cpu
cpumask: only allocate nr_cpumask_bits.
Fix weird uses of num_online_cpus().
cpumask: remove deprecated functions.
mips: fix obsolete cpumask_of_cpu usage.
x86: fix more deprecated cpu function usage.
ia64: remove deprecated cpus_ usage.
powerpc: fix deprecated CPU_MASK_CPU0 usage.
CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
staging/lustre/o2iblnd: Don't use cpus_weight
staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_
staging/lustre/ptlrpc: Do not use deprecated cpus_* functions
blackfin: fix up obsolete cpu function usage.
parisc: fix up obsolete cpu function usage.
tile: fix up obsolete cpu function usage.
arm64: fix up obsolete cpu function usage.
mips: fix up obsolete cpu function usage.
x86: fix up obsolete cpu function usage.
...
Pull arch/tile updates from Chris Metcalf:
"These are mostly nohz_full changes, plus a smattering of minor fixes
(notably a couple for ftrace)"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: nohz: warn if nohz_full uses hypervisor shared cores
tile: ftrace: fix function_graph tracer issues
tile: map data region shadow of kernel as R/W
tile: support CONTEXT_TRACKING and thus NOHZ_FULL
tile: support arch_irq_work_raise
arch: tile: fix null pointer dereference on pt_regs pointer
tile/elf: reorganize notify_exec()
tile: use si_int instead of si_ptr for compat_siginfo
The "hypervisor shared" cores are ones that the Tilera hypervisor
uses to receive interrupts to manage hypervisor-owned devices.
It's a bad idea to try to use those cores with nohz_full, since
they will get interrupted unpredictably -- and invisibly to Linux
tracing tools, since the interrupts are delivered at a higher
privilege level to the Tilera hypervisor.
Generate a clear warning at boot up that this doesn't end well
for the nohz_full cores in question.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
- Add support for ARCH_SUPPORTS_FTRACE_OPS
- Replace the instruction in ftrace_call with the bundle {move r10, lr;
jal ftrace_stub}, so that the lr contains the right value after returning
from ftrace_stub. An alternative fix might be to leave the instruction
in ftrace_call alone when it is being updated with ftrace_stub.
Signed-off-by: Tony Lu <zlu@ezchip.com>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Add the TIF_NOHZ flag appropriately.
Add call to user_exit() on entry to do_work_pending() and on entry
to syscalls via do_syscall_trace_enter(), and also the top of
do_syscall_trace_exit() just because it's done in x86.
Add call to user_enter() at the bottom of do_work_pending() once we
have no more work to do before returning to userspace.
Wrap all the trap code in exception_enter() / exception_exit().
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Tile includes a hypervisor hook to deliver messages to arbitrary
tiles, so we can use that to raise an interrupt as soon as
possible on our own core. Unfortunately the Tilera hypervisor
disabled that support on principle in previous releases, but
it will be available in MDE 4.3.4 and later.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cppcheck reports the following issue:
[arch/tile/kernel/stack.c:116]: (error) Possible null
pointer dereference: p
In this case, on reporting on an odd fault, p is set to NULL
and immediately afterwords p is dereferenced iff
!kbt->profile is false. Rather than doing this check just
return NULL rather than falling through to the potential
null pointer dereference (since the original intentional
outcome would be to return NULL anyhow) for this odd fault
case.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> [tweaked lightly]
To be compatible with the generic get_compat_sigevent(), the
copy_siginfo_to_user32() and thus copy_siginfo_from_user32()
have to use si_int instead of si_ptr. Using si_ptr means that
for the case of ILP32 compat code running in big-endian mode,
we would end up copying the high 32 bits of the pointer value
into si_int instead of the desired low 32 bits.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Pull exec domain removal from Richard Weinberger:
"This series removes execution domain support from Linux.
The idea behind exec domains was to support different ABIs. The
feature was never complete nor stable. Let's rip it out and make the
kernel signal handling code less complicated"
* 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits)
arm64: Removed unused variable
sparc: Fix execution domain removal
Remove rest of exec domains.
arch: Remove exec_domain from remaining archs
arc: Remove signal translation and exec_domain
xtensa: Remove signal translation and exec_domain
xtensa: Autogenerate offsets in struct thread_info
x86: Remove signal translation and exec_domain
unicore32: Remove signal translation and exec_domain
um: Remove signal translation and exec_domain
tile: Remove signal translation and exec_domain
sparc: Remove signal translation and exec_domain
sh: Remove signal translation and exec_domain
s390: Remove signal translation and exec_domain
mn10300: Remove signal translation and exec_domain
microblaze: Remove signal translation and exec_domain
m68k: Remove signal translation and exec_domain
m32r: Remove signal translation and exec_domain
m32r: Autogenerate offsets in struct thread_info
frv: Remove signal translation and exec_domain
...
As execution domain support is gone we can remove
signal translation from the signal code and remove
exec_domain from thread_info.
Signed-off-by: Richard Weinberger <richard@nod.at>
In preparation of adding another tkr field, rename this one to
tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base,
since the structure is not specific to CLOCK_MONOTONIC and the mono
name got added to the tk_read_base instance.
Lots of trivial churn.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Previously, pci_scan_root_bus() created a root PCI bus, enumerated the
devices on it, and called pci_bus_add_devices(), which made the devices
available for drivers to claim them.
Most callers assigned resources to devices after pci_scan_root_bus()
returns, which may be after drivers have claimed the devices. This is
incorrect; the PCI core should not change device resources while a driver
is managing the device.
Remove pci_bus_add_devices() from pci_scan_root_bus() and do it after any
resource assignment in the callers.
Note that ARM's pci_common_init_dev() already called pci_bus_add_devices()
after pci_scan_root_bus(), so we only need to remove the first call:
pci_common_init_dev
pcibios_init_hw
pci_scan_root_bus
pci_bus_add_devices # first call
pci_bus_assign_resources
pci_bus_add_devices # second call
[bhelgaas: changelog, drop "root_bus" var in alpha common_init_pci(),
return failure earlier in mn10300, add "return" in x86 pcibios_scan_root(),
return early if xtensa platform_pcibios_fixup() fails]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
CC: Matt Turner <mattst88@gmail.com>
CC: David Howells <dhowells@redhat.com>
CC: Tony Luck <tony.luck@intel.com>
CC: Michal Simek <monstr@monstr.eu>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Chris Metcalf <cmetcalf@ezchip.com>
CC: Chris Zankel <chris@zankel.net>
CC: Max Filippov <jcmvbkbc@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target. This is because the
restart_block is held in the same memory allocation as the kernel stack.
Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.
Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.
It's also a decent simplification, since the restart code is more or less
identical on all architectures.
[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace a magic number with a PCI #define symbol.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Chris Metcalf <cmetcalf@ezchip.com>
Nothing needs the module pointer any more, and the next patch will
call it from RCU, where the module itself might no longer exist.
Removing the arg is the safest approach.
This just codifies the use of the module_alloc/module_free pattern
which ftrace and bpf use.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: x86@kernel.org
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linux-cris-kernel@axis.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: nios2-dev@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: netdev@vger.kernel.org
Archs have been abusing module_free() to clean up their arch-specific
allocations. Since module_free() is also (ab)used by BPF and trace code,
let's keep it to simple allocations, and provide a hook called before
that.
This means that avr32, ia64, parisc and s390 no longer need to implement
their own module_free() at all. avr32 doesn't need module_finalize()
either.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Pull arch/tile updates from Chris Metcalf:
"Note that one of the changes converts my old cmetcalf@tilera.com email
in MAINTAINERS to the cmetcalf@ezchip.com email that you see on this
email"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: update MAINTAINERS email to EZchip
tile: avoid undefined behavior with regs[TREG_TP] etc
arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one
tile: Use the more common pr_warn instead of pr_warning
arch: tile: gxio: Export symbols for module using in 'mpipe.c'
arch: tile: kernel: signal.c: Use __copy_from/to_user() instead of __get/put_user()
Use the more common pr_warn.
Coalesce formats, realign arguments.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Coalesce the formats and align arguments.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eliminate the unlikely possibility of message interleaving for
early_printk/early_vprintk use.
early_vprintk can be done via the %pV extension so remove this
unnecessary function and change early_printk to have the equivalent
vprintk code.
All uses of early_printk already end with a newline so also remove the
unnecessary newline from the early_printk function.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed
to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage
sites. The conversion helper functions are kept around to avoid
conflicts in next and will be removed after merging into mainline.
Coccinelle assisted conversion. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: x86@kernel.org
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Mohit Kumar <mohit.kumar@st.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI
specific.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Not only memcpy() is faster than pointer copy, but also let code more
clearer and simple, which can avoid compiling warning (the original
implementation copy registers by exceeding member array border).
The related warning (with allmodconfig under tile):
CC arch/tile/kernel/kgdb.o
arch/tile/kernel/kgdb.c: In function 'sleeping_thread_to_gdb_regs':
arch/tile/kernel/kgdb.c:140:31: warning: iteration 53u invokes undefined behavior [-Waggressive-loop-optimizations]
*(ptr++) = thread_regs->regs[reg];
^
arch/tile/kernel/kgdb.c:139:2: note: containing loop
for (reg = 0; reg <= TREG_LAST_GPR; reg++)
^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
And other message logging neatening.
Other miscellanea:
o coalesce formats
o realign arguments
o standardize a couple of macros
o use __func__ instead of embedding the function name
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
setup/restore_sigcontext() want to copy all related registers between
user and kernel. So use block copy instead of each registers copy. Then
can let code simple and clearer (which can avoid compiler's warning):
The related warning (with allmodconfig under tile):
CC arch/tile/kernel/signal.o
In file included from include/linux/poll.h:11:0,
from include/linux/ring_buffer.h:7,
from include/linux/ftrace_event.h:5,
from include/trace/syscall.h:6,
from include/linux/syscalls.h:81,
from arch/tile/kernel/signal.c:30:
arch/tile/kernel/signal.c: In function 'setup_sigcontext':
arch/tile/kernel/signal.c:116:31: warning: iteration 53u invokes undefined behavior [-Waggressive-loop-optimizations]
err |= __put_user(regs->regs[i], &sc->gregs[i]);
^
./arch/tile/include/asm/uaccess.h:236:26: note: in definition of macro '__put_user_asm'
: "r" (ptr), "r" (x), "i" (-EFAULT))
^
./arch/tile/include/asm/uaccess.h:297:10: note: in expansion of macro '__put_user_8'
case 8: __put_user_8(x, ptr, __ret); break; \
^
arch/tile/kernel/signal.c:116:10: note: in expansion of macro '__put_user'
err |= __put_user(regs->regs[i], &sc->gregs[i]);
^
arch/tile/kernel/signal.c:115:2: note: containing loop
for (i = 0; i < sizeof(struct pt_regs)/sizeof(long); ++i)
^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.
This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().
Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().
This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"
* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...
This change adds support for clock_gettime with CLOCK_REALTIME
and CLOCK_MONOTONIC using vDSO. It also updates the vdso
struct nomenclature used for the clocks to match the x86 code
to keep it easier to update going forward.
We also support the *_COARSE clockid_t, for apps that want speed
but aren't concerned about fine-grained timestamps; this saves
about 20 cycles per call (see http://lwn.net/Articles/342018/).
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: John Stultz <john.stultz@linaro.org>