Commit Graph

1841 Commits

Author SHA1 Message Date
Richard Weinberger
0834f9cc9f um: Export pm_power_off
...modules are using this symbol.
Export it like all other archs to.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-05 22:17:52 +01:00
Richard Weinberger
322740efbb Revert "um: Fix get_signal() usage"
Commit db2f24dc24
was plain wrong. I did not realize the we are
allowed to loop here.
In fact we have to loop and must not return to userspace
before all SIGSEGVs have been delivered.
Other archs do this directly in their entry code, UML
does it here.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-05 22:16:40 +01:00
Nicolai Stange
012a4163be um: asm/page.h: remove the pte_high member from struct pte_t
Commit 16da306849 ("um: kill pfn_t") introduced a compile warning for
defconfig (SUBARCH=i386):

  arch/um/kernel/skas/mmu.c:38:206:
      warning: right shift count >= width of type [-Wshift-count-overflow]

Aforementioned patch changes the definition of the phys_to_pfn() macro
from

  ((pfn_t) ((p) >> PAGE_SHIFT))

to

  ((p) >> PAGE_SHIFT)

This effectively changes the phys_to_pfn() expansion's type from
unsigned long long to unsigned long.

Through the callchain init_stub_pte() => mk_pte(), the expansion of
phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
pte_set_val(pte, phys, prot) macro, eventually leading to

  (pte).pte_high = (phys) >> 32;

This results in the warning from above.

Since UML only deals with 32 bit addresses, the upper 32 bits from
'phys' used to be always zero anyway.  Also, all page protection flags
defined by UML don't use any bits beyond bit 9.  Since the contents of a
PTE are defined within architecture scope only, the ->pte_high member
can be safely removed.

Remove the ->pte_high member from struct pte_t.
Rename ->pte_low to ->pte.
Adapt the pte helper macros in arch/um/include/asm/page.h.

Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped.  This
write barrier doesn't seem to be paired with any read barrier though and
thus, was useless anyway.

Fixes: 16da306849 ("um: kill pfn_t")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-05 18:10:40 -08:00
Dan Williams
16da306849 um: kill pfn_t
The core has developed a need for a "pfn_t" type [1].  Convert the usage
of pfn_t by usermode-linux to an unsigned long, and update pfn_to_phys()
to drop its expectation of a typed pfn.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Linus Torvalds
33caf82acf Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "All kinds of stuff.  That probably should've been 5 or 6 separate
  branches, but by the time I'd realized how large and mixed that bag
  had become it had been too close to -final to play with rebasing.

  Some fs/namei.c cleanups there, memdup_user_nul() introduction and
  switching open-coded instances, burying long-dead code, whack-a-mole
  of various kinds, several new helpers for ->llseek(), assorted
  cleanups and fixes from various people, etc.

  One piece probably deserves special mention - Neil's
  lookup_one_len_unlocked().  Similar to lookup_one_len(), but gets
  called without ->i_mutex and tries to avoid ever taking it.  That, of
  course, means that it's not useful for any directory modifications,
  but things like getting inode attributes in nfds readdirplus are fine
  with that.  I really should've asked for moratorium on lookup-related
  changes this cycle, but since I hadn't done that early enough...  I
  *am* asking for that for the coming cycle, though - I'm going to try
  and get conversion of i_mutex to rwsem with ->lookup() done under lock
  taken shared.

  There will be a patch closer to the end of the window, along the lines
  of the one Linus had posted last May - mechanical conversion of
  ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
  inode_is_locked()/inode_lock_nested().  To quote Linus back then:

    -----
    |    This is an automated patch using
    |
    |        sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
    |        sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
    |        sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[     ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
    |        sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
    |        sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
    |
    |    with a very few manual fixups
    -----

  I'm going to send that once the ->i_mutex-affecting stuff in -next
  gets mostly merged (or when Linus says he's about to stop taking
  merges)"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  nfsd: don't hold i_mutex over userspace upcalls
  fs:affs:Replace time_t with time64_t
  fs/9p: use fscache mutex rather than spinlock
  proc: add a reschedule point in proc_readfd_common()
  logfs: constify logfs_block_ops structures
  fcntl: allow to set O_DIRECT flag on pipe
  fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
  fs: xattr: Use kvfree()
  [s390] page_to_phys() always returns a multiple of PAGE_SIZE
  nbd: use ->compat_ioctl()
  fs: use block_device name vsprintf helper
  lib/vsprintf: add %*pg format specifier
  fs: use gendisk->disk_name where possible
  poll: plug an unused argument to do_poll
  amdkfd: don't open-code memdup_user()
  cdrom: don't open-code memdup_user()
  rsxx: don't open-code memdup_user()
  mtip32xx: don't open-code memdup_user()
  [um] mconsole: don't open-code memdup_user_nul()
  [um] hostaudio: don't open-code memdup_user()
  ...
2016-01-12 17:11:47 -08:00
Mickaël Salaün
3e46b25376 um: Use race-free temporary file creation
Open the memory mapped file with the O_TMPFILE flag when available.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Acked-by: Tristan Schmelcher <tschmelcher@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:50 +01:00
Mickaël Salaün
571d2f0c34 um: Do not set unsecure permission for temporary file
Remove the insecure 0777 mode for temporary file to prohibit other users
to change the executable mapped code.

An attacker could gain access to the mapped file descriptor from the
temporary file (before it is unlinked) in a read-only mode but it should
not be accessible in write mode to avoid arbitrary code execution.

To not change the hostfs behavior, the temporary file creation
permission now depends on the current umask(2) and the implementation of
mkstemp(3).

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Acked-by: Tristan Schmelcher <tschmelcher@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:50 +01:00
Mickaël Salaün
c50b4659e4 um: Add seccomp support
This brings SECCOMP_MODE_STRICT and SECCOMP_MODE_FILTER support through
prctl(2) and seccomp(2) to User-mode Linux for i386 and x86_64
subarchitectures.

secure_computing() is called first in handle_syscall() so that the
syscall emulation will be aborted quickly if matching a seccomp rule.

This is inspired from Meredydd Luff's patch
(https://gerrit.chromium.org/gerrit/21425).

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: David Drysdale <drysdale@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Kees Cook <keescook@chromium.org>
2016-01-10 21:49:49 +01:00
Mickaël Salaün
d8f8b84456 um: Add full asm/syscall.h support
Add subarchitecture-independent implementation of asm-generic/syscall.h
allowing access to user system call parameters and results:
* syscall_get_nr()
* syscall_rollback()
* syscall_get_error()
* syscall_get_return_value()
* syscall_set_return_value()
* syscall_get_arguments()
* syscall_set_arguments()
* syscall_get_arch() provided by arch/x86/um/asm/syscall.h

This provides the necessary syscall helpers needed by
HAVE_ARCH_SECCOMP_FILTER plus syscall_get_error().

This is inspired from Meredydd Luff's patch
(https://gerrit.chromium.org/gerrit/21425).

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: David Drysdale <drysdale@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Kees Cook <keescook@chromium.org>
2016-01-10 21:49:49 +01:00
Mickaël Salaün
e04c989eb7 um: Fix ptrace GETREGS/SETREGS bugs
This fix two related bugs:
* PTRACE_GETREGS doesn't get the right orig_ax (syscall) value
* PTRACE_SETREGS can't set the orig_ax value (erased by initial value)

Get rid of the now useless and error-prone get_syscall().

Fix inconsistent behavior in the ptrace implementation for i386 when
updating orig_eax automatically update the syscall number as well. This
is now updated in handle_syscall().

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Anton Ivanov <aivanov@brocade.com>
Cc: Meredydd Luff <meredydd@senatehouse.org>
Cc: David Drysdale <drysdale@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Kees Cook <keescook@chromium.org>
2016-01-10 21:49:48 +01:00
Anton Ivanov
8c6157b6b3 um: Update UBD to use pread/pwrite family of functions
This decreases the number of syscalls per read/write by half.

Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:48 +01:00
Anton Ivanov
470a166e8c um: Do not change hard IRQ flags in soft IRQ processing
Software IRQ processing in generic architectures assumes that the
exit out of hard IRQ may have re-enabled interrupts (some
architectures may have an implicit EOI). It presumes them enabled
and toggles the flags once more just in case unless this is turned
off in the architecture specific hardirq.h by setting
__ARCH_IRQ_EXIT_IRQS_DISABLED

This patch adds this to UML where due to the way IRQs are handled
it is an optimization (it works fine without it too).

Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:48 +01:00
Anton Ivanov
d5e3f5cbe5 um: Prevent IRQ handler reentrancy
The existing IRQ handler design in UML does not prevent reentrancy

This is mitigated by fd-enable/fd-disable semantics for the IO
portion of the UML subsystem. The timer, however, can and is
re-entered resulting in very deep stack usage and occasional
stack exhaustion.

This patch prevents this by checking if there is a timer
interrupt in-flight before processing any pending timer interrupts.

Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:47 +01:00
Vegard Nossum
0754fb298f uml: flush stdout before forking
I was seeing some really weird behaviour where piping UML's output
somewhere would cause output to get duplicated:

  $ ./vmlinux | head -n 40
  Checking that ptrace can change system call numbers...Core dump limits :
          soft - 0
          hard - NONE
  OK
  Checking syscall emulation patch for ptrace...Core dump limits :
          soft - 0
          hard - NONE
  OK
  Checking advanced syscall emulation patch for ptrace...Core dump limits :
          soft - 0
          hard - NONE
  OK
  Core dump limits :
          soft - 0
          hard - NONE

This is because these tests do a fork() which duplicates the non-empty
stdout buffer, then glibc flushes the duplicated buffer as each child
exits.

A simple workaround is to flush before forking.

Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 21:49:47 +01:00
Al Viro
793b796ebf [um] mconsole: don't open-code memdup_user_nul()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-04 10:29:45 -05:00
Al Viro
1ceb36285c [um] hostaudio: don't open-code memdup_user()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-04 10:29:40 -05:00
Geyslan G. Bem
887a985309 um: fix returns without va_end
When using va_list ensure that va_start will be followed by va_end.

Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-08 22:26:00 +01:00
Lorenzo Colitti
fb1770aa78 arch: um: fix error when linking vmlinux.
On gcc Ubuntu 4.8.4-2ubuntu1~14.04, linking vmlinux fails with:

arch/um/os-Linux/built-in.o: In function `os_timer_create':
/android/kernel/android/arch/um/os-Linux/time.c:51: undefined reference to `timer_create'
arch/um/os-Linux/built-in.o: In function `os_timer_set_interval':
/android/kernel/android/arch/um/os-Linux/time.c:84: undefined reference to `timer_settime'
arch/um/os-Linux/built-in.o: In function `os_timer_remain':
/android/kernel/android/arch/um/os-Linux/time.c:109: undefined reference to `timer_gettime'
arch/um/os-Linux/built-in.o: In function `os_timer_one_shot':
/android/kernel/android/arch/um/os-Linux/time.c:132: undefined reference to `timer_settime'
arch/um/os-Linux/built-in.o: In function `os_timer_disable':
/android/kernel/android/arch/um/os-Linux/time.c:145: undefined reference to `timer_settime'

This is because -lrt appears in the generated link commandline
after arch/um/os-Linux/built-in.o. Fix this by removing -lrt from
arch/um/Makefile and adding it to the UM-specific section of
scripts/link-vmlinux.sh.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-08 22:25:13 +01:00
Richard Weinberger
db2f24dc24 um: Fix get_signal() usage
If get_signal() returns us a signal to post
we must not call it again, otherwise the already
posted signal will be overridden.
Before commit a610d6e672 this was the case as we stopped
the while after a successful handle_signal().

Cc: <stable@vger.kernel.org> # 3.10-
Fixes: a610d6e672 ("pull clearing RESTORE_SIGMASK into block_sigmask()")
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-08 22:23:30 +01:00
Anton Ivanov
2eb5f31bc4 um: Switch clocksource to hrtimers
UML is using an obsolete itimer call for
all timers and "polls" for kernel space timer firing
in its userspace portion resulting in a long list
of bugs and incorrect behaviour(s). It also uses
ITIMER_VIRTUAL for its timer which results in the
timer being dependent on it running and the cpu
load.

This patch fixes this by moving to posix high resolution
timers firing off CLOCK_MONOTONIC and relaying the timer
correctly to the UML userspace.

Fixes:
 - crashes when hosts suspends/resumes
 - broken userspace timers - effecive ~40Hz instead
   of what they should be. Note - this modifies skas behavior
   by no longer setting an itimer per clone(). Timer events
   are relayed instead.
 - kernel network packet scheduling disciplines
 - tcp behaviour especially under load
 - various timer related corner cases

Finally, overall responsiveness of userspace is better.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Anton Ivanov <aivanov@brocade.com>
[rw: massaged commit message]
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 22:54:49 +01:00
Saurabh Sengar
e17c6d77b2 um: net: replace GFP_KERNEL with GFP_ATOMIC when spinlock is held
since GFP_KERNEL with GFP_ATOMIC while spinlock is held,
as code while holding a spinlock should be atomic.
GFP_KERNEL may sleep and can cause deadlock,
where as GFP_ATOMIC may fail but certainly avoids deadlockdex f70dd54..d898f6c 100644

Signed-off-by: Saurabh Sengar <saurabh.truth@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 22:51:00 +01:00
Richard Weinberger
70c8205f40 um: Report host OOM more nicely
If UML runs on the host side out of memory, report this
condition more nicely.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 22:49:12 +01:00
Richard Weinberger
f10e6d652b um: Get rid of open coded NR_SYSCALLS
We can use __NR_syscall_max.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 22:49:10 +01:00
Richard Weinberger
1d80f0cda1 um: Store syscall number after syscall_trace_enter()
To support changing syscall numbers we have to store
it after syscall_trace_enter().

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 22:49:09 +01:00
Richard Weinberger
44011b897a um: Define PTRACE_OLDSETOPTIONS
...such that processes within UML can do a ptrace(PTRACE_OLDSETOPTIONS, ...)

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 22:49:09 +01:00
Richard Weinberger
56b88a3bf9 um: Fix kernel mode fault condition
We have to exclude memory locations <= PAGE_SIZE from
the condition and let the kernel mode fault path catch it.
Otherwise a kernel NULL pointer exception will be reported
as a kernel user space access.

Fixes: d2313084e2 (um: Catch unprotected user memory access)
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-19 22:53:37 +02:00
Richard Weinberger
6b1873371c um: Fix waitpid() usage in helper code
If UML is executing a helper program it is using
waitpid() with the __WCLONE flag to wait for the program
as the helper is executed from a clone()'ed thread.
While using __WCLONE is perfectly fine for clone()'ed
childs it won't detect terminated childs if the helper
has issued an execve().

We have to use __WALL to wait for both clone()'ed and
regular childs to detect the termination before and
after an execve().

Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-19 22:53:37 +02:00
Richard Weinberger
0b5aedfe0e um: Fix out-of-tree build
Commit 30b11ee9a (um: Remove copy&paste code from init.h)
uncovered an issue wrt. out-of-tree builds.
For out-of-tree builds, we must not rely on relative paths.
Before 30b11ee9a it worked by chance as no host code included
generated header files.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-19 22:53:36 +02:00
Linus Torvalds
30c44659f4 Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull strscpy string copy function implementation from Chris Metcalf.

Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.

The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.

strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result.  To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.

strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string.  Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated.  It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.

strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG.  It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.

So why did I waffle about this for so long?

Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.

And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.

So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches.  Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: use global strscpy() rather than private copy
  string: provide strscpy()
  Make asm/word-at-a-time.h available on all architectures
2015-10-04 16:31:13 +01:00
Linus Torvalds
5e359bf221 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "Rather large, but nothing exiting:

   - new range check for settimeofday() to prevent that boot time
     becomes negative.
   - fix for file time rounding
   - a few simplifications of the hrtimer code
   - fix for the proc/timerlist code so the output of clock realtime
     timers is accurate
   - more y2038 work
   - tree wide conversion of clockevent drivers to the new callbacks"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits)
  hrtimer: Handle failure of tick_init_highres() gracefully
  hrtimer: Unconfuse switch_hrtimer_base() a bit
  hrtimer: Simplify get_target_base() by returning current base
  hrtimer: Drop return code of hrtimer_switch_to_hres()
  time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64()
  time: Introduce current_kernel_time64()
  time: Introduce struct itimerspec64
  time: Add the common weak version of update_persistent_clock()
  time: Always make sure wall_to_monotonic isn't positive
  time: Fix nanosecond file time rounding in timespec_trunc()
  timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers
  cris/time: Migrate to new 'set-state' interface
  kernel: broadcast-hrtimer: Migrate to new 'set-state' interface
  xtensa/time: Migrate to new 'set-state' interface
  unicore/time: Migrate to new 'set-state' interface
  um/time: Migrate to new 'set-state' interface
  sparc/time: Migrate to new 'set-state' interface
  sh/localtimer: Migrate to new 'set-state' interface
  score/time: Migrate to new 'set-state' interface
  s390/time: Migrate to new 'set-state' interface
  ...
2015-09-01 14:04:50 -07:00
Viresh Kumar
71b5280b79 um/time: Migrate to new 'set-state' interface
Migrate um driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.

This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.

Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: user-mode-linux-user@lists.sourceforge.net
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-08-10 11:41:06 +02:00
Ingo Molnar
5b929bd11d Merge branch 'x86/urgent' into x86/asm, before applying dependent patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-31 10:23:35 +02:00
Laurent Dufour
f2abeef9fd mm: clean up per architecture MM hook header files
Commit 2ae416b142 ("mm: new mm hook framework") introduced an empty
header file (mm-arch-hooks.h) for every architecture, even those which
doesn't need to define mm hooks.

As suggested by Geert Uytterhoeven, this could be cleaned through the use
of a generic header file included via each per architecture
asm/include/Kbuild file.

The PowerPC architecture is not impacted here since this architecture has
to defined the arch_remap MM hook.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-17 16:39:53 -07:00
Chris Metcalf
a6e2f029ae Make asm/word-at-a-time.h available on all architectures
Added the x86 implementation of word-at-a-time to the
generic version, which previously only supported big-endian.

Omitted the x86-specific load_unaligned_zeropad(), which in
any case is also not present for the existing BE-only
implementation of a word-at-a-time, and is only used under
CONFIG_DCACHE_WORD_ACCESS.

Added as a "generic-y" to the Kbuilds of all architectures
that didn't previously have it.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-07-08 16:41:55 -04:00
Ingo Molnar
ccaee5f851 um: Fix do_signal() prototype
Once x86 exports its do_signal(), the prototypes will clash.

Fix the clash and also improve the code a bit: remove the
unnecessary kern_do_signal() indirection. This allows
interrupt_end() to share the 'regs' parameter calculation.

Also remove the unused return code to match x86.

Minimally build and boot tested.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/67c57eac09a589bac3c6c5ff22f9623ec55a184a.1435952415.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-07 10:58:54 +02:00
Linus Torvalds
02201e3f1b Minor merge needed, due to function move.
Main excitement here is Peter Zijlstra's lockless rbtree optimization to
 speed module address lookup.  He found some abusers of the module lock
 doing that too.
 
 A little bit of parameter work here too; including Dan Streetman's breaking
 up the big param mutex so writing a parameter can load another module (yeah,
 really).  Unfortunately that broke the usual suspects, !CONFIG_MODULES and
 !CONFIG_SYSFS, so those fixes were appended too.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVkgKHAAoJENkgDmzRrbjxQpwQAJVmBN6jF3SnwbQXv9vRixjH
 58V33sb1G1RW+kXxQ3/e8jLX/4VaN479CufruXQp+IJWXsN/CH0lbC3k8m7u50d7
 b1Zeqd/Yrh79rkc11b0X1698uGCSMlzz+V54Z0QOTEEX+nSu2ZZvccFS4UaHkn3z
 rqDo00lb7rxQz8U25qro2OZrG6D3ub2q20TkWUB8EO4AOHkPn8KWP2r429Axrr0K
 wlDWDTTt8/IsvPbuPf3T15RAhq1avkMXWn9nDXDjyWbpLfTn8NFnWmtesgY7Jl4t
 GjbXC5WYekX3w2ZDB9KaT/DAMQ1a7RbMXNSz4RX4VbzDl+yYeSLmIh2G9fZb1PbB
 PsIxrOgy4BquOWsJPm+zeFPSC3q9Cfu219L4AmxSjiZxC3dlosg5rIB892Mjoyv4
 qxmg6oiqtc4Jxv+Gl9lRFVOqyHZrTC5IJ+xgfv1EyP6kKMUKLlDZtxZAuQxpUyxR
 HZLq220RYnYSvkWauikq4M8fqFM8bdt6hLJnv7bVqllseROk9stCvjSiE3A9szH5
 OgtOfYV5GhOeb8pCZqJKlGDw+RoJ21jtNCgOr6DgkNKV9CX/kL/Puwv8gnA0B0eh
 dxCeB7f/gcLl7Cg3Z3gVVcGlgak6JWrLf5ITAJhBZ8Lv+AtL2DKmwEWS/iIMRmek
 tLdh/a9GiCitqS0bT7GE
 =tWPQ
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Main excitement here is Peter Zijlstra's lockless rbtree optimization
  to speed module address lookup.  He found some abusers of the module
  lock doing that too.

  A little bit of parameter work here too; including Dan Streetman's
  breaking up the big param mutex so writing a parameter can load
  another module (yeah, really).  Unfortunately that broke the usual
  suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were
  appended too"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits)
  modules: only use mod->param_lock if CONFIG_MODULES
  param: fix module param locks when !CONFIG_SYSFS.
  rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE()
  module: add per-module param_lock
  module: make perm const
  params: suppress unused variable error, warn once just in case code changes.
  modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'.
  kernel/module.c: avoid ifdefs for sig_enforce declaration
  kernel/workqueue.c: remove ifdefs over wq_power_efficient
  kernel/params.c: export param_ops_bool_enable_only
  kernel/params.c: generalize bool_enable_only
  kernel/module.c: use generic module param operaters for sig_enforce
  kernel/params: constify struct kernel_param_ops uses
  sysfs: tightened sysfs permission checks
  module: Rework module_addr_{min,max}
  module: Use __module_address() for module_address_lookup()
  module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING
  module: Optimize __module_address() using a latched RB-tree
  rbtree: Implement generic latch_tree
  seqlock: Introduce raw_read_seqcount_latch()
  ...
2015-07-01 10:49:25 -07:00
Linus Torvalds
21dc2e6c6d Merge branch 'for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:

 - remove hppfs ("HonePot ProcFS")

 - initial support for musl libc

 - uaccess cleanup

 - random cleanups and bug fixes all over the place

* 'for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (21 commits)
  um: Don't pollute kernel namespace with uapi
  um: Include sys/types.h for makedev(), major(), minor()
  um: Do not use stdin and stdout identifiers for struct members
  um: Do not use __ptr_t type for stack_t's .ss pointer
  um: Fix mconsole dependency
  um: Handle tracehook_report_syscall_entry() result
  um: Remove copy&paste code from init.h
  um: Stop abusing __KERNEL__
  um: Catch unprotected user memory access
  um: Fix warning in setup_signal_stack_si()
  um: Rework uaccess code
  um: Add uaccess.h to ldt.c
  um: Add uaccess.h to syscalls_64.c
  um: Add asm/elf.h to vma.c
  um: Cleanup mem_32/64.c headers
  um: Remove hppfs
  um: Move syscall() declaration into os.h
  um: kernel: ksyms: Export symbol syscall() for fixing modpost issue
  um/os-Linux: Use char[] for syscall_stub declarations
  um: Use char[] for linker script address declarations
  ...
2015-06-28 13:55:08 -07:00
Linus Torvalds
d87823813f Char/Misc driver patches for 4.2-rc1
Here's the big char/misc driver pull request for 4.2-rc1.
 
 Lots of mei, extcon, coresight, uio, mic, and other driver updates in
 here.  Full details in the shortlog.  All of these have been in
 linux-next for some time with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlWNn0gACgkQMUfUDdst+ykCCQCgvdF4F2+Hy9+RATdk22ak1uq1
 JDMAoJTf4oyaIEdaiOKfEIWg9MasS42B
 =H5wD
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here's the big char/misc driver pull request for 4.2-rc1.

  Lots of mei, extcon, coresight, uio, mic, and other driver updates in
  here.  Full details in the shortlog.  All of these have been in
  linux-next for some time with no reported problems"

* tag 'char-misc-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (176 commits)
  mei: me: wait for power gating exit confirmation
  mei: reset flow control on the last client disconnection
  MAINTAINERS: mei: add mei_cl_bus.h to maintained file list
  misc: sram: sort and clean up included headers
  misc: sram: move reserved block logic out of probe function
  misc: sram: add private struct device and virt_base members
  misc: sram: report correct SRAM pool size
  misc: sram: bump error message level on unclean driver unbinding
  misc: sram: fix device node reference leak on error
  misc: sram: fix enabled clock leak on error path
  misc: mic: Fix reported static checker warning
  misc: mic: Fix randconfig build error by including errno.h
  uio: pruss: Drop depends on ARCH_DAVINCI_DA850 from config
  uio: pruss: Add CONFIG_HAS_IOMEM dependence
  uio: pruss: Include <linux/sizes.h>
  extcon: Redefine the unique id of supported external connectors without 'enum extcon' type
  char:xilinx_hwicap:buffer_icap - change 1/0 to true/false for bool type variable in function buffer_icap_set_configuration().
  Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion
  parport: check exclusive access before register
  w1: use correct lock on error in w1_seq_show()
  ...
2015-06-26 14:51:15 -07:00
Linus Torvalds
ad90fb9751 Merge branch 'for-4.2/sg' of git://git.kernel.dk/linux-block
Pull asm/scatterlist.h removal from Jens Axboe:
 "We don't have any specific arch scatterlist anymore, since parisc
  finally switched over.  Kill the include"

* 'for-4.2/sg' of git://git.kernel.dk/linux-block:
  remove scatterlist.h generation from arch Kbuild files
  remove <asm/scatterlist.h>
2015-06-25 15:22:36 -07:00
Richard Weinberger
da028d5e54 um: Don't pollute kernel namespace with uapi
Don't include ptrace uapi stuff in arch headers, it will
pollute the kernel namespace and conflict with existing
stuff.
In this case it fixes clashes with common names like R8.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-25 22:44:11 +02:00
Hans-Werner Hilse
8eeba4e9a7 um: Include sys/types.h for makedev(), major(), minor()
The functions in question are not part of the POSIX standard,
documentation however hints that the corresponding header shall
be sys/types.h. C libraries other than glibc, namely musl, did
not include that header via other ways and complained.

Signed-off-by: Hans-Werner Hilse <hwhilse@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-25 22:42:21 +02:00
Hans-Werner Hilse
f9bb3b5947 um: Do not use stdin and stdout identifiers for struct members
stdin, stdout and stderr are macros according to C89/C99.
Thus do not use them as struct member identifiers to avoid
bad results from macro expansion.

Signed-off-by: Hans-Werner Hilse <hwhilse@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-25 22:42:19 +02:00
Hans-Werner Hilse
9a75551aea um: Do not use __ptr_t type for stack_t's .ss pointer
__ptr_t type is a glibc-specific type, while the generally
documented type is a void*. That's what other C libraries use,
too.

Signed-off-by: Hans-Werner Hilse <hwhilse@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-25 22:42:17 +02:00
Laurent Dufour
2ae416b142 mm: new mm hook framework
CRIU is recreating the process memory layout by remapping the checkpointee
memory area on top of the current process (criu).  This includes remapping
the vDSO to the place it has at checkpoint time.

However some architectures like powerpc are keeping a reference to the
vDSO base address to build the signal return stack frame by calling the
vDSO sigreturn service.  So once the vDSO has been moved, this reference
is no more valid and the signal frame built later are not usable.

This patch serie is introducing a new mm hook framework, and a new
arch_remap hook which is called when mremap is done and the mm lock still
hold.  The next patch is adding the vDSO remap and unmap tracking to the
powerpc architecture.

This patch (of 3):

This patch introduces a new set of header file to manage mm hooks:
- per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
- a generic header (include/linux/mm-arch-hooks.h)

The architecture which need to overwrite a hook as to redefine it in its
header file, while architecture which doesn't need have nothing to do.

The default hooks are defined in the generic header and are used in the
case the architecture is not defining it.

In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
be moved here.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24 17:49:41 -07:00
Dan Streetman
b51d23e4e9 module: add per-module param_lock
Add a "param_lock" mutex to each module, and update params.c to use
the correct built-in or module mutex while locking kernel params.
Remove the kparam_block_sysfs_r/w() macros, replace them with direct
calls to kernel_param_[un]lock(module).

The kernel param code currently uses a single mutex to protect
modification of any and all kernel params.  While this generally works,
there is one specific problem with it; a module callback function
cannot safely load another module, i.e. with request_module() or even
with indirect calls such as crypto_has_alg().  If the module to be
loaded has any of its params configured (e.g. with a /etc/modprobe.d/*
config file), then the attempt will result in a deadlock between the
first module param callback waiting for modprobe, and modprobe trying to
lock the single kernel param mutex to set the new module's param.

This fixes that by using per-module mutexes, so that each individual module
is protected against concurrent changes in its own kernel params, but is
not blocked by changes to other module params.  All built-in modules
continue to use the built-in mutex, since they will always be loaded at
runtime and references (e.g. request_module(), crypto_has_alg()) to them
will never cause load-time param changing.

This also simplifies the interface used by modules to block sysfs access
to their params; while there are currently functions to block and unblock
sysfs param access which are split up by read and write and expect a single
kernel param to be passed, their actual operation is identical and applies
to all params, not just the one passed to them; they simply lock and unlock
the global param mutex.  They are replaced with direct calls to
kernel_param_[un]lock(THIS_MODULE), which locks THIS_MODULE's param_lock, or
if the module is built-in, it locks the built-in mutex.

Suggested-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-06-23 15:27:38 +09:30
Richard Weinberger
1eb7c6c70e um: Fix mconsole dependency
mconsole depends on CONFIG_PROC_FS.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31 23:27:26 +02:00
Richard Weinberger
5334cdae40 um: Handle tracehook_report_syscall_entry() result
tracehook_report_syscall_entry() is allowed to fail,
in case of failure we have to abort the current syscall.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31 22:59:03 +02:00
Richard Weinberger
30b11ee9ae um: Remove copy&paste code from init.h
As we got rid of the __KERNEL__ abuse, we can directly
include linux/compiler.h now.
This also allows gcc 5 to build UML.

Reported-by: Hans-Werner Hilse <hwhilse@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31 22:17:08 +02:00
Richard Weinberger
298e20ba8c um: Stop abusing __KERNEL__
Currently UML is abusing __KERNEL__ to distinguish between
kernel and host code (os-Linux). It is better to use a custom
define such that existing users of __KERNEL__ don't get confused.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31 22:05:32 +02:00
Richard Weinberger
d2313084e2 um: Catch unprotected user memory access
If the kernel tries to access user memory without copy_from_user()
a trap will happen as kernel and userspace run in different processes
on the host side. Currently this special page fault cannot be resolved
and will happen over and over again. As result UML will lockup.
This patch allows the page fault code to detect that situation and
causes a panic() such that the root cause of the unprotected memory
access can be found and fixed.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-31 19:21:51 +02:00