Commit Graph

1221 Commits

Author SHA1 Message Date
Martin Schwidefsky
1b948d6cae s390/mm,tlb: optimize TLB flushing for zEC12
The zEC12 machines introduced the local-clearing control for the IDTE
and IPTE instruction. If the control is set only the TLB of the local
CPU is cleared of entries, either all entries of a single address space
for IDTE, or the entry for a single page-table entry for IPTE.
Without the local-clearing control the TLB flush is broadcasted to all
CPUs in the configuration, which is expensive.

The reset of the bit mask of the CPUs that need flushing after a
non-local IDTE is tricky. As TLB entries for an address space remain
in the TLB even if the address space is detached a new bit field is
required to keep track of attached CPUs vs. CPUs in the need of a
flush. After a non-local flush with IDTE the bit-field of attached CPUs
is copied to the bit-field of CPUs in need of a flush. The ordering
of operations on cpu_attach_mask, attach_count and mm_cpumask(mm) is
such that an underindication in mm_cpumask(mm) is prevented but an
overindication in mm_cpumask(mm) is possible.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-04-03 14:31:00 +02:00
Martin Schwidefsky
02a8f3abb7 s390/mm,tlb: safeguard against speculative TLB creation
The principles of operations states that the CPU is allowed to create
TLB entries for an address space anytime while an ASCE is loaded to
the control register. This is true even if the CPU is running in the
kernel and the user address space is not (actively) accessed.

In theory this can affect two aspects of the TLB flush logic.
For full-mm flushes the ASCE of the dying process is still attached.
The approach to flush first with IDTE and then just free all page
tables can in theory lead to stale TLB entries. Use the batched
free of page tables for the full-mm flushes as well.

For operations that can have a stale ASCE in the control register,
e.g. a delayed update_user_asce in switch_mm, load the kernel ASCE
to prevent invalid TLBs from being created.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-04-03 14:30:55 +02:00
Thomas Huth
1dad093b66 s390/irq: Use defines for external interruption codes
Use the new defines for external interruption codes to get rid
of "magic" numbers in the s390 source code. And while we're at it,
also rename the (un-)register_external_interrupt function to
something shorter so that this patch does not exceed the 80
columns all over the place.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-04-03 14:30:52 +02:00
Thomas Huth
072c279029 s390/irq: Add defines for external interruption codes
Introduce defines for external interruption codes so that we
can get rid of some "magic" numbers in the s390 source code.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-04-03 14:30:49 +02:00
Linus Torvalds
159d8133d0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual rocket science -- mostly documentation and comment updates"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  sparse: fix comment
  doc: fix double words
  isdn: capi: fix "CAPI_VERSION" comment
  doc: DocBook: Fix typos in xml and template file
  Bluetooth: add module name for btwilink
  driver core: unexport static function create_syslog_header
  mmc: core: typo fix in printk specifier
  ARM: spear: clean up editing mistake
  net-sysfs: fix comment typo 'CONFIG_SYFS'
  doc: Insert MODULE_ in module-signing macros
  Documentation: update URL to hfsplus Technote 1150
  gpio: update path to documentation
  ixgbe: Fix format string in ixgbe_fcoe.
  Kconfig: Remove useless "default N" lines
  user_namespace.c: Remove duplicated word in comment
  CREDITS: fix formatting
  treewide: Fix typo in Documentation/DocBook
  mm: Fix warning on make htmldocs caused by slab.c
  ata: ata-samsung_cf: cleanup in header file
  idr: remove unused prototype of idr_free()
2014-04-02 16:23:38 -07:00
Linus Torvalds
7cbb39d4d4 Merge tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
 "PPC and ARM do not have much going on this time.  Most of the cool
  stuff, instead, is in s390 and (after a few releases) x86.

  ARM has some caching fixes and PPC has transactional memory support in
  guests.  MIPS has some fixes, with more probably coming in 3.16 as
  QEMU will soon get support for MIPS KVM.

  For x86 there are optimizations for debug registers, which trigger on
  some Windows games, and other important fixes for Windows guests.  We
  now expose to the guest Broadwell instruction set extensions and also
  Intel MPX.  There's also a fix/workaround for OS X guests, nested
  virtualization features (preemption timer), and a couple kvmclock
  refinements.

  For s390, the main news is asynchronous page faults, together with
  improvements to IRQs (floating irqs and adapter irqs) that speed up
  virtio devices"

* tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
  KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
  KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
  KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
  KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
  KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
  KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
  KVM: PPC: Book3S HV: Add transactional memory support
  KVM: Specify byte order for KVM_EXIT_MMIO
  KVM: vmx: fix MPX detection
  KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
  KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
  KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
  KVM: s390: clear local interrupts at cpu initial reset
  KVM: s390: Fix possible memory leak in SIGP functions
  KVM: s390: fix calculation of idle_mask array size
  KVM: s390: randomize sca address
  KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
  KVM: Bump KVM_MAX_IRQ_ROUTES for s390
  KVM: s390: irq routing for adapter interrupts.
  KVM: s390: adapter interrupt sources
  ...
2014-04-02 14:50:10 -07:00
Linus Torvalds
158e0d3621 Driver core / sysfs patches for 3.15-rc1
Here's the big driver core / sysfs update for 3.15-rc1.
 
 Lots of kernfs updates to make it useful for other subsystems, and a few
 other tiny driver core patches.
 
 All have been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlM7A0wACgkQMUfUDdst+ynJNACfZlY+KNKIhNFt1OOW8rQfSZzy
 1PYAnjYuOoly01JlPrpJD5b4TdxaAq71
 =GVUg
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core and sysfs updates from Greg KH:
 "Here's the big driver core / sysfs update for 3.15-rc1.

  Lots of kernfs updates to make it useful for other subsystems, and a
  few other tiny driver core patches.

  All have been in linux-next for a while"

* tag 'driver-core-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (42 commits)
  Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()"
  kernfs: cache atomic_write_len in kernfs_open_file
  numa: fix NULL pointer access and memory leak in unregister_one_node()
  Revert "driver core: synchronize device shutdown"
  kernfs: fix off by one error.
  kernfs: remove duplicate dir.c at the top dir
  x86: align x86 arch with generic CPU modalias handling
  cpu: add generic support for CPU feature based module autoloading
  sysfs: create bin_attributes under the requested group
  driver core: unexport static function create_syslog_header
  firmware: use power efficient workqueue for unloading and aborting fw load
  firmware: give a protection when map page failed
  firmware: google memconsole driver fixes
  firmware: fix google/gsmi duplicate efivars_sysfs_init()
  drivers/base: delete non-required instances of include <linux/init.h>
  kernfs: fix kernfs_node_from_dentry()
  ACPI / platform: drop redundant ACPI_HANDLE check
  kernfs: fix hash calculation in kernfs_rename_ns()
  kernfs: add CONFIG_KERNFS
  sysfs, kobject: add sysfs wrapper for kernfs_enable_ns()
  ...
2014-04-01 16:28:19 -07:00
Heiko Carstens
0ccc8b7ac8 s390/bitops,atomic: add missing memory barriers
When reworking the bitops and atomic ops I missed that those instructions
that got atomic behaviour only perform a "specific-operand-serialization"
instead of a full "serialization".
The compare-and-swap instruction used before performs a full serialization
before and after the instruction is executed, which means it has full
memory barrier semantics.
In order to give the new bitops and atomic ops functions also full memory
barrier semantics add a "bcr 14,0" before and after each of those new
instructions which performs full serialization as well.

This restores memory barrier semantics for bitops and atomic ops functions
which return values, like e.g. atomic_add_return(), but not for functions
which do not return a value, like e.g. atomic_add().
This is consistent to other architectures and what common code requires.

Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-04-01 09:23:35 +02:00
Linus Torvalds
1f8c538ed6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "There are two memory management related changes, the CMMA support for
  KVM to avoid swap-in of freed pages and the split page table lock for
  the PMD level.  These two come with common code changes in mm/.

  A fix for the long standing theoretical TLB flush problem, this one
  comes with a common code change in kernel/sched/.

  Another set of changes is Heikos uaccess work, included is the initial
  set of patches with more to come.

  And fixes and cleanups as usual"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (36 commits)
  s390/con3270: optionally disable auto update
  s390/mm: remove unecessary parameter from pgste_ipte_notify
  s390/mm: remove unnecessary parameter from gmap_do_ipte_notify
  s390/mm: fixing comment so that parameter name match
  s390/smp: limit number of cpus in possible cpu mask
  hypfs: Add clarification for "weight_min" attribute
  s390: update defconfigs
  s390/ptrace: add support for PTRACE_SINGLEBLOCK
  s390/perf: make print_debug_cf() static
  s390/topology: Remove call to update_cpu_masks()
  s390/compat: remove compat exec domain
  s390: select CONFIG_TTY for use of tty in unconditional keyboard driver
  s390/appldata_os: fix cpu array size calculation
  s390/checksum: remove memset() within csum_partial_copy_from_user()
  s390/uaccess: remove copy_from_user_real()
  s390/sclp_early: Return correct HSA block count also for zero
  s390: add some drivers/subsystems to the MAINTAINERS file
  s390: improve debug feature usage
  s390/airq: add support for irq ranges
  s390/mm: enable split page table lock for PMD level
  ...
2014-03-31 14:35:30 -07:00
Linus Torvalds
190f918660 Merge branch 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 compat wrapper rework from Heiko Carstens:
 "S390 compat system call wrapper simplification work.

  The intention of this work is to get rid of all hand written assembly
  compat system call wrappers on s390, which perform proper sign or zero
  extension, or pointer conversion of compat system call parameters.
  Instead all of this should be done with C code eg by using Al's
  COMPAT_SYSCALL_DEFINEx() macro.

  Therefore all common code and s390 specific compat system calls have
  been converted to the COMPAT_SYSCALL_DEFINEx() macro.

  In order to generate correct code all compat system calls may only
  have eg compat_ulong_t parameters, but no unsigned long parameters.
  Those patches which change parameter types from unsigned long to
  compat_ulong_t parameters are separate in this series, but shouldn't
  cause any harm.

  The only compat system calls which intentionally have 64 bit
  parameters (preadv64 and pwritev64) in support of the x86/32 ABI
  haven't been changed, but are now only available if an architecture
  defines __ARCH_WANT_COMPAT_SYS_PREADV64/PWRITEV64.

  System calls which do not have a compat variant but still need proper
  zero extension on s390, like eg "long sys_brk(unsigned long brk)" will
  get a proper wrapper function with the new s390 specific
  COMPAT_SYSCALL_WRAPx() macro:

     COMPAT_SYSCALL_WRAP1(brk, unsigned long, brk);

  which generates the following code (simplified):

     asmlinkage long sys_brk(unsigned long brk);
     asmlinkage long compat_sys_brk(long brk)
     {
         return sys_brk((u32)brk);
     }

  Given that the C file which contains all the COMPAT_SYSCALL_WRAP lines
  includes both linux/syscall.h and linux/compat.h, it will generate
  build errors, if the declaration of sys_brk() doesn't match, or if
  there exists a non-matching compat_sys_brk() declaration.

  In addition this will intentionally result in a link error if
  somewhere else a compat_sys_brk() function exists, which probably
  should have been used instead.  Two more BUILD_BUG_ONs make sure the
  size and type of each compat syscall parameter can be handled
  correctly with the s390 specific macros.

  I converted the compat system calls step by step to verify the
  generated code is correct and matches the previous code.  In fact it
  did not always match, however that was always a bug in the hand
  written asm code.

  In result we get less code, less bugs, and much more sanity checking"

* 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (44 commits)
  s390/compat: add copyright statement
  compat: include linux/unistd.h within linux/compat.h
  s390/compat: get rid of compat wrapper assembly code
  s390/compat: build error for large compat syscall args
  mm/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  kexec/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  net/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  ipc/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  fs/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
  ipc/compat: convert to COMPAT_SYSCALL_DEFINE
  fs/compat: convert to COMPAT_SYSCALL_DEFINE
  security/compat: convert to COMPAT_SYSCALL_DEFINE
  mm/compat: convert to COMPAT_SYSCALL_DEFINE
  net/compat: convert to COMPAT_SYSCALL_DEFINE
  kernel/compat: convert to COMPAT_SYSCALL_DEFINE
  fs/compat: optional preadv64/pwrite64 compat system calls
  ipc/compat_sys_msgrcv: change msgtyp type from long to compat_long_t
  s390/compat: partial parameter conversion within syscall wrappers
  s390/compat: automatic zero, sign and pointer conversion of syscalls
  s390/compat: add sync_file_range and fallocate compat syscalls
  ...
2014-03-31 14:32:17 -07:00
Paolo Bonzini
f7b9ddb8a5 3 fixes
- memory leak on certain SIGP conditions
 - wrong size for idle bitmap (always too big)
 - clear local interrupts on initial CPU reset
 
 1 performance improvement
 - improve performance with many guests on certain workloads
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJTMXvEAAoJEBF7vIC1phx8ii8P/2XI/aXFlhkITD/79rghPbjo
 sE76o5Rlz4UzM8khgEW/4kajehww/1/hO68ojcy1xUE1eMHnjk4sF7c5ozbKX7Qw
 a411B/IrkDEz3aVFLx8KXu2qSdCs22WDgyYFUR4kOBjStP+TvN7KH1NYw84CO+pA
 7Mf+DAxF26X83q6nNvfnEq4cjrQEGmB0aRLkiT4PmHjs4WL+iimJmXeVTewhCtKm
 rE3N9AjpaZpqUQANXvdTJc5Cap/RbVMJf05EbIg5LrsEN+9xQHHRpkkjSRw+7XLx
 iY1uxgA8a92dGWY1RCSZe3lLS0ibg6LWH05hlhYOOCe1DBU0KVQJ5Txixu/ekm5j
 gv2BPpnquF+R2uYptTmF8Xq7TeP15kc3JjahrE8tZ16RH5dpZ7fn6fK/f3590JYB
 4rt0xt5diQwaFRDcgT+8zLGvIq8DZ4ZH6KNElXli8megdY1hiOIlkb1R7Hq/RIt1
 eacX2mycZAlf2ZUp3lVTHYxPL43WH2Qf2s4Y4mHlEmH8LQGGiFOl8Ne0Wgoeha9O
 JVvoHxTEqvzhWTgUi6n8cTlUsYvq6ICXhwCOPM5HLgbxCgbPbEv5EMOZxGAQJ0FK
 fQXasKpjxzPYyJ6XS8xeNel4yFQ+j8G1rvN4Q8kMLY4fjAj5sfm0WRrFw/gCb2ds
 ISfe6UOnoV9scrXvAMyH
 =7smd
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-20140325' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next

3 fixes
- memory leak on certain SIGP conditions
- wrong size for idle bitmap (always too big)
- clear local interrupts on initial CPU reset

1 performance improvement
- improve performance with many guests on certain workloads
2014-03-25 15:44:06 +01:00
Jens Freimann
609433fbed KVM: s390: fix calculation of idle_mask array size
We need BITS_TO_LONGS, not sizeof(long) to calculate
the correct size.

idle_mask is a bitmask, each bit representing the state
of a cpu. The desired outcome is an array of unsigned long
fields that can fit KVM_MAX_VCPUS bits. We should not use
sizeof(long) which returnes the size in bytes, but BITS_TO_LONGS

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-03-25 13:27:11 +01:00
Cornelia Huck
8422359877 KVM: s390: irq routing for adapter interrupts.
Introduce a new interrupt class for s390 adapter interrupts and enable
irqfds for s390.

This is depending on a new s390 specific vm capability, KVM_CAP_S390_IRQCHIP,
that needs to be enabled by userspace.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-03-21 13:43:00 +01:00
Cornelia Huck
841b91c584 KVM: s390: adapter interrupt sources
Add a new interface to register/deregister sources of adapter interrupts
identified by an unique id via the flic. Adapters may also be maskable
and carry a list of pinned pages.

These adapters will be used by irq routing later.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-03-21 13:42:49 +01:00
Dominik Dingel
6e5a40a49f s390/mm: remove unecessary parameter from pgste_ipte_notify
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-03-21 09:58:01 +01:00
Dominik Dingel
aaeff84a2d s390/mm: remove unnecessary parameter from gmap_do_ipte_notify
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-03-21 09:57:58 +01:00
Eric Paris
579ec9e1ab audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
The syscall.h headers were including linux/audit.h but really only
needed the uapi/linux/audit.h to get the requisite defines.  Switch to
the uapi headers.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
Cc: linux-s390@vger.kernel.org
Cc: x86@kernel.org
2014-03-20 10:11:59 -04:00
Eric Paris
5e937a9ae9 syscall_get_arch: remove useless function arguments
Every caller of syscall_get_arch() uses current for the task and no
implementors of the function need args.  So just get rid of both of
those things.  Admittedly, since these are inline functions we aren't
wasting stack space, but it just makes the prototypes better.

Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@linux-mips.org
Cc: linux390@de.ibm.com
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-arch@vger.kernel.org
2014-03-20 10:11:59 -04:00
Heiko Carstens
cf813db0b4 s390/smp: limit number of cpus in possible cpu mask
Limit the number of bits to the maximum number of cpus a machine
can have.
possible_cpu_mask typically will have more bits set than a machine
may physically have. This results in wasted memory during per-cpu
memory allocations, if the possible mask contains more cpus than
physically possible for a given configuration.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-03-17 15:53:06 +01:00
Martin Schwidefsky
818a330c4e s390/ptrace: add support for PTRACE_SINGLEBLOCK
The PTRACE_SINGLEBLOCK option is used to get control whenever
the inferior has executed a successful branch. The PER option to
implement block stepping is successful-branching event, bit 32
in the PER-event mask.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-03-14 12:59:38 +01:00
Heiko Carstens
9a205286bc s390/compat: build error for large compat syscall args
Enforce 32 bit types for all compat syscall argument types.

This way we can make sure that all arguments get correct sign
or zero extension. Otherwise incorrect code would be generated.

E.g. for a 'long' type the COMPAT_SYSCALL_DEFINE macro wouldn't
generate code that would cause sign extension of the passed in 32
bit user space parameter.
This can cause quite subtle bugs like e.g. the one that was fixed
with dfd948e32a "fs/compat: fix parameter handling for compat
readv/writev syscalls".

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-03-06 16:30:47 +01:00
Heiko Carstens
932602e238 fs/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
Some fs compat system calls have unsigned long parameters instead of
compat_ulong_t.
In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that
performs proper zero and sign extension convert all 64 bit parameters
their corresponding 32 bit counterparts.

compat_sys_io_getevents() is a bit different: the non-compat version
has signed parameters for the "min_nr" and "nr" parameters while the
compat version has unsigned parameters.
So change this as well. For all practical purposes this shouldn't make
any difference (doesn't fix a real bug).
Also introduce a generic compat_aio_context_t type which can be used
everywhere.
The access_ok() check within compat_sys_io_getevents() got also removed
since the non-compat sys_io_getevents() should be able to handle
everything anyway.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-03-06 16:30:44 +01:00
Cornelia Huck
96b14536d9 virtio-ccw: virtio-ccw adapter interrupt support.
Implement the new CCW_CMD_SET_IND_ADAPTER command and try to enable
adapter interrupts for every device on the first startup. If the host
does not support adapter interrupts, fall back to normal I/O interrupts.

virtio-ccw adapter interrupts use the same isc as normal I/O subchannels
and share a summary indicator for all devices sharing the same indicator
area.

Indicator bits for the individual virtqueues may be contained in the same
indicator area for different devices.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-03-04 10:41:04 +01:00
Martin Schwidefsky
84ec96a615 s390/airq: add support for irq ranges
Add airq_iv_alloc and airq_iv_free to allocate and free consecutive
ranges of irqs from the interrupt vector.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-03-04 10:41:04 +01:00
Jens Freimann
1ee0bc559d KVM: s390: get rid of local_int array
We can use kvm_get_vcpu() now and don't need the
local_int array in the floating_int struct anymore.
This also means we don't have to hold the float_int.lock
in some places.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-03-04 10:41:03 +01:00
Christian Borntraeger
672550fb68 KVM: s390: Provide access to program parameter
commit d208c79d63 (KVM: s390: Enable
the LPP facility for guests) enabled the LPP instruction for guests.
We should expose the program parameter as a pseudo register for
migration/reset etc. Lets also reset this value on initial CPU
reset.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
2014-03-04 10:41:01 +01:00
Heiko Carstens
9c4d62fab4 s390/compat: convert system call wrappers to C part 11
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2014-03-04 09:05:44 +01:00
Heiko Carstens
d72d2bb5a6 s390/checksum: remove memset() within csum_partial_copy_from_user()
The memset() within csum_partial_copy_from_user() is rather pointless since
copy_from_user() already cleared the rest of the destination buffer if an
exception happened.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-24 17:14:08 +01:00
Heiko Carstens
823002023d s390/uaccess: remove copy_from_user_real()
There is no user left, so remove it.
It was also potentially broken, since the function didn't clear destination
memory if copy_from_user() failed. Which would allow for information leaks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-24 17:14:00 +01:00
Martin Schwidefsky
fe7c30a420 s390/airq: add support for irq ranges
Add airq_iv_alloc and airq_iv_free to allocate and free consecutive
ranges of irqs from the interrupt vector.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:22 +01:00
Martin Schwidefsky
ec66ad66a0 s390/mm: enable split page table lock for PMD level
Add the pgtable_pmd_page_ctor/pgtable_pmd_page_dtor calls to the pmd
allocation and free functions and enable ARCH_ENABLE_SPLIT_PMD_PTLOCK
for 64 bit.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:22 +01:00
Heiko Carstens
db85eaeb52 s390/bitops: fix comment
Fix some numbers in the comments describing the layout of the bit maps.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:20 +01:00
Martin Schwidefsky
deedabb2b4 s390/kvm: set guest page states to stable on re-ipl
The guest page state needs to be reset to stable for all pages
on initial program load via diagnose 0x308.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:20 +01:00
Konstantin Weitz
b31288fa83 s390/kvm: support collaborative memory management
This patch enables Collaborative Memory Management (CMM) for kvm
on s390. CMM allows the guest to inform the host about page usage
(see arch/s390/mm/cmm.c). The host uses this information to avoid
swapping in unused pages in the page fault handler. Further, a CPU
provided list of unused invalid pages is processed to reclaim swap
space of not yet accessed unused pages.

[ Martin Schwidefsky: patch reordering and cleanup ]

Signed-off-by: Konstantin Weitz <konstantin.weitz@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:19 +01:00
Martin Schwidefsky
53e857f308 s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries
Git commit 050eef364a "[S390] fix tlb flushing vs. concurrent
/proc accesses" introduced the attach counter to avoid using the
mm_users value to decide between IPTE for every PTE and lazy TLB
flushing with IDTE. That fixed the problem with mm_users but it
introduced another subtle race, fortunately one that is very hard
to hit.
The background is the requirement of the architecture that a valid
PTE may not be changed while it can be used concurrently by another
cpu. The decision between IPTE and lazy TLB flushing needs to be
done while the PTE is still valid. Now if the virtual cpu is
temporarily stopped after the decision to use lazy TLB flushing but
before the invalid bit of the PTE has been set, another cpu can attach
the mm, find that flush_mm is set, do the IDTE, return to userspace,
and recreate a TLB that uses the PTE in question. When the first,
stopped cpu continues it will change the PTE while it is attached on
another cpu. The first cpu will do another IDTE shortly after the
modification of the PTE which makes the race window quite short.

To fix this race the CPU that wants to attach the address space of a
user space thread needs to wait for the end of the PTE modification.
The number of concurrent TLB flushers for an mm is tracked in the
upper 16 bits of the attach_count and finish_arch_post_lock_switch
is used to wait for the end of the flush operation if required.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:18 +01:00
Heiko Carstens
ca04ddbf53 s390/setup: get rid of MACHINE_HAS_MVCOS machine flag
MACHINE_HAS_MVCOS is used exactly once when the machine is brought up.
There is no need to cache the flag in the machine_flags.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:15 +01:00
Heiko Carstens
211deca6bf s390/uaccess: consistent types
The types 'size_t' and 'unsigned long' have been used randomly for the
uaccess functions. This looks rather confusing.
So let's change all functions to use unsigned long instead and get rid
of size_t in order to have a consistent interface.

The only exception is strncpy_from_user() which uses 'long' since it
may return a signed value (-EFAULT).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:15 +01:00
Heiko Carstens
4f41c2b456 s390/uaccess: get rid of indirect function calls
There are only two uaccess variants on s390 left: the version that is used
if the mvcos instruction is available, and the page table walk variant.
So there is no need for expensive indirect function calls.

By default the mvcos variant will be called. If the mvcos instruction is not
available it will call the page table walk variant.

For minimal performance impact the "if (mvcos_is_available)" is implemented
with a jump label, which will be a six byte nop on machines with mvcos.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:14 +01:00
Heiko Carstens
cfa785e623 s390/uaccess: normalize order of parameters of indirect uaccess function calls
For some unknown reason the indirect uaccess functions on s390 implement a
different parameter order than what is usual.

e.g.:

unsigned long copy_to_user(void *to, const void *from, unsigned long n);
vs.
size_t (*copy_to_user)(size_t n, void __user * to, const void *from);

Let's get rid of this confusing parameter reordering.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:13 +01:00
Sebastian Ott
1e53209605 s390/cio: reorder initialization of ccw consoles
Drivers for ccw consoles use ccw_device_probe_console to receive
an initialized ccw device which is already enabled for interrupts.
After that the device driver does the initialization of its private
data. This can race with unsolicited interrupts which can happen
once the device is enabled for interrupts.

Split ccw_device_probe_console into ccw_device_create_console and
ccw_device_enable_console and reorder the initialization of the ccw
console drivers.

While at it mark these functions as __init.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:12 +01:00
Sebastian Ott
2253e8d792 s390/cio: fix driver callback initialization for ccw consoles
ccw consoles are in use before they can be properly registered with
the driver core. For devices which are in use by a device driver we
rely on the ccw_device's pointer to the driver callbacks to be valid.
For ccw consoles this pointer is NULL until they are registered later
during boot and we dereferenced this pointer. This worked by
chance on 64 bit builds (cdev->drv was NULL but the optional callback
cdev->drv->path_event was also NULL by coincidence) and was unnoticed
until we received reports about boot failures on 31 bit systems.
Fix it by initializing the driver pointer for ccw consoles.

Cc: <stable@vger.kernel.org> # 3.10+
Reported-by: Mike Frysinger <vapier@gentoo.org>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:11 +01:00
Jiri Kosina
d4263348f7 Merge branch 'master' into for-next 2014-02-20 14:54:28 +01:00
Masanari Iida
e227867f12 treewide: Fix typo in Documentation/DocBook
This patch fix spelling typo in Documentation/DocBook.
It is because .html and .xml files are generated by make htmldocs,
I have to fix a typo within the source files.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-02-19 14:58:17 +01:00
Tim Chen
ddf1d169c0 locking/mcs: Allow architecture specific asm files to be used for contended case
This patch allows each architecture to add its specific assembly optimized
arch_mcs_spin_lock_contended and arch_mcs_spinlock_uncontended for
MCS lock and unlock functions.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: AswinChandramouleeswaran <aswin@hp.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Rik vanRiel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: MichelLespinasse <walken@google.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Figo.zhang" <figo1802@gmail.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew R Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1390347382.3138.67.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09 21:18:52 +01:00
Tim Chen
b119fa61d4 locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order
We perform a clean up of the Kbuid files in each architecture.
We order the files in each Kbuild in alphabetical order
by running the below script.

for i in arch/*/include/asm/Kbuild
do
        cat $i | gawk '/^generic-y/ {
                i = 3;
                do {
                        for (; i <= NF; i++) {
                                if ($i == "\\") {
                                        getline;
                                        i = 1;
                                        continue;
                                }
                                if ($i != "")
                                        hdr[$i] = $i;
                        }
                        break;
                } while (1);
                next;
        }
        // {
                print $0;
        }
        END {
                n = asort(hdr);
                for (i = 1; i <= n; i++)
                        print "generic-y += " hdr[i];
        }' > ${i}.sorted;
        mv ${i}.sorted $i;
done

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew R Wilcox <matthew.r.wilcox@intel.com>
Cc: AswinChandramouleeswaran <aswin@hp.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "Figo.zhang" <figo1802@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: George Spelvin <linux@horizon.com>
Cc: MichelLespinasse <walken@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
[ Fixed build bug. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09 21:17:50 +01:00
Tejun Heo
0b60f9ead5 s390: use device_remove_file_self() instead of device_schedule_callback()
driver-core now supports synchrnous self-deletion of attributes and
the asynchrnous removal mechanism is scheduled for removal.  Use it
instead of device_schedule_callback().

* Conversions in arch/s390/pci/pci_sysfs.c and
  drivers/s390/block/dcssblk.c are straightforward.

* drivers/s390/cio/ccwgroup.c is a bit more tricky because
  ccwgroup_notifier() was (ab)using device_schedule_callback() to
  purely obtain a process context to kick off ungroup operation which
  may block from a notifier callback.

  Rename ccwgroup_ungroup_callback() to ccwgroup_ungroup() and make it
  take ccwgroup_device * instead.  The new function is now called
  directly from ccwgroup_ungroup_store().

  ccwgroup_notifier() chain is updated to explicitly bounce through
  ccwgroup_device->ungroup_work.  This also removes possible failure
  from memory pressure.

Only compile-tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07 15:42:41 -08:00
Paolo Bonzini
f244d910ea Two new features are added by this patch set:
- The floating interrupt controller (flic) that allows us to inject,
   clear and inspect non-vcpu local interrupts. This also gives us an
   opportunity to fix deficiencies in our existing interrupt definitions.
 - Support for asynchronous page faults via the pfault mechanism. Testing
   show significant guest performance improvements under host swap.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJS6kZMAAoJEBF7vIC1phx8OOcP/jbMTi3pbCPGfyL4tCyyIL6d
 FYmoVqJMg1lTxLzhdqhli81ahT683I0/jrH17oz1O+PdgxP9annZnNfNs1bO8YzG
 F4XvmrgQ/JnJLl2xKq0MVPOUbLivMiVkCemhPnzg39JKJDTGPSKBLBTEso+dMVKb
 5qxo8UDijJ/TJS4CLYG3AcNWIMtMBlQWCmFMTvVUmaDe8iFEeMoWZRgBJ3X43Hxj
 uQr6zBvG7zd4+IgfrXGaExLJw9EsLIs8MwkqTvYbuDdxoU/jjyvo3xSR5T+3u3fp
 raK2RHirfb31XWNC33hTm4VvTHxBQ0gjUZgWb8AZe8UnhpHgnMS1QWGbU9Tj3RKm
 d5p+AWMT2l+2TzKQRp85NSJCySnQQIgI9Vh3A6MTAsanRpLncpHzOVaBvYKmphWV
 RG94pjxa3NNDvZ3m8V2htWKArlB+rr1S0nz7R5IHM/1IwmPj5VfianhfT3qmjexC
 0CvPlS3FXTbSiTC8xnGET4wvKNUBMThq6alP2/MArqM+28NjDpwOG8/hDpAAIksL
 COho3/csBHH5mpxI0odkfoyIJLDDdIPOWE3kIQBKkg/Qt3nPUcPOKamKTsP7s7RA
 T3K1Kz6bljkvY1eDeJ2pWB85XPGxSig2UVxSdTs+ywQ5UcQnEF3aN5dFKjewk8Xv
 bUgdLiW7ABkj+rCOGIJ8
 =z8I9
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-20140130' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

Two new features are added by this patch set:
- The floating interrupt controller (flic) that allows us to inject,
  clear and inspect non-vcpu local interrupts. This also gives us an
  opportunity to fix deficiencies in our existing interrupt definitions.
- Support for asynchronous page faults via the pfault mechanism. Testing
  show significant guest performance improvements under host swap.
2014-02-04 04:23:37 +01:00
Linus Torvalds
e2a0f813e0 Second batch of KVM updates. Some minor x86 fixes,
two s390 guest features that need some handling in the host,
 and all the PPC changes.  The PPC changes include support for
 little-endian guests and enablement for new POWER8 features.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS6UF5AAoJEBvWZb6bTYby55kP/AgTJnyu7avN653/2aSHkjkx
 KgYSMYhZPIFoY5LyZuNetXaoXFRvCykux1VYSZ6V6s35h2PZ+hdJNbHGjFYKPGTq
 FQ92xQVNuWCAPxmFCjDNuDV/0BauG5y08/Orh/jpjz+GAfH43LruUQGbtXUuyJ8u
 vf+yTHniU5gguqsAmodqjHUgbf+GoPJ1j7hmRoWwt8IWm7Ns3v/IK4l0p6G0h26a
 RjE6aK+Tm208Yr5hD/dRAqeTbBNt3c4xub+QPsKoiEMaZBSuAOiux7D3Kx+If1gp
 WsmqEQxoymihVtkZhUFO9ONLJepvmG2QwJVVyMSUW9iqxX9rraXsvVyVMwcQAhog
 JuOAYxBftH07xu6Fs4eym5KvCFghM+EaJvxxt+kgnvdD4htK1+eK5trntc2zygSi
 /qGiIrkqjXpkskW8kujLayF0eAU3CrZvFWveEPBfFgYiOGX/2wzJCtSm/bt9Jo0M
 v60qgNFK3LNqAyeEfnm9VtlwGr6ZgsAB6DHNPX4fM5s2IBjL+qloXk/e/+aVKkW0
 I3yeRdy/ExhLAab6w81JtMeR7G3YS0UNuAEVvcoxzNb5wIBY8qnpfUzTKyMxQR94
 64EVpxWEYO1s55eCCyMujWrSvc+YAwhJcWHGKgC4K7mxxLD3FVyQXX6YZvgRozMX
 HjQju+DToj9CskyrFlRL
 =yd0Z
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 "Second batch of KVM updates.  Some minor x86 fixes, two s390 guest
  features that need some handling in the host, and all the PPC changes.

  The PPC changes include support for little-endian guests and
  enablement for new POWER8 features"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (45 commits)
  x86, kvm: correctly access the KVM_CPUID_FEATURES leaf at 0x40000101
  x86, kvm: cache the base of the KVM cpuid leaves
  kvm: x86: move KVM_CAP_HYPERV_TIME outside #ifdef
  KVM: PPC: Book3S PR: Cope with doorbell interrupts
  KVM: PPC: Book3S HV: Add software abort codes for transactional memory
  KVM: PPC: Book3S HV: Add new state for transactional memory
  powerpc/Kconfig: Make TM select VSX and VMX
  KVM: PPC: Book3S HV: Basic little-endian guest support
  KVM: PPC: Book3S HV: Add support for DABRX register on POWER7
  KVM: PPC: Book3S HV: Prepare for host using hypervisor doorbells
  KVM: PPC: Book3S HV: Handle new LPCR bits on POWER8
  KVM: PPC: Book3S HV: Handle guest using doorbells for IPIs
  KVM: PPC: Book3S HV: Consolidate code that checks reason for wake from nap
  KVM: PPC: Book3S HV: Implement architecture compatibility modes for POWER8
  KVM: PPC: Book3S HV: Add handler for HV facility unavailable
  KVM: PPC: Book3S HV: Flush the correct number of TLB sets on POWER8
  KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs
  KVM: PPC: Book3S HV: Align physical and virtual CPU thread numbers
  KVM: PPC: Book3S HV: Don't set DABR on POWER8
  kvm/ppc: IRQ disabling cleanup
  ...
2014-01-31 08:37:32 -08:00
Dominik Dingel
3c038e6be0 KVM: async_pf: Async page fault support on s390
This patch enables async page faults for s390 kvm guests.
It provides the userspace API to enable and disable_wait this feature.
The disable_wait will enforce that the feature is off by waiting on it.
Also it includes the diagnose code, called by the guest to enable async page faults.

The async page faults will use an already existing guest interface for this
purpose, as described in "CP Programming Services (SC24-6084)".

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-30 13:11:02 +01:00
Dominik Dingel
24eb3a824c KVM: s390: Add FAULT_FLAG_RETRY_NOWAIT for guest fault
In the case of a fault, we will retry to exit sie64 but with gmap fault
indication for this thread set. This makes it possible to handle async
page faults.

Based on a patch from Martin Schwidefsky.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-30 12:50:39 +01:00
Jens Freimann
a91b8ebe86 KVM: s390: limit floating irqs
Userspace can flood the kernel with interrupts as of now, so let's
limit the number of pending floating interrupts injected via either
the floating interrupt controller or the KVM_S390_INTERRUPT ioctl.

We can have up to 4*64k pending subchannels + 8 adapter interrupts,
as well as up to ASYNC_PF_PER_VCPU*KVM_MAX_VCPUS pfault done interrupts.
There are also sclp and machine checks. This gives us
(4*65536+8+64*64+1+1) = 266250 interrupts.

Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-30 10:25:23 +01:00
Jens Freimann
c05c4186bb KVM: s390: add floating irq controller
This patch adds a floating irq controller as a kvm_device.
It will be necessary for migration of floating interrupts as well
as for hardening the reset code by allowing user space to explicitly
remove all pending floating interrupts.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-30 10:25:20 +01:00
Jens Freimann
81aa8efe01 KVM: s390: add and extend interrupt information data structs
With the currently available struct kvm_s390_interrupt it is not possible to
inject every kind of interrupt as defined in the z/Architecture. Add
additional interruption parameters to the structures and move it to kvm.h

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-29 22:00:41 +01:00
Linus Torvalds
e770d73ceb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 patches from Martin Schwidefsky:
 "A new binary interface to be able to query and modify the LPAR
  scheduler weight and cap settings.  Some improvements for the hvc
  terminal over iucv and a couple of bux fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/hypfs: add interface for diagnose 0x304
  s390: wire up sys_sched_setattr/sys_sched_getattr
  s390/uapi: fix struct statfs64 definition
  s390/uaccess: remove dead extern declarations, make functions static
  s390/uaccess: test if current->mm is set before walking page tables
  s390/zfcpdump: make zfcpdump depend on 64BIT
  s390/32bit: fix cmpxchg64
  s390/xpram: don't modify module parameters
  s390/zcrypt: remove zcrypt kmsg documentation again
  s390/hvc_iucv: Automatically assign free HVC terminal devices
  s390/hvc_iucv: Display connection details through device attributes
  s390/hvc_iucv: fix sparse warning
  s390/vmur: Link parent CCW device during UR device creation
2014-01-28 09:02:24 -08:00
Linus Torvalds
4ba9920e5e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) BPF debugger and asm tool by Daniel Borkmann.

 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

 3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

 4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match.  From Ben Hutchings.

 5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

 6) Implement auto corking in TCP, from Eric Dumazet.  Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

 9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

10) Fix ipv6 router reachability probing, from Jiri Benc.

11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

12) Support tunneling in GRO layer, from Jerry Chu.

13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI.  From Atzm Watanabe.

15) New "Heavy Hitter" qdisc, from Terry Lam.

16) Significantly improve the IPSEC support in pktgen, from Fan Du.

17) Allow ipv4 tunnels to cache routes, just like sockets.  From Tom
    Herbert.

18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

20) Key TCP metrics blobs also by source address, not just destination
    address.  From Christoph Paasch.

21) Support 10G in generic phylib.  From Andy Fleming.

22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided.  From Tom Herbert.

The wireless and netfilter folks have been busy little bees too.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
  net/cxgb4: Fix referencing freed adapter
  ipv6: reallocate addrconf router for ipv6 address when lo device up
  fib_frontend: fix possible NULL pointer dereference
  rtnetlink: remove IFLA_BOND_SLAVE definition
  rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
  qlcnic: update version to 5.3.55
  qlcnic: Enhance logic to calculate msix vectors.
  qlcnic: Refactor interrupt coalescing code for all adapters.
  qlcnic: Update poll controller code path
  qlcnic: Interrupt code cleanup
  qlcnic: Enhance Tx timeout debugging.
  qlcnic: Use bool for rx_mac_learn.
  bonding: fix u64 division
  rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
  sfc: Use the correct maximum TX DMA ring size for SFC9100
  Add Shradha Shah as the sfc driver maintainer.
  net/vxlan: Share RX skb de-marking and checksum checks with ovs
  tulip: cleanup by using ARRAY_SIZE()
  ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
  net/cxgb4: Don't retrieve stats during recovery
  ...
2014-01-25 11:17:34 -08:00
Martin Schwidefsky
07be038209 s390/hypfs: add interface for diagnose 0x304
To provide access to the set-partition-resource-parameter interface
to user space add a new attribute to hypfs/debugfs:
 * s390_hypsfs/diag_304
The data for the query-partition-resource-parameters command can
be access by a read on the attribute. All other diagnose 0x304
requests need to be submitted via ioctl with CAP_SYS_ADMIN rights.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-24 09:40:59 +01:00
Paolo Bonzini
c760f5e29d This deals with 2 guest features that need enablement in the kvm host:
- transactional execution
 - lpp sampling support
 
 In addition there is also a fix to the virtio-ccw guest driver. This will
 enable future features
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJS2SgQAAoJEBF7vIC1phx8f48P/Rfb/Gg6YT8impCxUr9xaCBM
 X5lI48HbY7o/b3pQ624VlBUTuv6Hwo0HTVwNExKK0XzS/e9LXFch5dZ03EvFnVYX
 3KcAOQ1mlJwYz2At8WdIHj+UHWSiVtNSq6T/rFILMBXqQw/d20NBG2t6J79Pa84G
 /WkexHv3Q9VKTWZUl05fmbnYTDtEPbVfTt85EbjaHxuUS8ahibYws+GSWcH2eDYe
 NYtXjrnDJwpoNM0OsyyGItiwNnIQ0ISzxwCzgtu97re9VTKEUoCqkEsMEf1lYr2+
 t35RKzuPSvIrVufYf1+L553n9RRAckdHyq/trV70QNj69RoVA8qBii8HuQhN+2WP
 z+GzCqFv5mMFG2dzoBnrKG77cMXuKFvV9AjyaKPKHg/sty18jXFWzl8YFHVIwngV
 /KvQx5/+GznsETI5mHAn7BHlOWm1+Wk+I9Mkh6XySlglxlDvH0LTybJDnehhTotX
 wqPj6X+Qjq1AytDCpExQzDNfeLZx8jYbus4KOo8vptXNRKEyWY2yg5XJxjyjbyp4
 0JOorFgl0zrV04+JMhWkLZY5sPzH7/tHe0hJ8VDzow2+IRwhEu31zLmAwFvsb/Ih
 6XL1ioncWpCFDGwLcayNMHKAU/k4C5jDxzFJHrLvK8qKMM/SA7LLkaPUhsuUr+oX
 SlO7Pk859ckzjc5xfCzd
 =GbUg
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-20140117' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-queue

This deals with 2 guest features that need enablement in the kvm host:
- transactional execution
- lpp sampling support

In addition there is also a fix to the virtio-ccw guest driver. This will
enable future features
2014-01-23 11:38:13 +01:00
Linus Torvalds
7ebd3faa9b First round of KVM updates for 3.14; PPC parts will come next week.
Nothing major here, just bugfixes all over the place.  The most
 interesting part is the ARM guys' virtualized interrupt controller
 overhaul, which lets userspace get/set the state and thus enables
 migration of ARM VMs.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJS3TVKAAoJEBvWZb6bTYbyIFgP/2cmt4ifCuFMaZv4+G1S8jZU
 uC9ZB/+7vzht/p6zAy+4BxurKbHmSBFkC1OKcxYuy7yB4CQkHabzj4V2vRtqFdwH
 5lExP9qh3kqaVLuhnvxLTmkktR3EW4PFy6OI53l5kRNktOXSuZ0aN6K3V7tCg/X0
 iL7ASo4bJKlxeWcDpmuVrNgAajmZVfXrjKY7robgBQno+yIsgKhRZRBQHjozA6B8
 FpCo/k48RZd/EzIbV/PDDRI4hmmry/lgrO9SKjzq56wSqff2bd/k/KYze4dbAPfd
 Ps60enPTuHmeEjjb4MMMU4EKHVdTQFUMx/xZCmT4xzoh8s4of6RHphXbfE0SUznQ
 dTveyEQAR7E3JNS0k1+3WEX5fWlFesp0hO2NeE0wzUq4TAr9ztgVO9NQ6Si15e7Z
 2HysO0T5Ojtt0lY08/PvS6i48eCAuuBomrejJS8hLW4SUZ5adn+yW4Qo7Fp9JeBR
 l9a3LsVT8BZMtUWrUuFcVhlM4MbzElUPjDbgWhR8UYU/kpfVZOQu8qWgGKR4UWXy
 X7/t9l/tjR99CmfMJBAOzJid+ScSpAfg77BdaKiQrVfVIJmsjEjlO8vUMyj5b1HF
 hPX5wNyJjHAOfridLeHSs4Rdm4a8sk8Az5d4h76pLVz8M4jyTi2v0rO3N4/dU/pu
 x7N8KR5hAj+mLBoM9/Al
 =8sYU
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "First round of KVM updates for 3.14; PPC parts will come next week.

  Nothing major here, just bugfixes all over the place.  The most
  interesting part is the ARM guys' virtualized interrupt controller
  overhaul, which lets userspace get/set the state and thus enables
  migration of ARM VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
  kvm: make KVM_MMU_AUDIT help text more readable
  KVM: s390: Fix memory access error detection
  KVM: nVMX: Update guest activity state field on L2 exits
  KVM: nVMX: Fix nested_run_pending on activity state HLT
  KVM: nVMX: Clean up handling of VMX-related MSRs
  KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
  KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
  KVM: nVMX: Leave VMX mode on clearing of feature control MSR
  KVM: VMX: Fix DR6 update on #DB exception
  KVM: SVM: Fix reading of DR6
  KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
  add support for Hyper-V reference time counter
  KVM: remove useless write to vcpu->hv_clock.tsc_timestamp
  KVM: x86: fix tsc catchup issue with tsc scaling
  KVM: x86: limit PIT timer frequency
  KVM: x86: handle invalid root_hpa everywhere
  kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
  kvm: vfio: silence GCC warning
  KVM: ARM: Remove duplicate include
  arm/arm64: KVM: relax the requirements of VMA alignment for THP
  ...
2014-01-22 21:40:43 -08:00
Heiko Carstens
38ea1f358b s390/32bit: fix cmpxchg64
Fix broken inline assembly contraints for cmpxchg64 on 32bit.

Fixes this crash:

specification exception: 0006 [#1] SMP
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0 #4
task: 005a16c8 ti: 00592000 task.ti: 00592000
Krnl PSW : 070ce000 8029abd6 (lockref_get+0x3e/0x9c)
...
Krnl Code: 8029abcc: a71a0001           ahi     %r1,1
           8029abd0: 1852               lr      %r5,%r2
          #8029abd2: bb40f064           cds     %r4,%r0,100(%r15)
          >8029abd6: 1943               cr      %r4,%r3
           8029abd8: 1815               lr      %r1,%r5
Call Trace:
([<0000000078e01870>] 0x78e01870)
 [<000000000021105a>] sysfs_mount+0xd2/0x1c8
 [<00000000001b551e>] mount_fs+0x3a/0x134
 [<00000000001ce768>] vfs_kern_mount+0x44/0x11c
 [<00000000001ce864>] kern_mount_data+0x24/0x3c
 [<00000000005cc4b8>] sysfs_init+0x74/0xd4
 [<00000000005cb5b4>] mnt_init+0xe0/0x1fc
 [<00000000005cb16a>] vfs_caches_init+0xb6/0x14c
 [<00000000005be794>] start_kernel+0x318/0x33c
 [<000000000010001c>] _stext+0x1c/0x80

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-22 14:02:15 +01:00
Linus Torvalds
6ffbe7d1fa Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking changes from Ingo Molnar:
 - futex performance increases: larger hashes, smarter wakeups
 - mutex debugging improvements
 - lots of SMP ordering documentation updates
 - introduce the smp_load_acquire(), smp_store_release() primitives.
   (There are WIP patches that make use of them - not yet merged)
 - lockdep micro-optimizations
 - lockdep improvement: better cover IRQ contexts
 - liblockdep at last. We'll continue to monitor how useful this is

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  futexes: Fix futex_hashsize initialization
  arch: Re-sort some Kbuild files to hopefully help avoid some conflicts
  futexes: Avoid taking the hb->lock if there's nothing to wake up
  futexes: Document multiprocessor ordering guarantees
  futexes: Increase hash table size for better performance
  futexes: Clean up various details
  arch: Introduce smp_load_acquire(), smp_store_release()
  arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
  arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h
  locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
  mutexes: Give more informative mutex warning in the !lock->owner case
  powerpc: Full barrier for smp_mb__after_unlock_lock()
  rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods
  Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
  locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
  Documentation/memory-barriers.txt: Document ACCESS_ONCE()
  Documentation/memory-barriers.txt: Prohibit speculative writes
  Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt
  Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
  Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency"
  ...
2014-01-20 10:23:08 -08:00
Linus Torvalds
f479c01c8e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "The bulk of the s390 updates for v3.14.

  New features are the perf support for the CPU-Measurement Sample
  Facility and the EP11 support for the crypto cards.  And the normal
  cleanups and bug-fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (44 commits)
  s390/cpum_sf: fix printk format warnings
  s390: Fix misspellings using 'codespell' tool
  s390/qdio: bridgeport support - CHSC part
  s390: delete new instances of __cpuinit usage
  s390/compat: fix PSW32_USER_BITS definition
  s390/zcrypt: add support for EP11 coprocessor cards
  s390/mm: optimize randomize_et_dyn for !PF_RANDOMIZE
  s390: use IS_ENABLED to check if a CONFIG is set to y or m
  s390/cio: use device_lock to synchronize calls to the ccwgroup driver
  s390/cio: use device_lock to synchronize calls to the ccw driver
  s390/cio: fix unlocked access of online member
  s390/cpum_sf: Add flag to process full SDBs only
  s390/cpum_sf: Add raw data sampling to support the diagnostic-sampling function
  s390/cpum_sf: Filter perf events based event->attr.exclude_* settings
  s390/cpum_sf: Detect KVM guest samples
  s390/cpum_sf: Add helper to read TOD from trailer entries
  s390/cpum_sf: Atomically reset trailer entry fields of sample-data-blocks
  s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur
  s390/pci: reenable per default
  s390/pci/dma: fix accounting of allocated_pages
  ...
2014-01-20 09:23:31 -08:00
Michael Mueller
7feb6bb8e6 KVM: s390: enable Transactional Execution
This patch enables transactional execution for KVM guests
on s390 systems zec12 or later.

We rework the allocation of the page containing the sie_block
to also back the Interception Transaction Diagnostic Block.
If available the TE facilities will be enabled.

Setting bit 73 and 50 in vfacilities bitmask reveals the HW
facilities Transactional Memory and Constraint Transactional
Memory respectively to the KVM guest.

Furthermore, the patch restores the Program-Interruption TDB
from the Interception TDB in case a program interception has
occurred and the ITDB has a valid format.

Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-17 13:12:01 +01:00
Hendrik Brueckner
b4a960159e s390: Fix misspellings using 'codespell' tool
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-16 16:40:13 +01:00
Eugene Crosser
59b55a4df2 s390/qdio: bridgeport support - CHSC part
Introduce function for the "Perform network-subchannel operation"
CHSC command with operation code "bridgeport information",
and bit definitions for "characteristics" pertaning to this command.

Signed-off-by: Eugene Crosser <eugene.crosser@ru.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15 14:48:01 -08:00
David S. Miller
0a379e21c5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-14 14:42:42 -08:00
Eugene Crosser
1c59a861d6 s390/qdio: bridgeport support - CHSC part
Introduce function for the "Perform network-subchannel operation"
CHSC command with operation code "bridgeport information",
and bit definitions for "characteristics" pertaning to this command.

Signed-off-by: Eugene Crosser <eugene.crosser@ru.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-14 15:16:09 +01:00
Heiko Carstens
075dfd8210 s390/compat: fix PSW32_USER_BITS definition
PSW32_USER_BITS should define the primary address space for user space
instead of the home address space.
Symptom of this bug is that gdb doesn't work in compat mode.

The bug was introduced with e258d719ff "s390/uaccess: always run the kernel
in home space" and f26946d7ec "s390/compat: make psw32_user_bits a constant
value again".

Cc: stable@vger.kernel.org # v3.13+
Reported-by: Andreas Arnez <arnez@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-13 16:50:21 +01:00
Ingo Molnar
1c62448e39 Linux 3.13-rc8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS0miqAAoJEHm+PkMAQRiGbfgIAJSWEfo8ludknhPcHJabBtxu
 75SQAKJlL3sBVnxEc58Rtt8gsKYQIrm4IY5Slunklsn04RxuDUIQMgFoAYR5gQwz
 +Myqkw/HOqDe5VStGxtLYpWnfglxVwGDCd7ISfL9AOVy5adMWBxh4Tv+qqQc7aIZ
 eF7dy+DD+C6Q3Z5OoV8s0FZDxse29vOf17Nki7+7t8WMqyegYwjoOqNeqocGKsPi
 eHLrJgTl4T6jB4l9LKKC154DSKjKOTSwZMWgwK8mToyNLT/ufCiKgXloIjEvZZcY
 VVKUtncdHiTf+iqVojgpGBzOEeB5DM83iiapFeDiJg8C9yBzvT8lBtA9aPb5Wgw=
 =lEeV
 -----END PGP SIGNATURE-----

Merge tag 'v3.13-rc8' into core/locking

Refresh the tree with the latest fixes, before applying new changes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 11:44:41 +01:00
Peter Zijlstra
47933ad41a arch: Introduce smp_load_acquire(), smp_store_release()
A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads.  Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.

This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation.  The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes.  These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.

In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability.  It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits.  It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.

[Changelog by PaulMck]

Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-12 10:37:17 +01:00
David S. Miller
143c905494 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/i40e/i40e_main.c
	drivers/net/macvtap.c

Both minor merge hassles, simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18 16:42:06 -05:00
Heiko Carstens
d80512f874 s390/smp: improve setup of possible cpu mask
Since under z/VM we cannot have more than 64 cpus, make sure the
cpu_possible_mask does not contain more bits.
This avoids wasting memory for dynamic per-cpu allocations if
CONFIG_NR_CPUS is larger than 64.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-18 17:35:18 +01:00
David S. Miller
e3fec2f74f lib: Add missing arch generic-y entries for asm-generic/hash.h
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-17 21:26:19 -05:00
Hendrik Brueckner
d7528862cf s390/cpum_sf: Add flag to process full SDBs only
Add the PERF_CPUM_SF_FULL_BLOCKS flag to process only sample-data-blocks that
have the block-full-indicator bit set.  Sample-data-blocks that are partially
filled are discarded.  Use this flag if the sampling buffer is likely to be
shared among perf events that use different sampling modes.  In such
environments, flushing sample-data-blocks that are not completely filled, might
cause invalid-data-formats.

Setting PERF_CPUM_SF_FULL_BLOCKS prevents potentially invalid sampling data to
be processed but, in contrast, also discards valid samples in partially filled
sample-data-blocks.  Note that sample-data-blocks might not become full for
small sampling frequencies or for workload that is scheduled for tiny intervals.

To sample with the PERF_CPUM_SF_FULL_BLOCKS flag, set the perf->attr.config1
to 0x0004.  For example:

	perf record -e cpum_sf/config=0xB000,config1=0x0004/

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:38:01 +01:00
Hendrik Brueckner
7e75fc3ff4 s390/cpum_sf: Add raw data sampling to support the diagnostic-sampling function
Also support the diagnostic-sampling function in addition to the basic-sampling
function.  Diagnostic-sampling data entries contain hardware model specific
sampling data and additional programs are required to analyze the data.

To deliver diagnostic-sampling, as well, as basis-sampling data entries to user
space, introduce support for sampling "raw data".  If this particular perf
sampling type (PERF_SAMPLE_RAW) is used, sampling data entries are copied
to user space.  External programs can then analyze these data.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:38:00 +01:00
Hendrik Brueckner
443e802bab s390/cpum_sf: Detect KVM guest samples
The host-program-parameter (hpp) value of basic sample-data-entries designates
a SIE control block that is set by the LPP instruction in sie64a().
Non-zero values indicate guest samples, a value of zero indicates a host sample.

For perf samples, host and guest samples are distinguished using particular
PERF_MISC_* flags.  The perf layer calls perf_misc_flags() to set the flags
based on the pt_regs content.  For each sample-data-entry, the cpum_sf PMU
creates a pt_regs structure with the sample-data information.  An additional
flag structure is added to easily distinguish between host and guest samples.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:59 +01:00
Hendrik Brueckner
443d4beb82 s390/cpum_sf: Add helper to read TOD from trailer entries
The trailer entry contains a timestamp of the time when the sample-data-block
became full.  The timestamp specifies a TOD (time-of-day) value in either the
STCK or STCKE format.

Provide a helper function to return the TOD value depending on the setting of
time format indicator.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:58 +01:00
Hendrik Brueckner
fcc77f5073 s390/cpum_sf: Atomically reset trailer entry fields of sample-data-blocks
Ensure to reset the sample-data-block full indicator and the overflow counter
at the same time.  This must be done atomically because the sampling hardware
is still active while full sample-data-block is processed.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:57 +01:00
Hendrik Brueckner
69f239ed33 s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur
Improve the sampling buffer allocation and add a function to reallocate and
increase the sampling buffer structure.  The number of allocated buffer elements
(sample-data-blocks) are accounted.  You can control the minimum and maximum
number these sample-data-blocks through the cpum_sfb_size kernel parameter.

The number hardware sample overflows (if any) are also accounted and stored
per perf event.  During the PMU disable/enable calls, the accumulated overflow
counter is analyzed and, if necessary, the sampling buffer is dynamically
increased.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:57 +01:00
Sebastian Ott
aa3b7c2967 s390/pci: prevent inadvertently triggered bus scans
Initialization and scanning of the pci bus is omitted on older
machines without pci support or if pci=off was specified. Remember
the fact that we ran without pci support and prevent further bus
scans during resume from hibernate or after receiving hotplug
notifications.

Reported-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:54 +01:00
Hendrik Brueckner
e28bb79d99 s390/perf,oprofile: Share sampling facility
Introduce reserve/release functions to share the sampling facility
between perf and oprofile.
Also improve error handling for the sampling facility support in perf.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:52 +01:00
Hendrik Brueckner
8c069ff4bd s390/perf: add support for the CPU-Measurement Sampling Facility
Introduce a perf PMU, "cpum_sf", to support the CPU-Measurement
Sampling Facility.  You can control the sampling facility through
this perf PMU interfaces.  Perf sampling events are created for
hardware samples.

For details about the CPU-Measurement Sampling Facility, see
"The Load-Program-Parameter and the CPU-Measurement Facilities" (SA23-2260).

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:51 +01:00
Hendrik Brueckner
c716832513 s390/cpum_cf: Export event names in sysfs
Provide PMU event attributes for supported counters and export their symbolic
names to the sysfs "events" directory.

See the /sys/devices/cpum_cf/events/ directory for a list of available counters.
Note that you might require counter set authorizations for the LPAR to use them.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:50 +01:00
Hendrik Brueckner
cf48ad8327 s390/oprofile: move hwsampler interfaces to cpu_mf.h
Extract and move the oprofile hwsampler data structures and interfaces to
the cpu_mf.h header file which contains common interface definitions
for the various CPU-measurement facilities.   This change is necessary for
a new perf PMU.

Few interface names have been revised to fit to the latest CPU-measurement
facilities documentation.  Also declare the data structures as __packed and
correct checkpatch findings.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:50 +01:00
Hendrik Brueckner
52733e0152 s390/sclp_early: Add function to detect sclp console capabilities
Add SCLP console detect functions to encapsulate detection of SCLP console
capabilities, for example, VT220 support.  Reuse the sclp_send/receive masks
that were stored by the most recent sclp_set_event_mask() call to prevent
unnecessary SCLP calls.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-16 14:37:49 +01:00
Paolo Bonzini
6bb05ef785 Some further s390 patches for kvm-next.
Various improvements and bugfixes in the signal processor handling.
 Document kvm support for diagnose (s390 hypercalls). And last but
 not least, fix a bug in the s390 ioeventfd backend that was causing
 us grief in scenarios with 4G+ memory.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJSqLJhAAoJEN7Pa5PG8C+vfdgP/3TzhJDOlQQBUSrmgE7TzaXR
 qHXMVDsWRzrpkrXXjLnHrmwTl5+d5y6yz4TQGhpDEQsMSzTHtv0xq+2qdED92P5P
 L2RNTBFeilRMke1j71uWYszJBiiEoyePg1Cj0rJ84fGJobc6NgXIjX+pftvtsMpC
 Xid54/m3DmBKJBHIIQKz2EwUHc3/TQj3wnIPjK5f6Lb7d231131fn5heEQj9TVCh
 Vc2Ei/c/KGf8afgzPtA29zIQX8y2KuPcbh5ouGI9bHIgBuLN9FbYV3V9CsDNc2lM
 rKSCeXwExdaZbtQ/kStbIKRFLR6D7JFSjUZORGgOq03wMvGUztckJ1WhbAw+ufef
 vPDtKY4h9Jl1tEuErmnuecspJafjrsY3q1d1quxOOl+anU6CfX8DZ/4xbXtU8XYd
 CRHEsxvJNXKPge4jpCmukJP8iJ4HaGdZYeiy9mdNEAhQZhmeuPdIzt+DtJjNZpup
 C1ofz3a7mz5c+lTMWC2xOwLmqpPCioIqYUSKYmzWimWBXmpRqoJtOmZAvT+BqDY2
 +GTTk4JJoj4vhxQNRFROuvZ710H/NNF/Nm8Mex776IDXg3vNuPuxu9yFtf8k8fiY
 UcIZ8Y8PEYhpm4VN/fdFM6pwTvHqFHtAhqaofp3Tb6wE0JCiVZ7b2E5jJuIYE6wS
 EP+r1TCuGi5OH02doEKH
 =/dj0
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-20131211' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next

Some further s390 patches for kvm-next.

Various improvements and bugfixes in the signal processor handling.
Document kvm support for diagnose (s390 hypercalls). And last but
not least, fix a bug in the s390 ioeventfd backend that was causing
us grief in scenarios with 4G+ memory.
2013-12-12 11:41:33 +01:00
Thomas Huth
58bc33b2b7 KVM: s390: SIGP START has to report BUSY while stopping a CPU
Just like the RESTART order, the START order also has to report BUSY
while a STOP request is pending, to avoid that the START might be
ignored due to a race condition between the STOP and the START order.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2013-12-11 19:05:21 +01:00
Thomas Huth
b13d3580ee KVM: s390: Add the SIGP order CONDITIONAL EMERGENCY SIGNAL
This patch adds the missing SIGP order "conditional emergency
signal" by calling the "emergency signal" SIGP handler if the
required conditions are met.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2013-12-11 19:04:37 +01:00
Heiko Carstens
9c0952344d s390/smp,sclp: fix size of sclp_cpu_info structure
struct sclp_cpu_info contains entries only for 255 cpus, while the new
smp fallback sigp detection code will fill up to 256 entries.
Even though there is no machine available which has 256 cpus and where
in addition the fallback sigp cpu detection code will be used we better
fix this, to prevent out of bound accesses.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-12-02 15:31:08 +01:00
Martin Schwidefsky
79c74ecbeb s390/time,vdso: convert to the new update_vsyscall interface
Switch to the improved update_vsyscall interface that provides
sub-nanosecond precision for gettimeofday and clock_gettime.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-25 09:15:39 +01:00
Heiko Carstens
dba6bb6004 s390/mm: optimize copy_page
Always use the mvcl instruction to copy a page instead of mvpg or a
couple of mvc instructions.
Copying a huge page is 25% faster this way. Also bypass caches when
copying pages since only parts of a page will be used afterwards.
Especially when copying a huge page this would kick everything out
of the L1 and L2 data caches on a zEC12 machine.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-11-20 09:04:55 +01:00
Linus Torvalds
806dace637 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull second set of s390 patches from Martin Schwidefsky:
 "The handling of the PCI hotplug notifications has been improved, the
  zfcp dumper can now detect the HSA size dynamically and the default
  install kernel has been changed to the compressed bzImage.  And two
  bug-fixes for scm and 3720"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: implement hotplug notifications
  s390/scm_block: do not hide eadm subchannel dependency
  s390/sclp: Consolidate early sclp init calls to sclp_early_detect()
  s390/sclp: Move early code from sclp_cmd.c to sclp_early.c
  s390/sclp: Determine HSA size dynamically for zfcpdump
  s390/sclp: Move declarations for sclp_sdias into separate header file
  s390/pci: implement pcibios_remove_bus
  s390/pci: improve handling of bus resources
  s390/3270: fix missing device_destroy() call
  s390/boot: Install bzImage as default kernel image
2013-11-19 11:43:21 -08:00
Linus Torvalds
4007162647 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq cleanups from Ingo Molnar:
 "This is a multi-arch cleanup series from Thomas Gleixner, which we
  kept to near the end of the merge window, to not interfere with
  architecture updates.

  This series (motivated by the -rt kernel) unifies more aspects of IRQ
  handling and generalizes PREEMPT_ACTIVE"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  preempt: Make PREEMPT_ACTIVE generic
  sparc: Use preempt_schedule_irq
  ia64: Use preempt_schedule_irq
  m32r: Use preempt_schedule_irq
  hardirq: Make hardirq bits generic
  m68k: Simplify low level interrupt handling code
  genirq: Prevent spurious detection for unconditionally polled interrupts
2013-11-19 10:40:00 -08:00
Sebastian Ott
605c36986c s390/scm_block: do not hide eadm subchannel dependency
Stop hiding scm_block's dependency to the eadm subchannel driver
(by using functions provided by the eadm subchannel instead of
wrappers provided by the scm bus).

This will help userspace recognizing module dependencies (e.g. for
building a ramdisk). As a side effect we can get rid of some code
reimplementing refcounting between those modules.

Reported-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-15 14:08:42 +01:00
Michael Holzheu
7b50da53f6 s390/sclp: Consolidate early sclp init calls to sclp_early_detect()
The new function calls the old ones. The sclp_event_mask_early() is removed
and replaced by one invocation of sclp_set_event_mask(0, 0).

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-15 14:08:41 +01:00
Michael Holzheu
acf6a004e6 s390/sclp: Move early code from sclp_cmd.c to sclp_early.c
The early SCLP driver code in sclp_cmd.c belongs to sclp_early.c
because it is independent from the 'normal' SCLP driver. So move
it to sclp_early.c

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-15 14:08:41 +01:00
Michael Holzheu
e657d8fe2f s390/sclp: Determine HSA size dynamically for zfcpdump
Currently we have hardcoded the HSA size to 32 MiB. With this patch the
HSA size is determined dynamically via SCLP in early.c.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-15 14:08:40 +01:00
Sebastian Ott
7d594322b2 s390/pci: implement pcibios_remove_bus
Implement pcibios_remove_bus to free arch specific data when a pci
bus is deregistered. While at it remove a useless kzalloc/kfree
wrapper.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-15 14:08:38 +01:00
Sebastian Ott
7a572a3ab1 s390/pci: improve handling of bus resources
Cleanup the functions for allocation and setup of bus resources. Do
not allocate the same name for each resource but use a per-bus name.
Also provide means to cleanup all resources allocated by a bus.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-15 14:08:37 +01:00
Linus Torvalds
f080480488 Here are the 3.13 KVM changes. There was a lot of work on the PPC
side: the HV and emulation flavors can now coexist in a single kernel
 is probably the most interesting change from a user point of view.
 On the x86 side there are nested virtualization improvements and a
 few bugfixes.  ARM got transparent huge page support, improved
 overcommit, and support for big endian guests.
 
 Finally, there is a new interface to connect KVM with VFIO.  This
 helps with devices that use NoSnoop PCI transactions, letting the
 driver in the guest execute WBINVD instructions.  This includes
 some nVidia cards on Windows, that fail to start without these
 patches and the corresponding userspace changes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJShPAhAAoJEBvWZb6bTYbyl48P/297GgmELHAGBgjvb6q7yyGu
 L8+eHjKbh4XBAkPwyzbvUjuww5z2hM0N3JQ0BDV9oeXlO+zwwCEns/sg2Q5/NJXq
 XxnTeShaKnp9lqVBnE6G9rAOUWKoyLJ2wItlvUL8JlaO9xJ0Vmk0ta4n2Nv5GqDp
 db6UD7vju6rHtIAhNpvvAO51kAOwc01xxRixCVb7KUYOnmO9nvpixzoI/S0Rp1gu
 w/OWMfCosDzBoT+cOe79Yx1OKcpaVW94X6CH1s+ShCw3wcbCL2f13Ka8/E3FIcuq
 vkZaLBxio7vjUAHRjPObw0XBW4InXEbhI1DjzIvm8dmc4VsgmtLQkTCG8fj+jINc
 dlHQUq6Do+1F4zy6WMBUj8tNeP1Z9DsABp98rQwR8+BwHoQpGQBpAxW0TE0ZMngC
 t1caqyvjZ5pPpFUxSrAV+8Kg4AvobXPYOim0vqV7Qea07KhFcBXLCfF7BWdwq/Jc
 0CAOlsLL4mHGIQWZJuVGw0YGP7oATDCyewlBuDObx+szYCoV4fQGZVBEL0KwJx/1
 7lrLN7JWzRyw6xTgJ5VVwgYE1tUY4IFQcHu7/5N+dw8/xg9KWA3f4PeMavIKSf+R
 qteewbtmQsxUnvuQIBHLs8NRWPnBPy+F3Sc2ckeOLIe4pmfTte6shtTXcLDL+LqH
 NTmT/cfmYp2BRkiCfCiS
 =rWNf
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM changes from Paolo Bonzini:
 "Here are the 3.13 KVM changes.  There was a lot of work on the PPC
  side: the HV and emulation flavors can now coexist in a single kernel
  is probably the most interesting change from a user point of view.

  On the x86 side there are nested virtualization improvements and a few
  bugfixes.

  ARM got transparent huge page support, improved overcommit, and
  support for big endian guests.

  Finally, there is a new interface to connect KVM with VFIO.  This
  helps with devices that use NoSnoop PCI transactions, letting the
  driver in the guest execute WBINVD instructions.  This includes some
  nVidia cards on Windows, that fail to start without these patches and
  the corresponding userspace changes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits)
  kvm, vmx: Fix lazy FPU on nested guest
  arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu
  arm/arm64: KVM: MMIO support for BE guest
  kvm, cpuid: Fix sparse warning
  kvm: Delete prototype for non-existent function kvm_check_iopl
  kvm: Delete prototype for non-existent function complete_pio
  hung_task: add method to reset detector
  pvclock: detect watchdog reset at pvclock read
  kvm: optimize out smp_mb after srcu_read_unlock
  srcu: API for barrier after srcu read unlock
  KVM: remove vm mmap method
  KVM: IOMMU: hva align mapping page size
  KVM: x86: trace cpuid emulation when called from emulator
  KVM: emulator: cleanup decode_register_operand() a bit
  KVM: emulator: check rex prefix inside decode_register()
  KVM: x86: fix emulation of "movzbl %bpl, %eax"
  kvm_host: typo fix
  KVM: x86: emulate SAHF instruction
  MAINTAINERS: add tree for kvm.git
  Documentation/kvm: add a 00-INDEX file
  ...
2013-11-15 13:51:36 +09:00
Thomas Gleixner
00d1a39e69 preempt: Make PREEMPT_ACTIVE generic
No point in having this bit defined by architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130917183629.090698799@linutronix.de
2013-11-13 20:21:47 +01:00
Thomas Gleixner
54197e43a4 hardirq: Make hardirq bits generic
There is no reason for per arch hardirq bits. Make them all generic

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130917183628.534494408@linutronix.de
2013-11-13 20:21:46 +01:00
Linus Torvalds
39cf275a1a Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "The main changes in this cycle are:

   - (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik
     van Riel, Peter Zijlstra et al.  Yay!

   - optimize preemption counter handling: merge the NEED_RESCHED flag
     into the preempt_count variable, by Peter Zijlstra.

   - wait.h fixes and code reorganization from Peter Zijlstra

   - cfs_bandwidth fixes from Ben Segall

   - SMP load-balancer cleanups from Peter Zijstra

   - idle balancer improvements from Jason Low

   - other fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
  ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED
  stop_machine: Fix race between stop_two_cpus() and stop_cpus()
  sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus
  sched: Fix asymmetric scheduling for POWER7
  sched: Move completion code from core.c to completion.c
  sched: Move wait code from core.c to wait.c
  sched: Move wait.c into kernel/sched/
  sched/wait: Fix __wait_event_interruptible_lock_irq_timeout()
  sched: Avoid throttle_cfs_rq() racing with period_timer stopping
  sched: Guarantee new group-entities always have weight
  sched: Fix hrtimer_cancel()/rq->lock deadlock
  sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
  sched: Fix race on toggling cfs_bandwidth_used
  sched: Remove extra put_online_cpus() inside sched_setaffinity()
  sched/rt: Fix task_tick_rt() comment
  sched/wait: Fix build breakage
  sched/wait: Introduce prepare_to_wait_event()
  sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too
  sched: Remove get_online_cpus() usage
  sched: Fix race in migrate_swap_stop()
  ...
2013-11-12 10:20:12 +09:00
Martin Schwidefsky
106078641f s390/mm,tlb: correct tlb flush on page table upgrade
The IDTE instruction used to flush TLB entries for a specific address
space uses the address-space-control element (ASCE) to identify
affected TLB entries. The upgrade of a page table adds a new top
level page table which changes the ASCE. The TLB entries associated
with the old ASCE need to be flushed and the ASCE for the address space
needs to be replaced synchronously on all CPUs which currently use it.
The concept of a lazy ASCE update with an exception handler is broken.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-04 13:51:47 +01:00
Ingo Molnar
fb10d5b7ef Merge branch 'linus' into sched/core
Resolve cherry-picking conflicts:

Conflicts:
	mm/huge_memory.c
	mm/memory.c
	mm/mprotect.c

See this upstream merge commit for more details:

  52469b4fcd Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-01 08:24:41 +01:00
Heiko Carstens
b226635ab8 s390/percpu: remove this_cpu_xor() implementation
this_cpu_xor() will be removed tree wide during the next merge window.
To avoid merge conflicts s390's removal comes via the s390 tree.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-31 09:53:58 +01:00
Martin Schwidefsky
7ab64a85e1 s390/time: fix get_tod_clock_ext inline assembly
The get_tod_clock_ext inline assembly does not specify its output
operands correctly. This can cause incorrect code to be generated.

Cc: stable@vger.kernel.org # 3.12
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-31 09:52:48 +01:00
Sebastian Ott
cb4deb6962 s390/pci: cleanup function information block
Cleanup function information block used as a modify pci function
parameter. Change reserved members to be anonymous. Fix the size
of the struct and add proper alignment information. Also put the
FIB on the stack.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:17 +02:00
Sebastian Ott
af0b129294 s390/pci: remove CONFIG_PCI_DEBUG dependancy
Add debugfs entries regardless of CONFIG_PCI_DEBUG.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:16 +02:00
Heiko Carstens
f84cd97e5c s390/percpu: make use of interlocked-access facility 1 instructions
Optimize this_cpu_* functions for 64 bit by making use of new instructions
that came with the interlocked-access facility 1 (load-and-*) and the
general-instructions-extension facility (asi, agsi).
That way we get rid of the compare-and-swap loop in most cases.
Code size reduction (defconfig, -march=z196): 11,555 bytes.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:13 +02:00
Heiko Carstens
0702fbf572 s390/percpu: use generic percpu ops for CONFIG_32BIT
Remove the special cases for the this_cpu_* functions for 32 bit
in order to make it easier to add additional code for 64 bit.
32 bit will use the generic implementation.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:13 +02:00
Heiko Carstens
f26946d7ec s390/compat: make psw32_user_bits a constant value again
Make psw32_user_bits a constant value again.
This is a leftover of the code which allowed to run the kernel either
in primary or home space which got removed with 9a905662 "s390/uaccess:
always run the kernel in home space".

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:12 +02:00
Heiko Carstens
5ebf250dab s390: fix handling of runtime instrumentation psw bit
Fix the following bugs:
- When returning from a signal the signal handler copies the saved psw mask
  from user space and uses parts of it. Especially it restores the RI bit
  unconditionally. If however the machine doesn't support RI, or RI is
  disabled for the task, the last lpswe instruction which returns to user
  space will generate a specification exception.
  To fix this check if the RI bit is allowed to be set and kill the task
  if not.
- In the compat mode signal handler code the RI bit of the psw mask gets
  propagated to the mask of the return psw: if user space enables RI in the
  signal handler, RI will also be enabled after the signal handler is
  finished.
  This is a different behaviour than with 64 bit tasks. So change this to
  match the 64 bit semantics, which restores the original RI bit value.
- Fix similar oddities within the ptrace code as well.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:11 +02:00
Martin Schwidefsky
4725c86055 s390: fix save and restore of the floating-point-control register
The FPC_VALID_MASK has been used to check the validity of the value
to be loaded into the floating-point-control register. With the
introduction of the floating-point extension facility and the
decimal-floating-point additional bits have been defined which need
to be checked in a non straight forward way. So far these bits have
been ignored which can cause an incorrect results for decimal-
floating-point operations, e.g. an incorrect rounding mode to be
set after signal return.

The static check with the FPC_VALID_MASK is replaced with a trial
load of the floating-point-control value, see test_fp_ctl.

In addition an information leak with the padding word between the
floating-point-control word and the floating-point registers in
the s390_fp_regs is fixed.

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:11 +02:00
Michael Holzheu
f5be85a2d3 s390: Remove unused declaration of zfcpdump_prefix_array[]
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:05 +02:00
Peter Oberparleiter
8a80b10895 s390/cio: fix error-prone defines
Missing parenthesis may cause problems when using the defines
together with operations of higher precedence.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:04 +02:00
Michael Holzheu
5895294274 s390: Remove zfcpdump NR_CPUS dependency
Currently zfpcdump can only collect registers for up to CONFIG_NR_CPUS
CPUss. This dependency is not necessary. So remove it by dynamically
allocating the save area array.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:04 +02:00
Chen Gang
72b7fb5fda s390/atomic: use 'unsigned int' instead of 'unsigned long' for atomic_*_mask()
The type of 'v->counter' is always 'int', and related inline assembly
code also process 'int', so use 'unsigned int' instead of 'unsigned
long' for the 'mask'.

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:02 +02:00
Martin Schwidefsky
127c1fefff s390/mm: do not initialize storage keys
With dirty and referenced bits implemented in software it is unnecessary
to initialize the storage key for every page. With this patch not a single
storage key operation is done for a system that does not use KVM.
For KVM set_pte_at/pgste_set_key will do the initialization for the guest
view of the storage key when the mapping for the page is established in
the host.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:17:00 +02:00
Heiko Carstens
12325f0978 s390: cleanup and add sanity checks to control register macros
- turn some macros into functions
- merge two almost identical versions for 32/64 bit
- add BUILD_BUG_ON() check to make sure the passed in array is large enough

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:59 +02:00
Martin Schwidefsky
e258d719ff s390/uaccess: always run the kernel in home space
Simplify the uaccess code by removing the user_mode=home option.
The kernel will now always run in the home space mode.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:57 +02:00
Heiko Carstens
7d7c7b24e4 s390/bitops: rename find_first_bit_left() to find_first_bit_inv()
find_first_bit_left() and friends have nothing to do with the normal
LSB0 bit numbering for big endian machines used in Linux (least
significant bit has bit number 0).
Instead they use MSB0 bit numbering, where the most signficant bit has
bit number 0. So rename find_first_bit_left() and friends to
find_first_bit_inv(), to avoid any confusion.
Also provide inv versions of set_bit, clear_bit and test_bit.

This also removes the confusing use of e.g. set_bit() in airq.c which
uses a "be_to_le" bit number conversion, which could imply that instead
set_bit_le() could be used. But that is entirely wrong since the _le
bitops variant uses yet another bit numbering scheme.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:56 +02:00
Heiko Carstens
b1cb7e2b6c s390/bitops: use flogr instruction to implement __ffs, ffs, __fls, fls and fls64
Since z9 109 we have the flogr instruction which can be used to implement
optimized versions of __ffs, ffs, __fls, fls and fls64.
So implement and use them, instead of the generic variants.
This reduces the size of the kernel image (defconfig, -march=z9-109)
by 19,648 bytes.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:55 +02:00
Heiko Carstens
746479cdcb s390/bitops: use generic find bit functions / reimplement _left variant
Just like all other architectures we should use out-of-line find bit
operations, since the inline variant bloat the size of the kernel image.
And also like all other architecures we should only supply optimized
variants of the __ffs, ffs, etc. primitives.

Therefore this patch removes the inlined s390 find bit functions and uses
the generic out-of-line variants instead.

The optimization of the primitives follows with the next patch.

With this patch also the functions find_first_bit_left() and
find_next_bit_left() have been reimplemented, since logically, they are
nothing else but a find_first_bit()/find_next_bit() implementation that
use an inverted __fls() instead of __ffs().
Also the restriction that these functions only work on machines which
support the "flogr" instruction is gone now.

This reduces the size of the kernel image (defconfig, -march=z9-109)
by 144,482 bytes.
Alone the size of the function build_sched_domains() gets reduced from
7 KB to 3,5 KB.

We also git rid of unused functions like find_first_bit_le()...

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:55 +02:00
Hendrik Brueckner
f1d86b61fb s390/s390dbf: add debug_level_enabled() function
Add the debug_level_enabled() function to check if debug events for
a particular level would be logged.  This might help to save cycles
for debug events that require additional information collection.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:53 +02:00
Heiko Carstens
4ae803253e s390/bitops: optimize set_bit() for constant values
Since zEC12 we have the interlocked-access facility 2 which allows to
use the instructions ni/oi/xi to update a single byte in storage with
compare-and-swap semantics.
So change set_bit(), clear_bit() and change_bit() to generate such code
instead of a compare-and-swap loop (or using the load-and-* instruction
family), if possible.
This reduces the text segment by yet another 8KB (defconfig).

Alternatively the long displacement variants niy/oiy/xiy could have
been used, but the extended displacement field is usually not needed
and therefore would only increase the size of the text segment again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:53 +02:00
Heiko Carstens
370b0b5f77 s390/bitops: remove CONFIG_SMP / simplify non-atomic bitops
Remove CONFIG_SMP from bitops code. This reduces the C code significantly
but also generates better code for the SMP case.

This means that for !CONFIG_SMP set_bit() and friends now also have
compare and swap semantics (read: more code). However nobody really cares
for !CONFIG_SMP and this is the trade-off to simplify the SMP code which we
do care about.

The non-atomic bitops like __set_bit() now generate also better code
because the old code did not have a __builtin_contant_p() check for the
CONFIG_SMP case and therefore always generated the inline assembly variant.
However the inline assemblies for the non-atomic case now got completely
removed since gcc can produce better code, which accesses less memory
operands.

test_bit() got also a bit simplified since it did have a
__builtin_constant_p() check, however two identical code pathes for each
case (written differently).

In result this mainly reduces the to be maintained code but is not very
relevant for code generation, since there are not many non-atomic bitops
usages that we care about.
(code reduction defconfig kernel image before/after: 560 bytes).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:52 +02:00
Heiko Carstens
9a70a42835 s390/atomic: various small cleanups
- add a typecheck to the defines to make sure they operate on an atomic_t
- simplify inline assembly constraints
- keep variable names common between functions

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:51 +02:00
Heiko Carstens
5692e4d11c s390/atomic: optimize atomic_add() for constant values
If the interlocked-access facility 1 is available we can use the asi
and agsi instructions for interlocked updates if the to be added
value is a contanst and small (in the range of -128..127).
asi and agsi do not not return the old or new value, therefore these
instructions can only be used for atomic_(add|sub|inc|dec)[64].

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:51 +02:00
Heiko Carstens
1ffa11abfe s390/kprobes: allow kprobes only on known instructions
Since we have an in-kernel disassembler we can make sure that
there won't be any kprobes set on random data.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:50 +02:00
Heiko Carstens
0f20822a69 s390/dis: move disassembler function prototypes to proper header file
Now that the in-kernel disassembler has an own header file move the
disassembler related function prototypes to that header file.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:48 +02:00
Suzuki K. Poulose
648ae35c54 s390/dis: move common definitions to a header file
The patch moves some of the definitions to a
header file. No functional changes involved.

I have retained the Copyright Statement from the
original file.

Signed-off-by: Suzuki K Poulose <suzuki@in.ibm.com>
[Heiko Carstens: rename s390-dis.h to dis.h]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:48 +02:00
Heiko Carstens
75287430b4 s390/atomic: make use of interlocked-access facility 1 instructions
Same as for bitops: make use of the interlocked-access facility 1
instructions which allow to atomically update storage locations
without a compare-and-swap loop.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:46 +02:00
Heiko Carstens
86d51bc31f s390/atomic: implement atomic_sub_return() with atomic_add_return()
Get rid of the own atomic_sub_return() implementation. Otherwise we can't
make use of the interlocked-access facility 1 instructions for
atomic_sub_return(), since there is no "load and subtract" instruction
available.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:46 +02:00
Heiko Carstens
e344e52c7c s390/bitops: make use of interlocked-access facility 1 instructions
Make use of the interlocked-access facility 1 that got added with the
z196 architecure.
This facilility added new instructions which can atomically update a
storage location without a compare-and-swap loop. E.g. setting a bit
within a "long" can be done with a single instruction.

The size of the kernel image gets ~30kb smaller. Considering that there
are appr. 1900 bitops call sites this means that each one saves about
15-16 bytes per call site which is expected.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:44 +02:00
Martin Schwidefsky
8c071b0f19 s390/time: correct use of store clock fast
The result of the store-clock-fast (STCKF) instruction is a bit fuzzy.
It can happen that the value stored on one CPU is smaller than the value
stored on another CPU, although the order of the stores is the other
way around. This can cause deltas of get_tod_clock() values to become
negative when they should not be.

We need to be more careful with store-clock-fast, this patch partially
reverts git commit e4b7b4238e666682555461fa52eecd74652f36bb "time:
always use stckf instead of stck if available". The get_tod_clock()
function now uses the store-clock-extended (STCKE) instruction.
get_tod_clock_fast() can be used if the fuzziness of store-clock-fast
is acceptable e.g. for wait loops local to a CPU.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-22 09:16:40 +02:00
Gleb Natapov
13acfd5715 Powerpc KVM work is based on a commit after rc4.
Merging master into next to satisfy the dependencies.

Conflicts:
	arch/arm/kvm/reset.c
2013-10-17 17:41:49 +03:00
Martin Schwidefsky
af0ebc40a8 s390/mm,kvm: fix software dirty bits vs. kvm for old machines
For machines without enhanced supression on protection the software
dirty bit code forces the pte dirty bit and clears the page protection
bit in pgste_set_pte. This is done for all pte types, the check for
present ptes is missing. As a result swap ptes and other not-present
ptes can get corrupted.
Add a check for the _PAGE_PRESENT bit to pgste_set_pte before modifying
the pte value.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-15 13:47:57 +02:00
Christoffer Dall
a7efdf6bec KVM: s390: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:12:13 +03:00
Ingo Molnar
ec0ad3d01f Merge branch 'core/urgent' into sched/core
Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-11 07:39:37 +02:00
Ingo Molnar
3f0116c323 compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down
a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto'
constructs, as outlined here:

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670

Implement a workaround suggested by Jakub Jelinek.

Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Suggested-by: Jakub Jelinek <jakub@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-11 07:39:14 +02:00
Ingo Molnar
37bf06375c Linux 3.12-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh
 T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM
 I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE
 UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I
 QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii
 8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs=
 =xSFJ
 -----END PGP SIGNATURE-----

Merge tag 'v3.12-rc4' into sched/core

Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree
before applying more scheduler patches.

Conflicts:
	arch/avr32/include/asm/Kbuild

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-09 12:36:13 +02:00
Heiko Carstens
efc1d23b3d s390: enable ARCH_USE_CMPXCHG_LOCKREF
Enable ARCH_USE_CMPXCHG_LOCKREF since it shows performance improvements
with Linus' simple stat() test case of up to 50% on a 30 cpu system.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-09-28 12:46:29 +02:00
Heiko Carstens
083986e824 mutex: replace CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX with simple ifdef
Linus suggested to replace

 #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
 #define arch_mutex_cpu_relax() cpu_relax()
 #endif

with just a simple

  #ifndef arch_mutex_cpu_relax
  # define arch_mutex_cpu_relax() cpu_relax()
  #endif

to get rid of CONFIG_HAVE_CPU_RELAX_SIMPLE. So architectures can
simply define arch_mutex_cpu_relax if they want an architecture
specific function instead of having to add a select statement in
their Kconfig in addition.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-09-28 12:46:21 +02:00
Peter Zijlstra
a787870924 sched, arch: Create asm/preempt.h
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-25 14:07:50 +02:00
Thomas Huth
6b948a7276 KVM: s390: Remove dead "rerun vcpu" code
The need for SIE_INTERCEPT_RERUNVCPU has been removed long ago already,
with the following commit:
	f7850c9288
	[S390] remove kvm mmu reload on s390
Since the remainders are dead code, they are now removed by this patch.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:17 +02:00
Michael Holzheu
6f79d33228 s390/vmcore: use vmcore for zfcpdump
Modify the s390 copy_oldmem_page() and remap_oldmem_pfn_range() function
for zfcpdump to read from the HSA memory if memory below HSA_SIZE bytes is
requested.  Otherwise real memory is used.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Jan Willeke <willeke@de.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:59:15 -07:00
Heiko Carstens
63c40436a1 s390/kprobes: add support for pc-relative long displacement instructions
With the general-instruction extension facility (z10) a couple of
instructions with a pc-relative long displacement were introduced.  The
kprobes support for these instructions however was never implemented.

In result, if anybody ever put a probe on any of these instructions the
result would have been random behaviour after the instruction got executed
within the insn slot.

So lets add the missing handling for these instructions.  Since all of the
new instructions have 32 bit signed displacement the easiest solution is
to allocate an insn slot that is within the same 2GB area like the
original instruction and patch the displacement field.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:58:52 -07:00
Linus Torvalds
e831cbfc1a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
 "This includes one bpf/jit bug fix where the jit compiler could
  sometimes write generated code out of bounds of the allocated memory
  area.

  The rest of the patches are only cleanups and minor improvements"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/irq: reduce size of external interrupt handler hash array
  s390/compat,uid16: use current_cred()
  s390/ap_bus: use and-mask instead of a cast
  s390/ftrace: avoid pointer arithmetics with function pointers
  s390: make various functions static, add declarations to header files
  s390/compat signal: add couple of __force annotations
  s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table()
  s390: keep Kconfig sorted
  s390/irq: rework irq subclass handling
  s390/irq: use hlists for external interrupt handler array
  s390/dumpstack: convert print_symbol to %pSR
  s390/perf: Remove print_hex_dump_bytes() debug output
  s390: update defconfig
  s390/bpf,jit: fix address randomization
2013-09-11 08:36:03 -07:00
Linus Torvalds
b4b50fd78b ARM: SoC platform changes for 3.12
This branch contains mostly additions and changes to platform enablement
 and SoC-level drivers. Since there's sometimes a dependency on device-tree
 changes, there's also a fair amount of those in this branch.
 
 Pieces worth mentioning are:
 
 - Mbus driver for Marvell platforms, allowing kernel configuration
   and resource allocation of on-chip peripherals.
 - Enablement of the mbus infrastructure from Marvell PCI-e drivers.
 - Preparation of MSI support for Marvell platforms.
 - Addition of new PCI-e host controller driver for Tegra platforms
 - Some churn caused by sharing of macro names between i.MX 6Q and 6DL
   platforms in the device tree sources and header files.
 - Various suspend/PM updates for Tegra, including LP1 support.
 - Versatile Express support for MCPM, part of big little support.
 - Allwinner platform support for A20 and A31 SoCs (dual and quad Cortex-A7)
 - OMAP2+ support for DRA7, a new Cortex-A15-based SoC.
 
 The code that touches other architectures are patches moving
 MSI arch-specific functions over to weak symbols and removal of
 ARCH_SUPPORTS_MSI, acked by PCI maintainers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJSKhYmAAoJEIwa5zzehBx322AP/1ONYs8o8f7/Gzq6lZvTN6T3
 0pBTApg6Jfioi3lwKvUAEIcsW82YKQ+UZkbW66GQH6+Ri4aZJKZHuz0+JPU67OJ4
 LtSLuzVWrymy2VOOUvAnS/SXkOZw/pHhU4cLNHn1dMndhUL1Uqp9/XwuiHEQyFsP
 uOkpcBtIu0EWElov0PKKZ5SWBg8JJs2vy5ydiViGelWHCrZvDDZkWzIsDcBQxJLQ
 juzT4+JE+KOu7vKmfw78o6iHoCS2TBRAN9YUCajRb8Wl+out1hrTahHnDWaZ5Mce
 EskcQNkJROqFbjD4k3ABN4XGTv2VDmrztIwFe0SEQ7Dz/9ypCrBGT69uI9xIqTXr
 GwVRIwAUFTpMupK0gy93z1ajV3N0CXV79out9+jQNUQybYE+czp8QOyhmuc1tZx0
 8fn9jlBQe9Vy6yrs39gEcE7nUwrayeyQ+6UvqqwsE2pWZabNAnCMSPX5+QIu+T/3
 tQ7+jYmfFeserp1sIDOHOnxfhtW9EI6U9d1h/DUCwrsuFdkL9ha4M/vh9Pwgye98
 tBdz0T4yE39AJQwwFWRkv1jcQKcGu6WqJanmvS4KRBksGwuLWxy+ewOnkz2ifS25
 ZYSyxAryZRBvQRqlOK11rXPfRcbGcY0MG9lkKX96rGcyWEizgE1DdjxXD8HoIleN
 R8heV6GX5OzlFLGX2tKK
 =fJ5x
 -----END PGP SIGNATURE-----

Merge tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC platform changes from Olof Johansson:
 "This branch contains mostly additions and changes to platform
  enablement and SoC-level drivers.  Since there's sometimes a
  dependency on device-tree changes, there's also a fair amount of
  those in this branch.

  Pieces worth mentioning are:

   - Mbus driver for Marvell platforms, allowing kernel configuration
     and resource allocation of on-chip peripherals.
   - Enablement of the mbus infrastructure from Marvell PCI-e drivers.
   - Preparation of MSI support for Marvell platforms.
   - Addition of new PCI-e host controller driver for Tegra platforms
   - Some churn caused by sharing of macro names between i.MX 6Q and 6DL
     platforms in the device tree sources and header files.
   - Various suspend/PM updates for Tegra, including LP1 support.
   - Versatile Express support for MCPM, part of big little support.
   - Allwinner platform support for A20 and A31 SoCs (dual and quad
     Cortex-A7)
   - OMAP2+ support for DRA7, a new Cortex-A15-based SoC.

  The code that touches other architectures are patches moving MSI
  arch-specific functions over to weak symbols and removal of
  ARCH_SUPPORTS_MSI, acked by PCI maintainers"

* tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (266 commits)
  tegra-cpuidle: provide stub when !CONFIG_CPU_IDLE
  PCI: tegra: replace devm_request_and_ioremap by devm_ioremap_resource
  ARM: tegra: Drop ARCH_SUPPORTS_MSI and sort list
  ARM: dts: vf610-twr: enable i2c0 device
  ARM: dts: i.MX51: Add one more I2C2 pinmux entry
  ARM: dts: i.MX51: Move pins configuration under "iomuxc" label
  ARM: dtsi: imx6qdl-sabresd: Add USB OTG vbus pin to pinctrl_hog
  ARM: dtsi: imx6qdl-sabresd: Add USB host 1 VBUS regulator
  ARM: dts: imx27-phytec-phycore-som: Enable AUDMUX
  ARM: dts: i.MX27: Disable AUDMUX in the template
  ARM: dts: wandboard: Add support for SDIO bcm4329
  ARM: i.MX5 clocks: Remove optional clock setup (CKIH1) from i.MX51 template
  ARM: dts: imx53-qsb: Make USBH1 functional
  ARM i.MX6Q: dts: Enable I2C1 with EEPROM and PMIC on Phytec phyFLEX-i.MX6 Ouad module
  ARM i.MX6Q: dts: Enable SPI NOR flash on Phytec phyFLEX-i.MX6 Ouad module
  ARM: dts: imx6qdl-sabresd: Add touchscreen support
  ARM: imx: add ocram clock for imx53
  ARM: dts: imx: ocram size is different between imx6q and imx6dl
  ARM: dts: imx27-phytec-phycore-som: Fix regulator settings
  ARM: dts: i.MX27: Remove clock name from CPU node
  ...
2013-09-06 13:30:06 -07:00
Linus Torvalds
ae7a835cc5 Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Gleb Natapov:
 "The highlights of the release are nested EPT and pv-ticketlocks
  support (hypervisor part, guest part, which is most of the code, goes
  through tip tree).  Apart of that there are many fixes for all arches"

Fix up semantic conflicts as discussed in the pull request thread..

* 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (88 commits)
  ARM: KVM: Add newlines to panic strings
  ARM: KVM: Work around older compiler bug
  ARM: KVM: Simplify tracepoint text
  ARM: KVM: Fix kvm_set_pte assignment
  ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
  ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
  ARM: KVM: vgic: fix GICD_ICFGRn access
  ARM: KVM: vgic: simplify vgic_get_target_reg
  KVM: MMU: remove unused parameter
  KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate()
  KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls
  KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX
  KVM: x86: update masterclock when kvmclock_offset is calculated (v2)
  KVM: PPC: Book3S: Fix compile error in XICS emulation
  KVM: PPC: Book3S PR: return appropriate error when allocation fails
  arch: powerpc: kvm: add signed type cast for comparation
  KVM: x86: add comments where MMIO does not return to the emulator
  KVM: vmx: count exits to userspace during invalid guest emulation
  KVM: rename __kvm_io_bus_sort_cmp to kvm_io_bus_cmp
  kvm: optimize away THP checks in kvm_is_mmio_pfn()
  ...
2013-09-04 18:15:06 -07:00
Linus Torvalds
6832d9652f Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timers/nohz changes from Ingo Molnar:
 "It mostly contains fixes and full dynticks off-case optimizations, by
  Frederic Weisbecker"

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  nohz: Include local CPU in full dynticks global kick
  nohz: Optimize full dynticks's sched hooks with static keys
  nohz: Optimize full dynticks state checks with static keys
  nohz: Rename a few state variables
  vtime: Always debug check snapshot source _before_ updating it
  vtime: Always scale generic vtime accounting results
  vtime: Optimize full dynticks accounting off case with static keys
  vtime: Describe overriden functions in dedicated arch headers
  m68k: hardirq_count() only need preempt_mask.h
  hardirq: Split preempt count mask definitions
  context_tracking: Split low level state headers
  vtime: Fix racy cputime delta update
  vtime: Remove a few unneeded generic vtime state checks
  context_tracking: User/kernel broundary cross trace events
  context_tracking: Optimize context switch off case with static keys
  context_tracking: Optimize guest APIs off case with static key
  context_tracking: Optimize main APIs off case with static key
  context_tracking: Ground setup for static key use
  context_tracking: Remove full dynticks' hacky dependency on wide context tracking
  nohz: Only enable context tracking on full dynticks CPUs
  ...
2013-09-04 09:36:54 -07:00
Heiko Carstens
82003c3e60 s390/irq: rework irq subclass handling
Let's not add a function for every external interrupt subclass for
which we need reference counting. Just have two register/unregister
functions which have a subclass parameter:

void irq_subclass_register(enum irq_subclass subclass);
void irq_subclass_unregister(enum irq_subclass subclass);

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-09-04 17:19:13 +02:00
Sebastian Ott
57b5918c33 s390/pci: update function handle after resume from hibernate
Function handles may change while the system was in hibernation
use list pci functions and update the function handles.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-30 08:57:20 +02:00
Sebastian Ott
1d57896638 s390/pci: split lpf
List pci functions is used to query and iterate over pci functions.
This function currently has 2 users - initial device discovery and
rescan after a machine check. Instead of having a multipurpose
function pass a callback which gets called for each pci function.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-30 08:57:17 +02:00
Sebastian Ott
77e844b964 s390/hibernate: add early resume function
Some functions that do arch specific resume actions are called
directly from swsusp_asm64.S . Before we add another function call
provide a generic s390_early_resume function which can be used
for this purpose.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-30 08:57:15 +02:00
Sebastian Ott
67f43f38ee s390/pci/hotplug: convert to be builtin only
Convert s390' pci hotplug to be builtin only, with no module option.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-30 08:57:07 +02:00
Martin Schwidefsky
0944fe3f4a s390/mm: implement software referenced bits
The last remaining use for the storage key of the s390 architecture
is reference counting. The alternative is to make page table entries
invalid while they are old. On access the fault handler marks the
pte/pmd as young which makes the pte/pmd valid if the access rights
allow read access. The pte/pmd invalidations required for software
managed reference bits cost a bit of performance, on the other hand
the RRBE/RRBM instructions to read and reset the referenced bits are
quite expensive as well.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-29 13:20:11 +02:00
Martin Schwidefsky
0196c642f7 s390/pgtable: fix mprotect for single-threaded KVM guests
For a single-threaded KVM guest ptep_modify_prot_start will not use
IPTE, the invalid bit will therefore not be set. If DEBUG_VM is set
pgste_set_key called by ptep_modify_prot_commit will complain about
the missing invalid bit. ptep_modify_prot_start should set the
invalid bit in all cases.

Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-28 09:19:28 +02:00
Heiko Carstens
bee5c2863e s390/switch_to: fix save_access_regs() / restore_access_regs()
Fix broken contraints for both save_access_regs() and restore_access_regs().
The constraints are incorrect since they tell the compiler that the inline
assemblies only access the first element of an array of 16 elements.
Therefore the compiler could generate incorrect code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:11 +02:00
Heiko Carstens
02aff3aa17 s390/bitops: fix inline assembly constraints
Fix inline assembly contraints for non atomic bitops functions.

This is broken since 2.6.34 987bcdac "[S390] use inline assembly
contraints available with gcc 3.3.3".

Reported-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Reported-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:10 +02:00
Martin Schwidefsky
5c474a1e22 s390/mm: introduce ptep_flush_lazy helper
Isolate the logic of IDTE vs. IPTE flushing of ptes in two functions,
ptep_flush_lazy and __tlb_flush_mm_lazy.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:09 +02:00
Martin Schwidefsky
d56c893d36 s390/pgtable: add pgste_get helper
ptep_modify_prot_start uses the pgste_set helper to store the pgste,
while ptep_modify_prot_commit uses its own pointer magic to retrieve
the value again. Add the pgste_get help function to keep things
symmetrical and improve readability.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:07 +02:00
Martin Schwidefsky
a055f66a3a s390/pgtable: skip pgste updates on full flush
On process exit there is no more need for the pgste information.
Skip the expensive storage key operations which should speed up
termination of KVM processes.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:06 +02:00
Martin Schwidefsky
e509861105 s390/mm: cleanup page table definitions
Improve the encoding of the different pte types and the naming of the
page, segment table and region table bits. Due to the different pte
encoding the hugetlbfs primitives need to be adapted as well. To improve
compatability with common code make the huge ptes use the encoding of
normal ptes. The conversion between the pte and pmd encoding for a huge
pte is done with set_huge_pte_at and huge_ptep_get.
Overall the code is now easier to understand.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:06 +02:00
Heiko Carstens
416fd0ffb1 s390/mm: remove dead pfmf inline assembly
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:05 +02:00
Martin Schwidefsky
1f44a22577 s390: convert interrupt handling to use generic hardirq
With the introduction of PCI it became apparent that s390 should
convert to generic hardirqs as too many drivers do not have the
correct dependency for GENERIC_HARDIRQS. On the architecture
level s390 does not have irq lines. It has external interrupts,
I/O interrupts and adapter interrupts. This patch hard-codes all
external interrupts as irq #1, all I/O interrupts as irq #2 and
all adapter interrupts as irq #3. The additional information from
the lowcore associated with the interrupt is stored in the
pt_regs of the interrupt frame, where the interrupt handler can
pick it up. For PCI/MSI interrupts the adapter interrupt handler
scans the relevant bit fields and calls generic_handle_irq with
the virtual irq number for the MSI interrupt.

Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:04 +02:00
Martin Schwidefsky
5d0d8f4353 s390/pci: use adapter interrupt vector helpers
Make use of the adapter interrupt helpers in the PCI code. This is
the first step to convert the MSI interrupt code to PCI domains.
The patch removes the limitation of 64 adapter interrupts per
PCI function.

Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:03 +02:00
Martin Schwidefsky
9389339f28 s390/pci: cleanup function names
Rename s390pci_xyz to zpci_xxz and set_irq_ctrl to zpci_set_irq_ctrl.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:03 +02:00
Martin Schwidefsky
a9a6f0341d s390/airq: introduce adapter interrupt vector helper
The PCI code is the first user of adapter interrupts vectors.
Add a set of helpers to airq.c to separate the adatper interrupt
code from the PCI bits. The helpers allow for adapter interrupt
vectors of any size.

Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:02 +02:00
Kevin Hilman
bfa664f21b ARM: tegra: core SoC enhancements for 3.12
This branch includes a number of enhancements to core SoC support for
 Tegra devices. The major new features are:
 
 * Adds a new CPU-power-gated cpuidle state for Tegra114.
 * Adds initial system suspend support for Tegra114, initially supporting
   just CPU-power-gating during suspend.
 * Adds "LP1" suspend mode support for all of Tegra20/30/114. This mode
   both gates CPU power, and places the DRAM into self-refresh mode.
 * A new DT-driven PCIe driver to Tegra20/30. The driver is also moved
   from arch/arm/mach-tegra/ to drivers/pci/host/.
 
 The PCIe driver work depends on the following tag from Thomas Petazzoni:
 git://git.infradead.org/linux-mvebu.git mis-3.12.2
 ... which is merged into the middle of this pull request.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJSDlwwAAoJEMzrak5tbycxR68QAJZ/Izc9Izj0JH8hmCEvMNfi
 ub1DQfWAy3oXk0ttkk+BMvuyD8JTvBr8LSK8GqjZs//rFGlW81A4NHTvCwoKZjKe
 hgrRgI2B1wj3Um1sp8le9D0klKrTcfmpXrOxH8ALgz0BIpMge8AGZHkV0SrfQa1z
 bKiISFVAw12WJCVrQ2nbzpZGU51lbyJ/+RghttM1a8LuS2P03CZgt2kqiytk3UVK
 uiGEy3sCkjXLFO3EsUvM6ha623S6BumCAYjNfgDowTVKaoEe1r2TD4bFeU6lGcXJ
 mlVTv0Kywazf4Q2gKzkbDz8UQMArW4hok2iILHzz+sf/Rn0hie5XVqhFlbBlcae8
 vyWsHmqvmE9BJAK2G2RLs9cJCTzEpEyAjUWfE3sIIa3ztSguT5+PHndDLR/d76aS
 j8L3FYReICZ1NuNw1JSQPFs9g2EWJbNRiy+8o9O2elsJMpLDBj/FcV6TVpudbBTI
 z7hvN+XSVYUaCVD4e8ma9YoC3VGseiAZvd+Y8hPd2MFBECVPNpy2bOacieU6Bgxh
 zjSBXZ/URxN3rTkv9+F3BLWAOfVmJYN0rKV9YfM/rqpWjc9iQx30m1fRZDnXWhvd
 ps8eFIYsKqc6v9AAugl/RexFy4Laav9eREjb0k2LA8ClLhK/qLLuiisVmKWS/grh
 lX9tzPEG2nZcjxSYaEjz
 =ve9i
 -----END PGP SIGNATURE-----

Merge tag 'tegra-for-3.12-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra into next/soc

From: Stephen Warren:
ARM: tegra: core SoC enhancements for 3.12

This branch includes a number of enhancements to core SoC support for
Tegra devices. The major new features are:

* Adds a new CPU-power-gated cpuidle state for Tegra114.
* Adds initial system suspend support for Tegra114, initially supporting
  just CPU-power-gating during suspend.
* Adds "LP1" suspend mode support for all of Tegra20/30/114. This mode
  both gates CPU power, and places the DRAM into self-refresh mode.
* A new DT-driven PCIe driver to Tegra20/30. The driver is also moved
  from arch/arm/mach-tegra/ to drivers/pci/host/.

The PCIe driver work depends on the following tag from Thomas Petazzoni:
git://git.infradead.org/linux-mvebu.git mis-3.12.2
... which is merged into the middle of this pull request.

* tag 'tegra-for-3.12-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/swarren/linux-tegra: (33 commits)
  ARM: tegra: disable LP2 cpuidle state if PCIe is enabled
  MAINTAINERS: Add myself as Tegra PCIe maintainer
  PCI: tegra: set up PADS_REFCLK_CFG1
  PCI: tegra: Add Tegra 30 PCIe support
  PCI: tegra: Move PCIe driver to drivers/pci/host
  PCI: msi: add default MSI operations for !HAVE_GENERIC_HARDIRQS platforms
  ARM: tegra: add LP1 suspend support for Tegra114
  ARM: tegra: add LP1 suspend support for Tegra20
  ARM: tegra: add LP1 suspend support for Tegra30
  ARM: tegra: add common LP1 suspend support
  clk: tegra114: add LP1 suspend/resume support
  ARM: tegra: config the polarity of the request of sys clock
  ARM: tegra: add common resume handling code for LP1 resuming
  ARM: pci: add ->add_bus() and ->remove_bus() hooks to hw_pci
  of: pci: add registry of MSI chips
  PCI: Introduce new MSI chip infrastructure
  PCI: remove ARCH_SUPPORTS_MSI kconfig option
  PCI: use weak functions for MSI arch-specific functions
  ARM: tegra: unify Tegra's Kconfig a bit more
  ARM: tegra: remove the limitation that Tegra114 can't support suspend
  ...

Signed-off-by: Kevin Hilman <khilman@linaro.org>
2013-08-21 10:17:18 -07:00
Guenter Roeck
215b28a530 s390: Fix broken build
Fix this build error:

  In file included from fs/exec.c:61:0:
  arch/s390/include/asm/tlb.h:35:23: error: expected identifier or '(' before 'unsigned'
  arch/s390/include/asm/tlb.h:36:1: warning: no semicolon at end of struct or union [enabled by default]
  arch/s390/include/asm/tlb.h: In function 'tlb_gather_mmu':
  arch/s390/include/asm/tlb.h:57:5: error: 'struct mmu_gather' has no member named 'end'

Broken due to commit 2b047252d0 ("Fix TLB gather virtual address range
invalidation corner cases").

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
[ Oh well. We had build testing for ppc amd um, but no s390  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-16 21:16:37 -07:00
Linus Torvalds
2b047252d0 Fix TLB gather virtual address range invalidation corner cases
Ben Tebulin reported:

 "Since v3.7.2 on two independent machines a very specific Git
  repository fails in 9/10 cases on git-fsck due to an SHA1/memory
  failures.  This only occurs on a very specific repository and can be
  reproduced stably on two independent laptops.  Git mailing list ran
  out of ideas and for me this looks like some very exotic kernel issue"

and bisected the failure to the backport of commit 53a59fc67f ("mm:
limit mmu_gather batching to fix soft lockups on !CONFIG_PREEMPT").

That commit itself is not actually buggy, but what it does is to make it
much more likely to hit the partial TLB invalidation case, since it
introduces a new case in tlb_next_batch() that previously only ever
happened when running out of memory.

The real bug is that the TLB gather virtual memory range setup is subtly
buggered.  It was introduced in commit 597e1c3580 ("mm/mmu_gather:
enable tlb flush range in generic mmu_gather"), and the range handling
was already fixed at least once in commit e6c495a96c ("mm: fix the TLB
range flushed when __tlb_remove_page() runs out of slots"), but that fix
was not complete.

The problem with the TLB gather virtual address range is that it isn't
set up by the initial tlb_gather_mmu() initialization (which didn't get
the TLB range information), but it is set up ad-hoc later by the
functions that actually flush the TLB.  And so any such case that forgot
to update the TLB range entries would potentially miss TLB invalidates.

Rather than try to figure out exactly which particular ad-hoc range
setup was missing (I personally suspect it's the hugetlb case in
zap_huge_pmd(), which didn't have the same logic as zap_pte_range()
did), this patch just gets rid of the problem at the source: make the
TLB range information available to tlb_gather_mmu(), and initialize it
when initializing all the other tlb gather fields.

This makes the patch larger, but conceptually much simpler.  And the end
result is much more understandable; even if you want to play games with
partial ranges when invalidating the TLB contents in chunks, now the
range information is always there, and anybody who doesn't want to
bother with it won't introduce subtle bugs.

Ben verified that this fixes his problem.

Reported-bisected-and-tested-by: Ben Tebulin <tebulin@googlemail.com>
Build-testing-by: Stephen Rothwell <sfr@canb.auug.org.au>
Build-testing-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-16 08:52:46 -07:00
Ingo Molnar
6f1d657668 Merge branch 'timers/nohz-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz
Pull nohz improvements from Frederic Weisbecker:

 " It mostly contains fixes and full dynticks off-case optimizations. I believe that
   distros want to enable this feature so it seems important to optimize the case
   where the "nohz_full=" parameter is empty. ie: I'm trying to remove any performance
   regression that comes with NO_HZ_FULL=y when the feature is not used.

   This patchset improves the current situation a lot (off-case appears to be around 11% faster
   with hackbench, although I guess it may vary depending on the configuration but it should be
   significantly faster in any case) now there is still some work to do: I can still observe a
   remaining loss of 1.6% throughput seen with hackbench compared to CONFIG_NO_HZ_FULL=n. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-14 17:58:56 +02:00
Frederic Weisbecker
a5725ac23b vtime: Describe overriden functions in dedicated arch headers
If the arch overrides some generic vtime APIs, let it describe
these on a dedicated and standalone header. This way it becomes
convenient to include it in vtime generic headers without irrelevant
stuff in such a low level header.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-08-14 17:14:53 +02:00
Thomas Petazzoni
4287d824f2 PCI: use weak functions for MSI arch-specific functions
Until now, the MSI architecture-specific functions could be overloaded
using a fairly complex set of #define and compile-time
conditionals. In order to prepare for the introduction of the msi_chip
infrastructure, it is desirable to switch all those functions to use
the 'weak' mechanism. This commit converts all the architectures that
were overidding those MSI functions to use the new strategy.

Note that we keep two separate, non-weak, functions
default_teardown_msi_irqs() and default_restore_msi_irqs() for the
default behavior of the arch_teardown_msi_irqs() and
arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI
code.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Daniel Price <daniel.price@gmail.com>
Tested-by: Thierry Reding <thierry.reding@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12 15:26:39 +00:00
Dominik Dingel
bf640876e2 KVM: s390: Make KVM_HVA_ERR_BAD usable on s390
Current common code uses PAGE_OFFSET to indicate a bad host virtual address.
As this check won't work on architectures that don't map kernel and user memory
into the same address space (e.g. s390), such architectures can now provide
their own KVM_HVA_ERR_BAD defines.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-29 09:03:53 +02:00
Martin Schwidefsky
ee6ee55bb5 KVM: s390: fix task size check
The gmap_map_segment function uses PGDIR_SIZE in the check for the
maximum address in the tasks address space. This incorrectly limits
the amount of memory usable for a kvm guest to 4TB. The correct limit
is (1UL << 53). As the TASK_SIZE has different values (4TB vs 8PB)
dependent on the existance of the fourth page table level, create
a new define 'TASK_MAX_SIZE' for (1UL << 53).

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-29 09:03:19 +02:00
Martin Schwidefsky
3eabaee998 KVM: s390: allow sie enablement for multi-threaded programs
Improve the code to upgrade the standard 2K page tables to 4K page tables
with PGSTEs to allow the operation to happen when the program is already
multi-threaded.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-29 09:03:09 +02:00
Martin Schwidefsky
3b0040a47a s390/bitops: fix find_next_bit_left
The find_next_bit_left function is broken if used with an offset which
is not a multiple of 64. The shift to mask the bits of a 64-bit word
not to search is in the wrong direction, the result can be either a
bit found smaller than the offset or failure to find a set bit.

Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-07-26 13:25:21 +02:00
Michael Mueller
64597f9dae s390/ptrace: PTRACE_TE_ABORT_RAND
The patch implements a s390 specific ptrace request
PTRACE_TE_ABORT_RAND to modify the randomness of spontaneous
aborts of memory transactions of the transaction execution
facility. The data argument of the ptrace request is used to
specify the levels of randomness, 0 for normal operation, 1 to
abort every transaction at a random instruction, and 2 to abort
a random transaction at a random instruction. The default is 0
for normal operation.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-07-16 12:21:56 +02:00
Linus Torvalds
65b97fb730 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Ben Herrenschmidt:
 "This is the powerpc changes for the 3.11 merge window.  In addition to
  the usual bug fixes and small updates, the main highlights are:

   - Support for transparent huge pages by Aneesh Kumar for 64-bit
     server processors.  This allows the use of 16M pages as transparent
     huge pages on kernels compiled with a 64K base page size.

   - Base VFIO support for KVM on power by Alexey Kardashevskiy

   - Wiring up of our nvram to the pstore infrastructure, including
     putting compressed oopses in there by Aruna Balakrishnaiah

   - Move, rework and improve our "EEH" (basically PCI error handling
     and recovery) infrastructure.  It is no longer specific to pseries
     but is now usable by the new "powernv" platform as well (no
     hypervisor) by Gavin Shan.

   - I fixed some bugs in our math-emu instruction decoding and made it
     usable to emulate some optional FP instructions on processors with
     hard FP that lack them (such as fsqrt on Freescale embedded
     processors).

   - Support for Power8 "Event Based Branch" facility by Michael
     Ellerman.  This facility allows what is basically "userspace
     interrupts" for performance monitor events.

   - A bunch of Transactional Memory vs.  Signals bug fixes and HW
     breakpoint/watchpoint fixes by Michael Neuling.

  And more ...  I appologize in advance if I've failed to highlight
  something that somebody deemed worth it."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits)
  pstore: Add hsize argument in write_buf call of pstore_ftrace_call
  powerpc/fsl: add MPIC timer wakeup support
  powerpc/mpic: create mpic subsystem object
  powerpc/mpic: add global timer support
  powerpc/mpic: add irq_set_wake support
  powerpc/85xx: enable coreint for all the 64bit boards
  powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx
  powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig
  powerpc/mpic: Add get_version API both for internal and external use
  powerpc: Handle both new style and old style reserve maps
  powerpc/hw_brk: Fix off by one error when validating DAWR region end
  powerpc/pseries: Support compression of oops text via pstore
  powerpc/pseries: Re-organise the oops compression code
  pstore: Pass header size in the pstore write callback
  powerpc/powernv: Fix iommu initialization again
  powerpc/pseries: Inform the hypervisor we are using EBB regs
  powerpc/perf: Add power8 EBB support
  powerpc/perf: Core EBB support for 64-bit book3s
  powerpc/perf: Drop MMCRA from thread_struct
  powerpc/perf: Don't enable if we have zero events
  ...
2013-07-04 10:29:23 -07:00
Linus Torvalds
fe489bf450 KVM fixes for 3.11
On the x86 side, there are some optimizations and documentation updates.
 The big ARM/KVM change for 3.11, support for AArch64, will come through
 Catalin Marinas's tree.  s390 and PPC have misc cleanups and bugfixes.
 
 There is a conflict due to "s390/pgtable: fix ipte notify bit" having
 entered 3.10 through Martin Schwidefsky's s390 tree.  This pull request
 has additional changes on top, so this tree's version is the correct one.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR0oU6AAoJEBvWZb6bTYbynnsP/RSUrrHrA8Wu1tqVfAKu+1y5
 6OIihqZ9x11/YMaNofAfv86jqxFu0/j7CzMGphNdjzujqKI+Q1tGe7oiVCmKzoG+
 UvSctWsz0lpllgBtnnrm5tcfmG6rrddhLtpA7m320+xCVx8KV5P4VfyHZEU+Ho8h
 ziPmb2mAQ65gBNX6nLHEJ3ITTgad6gt4NNbrKIYpyXuWZQJypzaRqT/vpc4md+Ed
 dCebMXsL1xgyb98EcnOdrWH1wV30MfucR7IpObOhXnnMKeeltqAQPvaOlKzZh4dK
 +QfxJfdRZVS0cepcxzx1Q2X3dgjoKQsHq1nlIyz3qu1vhtfaqBlixLZk0SguZ/R9
 1S1YqucZiLRO57RD4q0Ak5oxwobu18ZoqJZ6nledNdWwDe8bz/W2wGAeVty19ky0
 qstBdM9jnwXrc0qrVgZp3+s5dsx3NAm/KKZBoq4sXiDLd/yBzdEdWIVkIrU3X9wU
 3X26wOmBxtsB7so/JR7ciTsQHelmLicnVeXohAEP9CjIJffB81xVXnXs0P0SYuiQ
 RzbSCwjPzET4JBOaHWT0Dhv0DTS/EaI97KzlN32US3Bn3WiLlS1oDCoPFoaLqd2K
 LxQMsXS8anAWxFvexfSuUpbJGPnKSidSQoQmJeMGBa9QhmZCht3IL16/Fb641ToN
 xBohzi49L9FDbpOnTYfz
 =1zpG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "On the x86 side, there are some optimizations and documentation
  updates.  The big ARM/KVM change for 3.11, support for AArch64, will
  come through Catalin Marinas's tree.  s390 and PPC have misc cleanups
  and bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits)
  KVM: PPC: Ignore PIR writes
  KVM: PPC: Book3S PR: Invalidate SLB entries properly
  KVM: PPC: Book3S PR: Allow guest to use 1TB segments
  KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match
  KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry
  KVM: PPC: Book3S PR: Fix proto-VSID calculations
  KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL
  KVM: Fix RTC interrupt coalescing tracking
  kvm: Add a tracepoint write_tsc_offset
  KVM: MMU: Inform users of mmio generation wraparound
  KVM: MMU: document fast invalidate all mmio sptes
  KVM: MMU: document fast invalidate all pages
  KVM: MMU: document fast page fault
  KVM: MMU: document mmio page fault
  KVM: MMU: document write_flooding_count
  KVM: MMU: document clear_spte_count
  KVM: MMU: drop kvm_mmu_zap_mmio_sptes
  KVM: MMU: init kvm generation close to mmio wrap-around value
  KVM: MMU: add tracepoint for check_mmio_spte
  KVM: MMU: fast invalidate all mmio sptes
  ...
2013-07-03 13:21:40 -07:00
Linus Torvalds
c1101cbc7d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "This is the bulk of the s390 patches for the 3.11 merge window.

  Notable enhancements are: the block timeout patches for dasd from
  Hannes, and more work on the PCI support front.  In addition some
  cleanup and the usual bug fixing."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
  s390/dasd: Fail all requests when DASD_FLAG_ABORTIO is set
  s390/dasd: Add 'timeout' attribute
  block: check for timeout function in blk_rq_timed_out()
  block/dasd: detailed I/O errors
  s390/dasd: Reduce amount of messages for specific errors
  s390/dasd: Implement block timeout handling
  s390/dasd: process all requests in the device tasklet
  s390/dasd: make number of retries configurable
  s390/dasd: Clarify comment
  s390/hwsampler: Updated misleading member names in hws_data_entry
  s390/appldata_net_sum: do not use static data
  s390/appldata_mem: do not use static data
  s390/vmwatchdog: do not use static data
  s390/airq: simplify adapter interrupt code
  s390/pci: remove per device debug attribute
  s390/dma: remove gratuitous brackets
  s390/facility: decompose test_facility()
  s390/sclp: remove duplicated include from sclp_ctl.c
  s390/irq: store interrupt information in pt_regs
  s390/drivers: Cocci spatch "ptr_ret.spatch"
  ...
2013-07-03 11:08:24 -07:00
Linus Torvalds
63580e51bb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS patches (part 1) from Al Viro:
 "The major change in this pile is ->readdir() replacement with
  ->iterate(), dealing with ->f_pos races in ->readdir() instances for
  good.

  There's a lot more, but I'd prefer to split the pull request into
  several stages and this is the first obvious cutoff point."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (67 commits)
  [readdir] constify ->actor
  [readdir] ->readdir() is gone
  [readdir] convert ecryptfs
  [readdir] convert coda
  [readdir] convert ocfs2
  [readdir] convert fatfs
  [readdir] convert xfs
  [readdir] convert btrfs
  [readdir] convert hostfs
  [readdir] convert afs
  [readdir] convert ncpfs
  [readdir] convert hfsplus
  [readdir] convert hfs
  [readdir] convert befs
  [readdir] convert cifs
  [readdir] convert freevxfs
  [readdir] convert fuse
  [readdir] convert hpfs
  reiserfs: switch reiserfs_readdir_dentry to inode
  reiserfs: is_privroot_deh() needs only directory inode, actually
  ...
2013-07-02 09:28:37 -07:00
Benjamin Herrenschmidt
24a72acac1 Linux 3.10
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL
 Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO
 Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9
 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS
 bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10
 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ=
 =Ob6Z
 -----END PGP SIGNATURE-----

Merge tag 'v3.10' into next

Merge 3.10 in order to get some of the last minute powerpc
changes, resolve conflicts and add additional fixes on top
of them.
2013-07-01 17:57:25 +10:00
Al Viro
40d158e618 consolidate io_remap_pfn_range definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:46:35 +04:00
Martin Schwidefsky
f4eae94f71 s390/airq: simplify adapter interrupt code
There are three users of adapter interrupts: AP, QDIO and PCI. Each
registers a single adapter interrupt with independent ISCs. Define
a "struct airq" with the interrupt handler, a pointer and a mask for
the local summary indicator and the ISC for the adapter interrupt
source. Convert the indicator array with its fixed number of adapter
interrupt sources per ISE to an array of hlists. This removes the
limitation to 32 adapter interrupts per ISC and allows for arbitrary
memory locations for the local summary indicator.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:28 +02:00
Martin Schwidefsky
386aa051fb s390/pci: remove per device debug attribute
The per-pci-device 'debug' attribute is ill defined. For each device
it prints the same information, the adapter interrupt bit vector for
irq numbers 0 & 1, the start of the global interrupt summary vector
and the global irq retries counter. Just remove the attribute and
the associated code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:27 +02:00
Sebastian Ott
a9a5250cc6 s390/dma: remove gratuitous brackets
Remove gratuitous brackets in dma_mapping_error.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:26 +02:00
Michael Mueller
32089246e3 s390/facility: decompose test_facility()
The patch decomposes the function test_facility() into its API
test_facility() and its implementation __test_facility(). This
allows to reuse the implementation with a different API.

Patch is used to prepare checkin of SIE satellite code.

Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:25 +02:00
Martin Schwidefsky
48f6b00c6e s390/irq: store interrupt information in pt_regs
Copy the interrupt parameters from the lowcore to the pt_regs structure
in entry[64].S and reduce the arguments of the low level interrupt handler
to the pt_regs pointer only. In addition move the test-pending-interrupt
loop from do_IRQ to entry[64].S to make sure that interrupt information
is always delivered via pt_regs.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:23 +02:00
Sebastian Ott
4bee2a5dce s390/pci: cleanup hotplug code
Provide wrappers for the [de]configure operations, add some error
handling, and use pci_scan_slot instead of pci_scan_single_device.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:07 +02:00
Christian Borntraeger
24d5dd0208 s390/kvm: Provide function for setting the guest storage key
From time to time we need to set the guest storage key. Lets
provide a helper function that handles the changes with all the
right locking and checking.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:04 +02:00
Sebastian Ott
92820a5f99 s390: remove virt_to_phys implementation
virt_to_phys on s390 currently uses the LRA instruction to translate
virtual to physical addresses. This creates an unnecessary overhead
and caused trouble with dma debugging code (when called with an
address pointing to a already unmapped page).
Just get rid of s390's implementation and use the one from
asm-generic/io.h .

Note: with this change virt_to_phys will no longer work on vmalloc'ed
addresses.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-26 21:10:02 +02:00
Thomas Huth
208dd7567d KVM: s390: Renamed PGM_PRIVILEGED_OPERATION
Renamed the PGM_PRIVILEGED_OPERATION define to PGM_PRIVILEGED_OP since this
define was way longer than the other PGM_* defines and caused the code often
to exceed the 80 columns limit when not split to multiple lines.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 23:31:04 +02:00
Aneesh Kumar K.V
6b0b50b061 mm/THP: add pmd args to pgtable deposit and withdraw APIs
This will be later used by powerpc THP support.  In powerpc we want to use
pgtable for storing the hash index values.  So instead of adding them to
mm_context list, we would like to store them in the second half of pmd

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 16:55:07 +10:00
Sebastian Ott
4026099a31 s390/dma: support debug_dma_mapping_error
Without this patch drivers will get blamed (CONFIG_DMA_API_DEBUG=y)
for not calling dma_mapping_error (even if they do).

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-19 09:25:42 +02:00
Sebastian Ott
73e5a84842 s390/dma: fix mapping_error detection
The map_page implementation of s390 returns DMA_ERROR_CODE in an error
situation. Correctly test if a mapping was erroneous (DMA_ERROR_CODE is
defined as ~0).

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-19 09:25:42 +02:00
Heinz Graalfs
b764bb1c50 KVM: s390,perf: Detect if perf samples belong to KVM host or guest
This patch is based on an original patch of David Hildenbrand.

The perf core implementation calls architecture specific code in order
to ask for specific information for a particular sample:

perf_instruction_pointer()
When perf core code asks for the instruction pointer, architecture
specific code must detect if a KVM guest was running when the sample
was taken. A sample can be associated with a  KVM guest when the PSW
supervisor state bit is set and the PSW instruction pointer part
contains the address of 'sie_exit'.
A KVM guest's instruction pointer information is then retrieved via
gpsw entry pointed to by the sie control-block.

perf_misc_flags()
perf code code calls this function in order to associate the kernel
vs. user state infomation with a particular sample. Architecture
specific code must also first detectif a KVM guest was running
at the time the sample was taken.

Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-17 17:10:23 +02:00
Christian Borntraeger
d0321a24bf KVM: s390: Use common waitqueue
Lets use the common waitqueue for kvm cpus on s390. By itself it is
just a cleanup, but it should also improve the accuracy of diag 0x44
which is implemented via kvm_vcpu_on_spin. kvm_vcpu_on_spin has
an explicit check for waiting on the waitqueue to optimize the
yielding.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-17 17:09:17 +02:00
Christian Borntraeger
69d0d3a316 KVM: s390: guest large pages
This patch enables kvm to give large pages to the guest. The heavy
lifting is done by the hardware, the host only has to take care
of the PFMF instruction, which is also part of EDAT-1.

We also support the non-quiescing key setting facility if the host
supports it, to behave similar to the interpretation of sske.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-17 17:05:07 +02:00
Christian Borntraeger
db70ccdfb9 KVM: s390: Provide function for setting the guest storage key
From time to time we need to set the guest storage key. Lets
provide a helper function that handles the changes with all the
right locking and checking.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-17 17:04:30 +02:00
Christian Borntraeger
a8f6e7f795 s390/pgtable: make pgste lock an explicit barrier
Getting and Releasing the pgste lock has lock semantics. Make the
code an explicit barrier.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-05 17:36:21 +02:00
Christian Borntraeger
3a82603be4 s390/pgtable: Save pgste during modify_prot_start/commit
In modify_prot_start we update the pgste value but never store it back
into the original location. Lets save the calculated result, since
modify_prot_commit will use the value of the pgste.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-05 17:36:20 +02:00
Christian Borntraeger
338679f7ba s390/pgtable: Fix guest overindication for change bit
When doing the transition invalid->valid in the host page table for
a guest, then the guest view of C/R is in the pgste. After validation
the view is pgste OR real key. We must zero out the real key C/R to
avoid guest over-indication for change (and reference).

Touching the real key is ok also for the host: The change bit is
tracked via write protection and the reference bit is also ok
because set_pte_at was called and  the page will be touched anyway
soon. Furthermore architecture defines reference as "substantially
accurate", over- and underindication are ok.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-06-05 17:36:19 +02:00
Christian Borntraeger
b56433cb78 s390/pgtable: Fix check for pgste/storage key handling
pte_present might return true on PAGE_TYPE_NONE, even if
the invalid bit is on. Modify the existing check of the
pgste functions to avoid crashes.

[ Martin Schwidefsky: added ptep_modify_prot_[start|commit] bits ]

Reported-by: Martin Schwidefky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-28 15:18:05 +02:00
Michael Holzheu
576ebd7492 kernel: Fix s390 absolute memory access for /dev/mem
On s390 the prefix page and absolute zero pages are not correctly
returned when reading /dev/mem. The reason is that the s390 asm/io.h
file includes the asm-generic/io.h file which then defines
xlate_dev_mem_ptr() and therefore overwrites the s390 specific
version that does the correct swap operation for prefix and absolute
zero pages. The problem is a regression that was introduced with git
commit cd248341 (s390/pci: base support).

To fix the problem add "#ifndef xlate_dev_mem_ptr" in asm-generic/io.h
and "#define xlate_dev_mem_ptr" in asm/io.h. This ensures that the
s390 version is used. For completeness also add the "#ifndef"
construct for xlate_dev_kmem_ptr().

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-22 09:45:57 +02:00
Sebastian Ott
abd9a0c360 s390/dma: do not call debug_dma after free
In dma_free_coherent call debug_dma_free_coherent before deallocating
the memory to avoid a possible use after free.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-22 09:45:56 +02:00
Christian Borntraeger
2c70fe4416 s390/kvm: Kick guests out of sie if prefix page host pte is touched
The guest prefix pages must be mapped writeable all the time
while SIE is running, otherwise the guest might see random
behaviour. (pinned at the pte level) Turns out that mlocking is
not enough, the page table entry (not the page) might change or
become r/o. This patch uses the gmap notifiers to kick guest
cpus out of SIE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-21 11:55:24 +03:00
Christian Borntraeger
49b99e1e0d s390/kvm: Provide a way to prevent reentering SIE
Lets provide functions to prevent KVM from reentering SIE and
to kick cpus out of SIE. We cannot use the common kvm_vcpu_kick code,
since we need to kick out guests in places that hold architecture
specific locks (e.g. pgste lock) which might be necessary on the
other cpus - so no waiting possible.

So lets provide a bit in a private field of the sie control block
that acts as a gate keeper, after we claimed we are in SIE.
Please note that we do not reuse prog0c, since we want to access
that bit without atomic ops.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-21 11:55:23 +03:00
Christian Borntraeger
95d38fd0bc s390/kvm: Mark if a cpu is in SIE
Lets track in a private bit if the sie control block is active.
We want to track this as closely as possible, so we also have to
instrument the interrupt and program check handler. Lets use the
existing HANDLE_SIE_INTERCEPT macro.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-21 11:55:21 +03:00
Martin Schwidefsky
0d0dafc1e4 s390/kvm: rename RCP_xxx defines to PGSTE_xxx
The RCP byte is a part of the PGSTE value, the existing RCP_xxx names
are inaccurate. As the defines describe bits and pieces of the PGSTE,
the names should start with PGSTE_. The KVM_UR_BIT and KVM_UC_BIT are
part of the PGSTE as well, give them better names as well.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-21 11:55:20 +03:00
Christian Borntraeger
eed3b1e55b s390/pgtable: fix ipte notify bit
Dont use the same bit as user referenced.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-05-21 11:55:11 +03:00
Christian Borntraeger
6b8224e40a s390/pgtable: fix ipte notify bit
Dont use the same bit as user referenced.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2013-05-17 11:47:05 +02:00
Heiko Carstens
aca9120977 s390/ftrace: fix mcount adjustment
Tony Jones reported that the ftrace self tests on s390 do not work:

<6>Testing dynamic ftrace ops #1: (0 0 0 0 0) FAILED!
<6>Testing tracer irqsoff:
<3>failed to start irqsoff tracer
<4>.. no entries found ..FAILED!
<6>Testing tracer wakeup:
<3>failed to start wakeup tracer
<4>.. no entries found ..FAILED!
<6>Testing tracer function_graph:
<4>Failed to init function_graph tracer, init returned -19
<4>FAILED!

This happens because we forgot to adjust the instruction pointer that gets
passed to the ftrace trace function by MCOUNT_INSN_SIZE.

In addition change MCOUNT_INSN_SIZE to the correct value on 31 bit.
It only worked so far because the to be patched instruction was identical.

Reported-by: Tony Jones <tonyj@suse.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-15 13:09:09 +02:00
Christian Borntraeger
617e164c35 s390: disable pfmf for clear page instruction
Wit the introduction of large pages Linux also used pfmf for page
clearing. The current implementation is not ideal, though:
- currently we set usage intent=0, but cleared pages are often used
directly after the clearing
- z/VM does not yet provide EDAT
- KVM does have to intercept PFMF even for resident pages

Lets just the mvcl loop in all cases until we have a well defined
pattern were pfmf is besser.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-07 14:11:55 +02:00
Heiko Carstens
996b4a7d8f s390/mem_detect: remove artificial kdump memory types
Simplify the memory detection code a bit by removing the CHUNK_OLDMEM
and CHUNK_CRASHK memory types.
They are not needed. Everything that is needed is a mechanism to
insert holes into the detected memory.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-03 14:21:15 +02:00
Martin Schwidefsky
d3383632d4 s390/mm: add pte invalidation notifier for kvm
Add a notifier for kvm to get control before a page table entry is
invalidated. The notifier is only called for ptes of an address space
with pgstes that have been explicitly marked to require notification.
Kvm will use this to get control before prefix pages of virtual CPU
are unmapped.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-03 14:21:12 +02:00
Heiko Carstens
df1bd59c5c s390/mem_detect: limit memory detection loop to "mem=" parameter
The current memory detection loop will detect all present memory of
a machine. This is true even if the user specified the "mem=" parameter
on the kernel command line.
This can be a problem since the memory detection may cause a fully
populated host page table for the guest, even for those parts of the
memory that the guest will never use afterwards.

So fix this and only detect memory up to a user supplied "mem=" limit
if specified.

Reported-by: Michael Johanssen <johanssn@de.ibm.com>
Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-02 15:50:26 +02:00
Heiko Carstens
118131a2d5 s390: get rid of odd global real_memory_size
The variable real_memory_size has odd semantics and has been used in
a broken way by e.g. the old kvm code.
Therefore get rid of it before anybody else makes use of it.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-02 15:50:25 +02:00
Linus Torvalds
08d7676083 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull compat cleanup from Al Viro:
 "Mostly about syscall wrappers this time; there will be another pile
  with patches in the same general area from various people, but I'd
  rather push those after both that and vfs.git pile are in."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  syscalls.h: slightly reduce the jungles of macros
  get rid of union semop in sys_semctl(2) arguments
  make do_mremap() static
  sparc: no need to sign-extend in sync_file_range() wrapper
  ppc compat wrappers for add_key(2) and request_key(2) are pointless
  x86: trim sys_ia32.h
  x86: sys32_kill and sys32_mprotect are pointless
  get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC
  merge compat sys_ipc instances
  consolidate compat lookup_dcookie()
  convert vmsplice to COMPAT_SYSCALL_DEFINE
  switch getrusage() to COMPAT_SYSCALL_DEFINE
  switch epoll_pwait to COMPAT_SYSCALL_DEFINE
  convert sendfile{,64} to COMPAT_SYSCALL_DEFINE
  switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE
  make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect
  make HAVE_SYSCALL_WRAPPERS unconditional
  consolidate cond_syscall and SYSCALL_ALIAS declarations
  teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long
  get rid of duplicate logics in __SC_....[1-6] definitions
2013-05-01 07:21:43 -07:00
Gerald Schaefer
106c992a5e mm/hugetlb: add more arch-defined huge_pte functions
Commit abf09bed3c ("s390/mm: implement software dirty bits")
introduced another difference in the pte layout vs.  the pmd layout on
s390, thoroughly breaking the s390 support for hugetlbfs.  This requires
replacing some more pte_xxx functions in mm/hugetlbfs.c with a
huge_pte_xxx version.

This patch introduces those huge_pte_xxx functions and their generic
implementation in asm-generic/hugetlb.h, which will now be included on
all architectures supporting hugetlbfs apart from s390.  This change
will be a no-op for those architectures.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Hillf Danton <dhillf@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz>	[for !s390 parts]
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:33 -07:00
Linus Torvalds
d0b8883800 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 update from Martin Schwidefsky:
 "This is the first batch of s390 patches for the 3.10 merge window.

  Included are some performance enhancements: storage key
  initialization, zero page cache synonyms, system call micro
  optimization and the speedup patches for dasdfmt.  Sebastian managed
  to get rid of the special casing for the console device in the cio
  layer.  And the usual bunch of bug fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (59 commits)
  s390/pci: use pci_scan_root_bus
  s390/scm_blk: fix memleak in init function
  s390/scm_blk: allow more cluster size values
  s390/cio: fix irq statistics
  s390/memory hotplug: prevent offline of active memory increments
  s390: remove small stack config option
  s390: system call path micro optimization
  s390: lowcore stack pointer offsets
  s390/uapi: change struct statfs[64] member types to unsigned values
  s390/pci: return correct dma address for offset > PAGE_SIZE
  s390/ptrace: remove empty ifdefs
  s390/compat: remove ptrace compat definitions from uapi header file
  s390/compat: fix compile error for !COMPAT
  s390/compat: fix compat_sys_statfs() memory corruption
  s390/zcore: Fix HSA copy length for last block
  s390/mm,gmap: segment mapping race
  s390/mm,gmap: implement gmap_translate()
  s390/pci: remove disable_device implementation
  s390/pci: disable per default
  s390/pci: return error after failed pci ops
  ...
2013-04-29 08:19:39 -07:00
Heiko Carstens
581618f226 s390: remove small stack config option
We've seen repeatedly that 8KB stack size on 64 bit kernels
is not sufficient.
So simply remove the config option.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-26 09:07:08 +02:00
Martin Schwidefsky
616498813b s390: system call path micro optimization
Add a pointer to the system call table to the thread_info structure.
The TIF_31BIT bit is set or cleared by SET_PERSONALITY exactly once
for the lifetime of a process. With the pointer to the correct system
call table in thread_info the system call code in entry64.S path can
drop the check for TIF_31BIT which saves a couple of instructions.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-26 09:07:05 +02:00
Heiko Carstens
b8668fd0a7 s390/uapi: change struct statfs[64] member types to unsigned values
Kay Sievers reported that coreutils' stat tool has a problem with
s390's statfs[64] definition:

> The definition of struct statfs::f_type needs a fix. s390 is the only
> architecture in the kernel that uses an int and expects magic
> constants lager than INT_MAX to fit into.
>
> A fix is needed to make Fedora boot on s390, it currently fails to do
> so. Userspace does not want to add code to paper-over this issue.

[...]

> Even coreutils cannot handle it:
>   #define RAMFS_MAGIC  0x858458f6
>   # stat -f -c%t /
>   ffffffff858458f6
>
>   #define BTRFS_SUPER_MAGIC 0x9123683E
>   # stat -f -c%t /mnt
>   ffffffff9123683e

The bug is caused by an implicit sign extension within the stat tool:

out_uint_x (pformat, prefix_len, statfsbuf->f_type);

where the format finally will be "%lx".
A similar problem can be found in the 'tail' tool.
s390 is the only architecture which has an int type f_type member in
struct statfs[64]. Other architectures have either unsigned ints or
long values, so that the problem doesn't occur there.

Therefore change the type of the f_type member to unsigned int, so
that we get zero extension instead of sign extension when assignment to
a long value happens.

This patch changes the s390 uapi struct stafs[64] definition in the kernel
to contain only unsigned values.
This was true for 32 bit builds anyway, since we use the generic uapi
header file in that case. So lets not include conditionally the generic
uapi header file but have the s390 implementation completely independent.

Also fix the types of struct compat_stafs to match reality and move the
definition of struct compat_statfs64 to asm/compat.h since it is not part
of the api.

Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-23 10:18:18 +02:00
Heiko Carstens
63dd9b44ac s390/ptrace: remove empty ifdefs
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-23 10:18:14 +02:00
Heiko Carstens
e4371f602e s390/compat: remove ptrace compat definitions from uapi header file
The compat definitions are not part of the uapi. So move them to
s390's private compat header file.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-23 10:18:12 +02:00
Heiko Carstens
0f58104c8c s390/compat: fix compile error for !COMPAT
Fix this one for !COMPAT:

compat.h: In function ‘arch_compat_alloc_user_space’:
compat.h:292:2: error: implicit declaration of function ‘is_compat_task’

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-23 10:18:10 +02:00
Heiko Carstens
a2aec0d3e2 s390/compat: fix compat_sys_statfs() memory corruption
The f_spare field within struct compat_statfs is four bytes larger
than within the native 31 bit struct statfs.
compat_sys_statfs() clears the f_spare field in user space which
means that in compat mode four bytes that are behind the user space
supplied struct compat_statfs will be corrupted (zeroed).

According to Thomas Gleixner's Linux 2.6 history tree this bug is
present since v2.5.74 87880da124 "[PATCH] s390: 31 bit compat.".
So it get's fixed shortly before its 10th anniversary. Tough luck.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-23 10:18:09 +02:00
Heiko Carstens
c5034945ce s390/mm,gmap: implement gmap_translate()
Implement gmap_translate() function which translates a guest absolute address
to a user space process address without establishing the guest page table
entries.

This is useful for kvm guest address translations where no memory access
is expected to happen soon (e.g. tprot exception handler).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-23 10:18:02 +02:00
Linus Torvalds
4f2e29031e s390: move dummy io_remap_pfn_range() to asm/pgtable.h
Commit b4cbb197c7 ("vm: add vm_iomap_memory() helper function") added
a helper function wrapper around io_remap_pfn_range(), and every other
architecture defined it in <asm/pgtable.h>.

The s390 choice of <asm/io.h> may make sense, but is not very convenient
for this case, and gratuitous differences like that cause unexpected errors like this:

   mm/memory.c: In function 'vm_iomap_memory':
   mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration]

Glory be the kbuild test robot who noticed this, bisected it, and
reported it to the guilty parties (ie me).

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-17 08:46:19 -07:00
Sebastian Ott
b170bad40d s390/pci: do not read data after failed load
If a pci load instruction fails the content of the register where the
data is stored is possibly unchanged. Fix the inline assembly wrapper
__pcilg to not return stale data. Additionally fix the callers of this
function who access uninitialized variables.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:38 +02:00
Sebastian Ott
f0bacb7fc4 s390/pci: add exception table to load/store instructions
Don't let pci_load and friends crash the kernel when called with
e.g. an invalid offset. Return -ENXIO instead.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:37 +02:00
Sebastian Ott
b2a9e87d2c s390/pci: rename instruction wrappers
Use distinct (and hopefully sane) names for the pci instruction
wrappers.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:37 +02:00
Sebastian Ott
cbcca5d070 s390/pci: uninline instruction wrappers
Uninline pci related instruction wrappers to de-bloat the code:
add/remove: 15/0 grow/shrink: 2/24 up/down: 1326/-12628 (-11302)

This is especially useful for the inlined pci read and write functions
which are used all over the kernel. Also remove the unused __stpcifc
while at it.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:37 +02:00
Sebastian Ott
cb65a669f6 s390/pci: do not modify function handles
Don't modify function handles to get a disabled handle - call
clp_disable_fh. With this change we also do no longer deconfigure
enabled functions.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:36 +02:00
Sebastian Ott
a2ab833360 s390/pci: debug device states
Use the debugfs to keep track of a pci function's status changes.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:35 +02:00
Sebastian Ott
f10ccca7a5 s390/cio: ccw_device_force_console don't use static variable
force_console is used to wake up the CCW based console device to
print a panic message in case something goes wrong in a suspend
or resume cycle. Stop using the static console_subchannel and add
a parameter to this function to specify which ccw device we have
to wake up.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:31 +02:00
Sebastian Ott
188561a462 s390/cio: wait_cons_dev don't use static variable
wait_cons_dev is used to busy wait for an interrupt on the console
ccw device. Stop using the static console_subchannel and add a
parameter to this function to specify on which ccw device/subchannel
we have to do the polling.

While at it rename the function to ccw_device_wait_idle and
move it to device.c

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:30 +02:00
Heiko Carstens
5294ee00a1 s390/bitops: get rid of __BITOPS_BARRIER()
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:30 +02:00
Akinobu Mita
01c2475f6d s390/bitops: remove unnecessary macro definitions in asm/bitops.h
Remove unused __BITOPS_ALIGN, and replace __BITOPS_WORDSIZE with
BITS_PER_LONG.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:29 +02:00
Stefan Raspl
0bcc94baca s390/dis: use explicit buf len
Pass buffer length in extra parameter.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-17 14:07:25 +02:00
Heiko Carstens
765a0cac56 s390/mm: provide emtpy check_pgt_cache() function
All architectures need to provide a check_pgt_cache() function. The s390 one
got lost somewhere.
So reintroduce it to prevent future compile errors e.g. if Thomas Gleixner's
idle loop rework patches get merged.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-02 08:53:11 +02:00
Heiko Carstens
ea81531de2 s390/uaccess: fix page table walk
When translating user space addresses to kernel addresses the follow_table()
function had two bugs:

- PROT_NONE mappings could be read accessed via the kernel mapping. That is
  e.g. putting a filename into a user page, then protecting the page with
  PROT_NONE and afterwards issuing the "open" syscall with a pointer to
  the filename would incorrectly succeed.

- when walking the page tables it used the pgd/pud/pmd/pte primitives which
  with dynamic page tables give no indication which real level of page tables
  is being walked (region2, region3, segment or page table). So in case of an
  exception the translation exception code passed to __handle_fault() is not
  necessarily correct.
  This is not really an issue since __handle_fault() doesn't evaluate the code.
  Only in case of e.g. a SIGBUS this code gets passed to user space. If user
  space can do something sane with the value is a different question though.

To fix these issues don't use any Linux primitives. Only walk the page tables
like the hardware would do it, however we leave quite some checks away since
we know that we only have full size page tables and each index is within bounds.

In theory this should fix all issues...

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-02 08:53:08 +02:00
Linus Torvalds
991657a39d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A couple of bug fixes, the most hairy on is the flush_tlb_kernel_range
  fix.  Another case of "how could this ever have worked?"."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/kdump: Do not add standby memory for kdump
  drivers/i2c: remove !S390 dependency, add missing GENERIC_HARDIRQS dependencies
  s390/scm: process availability
  s390/scm_blk: suspend writes
  s390/scm_drv: extend notify callback
  s390/scm_blk: fix request number accounting
  s390/mm: fix flush_tlb_kernel_range()
  s390/mm: fix vmemmap size calculation
  s390: critical section cleanup vs. machine checks
2013-03-18 08:19:13 -07:00
Linus Torvalds
7c6baa304b Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc minor fixes mostly related to tracing"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  s390: Fix a header dependencies related build error
  tracing: update documentation of snapshot utility
  tracing: Do not return EINVAL in snapshot when not allocated
  tracing: Add help of snapshot feature when snapshot is empty
  ftrace: Update the kconfig for DYNAMIC_FTRACE
2013-03-11 07:54:29 -07:00
Li Zefan
cb16b91a44 s390: Fix a header dependencies related build error
Commit 877c685607
("perf: Remove include of cgroup.h from perf_event.h") caused
this build failure if PERF_EVENTS is enabled:

   In file included from arch/s390/include/asm/perf_event.h:9:0,
                    from include/linux/perf_event.h:24,
                    from kernel/events/ring_buffer.c:12:
   arch/s390/include/asm/cpu_mf.h: In function 'qctri':
   arch/s390/include/asm/cpu_mf.h:61:12: error: 'EINVAL' undeclared (first use in this function)

cpu_mf.h had an implicit errno.h dependency, which was added
indirectly via cgroups.h but not anymore. Add it explicitly.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Link: http://lkml.kernel.org/r/51385F79.7000106@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-11 10:43:35 +01:00
Sebastian Ott
aebfa669d9 s390/scm: process availability
Let the bus code process scm availability information and
notify scm device drivers about the new state.

Reviewed-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-03-07 09:52:24 +01:00
Sebastian Ott
4fa3c01964 s390/scm_blk: suspend writes
Stop writing to scm after certain error conditions such as a concurrent
firmware upgrade. Resume to normal state once scm_blk_set_available is
called (due to an scm availability notification).

Reviewed-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-03-07 09:52:23 +01:00