Lazy unmap (defer tlb flush after unmap until dma address reuse) can
greatly reduce the number of RPCIT instructions in the best case. In
reality we are often far away from the best case scenario because our
implementation suffers from the following problem:
To create dma addresses we maintain an iommu bitmap and a pointer into
that bitmap to mark the start of the next search. That pointer moves from
the start to the end of that bitmap and we issue a global tlb flush
once that pointer wraps around. To prevent address reuse before we issue
the tlb flush we even have to move the next pointer during unmaps - when
clearing a bit > next. This could lead to a situation where we only use
the rear part of that bitmap and issue more tlb flushes than expected.
To fix this we no longer clear bits during unmap but maintain a 2nd
bitmap which we use to mark addresses that can't be reused until we issue
the global tlb flush after wrap around.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Split dma_update_trans into __dma_update_trans which handles updating
the dma translation tables and __dma_purge_tlb which takes care of
purging associated entries in the dma translation lookaside buffer.
The map_sg API makes use of this split approach by calling
__dma_update_trans once per physically contiguous address range but
__dma_purge_tlb only once per dma contiguous address range.
This results in less invocations of the expensive RPCIT instruction
when using map_sg.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Our map_sg implementation mapped sg entries independently of each other.
For ease of use and possible performance improvements this patch changes
the implementation to try to map as many (likely physically non-contiguous)
sglist entries as possible into a contiguous DMA segment.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Simplify the code we use to calculate dma addresses by putting
everything related in a dma_alloc_address function. Also provide
a dma_free_address counterpart.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We calculate dma addresses using an iommu bitmap. Since commit
69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory")
we've made sure that addresses created using that bitmap are below
the maximum reported by firmware. Thus the additional check for
that address to be within range can be removed.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
These files were only including module.h for exception table
related functions. We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.
The additions of uaccess.h are to deal with implict includes like:
arch/s390/kernel/traps.c: In function 'do_report_trap':
arch/s390/kernel/traps.c:56:4: error: implicit declaration of function 'extable_fixup' [-Werror=implicit-function-declaration]
arch/s390/kernel/traps.c: In function 'illegal_op':
arch/s390/kernel/traps.c:173:3: error: implicit declaration of function 'get_user' [-Werror=implicit-function-declaration]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Export clp.h for usage by userspace.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This enables UBSAN for s390. We have to disable the null sanitizer
as s390 code does access memory via a null pointer (the prefix page).
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The combo of list_empty() check and return list_first_entry()
can be replaced with list_first_entry_or_null().
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
most unaligned accesses are reasonable efficient (no kernel emulation)
on s390, let's announce it
This also
- removes the ubsan false positives for unaligned accesses on s390 with
default config
- uses simpler arithmetic in several functions in several other areas
of the kernel like ethernet frame classification
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With git commit 0eab11c7e0
"s390/vx: allow to include vx-insn.h with .include"
and an older gcc we get errors like this:
{standard input}:6: Error: can't open asm/vx-insn.h for reading:
No such file or directory
arch/s390/kernel/fpu.c:57: Error: Unrecognized opcode: `vstm'
To solve this issue simply add the path to arch/s390/include to
all assembler runs.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Static analysis with cppcheck detected that ret is not initialized
and hence garbage is potentially being returned in the case where
prng_data->ppnows.reseed_counter <= prng_reseed_limit.
Thanks to Martin Schwidefsky for spotting a mistake in my original
fix.
Fixes: 0177db01ad ("s390/crypto: simplify return code handling")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The workqueue "appldata_wq" has been replaced with an ordered dedicated
workqueue.
WQ_MEM_RECLAIM has not been set since the workqueue is not being used on
a memory reclaim path.
The adapter->work_queue queues multiple work items viz
&adapter->scan_work, &port->rport_work, &adapter->ns_up_work,
&adapter->stat_work, adapter->work_queue, &adapter->events.work,
&port->gid_pn_work, &port->test_link_work. Hence, an ordered
dedicated workqueue has been used.
WQ_MEM_RECLAIM has been set to ensure forward progress under memory
pressure.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The double while loops of the CTR mode encryption / decryption functions
are overly complex for little gain. Simplify the functions to a single
while loop at the cost of an additional memcpy of a few bytes for every
4K page worth of data.
Adapt the other crypto functions to make them all look alike.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The CPACF code makes some assumptions about the availablity of hardware
support. E.g. if the machine supports KM(AES-256) without chaining it is
assumed that KMC(AES-256) with chaining is available as well. For the
existing CPUs this is true but the architecturally correct way is to
check each CPACF functions on its own. This is what the query function
of each instructions is all about.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The aes and the des module register multiple crypto algorithms
dependent on the availability of specific CPACF instructions.
To simplify the deregistration with crypto_unregister_alg add
an array with pointers to the successfully registered algorithms
and use it for the error handling in the init function and in
the module exit function.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The CPACF instructions can complete with three different condition codes:
CC=0 for successful completion, CC=1 if the protected key verification
failed, and CC=3 for partial completion.
The inline functions will restart the CPACF instruction for partial
completion, this removes the CC=3 case. The CC=1 case is only relevant
for the protected key functions of the KM, KMC, KMAC and KMCTR
instructions. As the protected key functions are not used by the
current code, there is no need for any kind of return code handling.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use a separate define for the decryption modifier bit instead of
duplicating the function codes for encryption / decrypton.
In addition use an unsigned type for the function code.
Reviewed-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The machine check handler will do one of two things if the floating-point
control, a floating point register or a vector register can not be
revalidated:
1) if the PSW indicates user mode the process is terminated
2) if the PSW indicates kernel mode the system is stopped
To unconditionally stop the system for 2) is incorrect.
There are three possible outcomes if the floating-point control, a
floating point register or a vector registers can not be revalidated:
1) The kernel is inside a kernel_fpu_begin/kernel_fpu_end block and
needs the register. The system is stopped.
2) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_CPU bit
is not set. The user space process needs the register and is killed.
3) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_FPU bit
is set. Neither the kernel nor the user space process needs the
lost register. Just revalidate it and continue.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In case of nested user of the FPU or vector registers in the kernel
the current code uses the mask of the FPU/vector registers of the
previous contexts to decide which registers to save and restore.
E.g. if the previous context used KERNEL_VXR_V0V7 and the next
context wants to use KERNEL_VXR_V24V31 the first 8 vector registers
are stored to the FPU state structure. But this is not necessary
as the next context does not use these registers.
Rework the FPU/vector register save and restore code. The new code
does a few things differently:
1) A lowcore field is used instead of a per-cpu variable.
2) The kernel_fpu_end function now has two parameters just like
kernel_fpu_begin. The register flags are required by both
functions to save / restore the minimal register set.
3) The inline functions kernel_fpu_begin/kernel_fpu_end now do the
update of the register masks. If the user space FPU registers
have already been stored neither save_fpu_regs nor the
__kernel_fpu_begin/__kernel_fpu_end functions have to be called
for the first context. In this case kernel_fpu_begin adds 7
instructions and kernel_fpu_end adds 4 instructions.
3) The inline assemblies in __kernel_fpu_begin / __kernel_fpu_end
to save / restore the vector registers are simplified a bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To make the vx-insn.h more versatile avoid cpp preprocessor macros
and allow to use plain numbers for vector and general purpose register
operands. With that you can emit an .include from a C file into the
assembler text and then use the vx-insn macros in inline assemblies.
For example:
asm (".include \"asm/vx-insn.h\"");
static inline void xor_vec(int x, int y, int z)
{
asm volatile("VX %0,%1,%2"
: : "i" (x), "i" (y), "i" (z));
}
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The increment might not be atomic and we're not holding the
timekeeper_lock. Therefore we might lose an update to count, resulting in
VDSO being trapped in a loop. As other archs also simply update the
values and count doesn't seem to have an impact on reloading of these
values in VDSO code, let's just remove the update of tb_update_count.
Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
By leaving fixup_cc unset, only the clock comparator of the cpu actually
doing the sync is fixed up until now.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are still some etr leftovers and wrong comments, let's clean that up.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The way we call do_adjtimex() today is broken. It has 0 effect, as
ADJ_OFFSET_SINGLESHOT (0x0001) in the kernel maps to !ADJ_ADJTIME
(in contrast to user space where it maps to ADJ_OFFSET_SINGLESHOT |
ADJ_ADJTIME - 0x8001). !ADJ_ADJTIME will silently ignore all adjustments
without STA_PLL being active. We could switch to ADJ_ADJTIME or turn
STA_PLL on, but still we would run into some problems:
- Even when switching to nanoseconds, we lose accuracy.
- Successive calls to do_adjtimex() will simply overwrite any leftovers
from the previous call (if not fully handled)
- Anything that NTP does using the sysctl heavily interferes with our
use.
- !ADJ_ADJTIME will silently round stuff > or < than 0.5 seconds
Reusing do_adjtimex() here just feels wrong. The whole STP synchronization
works right now *somehow* only, as do_adjtimex() does nothing and our
TOD clock jumps in time, although it shouldn't. This is especially bad
as the clock could jump backwards in time. We will have to find another
way to fix this up.
As leap seconds are also not properly handled yet, let's just get rid of
all this complex logic altogether and use the correct clock_delta for
fixing up the clock comparator and keeping the sched_clock monotonic.
This change should have 0 effect on the current STP mechanism. Once we
know how to best handle sync events and leap second updates, we'll start
with a fresh implementation.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull facility mask patch from the KVM tree.
* tag 's390forkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
KVM: s390: generate facility mask from readable list
Automatically generate the KVM facility mask out of a readable list.
Manually changing the masks is very error prone, especially if the
special IBM bit numbering has to be considered.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The 'report_error' interface for PCI devices found on s390 can be
used by a user space program to inject an adapter error notification.
Add a new kernel interface zpci_report_error to allow a PCI device
driver to inject these error notifications without a detour over
user space.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge the __p[m|u]xdp_idte and __p[m|u]dp_idte_local functions into a
single __p[m|u]dp_idte function with an additional parameter.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge the __ptep_ipte and __ptep_ipte_local functions into a single
__ptep_ipte function with an additional parameter. The __pte_ipte_range
function is still extra as the while loops makes it hard to merge.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The __tlb_flush_mm() helper uses a global flush if the mm struct
has a gmap structure attached to it. Replace the global flush with
two individual flushes by means of the IDTE instruction if only a
single gmap is attached the the mm.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The local-clearing control of the IDTE instruction does not have any effect
for the clearing-by-ASCE operation. Only the invalidation-and-clearing
operation respects the local-clearing bit.
Remove __tlb_flush_idte_local and simplify the batched TLB flushing code.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull s390 fixes from Martin Schwidefsky:
"A couple of bug fixes, minor cleanup and a change to the default
config"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/dasd: fix failing CUIR assignment under LPAR
s390/pageattr: handle numpages parameter correctly
s390/dasd: fix hanging device after clear subchannel
s390/qdio: avoid reschedule of outbound tasklet once killed
s390/qdio: remove checks for ccw device internal state
s390/qdio: fix double return code evaluation
s390/qdio: get rid of spin_lock_irqsave usage
s390/cio: remove subchannel_id from ccw_device_private
s390/qdio: obtain subchannel_id via ccw_device_get_schid()
s390/cio: stop using subchannel_id from ccw_device_private
s390/config: make the vector optimized crc function builtin
s390/lib: fix memcmp and strstr
s390/crc32-vx: Fix checksum calculation for small sizes
s390: clarify compressed image code path
PPC splits debugfs initialization from creation of the xics device to
unlock the newly taken kvm lock earlier.
s390 prevents userspace from triggering two WARN_ON_ONCE.
MIPS fixes several issues in the management of TLB faults (Cc: stable).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCAAGBQJXrx2ZAAoJEED/6hsPKofoo/4H/jra5NNxvpo09LWlXTwGXxBH
cwcfDZSiOFxgvWztKJOIjPI4ETL3mnZvb9SFWBZZh1U0kfZ/TGiWouwaDNlBkPYj
I3YHuPI7if+yUOmJlI3N2hWa0Wo0qiMqIjKT0pQVSLLdK/CVE+xGyS+qtXTNXHQn
pFdKlYr//7OwQEY0ow1yj5VnsFrXB1JWFyB/+N5zaCfbCaQVyZAL7rj8SUbC/32W
CiNhrvatzierKIfPerWw8DvvBKhCgWaRuLl0W+uMncrC9Qepcx9moM2beD1txK2I
iHor1TDxUPifGQONfWMAlw87FluzHF4vQ5nN2jyTi8TT+CEfZpZ43Q+DY7okD4w=
=NQP9
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"KVM:
- lock kvm_device list to prevent corruption on device creation.
PPC:
- split debugfs initialization from creation of the xics device to
unlock the newly taken kvm lock earlier.
s390:
- prevent userspace from triggering two WARN_ON_ONCE.
MIPS:
- fix several issues in the management of TLB faults (Cc: stable)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
MIPS: KVM: Propagate kseg0/mapped tlb fault errors
MIPS: KVM: Fix gfn range check in kseg0 tlb faults
MIPS: KVM: Add missing gfn range check
MIPS: KVM: Fix mapped fault broken commpage handling
KVM: Protect device ops->create and list_add with kvm->lock
KVM: PPC: Move xics_debugfs_init out of create
KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
KVM: s390: set the prefix initially properly
When triggering KVM_RUN without a user memory region being mapped
(KVM_SET_USER_MEMORY_REGION) a validity intercept occurs. This could
happen, if the user memory region was not mapped initially or if it
was unmapped after the vcpu is initialized. The function
kvm_s390_handle_requests checks for the KVM_REQ_MMU_RELOAD bit. The
check function always clears this bit. If gmap_mprotect_notify
returns an error code, the mapping failed, but the KVM_REQ_MMU_RELOAD
was not set anymore. So the next time kvm_s390_handle_requests is
called, the execution would fall trough the check for
KVM_REQ_MMU_RELOAD. The bit needs to be resetted, if
gmap_mprotect_notify returns an error code. Resetting the bit with
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu) fixes the bug.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Julius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
When KVM_RUN is triggered on a VCPU without an initial reset, a
validity intercept occurs.
Setting the prefix will set the KVM_REQ_MMU_RELOAD bit initially,
thus preventing the bug.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Julius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
- Misc fixes and cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXq0ruAAoJECgfDbjSjVRp5P8H/2OlDJdSS1l+TwOXbY95ntQ1
vxUX4vGCX5IujC+Rbt7sQV2prE3b6IktFNagpbRoWn21JkpoDMvPtYJrn5BhLtoh
fvDkZE6Wo3QztFSjaUBZWEABBt03KPX0yrAIZplu8ne/Z8KAT3zK57BPnKfmxwv+
dpxt+1wlnqAvYsoUUQZBFT4Gmk2oDiTofiIbQq7W9W/fooznLtLB+ArYtdfNJizC
JnI/vJuWceEXfjT26HexCRhA2OZskrA4ZadDhOjAqkTPN5DHfweLDuHh7IsVfDd1
wXqjc4ks3cYG0CloJ2qY2K7RpDOFIxIizixeDIuAbn9aX4sPOYYfqRm+4iRwmqQ=
=9aUO
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost fixes and cleanups from Michael Tsirkin:
"Misc fixes and cleanups all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio/s390: deprecate old transport
virtio/s390: keep early_put_chars
virtio_blk: Fix a slient kernel panic
virtio-vsock: fix include guard typo
vhost/vsock: fix vhost virtio_vsock_pkt use-after-free
9p/trans_virtio: use kvfree() for iov_iter_get_pages_alloc()
virtio: fix error handling for debug builds
virtio: fix memory leak in virtqueue_add()
Both set_memory_ro() and set_memory_rw() will modify the page
attributes of at least one page, even if the numpages parameter is
zero.
The author expected that calling these functions with numpages == zero
would never happen. However with the new 444d13ff10 ("modules: add
ro_after_init support") feature this happens frequently.
Therefore do the right thing and make these two functions return
gracefully if nothing should be done.
Fixes crashes on module load like this one:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 000003ff80008000 TEID: 000003ff80008407
Fault in home space mode while using kernel ASCE.
AS:0000000000d18007 R3:00000001e6aa4007 S:00000001e6a10800 P:00000001e34ee21d
Oops: 0004 ilc:3 [#1] SMP
Modules linked in: x_tables
CPU: 10 PID: 1 Comm: systemd Not tainted 4.7.0-11895-g3fa9045 #4
Hardware name: IBM 2964 N96 703 (LPAR)
task: 00000001e9118000 task.stack: 00000001e9120000
Krnl PSW : 0704e00180000000 00000000005677f8 (rb_erase+0xf0/0x4d0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 000003ff80008b20 000003ff80008b20 000003ff80008b70 0000000000b9d608
000003ff80008b20 0000000000000000 00000001e9123e88 000003ff80008950
00000001e485ab40 000003ff00000000 000003ff80008b00 00000001e4858480
0000000100000000 000003ff80008b68 00000000001d5998 00000001e9123c28
Krnl Code: 00000000005677e8: ec1801c3007c cgij %r1,0,8,567b6e
00000000005677ee: e32010100020 cg %r2,16(%r1)
#00000000005677f4: a78401c2 brc 8,567b78
>00000000005677f8: e35010080024 stg %r5,8(%r1)
00000000005677fe: ec5801af007c cgij %r5,0,8,567b5c
0000000000567804: e30050000024 stg %r0,0(%r5)
000000000056780a: ebacf0680004 lmg %r10,%r12,104(%r15)
0000000000567810: 07fe bcr 15,%r14
Call Trace:
([<000003ff80008900>] __this_module+0x0/0xffffffffffffd700 [x_tables])
([<0000000000264fd4>] do_init_module+0x12c/0x220)
([<00000000001da14a>] load_module+0x24e2/0x2b10)
([<00000000001da976>] SyS_finit_module+0xbe/0xd8)
([<0000000000803b26>] system_call+0xd6/0x264)
Last Breaking-Event-Address:
[<000000000056771a>] rb_erase+0x12/0x4d0
Kernel panic - not syncing: Fatal exception: panic_on_oops
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: e8a97e42dc ("s390/pageattr: allow kernel page table splitting")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There only ever have been two host implementations of the old
s390-virtio (pre-ccw) transport: the experimental kuli userspace,
and qemu. As qemu switched its default to ccw with 2.4 (with most
users having used ccw well before that) and removed the old transport
entirely in 2.6, s390-virtio probably hasn't been in active use for
quite some time and is therefore likely to bitrot.
Let's start the slow march towards removing the code by deprecating
it.
Note that this also deprecates the early virtio console code, which
has been causing trouble in the guest without being wired up in any
relevant hypervisor code.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For all configs with CONFIG_BTRFS_FS = y we should also make the
optimized crc module builtin. Otherwise early mounts will fall
back to the software variant.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
if two string compare equal the clcle instruction will update the
string addresses to point _after_ the string. This might already
be on a different page, so we should not use these pointer to
calculate the difference as in that case the calculation of the
difference can cause oopses.
The return value of memcmp does not need the difference, we
can just reuse the condition code and return for CC=1 (All bytes
compared, first operand low) -1 and for CC=2 (All bytes compared,
first operand high) +1
strstr also does not need the diff.
While fixing this, make the common function clcle "correct on its
own" by using l1 instead of l2 for the first length. strstr will
call this with l2 for both strings.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: db7f5eef3d ("s390/lib: use basic blocks for inline assemblies")
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The current prealign logic will fail for sizes < alignment,
as the new datalen passed to the vector function is smaller
than zero. Being a size_t this gets wrapped to a huge
number causing memory overruns and wrong data.
Let's add an early exit if the size is smaller than the minimal
size with alignment. This will also avoid calling the software
fallback twice for all sizes smaller than the minimum size
(prealign + remaining)
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: f848dbd3bc ("s390/crc32-vx: add crypto API module for optimized CRC-32 algorithms")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The way the decompressor is hooked into the start-up code is rather
subtle, with a mix of multiply-defined symbols and hardcoded address
literals. Add some comments at the junction points to clarify how it
works.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull kbuild updates from Michal Marek:
- GCC plugin support by Emese Revfy from grsecurity, with a fixup from
Kees Cook. The plugins are meant to be used for static analysis of
the kernel code. Two plugins are provided already.
- reduction of the gcc commandline by Arnd Bergmann.
- IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada
- bin2c fix by Michael Tautschnig
- setlocalversion fix by Wolfram Sang
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
gcc-plugins: disable under COMPILE_TEST
kbuild: Abort build on bad stack protector flag
scripts: Fix size mismatch of kexec_purgatory_size
kbuild: make samples depend on headers_install
Kbuild: don't add obj tree in additional includes
Kbuild: arch: look for generated headers in obtree
Kbuild: always prefix objtree in LINUXINCLUDE
Kbuild: avoid duplicate include path
Kbuild: don't add ../../ to include path
vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
kconfig.h: use already defined macros for IS_REACHABLE() define
export.h: use __is_defined() to check if __KSYM_* is defined
kconfig.h: use __is_defined() to check if MODULE is defined
kbuild: setlocalversion: print error to STDERR
Add sancov plugin
Add Cyclomatic complexity GCC plugin
GCC plugin infrastructure
Shared library support
VGIC implementation.
- s390: support for trapping software breakpoints, nested virtualization
(vSIE), the STHYI opcode, initial extensions for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
preliminary to this and the upcoming support for hardware virtualization
extensions.
- x86: support for execute-only mappings in nested EPT; reduced vmexit
latency for TSC deadline timer (by about 30%) on Intel hosts; support for
more than 255 vCPUs.
- PPC: bugfixes.
The ugly bit is the conflicts. A couple of them are simple conflicts due
to 4.7 fixes, but most of them are with other trees. There was definitely
too much reliance on Acked-by here. Some conflicts are for KVM patches
where _I_ gave my Acked-by, but the worst are for this pull request's
patches that touch files outside arch/*/kvm. KVM submaintainers should
probably learn to synchronize better with arch maintainers, with the
latter providing topic branches whenever possible instead of Acked-by.
This is what we do with arch/x86. And I should learn to refuse pull
requests when linux-next sends scary signals, even if that means that
submaintainers have to rebase their branches.
Anyhow, here's the list:
- arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
by the nvdimm tree. This tree adds handle_preemption_timer and
EXIT_REASON_PREEMPTION_TIMER at the same place. In general all mentions
of pcommit have to go.
There is also a conflict between a stable fix and this patch, where the
stable fix removed the vmx_create_pml_buffer function and its call.
- virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
This tree adds kvm_io_bus_get_dev at the same place.
- virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
file was completely removed for 4.8.
- include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
this is a change that should have gone in through the irqchip tree and
pulled by kvm-arm. I think I would have rejected this kvm-arm pull
request. The KVM version is the right one, except that it lacks
GITS_BASER_PAGES_SHIFT.
- arch/powerpc: what a mess. For the idle_book3s.S conflict, the KVM
tree is the right one; everything else is trivial. In this case I am
not quite sure what went wrong. The commit that is causing the mess
(fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
and arch/powerpc/kvm/. It's large, but at 396 insertions/5 deletions
I guessed that it wasn't really possible to split it and that the 5
deletions wouldn't conflict. That wasn't the case.
- arch/s390: also messy. First is hypfs_diag.c where the KVM tree
moved some code and the s390 tree patched it. You have to reapply the
relevant part of commits 6c22c98637, plus all of e030c1125e, to
arch/s390/kernel/diag.c. Or pick the linux-next conflict
resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
Second, there is a conflict in gmap.c between a stable fix and 4.8.
The KVM version here is the correct one.
I have pushed my resolution at refs/heads/merge-20160802 (commit
3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
=42ti
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
- ARM: GICv3 ITS emulation and various fixes. Removal of the
old VGIC implementation.
- s390: support for trapping software breakpoints, nested
virtualization (vSIE), the STHYI opcode, initial extensions
for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots
of cleanups, preliminary to this and the upcoming support for
hardware virtualization extensions.
- x86: support for execute-only mappings in nested EPT; reduced
vmexit latency for TSC deadline timer (by about 30%) on Intel
hosts; support for more than 255 vCPUs.
- PPC: bugfixes.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
KVM: PPC: Introduce KVM_CAP_PPC_HTM
MIPS: Select HAVE_KVM for MIPS64_R{2,6}
MIPS: KVM: Reset CP0_PageMask during host TLB flush
MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
MIPS: KVM: Sign extend MFC0/RDHWR results
MIPS: KVM: Fix 64-bit big endian dynamic translation
MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
MIPS: KVM: Use 64-bit CP0_EBase when appropriate
MIPS: KVM: Set CP0_Status.KX on MIPS64
MIPS: KVM: Make entry code MIPS64 friendly
MIPS: KVM: Use kmap instead of CKSEG0ADDR()
MIPS: KVM: Use virt_to_phys() to get commpage PFN
MIPS: Fix definition of KSEGX() for 64-bit
KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
kvm: x86: nVMX: maintain internal copy of current VMCS
KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
KVM: arm64: vgic-its: Simplify MAPI error handling
KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
...
This fixes the same issue Steven already fixed for x86
in following commit:
237d28db03 ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
It fixes the crash, that happens when function graph tracing
and jprobes are used simultaneously. Please refer to above
commit for details.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>