For now, only the owning VCPU thread (that has loaded the VCPU) can get a
consistent cpu timer value when calculating the delta. However, other
threads might also be interested in a more recent, consistent value. Of
special interest will be the timer callback of a VCPU that executes without
having the VCPU loaded and could run in parallel with the VCPU thread.
The cpu timer has a nice property: it is only updated by the owning VCPU
thread. And speaking about accounting, a consistent value can only be
calculated by looking at cputm_start and the cpu timer itself in
one shot, otherwise the result might be wrong.
As we only have one writing thread at a time (owning VCPU thread), we can
use a seqcount instead of a seqlock and retry if the VCPU refreshed its
cpu timer. This avoids any heavy locking and only introduces a counter
update/check plus a handful of smp_wmb().
The owning VCPU thread should never have to retry on reads, and also for
other threads this might be a very rare scenario.
Please note that we have to use the raw_* variants for locking the seqcount
as lockdep will produce false warnings otherwise. The rq->lock held during
vcpu_load/put is also acquired from hardirq context. Lockdep cannot know
that we avoid potential deadlocks by disabling preemption and thereby
disable concurrent write locking attempts (via vcpu_put/load).
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Architecturally we should only provide steal time if we are scheduled
away, and not if the host interprets a guest exit. We have to step
the guest CPU timer in these cases.
In the first shot, we will step the VCPU timer only during the kvm_run
ioctl. Therefore all time spent e.g. in interception handlers or on irq
delivery will be accounted for that VCPU.
We have to take care of a few special cases:
- Other VCPUs can test for pending irqs. We can only report a consistent
value for the VCPU thread itself when adding the delta.
- We have to take care of STP sync, therefore we have to extend
kvm_clock_sync() and disable preemption accordingly
- During any call to disable/enable/start/stop we could get premeempted
and therefore get start/stop calls. Therefore we have to make sure we
don't get into an inconsistent state.
Whenever a VCPU is scheduled out, sleeping, in user space or just about
to enter the SIE, the guest cpu timer isn't stepped.
Please note that all primitives are prepared to be called from both
environments (cpu timer accounting enabled or not), although not completely
used in this patch yet (e.g. kvm_s390_set_cpu_timer() will never be called
while cpu timer accounting is enabled).
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We want to manually step the cpu timer in certain scenarios in the future.
Let's abstract any access to the cpu timer, so we can hide the complexity
internally.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
By storing the cpu id, we have a way to verify if the current cpu is
owning a VCPU.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
DIAG 0x288 may occur now. Let's add its code to the diag table in
sie.h.
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
A KVM_GET_DIRTY_LOG ioctl might take a long time.
This can result in fatal signals seemingly being ignored.
Lets bail out during the dirty bit sync, if a fatal signal
is pending.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Dirty log query can take a long time for huge guests.
Holding the mmap_sem for very long times can cause some unwanted
latencies.
Turns out that we do not need to hold the mmap semaphore.
We hold the slots_lock for gfn->hva translation and walk the page
tables with that address, so no need to look at the VMAs. KVM also
holds a reference to the mm, which should prevent other things
going away. During the walk we take the necessary ptl locks.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Since commit 9977e886cb ("s390/kernel: lazy restore fpu registers"),
vregs in struct sie_page is unsed. We can safely remove the field and
the definition.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
On instruction-fetch exceptions, we have to forward the PSW by any
valid ilc and correctly use that ilc when injecting the irq. Injection
will already take care of rewinding the PSW if we injected a nullifying
program irq, so we don't need special handling prior to injection.
Until now, autodetection would have guessed an ilc of 0.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
On SIE faults, the ilc cannot be detected automatically, as the icptcode
is 0. The ilc indicated in the program irq will always be 0. Therefore we
have to manually specify the ilc in order to tell the guest which ilen was
used when forwarding the PSW.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Program irq injection during program irq intercepts is the last candidates
that injects nullifying irqs and relies on delivery to do the right thing.
As we should not rely on the icptcode during any delivery (because that
value will not be migrated), let's add a flag, telling prog IRQ delivery
to not rewind the PSW in case of nullifying prog IRQs.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
__extract_prog_irq() is used only once for getting the program check data
in one place. Let's combine it with an injection function to avoid a memset
and to prevent misuse on injection by simplifying the interface to only
have the VCPU as parameter.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's use our fresh new function read_guest_instr() to access
guest storage via the correct addressing schema.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
When an instruction is to be fetched, special handling applies to
secondary-space mode and access-register mode. The instruction is to be
fetched from primary space.
We can easily support this by selecting the right asce for translation.
Access registers will never be used during translation, so don't
include them in the interface. As we only want to read from the current
PSW address for now, let's also hide that detail.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We will need special handling when fetching instructions, so let's
introduce new guest access modes GACC_FETCH and GACC_STORE instead
of a write flag. An additional patch will then introduce GACC_IFETCH.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We have to migrate the program irq ilc and someday we will have to
specify the ilc without KVM trying to autodetect the value.
Let's reuse one of the spare fields in our program irq that should
always be set to 0 by user space. Because we also want to make use
of 0 ilcs ("not available"), we need a validity indicator.
If no valid ilc is given, we try to autodetect the ilc via the current
icptcode and icptstatus + parameter and store the valid ilc in the
irq structure.
This has a nice effect: QEMU's making use of KVM_S390_IRQ /
KVM_S390_SET_IRQ_STATE / KVM_S390_GET_IRQ_STATE for migration will
directly migrate the ilc without any changes.
Please note that we use bit 0 as validity and bit 1,2 for the ilc, so
by applying the ilc mask we directly get the ilen which is usually what
we work with.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We have some confusion about ilc vs. ilen in our current code. So let's
correctly use the term ilen when dealing with (ilc << 1).
Program irq injection didn't take care of the correct ilc in case of
irqs triggered by EXECUTE functions, let's provide one function
kvm_s390_get_ilen() to take care of all that.
Also, manually specifying in intercept handlers the size of the
instruction (and sometimes overwriting that value for EXECUTE internally)
doesn't make too much sense. So also provide the functions:
- kvm_s390_retry_instr to retry the currently intercepted instruction
- kvm_s390_rewind_psw to rewind the PSW without internal overwrites
- kvm_s390_forward_psw to forward the PSW
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
As we already store the floating point registers in the vector save area
in floating point register format when we don't have MACHINE_HAS_VX, we can
directly expose them to user space using a new sync flag.
The floating point registers will be valid when KVM_SYNC_FPRS is set. The
fpc will also be valid when KVM_SYNC_FPRS is set.
Either KVM_SYNC_FPRS or KVM_SYNC_VRS will be enabled, never both.
Let's also change two positions where we access vrs, making the code easier
to read and one comment superfluous.
Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If we have MACHINE_HAS_VX, the floating point registers are stored
in the vector register format, event if the guest isn't enabled for vector
registers. So we can allow KVM_SYNC_VRS as soon as MACHINE_HAS_VX is
available.
This can in return be used by user space to support floating point
registers via struct kvm_run when the machine has vector registers.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Different pieces of code checked for vcpu->arch.apic being (non-)NULL,
or used kvm_vcpu_has_lapic (more optimized) or lapic_in_kernel.
Replace everything with lapic_in_kernel's name and kvm_vcpu_has_lapic's
implementation.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do for kvm_cpu_has_pending_timer and kvm_inject_pending_timer_irqs
what the other irq.c routines have been doing.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Usually the in-kernel APIC's existence is checked in the caller. Do not
bother checking it again in lapic.c.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add host irq information in trace event, so we can better understand
which irq is in posted mode.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use vector-hashing to deliver lowest-priority interrupts for
VT-d posted-interrupts. This patch extends kvm_intr_is_single_vcpu()
to support lowest-priority handling.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use vector-hashing to deliver lowest-priority interrupts, As an
example, modern Intel CPUs in server platform use this method to
handle lowest-priority interrupts.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the interrupt is not single destination any more, we need
to change back IRTE to remapped mode explicitly.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is similar to the existing div_frac function, but it returns the
remainder too. Unlike div_frac, it can be used to implement long
division, e.g. (a << 64) / b for 32-bit a and b.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 29bb45f25f (regmap-mmio: Use native endianness for read/write)
attempted to fix some long standing bugs in the MMIO implementation for
big endian systems caused by duplicate byte swapping in both regmap and
readl()/writel() which affected MIPS systems as when they are in big
endian mode they flip the endianness of all registers in the system, not
just the CPU. MIPS systems had worked around this by declaring regmap
using IPs as little endian which is inaccurate, unfortunately the issue
had not been reported.
Sadly the fix makes things worse rather than better. By changing the
behaviour to match the documentation it caused behaviour changes for
other IPs which broke them and by using the __raw I/O accessors to avoid
the endianness swapping in readl()/writel() it removed some memory
ordering guarantees and could potentially generate unvirtualisable
instructions on some architectures.
Unfortunately sorting out all this mess in any half way sensible fashion
was far too invasive to go in during an -rc cycle so instead let's go
back to the old broken behaviour for v4.5, the better fixes are already
queued for v4.6. This does mean that we keep the broken MIPS DTs for
another release but that seems the least bad way of handling the
situation.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWtIjbAAoJECTWi3JdVIfQs8QH/jNpfio4klDkdlH/KpPZXlrp
FzASbGePNtLqZXFL5WcG//ni3EYdbaiXZIdLBKDx9K4F2ca9FAF8aAnZAZ5uefGx
bnloYpV34DqQwS5f9FrrNsm+YVTTuUIt0dx4ZRGCEdMTzW7i3efs/9eVEITUixK6
U1obTJovAl33bihadsC9hzJVwfOq3H4aFFWc/EFZzbQaU2/so2eiA1dhPr/YErRJ
dMR8drWxpYXuBsrk5T647R0sUw7pA4Zw+WAF032TPQf/1Fy9Vk1/yXbTyccZzFzo
bfupRA/HpeLNZ9cN9z9y3Fa0je4UNbBZh0poB5B773af84NnhX7Ytenjo+peVxI=
=+Q6E
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A single revert back to v4.4 endianness handling.
Commit 29bb45f25f ("regmap-mmio: Use native endianness for
read/write") attempted to fix some long standing bugs in the MMIO
implementation for big endian systems caused by duplicate byte
swapping in both regmap and readl()/writel(). Sadly the fix makes
things worse rather than better, so revert it for now"
* tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: mmio: Revert to v4.4 endianness handling
A few random fixes, mostly coming from the PMU work by Shannon:
- fix for injecting faults coming from the guest's userspace
- cleanup for our CPTR_EL2 accessors (reserved bits)
- fix for a bug impacting perf (user/kernel discrimination)
- fix for a 32bit sysreg handling bug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWqehPAAoJECPQ0LrRPXpDn6gP/2lrJ9lV5I3MxLzUytmRY8EM
Xl8WnNEQJ0e7oEdb1l6k4DR8D/HefzXpp/YWHY1WdDZSej0b2egro1xsFWdgaOr9
NVGJnoQBlCFqSIf2szml4ftpHXZZ/kMF/EvhtzEL6cpUdqeA/tkS6HoCMQknhCbx
3zOYnNKCGQUkFhTKJUSXB6NcZ/950uqkQxAdCPNUTGg1YzkNfbcgTewqKsmb25Cv
/sOUFmrq2AlnWkdH+QWP0BtNFUX9saOSXvxrABT6nfiXSpUeF6Rprcgi9gdoNhAD
mfE5IFw0dOEo2XThZTchKu3FBSMAkDadvC9yWFr88dr62E6EKFPzY3vHLCA8QoT/
zk5beGSjyWGe7FZZJ4CKdO4EWBZr/WSlSVzOfG4ZBVPUoh2AZcUEhzzrzTezzocO
71/5ZVpQ6O8+Pxwyy85Vd2drf7OZLagGNydNx46RHXrRxl+q0c5vFTVh4Txbd4YU
XNsd+kA62/OYyPHbtVzTzAPPKG7aM8hLzdy8dkTgvuDzWHmxFWhD/HgiMHfFrQqs
WCafvBhTc4375dvwYOupxaU2ncHKvt/zQJtBOw6bEwAIUa5c1IkIUr0i8XgRq6lr
x/YvhFIwiVyXVnrDt3ZSIx79Oajf541uJg7vLFyPBQkcnQlJ6T7oy7qJlqhM0567
Sr6G0/YXa1ccIfmKyeh4
=36kx
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/ARM fixes for v4.5-rc2
A few random fixes, mostly coming from the PMU work by Shannon:
- fix for injecting faults coming from the guest's userspace
- cleanup for our CPTR_EL2 accessors (reserved bits)
- fix for a bug impacting perf (user/kernel discrimination)
- fix for a 32bit sysreg handling bug
The first real batch of fixes for this release cycle, so there are a few more
than usual.
Most of these are fixes and tweaks to board support (DT bugfixes, etc). I've
also picked up a couple of small cleanups that seemed innocent enough that
there was little reason to wait (const/__initconst and Kconfig deps).
Quite a bit of the changes on OMAP were due to fixes to no longer write to
rodata from assembly when ARM_KERNMEM_PERMS was enabled, but there were also
other fixes.
Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC fixes
on OMAP5, and Nomadik had changes to MMC parameters in DT.
All in all, mostly the usual mix of various fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWt8K0AAoJEIwa5zzehBx3HxsQAJMqKkTCr/2hzHTw5V8sTgDf
zrVYEi5WF5IGLR4eON31rF31tbEmQd0bqVlsTLy/yK3hu1gTQwDyqBJqoEQBMQUW
lBShtVERP3mNUm0yICeupaWIhoRqaymlwFKKfq93f+YTn27pEDQ1ImEHuARlbAKa
3zCd91ClRRm3WxrBXj9srt/NyMX7BlcHLjcN1BurpVkR0aciW1B692Lb8LotEP4k
D1CLNZeQEwV+uOHcJsvjEdB/Uh42+dpsxbIAaBW2cFB0iuX3BsnmferoFe0cXmpC
wO5ffvzr0LCMsrUzUsbvn0RgRtMDi2RxrS1n0cXrAVPP6OEeOaMLwGdPUGvQ2EVI
cvCHpw3qXRz7CTERpy7bv0YugIY3vZPukJrne2ZEH7cpA/JLsuqlKm/cOmPRB7gJ
tC2mXlP5jHbbGRiq/Kk3QB7QsKIxHfIalCZMMiRe0ldWSDW6jDpvrv4Nsfzs3etN
LaB0iIm3f5DqOFjjZi+LVUJUGE3M8/3Fs2f70rCdPKDGq9fTqD3+2mK7l80ZaYXG
J3wPKM+9WXGISakS/biQzvYA9iDnbDZCTUxBIM6VlvcHmARJEH3TS5ZjR0eaIb7w
Sqx7e2ufm/2wpGINDoT1qms14cI8ayj7iq+8fDnI3R9XSXxeKk5J5jo9fKnbnDWP
4A4Ai+NYBv/rDWjkg19s
=1iBu
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"The first real batch of fixes for this release cycle, so there are a
few more than usual.
Most of these are fixes and tweaks to board support (DT bugfixes,
etc). I've also picked up a couple of small cleanups that seemed
innocent enough that there was little reason to wait (const/
__initconst and Kconfig deps).
Quite a bit of the changes on OMAP were due to fixes to no longer
write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but
there were also other fixes.
Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC
fixes on OMAP5, and Nomadik had changes to MMC parameters in DT.
All in all, mostly the usual mix of various fixes"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
ARM: multi_v7_defconfig: enable DW_WATCHDOG
ARM: nomadik: fix up SD/MMC DT settings
ARM64: tegra: Add chosen node for tegra132 norrin
ARM: realview: use "depends on" instead of "if" after prompt
ARM: tango: use "depends on" instead of "if" after prompt
ARM: tango: use const and __initconst for smp_operations
ARM: realview: use const and __initconst for smp_operations
bus: uniphier-system-bus: revive tristate prompt
arm64: dts: Add missing DMA Abort interrupt to Juno
bus: vexpress-config: Add missing of_node_put
ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings
ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux
ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux
ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency
ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2
ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address
ARM: dts: LogicPD Torpedo: Revert Duplicative Entries
ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types
ARM: dts: am4372: fix irq type for arm twd and global timer
ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
...
Commit 16da306849 ("um: kill pfn_t") introduced a compile warning for
defconfig (SUBARCH=i386):
arch/um/kernel/skas/mmu.c:38:206:
warning: right shift count >= width of type [-Wshift-count-overflow]
Aforementioned patch changes the definition of the phys_to_pfn() macro
from
((pfn_t) ((p) >> PAGE_SHIFT))
to
((p) >> PAGE_SHIFT)
This effectively changes the phys_to_pfn() expansion's type from
unsigned long long to unsigned long.
Through the callchain init_stub_pte() => mk_pte(), the expansion of
phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
pte_set_val(pte, phys, prot) macro, eventually leading to
(pte).pte_high = (phys) >> 32;
This results in the warning from above.
Since UML only deals with 32 bit addresses, the upper 32 bits from
'phys' used to be always zero anyway. Also, all page protection flags
defined by UML don't use any bits beyond bit 9. Since the contents of a
PTE are defined within architecture scope only, the ->pte_high member
can be safely removed.
Remove the ->pte_high member from struct pte_t.
Rename ->pte_low to ->pte.
Adapt the pte helper macros in arch/um/include/asm/page.h.
Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped. This
write barrier doesn't seem to be paired with any read barrier though and
thus, was useless anyway.
Fixes: 16da306849 ("um: kill pfn_t")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 944d9fec8d ("hugetlb: add support for gigantic page allocation
at runtime") has added the runtime gigantic page allocation via
alloc_contig_range(), making this support available only when CONFIG_CMA
is enabled. Because it doesn't depend on MIGRATE_CMA pageblocks and the
associated infrastructure, it is possible with few simple adjustments to
require only CONFIG_MEMORY_ISOLATION instead of full CONFIG_CMA.
After this patch, alloc_contig_range() and related functions are
available and used for gigantic pages with just CONFIG_MEMORY_ISOLATION
enabled. Note CONFIG_CMA selects CONFIG_MEMORY_ISOLATION. This allows
supporting runtime gigantic pages without the CMA-specific checks in
page allocator fastpaths.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One of the randconfig build failed with the error:
arch/m32r/kernel/smp.c: In function 'smp_flush_tlb_mm':
arch/m32r/kernel/smp.c:283:20: error: subscripted value is neither array nor pointer nor vector
mmc = &mm->context[cpu_id];
^
arch/m32r/kernel/smp.c: In function 'smp_flush_tlb_page':
arch/m32r/kernel/smp.c:353:20: error: subscripted value is neither array nor pointer nor vector
mmc = &mm->context[cpu_id];
^
arch/m32r/kernel/smp.c: In function 'smp_invalidate_interrupt':
arch/m32r/kernel/smp.c:479:41: error: subscripted value is neither array nor pointer nor vector
unsigned long *mmc = &flush_mm->context[cpu_id];
It turned out that CONFIG_SMP was defined but CONFIG_MMU was not
defined. But arch/m32r/include/asm/mmu.h only defines mm_context_t as
an array when both CONFIG_SMP and CONFIG_MMU are defined. And
arch/m32r/kernel/smp.c is always using context as an array. So without
MMU SMP can not work.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 29bb45f25f (regmap-mmio: Use native endianness for read/write)
attempted to fix some long standing bugs in the MMIO implementation for
big endian systems caused by duplicate byte swapping in both regmap and
readl()/writel() which affected MIPS systems as when they are in big
endian mode they flip the endianness of all registers in the system, not
just the CPU. MIPS systems had worked around this by declaring regmap
using IPs as little endian which is inaccurate, unfortunately the issue
had not been reported.
Sadly the fix makes things worse rather than better. By changing the
behaviour to match the documentation it caused behaviour changes for
other IPs which broke them and by using the __raw I/O accessors to avoid
the endianness swapping in readl()/writel() it removed some memory
ordering guarantees and could potentially generate unvirtualisable
instructions on some architectures.
Unfortunately sorting out all this mess in any half way sensible fashion
was far too invasive to go in during an -rc cycle so instead let's go
back to the old broken behaviour for v4.5, the better fixes are already
queued for v4.6. This does mean that we keep the broken MIPS DTs for
another release but that seems the least bad way of handling the
situation.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Mark Brown <broonie@kernel.org>
- Add missing PAN toggling in the futex code
- Fix missing #include that briefly caused issues in -next
- Allow changing of vmalloc permissions with set_memory_* (used by bpf)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJWs2KkAAoJELescNyEwWM0kn4H+gIGvX7Dk792AJlLj8u1hIb6
m+Z4hgLcNIFZCWEGEEWOjHAJ1n2+SgRPdJgDmhN4KP+5Oh9+Qkqhj3wDy7RB05Bf
q3B/dZzr/rASRpycOyNL4CoyqScS7YsP9+X+7tNC8dAMr8UeYahvVKYVwzMlXfDh
4hh3gZkhBQ/hIUse02VtE+OR6lbZAwMYahrM2T9YHslNKanQkpezqBVD4Ic0J5ic
Jkpep8tBdTBLrFm4WERoO6vv8YCrXDP+6DutAO/nOmvHTr07LgkRwiOn+HmoCxiv
Ir5j01SV2SNlr7AfLBWBLN/NluoRByzmQ6SeEsQ3thN35fIsVndStKcAQAAu1nU=
=pr2h
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Nothing particularly interesting here, but all important fixes
nonetheless:
- Add missing PAN toggling in the futex code
- Fix missing #include that briefly caused issues in -next
- Allow changing of vmalloc permissions with set_memory_* (used by
bpf)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: asm: Explicitly include linux/personality.h in asm/page.h
arm64: futex.h: Add missing PAN toggling
arm64: allow vmalloc regions to be set with set_memory_*
The watchdog timer on the SoCFPGA platform is the Synopsys Designware watchdog.
Enable CONFIG_DW_WATCHDOG for the driver to get built.
Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
The DTSI file for the Nomadik does not properly specify how the
PL180 levelshifter is connected: the Nomadik actually needs all
the five st,sig-dir-* flags set to properly control all lines out.
Further this board supports full power cycling of the card, and
since this variant has no hardware clock gating, it needs a
ridiculously low frequency setting to keep up with the ever
overflowing FIFO.
The pin configuration set-up is a bit of a mystery, because of
course these pins are a mix of inputs and outputs. However the
reference implementation sets all pins to "output" with
unspecified initial value, so let's do that here as well.
Cc: stable@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
asm/page.h uses READ_IMPLIES_EXEC from linux/personality.h but does not
explicitly include it causing build failures in -next where whatever was
causing it to be implicitly included has changed to remove that
inclusion. Add an explicit inclusion to fix this.
Signed-off-by: Mark Brown <broonie@kernel.org>
[will: moved #include inside #ifndef __ASSEMBLY__ block]
Signed-off-by: Will Deacon <will.deacon@arm.com>
futex.h's futex_atomic_cmpxchg_inatomic() does not use the
__futex_atomic_op() macro and needs its own PAN toggling. This was missed
when the feature was implemented.
Fixes: 338d4f49d6 ("arm64: kernel: Add support for Privileged Access Never")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The range of set_memory_* is currently restricted to the module address
range because of difficulties in breaking down larger block sizes.
vmalloc maps PAGE_SIZE pages so it is safe to use as well. Update the
function ranges and add a comment explaining why the range is restricted
the way it is.
Suggested-by: Laura Abbott <labbott@fedoraproject.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull networking fixes from David Miller:
"This looks like a lot but it's a mixture of regression fixes as well
as fixes for longer standing issues.
1) Fix on-channel cancellation in mac80211, from Johannes Berg.
2) Handle CHECKSUM_COMPLETE properly in xt_TCPMSS netfilter xtables
module, from Eric Dumazet.
3) Avoid infinite loop in UDP SO_REUSEPORT logic, also from Eric
Dumazet.
4) Avoid a NULL deref if we try to set SO_REUSEPORT after a socket is
bound, from Craig Gallek.
5) GRO key comparisons don't take lightweight tunnels into account,
from Jesse Gross.
6) Fix struct pid leak via SCM credentials in AF_UNIX, from Eric
Dumazet.
7) We need to set the rtnl_link_ops of ipv6 SIT tunnels before we
register them, otherwise the NEWLINK netlink message is missing
the proper attributes. From Thadeu Lima de Souza Cascardo.
8) Several Spectrum chip bug fixes for mlxsw switch driver, from Ido
Schimmel
9) Handle fragments properly in ipv4 easly socket demux, from Eric
Dumazet.
10) Don't ignore the ifindex key specifier on ipv6 output route
lookups, from Paolo Abeni"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (128 commits)
tcp: avoid cwnd undo after receiving ECN
irda: fix a potential use-after-free in ircomm_param_request
net: tg3: avoid uninitialized variable warning
net: nb8800: avoid uninitialized variable warning
net: vxge: avoid unused function warnings
net: bgmac: clarify CONFIG_BCMA dependency
net: hp100: remove unnecessary #ifdefs
net: davinci_cpdma: use dma_addr_t for DMA address
ipv6/udp: use sticky pktinfo egress ifindex on connect()
ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()
netlink: not trim skb for mmaped socket when dump
vxlan: fix a out of bounds access in __vxlan_find_mac
net: dsa: mv88e6xxx: fix port VLAN maps
fib_trie: Fix shift by 32 in fib_table_lookup
net: moxart: use correct accessors for DMA memory
ipv4: ipconfig: avoid unused ic_proto_used symbol
bnxt_en: Fix crash in bnxt_free_tx_skbs() during tx timeout.
bnxt_en: Exclude rx_drop_pkts hw counter from the stack's rx_dropped counter.
bnxt_en: Ring free response from close path should use completion ring
net_sched: drr: check for NULL pointer in drr_dequeue
...
assembly fixes, the other things are mostly board related:
- A series of omap assembly code fixes to fix issues with rodata with
ARM_KERNMEM_PERMS enabled. We had several places writing to rodata,
which is bad. The fix in most cases is to load the value from data
section using a pointer. Let's also enable ARM_KERNMEM_PERMS so
DEBUG_RODATA gets selected by default. And while testing things,
I also added few more loadable driver modules to the defconfig that
I seem to need quite often.
- Fix a long standing omap5 RTC mystery and enable RTC where we need
to ensure the SoC msecure pin is high so we can write to the RTC
registers.
- Fix irq types for am437x
- A series of minor dts fixes for sbc-am57x and cl-som-am57x
- Fixes for torpedo dts to make WLAN behave and to remove a duplicate
i2c rate entry
This series also includes few minor changes that are not stricly
fixes, but would be good to get in during the early -rc cycle:
- Remove legacy mailbox platform data that is no longer needed
- Add the pdata-quirks needed for the new pwm-omap-dmtimer so
people can use it
- Enable ti,mbox-send-noirq that's needed by wkup_m3 driver
- Enable SPLIT and DWARF4 in omap2plus_defconfig as it makes the
initramfs quite a bit smaller
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWqT5XAAoJEBvUPslcq6Vz01kP/jdo4hXcvUgtG9W5yIxortiK
Sg2D0omkQzwgNHhh9K/ezOYaGwgRJ8grEkCYImlr7n/zGr7Mpt3eUiJC7gbV8xg4
nEPNxGoIQSQ3A0NVV+6gtnHHco4ajih/l7A+0UDZy/x375VExW46HS0KLWy2hov9
WgEJDNBIZBdBN3S3CJ2pO1+I4KHkk9vqaDHjfDaSnyQXRKxQTziubnk5KhfcYpMS
0fDY9BqJFDp0gbE3Dp3GOk/eEW+6XQAUFxK2i+rp3fmOhENBbbEAPWJ4qM8VFQr+
ITQdd2o/SXE3hnqoXLMpCBFPSBDD7UMoxIp3gtMu/YwRePw8zETeQKYuHwSO69oz
BKoKXJKg1WfiTquCzwijlqvOhMi0KzVSBi+X5MSQaUl+30qrHXdY4ecHvQAzp9vZ
OtkCLI5SLmxCRLQllssifey91IfaWEm01So/XgvSgqUVfTLrUcBU5emlgwK5NMy5
ya6NsOu0ME9k6GuGCWupGnVpUHlIAj4e60xisiVYI9GP4sN8aCey/RiOR+rWZW+C
YYYxttRqRlwKH1VHNow0aWCG15hNjWGW8XWCSNyJeCCigObEwQxBE1xfAwGeGZ5s
FA+ogfZesvPu/u1ychgF1e0w30P9AnWhTt1dTR7TNxLBqFsrbhWz+r0jOTzrlYaR
6l2eMQYPlQ4569rGIbdF
=NqvQ
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.5/fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omaps with the most intrusive stuff being read-only data
assembly fixes, the other things are mostly board related:
- A series of omap assembly code fixes to fix issues with rodata with
ARM_KERNMEM_PERMS enabled. We had several places writing to rodata,
which is bad. The fix in most cases is to load the value from data
section using a pointer. Let's also enable ARM_KERNMEM_PERMS so
DEBUG_RODATA gets selected by default. And while testing things,
I also added few more loadable driver modules to the defconfig that
I seem to need quite often.
- Fix a long standing omap5 RTC mystery and enable RTC where we need
to ensure the SoC msecure pin is high so we can write to the RTC
registers.
- Fix irq types for am437x
- A series of minor dts fixes for sbc-am57x and cl-som-am57x
- Fixes for torpedo dts to make WLAN behave and to remove a duplicate
i2c rate entry
This series also includes few minor changes that are not stricly
fixes, but would be good to get in during the early -rc cycle:
- Remove legacy mailbox platform data that is no longer needed
- Add the pdata-quirks needed for the new pwm-omap-dmtimer so
people can use it
- Enable ti,mbox-send-noirq that's needed by wkup_m3 driver
- Enable SPLIT and DWARF4 in omap2plus_defconfig as it makes the
initramfs quite a bit smaller
* tag 'omap-for-v4.5/fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (23 commits)
ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings
ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux
ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux
ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency
ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2
ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address
ARM: dts: LogicPD Torpedo: Revert Duplicative Entries
ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types
ARM: dts: am4372: fix irq type for arm twd and global timer
ARM: dts: Fix wl12xx missing clocks that cause hangs
ARM: OMAP: Add PWM dmtimer platform data quirks
ARM: omap2plus_defconfig: Enable ARM_KERNMEM_PERMS and few loadable modules
ARM: OMAP2+: Fix ppa_zero_params and ppa_por_params for rodata
ARM: OMAP2+: Fix l2_inv_api_params for rodata
ARM: OMAP2+: Fix save_secure_ram_context for rodata
ARM: OMAP2+: Fix l2dis_3630 for rodata
ARM: OMAP2+: Fix wait_dll_lock_timed for rodata
ARM: OMAP2+: Remove legacy mailbox device instantiation
ARM: dts: AM4372: Add ti,mbox-send-noirq to wkup_m3 mailbox
ARM: dts: AM33xx: Add ti,mbox-send-noirq to wkup_m3 mailbox
...
Signed-off-by: Olof Johansson <olof@lixom.net>
- sama5d4: error in DBGU index
- addition of phy properties in several boards
- at91sam9n12ek fix a panel compatible string
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJWqdjFAAoJEAf03oE53VmQG30H/R9luD6d9ebmhaOE7ay40HAy
RrG8wtRQ/zgxf37SntoJxyVxxXWsDsb7sOR1LRXiT7FEGWr3Eip7B1uwmasC8pky
ich2Yj5pGVUH+qscm26GDGyHbNwIrFOQyl1t/R6upVpITlXa0bpaEIXx3RejH8PN
Wk4pMZg/4OkUXlcmYNU0Rz8ban8GfJ428bkLxMKeUXUAjvevNlWqTvOqC+QIrzUC
w3iDoXfhc81sqrOzBzW44H28g7rh//d3TAfzMbM1BJti880QkP+CksG5qHvogI5f
olN4+9QmV9tLXOr2K6iUkM8dwzHYW/3PPBR1CHODjSLP0rQ2Assy+DPxyO64Rtk=
=jJrB
-----END PGP SIGNATURE-----
Merge tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91 into fixes
First fixes for 4.5. Only DT changes:
- sama5d4: error in DBGU index
- addition of phy properties in several boards
- at91sam9n12ek fix a panel compatible string
* tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91:
ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
ARM: dts: at91: sama5d4 xplained: properly mux phy interrupt
ARM: dts: at91: sama5d4ek: add phy address and IRQ for macb0
ARM: dts: at91: sama5d2 xplained: add phy address and IRQ for macb0
ARM: dts: at91: at91sam9n12ek: fix panel compatible string
ARM: dts: at91: sama5d4: fix instance id of DBGU
Signed-off-by: Olof Johansson <olof@lixom.net>
The NVIDIA bootloader, nvtboot, expects the "chosen" node to be present
in the device-tree blob and if it is not then it fails to boot the kernel.
Add the chosen node so we can boot the kernel on Tegra132 Norrin with the
nvtboot bootloader.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
This platform recently moved to multi-platform, so missed the global
fixup by commit e324654294 ("ARM: use "depends on" for SoC configs
instead of "if" after prompt"). Fix it now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>