Commit Graph

17464 Commits

Author SHA1 Message Date
Paul Mackerras
d27998185d Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next
This merges in the ppc-kvm topic branch of the powerpc tree to get
two patches which are prerequisites for the following patch series,
plus another patch which touches both powerpc and KVM code.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-19 12:09:57 +11:00
Paul Mackerras
d075745d89 KVM: PPC: Book3S HV: Improve handling of debug-trigger HMIs on POWER9
Hypervisor maintenance interrupts (HMIs) are generated by various
causes, signalled by bits in the hypervisor maintenance exception
register (HMER).  In most cases calling OPAL to handle the interrupt
is the correct thing to do, but the "debug trigger" HMIs signalled by
PPC bit 17 (bit 46) of HMER are used to invoke software workarounds
for hardware bugs, and OPAL does not have any code to handle this
cause.  The debug trigger HMI is used in POWER9 DD2.0 and DD2.1 chips
to work around a hardware bug in executing vector load instructions to
cache inhibited memory.  In POWER9 DD2.2 chips, it is generated when
conditions are detected relating to threads being in TM (transactional
memory) suspended mode when the core SMT configuration needs to be
reconfigured.

The kernel currently has code to detect the vector CI load condition,
but only when the HMI occurs in the host, not when it occurs in a
guest.  If a HMI occurs in the guest, it is always passed to OPAL, and
then we always re-sync the timebase, because the HMI cause might have
been a timebase error, for which OPAL would re-sync the timebase, thus
removing the timebase offset which KVM applied for the guest.  Since
we don't know what OPAL did, we don't know whether to subtract the
timebase offset from the timebase, so instead we re-sync the timebase.

This adds code to determine explicitly what the cause of a debug
trigger HMI will be.  This is based on a new device-tree property
under the CPU nodes called ibm,hmi-special-triggers, if it is
present, or otherwise based on the PVR (processor version register).
The handling of debug trigger HMIs is pulled out into a separate
function which can be called from the KVM guest exit code.  If this
function handles and clears the HMI, and no other HMI causes remain,
then we skip calling OPAL and we proceed to subtract the guest
timebase offset from the timebase.

The overall handling for HMIs that occur in the host (i.e. not in a
KVM guest) is largely unchanged, except that we now don't set the flag
for the vector CI load workaround on DD2.2 processors.

This also removes a BUG_ON in the KVM code.  BUG_ON is generally not
useful in KVM guest entry/exit code since it is difficult to handle
the resulting trap gracefully.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-18 15:31:25 +11:00
Paul Mackerras
00608e1f00 KVM: PPC: Book3S HV: Allow HPT and radix on the same core for POWER9 v2.2
POWER9 chip versions starting with "Nimbus" v2.2 can support running
with some threads of a core in HPT mode and others in radix mode.
This means that we don't have to prohibit independent-threads mode
when running a HPT guest on a radix host, and we don't have to do any
of the synchronization between threads that was introduced in commit
c01015091a ("KVM: PPC: Book3S HV: Run HPT guests on POWER9 radix
hosts", 2017-10-19).

Rather than using up another CPU feature bit, we just do an
explicit test on the PVR (processor version register) at module
startup time to determine whether we have to take steps to avoid
having some threads in HPT mode and some in radix mode (so-called
"mixed mode").  We test for "Nimbus" (indicated by 0 or 1 in the top
nibble of the lower 16 bits) v2.2 or later, or "Cumulus" (indicated by
2 or 3 in that nibble) v1.1 or later.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-18 12:05:19 +11:00
Paul Mackerras
6964e6a4e4 KVM: PPC: Book3S HV: Do SLB load/unload with guest LPCR value loaded
This moves the code that loads and unloads the guest SLB values so that
it is done while the guest LPCR value is loaded in the LPCR register.
The reason for doing this is that on POWER9, the behaviour of the
slbmte instruction depends on the LPCR[UPRT] bit.  If UPRT is 1, as
it is for a radix host (or guest), the SLB index is truncated to
2 bits.  This means that for a HPT guest on a radix host, the SLB
was not being loaded correctly, causing the guest to crash.

The SLB is now loaded much later in the guest entry path, after the
LPCR is loaded, which for a secondary thread is after it sees that
the primary thread has switched the MMU to the guest.  The loop that
waits for the primary thread has a branch out to the exit code that
is taken if it sees that other threads have commenced exiting the
guest.  Since we have now not loaded the SLB at this point, we make
this path branch to a new label 'guest_bypass' and we move the SLB
unload code to before this label.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-17 11:19:02 +11:00
Paul Mackerras
43ff3f6523 KVM: PPC: Book3S HV: Make sure we don't re-enter guest without XIVE loaded
This fixes a bug where it is possible to enter a guest on a POWER9
system without having the XIVE (interrupt controller) context loaded.
This can happen because we unload the XIVE context from the CPU
before doing the real-mode handling for machine checks.  After the
real-mode handler runs, it is possible that we re-enter the guest
via a fast path which does not load the XIVE context.

To fix this, we move the unloading of the XIVE context to come after
the real-mode machine check handler is called.

Fixes: 5af5099385 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-17 11:18:48 +11:00
Paul Mackerras
5855564c8a KVM: PPC: Book3S HV: Enable migration of decrementer register
This adds a register identifier for use with the one_reg interface
to allow the decrementer expiry time to be read and written by
userspace.  The decrementer expiry time is in guest timebase units
and is equal to the sum of the decrementer and the guest timebase.
(The expiry time is used rather than the decrementer value itself
because the expiry time is not constantly changing, though the
decrementer value is, while the guest vcpu is not running.)

Without this, a guest vcpu migrated to a new host will see its
decrementer set to some random value.  On POWER8 and earlier, the
decrementer is 32 bits wide and counts down at 512MHz, so the
guest vcpu will potentially see no decrementer interrupts for up
to about 4 seconds, which will lead to a stall.  With POWER9, the
decrementer is now 56 bits side, so the stall can be much longer
(up to 2.23 years) and more noticeable.

To help work around the problem in cases where userspace has not been
updated to migrate the decrementer expiry time, we now set the
default decrementer expiry at vcpu creation time to the current time
rather than the maximum possible value.  This should mean an
immediate decrementer interrupt when a migrated vcpu starts
running.  In cases where the decrementer is 32 bits wide and more
than 4 seconds elapse between the creation of the vcpu and when it
first runs, the decrementer would have wrapped around to positive
values and there may still be a stall - but this is no worse than
the current situation.  In the large-decrementer case, we are sure
to get an immediate decrementer interrupt (assuming the time from
vcpu creation to first run is less than 2.23 years) and we thus
avoid a very long stall.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-16 11:54:45 +11:00
Benjamin Herrenschmidt
7f1c410da5 powerpc/xive: Add interrupt flag to disable automatic EOI
This will be used by KVM in order to keep escalation interrupts
in the non-EOI (masked) state after they fire. They will be
re-enabled directly in HW by KVM when needed.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-12 15:24:44 +11:00
Benjamin Herrenschmidt
12c1f339cd powerpc/xive: Move definition of ESB bits
From xive.h to xive-regs.h since it's a HW register definition
and it can be used from assembly

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-12 15:24:41 +11:00
Markus Elfring
1627301020 KVM: PPC: Use seq_puts() in kvmppc_exit_timing_show()
A headline should be quickly put into a sequence. Thus use the
function "seq_puts" instead of "seq_printf" for this purpose.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-11 20:36:06 +11:00
Alexander Graf
81ceca05a4 KVM: PPC: Book3S HV: Remove vcpu->arch.dec usage
On Book3S in HV mode, we don't use the vcpu->arch.dec field at all.
Instead, all logic is built around vcpu->arch.dec_expires.

So let's remove the one remaining piece of code that was setting it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-01-11 20:36:06 +11:00
Michael Neuling
191eccb158 powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper
A new hypervisor call has been defined to communicate various
characteristics of the CPU to guests. Add definitions for the hcall
number, flags and a wrapper function.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-10 01:46:34 +11:00
Linus Torvalds
a0651c7fa2 powerpc fixes for 4.15 #3
Two fixes for nasty kexec/kdump crashes in certain configurations.
 
 A couple of minor fixes for the new TIDR code.
 
 A fix for an oops in a CXL error handling path.
 
 Thanks to:
   Andrew Donnellan, Christophe Lombard, David Gibson, Mahesh Salgaonkar, Vaibhav Jain.
 -----BEGIN PGP SIGNATURE-----
 
 iQIwBAABCAAaBQJaITuLExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYBS
 9A/+IYjSoDdTcJXqfR46xn2147RGBQIym2rqaBJd+WwZj8Br0Yap5hrPtr1zAilD
 75aj0CRR0Y91nodnfishjujsJZckyQYOHi0/WQluLbpRlWEmeQt47gDjz70Wt1T8
 BZUEVqPF2k6Mk5WJV6sSIHBtw2uKrl/lJZAUJbTobOWgsMdopO504MkFxvySWKMV
 AX7UEXrcxPLb/yVGk9Ih9iwXxm/ymvQrkljp4s3jWqkc7bWwN93CmimIQ+X6bop0
 yqmAzCiUJsPsulmkBkmsY78llPg0roUrh98R4JIe0+cUiQROa5Kvt/u0zohN/rqS
 6SkPT0ds2Fs1z5cHayyQWMN0j0A5sfwW2KRMLHCJjAwAxzoT2CdMZDv0+QLi0ETy
 RGtYvnew8eCqrfBpyBneEP1JySARJ85ML4rZvudewSHJoMzTkYDnSEKU8+wlqRIf
 KHdvHmErRMlF7OB6Om3Uxz6oIXan/Puj7HsdL8f7MazjFPqb/r+/AuTDzUov17Fs
 7Y0qVawFyJyAJ8zkUAGB1kN2FN+eYnsFxUa7ubpeJY7VX+8pUOwT24rFc803eAu4
 p/ad1CpBy+8xaq83WeaM6BpMqW80ao2BzzbQUhDcEQl4qovO/ZEZxQt0ySoQp+SY
 MqE8SnZMkL/30CasbKTAqmt+P44GCSYZVtOUwTmvLNMTjSg=
 =/LD/
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Two fixes for nasty kexec/kdump crashes in certain configurations.

  A couple of minor fixes for the new TIDR code.

  A fix for an oops in a CXL error handling path.

  Thanks to: Andrew Donnellan, Christophe Lombard, David Gibson, Mahesh
  Salgaonkar, Vaibhav Jain"

* tag 'powerpc-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Do not assign thread.tidr if already assigned
  powerpc: Avoid signed to unsigned conversion in set_thread_tidr()
  powerpc/kexec: Fix kexec/kdump in P9 guest kernels
  powerpc/powernv: Fix kexec crashes caused by tlbie tracing
  cxl: Check if vphb exists before iterating over AFU devices
2017-12-01 08:40:17 -05:00
Linus Torvalds
9e0600f5cf * x86 bugfixes: APIC, nested virtualization, IOAPIC
* PPC bugfix: HPT guests on a POWER9 radix host
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJaICi1AAoJEL/70l94x66DjvEIAIML/e9YX1YrJZi0rsB9cbm0
 Le3o5b3wKxPrlZdnpOZQ2mVWubUQdiHMPGX6BkpgyiJWUchnbj5ql1gUf5S0i3jk
 TOk6nae6DU94xBuboeqZJlmx2VfPY/fqzLWsX3HFHpnzRl4XvXL5o7cWguIxVcVO
 yU6bPgbAXyXSBennLWZxC3aQ2Ojikr3uxZQpUZTAPOW5hFINpCKCpqJBMxsb67wq
 rwI0cJhRl92mHpbe8qeNJhavqY5eviy9iPUaZrOW9P4yw1uqjTAjgsUc1ydiaZSV
 rOHeKBOgVfY/KBaNJKyKySfuL1MJ+DLcQqm9RlGpKNpFIeB0vvSf0gtmmqIAXIk=
 =kh2y
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:

 - x86 bugfixes: APIC, nested virtualization, IOAPIC

 - PPC bugfix: HPT guests on a POWER9 radix host

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
  KVM: Let KVM_SET_SIGNAL_MASK work as advertised
  KVM: VMX: Fix vmx->nested freeing when no SMI handler
  KVM: VMX: Fix rflags cache during vCPU reset
  KVM: X86: Fix softlockup when get the current kvmclock
  KVM: lapic: Fixup LDR on load in x2apic
  KVM: lapic: Split out x2apic ldr calculation
  KVM: PPC: Book3S HV: Fix migration and HPT resizing of HPT guests on radix hosts
  KVM: vmx: use X86_CR4_UMIP and X86_FEATURE_UMIP
  KVM: x86: Fix CPUID function for word 6 (80000001_ECX)
  KVM: nVMX: Fix vmx_check_nested_events() return value in case an event was reinjected to L2
  KVM: x86: ioapic: Preserve read-only values in the redirection table
  KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered
  KVM: x86: ioapic: Remove redundant check for Remote IRR in ioapic_set_irq
  KVM: x86: ioapic: Don't fire level irq when Remote IRR set
  KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race
  KVM: x86: inject exceptions produced by x86_decode_insn
  KVM: x86: Allow suppressing prints on RDMSR/WRMSR of unhandled MSRs
  KVM: x86: fix em_fxstor() sleeping while in atomic
  KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure
  KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry
  ...
2017-11-30 08:15:19 -08:00
Dan Williams
e4e40e0263 mm: switch to 'define pmd_write' instead of __HAVE_ARCH_PMD_WRITE
In response to compile breakage introduced by a series that added the
pud_write helper to x86, Stephen notes:

    did you consider using the other paradigm:

    In arch include files:
    #define pud_write       pud_write
    static inline int pud_write(pud_t pud)
     .....

    Then in include/asm-generic/pgtable.h:

    #ifndef pud_write
    tatic inline int pud_write(pud_t pud)
    {
            ....
    }
    #endif

    If you had, then the powerpc code would have worked ... ;-) and many
    of the other interfaces in include/asm-generic/pgtable.h are
    protected that way ...

Given that some architecture already define pmd_write() as a macro, it's
a net reduction to drop the definition of __HAVE_ARCH_PMD_WRITE.

Link: http://lkml.kernel.org/r/151129126721.37405.13339850900081557813.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-29 18:40:42 -08:00
Vaibhav Jain
7e4d423326 powerpc: Do not assign thread.tidr if already assigned
If set_thread_tidr() is called twice for same task_struct then it will
allocate a new tidr value to it leaving the previous value still
dangling in the vas_thread_ida table.

To fix this the patch changes set_thread_tidr() to check if a tidr
value is already assigned to the task_struct and if yes then returns
zero.

Fixes: ec233ede4c86("powerpc: Add support for setting SPRN_TIDR")
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
[mpe: Modify to return 0 in the success case, not the TID value]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-29 19:56:18 +11:00
Vaibhav Jain
aca7573fde powerpc: Avoid signed to unsigned conversion in set_thread_tidr()
There is an unsafe signed to unsigned conversion in set_thread_tidr()
that may cause an error value to be assigned to SPRN_TIDR register and
used as thread-id.

The issue happens as assign_thread_tidr() returns an int and
thread.tidr is an unsigned-long. So a negative error code returned
from assign_thread_tidr() will fail the error check and gets assigned
as tidr as a large positive value.

To fix this the patch assigns the return value of assign_thread_tidr()
to a temporary int and assigns it to thread.tidr iff its '> 0'.

The patch shouldn't impact the calling convention of set_thread_tidr()
i.e all -ve return-values are error codes and a return value of '0'
indicates success.

Fixes: ec233ede4c86("powerpc: Add support for setting SPRN_TIDR")
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Christophe Lombard clombard@linux.vnet.ibm.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-29 19:36:13 +11:00
Paolo Bonzini
a63dd7480d PPC KVM fixes for 4.15
One commit here, that fixes a couple of bugs relating to the patch
 series that enables HPT guests to run on a radix host on POWER9
 systems.  This patch series went upstream in the 4.15 merge window,
 so no stable backport is required.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJaGKSnAAoJEJ2a6ncsY3GfF7MIANLLhznEMrWq8jw4g95WsJU1
 MkDGwp8kIdhIOM9HD6JRskoJZB5Mws2BWlQ5PSaVFxO6v6eUgNLaRb/UBxC1r7gU
 1f9/8corY4BNkezSdJqTL7Xgp13KjTU726OwYAqCPEyCSPEc9ciMyeIgyZuv2dPa
 Pju+u4tnA+9JJyskgNL+/ybOOZwVat91VmNUVRq29zP6+zo1tmIDxrQchy6Bqui/
 7Wg298G+yjAkJ8ktQu69ACk+0oEBGUOcLUlraqGSr9auR+b0nJ1PAGCDRaONdwgE
 +X+OE+t+UC6rU+coUXMwO+Id0X7HMdsLQd3066ODEtD55g8MIVZ126Wt8xDmj5o=
 =GSTh
 -----END PGP SIGNATURE-----

Merge tag 'kvm-ppc-fixes-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master

PPC KVM fixes for 4.15

One commit here, that fixes a couple of bugs relating to the patch
series that enables HPT guests to run on a radix host on POWER9
systems.  This patch series went upstream in the 4.15 merge window,
so no stable backport is required.
2017-11-27 17:54:13 +01:00
Jan H. Schönherr
20b7035c66 KVM: Let KVM_SET_SIGNAL_MASK work as advertised
KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
"any unblocked signal received [...] will cause KVM_RUN to return with
-EINTR" and that "the signal will only be delivered if not blocked by
the original signal mask".

This, however, is only true, when the calling task has a signal handler
registered for a signal. If not, signal evaluation is short-circuited for
SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
returning or the whole process is terminated.

Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
to that in do_sigtimedwait() to avoid short-circuiting of signals.

Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-11-27 17:53:47 +01:00
Linus Torvalds
844056fd74 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:

 - The final conversion of timer wheel timers to timer_setup().

   A few manual conversions and a large coccinelle assisted sweep and
   the removal of the old initialization mechanisms and the related
   code.

 - Remove the now unused VSYSCALL update code

 - Fix permissions of /proc/timer_list. I still need to get rid of that
   file completely

 - Rename a misnomed clocksource function and remove a stale declaration

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  m68k/macboing: Fix missed timer callback assignment
  treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
  timer: Remove redundant __setup_timer*() macros
  timer: Pass function down to initialization routines
  timer: Remove unused data arguments from macros
  timer: Switch callback prototype to take struct timer_list * argument
  timer: Pass timer_list pointer to callbacks unconditionally
  Coccinelle: Remove setup_timer.cocci
  timer: Remove setup_*timer() interface
  timer: Remove init_timer() interface
  treewide: setup_timer() -> timer_setup() (2 field)
  treewide: setup_timer() -> timer_setup()
  treewide: init_timer() -> setup_timer()
  treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
  s390: cmm: Convert timers to use timer_setup()
  lightnvm: Convert timers to use timer_setup()
  drivers/net: cris: Convert timers to use timer_setup()
  drm/vc4: Convert timers to use timer_setup()
  block/laptop_mode: Convert timers to use timer_setup()
  net/atm/mpc: Avoid open-coded assignment of timer callback function
  ...
2017-11-25 08:37:16 -10:00
Linus Torvalds
83ada03196 powerpc fixes for 4.15 #2
A small batch of fixes, about 50% tagged for stable and the rest for recently
 merged code.
 
 There's one more fix for the >128T handling on hash. Once a process had
 requested a single mmap above 128T we would then always search above 128T. The
 correct behaviour is to consider the hint address in isolation for each mmap
 request.
 
 Then a couple of fixes for the IMC PMU, a missing EXPORT_SYMBOL in VAS, a fix
 for STRICT_KERNEL_RWX on 32-bit, and a fix to correctly identify P9 DD2.1 but in
 code that is currently not used by default.
 
 Thanks to:
   Aneesh Kumar K.V, Christophe Leroy, Madhavan Srinivasan, Sukadev Bhattiprolu.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaF/VqAAoJEFHr6jzI4aWA994P/3NNXkSASJHjLrIlQAKXtmx9
 lrv1v+6MbPWhyB8Q8LVnnC3Ab2LTHnkccjq2Jw0bP0RQ86HF4mH7Sb7N5Wj0cG+M
 5NioikvGE057ncLfxVhesOK0C9Lhc7Zb+zphXZliYP76IGxwbxorJRepeZctVkyO
 KPMv4eaImdblVn71aoQQSlepON4+/rtiW2yo5u98uCqR+Ttds4J1fiDZ4TNrBYRP
 Ilh6DmA//CWvN+KsGT+brRd/PjEkxQKHyS8px3lxRl4cwCJucXPCik/Gn9t6OiMw
 3S6y1Mu8nrh4z+YepKv6APvl2DEwwXn8w9f85kn+QiE9Qp3Z/wckW9/4LT5FeuKE
 L8E3dKq2NzJ9oDs/20sVbBvVR7CUvBoyWytsXVkmmlC6sVReTrYAJ1UP9HnNvcF6
 be4zYUKusU83uG6saGgchRrPUrD31XKXw8Piv9EoWo1Uz7VgWCkxidclRNocgeDO
 k5VxYnRd9jPsv2pCzXH2YmuQAypGUh12IPTxEOnSt5uzXSXcamZJBLKp5fAJ/9dl
 jD6GlRQMX8JpNRJzxOBLly3CmwQBw2ekOuPLXI+M/ilks66AGK8lp4bg5cWwDGNe
 puzmRJ2mO3dnFlVUHBQ5LyX8ne4yunin1JZB1YQ4xm8yxZbGO2AdypEWMSkPKNPN
 fkrGPlwQ1JwFheMbHHLj
 =gv70
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "A small batch of fixes, about 50% tagged for stable and the rest for
  recently merged code.

  There's one more fix for the >128T handling on hash. Once a process
  had requested a single mmap above 128T we would then always search
  above 128T. The correct behaviour is to consider the hint address in
  isolation for each mmap request.

  Then a couple of fixes for the IMC PMU, a missing EXPORT_SYMBOL in
  VAS, a fix for STRICT_KERNEL_RWX on 32-bit, and a fix to correctly
  identify P9 DD2.1 but in code that is currently not used by default.

  Thanks to: Aneesh Kumar K.V, Christophe Leroy, Madhavan Srinivasan,
  Sukadev Bhattiprolu"

* tag 'powerpc-4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Fix Power9 DD2.1 logic in DT CPU features
  powerpc/perf: Fix IMC_MAX_PMU macro
  powerpc/perf: Fix pmu_count to count only nest imc pmus
  powerpc: Fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX
  powerpc/perf/imc: Use cpu_to_node() not topology_physical_package_id()
  powerpc/vas: Export chip_to_vas_id()
  powerpc/64s/slice: Use addr limit when computing slice mask
2017-11-24 19:40:12 -10:00
Michael Ellerman
2621e945fb powerpc/kexec: Fix kexec/kdump in P9 guest kernels
The code that cleans up the IAMR/AMOR before kexec'ing failed to
remember that when we're running as a guest AMOR is not writable, it's
hypervisor privileged.

They symptom is that the kexec stops before entering purgatory and
nothing else is seen on the console. If you examine the state of the
system all threads will be in the 0x700 program check handler.

Fix it by making the write to AMOR dependent on HV mode.

Fixes: 1e2a516e89 ("powerpc/kexec: Fix radix to hash kexec due to IAMR/AMOR")
Cc: stable@vger.kernel.org # v4.10+
Reported-by: Yilin Zhang <yilzhang@redhat.com>
Debugged-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-24 16:49:37 +11:00
Mahesh Salgaonkar
a3961f824c powerpc/powernv: Fix kexec crashes caused by tlbie tracing
Rebooting into a new kernel with kexec fails in trace_tlbie() which is
called from native_hpte_clear(). This happens if the running kernel
has CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints
always execute few RCU checks regardless of whether tracing is on or
off. We are already in the last phase of kexec sequence in real mode
with HILE_BE set. At this point the RCU check ends up in
RCU_LOCKDEP_WARN and causes kexec to fail.

Fix this by not calling trace_tlbie() from native_hpte_clear().

mpe: It's not safe to call trace points at this point in the kexec
path, even if we could avoid the RCU checks/warnings. The only
solution is to not call them.

Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-23 23:10:14 +11:00
Paul Mackerras
9aa6825bbb KVM: PPC: Book3S: Eliminate some unnecessary checks
In an excess of caution, commit 6f63e81bda ("KVM: PPC: Book3S: Add
MMIO emulation for FP and VSX instructions", 2017-02-21) included
checks for the case that vcpu->arch.mmio_vsx_copy_nums is less than
zero, even though its type is u8.  This causes a Coverity warning,
so we remove the check for < 0.  We also adjust the associated
comment to be more accurate ("4 or less" rather than "less than 4").

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-23 14:23:34 +11:00
Paul Mackerras
c0093f1a38 KVM: PPC: Book3S HV: Fix conditions for starting vcpu
This corrects the test that determines whether a vcpu that has just
become able to run in the guest (e.g. it has just finished handling
a hypercall or hypervisor page fault) and whose virtual core is
already running somewhere as a "piggybacked" vcore can start
immediately or not.  (A piggybacked vcore is one which is executing
along with another vcore as a result of dynamic micro-threading.)

Previously the test tried to lock the piggybacked vcore using
spin_trylock, which would always fail because the vcore was already
locked, and so the vcpu would have to wait until its vcore exited
the guest before it could enter.

In fact the vcpu can enter if its vcore is in VCORE_PIGGYBACK state
and not already exiting (or exited) the guest, so the test in
VCORE_PIGGYBACK state is basically the same as for VCORE_RUNNING
state.

Coverity detected this as a double unlock issue, which it isn't
because the spin_trylock would always fail.  This will fix the
apparent double unlock as well.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-23 14:23:22 +11:00
Paul Mackerras
4fcf361dbd KVM: PPC: Book3S HV: Remove useless statement
This removes a statement that has no effect.  It should have been
removed in commit 898b25b202 ("KVM: PPC: Book3S HV: Simplify dynamic
micro-threading code", 2017-06-22) along with the loop over the
piggy-backed virtual cores.

This issue was reported by Coverity.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-23 14:23:14 +11:00
Paul Mackerras
117647ff93 KVM: PPC: Book3S HV: Fix typo in kvmppc_hv_get_dirty_log_radix()
This fixes a typo where the intent was to assign to 'j' in order to
skip some number of bits in the dirty bitmap for a guest.  The effect
of the typo is benign since it means we just iterate through all the
bits rather than skipping bits which we know will be zero.  This issue
was found by Coverity.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-23 14:23:08 +11:00
Paul Mackerras
cda2eaa359 KVM: PPC: Book3S HV: Avoid shifts by negative amounts
The kvmppc_hpte_page_shifts function decodes the actual and base page
sizes for a HPTE, returning -1 if it doesn't recognize the page size
encoding.  This then gets used as a shift amount in various places,
which is undefined behaviour.  This was reported by Coverity.

In fact this should never occur, since we should only get HPTEs in the
HPT which have a recognized page size encoding.  The only place where
this might not be true is in the call to kvmppc_actual_pgsz() near the
beginning of kvmppc_do_h_enter(), where we are validating the HPTE
value passed in from the guest.

So to fix this and eliminate the undefined behaviour, we make
kvmppc_hpte_page_shifts return 0 for unrecognized page size encodings,
and make kvmppc_actual_pgsz() detect that case and return 0 for the
page size, which will then cause kvmppc_do_h_enter() to return an
error and refuse to insert any HPTE with an unrecognized page size
encoding.

To ensure that we don't get undefined behaviour in compute_tlbie_rb(),
we take the 4k page size path for any unrecognized page size encoding.
This should never be hit in practice because it is only used on HPTE
values which have previously been checked for having a recognized
page size encoding.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-23 14:23:00 +11:00
Paul Mackerras
ded13fc11b KVM: PPC: Book3S HV: Fix migration and HPT resizing of HPT guests on radix hosts
This fixes two errors that prevent a guest using the HPT MMU from
successfully migrating to a POWER9 host in radix MMU mode, or resizing
its HPT when running on a radix host.

The first bug was that commit 8dc6cca556 ("KVM: PPC: Book3S HV:
Don't rely on host's page size information", 2017-09-11) missed two
uses of hpte_base_page_size(), one in the HPT rehashing code and
one in kvm_htab_write() (which is used on the destination side in
migrating a HPT guest).  Instead we use kvmppc_hpte_base_page_shift().
Having the shift count means that we can use left and right shifts
instead of multiplication and division in a few places.

Along the way, this adds a check in kvm_htab_write() to ensure that the
page size encoding in the incoming HPTEs is recognized, and if not
return an EINVAL error to userspace.

The second bug was that kvm_htab_write was performing some but not all
of the functions of kvmhv_setup_mmu(), resulting in the destination VM
being left in radix mode as far as the hardware is concerned.  The
simplest fix for now is make kvm_htab_write() call
kvmppc_setup_partition_table() like kvmppc_hv_setup_htab_rma() does.
In future it would be better to refactor the code more extensively
to remove the duplication.

Fixes: 8dc6cca556 ("KVM: PPC: Book3S HV: Don't rely on host's page size information")
Fixes: 7a84084c60 ("KVM: PPC: Book3S HV: Set partition table rather than SDR1 on POWER9")
Reported-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-23 14:17:54 +11:00
Michael Ellerman
4d6c51b107 powerpc/64s: Fix Power9 DD2.1 logic in DT CPU features
I got the logic wrong in the DT CPU features code when I added the
Power9 DD2.1 feature. We should be setting the bit if we detect a
DD2.1, not clearing it if we detect a DD2.0.

This code isn't actually exercised at the moment so nothing is
actually broken.

Fixes: 3ffa9d9e2a ("powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-22 23:17:01 +11:00
Madhavan Srinivasan
73ce9aec65 powerpc/perf: Fix IMC_MAX_PMU macro
IMC_MAX_PMU is used for static storage (per_nest_pmu_arr) which holds
nest pmu information. Current value for the macro is 32 based on
the initial number of nest pmu units supported by the nest microcode.
But going forward, microcode could support more nest units. Instead
of static storage, patch to fix the code to dynamically allocate an
array based on the number of nest imc units found in the device tree.

Fixes:8f95faaac56c1 ('powerpc/powernv: Detect and create IMC device')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-22 23:04:23 +11:00
Madhavan Srinivasan
de34787f10 powerpc/perf: Fix pmu_count to count only nest imc pmus
"pmu_count" in opal_imc_counters_probe() is intended to hold
the number of successful nest imc pmu registerations. But
current code also counts other imc units like core_imc and
thread_imc. Patch add a check to count only nest imc pmus.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-22 23:04:23 +11:00
Christophe Leroy
252eb55816 powerpc: Fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX
On powerpc32, patch_instruction() is called by apply_feature_fixups()
which is called from early_init()

There is the following note in front of early_init():
 * Note that the kernel may be running at an address which is different
 * from the address that it was linked at, so we must use RELOC/PTRRELOC
 * to access static data (including strings).  -- paulus

Therefore, slab_is_available() cannot be called yet, and
text_poke_area must be addressed with PTRRELOC()

Fixes: 95902e6c88 ("powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-22 23:04:20 +11:00
Michael Ellerman
f3f1dfd600 powerpc/perf/imc: Use cpu_to_node() not topology_physical_package_id()
init_imc_pmu() uses topology_physical_package_id() to detect the
node id of the processor it is on to get local memory, but that's
wrong, and can lead to crashes. Fix it to use cpu_to_node().

Fixes: 885dcd709b ("powerpc/perf: Add nest IMC PMU support")
Cc: stable@vger.kernel.org # v4.14+
Reported-By: Rob Lippert <rlippert@google.com>
Tested-By: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-22 12:24:46 +11:00
Kees Cook
86cb30ec07 treewide: setup_timer() -> timer_setup() (2 field)
This converts all remaining setup_timer() calls that use a nested field
to reach a struct timer_list. Coccinelle does not have an easy way to
match multiple fields, so a new script is needed to change the matches of
"&_E->_timer" into "&_E->_field1._timer" in all the rules.

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/timer_setup-2fields.cocci

@fix_address_of depends@
expression e;
@@

 setup_timer(
-&(e)
+&e
 , ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _field1;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_field1._timer, NULL, _E);
+timer_setup(&_E->_field1._timer, NULL, 0);
|
-setup_timer(&_E->_field1._timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, NULL, 0);
|
-setup_timer(&_E._field1._timer, NULL, &_E);
+timer_setup(&_E._field1._timer, NULL, 0);
|
-setup_timer(&_E._field1._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _field1;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_field1._timer, _callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, &_callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
 _E->_field1._timer@_stl.function = _callback;
|
 _E->_field1._timer@_stl.function = &_callback;
|
 _E->_field1._timer@_stl.function = (_cast_func)_callback;
|
 _E->_field1._timer@_stl.function = (_cast_func)&_callback;
|
 _E._field1._timer@_stl.function = _callback;
|
 _E._field1._timer@_stl.function = &_callback;
|
 _E._field1._timer@_stl.function = (_cast_func)_callback;
|
 _E._field1._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
 depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
(
	... when != _origarg
	_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(void *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
)
 }

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
 depends on change_timer_function_usage &&
                     !change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
+	_handletype *_origarg = from_timer(_origarg, t, _field1._timer);
+
	... when != _origarg
-	(_handletype *)_origarg
+	_origarg
	... when != _origarg
 }

// Avoid already converted callbacks.
@match_callback_converted
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
	    !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

 void _callback(struct timer_list *t)
 { ... }

// callback(struct something *handle)
@change_callback_handle_arg
 depends on change_timer_function_usage &&
	    !match_callback_converted &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

 void _callback(
-_handletype *_handle
+struct timer_list *t
 )
 {
+	_handletype *_handle = from_timer(_handle, t, _field1._timer);
	...
 }

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
 depends on change_timer_function_usage &&
	    change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

 void _callback(struct timer_list *t)
 {
-	_handletype *_handle = from_timer(_handle, t, _field1._timer);
 }

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg &&
	    !change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_field1._timer, _callback, 0);
+setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._field1._timer, _callback, 0);
+setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
 _E->_field1._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_field1._timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_field1._timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_field1._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

 _callback(
(
-(_cast_data)_E
+&_E->_field1._timer
|
-(_cast_data)&_E
+&_E._field1._timer
|
-_E
+&_E->_field1._timer
)
 )

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _field1;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_field1._timer, _callback, 0);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, _callback, 0L);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, _callback, 0UL);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, 0);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, 0L);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, 0UL);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_field1._timer, _callback, 0);
+timer_setup(&_field1._timer, _callback, 0);
|
-setup_timer(&_field1._timer, _callback, 0L);
+timer_setup(&_field1._timer, _callback, 0);
|
-setup_timer(&_field1._timer, _callback, 0UL);
+timer_setup(&_field1._timer, _callback, 0);
|
-setup_timer(_field1._timer, _callback, 0);
+timer_setup(_field1._timer, _callback, 0);
|
-setup_timer(_field1._timer, _callback, 0L);
+timer_setup(_field1._timer, _callback, 0);
|
-setup_timer(_field1._timer, _callback, 0UL);
+timer_setup(_field1._timer, _callback, 0);
)

@change_callback_unused_data
 depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *unused
 )
 {
	... when != _origarg
 }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:09 -08:00
Kees Cook
e99e88a9d2 treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.

Casting from unsigned long:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, ptr);

and forced object casts:

    void my_callback(struct something *ptr)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

become:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

Direct function assignments:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    ptr->my_timer.function = my_callback;

have a temporary cast added, along with converting the args:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

And finally, callbacks without a data assignment:

    void my_callback(unsigned long data)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, 0);

have their argument renamed to verify they're unused during conversion:

    void my_callback(struct timer_list *unused)
    {
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

The conversion is done with the following Coccinelle script:

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/timer_setup.cocci

@fix_address_of@
expression e;
@@

 setup_timer(
-&(e)
+&e
 , ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
 _E->_timer@_stl.function = _callback;
|
 _E->_timer@_stl.function = &_callback;
|
 _E->_timer@_stl.function = (_cast_func)_callback;
|
 _E->_timer@_stl.function = (_cast_func)&_callback;
|
 _E._timer@_stl.function = _callback;
|
 _E._timer@_stl.function = &_callback;
|
 _E._timer@_stl.function = (_cast_func)_callback;
|
 _E._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
 depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
(
	... when != _origarg
	_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
)
 }

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
 depends on change_timer_function_usage &&
                     !change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
+	_handletype *_origarg = from_timer(_origarg, t, _timer);
+
	... when != _origarg
-	(_handletype *)_origarg
+	_origarg
	... when != _origarg
 }

// Avoid already converted callbacks.
@match_callback_converted
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
	    !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

 void _callback(struct timer_list *t)
 { ... }

// callback(struct something *handle)
@change_callback_handle_arg
 depends on change_timer_function_usage &&
	    !match_callback_converted &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

 void _callback(
-_handletype *_handle
+struct timer_list *t
 )
 {
+	_handletype *_handle = from_timer(_handle, t, _timer);
	...
 }

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
 depends on change_timer_function_usage &&
	    change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

 void _callback(struct timer_list *t)
 {
-	_handletype *_handle = from_timer(_handle, t, _timer);
 }

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg &&
	    !change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
 _E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

 _callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
 )

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)

@change_callback_unused_data
 depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *unused
 )
 {
	... when != _origarg
 }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:07 -08:00
Sukadev Bhattiprolu
62b49c4210 powerpc/vas: Export chip_to_vas_id()
Export the symbol chip_to_vas_id() to fix a build failure when
CONFIG_CRYPTO_DEV_NX_COMPRESS_POWERNV=m.

Fixes: d4ef61b5e8 ("powerpc/vas, nx-842: Define and use chip_to_vas_id()")
Reported-by: Haren Myneni <hbabu@us.ibm.com>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-21 21:02:26 +11:00
Aneesh Kumar K.V
7a06c66835 powerpc/64s/slice: Use addr limit when computing slice mask
While computing slice mask for the free area we need make sure we only
search in the addr limit applicable for this mmap. We update the
slb_addr_limit after we request for a mmap above 128TB. But the
following mmap request with hint addr below 128TB should still limit
its search to below 128TB. ie. we should not use slb_addr_limit to
compute slice mask in this case. Instead, we should derive high addr
limit based on the mmap hint addr value.

Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-20 19:28:25 +11:00
Linus Torvalds
fa7f578076 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - a bit more MM

 - procfs updates

 - dynamic-debug fixes

 - lib/ updates

 - checkpatch

 - epoll

 - nilfs2

 - signals

 - rapidio

 - PID management cleanup and optimization

 - kcov updates

 - sysvipc updates

 - quite a few misc things all over the place

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
  EXPERT Kconfig menu: fix broken EXPERT menu
  include/asm-generic/topology.h: remove unused parent_node() macro
  arch/tile/include/asm/topology.h: remove unused parent_node() macro
  arch/sparc/include/asm/topology_64.h: remove unused parent_node() macro
  arch/sh/include/asm/topology.h: remove unused parent_node() macro
  arch/ia64/include/asm/topology.h: remove unused parent_node() macro
  drivers/pcmcia/sa1111_badge4.c: avoid unused function warning
  mm: add infrastructure for get_user_pages_fast() benchmarking
  sysvipc: make get_maxid O(1) again
  sysvipc: properly name ipc_addid() limit parameter
  sysvipc: duplicate lock comments wrt ipc_addid()
  sysvipc: unteach ids->next_id for !CHECKPOINT_RESTORE
  initramfs: use time64_t timestamps
  drivers/watchdog: make use of devm_register_reboot_notifier()
  kernel/reboot.c: add devm_register_reboot_notifier()
  kcov: update documentation
  Makefile: support flag -fsanitizer-coverage=trace-cmp
  kcov: support comparison operands collection
  kcov: remove pointless current != NULL check
  kernel/panic.c: add TAINT_AUX
  ...
2017-11-17 16:56:17 -08:00
Gargi Sharma
95846ecf9d pid: replace pid bitmap implementation with IDR API
Patch series "Replacing PID bitmap implementation with IDR API", v4.

This series replaces kernel bitmap implementation of PID allocation with
IDR API.  These patches are written to simplify the kernel by replacing
custom code with calls to generic code.

The following are the stats for pid and pid_namespace object files
before and after the replacement.  There is a noteworthy change between
the IDR and bitmap implementation.

Before
   text       data        bss        dec        hex    filename
   8447       3894         64      12405       3075    kernel/pid.o
After
   text       data        bss        dec        hex    filename
   3397        304          0       3701        e75    kernel/pid.o

Before
   text       data        bss        dec        hex    filename
   5692       1842        192       7726       1e2e    kernel/pid_namespace.o
After
   text       data        bss        dec        hex    filename
   2854        216         16       3086        c0e    kernel/pid_namespace.o

The following are the stats for ps, pstree and calling readdir on /proc
for 10,000 processes.

ps:
        With IDR API    With bitmap
real    0m1.479s        0m2.319s
user    0m0.070s        0m0.060s
sys     0m0.289s        0m0.516s

pstree:
        With IDR API    With bitmap
real    0m1.024s        0m1.794s
user    0m0.348s        0m0.612s
sys     0m0.184s        0m0.264s

proc:
        With IDR API    With bitmap
real    0m0.059s        0m0.074s
user    0m0.000s        0m0.004s
sys     0m0.016s        0m0.016s

This patch (of 2):

Replace the current bitmap implementation for Process ID allocation.
Functions that are no longer required, for example, free_pidmap(),
alloc_pidmap(), etc.  are removed.  The rest of the functions are
modified to use the IDR API.  The change was made to make the PID
allocation less complex by replacing custom code with calls to generic
API.

[gs051095@gmail.com: v6]
  Link: http://lkml.kernel.org/r/1507760379-21662-2-git-send-email-gs051095@gmail.com
[avagin@openvz.org: restore the old behaviour of the ns_last_pid sysctl]
  Link: http://lkml.kernel.org/r/20171106183144.16368-1-avagin@openvz.org
Link: http://lkml.kernel.org/r/1507583624-22146-2-git-send-email-gs051095@gmail.com
Signed-off-by: Gargi Sharma <gs051095@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:03 -08:00
Linus Torvalds
5a3e0b196b File locking related changes for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaDuoWAAoJEAAOaEEZVoIVXEQP/jQYoU9hgvEj8j3ZIgi56SDJ
 pR45w2zcJz2/uU43DEKyShyLgsuoBbJQ3l/gGBH/tl+xGm9NzB0gatoEu9GmKNYz
 /IN6/vUFnoIAUyD+iMZbpmsYKIkz0z2YJo261IfspAwIft/cvHJnYYGQrP9YXg9F
 c7bdDuANTKocdQigc4BQyOe3OfIBGfTwJhuakO+1yuZmGOVNyxEcdYbMM8FiTfc8
 +62kvQQ3t7WMqSbM8M0QdGcYQjG0EwcVAuV7COurLJIva7hUkVel32MVUjoFcf28
 BnRu2ztFJCubm1HA85twlJDtpeXbcMqrUl/CcwRMpwDaePd5GVB1h5iKqbZ51BZ1
 fWT2STmt+8hY2B5eiXoYEaG3B7ZRr+r0oroxqOxpiZ/m4AVeouF+gPGv+NV5zgvD
 NGWC0MdklIJ4xaC99NEeP6kBhz0M74VKymFCTeHkVg9m4TqDepNvitKed0qagw19
 uw8seei7TOTm4o117+l55NHmyfTHXFO4U0WLTJyeZcoEnUs0rOcHeqyy0RwCBMrK
 W2fJtdBLFr+tBIIrID4TnPhhYtSvIPjz+FpiRDobqhgvMva/PIvLGTWK4unrgIjG
 ZQ7YGnwWda8GjqKhgZacn/BSXyJzOAF9hJp0mz2ORaOxaMarEV55duiZufCvGuZw
 uUQWRCKuQX7Oi05i9jXp
 =fCeF
 -----END PGP SIGNATURE-----

Merge tag 'locks-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking update from Jeff Layton:
 "A couple of fixes for a patch that went into v4.14, and the bug report
  just came in a few days ago.. It passes my (minimal) testing, and has
  been in linux-next for a few days now.

  I also would like to get my address changed in MAINTAINERS to clear
  that hurdle"

* tag 'locks-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
  fcntl: don't leak fd reference when fixup_compat_flock fails
  MAINTAINERS: s/jlayton@poochiereds.net/jlayton@kernel.org/
2017-11-17 13:21:58 -08:00
Linus Torvalds
93f30c73ec Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat and uaccess updates from Al Viro:

 - {get,put}_compat_sigset() series

 - assorted compat ioctl stuff

 - more set_fs() elimination

 - a few more timespec64 conversions

 - several removals of pointless access_ok() in places where it was
   followed only by non-__ variants of primitives

* 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits)
  coredump: call do_unlinkat directly instead of sys_unlink
  fs: expose do_unlinkat for built-in callers
  ext4: take handling of EXT4_IOC_GROUP_ADD into a helper, get rid of set_fs()
  ipmi: get rid of pointless access_ok()
  pi433: sanitize ioctl
  cxlflash: get rid of pointless access_ok()
  mtdchar: get rid of pointless access_ok()
  r128: switch compat ioctls to drm_ioctl_kernel()
  selection: get rid of field-by-field copyin
  VT_RESIZEX: get rid of field-by-field copyin
  i2c compat ioctls: move to ->compat_ioctl()
  sched_rr_get_interval(): move compat to native, get rid of set_fs()
  mips: switch to {get,put}_compat_sigset()
  sparc: switch to {get,put}_compat_sigset()
  s390: switch to {get,put}_compat_sigset()
  ppc: switch to {get,put}_compat_sigset()
  parisc: switch to {get,put}_compat_sigset()
  get_compat_sigset()
  get rid of {get,put}_compat_itimerspec()
  io_getevents: Use timespec64 to represent timeouts
  ...
2017-11-17 11:54:55 -08:00
Linus Torvalds
974aa5630b First batch of KVM changes for 4.15
Common:
  - Python 3 support in kvm_stat
 
  - Accounting of slabs to kmemcg
 
 ARM:
  - Optimized arch timer handling for KVM/ARM
 
  - Improvements to the VGIC ITS code and introduction of an ITS reset
    ioctl
 
  - Unification of the 32-bit fault injection logic
 
  - More exact external abort matching logic
 
 PPC:
  - Support for running hashed page table (HPT) MMU mode on a host that
    is using the radix MMU mode;  single threaded mode on POWER 9 is
    added as a pre-requisite
 
  - Resolution of merge conflicts with the last second 4.14 HPT fixes
 
  - Fixes and cleanups
 
 s390:
  - Some initial preparation patches for exitless interrupts and crypto
 
  - New capability for AIS migration
 
  - Fixes
 
 x86:
  - Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs, and
    after-reset state
 
  - Refined dependencies for VMX features
 
  - Fixes for nested SMI injection
 
  - A lot of cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJaDayXAAoJEED/6hsPKofo/3UH/3HvlcHt+ADTkCU1/iiKAs+i
 0zngIOXIxgHDnV0ww6bV+Znww0BzTYgKCAXX76z603jdpDwG/pzQQcbLDF5ZoJnD
 sQtF10gZinWaRsHlfbLqjrHGL2pGDHO1UKBKLJ0bAIyORPZBxs7i+VmrY/blnr9c
 0wsybJ8RbvwAxjsDL5jeX/z4NehPupmKUc4Lf0eZdSHwVOf9sjn+MP6jJ0r2JcIb
 D+zddPBiLStzN97t4gZpQsrlj3LKrDS+6hY+1TjSvlh+yHKFVFh58VhLm4DuDeb5
 bYOAlWJ/gAWEzfvr5Ld+Nd7SqWWn/14logPkQ4gcU4BI/neAOzk4c6hJfCHl1nk=
 =593n
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "First batch of KVM changes for 4.15

  Common:
   - Python 3 support in kvm_stat
   - Accounting of slabs to kmemcg

  ARM:
   - Optimized arch timer handling for KVM/ARM
   - Improvements to the VGIC ITS code and introduction of an ITS reset
     ioctl
   - Unification of the 32-bit fault injection logic
   - More exact external abort matching logic

  PPC:
   - Support for running hashed page table (HPT) MMU mode on a host that
     is using the radix MMU mode; single threaded mode on POWER 9 is
     added as a pre-requisite
   - Resolution of merge conflicts with the last second 4.14 HPT fixes
   - Fixes and cleanups

  s390:
   - Some initial preparation patches for exitless interrupts and crypto
   - New capability for AIS migration
   - Fixes

  x86:
   - Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs,
     and after-reset state
   - Refined dependencies for VMX features
   - Fixes for nested SMI injection
   - A lot of cleanups"

* tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (89 commits)
  KVM: s390: provide a capability for AIS state migration
  KVM: s390: clear_io_irq() requests are not expected for adapter interrupts
  KVM: s390: abstract conversion between isc and enum irq_types
  KVM: s390: vsie: use common code functions for pinning
  KVM: s390: SIE considerations for AP Queue virtualization
  KVM: s390: document memory ordering for kvm_s390_vcpu_wakeup
  KVM: PPC: Book3S HV: Cosmetic post-merge cleanups
  KVM: arm/arm64: fix the incompatible matching for external abort
  KVM: arm/arm64: Unify 32bit fault injection
  KVM: arm/arm64: vgic-its: Implement KVM_DEV_ARM_ITS_CTRL_RESET
  KVM: arm/arm64: Document KVM_DEV_ARM_ITS_CTRL_RESET
  KVM: arm/arm64: vgic-its: Free caches when GITS_BASER Valid bit is cleared
  KVM: arm/arm64: vgic-its: New helper functions to free the caches
  KVM: arm/arm64: vgic-its: Remove kvm_its_unmap_device
  arm/arm64: KVM: Load the timer state when enabling the timer
  KVM: arm/arm64: Rework kvm_timer_should_fire
  KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate
  KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit
  KVM: arm/arm64: Move phys_timer_emulate function
  KVM: arm/arm64: Use kvm_arm_timer_set/get_reg for guest register traps
  ...
2017-11-16 13:00:24 -08:00
Linus Torvalds
5b0e2cb020 powerpc updates for 4.15
Non-highlights:
 
  - Five fixes for the >128T address space handling, both to fix bugs in our
    implementation and to bring the semantics exactly into line with x86.
 
 Highlights:
 
  - Support for a new OPAL call on bare metal machines which gives us a true NMI
    (ie. is not masked by MSR[EE]=0) for debugging etc.
 
  - Support for Power9 DD2 in the CXL driver.
 
  - Improvements to machine check handling so that uncorrectable errors can be
    reported into the generic memory_failure() machinery.
 
  - Some fixes and improvements for VPHN, which is used under PowerVM to notify
    the Linux partition of topology changes.
 
  - Plumbing to enable TM (transactional memory) without suspend on some Power9
    processors (PPC_FEATURE2_HTM_NO_SUSPEND).
 
  - Support for emulating vector loads form cache-inhibited memory, on some
    Power9 revisions.
 
  - Disable the fast-endian switch "syscall" by default (behind a CONFIG), we
    believe it has never had any users.
 
  - A major rework of the API drivers use when initiating and waiting for long
    running operations performed by OPAL firmware, and changes to the
    powernv_flash driver to use the new API.
 
  - Several fixes for the handling of FP/VMX/VSX while processes are using
    transactional memory.
 
  - Optimisations of TLB range flushes when using the radix MMU on Power9.
 
  - Improvements to the VAS facility used to access coprocessors on Power9, and
    related improvements to the way the NX crypto driver handles requests.
 
  - Implementation of PMEM_API and UACCESS_FLUSHCACHE for 64-bit.
 
 Thanks to:
   Alexey Kardashevskiy, Alistair Popple, Allen Pais, Andrew Donnellan, Aneesh
   Kumar K.V, Arnd Bergmann, Balbir Singh, Benjamin Herrenschmidt, Breno Leitao,
   Christophe Leroy, Christophe Lombard, Cyril Bur, Frederic Barrat, Gautham R.
   Shenoy, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo Romero, Haren
   Myneni, Joel Stanley, Kamalesh Babulal, Kautuk Consul, Markus Elfring, Masami
   Hiramatsu, Michael Bringmann, Michael Neuling, Michal Suchanek, Naveen N. Rao,
   Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pedro Miraglia Franco de
   Carvalho, Philippe Bergheaud, Sandipan Das, Seth Forshee, Shriya, Stephen
   Rothwell, Stewart Smith, Sukadev Bhattiprolu, Tyrel Datwyler, Vaibhav Jain,
   Vaidyanathan Srinivasan, William A. Kennington III.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDXGuAAoJEFHr6jzI4aWAEqwP/0TA35KFAK6wqfkCf67z4q+O
 I+5piI4eDV4jdCakfoIN1JfjhQRULNePSoCHTccan30mu/bm30p69xtOLL2/h5xH
 Mhz/eDBAOo0lrT20nyZfYMW3FnM66wnNf++qJ0O+8L052r4WOB02J0k1uM1ST01D
 5Lb5mUoxRLRzCgKRYAYWJifn+IFPUB9NMsvMTym94krAFlIjIzMEQXhDoln+jJMr
 QmY5f1BTA/fLfXobn0zwoc/C1oa2PUtxd+rxbwGrLoZ6G843mMqUi90SMr5ybhXp
 RzepnBTj4by3vOsnk/X1mANyaZfLsunp75FwnjHdPzKrAS/TuPp8D/iSxxE/PzEq
 cLwJFBnFXSgQMefDErXxhHSDz2dAg5r14rsTpDcq2Ko8TPV4rPsuSfmbd9Txekb0
 yWHsjoJUBBMl2QcWqIHl+AlV8j1RklF6solcTBcGnH1CZJMfa05VKXV7xGEvOHa0
 RJ+/xPyR9KjoB/SUp++9Vmx/M6SwQYFOJlr3Zpg9LNtR8WpoPYu1E6eO+u1Hhzny
 eJqaNstH+i+VdY9eqszkAsEBh8o9M/+b+7Wx7TetvU+v368CbXtgFYs9qy2oZjPF
 t9sY/BHaHZ8eZ7I00an77a0fVV5B1PVASUtIz5CqkwGpMvX6Z6W2K/XUUFI61kuu
 E06HS6Ht8UPJAzrAPUMl
 =Rq81
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "A bit of a small release, I suspect in part due to me travelling for
  KS. But my backlog of patches to review is smaller than usual, so I
  think in part folks just didn't send as much this cycle.

  Non-highlights:

   - Five fixes for the >128T address space handling, both to fix bugs
     in our implementation and to bring the semantics exactly into line
     with x86.

  Highlights:

   - Support for a new OPAL call on bare metal machines which gives us a
     true NMI (ie. is not masked by MSR[EE]=0) for debugging etc.

   - Support for Power9 DD2 in the CXL driver.

   - Improvements to machine check handling so that uncorrectable errors
     can be reported into the generic memory_failure() machinery.

   - Some fixes and improvements for VPHN, which is used under PowerVM
     to notify the Linux partition of topology changes.

   - Plumbing to enable TM (transactional memory) without suspend on
     some Power9 processors (PPC_FEATURE2_HTM_NO_SUSPEND).

   - Support for emulating vector loads form cache-inhibited memory, on
     some Power9 revisions.

   - Disable the fast-endian switch "syscall" by default (behind a
     CONFIG), we believe it has never had any users.

   - A major rework of the API drivers use when initiating and waiting
     for long running operations performed by OPAL firmware, and changes
     to the powernv_flash driver to use the new API.

   - Several fixes for the handling of FP/VMX/VSX while processes are
     using transactional memory.

   - Optimisations of TLB range flushes when using the radix MMU on
     Power9.

   - Improvements to the VAS facility used to access coprocessors on
     Power9, and related improvements to the way the NX crypto driver
     handles requests.

   - Implementation of PMEM_API and UACCESS_FLUSHCACHE for 64-bit.

  Thanks to: Alexey Kardashevskiy, Alistair Popple, Allen Pais, Andrew
  Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Balbir Singh, Benjamin
  Herrenschmidt, Breno Leitao, Christophe Leroy, Christophe Lombard,
  Cyril Bur, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven,
  Guilherme G. Piccoli, Gustavo Romero, Haren Myneni, Joel Stanley,
  Kamalesh Babulal, Kautuk Consul, Markus Elfring, Masami Hiramatsu,
  Michael Bringmann, Michael Neuling, Michal Suchanek, Naveen N. Rao,
  Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pedro Miraglia
  Franco de Carvalho, Philippe Bergheaud, Sandipan Das, Seth Forshee,
  Shriya, Stephen Rothwell, Stewart Smith, Sukadev Bhattiprolu, Tyrel
  Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, and William A.
  Kennington III"

* tag 'powerpc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (151 commits)
  powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature
  powerpc/64s: Fix masking of SRR1 bits on instruction fault
  powerpc/64s: mm_context.addr_limit is only used on hash
  powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation
  powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary
  powerpc/64s/hash: Fix fork() with 512TB process address space
  powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
  powerpc/64s/hash: Fix 512T hint detection to use >= 128T
  powerpc: Fix DABR match on hash based systems
  powerpc/signal: Properly handle return value from uprobe_deny_signal()
  powerpc/fadump: use kstrtoint to handle sysfs store
  powerpc/lib: Implement UACCESS_FLUSHCACHE API
  powerpc/lib: Implement PMEM API
  powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  powerpc/powernv/npu: Use flush_all_mm() instead of flush_tlb_mm()
  powerpc/powernv/idle: Round up latency and residency values
  powerpc/kprobes: refactor kprobe_lookup_name for safer string operations
  powerpc/kprobes: Blacklist emulate_update_regs() from kprobes
  powerpc/kprobes: Do not disable interrupts for optprobes and kprobes_on_ftrace
  powerpc/kprobes: Disable preemption before invoking probe handler for optprobes
  ...
2017-11-16 12:47:46 -08:00
Linus Torvalds
e60e1ee606 main drm pull request for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaCm8RAAoJEAx081l5xIa+zX0QAJSm31kCG3vdw2CNiRx25L3q
 3hcsEOgAjVJ9FQVGKFWjzb8TK35tSqtNx5kWIj0VGaIfBE5Bdg5SLLgKKUYas8rY
 4LaphqICq2uxu2BNa2tpiar/sHhAnuozwQ4czpVWXzlaISnb9yYzRl7gMuyUVGkx
 +Gih5VUhLmQC0HsRTLJ3vaZQoUsLAl2gAjKcWa1bx57j2S+iKOPfsLaq7VYo+y1I
 Njc+iSGqMhJzRLXVkxL2lQKaslp7R38Bbh5K4Kvyjkm4Aq7zErOF6irpOXKMcrGl
 mwnr89vf1G9thjikrBaXpKnuvdbWYveoN/ORMlTdCfxkFnChHLnm3bd7NJ49RXDN
 Hv/Iq9YYjmZ9GTatxnx7lWtmXnZXC5he1yn1JAuz/yt7/0b/Wx+Mu/wEpBXYNFTd
 1AZdD586i+AmPo3yDkqH9nBu8JC0W0AnS9VZma4LVvZOP2UfJmj5Im1CLHItbGDN
 FnUCkwyD/lJUUk+WgT+w/GOMJgmFHDiFFl4tFtYVVjrUirpCFVguSKG9xuv6tT8P
 8iRsoP7RrcmDN9ojN2SEHwcpsAv3HnKkDv+9+GIbWnrGsSbCPq8Qm+JDSvf4h22I
 K5lwNpJrcpSKI+q10L7w2xliTBwb98sJkWGA/rssomrdBOWteGZAyqFRYAVgQ+mJ
 x/nJurIqQYh2KQN9+uLG
 =xVV2
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.15.

  Core:
   - Atomic object lifetime fixes
   - Atomic iterator improvements
   - Sparse/smatch fixes
   - Legacy kms ioctls to be interruptible
   - EDID override improvements
   - fb/gem helper cleanups
   - Simple outreachy patches
   - Documentation improvements
   - Fix dma-buf rcu races
   - DRM mode object leasing for improving VR use cases.
   - vgaarb improvements for non-x86 platforms.

  New driver:
   - tve200: Faraday Technology TVE200 block.

     This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
     the StorLink SL3516 (later Cortina Systems CS3516) as well as the
     Grain Media GM8180.

  New bridges:
   - SiI9234 support

  New panels:
   - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
     LT089AC19000, Innolux AT043TN24

  i915:
   - Remove Coffeelake from alpha support
   - Cannonlake workarounds
   - Infoframe refactoring for DisplayPort
   - VBT updates
   - DisplayPort vswing/emph/buffer translation refactoring
   - CCS fixes
   - Restore GPU clock boost on missed vblanks
   - Scatter list updates for userptr allocations
   - Gen9+ transition watermarks
   - Display IPC (Isochronous Priority Control)
   - Private PAT management
   - GVT: improved error handling and pci config sanitizing
   - Execlist refactoring
   - Transparent Huge Page support
   - User defined priorities support
   - HuC/GuC firmware refactoring
   - DP MST fixes
   - eDP power sequencing fixes
   - Use RCU instead of stop_machine
   - PSR state tracking support
   - Eviction fixes
   - BDW DP aux channel timeout fixes
   - LSPCON fixes
   - Cannonlake PLL fixes

  amdgpu:
   - Per VM BO support
   - Powerplay cleanups
   - CI powerplay support
   - PASID mgr for kfd
   - SR-IOV fixes
   - initial GPU reset for vega10
   - Prime mmap support
   - TTM updates
   - Clock query interface for Raven
   - Fence to handle ioctl
   - UVD encode ring support on Polaris
   - Transparent huge page DMA support
   - Compute LRU pipe tweaks
   - BO flag to allow buffers to opt out of implicit sync
   - CTX priority setting API
   - VRAM lost infrastructure plumbing

  qxl:
   - fix flicker since atomic rework

  amdkfd:
   - Further improvements from internal AMD tree
   - Usermode events
   - Drop radeon support

  nouveau:
   - Pascal temperature sensor support
   - Improved BAR2 handling
   - MMU rework to support Pascal MMU

  exynos:
   - Improved HDMI/mixer support
   - HDMI audio interface support

  tegra:
   - Prep work for tegra186
   - Cleanup/fixes

  msm:
   - Preemption support for a5xx
   - Display fixes for 8x96 (snapdragon 820)
   - Async cursor plane fixes
   - FW loading rework
   - GPU debugging improvements

  vc4:
   - Prep for DSI panels
   - fix T-format tiling scanout
   - New madvise ioctl

  Rockchip:
   - LVDS support

  omapdrm:
   - omap4 HDMI CEC support

  etnaviv:
   - GPU performance counters groundwork

  sun4i:
   - refactor driver load + TCON backend
   - HDMI improvements
   - A31 support
   - Misc fixes

  udl:
   - Probe/EDID read fixes.

  tilcdc:
   - Misc fixes.

  pl111:
   - Support more variants

  adv7511:
   - Improve EDID handling.
   - HDMI CEC support

  sii8620:
   - Add remote control support"

* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
  drm/rockchip: analogix_dp: Use mutex rather than spinlock
  drm/mode_object: fix documentation for object lookups.
  drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
  drm/i915: Move init_clock_gating() back to where it was
  drm/i915: Prune the reservation shared fence array
  drm/i915: Idle the GPU before shinking everything
  drm/i915: Lock llist_del_first() vs llist_del_all()
  drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
  drm/i915: Disable lazy PPGTT page table optimization for vGPU
  drm/i915/execlists: Remove the priority "optimisation"
  drm/i915: Filter out spurious execlists context-switch interrupts
  drm/amdgpu: use irq-safe lock for kiq->ring_lock
  drm/amdgpu: bypass lru touch for KIQ ring submission
  drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
  drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
  drm/amd/powerplay: initialize a variable before using it
  drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
  drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
  drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
  drm/rockchip: add CONFIG_OF dependency for lvds
  ...
2017-11-15 20:42:10 -08:00
Linus Torvalds
7c225c69f8 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc bits

 - ocfs2 updates

 - almost all of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits)
  memory hotplug: fix comments when adding section
  mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP
  mm: simplify nodemask printing
  mm,oom_reaper: remove pointless kthread_run() error check
  mm/page_ext.c: check if page_ext is not prepared
  writeback: remove unused function parameter
  mm: do not rely on preempt_count in print_vma_addr
  mm, sparse: do not swamp log with huge vmemmap allocation failures
  mm/hmm: remove redundant variable align_end
  mm/list_lru.c: mark expected switch fall-through
  mm/shmem.c: mark expected switch fall-through
  mm/page_alloc.c: broken deferred calculation
  mm: don't warn about allocations which stall for too long
  fs: fuse: account fuse_inode slab memory as reclaimable
  mm, page_alloc: fix potential false positive in __zone_watermark_ok
  mm: mlock: remove lru_add_drain_all()
  mm, sysctl: make NUMA stats configurable
  shmem: convert shmem_init_inodecache() to void
  Unify migrate_pages and move_pages access checks
  mm, pagevec: rename pagevec drained field
  ...
2017-11-15 19:42:40 -08:00
Mel Gorman
2d4894b5d2 mm: remove cold parameter from free_hot_cold_page*
Most callers users of free_hot_cold_page claim the pages being released
are cache hot.  The exception is the page reclaim paths where it is
likely that enough pages will be freed in the near future that the
per-cpu lists are going to be recycled and the cache hotness information
is lost.  As no one really cares about the hotness of pages being
released to the allocator, just ditch the parameter.

The APIs are renamed to indicate that it's no longer about hot/cold
pages.  It should also be less confusing as there are subtle differences
between them.  __free_pages drops a reference and frees a page when the
refcount reaches zero.  free_hot_cold_page handled pages whose refcount
was already zero which is non-obvious from the name.  free_unref_page
should be more obvious.

No performance impact is expected as the overhead is marginal.  The
parameter is removed simply because it is a bit stupid to have a useless
parameter copied everywhere.

[mgorman@techsingularity.net: add pages to head, not tail]
  Link: http://lkml.kernel.org/r/20171019154321.qtpzaeftoyyw4iey@techsingularity.net
Link: http://lkml.kernel.org/r/20171018075952.10627-8-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:06 -08:00
Levin, Alexander (Sasha Levin)
75f296d93b kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK
Convert all allocations that used a NOTRACK flag to stop using it.

Link: http://lkml.kernel.org/r/20171007030159.22241-3-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Hansen <devtimhansen@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:04 -08:00
Kirill A. Shutemov
b4e98d9ac7 mm: account pud page tables
On a machine with 5-level paging support a process can allocate
significant amount of memory and stay unnoticed by oom-killer and memory
cgroup.  The trick is to allocate a lot of PUD page tables.  We don't
account PUD page tables, only PMD and PTE.

We already addressed the same issue for PMD page tables, see commit
dc6c9a35b6 ("mm: account pmd page tables to the process").
Introduction of 5-level paging brings the same issue for PUD page
tables.

The patch expands accounting to PUD level.

[kirill.shutemov@linux.intel.com: s/pmd_t/pud_t/]
  Link: http://lkml.kernel.org/r/20171004074305.x35eh5u7ybbt5kar@black.fi.intel.com
[heiko.carstens@de.ibm.com: s390/mm: fix pud table accounting]
  Link: http://lkml.kernel.org/r/20171103090551.18231-1-heiko.carstens@de.ibm.com
Link: http://lkml.kernel.org/r/20171002080427.3320-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:04 -08:00
Linus Torvalds
1b6115fbe3 pci-v4.15-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDFIdAAoJEFmIoMA60/r8Jg4P/3IrmMNVnpqmYEZ7lRSW7UQ3
 8jtupbzIkbPsIAEhbJ7xqO7zKx85j6Og+ZSOv4a8u/tS6cd1aVZu2PpWsTkacez0
 7nLGVCSL3HZi5qcrtOvb2Pmke18SUKSPxVYSgS2ajQavB1oKaY03FbHDWyWidCZx
 qxkeGZOiUDw5kSGkQWyks1WgB0dd76rVbPcrKKEJueGgrdSm+EdgdDv8eT6bZInZ
 uMrCmSjNYTQP0KASCJJvgYOtJbdwvP6NuQTxzOlU2G+H2SqsLRjsz4UUR8FF06T5
 cndpgpG3QSAZLx7wCeWTvRorTEYORzKMoyw/AUjhiGbRep9Zw0aKNvCC99E6xjyD
 FECrk6kCrqZs7l+LVXK4SwpBXCVjNgRoFAHBEKF2X3/SWUkUhHXZHCVvMQB8LQiS
 2p8VRoYWw2aCLkHCGynuzToUrD2P2Pjxe5n/13aYVJkyBNfQqqZ3l2YHiZdpDO3j
 rgG6RW0WCrpZxfb/0WAbPnQ2qpZAwDPO6hOW7dIfTZabFVXRIkBvNq53by/0MxvP
 jyOcMTsq2l8y46f3VgNPUAHj0f52HwfZA3PQRPh+MQDz5385BJklDRWtfVM7cQx9
 IoeGkq1zLLvpOh63he/jnnRELxDvNVcxND8lOkenJlObj9kK63hUEcXg/zEMS4w3
 oetLw9TqE32Jb7GfpVSw
 =j4L3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

  - detach driver before tearing down procfs/sysfs (Alex Williamson)

  - disable PCIe services during shutdown (Sinan Kaya)

  - fix ASPM oops on systems with no Root Ports (Ard Biesheuvel)

  - fix ASPM LTR_L1.2_THRESHOLD programming (Bjorn Helgaas)

  - fix ASPM Common_Mode_Restore_Time computation (Bjorn Helgaas)

  - fix portdrv MSI/MSI-X vector allocation (Dongdong Liu, Bjorn
    Helgaas)

  - report non-fatal AER errors only to the affected endpoint (Gabriele
    Paoloni)

  - distribute bus numbers, MMIO, and I/O space among hotplug bridges to
    allow more devices to be hot-added (Mika Westerberg)

  - fix pciehp races during initialization and surprise link down (Mika
    Westerberg)

  - handle surprise-removed devices in PME handling (Qiang)

  - support resizable BARs for large graphics devices (Christian König)

  - expose SR-IOV offset, stride, and VF device ID via sysfs (Filippo
    Sironi)

  - create SR-IOV virtfn/physfn sysfs links before attaching driver
    (Stuart Hayes)

  - fix SR-IOV "ARI Capable Hierarchy" restore issue (Tony Nguyen)

  - enforce Kconfig IOV/REALLOC dependency (Sascha El-Sharkawy)

  - avoid slot reset if bridge itself is broken (Jan Glauber)

  - clean up pci_reset_function() path (Jan H. Schönherr)

  - make pci_map_rom() fail if the option ROM is invalid (Changbin Du)

  - convert timers to timer_setup() (Kees Cook)

  - move PCI_QUIRKS to PCI bus Kconfig menu (Randy Dunlap)

  - constify pci_dev_type and intel_mid_pci_ops (Bhumika Goyal)

  - remove unnecessary pci_dev, pci_bus, resource, pcibios_set_master()
    declarations (Bjorn Helgaas)

  - fix endpoint framework overflows and BUG()s (Dan Carpenter)

  - fix endpoint framework issues (Kishon Vijay Abraham I)

  - avoid broken Cavium CN8xxx bus reset behavior (David Daney)

  - extend Cavium ACS capability quirks (Vadim Lomovtsev)

  - support Synopsys DesignWare RC in ECAM mode (Ard Biesheuvel)

  - turn off dra7xx clocks cleanly on shutdown (Keerthy)

  - fix Faraday probe error path (Wei Yongjun)

  - support HiSilicon STB SoC PCIe host controller (Jianguo Sun)

  - fix Hyper-V interrupt affinity issue (Dexuan Cui)

  - remove useless ACPI warning for Hyper-V pass-through devices (Vitaly
    Kuznetsov)

  - support multiple MSI on iProc (Sandor Bodo-Merle)

  - support Layerscape LS1012a and LS1046a PCIe host controllers (Hou
    Zhiqiang)

  - fix Layerscape default error response (Minghuan Lian)

  - support MSI on Tango host controller (Marc Gonzalez)

  - support Tegra186 PCIe host controller (Manikanta Maddireddy)

  - use generic accessors on Tegra when possible (Thierry Reding)

  - support V3 Semiconductor PCI host controller (Linus Walleij)

* tag 'pci-v4.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (85 commits)
  PCI/ASPM: Add L1 Substates definitions
  PCI/ASPM: Reformat ASPM register definitions
  PCI/ASPM: Use correct capability pointer to program LTR_L1.2_THRESHOLD
  PCI/ASPM: Account for downstream device's Port Common_Mode_Restore_Time
  PCI: xgene: Rename xgene_pcie_probe_bridge() to xgene_pcie_probe()
  PCI: xilinx: Rename xilinx_pcie_link_is_up() to xilinx_pcie_link_up()
  PCI: altera: Rename altera_pcie_link_is_up() to altera_pcie_link_up()
  PCI: Fix kernel-doc build warning
  PCI: Fail pci_map_rom() if the option ROM is invalid
  PCI: Move pci_map_rom() error path
  PCI: Move PCI_QUIRKS to the PCI bus menu
  alpha/PCI: Make pdev_save_srm_config() static
  PCI: Remove unused declarations
  PCI: Remove redundant pci_dev, pci_bus, resource declarations
  PCI: Remove redundant pcibios_set_master() declarations
  PCI/PME: Handle invalid data when reading Root Status
  PCI: hv: Use effective affinity mask
  PCI: pciehp: Do not clear Presence Detect Changed during initialization
  PCI: pciehp: Fix race condition handling surprise link down
  PCI: Distribute available resources to hotplug-capable bridges
  ...
2017-11-15 15:01:28 -08:00
Linus Torvalds
1be2172e96 Modules updates for v4.15
Summary of modules changes for the 4.15 merge window:
 
 - Treewide module_param_call() cleanup, fix up set/get function
   prototype mismatches, from Kees Cook
 
 - Minor code cleanups
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJaDCyzAAoJEMBFfjjOO8FyaYQP/AwHBy6XmwwVlWDP4BqIF6hL
 Vhy3ccVLYEORvePv68tWSRPUz5n6+1Ebqanmwtkw6i8l+KwxY2SfkZql09cARc33
 2iBE4bHF98iWQmnJbF6me80fedY9n5bZJNMQKEF9VozJWwTMOTQFTCfmyJRDBmk9
 iidQj6M3idbSUOYIJjvc40VGx5NyQWSr+FFfqsz1rU5iLGRGEvA3I2/CDT0oTuV6
 D4MmFxzE2Tv/vIMa2GzKJ1LGScuUfSjf93Lq9Kk0cG36qWao8l930CaXyVdE9WJv
 bkUzpf3QYv/rDX6QbAGA0cada13zd+dfBr8YhchclEAfJ+GDLjMEDu04NEmI6KUT
 5lP0Xw0xYNZQI7bkdxDMhsj5jaz/HJpXCjPCtZBnSEKiL4OPXVMe+pBHoCJ2/yFN
 6M716XpWYgUviUOdiE+chczB5p3z4FA6u2ykaM4Tlk0btZuHGxjcSWwvcIdlPmjm
 kY4AfDV6K0bfEBVguWPJicvrkx44atqT5nWbbPhDwTSavtsuRJLb3GCsHedx7K8h
 ZO47lCQFAWCtrycK1HYw+oupNC3hYWQ0SR42XRdGhL1bq26C+1sei1QhfqSgA9PQ
 7CwWH4UTOL9fhtrzSqZngYOh9sjQNFNefqQHcecNzcEjK2vjrgQZvRNWZKHSwaFs
 fbGX8juZWP4ypbK+irTB
 =c8vb
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:
 "Summary of modules changes for the 4.15 merge window:

   - treewide module_param_call() cleanup, fix up set/get function
     prototype mismatches, from Kees Cook

   - minor code cleanups"

* tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: Do not paper over type mismatches in module_param_call()
  treewide: Fix function prototypes for module_param_call()
  module: Prepare to convert all module_param_call() prototypes
  kernel/module: Delete an error message for a failed memory allocation in add_module_usage()
2017-11-15 13:46:33 -08:00