Commit Graph

723290 Commits

Author SHA1 Message Date
Thomas Gleixner
cef31d9af9 posix-timer: Properly check sigevent->sigev_notify
timer_create() specifies via sigevent->sigev_notify the signal delivery for
the new timer. The valid modes are SIGEV_NONE, SIGEV_SIGNAL, SIGEV_THREAD
and (SIGEV_SIGNAL | SIGEV_THREAD_ID).

The sanity check in good_sigevent() is only checking the valid combination
for the SIGEV_THREAD_ID bit, i.e. SIGEV_SIGNAL, but if SIGEV_THREAD_ID is
not set it accepts any random value.

This has no real effects on the posix timer and signal delivery code, but
it affects show_timer() which handles the output of /proc/$PID/timers. That
function uses a string array to pretty print sigev_notify. The access to
that array has no bound checks, so random sigev_notify cause access beyond
the array bounds.

Add proper checks for the valid notify modes and remove the SIGEV_THREAD_ID
masking from various code pathes as SIGEV_NONE can never be set in
combination with SIGEV_THREAD_ID.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: stable@vger.kernel.org
2017-12-15 11:08:40 +01:00
Lan Tianyu
f298103359 KVM/x86: Check input paging mode when cs.l is set
Reported by syzkaller:
    WARNING: CPU: 0 PID: 27962 at arch/x86/kvm/emulate.c:5631 x86_emulate_insn+0x557/0x15f0 [kvm]
    Modules linked in: kvm_intel kvm [last unloaded: kvm]
    CPU: 0 PID: 27962 Comm: syz-executor Tainted: G    B   W        4.15.0-rc2-next-20171208+ #32
    Hardware name: Intel Corporation S1200SP/S1200SP, BIOS S1200SP.86B.01.03.0006.040720161253 04/07/2016
    RIP: 0010:x86_emulate_insn+0x557/0x15f0 [kvm]
    RSP: 0018:ffff8807234476d0 EFLAGS: 00010282
    RAX: 0000000000000000 RBX: ffff88072d0237a0 RCX: ffffffffa0065c4d
    RDX: 1ffff100e5a046f9 RSI: 0000000000000003 RDI: ffff88072d0237c8
    RBP: ffff880723447728 R08: ffff88072d020000 R09: ffffffffa008d240
    R10: 0000000000000002 R11: ffffed00e7d87db3 R12: ffff88072d0237c8
    R13: ffff88072d023870 R14: ffff88072d0238c2 R15: ffffffffa008d080
    FS:  00007f8a68666700(0000) GS:ffff880802200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000002009506c CR3: 000000071fec4005 CR4: 00000000003626f0
    Call Trace:
     x86_emulate_instruction+0x3bc/0xb70 [kvm]
     ? reexecute_instruction.part.162+0x130/0x130 [kvm]
     vmx_handle_exit+0x46d/0x14f0 [kvm_intel]
     ? trace_event_raw_event_kvm_entry+0xe7/0x150 [kvm]
     ? handle_vmfunc+0x2f0/0x2f0 [kvm_intel]
     ? wait_lapic_expire+0x25/0x270 [kvm]
     vcpu_enter_guest+0x720/0x1ef0 [kvm]
     ...

When CS.L is set, vcpu should run in the 64 bit paging mode.
Current kvm set_sregs function doesn't have such check when
userspace inputs sreg values. This will lead unexpected behavior.
This patch is to add checks for CS.L, EFER.LME, EFER.LMA and
CR4.PAE when get SREG inputs from userspace in order to avoid
unexpected behavior.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Tianyu Lan <tianyu.lan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-15 10:01:46 +01:00
Andreas Platschek
2610acf46b dmaengine: fsl-edma: disable clks on all error paths
Previously enabled clks are only disabled if clk_prepare_enable() fails.
However, there are other error paths were the previously enabled
clocks are not disabled.

To fix the problem, fsl_disable_clocks() now takes the number of clocks
that shall be disabled + unprepared. For existing calls were all clocks
were already successfully prepared + enabled, DMAMUX_NR is passed to
disable + unprepare all clocks.

In error paths were only some clocks were successfully prepared +
enabled the loop counter is passed, in order to disable + unprepare
all successfully prepared + enabled clocks.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Andreas Platschek <andreas.platschek@opentech.at>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2017-12-15 09:53:04 +05:30
Prasad B Munirathnam
5771cfffdf scsi: aacraid: Fix I/O drop during reset
"FIB_CONTEXT_FLAG_TIMEDOUT" flag is set in aac_eh_abort to indicate
command timeout. Using the same flag in reset handler causes the command
to time out and the I/Os were dropped.

Define a new flag "FIB_CONTEXT_FLAG_EH_RESET" to make sure I/O is
properly handled in eh_reset handler.

[mkp: tweaked commit message]

Signed-off-by: Prasad B Munirathnam <prasad.munirathnam@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 22:34:28 -05:00
Bart Van Assche
093b8886f4 scsi: core: Use blist_flags_t consistently
Use the type blist_flags_t for all variables that represent blacklist
flags. Additionally, suppress recently introduced sparse warnings
related to blacklist flags.

[mkp: fixed commit id]

Fixes: 5ebde4694e ("scsi: Use 'blist_flags_t' for scsi_devinfo flags")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-14 22:30:24 -05:00
Linus Torvalds
032b4cc8ff Power management fix for v4.15-rc4
This fixes an issue in two recent commits that may cause
 pm_runtime_enable() to be called for too many times for some
 devices during the "thaw" transition belonging to hibernation.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJaMyQRAAoJEILEb/54YlRx8C0P/j4/LO0dLY4bNPVfeMDpZrcz
 iXozHkIg2wKztuO7j+A3sRhShzu90wGjYsN9t3rrEsdqQn5GlB98vU+uak6heyRI
 kY/yLPYMnzyZbJ7Rlix6BtZgykhs7w20qxTMuy2ZV7AbeZwODVlm9swDswLcpI1D
 tZwbbj+g47I0uC2cuhc0GlcFWcTEyBp9RjsdRKbrFnrDPDOBhJg2MkQwPZ1deG0a
 q5PmTI0uskrqJif5ovEhMgQ24pE4C6t88HZj5Dg8JO3H4vaeNrTgHSFE8UjID3fM
 PH2DspE9vyTa0irxEcLQmq6rzHSZZkqdirzPn5HP8gz6zQVBMeAe277r6P2DrxxJ
 vygiG0pnDsntiA3ygfVMkVr/pL35Y6TPy2kJpKN1+LUDC5If4iObW91qu0CN5pWJ
 QxUS76TIGXTEIPhHLiy+OKPFTAYAADSeZAaedhdHLE2a3iPGB6TxT4RlPU0h3txD
 4+aQ4cbi2cyGTbPD83jaoKbzL5oRUxnnZYGJEXu/Opbay8LDTSHaTTYxAohTMUj5
 46eBEJ+hX8SDqVeAlX1UL37Wjx0zHYPyJSjYFIm5Imr/JovftG6X7lCDlFBA86VN
 DKw5Swc12QmaGmC9vQPMUwtn/QN2R0WCBm6Hisi/3OIu2/+pY/xqtBI2eEgExsgZ
 PYVvmIwqAl2+M6HcweqJ
 =0D8g
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "This fixes an issue in two recent commits that may cause
  pm_runtime_enable() to be called for too many times for some devices
  during the "thaw" transition belonging to hibernation"

* tag 'pm-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / sleep: Avoid excess pm_runtime_enable() calls in device_resume()
2017-12-14 18:25:03 -08:00
Linus Torvalds
0424378781 Various fix ups.
- Comment fixes
  - Build fix
  - Better memory alloction (don't use NR_CPUS)
  - Configuration fix
  - Build warning fix
  - Enhanced callback parameter (to simplify users of trace hooks)
  - Give up on stack tracing when RCU isn't watching (it's a lost cause)
 -----BEGIN PGP SIGNATURE-----
 
 iQHIBAABCgAyFiEEPm6V/WuN2kyArTUe1a05Y9njSUkFAlozKm0UHHJvc3RlZHRA
 Z29vZG1pcy5vcmcACgkQ1a05Y9njSUmhXwv7BEY923K3Nl3qC6LeYmNyrZ4g1PsD
 nbZ+ZjU3KlMPugGbnJCJbfsS0utUp2Wd9gHT32O4BUf0/Pxjo3utXvkzRQJ3SwHT
 X7QhXROkicAKRFrPxj0BaiLexC+yJR23wGp2YUVHLO4Aa/ptN8BJvH22+eDpsCLc
 f1DWJdvdbyPBUeoHNKevjvccsUYMlnBfe1jhJ9nRWHnq1axGV3bllcd6v4T07LgK
 LO28Krp4/V3tVN9Sq6jBoGTrULf5O1xtuBDVtANeXdha1oMUTYr4TUzTuOQxFgNu
 sCIpUWTKu+PNGmU5bhlryb20C2+PveE4EMK0InVVlqrhYJ/XjXFyRaZYYkM2NAKq
 XmeHVjMfc6Wmrd/nepuaTZvGGrK2kK/pCX/XuQthKKjAU6rO5X+FfGiSGodLPIYB
 1m+QvfX8Re2IJkswm3lS68LKtG7SnjcSB9sY6PhVe6C6oP2ya2O1hthJ+TkrfNbc
 lhE5HzfCJZ9ujSjcQoUGwjrhnIezQX4KnfJZ
 =WdMQ
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Various fix-ups:

   - comment fixes

   - build fix

   - better memory alloction (don't use NR_CPUS)

   - configuration fix

   - build warning fix

   - enhanced callback parameter (to simplify users of trace hooks)

   - give up on stack tracing when RCU isn't watching (it's a lost
     cause)"

* tag 'trace-v4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Have stack trace not record if RCU is not watching
  tracing: Pass export pointer as argument to ->write()
  ring-buffer: Remove unused function __rb_data_page_index()
  tracing: make PREEMPTIRQ_EVENTS depend on TRACING
  tracing: Allocate mask_str buffer dynamically
  tracing: always define trace_{irq,preempt}_{enable_disable}
  tracing: Fix code comments in trace.c
2017-12-14 18:21:33 -08:00
Steven Rostedt (VMware)
b00d607bb1 tracing: Have stack trace not record if RCU is not watching
The stack tracer records a stack dump whenever it sees a stack usage that is
more than what it ever saw before. This can happen at any function that is
being traced. If it happens when the CPU is going idle (or other strange
locations), RCU may not be watching, and in this case, the recording of the
stack trace will trigger a warning. There's been lots of efforts to make
hacks to allow stack tracing to proceed even if RCU is not watching, but
this only causes more issues to appear. Simply do not trace a stack if RCU
is not watching. It probably isn't a bad stack anyway.

Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-12-14 20:48:22 -05:00
Linus Torvalds
c4f988ee51 pci-v4.15-fixes-1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaMwhcAAoJEFmIoMA60/r88ygP/3fuGrYrolx1yOM/XKjiG1Za
 B+FF5IK0xcV4Xrdg4K9TrqCkVBlSmTkF2wvD1raFQhvPvzNnTMI0JvI5iwUogSlA
 EI8rKvKRT0QRtT27URhGWCZOrPMgylE7Vtq7ZGLin2o19A4JQw3TC8mz9QirSbii
 C5H+WZ6iAQ3ISjOuiFUREqQDCE8FzbYQC69uh/xMRshZFG1WPsX96ZRSvVfPQUeC
 XwvKqiO4s9I7zyxhS/WSRnnEHO7LNSRRiq8kzP0VKKMAfTo8gLYPoeIPakvXSsQZ
 j7N1ycWub9DSn7wjh9Zd922wp1EYfxkCW6jilVOd+BlmOxLjyPPGgahbNIrIojwa
 qiTY5SNdty3h27IT3Hep4yDBoqhXXPORXgh7QRlzizu8Te2SjHY7n9qi1S8Uobn+
 s98PoNYU+oelivi9MtKLKcqCaNL0RE8aCQfuA3TzCU/dzR4Trx8obV50f67vsL6e
 eK67S/KpOVUjHQmGec8iqnrJfHPZkxgaRTOVVYCXKidtJxTf/LyjDKYh8A2Q27A5
 wXdRaSxLw25r0alAWzUEIVj/iUPd4nYKY9RTeRPWiXU+wsEUcBO+js6HE1xZAAfk
 ULDqZKnkKsgW9aWlQU6DzzLsCjxPTRcC8SfK79K47Otua2laABzL47mzo2sSKKFs
 mNq3i4OiNFq8XCsN18lo
 =o3Rn
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - add a pci_get_domain_bus_and_slot() stub for the CONFIG_PCI=n case to
   avoid build breakage in the v4.16 merge window if a
   pci_get_bus_and_slot() -> pci_get_domain_bus_and_slot() patch gets
   merged before the PCI tree (Randy Dunlap)

 - fix an AMD boot regression in the 64bit BAR support added in v4.15
   (Christian König)

 - fix an R-Car use-after-free that causes a crash if no PCIe card is
   present (Geert Uytterhoeven)

* tag 'pci-v4.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: rcar: Fix use-after-free in probe error path
  x86/PCI: Only enable a 64bit BAR on single-socket AMD Family 15h
  x86/PCI: Fix infinite loop in search for 64bit BAR placement
  PCI: Add pci_get_domain_bus_and_slot() stub
2017-12-14 17:02:39 -08:00
Linus Torvalds
18d40eae7f Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "17 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  arch: define weak abort()
  mm, oom_reaper: fix memory corruption
  kernel: make groups_sort calling a responsibility group_info allocators
  mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()'
  tools/slabinfo-gnuplot: force to use bash shell
  kcov: fix comparison callback signature
  mm/slab.c: do not hash pointers when debugging slab
  mm/page_alloc.c: avoid excessive IRQ disabled times in free_unref_page_list()
  mm/memory.c: mark wp_huge_pmd() inline to prevent build failure
  scripts/faddr2line: fix CROSS_COMPILE unset error
  Documentation/vm/zswap.txt: update with same-value filled page feature
  exec: avoid gcc-8 warning for get_task_comm
  autofs: fix careless error in recent commit
  string.h: workaround for increased stack usage
  mm/kmemleak.c: make cond_resched() rate-limiting more efficient
  lib/rbtree,drm/mm: add rbtree_replace_node_cached()
  include/linux/idr.h: add #include <linux/bug.h>
2017-12-14 16:35:20 -08:00
Sudip Mukherjee
7c2c11b208 arch: define weak abort()
gcc toggle -fisolate-erroneous-paths-dereference (default at -O2
onwards) isolates faulty code paths such as null pointer access, divide
by zero etc.  If gcc port doesnt implement __builtin_trap, an abort() is
generated which causes kernel link error.

In this case, gcc is generating abort due to 'divide by zero' in
lib/mpi/mpih-div.c.

Currently 'frv' and 'arc' are failing.  Previously other arch was also
broken like m32r was fixed by commit d22e3d69ee ("m32r: fix build
failure").

Let's define this weak function which is common for all arch and fix the
problem permanently.  We can even remove the arch specific 'abort' after
this is done.

Link: http://lkml.kernel.org/r/1513118956-8718-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:49 -08:00
Michal Hocko
4837fe37ad mm, oom_reaper: fix memory corruption
David Rientjes has reported the following memory corruption while the
oom reaper tries to unmap the victims address space

  BUG: Bad page map in process oom_reaper  pte:6353826300000000 pmd:00000000
  addr:00007f50cab1d000 vm_flags:08100073 anon_vma:ffff9eea335603f0 mapping:          (null) index:7f50cab1d
  file:          (null) fault:          (null) mmap:          (null) readpage:          (null)
  CPU: 2 PID: 1001 Comm: oom_reaper
  Call Trace:
     unmap_page_range+0x1068/0x1130
     __oom_reap_task_mm+0xd5/0x16b
     oom_reaper+0xff/0x14c
     kthread+0xc1/0xe0

Tetsuo Handa has noticed that the synchronization inside exit_mmap is
insufficient.  We only synchronize with the oom reaper if
tsk_is_oom_victim which is not true if the final __mmput is called from
a different context than the oom victim exit path.  This can trivially
happen from context of any task which has grabbed mm reference (e.g.  to
read /proc/<pid>/ file which requires mm etc.).

The race would look like this

  oom_reaper		oom_victim		task
						mmget_not_zero
			do_exit
			  mmput
  __oom_reap_task_mm				mmput
  						  __mmput
						    exit_mmap
						      remove_vma
    unmap_page_range

Fix this issue by providing a new mm_is_oom_victim() helper which
operates on the mm struct rather than a task.  Any context which
operates on a remote mm struct should use this helper in place of
tsk_is_oom_victim.  The flag is set in mark_oom_victim and never cleared
so it is stable in the exit_mmap path.

Debugged by Tetsuo Handa.

Link: http://lkml.kernel.org/r/20171210095130.17110-1-mhocko@kernel.org
Fixes: 2129258024 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrea Argangeli <andrea@kernel.org>
Cc: <stable@vger.kernel.org>	[4.14]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:49 -08:00
Thiago Rafael Becker
bdcf0a423e kernel: make groups_sort calling a responsibility group_info allocators
In testing, we found that nfsd threads may call set_groups in parallel
for the same entry cached in auth.unix.gid, racing in the call of
groups_sort, corrupting the groups for that entry and leading to
permission denials for the client.

This patch:
 - Make groups_sort globally visible.
 - Move the call to groups_sort to the modifiers of group_info
 - Remove the call to groups_sort from set_groups

Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com
Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:49 -08:00
Christophe JAILLET
1f704fd0d1 mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()'
A semaphore is acquired before this check, so we must release it before
leaving.

Link: http://lkml.kernel.org/r/20171211211009.4971-1-christophe.jaillet@wanadoo.fr
Fixes: b7f0554a56 ("mm: fail get_vaddr_frames() for filesystem-dax mappings")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Sterba <dsterba@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Liu, Changcheng
0b265c3b3b tools/slabinfo-gnuplot: force to use bash shell
On some linux distributions, the default link of sh is dash which
deoesn't support split array like "${var//,/ }"

It's better to force to use bash shell directly.

Link: http://lkml.kernel.org/r/20171208093751.GA175471@sofia
Signed-off-by: Liu Changcheng <changcheng.liu@intel.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Dmitry Vyukov
689d77f001 kcov: fix comparison callback signature
Fix a silly copy-paste bug.  We truncated u32 args to u16.

Link: http://lkml.kernel.org/r/20171207101134.107168-1-dvyukov@google.com
Fixes: ded97d2c2b ("kcov: support comparison operands collection")
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: syzkaller@googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Geert Uytterhoeven
85c3e4a5a1 mm/slab.c: do not hash pointers when debugging slab
If CONFIG_DEBUG_SLAB/CONFIG_DEBUG_SLAB_LEAK are enabled, the slab code
prints extra debug information when e.g.  corruption is detected.  This
includes pointers, which are not very useful when hashed.

Fix this by using %px to print unhashed pointers instead where it makes
sense, and by removing the printing of a last user pointer referring to
code.

[geert+renesas@glider.be: v2]
  Link: http://lkml.kernel.org/r/1513179267-2509-1-git-send-email-geert+renesas@glider.be
Link: http://lkml.kernel.org/r/1512641861-5113-1-git-send-email-geert+renesas@glider.be
Fixes: ad67b74d24 ("printk: hash addresses printed with %p")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Tobin C . Harding" <me@tobin.cc>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Lucas Stach
c24ad77d96 mm/page_alloc.c: avoid excessive IRQ disabled times in free_unref_page_list()
Since commit 9cca35d42e ("mm, page_alloc: enable/disable IRQs once
when freeing a list of pages") we see excessive IRQ disabled times of up
to 25ms on an embedded ARM system (tracing overhead included).

This is due to graphics buffers being freed back to the system via
release_pages().  Graphics buffers can be huge, so it's not hard to hit
cases where the list of pages to free has 2048 entries.  Disabling IRQs
while freeing all those pages is clearly not a good idea.

Introduce a batch limit, which allows IRQ servicing once every few
pages.  The batch count is the same as used in other parts of the MM
subsystem when dealing with IRQ disabled regions.

Link: http://lkml.kernel.org/r/20171207170314.4419-1-l.stach@pengutronix.de
Fixes: 9cca35d42e ("mm, page_alloc: enable/disable IRQs once when freeing a list of pages")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Geert Uytterhoeven
183f24aa5b mm/memory.c: mark wp_huge_pmd() inline to prevent build failure
With gcc 4.1.2:

    mm/memory.o: In function `wp_huge_pmd':
    memory.c:(.text+0x9b4): undefined reference to `do_huge_pmd_wp_page'

Interestingly, wp_huge_pmd() is emitted in the assembler output, but
never called.

Apparently replacing the call to pmd_write() in __handle_mm_fault() by a
call to the more complex pmd_access_permitted() reduced the ability of
the compiler to remove unused code.

Fix this by marking wp_huge_pmd() inline, like was done in commit
91a90140f9 ("mm/memory.c: mark create_huge_pmd() inline to prevent
build failure") for a similar problem.

[akpm@linux-foundation.org: add comment]
Link: http://lkml.kernel.org/r/1512335500-10889-1-git-send-email-geert@linux-m68k.org
Fixes: c7da82b894 ("mm: replace pmd_write with pmd_access_permitted in fault + gup paths")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Liu, Changcheng
4cc90b4cc3 scripts/faddr2line: fix CROSS_COMPILE unset error
faddr2line hit var unbound error when CROSS_COMPILE isn't set since
nounset option is set in bash script.

Link: http://lkml.kernel.org/r/20171206013022.GA83929@sofia
Fixes: 95a8798254 ("scripts/faddr2line: extend usage on generic arch")
Signed-off-by: Liu Changcheng <changcheng.liu@intel.com>
Reported-by: Richard Weinberger <richard.weinberger@gmail.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: NeilBrown <neilb@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Srividya Desireddy
51f73fffbf Documentation/vm/zswap.txt: update with same-value filled page feature
Update zswap document with details on same-value filled pages
identification feature.  The usage of zswap.same_filled_pages_enabled
module parameter is explained.

Link: http://lkml.kernel.org/r/20171206114852epcms5p6973b02a9f455d5d3c765eafda0fe2631@epcms5p6
Signed-off-by: Srividya Desireddy <srividya.dr@samsung.com>
Acked-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Arnd Bergmann
3756f6401c exec: avoid gcc-8 warning for get_task_comm
gcc-8 warns about using strncpy() with the source size as the limit:

  fs/exec.c:1223:32: error: argument to 'sizeof' in 'strncpy' call is the same expression as the source; did you mean to use the size of the destination? [-Werror=sizeof-pointer-memaccess]

This is indeed slightly suspicious, as it protects us from source
arguments without NUL-termination, but does not guarantee that the
destination is terminated.

This keeps the strncpy() to ensure we have properly padded target
buffer, but ensures that we use the correct length, by passing the
actual length of the destination buffer as well as adding a build-time
check to ensure it is exactly TASK_COMM_LEN.

There are only 23 callsites which I all reviewed to ensure this is
currently the case.  We could get away with doing only the check or
passing the right length, but it doesn't hurt to do both.

Link: http://lkml.kernel.org/r/20171205151724.1764896-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Kees Cook <keescook@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Aleksa Sarai <asarai@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
NeilBrown
302ec300ef autofs: fix careless error in recent commit
Commit ecc0c469f2 ("autofs: don't fail mount for transient error") was
meant to replace an 'if' with a 'switch', but instead added the 'switch'
leaving the case in place.

Link: http://lkml.kernel.org/r/87zi6wstmw.fsf@notabene.neil.brown.name
Fixes: ecc0c469f2 ("autofs: don't fail mount for transient error")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Ian Kent <raven@themaw.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Arnd Bergmann
146734b091 string.h: workaround for increased stack usage
The hardened strlen() function causes rather large stack usage in at
least one file in the kernel, in particular when CONFIG_KASAN is
enabled:

  drivers/media/usb/em28xx/em28xx-dvb.c: In function 'em28xx_dvb_init':
  drivers/media/usb/em28xx/em28xx-dvb.c:2062:1: error: the frame size of 3256 bytes is larger than 204 bytes [-Werror=frame-larger-than=]

Analyzing this problem led to the discovery that gcc fails to merge the
stack slots for the i2c_board_info[] structures after we strlcpy() into
them, due to the 'noreturn' attribute on the source string length check.

I reported this as a gcc bug, but it is unlikely to get fixed for gcc-8,
since it is relatively easy to work around, and it gets triggered
rarely.  An earlier workaround I did added an empty inline assembly
statement before the call to fortify_panic(), which works surprisingly
well, but is really ugly and unintuitive.

This is a new approach to the same problem, this time addressing it by
not calling the 'extern __real_strnlen()' function for string constants
where __builtin_strlen() is a compile-time constant and therefore known
to be safe.

We do this by checking if the last character in the string is a
compile-time constant '\0'.  If it is, we can assume that strlen() of
the string is also constant.

As a side-effect, this should also improve the object code output for
any other call of strlen() on a string constant.

[akpm@linux-foundation.org: add comment]
Link: http://lkml.kernel.org/r/20171205215143.3085755-1-arnd@arndb.de
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365
Link: https://patchwork.kernel.org/patch/9980413/
Link: https://patchwork.kernel.org/patch/9974047/
Fixes: 6974f0c455 ("include/linux/string.h: add the option of fortified string.h functions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Martin Wilck <mwilck@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Andrew Morton
13ab183d13 mm/kmemleak.c: make cond_resched() rate-limiting more efficient
Commit bde5f6bc68 ("kmemleak: add scheduling point to
kmemleak_scan()") tries to rate-limit the frequency of cond_resched()
calls, but does it in a way which might incur an expensive division
operation in the inner loop.  Simplify this.

Fixes: bde5f6bc68 ("kmemleak: add scheduling point to kmemleak_scan()")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Chris Wilson
338f1d9d1b lib/rbtree,drm/mm: add rbtree_replace_node_cached()
Add a variant of rbtree_replace_node() that maintains the leftmost cache
of struct rbtree_root_cached when replacing nodes within the rbtree.

As drm_mm is the only rb_replace_node() being used on an interval tree,
the mistake looks fairly self-contained.  Furthermore the only user of
drm_mm_replace_node() is its testsuite...

Testcase: igt/drm_mm/replace

Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk
Fixes: f808c13fd3 ("lib/interval_tree: fast overlap detection")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Wei Wang
c47d7f56e9 include/linux/idr.h: add #include <linux/bug.h>
The <linux/bug.h> was removed from radix-tree.h by commit f5bba9d11a
("include/linux/radix-tree.h: remove unneeded #include <linux/bug.h>").

Since that commit, tools/testing/radix-tree/ couldn't pass compilation
due to tools/testing/radix-tree/idr.c:17: undefined reference to
WARN_ON_ONCE.  This patch adds the bug.h header to idr.h to solve the
issue.

Link: http://lkml.kernel.org/r/1511963726-34070-2-git-send-email-wei.w.wang@intel.com
Fixes: f5bba9d11a ("include/linux/radix-tree.h: remove unneeded #include <linux/bug.h>")
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-14 16:00:48 -08:00
Linus Torvalds
d455df0bcc Small SMB3 fixes for stable and 4.15rc
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQGcBAABAgAGBQJaMszhAAoJEIosvXAHck9R+gYMAJM6QM9sjiCf8xPh1YhPkGr4
 /yLqw6dyaicsPBo2YN6aY3tRNuAkTTbcVW6Sjaepk5WkqK3t//PYC0MzmS9cfDg+
 DdgtHwW5CoyB7cdzx0QzgAfoH3A7IRJoO9ezjiM/mkPURZlhJJTgFOhggkCGPzhU
 R7h39e7SNmg4kB2x/fx4HBWxdHrPj0AysDaxFZ83FiVtZojZ7X9tIRb5HT0PFCB5
 buoAjvtOuXueKN91Z/seSkSj0NqaANXYPXsBudMy7TlfDb/tko7LOy7TcmOn1tVy
 av51+oSTcWSgSLPnJ2LRNMfeguw39YJzcMhAdZh/4/Hik8c2MrBSTaKveJl9N1cf
 CDqRdKaoycjjhiTPgmreQUaL35rDhJ3LoYOqX2IMsGFjVjbI1S/8oIPJpL/JxZYd
 t7jxDPGNWjA6AppKo5C2kysjI0VPCvtiwxrm0aCBx6iVM8Hf/nxk9I0Dq7LLL179
 7vdYPoS4H4aip5XvDPV99Xus72qfErrnVJcYmOziqg==
 =QS2E
 -----END PGP SIGNATURE-----

Merge tag '4.15-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Small SMB3 fixes for stable and 4.15rc"

* tag '4.15-rc-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: don't log STATUS_NOT_FOUND errors for DFS
  cifs: fix NULL deref in SMB2_read
2017-12-14 11:51:21 -08:00
Linus Torvalds
e375922fc5 Merge tag 'drm-misc-fixes-2017-12-14' of git://anongit.freedesktop.org/drm/drm-misc
Pull drm fixes from Daniel Vetter:

 - two fixes for new core features

 - a corner case fix for the connnector_iter fix from last week (this
   one is cc: stable)

 - one vc4 fix

* tag 'drm-misc-fixes-2017-12-14' of git://anongit.freedesktop.org/drm/drm-misc:
  drm/drm_lease: Prevent deadlock in case drm_lease_create() fails
  drm: rework delayed connector cleanup in connector_iter
  drm: Update edid-derived drm_display_info fields at edid property set [v2]
  drm/vc4: Release fence after signalling
2017-12-14 11:45:53 -08:00
Mark Rutland
c2e90800ae virtio_mmio: fix devm cleanup
Recent rework of the virtio_mmio probe/remove paths balanced a
devm_ioremap() with an iounmap() rather than its devm variant. This ends
up corrupting the devm datastructures, and results in the following
boot-time splat on arm64 under QEMU 2.9.0:

[    3.450397] ------------[ cut here ]------------
[    3.453822] Trying to vfree() nonexistent vm area (00000000c05b4844)
[    3.460534] WARNING: CPU: 1 PID: 1 at mm/vmalloc.c:1525 __vunmap+0x1b8/0x220
[    3.475898] Kernel panic - not syncing: panic_on_warn set ...
[    3.475898]
[    3.493933] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc3 #1
[    3.513109] Hardware name: linux,dummy-virt (DT)
[    3.525382] Call trace:
[    3.531683]  dump_backtrace+0x0/0x368
[    3.543921]  show_stack+0x20/0x30
[    3.547767]  dump_stack+0x108/0x164
[    3.559584]  panic+0x25c/0x51c
[    3.569184]  __warn+0x29c/0x31c
[    3.576023]  report_bug+0x1d4/0x290
[    3.586069]  bug_handler.part.2+0x40/0x100
[    3.597820]  bug_handler+0x4c/0x88
[    3.608400]  brk_handler+0x11c/0x218
[    3.613430]  do_debug_exception+0xe8/0x318
[    3.627370]  el1_dbg+0x18/0x78
[    3.634037]  __vunmap+0x1b8/0x220
[    3.648747]  vunmap+0x6c/0xc0
[    3.653864]  __iounmap+0x44/0x58
[    3.659771]  devm_ioremap_release+0x34/0x68
[    3.672983]  release_nodes+0x404/0x880
[    3.683543]  devres_release_all+0x6c/0xe8
[    3.695692]  driver_probe_device+0x250/0x828
[    3.706187]  __driver_attach+0x190/0x210
[    3.717645]  bus_for_each_dev+0x14c/0x1f0
[    3.728633]  driver_attach+0x48/0x78
[    3.740249]  bus_add_driver+0x26c/0x5b8
[    3.752248]  driver_register+0x16c/0x398
[    3.757211]  __platform_driver_register+0xd8/0x128
[    3.770860]  virtio_mmio_init+0x1c/0x24
[    3.782671]  do_one_initcall+0xe0/0x398
[    3.791890]  kernel_init_freeable+0x594/0x660
[    3.798514]  kernel_init+0x18/0x190
[    3.810220]  ret_from_fork+0x10/0x18

To fix this, we can simply rip out the explicit cleanup that the devm
infrastructure will do for us when our probe function returns an error
code, or when our remove function returns.

We only need to ensure that we call put_device() if a call to
register_virtio_device() fails in the probe path.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 7eb781b1bb ("virtio_mmio: add cleanup for virtio_mmio_probe")
Fixes: 25f32223bc ("virtio_mmio: add cleanup for virtio_mmio_remove")
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: weiping zhang <zhangweiping@didichuxing.com>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2017-12-14 21:01:40 +02:00
Takashi Iwai
c1cfd9025c ALSA: rawmidi: Avoid racy info ioctl via ctl device
The rawmidi also allows to obtaining the information via ioctl of ctl
API.  It means that user can issue an ioctl to the rawmidi device even
when it's being removed as long as the control device is present.
Although the code has some protection via the global register_mutex,
its range is limited to the search of the corresponding rawmidi
object, and the mutex is already unlocked at accessing the rawmidi
object.  This may lead to a use-after-free.

For avoiding it, this patch widens the application of register_mutex
to the whole snd_rawmidi_info_select() function.  We have another
mutex per rawmidi object, but this operation isn't very hot path, so
it shouldn't matter from the performance POV.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-12-14 16:52:31 +01:00
Dave Martin
3fab39997a arm64/sve: Report SVE to userspace via CPUID only if supported
Currently, the SVE field in ID_AA64PFR0_EL1 is visible
unconditionally to userspace via the CPU ID register emulation,
irrespective of the kernel config.  This means that if a kernel
configured with CONFIG_ARM64_SVE=n is run on SVE-capable hardware,
userspace will see SVE reported as present in the ID regs even
though the kernel forbids execution of SVE instructions.

This patch makes the exposure of the SVE field in ID_AA64PFR0_EL1
conditional on CONFIG_ARM64_SVE=y.

Since future architecture features are likely to encounter a
similar requirement, this patch adds a suitable helper macros for
use when declaring config-conditional ID register fields.

Fixes: 43994d824e ("arm64/sve: Detect SVE and activate runtime support")
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-14 15:14:30 +00:00
Mark Rutland
1d08a044cf arm64: fix CONFIG_DEBUG_WX address reporting
In ptdump_check_wx(), we pass walk_pgd() a start address of 0 (rather
than VA_START) for the init_mm. This means that any reported W&X
addresses are offset by VA_START, which is clearly wrong and can make
them appear like userspace addresses.

Fix this by telling the ptdump code that we're walking init_mm starting
at VA_START. We don't need to update the addr_markers, since these are
still valid bounds regardless.

Cc: <stable@vger.kernel.org>
Fixes: 1404d6f13e ("arm64: dump: Add checking for writable and exectuable pages")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <labbott@redhat.com>
Reported-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-14 10:18:23 +00:00
Amir Goldstein
da2e6b7eed ovl: fix overlay: warning prefix
Conform two stray warning messages to the standard overlayfs: prefix.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-12-14 11:14:52 +01:00
Stefan Raspl
cf656c7661 tools/kvm_stat: add line for totals
Add a line for the total number of events and current average at the
bottom of the body.
Note that both values exclude child trace events. I.e. if drilldown is
activated via interactive command 'x', only the totals are accounted, or
we'd be counting these twice (see previous commit "tools/kvm_stat: fix
child trace events accounting").

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:47 +01:00
Stefan Raspl
73fab6ffbd tools/kvm_stat: stop ignoring unhandled arguments
Unhandled arguments, which could easily include typos, are simply
ignored. We should be strict to avoid undetected typos.
To reproduce start kvm_stat with an extra argument, e.g.
'kvm_stat -d bnuh5ol' and note that this will actually work.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:46 +01:00
Stefan Raspl
822cfe3e48 tools/kvm_stat: suppress usage information on command line errors
Errors while parsing the '-g' command line argument result in display of
usage information prior to the error message. This is a bit confusing,
as the command line is syntactically correct.
To reproduce, run 'kvm_stat -g' and specify a non-existing or inactive
guest.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:46 +01:00
Stefan Raspl
08e20a6300 tools/kvm_stat: handle invalid regular expressions
Passing an invalid regular expression on the command line results in a
traceback. Note that interactive specification of invalid regular
expressions is not affected
To reproduce, run "kvm_stat -f '*'".

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:45 +01:00
Stefan Raspl
f3d11b0e86 tools/kvm_stat: add hint on '-f help' to man page
The man page update for this new functionality was omitted.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:44 +01:00
Stefan Raspl
fff8c9eb48 tools/kvm_stat: fix child trace events accounting
Child trace events were included in calculation of the overall total,
which is used for calculation of the percentages of the '%Total' column.
However, the parent trace envents' stats summarize the child trace
events, hence we'd incorrectly account for them twice, leading to
slightly wrong stats.
With this fix, we use the correct total. Consequently, the sum of the
child trace events' '%Total' column values is identical to the
respective value of the respective parent event. However, this also
means that the sum of the '%Total' column values will aggregate to more
than 100 percent.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:44 +01:00
Stefan Raspl
b74faa930d tools/kvm_stat: fix extra handling of 'help' with fields filter
Commit 67fbcd62f5 ("tools/kvm_stat: add '-f help' to get the available
event list") added support for '-f help'. However, the extra handling of
'help' will also take effect when 'help' is specified as a regex in
interactive mode via 'f'. This results in display of all events while
only those matching this regex should be shown.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:43 +01:00
Stefan Raspl
67c162b089 tools/kvm_stat: fix missing field update after filter change
When updating the fields filter, tracepoint events of fields previously
not visible were not enabled, as TracepointProvider.update_fields()
updated the member variable directly instead of using the setter, which
triggers the event enable/disable.
To reproduce, run 'kvm_stat -f kvm_exit', press 'c' to remove the
filter, and notice that no add'l fields that do not match the regex
'kvm_exit' will appear.
This issue was introduced by commit c469117df0 ("tools/kvm_stat:
simplify initializers").

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:42 +01:00
Stefan Raspl
faa0665041 tools/kvm_stat: fix drilldown in events-by-guests mode
When displaying debugfs events listed by guests, an attempt to switch to
reporting of stats for individual child trace events results in garbled
output. Reason is that when toggling drilldown, the update of the stats
doesn't honor when events are displayed by guests, as indicated by
Tui._display_guests.
To reproduce, run 'kvm_stat -d' and press 'b' followed by 'x'.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:42 +01:00
Stefan Raspl
19e8e54f43 tools/kvm_stat: fix command line option '-g'
Specifying a guest via '-g foo' always results in an error:
  $ kvm_stat -g foo
  Usage: kvm_stat [options]

  kvm_stat: error: Error while searching for guest "foo", use "-p" to
  specify a pid instead

Reason is that Tui.get_pid_from_gname() is not static, as it is supposed
to be.

Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:25:41 +01:00
Peter Xu
5663d8f9bb kvm: x86: fix WARN due to uninitialized guest FPU state
------------[ cut here ]------------
 Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers.
 WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90
 CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G    B      OE    4.15.0-rc2+ #10
 RIP: 0010:ex_handler_fprestore+0x88/0x90
 Call Trace:
  fixup_exception+0x4e/0x60
  do_general_protection+0xff/0x270
  general_protection+0x22/0x30
 RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm]
 RSP: 0018:ffff8803d5627810 EFLAGS: 00010246
  kvm_vcpu_reset+0x3b4/0x3c0 [kvm]
  kvm_apic_accept_events+0x1c0/0x240 [kvm]
  kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm]
  kvm_vcpu_ioctl+0x479/0x880 [kvm]
  do_vfs_ioctl+0x142/0x9a0
  SyS_ioctl+0x74/0x80
  do_syscall_64+0x15f/0x600

where kvm_put_guest_fpu is called without a prior kvm_load_guest_fpu.
To fix it, move kvm_load_guest_fpu to the very beginning of
kvm_arch_vcpu_ioctl_run.

Cc: stable@vger.kernel.org
Fixes: f775b13eed
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:24:35 +01:00
Wanpeng Li
d73235d17b KVM: X86: Fix load RFLAGS w/o the fixed bit
*** Guest State ***
 CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7
 CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffe871
 CR3 = 0x00000000fffbc000
 RSP = 0x0000000000000000  RIP = 0x0000000000000000
 RFLAGS=0x00000000         DR7 = 0x0000000000000400
        ^^^^^^^^^^

The failed vmentry is triggered by the following testcase when ept=Y:

    #include <unistd.h>
    #include <sys/syscall.h>
    #include <string.h>
    #include <stdint.h>
    #include <linux/kvm.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>

    long r[5];
    int main()
    {
    	r[2] = open("/dev/kvm", O_RDONLY);
    	r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
    	r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
    	struct kvm_regs regs = {
    		.rflags = 0,
    	};
    	ioctl(r[4], KVM_SET_REGS, &regs);
    	ioctl(r[4], KVM_RUN, 0);
    }

X86 RFLAGS bit 1 is fixed set, userspace can simply clearing bit 1
of RFLAGS with KVM_SET_REGS ioctl which results in vmentry fails.
This patch fixes it by oring X86_EFLAGS_FIXED during ioctl.

Cc: stable@vger.kernel.org
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Quan Xu <quan.xu0@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:24:26 +01:00
Wanpeng Li
ed52870f46 KVM: MMU: Fix infinite loop when there is no available mmu page
The below test case can cause infinite loop in kvm when ept=0.

    #include <unistd.h>
    #include <sys/syscall.h>
    #include <string.h>
    #include <stdint.h>
    #include <linux/kvm.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>

    long r[5];
    int main()
    {
    	r[2] = open("/dev/kvm", O_RDONLY);
    	r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
    	r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
    	ioctl(r[4], KVM_RUN, 0);
    }

It doesn't setup the memory regions, mmu_alloc_shadow/direct_roots() in
kvm return 1 when kvm fails to allocate root page table which can result
in beblow infinite loop:

    vcpu_run() {
    	for (;;) {
	    	r = vcpu_enter_guest()::kvm_mmu_reload() returns 1
	    	if (r <= 0)
	    		break;
	    	if (need_resched())
	    		cond_resched();
      }
    }

This patch fixes it by returning -ENOSPC when there is no available kvm mmu
page for root page table.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 26eeb53cf0 (KVM: MMU: Bail out immediately if there is no available mmu page)
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-14 09:24:14 +01:00
Marius Vlad
bd36d3bab2 drm/drm_lease: Prevent deadlock in case drm_lease_create() fails
This case can been seen when creating the lease with the same objects passed.

[  605.515097] 2 locks held by testapp/3337:
[  605.519027]  #0:  (&dev->mode_config.idr_mutex){......}, at: [<ffff0000085f1664>] drm_mode_create_lease_ioctl+0x384/0x858
[  605.530045]  #1:  (&dev->mode_config.idr_mutex){......}, at: [<ffff0000085f11bc>] drm_lease_destroy+0x2c/0x110

Which was causing the process to hang:

[  605.398827] [<ffff0000080856cc>] __switch_to+0x94/0xa8
[  605.404030] [<ffff000008c05d00>] __schedule+0x1b0/0x698
[  605.409322] [<ffff000008c06224>] schedule+0x3c/0xa8
[  605.414260] [<ffff000008c06628>] schedule_preempt_disabled+0x20/0x38
[  605.420677] [<ffff000008c07370>] mutex_lock_nested+0x158/0x340
[  605.426572] [<ffff0000085f11bc>] drm_lease_destroy+0x2c/0x110
[  605.432389] [<ffff0000085cecf0>] drm_master_put+0xc0/0xc8
[  605.437845] [<ffff0000085f175c>] drm_mode_create_lease_ioctl+0x47c/0x858
[  605.444612] [<ffff0000085d4460>] drm_ioctl+0x198/0x448
[  605.449811] [<ffff000008201134>] do_vfs_ioctl+0xa4/0x748
[  605.455192] [<ffff000008201864>] SyS_ioctl+0x8c/0xa0
[  605.460216] [<ffff000008082f4c>] __sys_trace_return+0x0/0x4

drm_mode_create_lease_ioctl() calls drm_lease_create() which acquires a lock
on dev->mode_config.idr_mutex. In case of failure, drm_lease_create() calls
drm_master_put() which in turn tries to acquire the same lock when calling
drm_lease_destroy().

v2: - Reverse the order at exit in case of fail, so that unlocking takes place
before dropping the reference.
    - Include detail information about deadlock (Daniel Vetter)

Signed-off-by: Marius Vlad <marius-cristian.vlad@nxp.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20171213181048.32719-1-marius-cristian.vlad@nxp.com
2017-12-14 08:25:37 +01:00
Linus Torvalds
7c5cac1bc7 Changes since last update:
- Clean up duplicate includes
 - Remove ancient 'no-alloc' crap code that occasionally caused hard fs
   shutdowns due to lack of proper space reservations
 - Fix regression in FIEMAP behavior when reporting xattr extents
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJaK0JUAAoJEPh/dxk0SrTrWOcP/iDoE1nV8BHru8ynwCr0ABun
 Hc+dmtQ1uQezu1qewzWkxH/zkyvpMBtH3wkqkYQApbPw7jSN4WDUazEGPY4Ju6pJ
 gMyg64EEC6UEGN8B9M2mf1QB/Q/TjZSeFiKOLw78ikWYSG/dbf814zC2fyWO79eG
 mjGzNbdvBbId35HLd62vd8VAW7zYY3acOyzQEl41LqKoGXD9eFWIh/uvH0bGuxN3
 3YipW/PM7MBq+1rCi6pFVX+wt7pemi8hQ4vRZqMp24SB5JmvruP9E45iOt/8sep+
 D/x1YjDyhutshAjbXyIaruxeIfsrs/r/3SAkOQgktwc8ihadBTJF3TPL9aTUGwLS
 1dCL7Gd2Mx317yeHzSFs+FCq8pc+ioysbyZcCIlJPnhb1ZCaA98XD/desbNL/BY4
 uf/Uq/5dJ6Kwllzol1VVz4CVKne4x1vQhPuIT1/wYsd2tSIYiBg+XlFV67CB7Fsv
 9wRetybw2c22qINLNPc50tocGcormQT940PieketssFsOHa96GduT5Z5DEbZa7FV
 /yk68o50VU2zlKuAMtTYbLT+uL/TimgeHU1pSCXOwT2wvJA/O5hVQEadIZ51cMct
 KSFlY8xEGwDZM8S88Xf1H7yFmUpGvmAnIwPHCZSJur026rZMWeANl6MTZJTJSpTx
 Wdj87C+2s5awNUcZmX0n
 =cmic
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.15-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Here are a few more bug fixes & cleanups for 4.15-rc4:

   - clean up duplicate includes

   - remove ancient 'no-alloc' crap code that occasionally caused hard
     fs shutdowns due to lack of proper space reservations

   - fix regression in FIEMAP behavior when reporting xattr extents"

* tag 'xfs-4.15-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: make iomap_begin functions trim iomaps consistently
  xfs: remove "no-allocation" reservations for file creations
  fs: xfs: remove duplicate includes
2017-12-13 20:15:49 -08:00
Linus Torvalds
4e746cf4f7 RISC-V Fixes for 4.15-rc4
This pull request contains three small fixes that I'm hoping to get into
 4.15-rc4:
 
 * A fix to a typo in sys_riscv_flush_icache.  This only effects error
   handling, but I think it's a small and obvious enough change that it's
   sane outside the merge window.
 * The addition of smp_mb__after_spinlock(), which was recently removed
   due to an incorrect comment.  This is largly a comment change (as
   there's a big one now), and while it's necessary for complience with
   the RISC-V memory model the lack of this fence shouldn't manifest as a
   bug on current implementations.  Nonetheless, it still seems saner to
   have the fence in 4.15.
 * The removal of some of the HVC_RISCV_SBI driver that snuck into the
   arch port.  This is compile-time dead code in 4.15 (as the driver
   isn't in yet), and during the review process we found a better way to
   implement early printk on RISC-V.  While this change doesn't do
   anything, it will make staging our HVC driver easier: without this
   change the HVC driver we hope to upstream won't build on 4.15 (because
   the 4.15 arch code would reference a function that no longer exists).
 
 Additionally, I'm instituting a bit of a process change so I don't
 submit things too quickly again:
 
 * All the patches I submit during an RC will be from the week before, so
   everyone has gotten a chance to see them and they've made it through
   our autobuilders and integration trees.
 * I'll cherry-pick (single patches) or merge (if it's a patch set) on
   top of the new RC on Monday morning.
 * I'll sign the tag on Monday morning, to let the autobuilders pick up
   exactly what I'm submitting.
 * Assuming nothing goes wrong, I'll mail the pull request out on
   Wednesday.  If something goes wrong, I'll wait at least a day after
   re-spinning the tag to let the autobuilders pick things up.
 
 Hopefully this will avoid any headaches in the future, barring any
 emergency fixes.
 
 I don't think this is the last patch set we'll want for 4.15: I think
 I'll want to remove some of the first-level irqchip driver that snuck in
 as well, which will look a lot like the HVC patch here.  This is pending
 some asm-generic cleanup I'm doing that I haven't quite gotten clean
 enough to send out yet, though, but hopefully it'll be ready by next
 week (and still OK for that late).
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlou0sQTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQebsD/4gPSr0adXxl3hXhrHWQIcigWM1QNgp
 EjNnsHiT1zLs9SNK+CBLQXEjXp/GwqfYDiE6yzaJEhoBm5rHB9c72IThCJkt6fRR
 5fuybYXxBXaXVxq0CbaItNzTdqm99iqSCD2p3mh7Dmw/X23EglkYdPqQlUIxWTtw
 BmlCaH8tn8PwGIgVATwJe8fWEe8tep8uPUW6NAMznLVCKKaOd56fkP+FVA1+M7u4
 7kaoG6B8tg3mEbHdsfvo4VSII7HcHsaTa4tOaD7RPjEDKdxFQbzY8HUP38BunqOu
 v9bsV0BXYvG5dHF0eWtQAfB+y5V7n1fipcfKtZl2HagN9hbKruwQMPZTO+xPT7Am
 F/VyYxoirzfY5quA0ancS+DV54LfcHC5MH9Xvhacvg7yY3zYTvhsZfEaND0GAMLT
 sqO3Mi24+yw5Cs24uSJEj6qkxxuHMZUePpzWJWET2vduXe+pqOYF8jQn13SqHyxs
 QoIaw18RqcBPtX6vABFoMKP6WHPk8No3iki5DrSVSQ0jN9r3HSBxvYCIdG89P6A+
 E0a65H91GhY/kVNJAE4YC8MF3e6D8/+KX9+h/q1wyueJV2tTE0cd0ZiQY7hN36HY
 /QIf8zH4tmYhHAJFNLQnK1sg1Ipino6ABHNv76MhZMxguYD4FAE0nAHsfv1AAa97
 Webx5zolwHEz8A==
 =FRZX
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-4.15-rc4-riscv_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "This contains three small fixes:

   - A fix to a typo in sys_riscv_flush_icache. This only effects error
     handling, but I think it's a small and obvious enough change that
     it's sane outside the merge window.

   - The addition of smp_mb__after_spinlock(), which was recently
     removed due to an incorrect comment. This is largly a comment
     change (as there's a big one now), and while it's necessary for
     complience with the RISC-V memory model the lack of this fence
     shouldn't manifest as a bug on current implementations.
     Nonetheless, it still seems saner to have the fence in 4.15.

   - The removal of some of the HVC_RISCV_SBI driver that snuck into the
     arch port. This is compile-time dead code in 4.15 (as the driver
     isn't in yet), and during the review process we found a better way
     to implement early printk on RISC-V. While this change doesn't do
     anything, it will make staging our HVC driver easier: without this
     change the HVC driver we hope to upstream won't build on 4.15
     (because the 4.15 arch code would reference a function that no
     longer exists).

  I don't think this is the last patch set we'll want for 4.15: I think
  I'll want to remove some of the first-level irqchip driver that snuck
  in as well, which will look a lot like the HVC patch here. This is
  pending some asm-generic cleanup I'm doing that I haven't quite gotten
  clean enough to send out yet, though, but hopefully it'll be ready by
  next week (and still OK for that late)"

 * tag 'riscv-for-linus-4.15-rc4-riscv_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux:
  RISC-V: Remove unused CONFIG_HVC_RISCV_SBI code
  RISC-V: Resurrect smp_mb__after_spinlock()
  RISC-V: Logical vs Bitwise typo
2017-12-13 20:13:05 -08:00