Commit Graph

824684 Commits

Author SHA1 Message Date
Linus Torvalds
b5dd0c658c Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - some of the rest of MM

 - various misc things

 - dynamic-debug updates

 - checkpatch

 - some epoll speedups

 - autofs

 - rapidio

 - lib/, lib/lzo/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
  samples/mic/mpssd/mpssd.h: remove duplicate header
  kernel/fork.c: remove duplicated include
  include/linux/relay.h: fix percpu annotation in struct rchan
  arch/nios2/mm/fault.c: remove duplicate include
  unicore32: stop printing the virtual memory layout
  MAINTAINERS: fix GTA02 entry and mark as orphan
  mm: create the new vm_fault_t type
  arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
  arch: simplify several early memory allocations
  openrisc: simplify pte_alloc_one_kernel()
  sh: prefer memblock APIs returning virtual address
  microblaze: prefer memblock API returning virtual address
  powerpc: prefer memblock APIs returning virtual address
  lib/lzo: separate lzo-rle from lzo
  lib/lzo: implement run-length encoding
  lib/lzo: fast 8-byte copy on arm64
  lib/lzo: 64-bit CTZ on arm64
  lib/lzo: tidy-up ifdefs
  ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
  ipc: annotate implicit fall through
  ...
2019-03-07 19:25:37 -08:00
Lee Duncan
a656183e6c scsi: libiscsi: Hold back_lock when calling iscsi_complete_task
If there is an error queueing an iscsi command in iscsi_queuecommand(), for
example if the transport fails to take the command in
sessuin->tt->xmit_task(), then the error path can call
iscsi_complete_task() without first aquiring the back_lock as
required. This can lead to things like ITT pool can get corrupt, resulting
in duplicate ITTs being sent out.

The solution is to hold the back_lock around iscsi_complete_task() calls,
and to add a little commenting to help others understand when back_lock
must be held.

Signed-off-by: Lee Duncan <lduncan@suse.com>
Acked-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-07 21:37:04 -05:00
Brajeswar Ghosh
fe0436e10c samples/mic/mpssd/mpssd.h: remove duplicate header
Remove duplicate headers which are included more than once

Link: http://lkml.kernel.org/r/20190114170033.GA3674@hp-pavilion-15-notebook-pc-brajeswar
Signed-off-by: Brajeswar Ghosh <brajeswar.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
YueHaibing
fd2081ffce kernel/fork.c: remove duplicated include
Remove duplicated include.

Link: http://lkml.kernel.org/r/20181209062952.17736-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Luc Van Oostenryck
62461ac2e5 include/linux/relay.h: fix percpu annotation in struct rchan
The percpu member of this structure is declared as:
	struct ... ** __percpu member;
So its type is:
	__percpu pointer to pointer to struct ...

But looking at how it's used, its type should be:
	pointer to __percpu pointer to struct ...
and it should thus be declared as:
	struct ... * __percpu *member;

So fix the placement of '__percpu' in the definition of this
structures.

This silents a few Sparse's warnings like:
	warning: incorrect type in initializer (different address spaces)
	  expected void const [noderef] <asn:3> *__vpp_verify
	  got struct sched_domain **

Link: http://lkml.kernel.org/r/20190118144902.79065-1-luc.vanoostenryck@gmail.com
Fixes: 017c59c042 ("relay: Use per CPU constructs for the relay channel buffer pointers")
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Sabyasachi Gupta
9587d19924 arch/nios2/mm/fault.c: remove duplicate include
Remove linux/ptrace.h which is included more than once

Link: http://lkml.kernel.org/r/5c45d345.1c69fb81.d90ed.8e05@mx.google.com
Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Geert Uytterhoeven
1476ea250c unicore32: stop printing the virtual memory layout
Since commit ad67b74d24 ("printk: hash addresses printed with %p"),
the virtual memory layout printed during boot up contains "ptrval"
instead of actual addresses.

Instead of changing the printing to "%px", and leaking virtual memory
layout information again, just remove the printing completely, cfr.
e.g.  commits 071929dbdd ("arm64: Stop printing the virtual memory
layout") and 31833332f7 ("m68k/mm: Stop printing the virtual memory
layout").

All interesting information (actual section sizes) is already printed by
mem_init_print_info() just above anyway.

Link: http://lkml.kernel.org/r/20190121152254.29079-1-geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Jann Horn
cb66cb4814 MAINTAINERS: fix GTA02 entry and mark as orphan
The entry for GTA02 never had paths listed; fix that.  commit 9d76295ac6
("[ARM] GTA02/FreeRunner: Add machine definition"), which added the entry
for GTA02, created two new files named
arch/arm/mach-s3c2442/{include/mach/gta02.h,mach-gta02.c}, which were then
renamed in commit dd6f01b5cc ("ARM: S3C2440: move mach-s3c2440/* into
mach-s3c24xx/") to
arch/arm/mach-s3c24xx/{include/mach/gta02.h,mach-gta02.c}.

Also, the GTA02 maintainer's email address is from a domain that doesn't
have an MX record anymore and appears to have expired.  Remove the
maintainer and mark the subsystem as orphan.

Link: http://lkml.kernel.org/r/20190215140444.37060-1-jannh@google.com
Signed-off-by: Jann Horn <jannh@google.com>
Cc: Nelson Castillo <arhuaco@freaks-unidos.net>
Cc: Nelson Castillo <nelsoneci@gmail.com>
Cc: Andy Green <andy@warmcat.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Souptick Joarder
3d3539018d mm: create the new vm_fault_t type
Page fault handlers are supposed to return VM_FAULT codes, but some
drivers/file systems mistakenly return error numbers.  Now that all
drivers/file systems have been converted to use the vm_fault_t return
type, change the type definition to no longer be compatible with 'int'.
By making it an unsigned int, the function prototype becomes
incompatible with a function which returns int.  Sparse will detect any
attempts to return a value which is not a VM_FAULT code.

VM_FAULT_SET_HINDEX and VM_FAULT_GET_HINDEX values are changed to avoid
conflict with other VM_FAULT codes.

[jrdr.linux@gmail.com: fix warnings]
  Link: http://lkml.kernel.org/r/20190109183742.GA24326@jordon-HP-15-Notebook-PC
Link: http://lkml.kernel.org/r/20190108183041.GA12137@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Mike Rapoport
c2938eeb88 arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
arm, s390 and unicore32 use oneliner wrappers for memblock_alloc().
Replace their usage with direct call to memblock_alloc().

Link: http://lkml.kernel.org/r/1546248566-14910-7-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Mike Rapoport
b63a07d69d arch: simplify several early memory allocations
There are several early memory allocations in arch/ code that use
memblock_phys_alloc() to allocate memory, convert the returned physical
address to the virtual address and then set the allocated memory to
zero.

Exactly the same behaviour can be achieved simply by calling
memblock_alloc(): it allocates the memory in the same way as
memblock_phys_alloc(), then it performs the phys_to_virt() conversion
and clears the allocated memory.

Replace the longer sequence with a simpler call to memblock_alloc().

Link: http://lkml.kernel.org/r/1546248566-14910-6-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Mike Rapoport
1e8ffd50fd openrisc: simplify pte_alloc_one_kernel()
The pte_alloc_one_kernel() function allocates a page using
__get_free_page(GFP_KERNEL) when mm initialization is complete and
memblock_phys_alloc() on the earlier stages.  The physical address of
the page allocated with memblock_phys_alloc() is converted to the
virtual address and in the both cases the allocated page is cleared
using clear_page().

The code is simplified by replacing __get_free_page() with
get_zeroed_page() and by replacing memblock_phys_alloc() with
memblock_alloc().

Link: http://lkml.kernel.org/r/1546248566-14910-5-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Stafford Horne <shorne@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Mike Rapoport
47f1e926ae sh: prefer memblock APIs returning virtual address
Rather than use the memblock_alloc_base that returns a physical address
and then convert this address to the virtual one, use appropriate
memblock function that returns a virtual address.

There is a small functional change in the allocation of then
NODE_DATA().  Instead of panicing if the local allocation failed, the
non-local allocation attempt will be made.

Link: http://lkml.kernel.org/r/1546248566-14910-4-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Mike Rapoport
3e5e79f240 microblaze: prefer memblock API returning virtual address
Rather than use the memblock_alloc_base that returns a physical address
and then convert this address to the virtual one, use appropriate
memblock function that returns a virtual address.

Link: http://lkml.kernel.org/r/1546248566-14910-3-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Mike Rapoport
f806714f70 powerpc: prefer memblock APIs returning virtual address
Patch series "memblock: simplify several early memory allocation", v4.

These patches simplify some of the early memory allocations by replacing
usage of older memblock APIs with newer and shinier ones.

Quite a few places in the arch/ code allocated memory using a memblock
API that returns a physical address of the allocated area, then
converted this physical address to a virtual one and then used memset(0)
to clear the allocated range.

More recent memblock APIs do all the three steps in one call and their
usage simplifies the code.

It's important to note that regardless of API used, the core allocation
is nearly identical for any set of memblock allocators: first it tries
to find a free memory with all the constraints specified by the caller
and then falls back to the allocation with some or all constraints
disabled.

The first three patches perform the conversion of call sites that have
exact requirements for the node and the possible memory range.

The fourth patch is a bit one-off as it simplifies openrisc's
implementation of pte_alloc_one_kernel(), and not only the memblock
usage.

The fifth patch takes care of simpler cases when the allocation can be
satisfied with a simple call to memblock_alloc().

The sixth patch removes one-liner wrappers for memblock_alloc on arm and
unicore32, as suggested by Christoph.

This patch (of 6):

There are a several places that allocate memory using memblock APIs that
return a physical address, convert the returned address to the virtual
address and frequently also memset(0) the allocated range.

Update these places to use memblock allocators already returning a
virtual address.  Use memblock functions that clear the allocated memory
instead of calling memset(0) where appropriate.

The calls to memblock_alloc_base() that were not followed by memset(0)
are replaced with memblock_alloc_try_nid_raw().  Since the latter does
not panic() when the allocation fails, the appropriate panic() calls are
added to the call sites.

Link: http://lkml.kernel.org/r/1546248566-14910-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mark Salter <msalter@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Dave Rodgman
45ec975efb lib/lzo: separate lzo-rle from lzo
To prevent any issues with persistent data, separate lzo-rle from lzo so
that it is treated as a separate algorithm, and lzo is still available.

Link: http://lkml.kernel.org/r/20190205155944.16007-3-dave.rodgman@arm.com
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Matt Sealey <matt.sealey@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <nitingupta910@gmail.com>
Cc: Richard Purdie <rpurdie@openedhand.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sonny Rao <sonnyrao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:03 -08:00
Dave Rodgman
5ee4014af9 lib/lzo: implement run-length encoding
Patch series "lib/lzo: run-length encoding support", v5.

Following on from the previous lzo-rle patchset:

  https://lkml.org/lkml/2018/11/30/972

This patchset contains only the RLE patches, and should be applied on
top of the non-RLE patches ( https://lkml.org/lkml/2019/2/5/366 ).

Previously, some questions were raised around the RLE patches.  I've
done some additional benchmarking to answer these questions.  In short:

 - RLE offers significant additional performance (data-dependent)

 - I didn't measure any regressions that were clearly outside the noise

One concern with this patchset was around performance - specifically,
measuring RLE impact separately from Matt Sealey's patches (CTZ & fast
copy).  I have done some additional benchmarking which I hope clarifies
the benefits of each part of the patchset.

Firstly, I've captured some memory via /dev/fmem from a Chromebook with
many tabs open which is starting to swap, and then split this into 4178
4k pages.  I've excluded the all-zero pages (as zram does), and also the
no-zero pages (which won't tell us anything about RLE performance).
This should give a realistic test dataset for zram.  What I found was
that the data is VERY bimodal: 44% of pages in this dataset contain 5%
or fewer zeros, and 44% contain over 90% zeros (30% if you include the
no-zero pages).  This supports the idea of special-casing zeros in zram.

Next, I've benchmarked four variants of lzo on these pages (on 64-bit
Arm at max frequency): baseline LZO; baseline + Matt Sealey's patches
(aka MS); baseline + RLE only; baseline + MS + RLE.  Numbers are for
weighted roundtrip throughput (the weighting reflects that zram does
more compression than decompression).

  https://drive.google.com/file/d/1VLtLjRVxgUNuWFOxaGPwJYhl_hMQXpHe/view?usp=sharing

Matt's patches help in all cases for Arm (and no effect on Intel), as
expected.

RLE also behaves as expected: with few zeros present, it makes no
difference; above ~75%, it gives a good improvement (50 - 300 MB/s on
top of the benefit from Matt's patches).

Best performance is seen with both MS and RLE patches.

Finally, I have benchmarked the same dataset on an x86-64 device.  Here,
the MS patches make no difference (as expected); RLE helps, similarly as
on Arm.  There were no definite regressions; allowing for observational
error, 0.1% (3/4178) of cases had a regression > 1 standard deviation,
of which the largest was 4.6% (1.2 standard deviations).  I think this
is probably within the noise.

  https://drive.google.com/file/d/1xCUVwmiGD0heEMx5gcVEmLBI4eLaageV/view?usp=sharing

One point to note is that the graphs show RLE appears to help very
slightly with no zeros present! This is because the extra code causes
the clang optimiser to change code layout in a way that happens to have
a significant benefit.  Taking baseline LZO and adding a do-nothing line
like "__builtin_prefetch(out_len);" immediately before the "goto next"
has the same effect.  So this is a real, but basically spurious effect -
it's small enough not to upset the overall findings.

This patch (of 3):

When using zram, we frequently encounter long runs of zero bytes.  This
adds a special case which identifies runs of zeros and encodes them
using run-length encoding.

This is faster for both compression and decompresion.  For high-entropy
data which doesn't hit this case, impact is minimal.

Compression ratio is within a few percent in all cases.

This modifies the bitstream in a way which is backwards compatible
(i.e., we can decompress old bitstreams, but old versions of lzo cannot
decompress new bitstreams).

Link: http://lkml.kernel.org/r/20190205155944.16007-2-dave.rodgman@arm.com
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Matt Sealey <matt.sealey@arm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <nitingupta910@gmail.com>
Cc: Richard Purdie <rpurdie@openedhand.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sonny Rao <sonnyrao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Matt Sealey
761b323850 lib/lzo: fast 8-byte copy on arm64
Enable faster 8-byte copies on arm64.

Link: http://lkml.kernel.org/r/20181127161913.23863-6-dave.rodgman@arm.com
Link: http://lkml.kernel.org/r/20190205141950.9058-4-dave.rodgman@arm.com
Signed-off-by: Matt Sealey <matt.sealey@arm.com>
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <nitingupta910@gmail.com>
Cc: Richard Purdie <rpurdie@openedhand.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Matt Sealey
433b3b3d9f lib/lzo: 64-bit CTZ on arm64
LZO leaves some performance on the table by not realising that arm64 can
optimize count-trailing-zeros bit operations.

Add CONFIG_ARM64 to the checked definitions alongside CONFIG_X86_64 to
enable the use of rbit/clz instructions on full 64-bit quantities.

Link: http://lkml.kernel.org/r/20181127161913.23863-5-dave.rodgman@arm.com
Link: http://lkml.kernel.org/r/20190205141950.9058-3-dave.rodgman@arm.com
Signed-off-by: Matt Sealey <matt.sealey@arm.com>
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <nitingupta910@gmail.com>
Cc: Richard Purdie <rpurdie@openedhand.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Dave Rodgman
95777591d0 lib/lzo: tidy-up ifdefs
Patch series "lib/lzo: performance improvements", v5.

This patch (of 3):

Modify the ifdefs in lzodefs.h to be more consistent with normal kernel
macros (e.g., change __aarch64__ to CONFIG_ARM64).

Link: http://lkml.kernel.org/r/20190205141950.9058-2-dave.rodgman@arm.com
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nitin Gupta <nitingupta910@gmail.com>
Cc: Richard Purdie <rpurdie@openedhand.com>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Sonny Rao <sonnyrao@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Matt Sealey <matt.sealey@arm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Gustavo A. R. Silva
4a2ae92993 ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
Use kvzalloc() instead of kvmalloc() and memset().

Also, make use of the struct_size() helper instead of the open-coded
version in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle.

Link: http://lkml.kernel.org/r/20190131214221.GA28930@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Mathieu Malaterre
667da6a268 ipc: annotate implicit fall through
There is a plan to build the kernel with -Wimplicit-fallthrough and this
place in the code produced a warning (W=1).

This commit remove the following warning:

  ipc/sem.c:1683:6: warning: this statement may fall through [-Wimplicit-fallthrough=]

Link: http://lkml.kernel.org/r/20190114203608.18218-1-malat@debian.org
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
David Engraf
e5eed351fd init/initramfs.c: provide more details in error messages
Use distinct error messages when archive decompression failed.

Link: http://lkml.kernel.org/r/20190212075635.7373-1-david.engraf@sysgo.com
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Anders Roxell
1a6a1dbeb7 lib/ubsan: default UBSAN_ALIGNMENT to not set
When booting an allmodconfig kernel, there are a lot of false-positives.
With a message like this 'UBSAN: Undefined behaviour in...' with a call
trace that follows.

UBSAN warnings are a result of enabling noisy CONFIG_UBSAN_ALIGNMENT
which is disabled by default if HAVE_EFFICIENT_UNALIGNED_ACCESS=y.

It's noisy even if don't have efficient unaligned access, e.g.  people
often add __cacheline_aligned_in_smp in structs, but forget to align
allocations of such struct (kmalloc() give 8-byte alignment in worst
case).

Rework so that when building a allmodconfig kernel that turns everything
into '=m' or '=y' will turn off UBSAN_ALIGNMENT.

[aryabinin@virtuozzo.com: changelog addition]
Link: http://lkml.kernel.org/r/20181217150326.30933-1-anders.roxell@linaro.org
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Jackie Liu
663cb6340c scripts/gdb: replace flags (MS_xyz -> SB_xyz)
Since commit 1751e8a6cb ("Rename superblock flags (MS_xyz ->
SB_xyz)"), scripts/gdb should be updated to replace MS_xyz with SB_xyz.

This change didn't directly affect the running operation of scripts/gdb
until commit e262e32d6b "vfs: Suppress MS_* flag defs within the
kernel unless explicitly enabled" removed the definitions used by
constants.py.

Update constants.py.in to utilise the new internal flags, matching the
implementation at fs/proc_namespace.c::show_sb_opts.

Note to stable, e262e32d6b landed in v5.0-rc1 (which was just
released), so we'll want this picked back to 5.0 stable once this patch
hits mainline (akpm just picked it up).  Without this, debugging a
kernel a kernel via GDB+QEMU is broken in the 5.0 release.

[kieran.bingham@ideasonboard.com: add fixes tag, reword commit message]
Link: http://lkml.kernel.org/r/20190305103014.25847-1-kieran.bingham@ideasonboard.com
Fixes: e262e32d6b "vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled"
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Dan Robertson <danlrobertson89@gmail.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Elena Reshetova
39e07cb608 kcov: convert kcov.refcount to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:

 - counter is initialized to 1 using atomic_set()

 - a resource is freed upon counter reaching zero

 - once counter reaches zero, its further
   increments aren't allowed

 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided refcount_t
type and API that prevents accidental counter overflows and underflows.
This is important since overflows and underflows can lead to
use-after-free situation and be exploitable.

The variable kcov.refcount is used as pure reference counter.  Convert
it to refcount_t and fix up the operations.

**Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c have
different memory ordering guarantees than their atomic counterparts.

The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57
and it is hopefully soon in state to be merged to the documentation
tree.  Normally the differences should not matter since refcount_t
provides enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.  Please double check that you don't
have some undocumented memory guarantees for this variable usage.

For the kcov.refcount it might make a difference
in following places:
 - kcov_put(): decrement in refcount_dec_and_test() only
   provides RELEASE ordering and control dependency on success
   vs. fully ordered atomic counterpart

Link: http://lkml.kernel.org/r/1547634429-772-1-git-send-email-elena.reshetova@intel.com
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Greg Kroah-Hartman
ec9672d576 kcov: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Link: http://lkml.kernel.org/r/20190122152151.16139-46-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Masahiro Yamada
13610aa908 kernel/configs: use .incbin directive to embed config_data.gz
This slightly optimizes the kernel/configs.c build.

bin2c is not very efficient because it converts a data file into a huge
array to embed it into a *.c file.

Instead, we can use the .incbin directive.

Also, this simplifies the code; Makefile is cleaner, and the way to get
the offset/size of the config_data.gz is more straightforward.

I used the "asm" statement in *.c instead of splitting it into *.S
because MODULE_* tags are not supported in *.S files.

I also cleaned up kernel/.gitignore; "config_data.gz" is unneeded
because the top-level .gitignore takes care of the "*.gz" pattern.

[yamada.masahiro@socionext.com: v2]
  Link: http://lkml.kernel.org/r/1550108893-21226-1-git-send-email-yamada.masahiro@socionext.com
Link: http://lkml.kernel.org/r/1549941160-8084-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Alexey Brodkin
3337d5cfe5 configs: get rid of obsolete CONFIG_ENABLE_WARN_DEPRECATED
This Kconfig option was removed during v4.19 development in commit
771c035372 ("deprecate the '__deprecated' attribute warnings entirely
and for good") so there's no point to keep it in defconfigs any longer.

FWIW defconfigs were patched with:
--------------------------->8----------------------
find . -name *_defconfig -exec sed -i '/CONFIG_ENABLE_WARN_DEPRECATED/d' {} \;
--------------------------->8----------------------

Link: http://lkml.kernel.org/r/20190128152434.41969-1-abrodkin@synopsys.com
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Gustavo A. R. Silva
9abdb50cda kernel/gcov/gcc_3_4.c: use struct_size() in kzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array.  For example:

  struct foo {
      int stuff;
      void *entry[];
  };

  instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

  instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Link: http://lkml.kernel.org/r/20190109172445.GA15908@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Christian Brauner
32a5ad9c22 sysctl: handle overflow for file-max
Currently, when writing

  echo 18446744073709551616 > /proc/sys/fs/file-max

/proc/sys/fs/file-max will overflow and be set to 0.  That quickly
crashes the system.

This commit sets the max and min value for file-max.  The max value is
set to long int.  Any higher value cannot currently be used as the
percpu counters are long ints and not unsigned integers.

Note that the file-max value is ultimately parsed via
__do_proc_doulongvec_minmax().  This function does not report error when
min or max are exceeded.  Which means if a value largen that long int is
written userspace will not receive an error instead the old value will be
kept.  There is an argument to be made that this should be changed and
__do_proc_doulongvec_minmax() should return an error when a dedicated min
or max value are exceeded.  However this has the potential to break
userspace so let's defer this to an RFC patch.

Link: http://lkml.kernel.org/r/20190107222700.15954-3-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Waiman Long <longman@redhat.com>
[christian@brauner.io: v4]
  Link: http://lkml.kernel.org/r/20190210203943.8227-3-christian@brauner.io
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Christian Brauner
7f2923c4f7 sysctl: handle overflow in proc_get_long
proc_get_long() is a funny function.  It uses simple_strtoul() and for a
good reason.  proc_get_long() wants to always succeed the parse and
return the maybe incorrect value and the trailing characters to check
against a pre-defined list of acceptable trailing values.  However,
simple_strtoul() explicitly ignores overflows which can cause funny
things like the following to happen:

  echo 18446744073709551616 > /proc/sys/fs/file-max
  cat /proc/sys/fs/file-max
  0

(Which will cause your system to silently die behind your back.)

On the other hand kstrtoul() does do overflow detection but does not
return the trailing characters, and also fails the parse when anything
other than '\n' is a trailing character whereas proc_get_long() wants to
be more lenient.

Now, before adding another kstrtoul() function let's simply add a static
parse strtoul_lenient() which:
 - fails on overflow with -ERANGE
 - returns the trailing characters to the caller

The reason why we should fail on ERANGE is that we already do a partial
fail on overflow right now.  Namely, when the TMPBUFLEN is exceeded.  So
we already reject values such as 184467440737095516160 (21 chars) but
accept values such as 18446744073709551616 (20 chars) but both are
overflows.  So we should just always reject 64bit overflows and not
special-case this based on the number of chars.

Link: http://lkml.kernel.org/r/20190107222700.15954-2-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Gustavo A. R. Silva
92bf501638 rapidio/mport_cdev: mark expected switch fall-through
In preparation for enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

This patch fixes the following warning:

  drivers/rapidio/devices/rio_mport_cdev.c: In function `mport_release_mapping':
  drivers/rapidio/devices/rio_mport_cdev.c:2151:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
     rio_unmap_inb_region(mport, map->phys_addr);
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    CC      drivers/regulator/fixed-helper.o
    CC      drivers/pinctrl/stm32/pinctrl-stm32f429.o
  drivers/rapidio/devices/rio_mport_cdev.c:2152:2: note: here
    case MAP_DMA:
    ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Link: http://lkml.kernel.org/r/20190212175014.GA14326@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Alexandre Bounine <alex.bou9@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:02 -08:00
Dan Carpenter
5ac188b12e drivers/rapidio/rio_cm.c: fix potential oops in riocm_ch_listen()
If riocm_get_channel() fails, then we should just return -EINVAL.
Calling riocm_put_channel() will trigger a NULL dereference and
generally we should call put() if the get() didn't succeed.

Link: http://lkml.kernel.org/r/20190110130230.GB27017@kadam
Fixes: b6e8d4aa11 ("rapidio: add RapidIO channelized messaging driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Johannes Weiner
4b04700275 kernel: workqueue: clarify wq_worker_last_func() caller requirements
This function can only be called safely from very specific scheduler
contexts.  Document those.

Link: http://lkml.kernel.org/r/20190206150528.31198-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Oleg Nesterov
6eb3c3d0a5 exec: increase BINPRM_BUF_SIZE to 256
Large enterprise clients often run applications out of networked file
systems where the IT mandated layout of project volumes can end up
leading to paths that are longer than 128 characters.  Bumping this up
to the next order of two solves this problem in all but the most
egregious case while still fitting into a 512b slab.

[oleg@redhat.com: update comment, per Kees]
Link: http://lkml.kernel.org/r/20181112160956.GA28472@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Ben Woodard <woodard@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Vineet Gupta
26e152252e fs/exec.c: replace opencoded set_mask_bits()
Link: http://lkml.kernel.org/r/1548275584-18096-2-git-send-email-vgupta@synopsys.com
Link: http://lkml.kernel.org/g/20150807115710.GA16897@redhat.com
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Hou Tao
67ceb1eca0 fat: enable .splice_write to support splice on O_DIRECT file
Now splice() on O_DIRECT-opened fat file will return -EFAULT, that is
because the default .splice_write, namely default_file_splice_write(),
will construct an ITER_KVEC iov_iter and dio_refill_pages() in dio path
can not handle it.

Fix it by implementing .splice_write through iter_file_splice_write().

Spotted by xfs-tests generic/091.

Link: http://lkml.kernel.org/r/20190210094754.56355-1-houtao1@huawei.com
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
NeilBrown
660c9fc72e autofs: clear O_NONBLOCK on the pipe
autofs does not expect the pipe it is given to have O_NONBLOCK set -
specifically if __kernel_write() in autofs_write() returns -EAGAIN, this
is treated as a fatal error and the pipe is closed.

For safety autofs should, therefore, clear the O_NONBLOCK flag.

Releases of systemd prior to 8th February 2019 used
  pipe2(p, O_NONBLOCK|O_CLOEXEC)
and thus (inadvertently) set this flag.

Link: http://lkml.kernel.org/r/154993550902.3321.1183632970046073478.stgit@pluto-themaw-net
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Ian Kent
874d22d62b fs/autofs/inode.c: use seq_puts() for simple strings in autofs_show_options()
Fix checkpatch.sh WARNING about the use of seq_printf() to print simple
strings in autofs_show_options(), use seq_puts() in this case.

Link: http://lkml.kernel.org/r/154889012613.4863.12231175554744203482.stgit@pluto-themaw-net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Ian Kent
60d6d04ca3 autofs: add ignore mount option
Add an autofs file system mount option that can be used to provide a
generic indicator to applications that the mount entry should be ignored
when displaying mount information.

In other OSes that provide autofs and that provide a mount list to user
space based on the kernel mount list a no-op mount option ("ignore" is
the one use on the most common OS) is allowed so that autofs file system
users can optionally use it.

The idea is that it be used by user space programs to exclude autofs
mounts from consideration when reading the mounts list.

Prior to the change to link /etc/mtab to /proc/self/mounts all I needed
to do to achieve this was to use mount(2) and not update the mtab but
now that no longer works.

I know the symlinking happened a long time ago and I considered doing
this then but, at the time I couldn't remember the commonly used option
name and thought persuading the various utility maintainers would be too
hard.

But now I have a RHEL request to do this for compatibility for a widely
used product so I want to go ahead with it and try and enlist the help
of some utility package maintainers.

Clearly, without the option nothing can be done so it's at least a
start.

Link: http://lkml.kernel.org/r/154725123970.11260.6113771566924907275.stgit@pluto-themaw-net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Valdis Kletnieks
8496ecd0be init/calibrate.c: provide proper prototype
Sparse issues a warning:

    CHECK   init/calibrate.c
  init/calibrate.c:271:28: warning: symbol 'calibration_delay_done' was not declared. Should it be static?

The actual issue is that it's a __weak symbol that archs can override
(in fact, ARM does so), but no prototype is provided.  Let's provide one
to prevent surprises.

Link: http://lkml.kernel.org/r/18827.1548750938@turing-police.cc.vt.edu
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Alexey Dobriyan
49ac981965 fs/binfmt_elf.c: spread const a little
Link: http://lkml.kernel.org/r/20190204202830.GC27482@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Alexey Dobriyan
93f044e282 fs/binfmt_elf.c: use list_for_each_entry()
[adobriyan@gmail.com: fixup compilation]
  Link: http://lkml.kernel.org/r/20190205064334.GA2152@avx2
Link: http://lkml.kernel.org/r/20190204202800.GB27482@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Alexey Dobriyan
faf1c31520 fs/binfmt_elf.c: don't be afraid of overflow
Number of ELF program headers is 16-bit by spec, so total size
comfortably fits into "unsigned int".

Space savings: 7 bytes!

	add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-7 (-7)
	Function                                     old     new   delta
	load_elf_phdrs                               137     130      -7

Link: http://lkml.kernel.org/r/20190204202715.GA27482@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Roman Penyaev
a218cc4914 epoll: use rwlock in order to reduce ep_poll_callback() contention
The goal of this patch is to reduce contention of ep_poll_callback()
which can be called concurrently from different CPUs in case of high
events rates and many fds per epoll.  Problem can be very well
reproduced by generating events (write to pipe or eventfd) from many
threads, while consumer thread does polling.  In other words this patch
increases the bandwidth of events which can be delivered from sources to
the poller by adding poll items in a lockless way to the list.

The main change is in replacement of the spinlock with a rwlock, which
is taken on read in ep_poll_callback(), and then by adding poll items to
the tail of the list using xchg atomic instruction.  Write lock is taken
everywhere else in order to stop list modifications and guarantee that
list updates are fully completed (I assume that write side of a rwlock
does not starve, it seems qrwlock implementation has these guarantees).

The following are some microbenchmark results based on the test [1]
which starts threads which generate N events each.  The test ends when
all events are successfully fetched by the poller thread:

 spinlock
 ========

 threads  events/ms  run-time ms
       8       6402        12495
      16       7045        22709
      32       7395        43268

 rwlock + xchg
 =============

 threads  events/ms  run-time ms
       8      10038         7969
      16      12178        13138
      32      13223        24199

According to the results bandwidth of delivered events is significantly
increased, thus execution time is reduced.

This patch was tested with different sort of microbenchmarks and
artificial delays (e.g.  "udelay(get_random_int() & 0xff)") introduced
in kernel on paths where items are added to lists.

[1] https://github.com/rouming/test-tools/blob/master/stress-epoll.c

Link: http://lkml.kernel.org/r/20190103150104.17128-5-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Roman Penyaev
c3e320b615 epoll: unify awaking of wakeup source on ep_poll_callback() path
Original comment "Activate ep->ws since epi->ws may get deactivated at
any time" indeed sounds loud, but it is incorrect, because the path
where we check epi->ws is a path where insert to ovflist happens, i.e.
ep_scan_ready_list() has taken ep->mtx and waits for this callback to
finish, thus ep_modify() (which unregisters wakeup source) waits for
ep_scan_ready_list().

Here in this patch I simply call ep_pm_stay_awake_rcu(), which is a bit
extra for this path (indirectly protected by main ep->mtx, so even rcu
is not needed), but I do not want to create another naked
__ep_pm_stay_awake() variant only for this particular case, so rcu variant
is just better for all the cases.

Link: http://lkml.kernel.org/r/20190103150104.17128-4-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Roman Penyaev
c141175d01 epoll: make sure all elements in ready list are in FIFO order
Patch series "use rwlock in order to reduce ep_poll_callback()
contention", v3.

The last patch targets the contention problem in ep_poll_callback(),
which can be very well reproduced by generating events (write to pipe or
eventfd) from many threads, while consumer thread does polling.

The following are some microbenchmark results based on the test [1]
which starts threads which generate N events each.  The test ends when
all events are successfully fetched by the poller thread:

 spinlock
 ========

 threads  events/ms  run-time ms
       8       6402        12495
      16       7045        22709
      32       7395        43268

 rwlock + xchg
 =============

 threads  events/ms  run-time ms
       8      10038         7969
      16      12178        13138
      32      13223        24199

According to the results bandwidth of delivered events is significantly
increased, thus execution time is reduced.

This patch (of 4):

All coming events are stored in FIFO order and this is also should be
applicable to ->ovflist, which originally is stack, i.e.  LIFO.

Thus to keep correct FIFO order ->ovflist should reversed by adding
elements to the head of the read list but not to the tail.

Link: http://lkml.kernel.org/r/20190103150104.17128-2-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Joe Perches
a8da38a9cf checkpatch: add test for SPDX-License-Identifier on wrong line #
Warn when any SPDX-License-Identifier: tag is not created on the proper
line number.

Link: http://lkml.kernel.org/r/9b74ee87f8c1b8fd310e213fcb4994d58610fcb6.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: "Enrico Weigelt, metux IT consult" <lkml@metux.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00
Vadim Bendebury
98005e8c74 checkpatch: allow reporting C99 style comments
Presently C99 style comments are removed unconditionally before actual
patch validity check happens.  This is a problem for some third party
projects which use checkpatch.pl but do not allow C99 style comments.

This patch adds yet another variable, named C99_COMMENT_TOLERANCE.  If
it is included in the --ignore command line or config file options list,
C99 comments in the patch are reported as errors.

Tested by processing a patch with a C99 style comment, it passes the
check just fine unless '--ignore C99_COMMENT_TOLERANCE' is present in
.checkpatch.conf.

Link: http://lkml.kernel.org/r/20190110224957.25008-1-vbendeb@chromium.org
Signed-off-by: Vadim Bendebury <vbendeb@chromium.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-07 18:32:01 -08:00