Commit Graph

944 Commits

Author SHA1 Message Date
Alexey Budankov
902a8dcc5b doc/admin-guide: Update perf-security.rst with CAP_PERFMON information
Update perf-security.rst documentation file with the information
related to usage of CAP_PERFMON capability to secure performance
monitoring and observability operations in system.

Committer notes:

While testing 'perf top' under cap_perfmon I noticed that it needs
some more capability and Alexey pointed out cap_ipc_lock, as needed by
this kernel chunk:

  kernel/events/core.c: 6101
       if ((locked > lock_limit) && perf_is_paranoid() &&
               !capable(CAP_IPC_LOCK)) {
               ret = -EPERM;
               goto unlock;
       }

So I added it to the documentation, and also mentioned that if the
libcap version doesn't yet supports 'cap_perfmon', its numeric value can
be used instead, i.e. if:

	# setcap "cap_perfmon,cap_ipc_lock,cap_sys_ptrace,cap_syslog=ep" perf

Fails, try:

	# setcap "38,cap_ipc_lock,cap_sys_ptrace,cap_syslog=ep" perf

I also added a paragraph stating that using an unpatched libcap will
fail the check for CAP_PERFMON, as it checks the cap number against a
maximum to see if it is valid, which makes it use as the default the
'cycles:u' event, even tho a cap_perfmon capable perf binary can get
kernel samples, to workaround that just use, e.g.:

  # perf top -e cycles
  # perf record -e cycles

And it will sample kernel and user modes.

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: James Morris <jmorris@namei.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Link: http://lore.kernel.org/lkml/17278551-9399-9ebe-d665-8827016a217d@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-16 12:19:10 -03:00
Linus Torvalds
0785249f8b Time(keeping) updates:
- Fix the time_for_children symlink in /proc/$PID/ so it properly reflects
    that it part of the 'time' namespace
 
  - Add the missing userns limit for the allowed number of time namespaces,
    which was half defined but the actual array member was not added.  This
    went unnoticed as the array has an exessive empty member at the end but
    introduced a user visible regression as the output was corrupted.
 
  - Prevent further silent ucount corruption by adding a BUILD_BUG_ON() to
    catch half updated data.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6TFe4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYob4PD/47Qwz2z2mEeO037VbbI2gY4yl/raFo
 5KPWmnwonKrtVaYAldLutA3iaG7bBbUX5fRvbSRNTS6CJIHwwfLSx7/CeWMmIXEX
 0zsBsn5QXjG89lJZXM+ot74yzjvkeoad2g0jEHv92v0WDSXFiAWhkBUwknfNFbpa
 csEjkdpyn2zTVBGBzKVHWHXddkY0o0Q0JOy0EiH09rHGpQktPoLJdYp73VCygoJd
 NRAXhTmBQq85RMcSB3eVTbSPpIuBUzZke9zoio7YZwEjl6bkvSqetPmTdIr57u4s
 ex3PX++64EXD7r8ZW36fPGDqu6v0CH2ILK7QVhwyHAYJo2LQKVd+v25muaFrzfpn
 dSG1SqabWqdIHUoW/76ORyecAFLTzGwDu07UH+6VJbXeLfmuhe/LI3hdDQFph9NQ
 BOBKhaHm8aXmAmvrkxbbAikSkJYVHrAIp5abI4PSYoPaqK1DWnSPaT1cqtaIUgYL
 Mk15z19V9np4lMCH2cucAlap8U9EvQEIfCRRdl+crDu17ZzGID1pwhY2DA8adqcT
 SUfwzzUaykd5TZtDeIe+6G9fsgf/wbSTSSbrNGKlLXDbxx+iNVXErkmx0JXLEHV4
 47cmBwQZ255DzjMfuS4HzCck2MaaP8mDWgcbszgkP+GFnkf9EAP5XNp9st937mbG
 rzP+NkjNCldN9w==
 =wOiC
 -----END PGP SIGNATURE-----

Merge tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull time(keeping) updates from Thomas Gleixner:

 - Fix the time_for_children symlink in /proc/$PID/ so it properly
   reflects that it part of the 'time' namespace

 - Add the missing userns limit for the allowed number of time
   namespaces, which was half defined but the actual array member was
   not added. This went unnoticed as the array has an exessive empty
   member at the end but introduced a user visible regression as the
   output was corrupted.

 - Prevent further silent ucount corruption by adding a BUILD_BUG_ON()
   to catch half updated data.

* tag 'timers-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ucount: Make sure ucounts in /proc/sys/user don't regress again
  time/namespace: Add max_time_namespaces ucount
  time/namespace: Fix time_for_children symlink
2020-04-12 10:13:14 -07:00
Linus Torvalds
5b8b9d0c6d Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:

 - Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc,
   gup, hugetlb, pagemap, memremap)

 - Various other things (hfs, ocfs2, kmod, misc, seqfile)

* akpm: (34 commits)
  ipc/util.c: sysvipc_find_ipc() should increase position index
  kernel/gcov/fs.c: gcov_seq_next() should increase position index
  fs/seq_file.c: seq_read(): add info message about buggy .next functions
  drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings
  change email address for Pali Rohár
  selftests: kmod: test disabling module autoloading
  selftests: kmod: fix handling test numbers above 9
  docs: admin-guide: document the kernel.modprobe sysctl
  fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once()
  kmod: make request_module() return an error when autoloading is disabled
  mm/memremap: set caching mode for PCI P2PDMA memory to WC
  mm/memory_hotplug: add pgprot_t to mhp_params
  powerpc/mm: thread pgprot_t through create_section_mapping()
  x86/mm: introduce __set_memory_prot()
  x86/mm: thread pgprot_t through init_memory_mapping()
  mm/memory_hotplug: rename mhp_restrictions to mhp_params
  mm/memory_hotplug: drop the flags field from struct mhp_restrictions
  mm/special: create generic fallbacks for pte_special() and pte_mkspecial()
  mm/vma: introduce VM_ACCESS_FLAGS
  mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS
  ...
2020-04-10 17:57:48 -07:00
Linus Torvalds
ca6151a978 A handful of late-arriving fixes for the documentation tree.
-----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl6QnP0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Yf2kH+gO2r5g/OzT32Szm3ZgRYgnJ3u6ON6MXllCd
 pxY+M3Yw/3Q4W1CMkg4Oo8Hq97dTSNtPriDpWMBDj6+QvBF1DzG797ThiHHNpTEy
 i0CV5uawvRuWHTr/jOdRZDRosegbTdgqRbj8MWBw9JjWalHky5QUG0StubXy0TPD
 ahj+QRN3oxFqlHr4OtVL8NXTeVwjiXPiRFDrcy/eEw2jb2hkiLhWfElFsh7KhE9L
 K0PPLGKJMqSM6PYjqQqJx+M363Pm/nNpyw5z55pTMn0l5kbG5tu6CIYMlMJmzYBE
 8vwJnxpiA3QRrqG5TtgVLS+yFRxO1AXdY/F6N5qt2hiczUyXlf8=
 =C8i4
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.7-2' of git://git.lwn.net/linux

Pull Documentation fixes from Jonathan Corbet:
 "A handful of late-arriving fixes for the documentation tree"

* tag 'docs-5.7-2' of git://git.lwn.net/linux:
  Documentation: android: binderfs: add 'stats' mount option
  Documentation: driver-api/usb/writing_usb_driver.rst Updates documentation links
  docs: driver-api: address duplicate label warning
  Documentation: sysrq: fix RST formatting
  docs: kernel-parameters.txt: Fix broken references
  docs: kernel-parameters.txt: Remove nompx
  docs: filesystems: fix typo in qnx6.rst
2020-04-10 17:53:43 -07:00
Eric Biggers
6e71582506 docs: admin-guide: document the kernel.modprobe sysctl
Document the kernel.modprobe sysctl in the same place that all the other
kernel.* sysctls are documented.  Make sure to mention how to use this
sysctl to completely disable module autoloading, and how this sysctl
relates to CONFIG_STATIC_USERMODEHELPER.

[ebiggers@google.com: v5]
  Link: http://lkml.kernel.org/r/20200318230515.171692-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: NeilBrown <neilb@suse.com>
Link: http://lkml.kernel.org/r/20200312202552.241885-4-ebiggers@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:22 -07:00
Roman Gushchin
cf11e85fc0 mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit 944d9fec8d ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.

However it actually works only at early stages of the system loading,
when the majority of memory is free.  After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero.  Even dropping caches manually doesn't
help a lot.

At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex.  At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.

The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
   as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
   cma allocator and the dedicated cma area

In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.

* On a multi-node machine a per-node cma area is allocated on each node.
  Following gigantic hugetlb allocation are using the first available
  numa node if the mask isn't specified by a user.

Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
   pass hugetlb_cma=10G as a kernel argument

2) allocate hugetlb pages as usual, e.g.
   echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.

x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.

The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap.  It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Randy Dunlap
befacdcf47 Documentation: android: binderfs: add 'stats' mount option
Add documentation of the binderfs 'stats' mount option.

Description taken from the commit message.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/baa0aa81-007d-af46-16a5-91fead0bd1b9@infradead.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-04-10 10:14:53 -06:00
Alyssa Ross
8699039042 Documentation: sysrq: fix RST formatting
"On x86" and "On SPARC" are now definition list terms, like
"On PowerPC", "On other", and "On all".

The Credits list is now a bulleted list, like lots of Credits lists in
other files.  This prevents the list from becoming a single long,
unpunctuated sentence in the generated documentation.

I also did a couple of other tiny readability improvements to the
"How do I use the magic SysRq key?" section while I was there.

Signed-off-by: Alyssa Ross <hi@alyssa.is>
Link: https://lore.kernel.org/r/20200403170701.10852-1-hi@alyssa.is
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-04-07 13:32:15 -06:00
Jimmy Assarsson
cd4ca34153 docs: kernel-parameters.txt: Fix broken references
Fix remaining broken references in kernel-parameters.txt.

Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Jimmy Assarsson <jimmyassarsson@gmail.com>
Link: https://lore.kernel.org/r/20200402172614.3020-2-jimmyassarsson@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-04-07 13:29:35 -06:00
Jimmy Assarsson
ed01b03018 docs: kernel-parameters.txt: Remove nompx
x86/mpx was removed in commit 45fc24e89b
("x86/mpx: remove MPX from arch/x86"), this removes the documentation of
parameter nompx.

Fixes: 45fc24e89b ("x86/mpx: remove MPX from arch/x86")
Signed-off-by: Jimmy Assarsson <jimmyassarsson@gmail.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20200402172614.3020-1-jimmyassarsson@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-04-07 13:28:56 -06:00
Baoquan He
f3cd4c865b mm/memory_hotplug.c: only respect mem= parameter during boot stage
In commit 357b4da50a ("x86: respect memory size limiting via mem=
parameter") a global varialbe max_mem_size is added to store the value
parsed from 'mem= ', then checked when memory region is added.  This truly
stops those DIMMs from being added into system memory during boot-time.

However, it also limits the later memory hotplug functionality.  Any DIMM
can't be hotplugged any more if its region is beyond the max_mem_size.  We
will get errors like:

[  216.387164] acpi PNP0C80:02: add_memory failed
[  216.389301] acpi PNP0C80:02: acpi_memory_enable_device() error
[  216.392187] acpi PNP0C80:02: Enumeration failure

This will cause issue in a known use case where 'mem=' is added to the
hypervisor.  The memory that lies after 'mem=' boundary will be assigned
to KVM guests.  After commit 357b4da50a merged, memory can't be extended
dynamically if system memory on hypervisor is not sufficient.

So fix it by also checking if it's during boot-time restricting to add
memory.  Otherwise, skip the restriction.

And also add this use case to document of 'mem=' kernel parameter.

Fixes: 357b4da50a ("x86: respect memory size limiting via mem= parameter")
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Link: http://lkml.kernel.org/r/20200204050643.20925-1-bhe@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07 10:43:40 -07:00
Martin Cracauer
57e5d4f278 userfaultfd: wp: UFFDIO_REGISTER_MODE_WP documentation update
Add documentation about the write protection support.

[peterx@redhat.com: rewrite in rst format; fixups here and there]
Signed-off-by: Martin Cracauer <cracauer@cons.org>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shli@fb.com>
Link: http://lkml.kernel.org/r/20200220163112.11409-17-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07 10:43:39 -07:00
David Rientjes
85b9f46e8e mm, thp: track fallbacks due to failed memcg charges separately
The thp_fault_fallback and thp_file_fallback vmstats are incremented if
either the hugepage allocation fails through the page allocator or the
hugepage charge fails through mem cgroup.

This patch leaves this field untouched but adds two new fields,
thp_{fault,file}_fallback_charge, which is incremented only when the mem
cgroup charge fails.

This distinguishes between attempted hugepage allocations that fail due to
fragmentation (or low memory conditions) and those that fail due to mem
cgroup limits.  That can be used to determine the impact of fragmentation
on the system by excluding faults that failed due to memcg usage.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jeremy Cline <jcline@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2003061422070.7412@chino.kir.corp.google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07 10:43:38 -07:00
David Rientjes
dcdf11ee14 mm, shmem: add vmstat for hugepage fallback
The existing thp_fault_fallback indicates when thp attempts to allocate a
hugepage but fails, or if the hugepage cannot be charged to the mem cgroup
hierarchy.

Extend this to shmem as well.  Adds a new thp_file_fallback to complement
thp_file_alloc that gets incremented when a hugepage is attempted to be
allocated but fails, or if it cannot be charged to the mem cgroup
hierarchy.

Additionally, remove the check for CONFIG_TRANSPARENT_HUGE_PAGECACHE from
shmem_alloc_hugepage() since it is only called with this configuration
option.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jeremy Cline <jcline@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2003061421240.7412@chino.kir.corp.google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07 10:43:38 -07:00
Dmitry Safonov
eeec26d5da time/namespace: Add max_time_namespaces ucount
Michael noticed that userns limit for number of time namespaces is missing.

Furthermore, time namespace introduced UCOUNT_TIME_NAMESPACES, but didn't
introduce an array member in user_table[]. It would make array's
initialisation OOB write, but by luck the user_table array has an excessive
empty member (all accesses to the array are limited with UCOUNT_COUNTS - so
it silently reuses the last free member.

Fixes user-visible regression: max_inotify_instances by reason of the
missing UCOUNT_ENTRY() has limited max number of namespaces instead of the
number of inotify instances.

Fixes: 769071ac9f ("ns: Introduce Time Namespace")
Reported-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andrei Vagin <avagin@gmail.com>
Acked-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: stable@kernel.org
Link: https://lkml.kernel.org/r/20200406171342.128733-1-dima@arista.com
2020-04-07 12:37:21 +02:00
Linus Torvalds
7e63420847 Additional ACPI updates for 5.7-rc1
- Update the ACPICA code in the kernel to upstream revision 20200326
    including:
 
    * Fix for a typo in a comment field (Bob Moore).
    * acpiExec namespace init file fixes (Bob Moore).
    * Addition of NHLT to the known tables list (Cezary Rojewski).
    * Conversion of PlatformCommChannel ASL keyword to PCC (Erik
      Kaneda).
    * acpiexec cleanup (Erik Kaneda).
    * WSMT-related typo fix (Erik Kaneda).
    * sprintf() utility function fix (John Levon).
    * IVRS IVHD type 11h parsing implementation (Michał Żygowski).
    * IVRS IVHD type 10h reserved field name fix (Michał Żygowski).
 
  - Fix ACPI-related CPU hotplug deadlock on x86 (Qian Cai).
 
  - Fix Intel Tiger Lake ACPI device IDs in several places (Gayatri
    Kammela).
 
  - Add ACPI backlight blacklist entry for Acer Aspire 5783z (Hans
    de Goede).
 
  - Fix documentation of the "acpi_backlight" kernel command line
    switch (Randy Dunlap).
 
  - Clean up the acpi_get_psd_map() CPPC library routine (Liguang
    Zhang).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl6LQEASHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxqfoQAI/GPq7xhb8jOofmTfLxa4ahO2NDxK1E
 Ye4Tcm8JLv78hro7iMUlbPsRXm15lyDxMldGRfxsiLFTF2xQtYhdTnPx+KZ439j+
 QokMHUT6gFEMAV7OPFvXd2r58ShJJHezobbn241zTILx1c3ai66dCQrqyhYjlZ28
 0hUCyY4ilgXWuYInlckGW3Rp/Qxc9IVOxzFUV90EW9pTb4vKzoqznjNm+dpY8rHm
 QFNb2BkTJygOPmJiumi/yJX+74YSZrzW5fS1PDQS4Lr46j0imvWVVataMd1qbQ0+
 fDhvhL7IimHiM/qZg67hKpsAt6AcQPhaZ6JyoEGUoafxpBQN0a7b5rMwmL0P/HWV
 pL5mKM+jc7zh0HTb+xkpNotJxT+KBFo1jTRxGyVAnK8SThzlyFhKhetiOwaHCIDv
 dNYao6bCNsuGLh3T/09xbAmEeCSt7k+ok892N4o9wzqNfoDg6fX/c0M5ZD1F+Awb
 l9agU7XChziyDJwAqTbqndx71DK4ALrhZa1tNKA5PGTY8b5XrojoKsOyYk6PYA1x
 CqU20muRV4VAzB0pvdiwBc2Yrtfiv32mv5jMNrqrrv3D6S6R8vBUNhHlWKu/75a9
 9muIoEHWnK0/a9kmVJG8CUSXTTTPQpvOovesznruTxvGx3Mp9gw+d3/1tjuA/QNM
 ZoOOru5AEyi+
 =b3Dp
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI updates from Rafael Wysocki:
 "Additional ACPI updates.

  These update the ACPICA code in the kernel to the 20200326 upstream
  revision, fix an ACPI-related CPU hotplug deadlock on x86, update
  Intel Tiger Lake device IDs in some places, add a new ACPI backlight
  blacklist entry, update the "acpi_backlight" kernel command line
  switch documentation and clean up a CPPC library routine.

  Specifics:

   - Update the ACPICA code in the kernel to upstream revision 20200326
     including:
      * Fix for a typo in a comment field (Bob Moore)
      * acpiExec namespace init file fixes (Bob Moore)
      * Addition of NHLT to the known tables list (Cezary Rojewski)
      * Conversion of PlatformCommChannel ASL keyword to PCC (Erik
        Kaneda)
      * acpiexec cleanup (Erik Kaneda)
      * WSMT-related typo fix (Erik Kaneda)
      * sprintf() utility function fix (John Levon)
      * IVRS IVHD type 11h parsing implementation (Michał Żygowski)
      * IVRS IVHD type 10h reserved field name fix (Michał Żygowski)

   - Fix ACPI-related CPU hotplug deadlock on x86 (Qian Cai)

   - Fix Intel Tiger Lake ACPI device IDs in several places (Gayatri
     Kammela)

   - Add ACPI backlight blacklist entry for Acer Aspire 5783z (Hans de
     Goede)

   - Fix documentation of the "acpi_backlight" kernel command line
     switch (Randy Dunlap)

   - Clean up the acpi_get_psd_map() CPPC library routine (Liguang
     Zhang)"

* tag 'acpi-5.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  x86: ACPI: fix CPU hotplug deadlock
  thermal: int340x_thermal: fix: Update Tiger Lake ACPI device IDs
  platform/x86: intel-hid: fix: Update Tiger Lake ACPI device ID
  ACPI: Update Tiger Lake ACPI device IDs
  ACPI: video: Use native backlight on Acer Aspire 5783z
  ACPI: video: Docs update for "acpi_backlight" kernel parameter options
  ACPICA: Update version 20200326
  ACPICA: Fixes for acpiExec namespace init file
  ACPICA: Add NHLT table signature
  ACPICA: WSMT: Fix typo, no functional change
  ACPICA: utilities: fix sprintf()
  ACPICA: acpiexec: remove redeclaration of acpi_gbl_db_opt_no_region_support
  ACPICA: Change PlatformCommChannel ASL keyword to PCC
  ACPICA: Fix IVRS IVHD type 10h reserved field name
  ACPICA: Implement IVRS IVHD type 11h parsing
  ACPICA: Fix a typo in a comment field
  ACPI: CPPC: clean up acpi_get_psd_map()
2020-04-06 10:35:06 -07:00
Linus Torvalds
ef05db16bb Additional power management updates for 5.7-rc1
- Fix corner-case suspend-to-idle wakeup issue on systems where
    the ACPI SCI is shared with another wakeup source (Hans de Goede).
 
  - Add document describing system-wide suspend and resume code flows
    to the admin guide (Rafael Wysocki).
 
  - Add kernel command line option to set pm_debug_messages (Chen Yu).
 
  - Choose schedutil as the preferred scaling governor by default on
    ARM big.LITTLE systems and on x86 systems using the intel_pstate
    driver in the passive mode (Linus Walleij, Rafael Wysocki).
 
  - Drop racy and redundant checks from the PM core's device_prepare()
    routine (Rafael Wysocki).
 
  - Make resume from hibernation take the hibernation_restore() return
    value into account (Dexuan Cui).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl6LN9kSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxqj0P/3Uh16YIKOUeXosqtptJJiWgZoWu82aR
 H2uzc9hCV6k/tnSG/Y3ZGgI4dUZGcJ94By80XoWceZ97CC1YuBHpF/dHSupTCcvt
 tBEP9H2qbSLi9d2PhlEUJ0B6PBpWEjCzOb519Tzj4sd67FGhsKeihXl+WXdgaaKZ
 hpdj+tBkQ6Dxb7CXVw/mUpWlg/nmF+yRqpK5By+V0WMOlQtg4Ecb3lMqaJPr7b4F
 W/I5m4PE9QG+VetSSXoe7CKfLo2fsDtyHH6ioTJ9xklcOflIblZ3MYf3sKgOuuny
 m/6Zm71HmQwWRkjA/V+C/eTr0VX2TuMUALfA8i+uZBo//I58TTJq9oqosu8eao2P
 5MR79Vwa+eqtYRgpHCTM4JO1iyKiE+FTfVAPOS4hHOKjqY9BUj0YtWHxk8abdzQd
 owv9DjEtGKcipAnCKtWDIS9ikZ5TOsYiGgWyJQnoonpFY+0s3DwtRdi+gBLJliSj
 5j7sRv9wJrJRFZ7tRMHHAqoQkSYqf4YdBuRBG7F/M44anveORBJNaTbPgm30qbUP
 FXD9hTJn/4Q2nmBSzqvEv0+TJrWoCyKduzr5uNsR0eyg48w9mrlUDDkW7dOMb3DU
 WV6QGNomMuKIY4DHKCo3/k21NmzW2JN7zhUrErCnl0bXjV2AM71Cb2xjUD1eNOqV
 9LxPDvAmZoGt
 =QFDr
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "Additional power management updates.

  These fix a corner-case suspend-to-idle wakeup issue on systems where
  the ACPI SCI is shared with another wakeup source, add a kernel
  command line option to set pm_debug_messages via the kernel command
  line, add a document desctibing system-wide suspend and resume code
  flows, modify cpufreq Kconfig to choose schedutil as the preferred
  governor by default in a couple of cases and do some assorted
  cleanups.

  Specifics:

   - Fix corner-case suspend-to-idle wakeup issue on systems where the
     ACPI SCI is shared with another wakeup source (Hans de Goede).

   - Add document describing system-wide suspend and resume code flows
     to the admin guide (Rafael Wysocki).

   - Add kernel command line option to set pm_debug_messages (Chen Yu).

   - Choose schedutil as the preferred scaling governor by default on
     ARM big.LITTLE systems and on x86 systems using the intel_pstate
     driver in the passive mode (Linus Walleij, Rafael Wysocki).

   - Drop racy and redundant checks from the PM core's device_prepare()
     routine (Rafael Wysocki).

   - Make resume from hibernation take the hibernation_restore() return
     value into account (Dexuan Cui)"

* tag 'pm-5.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  platform/x86: intel_int0002_vgpio: Use acpi_register_wakeup_handler()
  ACPI: PM: Add acpi_[un]register_wakeup_handler()
  Documentation: PM: sleep: Document system-wide suspend code flows
  cpufreq: Select schedutil when using big.LITTLE
  PM: sleep: Add pm_debug_messages kernel command line option
  PM: sleep: core: Drop racy and redundant checks from device_prepare()
  PM: hibernate: Propagate the return value of hibernation_restore()
  cpufreq: intel_pstate: Select schedutil as the default governor
2020-04-06 10:14:39 -07:00
Linus Torvalds
0ad5b053d4 Char/Misc driver patches for 5.7-rc1
Here is the big set of char/misc/other driver patches for 5.7-rc1.
 
 Lots of things in here, and it's later than expected due to some reverts
 to resolve some reported issues.  All is now clean with no reported
 problems in linux-next.
 
 Included in here is:
 	- interconnect updates
 	- mei driver updates
 	- uio updates
 	- nvmem driver updates
 	- soundwire updates
 	- binderfs updates
 	- coresight updates
 	- habanalabs updates
 	- mhi new bus type and core
 	- extcon driver updates
 	- some Kconfig cleanups
 	- other small misc driver cleanups and updates
 
 As mentioned, all have been in linux-next for a while, and with the last
 two reverts, all is calm and good.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXodfvA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynzCQCfROhar3E8EhYEqSOP6xq6uhX9uegAnRgGY2rs
 rN4JJpOcTddvZcVlD+vo
 =ocWk
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char/misc/other driver patches for 5.7-rc1.

  Lots of things in here, and it's later than expected due to some
  reverts to resolve some reported issues. All is now clean with no
  reported problems in linux-next.

  Included in here is:
   - interconnect updates
   - mei driver updates
   - uio updates
   - nvmem driver updates
   - soundwire updates
   - binderfs updates
   - coresight updates
   - habanalabs updates
   - mhi new bus type and core
   - extcon driver updates
   - some Kconfig cleanups
   - other small misc driver cleanups and updates

  As mentioned, all have been in linux-next for a while, and with the
  last two reverts, all is calm and good"

* tag 'char-misc-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (174 commits)
  Revert "driver core: platform: Initialize dma_parms for platform devices"
  Revert "amba: Initialize dma_parms for amba devices"
  amba: Initialize dma_parms for amba devices
  driver core: platform: Initialize dma_parms for platform devices
  bus: mhi: core: Drop the references to mhi_dev in mhi_destroy_device()
  bus: mhi: core: Initialize bhie field in mhi_cntrl for RDDM capture
  bus: mhi: core: Add support for reading MHI info from device
  misc: rtsx: set correct pcr_ops for rts522A
  speakup: misc: Use dynamic minor numbers for speakup devices
  mei: me: add cedar fork device ids
  coresight: do not use the BIT() macro in the UAPI header
  Documentation: provide IBM contacts for embargoed hardware
  nvmem: core: remove nvmem_sysfs_get_groups()
  nvmem: core: use is_bin_visible for permissions
  nvmem: core: use device_register and device_unregister
  nvmem: core: add root_only member to nvmem device struct
  extcon: axp288: Add wakeup support
  extcon: Mark extcon_get_edev_name() function as exported symbol
  extcon: palmas: Hide error messages if gpio returns -EPROBE_DEFER
  dt-bindings: extcon: usbc-cros-ec: convert extcon-usbc-cros-ec.txt to yaml format
  ...
2020-04-03 13:22:40 -07:00
Linus Torvalds
d883600523 Merge branch 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:

 - Christian extended clone3 so that processes can be spawned into
   cgroups directly.

   This is not only neat in terms of semantics but also avoids grabbing
   the global cgroup_threadgroup_rwsem for migration.

 - Daniel added !root xattr support to cgroupfs.

   Userland already uses xattrs on cgroupfs for bookkeeping. This will
   allow delegated cgroups to support such usages.

 - Prateek tried to make cpuset hotplug handling synchronous but that
   led to possible deadlock scenarios. Reverted.

 - Other minor changes including release_agent_path handling cleanup.

* 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  docs: cgroup-v1: Document the cpuset_v2_mode mount option
  Revert "cpuset: Make cpuset hotplug synchronous"
  cgroupfs: Support user xattrs
  kernfs: Add option to enable user xattrs
  kernfs: Add removed_size out param for simple_xattr_set
  kernfs: kvmalloc xattr value instead of kmalloc
  cgroup: Restructure release_agent_path handling
  selftests/cgroup: add tests for cloning into cgroups
  clone3: allow spawning processes into cgroups
  cgroup: add cgroup_may_write() helper
  cgroup: refactor fork helpers
  cgroup: add cgroup_get_from_file() helper
  cgroup: unify attach permission checking
  cpuset: Make cpuset hotplug synchronous
  cgroup.c: Use built-in RCU list checking
  kselftest/cgroup: add cgroup destruction test
  cgroup: Clean up css_set task traversal
2020-04-03 11:30:20 -07:00
Waiman Long
0c05b9bdbf docs: cgroup-v1: Document the cpuset_v2_mode mount option
The cpuset in cgroup v1 accepts a special "cpuset_v2_mode" mount
option that make cpuset.cpus and cpuset.mems behave more like those in
cgroup v2.  Document it to make other people more aware of this feature
that can be useful in some circumstances.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2020-04-03 11:42:56 -04:00
Rafael J. Wysocki
4506c531f1 Documentation: PM: sleep: Document system-wide suspend code flows
Add a document describing high-level system-wide suspend code flows
in Linux.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-03 11:41:01 +02:00
Linus Torvalds
8c1b724ddb ARM:
* GICv4.1 support
 * 32bit host removal
 
 PPC:
 * secure (encrypted) using under the Protected Execution Framework
 ultravisor
 
 s390:
 * allow disabling GISA (hardware interrupt injection) and protected
 VMs/ultravisor support.
 
 x86:
 * New dirty bitmap flag that sets all bits in the bitmap when dirty
 page logging is enabled; this is faster because it doesn't require bulk
 modification of the page tables.
 * Initial work on making nested SVM event injection more similar to VMX,
 and less buggy.
 * Various cleanups to MMU code (though the big ones and related
 optimizations were delayed to 5.8).  Instead of using cr3 in function
 names which occasionally means eptp, KVM too has standardized on "pgd".
 * A large refactoring of CPUID features, which now use an array that
 parallels the core x86_features.
 * Some removal of pointer chasing from kvm_x86_ops, which will also be
 switched to static calls as soon as they are available.
 * New Tigerlake CPUID features.
 * More bugfixes, optimizations and cleanups.
 
 Generic:
 * selftests: cleanups, new MMU notifier stress test, steal-time test
 * CSV output for kvm_stat.
 
 KVM/MIPS has been broken since 5.5, it does not compile due to a patch committed
 by MIPS maintainers.  I had already prepared a fix, but the MIPS maintainers
 prefer to fix it in generic code rather than KVM so they are taking care of it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl6GOnIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMfxwf/ZKLZiRoaovXCOG71M/eHtQb8ZIqU
 3MPy+On3eC5Sk/aBxWUL9EFZsbYG6kYdbZ1VOvG9XPBoLlnkDSm/IR0kaELHtnjj
 oGVda/tvGn46Ne39y8xBptmb91WDcWH0vFthT/CwlMxAw3xjr+gG7Qyo+8F2CW6m
 SSSuLiHSBnyO1cQKruBTHZ8qnR8LlnfXEqtd6Y4LFLic0LbLIoIdRcT3wjQrcZrm
 Djd7wbTEYZjUfoqZ72ekwEDUsONcDLDSKcguDO9pSMSCGhpxCVT5Vy68KRpoIMs2
 nzNWDKjvqQo5zb2+GWxJgkd12Hv+n7PCXZMbVrWBu1pQsewUns9m4mkpGw==
 =6fGt
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - GICv4.1 support

   - 32bit host removal

  PPC:
   - secure (encrypted) using under the Protected Execution Framework
     ultravisor

  s390:
   - allow disabling GISA (hardware interrupt injection) and protected
     VMs/ultravisor support.

  x86:
   - New dirty bitmap flag that sets all bits in the bitmap when dirty
     page logging is enabled; this is faster because it doesn't require
     bulk modification of the page tables.

   - Initial work on making nested SVM event injection more similar to
     VMX, and less buggy.

   - Various cleanups to MMU code (though the big ones and related
     optimizations were delayed to 5.8). Instead of using cr3 in
     function names which occasionally means eptp, KVM too has
     standardized on "pgd".

   - A large refactoring of CPUID features, which now use an array that
     parallels the core x86_features.

   - Some removal of pointer chasing from kvm_x86_ops, which will also
     be switched to static calls as soon as they are available.

   - New Tigerlake CPUID features.

   - More bugfixes, optimizations and cleanups.

  Generic:
   - selftests: cleanups, new MMU notifier stress test, steal-time test

   - CSV output for kvm_stat"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (277 commits)
  x86/kvm: fix a missing-prototypes "vmread_error"
  KVM: x86: Fix BUILD_BUG() in __cpuid_entry_get_reg() w/ CONFIG_UBSAN=y
  KVM: VMX: Add a trampoline to fix VMREAD error handling
  KVM: SVM: Annotate svm_x86_ops as __initdata
  KVM: VMX: Annotate vmx_x86_ops as __initdata
  KVM: x86: Drop __exit from kvm_x86_ops' hardware_unsetup()
  KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection
  KVM: x86: Set kvm_x86_ops only after ->hardware_setup() completes
  KVM: VMX: Configure runtime hooks using vmx_x86_ops
  KVM: VMX: Move hardware_setup() definition below vmx_x86_ops
  KVM: x86: Move init-only kvm_x86_ops to separate struct
  KVM: Pass kvm_init()'s opaque param to additional arch funcs
  s390/gmap: return proper error code on ksm unsharing
  KVM: selftests: Fix cosmetic copy-paste error in vm_mem_region_move()
  KVM: Fix out of range accesses to memslots
  KVM: X86: Micro-optimize IPI fastpath delay
  KVM: X86: Delay read msr data iff writes ICR MSR
  KVM: PPC: Book3S HV: Add a capability for enabling secure guests
  KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfs
  KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIs
  ...
2020-04-02 15:13:15 -07:00
Linus Torvalds
6cad420cc6 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
 "A large amount of MM, plenty more to come.

  Subsystems affected by this patch series:
   - tools
   - kthread
   - kbuild
   - scripts
   - ocfs2
   - vfs
   - mm: slub, kmemleak, pagecache, gup, swap, memcg, pagemap, mremap,
         sparsemem, kasan, pagealloc, vmscan, compaction, mempolicy,
         hugetlbfs, hugetlb"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits)
  include/linux/huge_mm.h: check PageTail in hpage_nr_pages even when !THP
  mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS
  selftests/vm: fix map_hugetlb length used for testing read and write
  mm/hugetlb: remove unnecessary memory fetch in PageHeadHuge()
  mm/hugetlb.c: clean code by removing unnecessary initialization
  hugetlb_cgroup: add hugetlb_cgroup reservation docs
  hugetlb_cgroup: add hugetlb_cgroup reservation tests
  hugetlb: support file_region coalescing again
  hugetlb_cgroup: support noreserve mappings
  hugetlb_cgroup: add accounting for shared mappings
  hugetlb: disable region_add file_region coalescing
  hugetlb_cgroup: add reservation accounting for private mappings
  mm/hugetlb_cgroup: fix hugetlb_cgroup migration
  hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations
  hugetlb_cgroup: add hugetlb_cgroup reservation counter
  hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race
  hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
  mm/memblock.c: remove redundant assignment to variable max_addr
  mm: mempolicy: require at least one nodeid for MPOL_PREFERRED
  mm: mempolicy: use VM_BUG_ON_VMA in queue_pages_test_walk()
  ...
2020-04-02 13:55:34 -07:00
Mina Almasry
6566704daf hugetlb_cgroup: add hugetlb_cgroup reservation docs
Add docs for how to use hugetlb_cgroup reservations, and their behavior.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Link: http://lkml.kernel.org/r/20200211213128.73302-9-almasrymina@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02 09:35:32 -07:00
Sebastian Andrzej Siewior
6923aa0d8c mm/compaction: Disable compact_unevictable_allowed on RT
Since commit 5bbe3547aa ("mm: allow compaction of unevictable pages")
it is allowed to examine mlocked pages and compact them by default.  On
-RT even minor pagefaults are problematic because it may take a few 100us
to resolve them and until then the task is blocked.

Make compact_unevictable_allowed = 0 default and issue a warning on RT if
it is changed.

[bigeasy@linutronix.de: v5]
  Link: https://lore.kernel.org/linux-mm/20190710144138.qyn4tuttdq6h7kqx@linutronix.de/
  Link: http://lkml.kernel.org/r/20200319165536.ovi75tsr2seared4@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/linux-mm/20190710144138.qyn4tuttdq6h7kqx@linutronix.de/
Link: http://lkml.kernel.org/r/20200303202225.nhqc3v5gwlb7x6et@linutronix.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02 09:35:31 -07:00
Johannes Weiner
8a931f8013 mm: memcontrol: recursive memory.low protection
Right now, the effective protection of any given cgroup is capped by its
own explicit memory.low setting, regardless of what the parent says.  The
reasons for this are mostly historical and ease of implementation: to make
delegation of memory.low safe, effective protection is the min() of all
memory.low up the tree.

Unfortunately, this limitation makes it impossible to protect an entire
subtree from another without forcing the user to make explicit protection
allocations all the way to the leaf cgroups - something that is highly
undesirable in real life scenarios.

Consider memory in a data center host.  At the cgroup top level, we have a
distinction between system management software and the actual workload the
system is executing.  Both branches are further subdivided into individual
services, job components etc.

We want to protect the workload as a whole from the system management
software, but that doesn't mean we want to protect and prioritize
individual workload wrt each other.  Their memory demand can vary over
time, and we'd want the VM to simply cache the hottest data within the
workload subtree.  Yet, the current memory.low limitations force us to
allocate a fixed amount of protection to each workload component in order
to get protection from system management software in general.  This
results in very inefficient resource distribution.

Another concern with mandating downward allocation is that, as the
complexity of the cgroup tree grows, it gets harder for the lower levels
to be informed about decisions made at the host-level.  Consider a
container inside a namespace that in turn creates its own nested tree of
cgroups to run multiple workloads.  It'd be extremely difficult to
configure memory.low parameters in those leaf cgroups that on one hand
balance pressure among siblings as the container desires, while also
reflecting the host-level protection from e.g.  rpm upgrades, that lie
beyond one or more delegation and namespacing points in the tree.

It's highly unusual from a cgroup interface POV that nested levels have to
be aware of and reflect decisions made at higher levels for them to be
effective.

To enable such use cases and scale configurability for complex trees, this
patch implements a resource inheritance model for memory that is similar
to how the CPU and the IO controller implement work-conserving resource
allocations: a share of a resource allocated to a subree always applies to
the entire subtree recursively, while allowing, but not mandating,
children to further specify distribution rules.

That means that if protection is explicitly allocated among siblings,
those configured shares are being followed during page reclaim just like
they are now.  However, if the memory.low set at a higher level is not
fully claimed by the children in that subtree, the "floating" remainder is
applied to each cgroup in the tree in proportion to its size.  Since
reclaim pressure is applied in proportion to size as well, each child in
that tree gets the same boost, and the effect is neutral among siblings -
with respect to each other, they behave as if no memory control was
enabled at all, and the VM simply balances the memory demands optimally
within the subtree.  But collectively those cgroups enjoy a boost over the
cgroups in neighboring trees.

E.g.  a leaf cgroup with a memory.low setting of 0 no longer means that
it's not getting a share of the hierarchically assigned resource, just
that it doesn't claim a fixed amount of it to protect from its siblings.

This allows us to recursively protect one subtree (workload) from another
(system management), while letting subgroups compete freely among each
other - without having to assign fixed shares to each leaf, and without
nested groups having to echo higher-level settings.

The floating protection composes naturally with fixed protection.
Consider the following example tree:

		A            A: low = 2G
               / \          A1: low = 1G
              A1 A2         A2: low = 0G

As outside pressure is applied to this tree, A1 will enjoy a fixed
protection from A2 of 1G, but the remaining, unclaimed 1G from A is split
evenly among A1 and A2, coming out to 1.5G and 0.5G.

There is a slight risk of regressing theoretical setups where the
top-level cgroups don't know about the true budgeting and set bogusly high
"bypass" values that are meaningfully allocated down the tree.  Such
setups would rely on unclaimed protection to be discarded, and
distributing it would change the intended behavior.  Be safe and hide the
new behavior behind a mount option, 'memory_recursiveprot'.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Chris Down <chris@chrisdown.name>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Koutný <mkoutny@suse.com>
Link: http://lkml.kernel.org/r/20200227195606.46212-4-hannes@cmpxchg.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02 09:35:28 -07:00
Chen Yu
db96a75946 PM: sleep: Add pm_debug_messages kernel command line option
Debug messages from the system suspend/hibernation infrastructure
are disabled by default, and can only be enabled after the system
has boot up via /sys/power/pm_debug_messages.

This makes the hibernation resume hard to track as it involves system
boot up across hibernation.  There's no chance for software_resume()
to track the resume process, for example.

Add a kernel command line option to set pm_debug_messages during
boot up.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-02 15:29:56 +02:00
Linus Torvalds
69c1fd9726 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "My attempt to revitalize trivial queue I've been neglecting for years
  (what a disaster that was for this world, right? :) ) with patches
  collected from backlog that were still relevant and not applied
  elsewhere in the meantime"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  err.h: remove deprecated PTR_RET for good
  blk-mq: Fix typo in comment
  x86/boot: Fix comment spelling
  sh: mach-highlander: Fix comment spelling
  s390/dasd: Fix comment spelling
  mfd: wm8994: Fix comment spelling
  docs: Add reference in binfmt-misc.rst
  genirq: fix kerneldoc comment for irq_desc
  drm/amdgpu: fix two documentation mismatch issues
  HID: fix Kconfig word ordering
  list/hashtable: minor documentation corrections.
2020-04-01 14:52:59 -07:00
Randy Dunlap
5fd769c2bf ACPI: video: Docs update for "acpi_backlight" kernel parameter options
Update the Documentation for "acpi_backlight" by adding 2 new
options (native and none).

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-01 11:15:55 +02:00
Linus Torvalds
29d9f30d4c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Fix the iwlwifi regression, from Johannes Berg.

   2) Support BSS coloring and 802.11 encapsulation offloading in
      hardware, from John Crispin.

   3) Fix some potential Spectre issues in qtnfmac, from Sergey
      Matyukevich.

   4) Add TTL decrement action to openvswitch, from Matteo Croce.

   5) Allow paralleization through flow_action setup by not taking the
      RTNL mutex, from Vlad Buslov.

   6) A lot of zero-length array to flexible-array conversions, from
      Gustavo A. R. Silva.

   7) Align XDP statistics names across several drivers for consistency,
      from Lorenzo Bianconi.

   8) Add various pieces of infrastructure for offloading conntrack, and
      make use of it in mlx5 driver, from Paul Blakey.

   9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki.

  10) Lots of parallelization improvements during configuration changes
      in mlxsw driver, from Ido Schimmel.

  11) Add support to devlink for generic packet traps, which report
      packets dropped during ACL processing. And use them in mlxsw
      driver. From Jiri Pirko.

  12) Support bcmgenet on ACPI, from Jeremy Linton.

  13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei
      Starovoitov, and your's truly.

  14) Support XDP meta-data in virtio_net, from Yuya Kusakabe.

  15) Fix sysfs permissions when network devices change namespaces, from
      Christian Brauner.

  16) Add a flags element to ethtool_ops so that drivers can more simply
      indicate which coalescing parameters they actually support, and
      therefore the generic layer can validate the user's ethtool
      request. Use this in all drivers, from Jakub Kicinski.

  17) Offload FIFO qdisc in mlxsw, from Petr Machata.

  18) Support UDP sockets in sockmap, from Lorenz Bauer.

  19) Fix stretch ACK bugs in several TCP congestion control modules,
      from Pengcheng Yang.

  20) Support virtual functiosn in octeontx2 driver, from Tomasz
      Duszynski.

  21) Add region operations for devlink and use it in ice driver to dump
      NVM contents, from Jacob Keller.

  22) Add support for hw offload of MACSEC, from Antoine Tenart.

  23) Add support for BPF programs that can be attached to LSM hooks,
      from KP Singh.

  24) Support for multiple paths, path managers, and counters in MPTCP.
      From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti,
      and others.

  25) More progress on adding the netlink interface to ethtool, from
      Michal Kubecek"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits)
  net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline
  cxgb4/chcr: nic-tls stats in ethtool
  net: dsa: fix oops while probing Marvell DSA switches
  net/bpfilter: remove superfluous testing message
  net: macb: Fix handling of fixed-link node
  net: dsa: ksz: Select KSZ protocol tag
  netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write
  net: stmmac: add EHL 2.5Gbps PCI info and PCI ID
  net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID
  net: stmmac: create dwmac-intel.c to contain all Intel platform
  net: dsa: bcm_sf2: Support specifying VLAN tag egress rule
  net: dsa: bcm_sf2: Add support for matching VLAN TCI
  net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions
  net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT
  net: dsa: bcm_sf2: Disable learning for ASP port
  net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge
  net: dsa: b53: Prevent tagged VLAN on port 7 for 7278
  net: dsa: b53: Restore VLAN entries upon (re)configuration
  net: dsa: bcm_sf2: Fix overflow checks
  hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
  ...
2020-03-31 17:29:33 -07:00
Linus Torvalds
b3aa112d57 selinux/stable-5.7 PR 20200330
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl6Ch6wUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPdcg/9FDMS/n0Xl1HQBUyu26EwLu3aUpNE
 BdghXW1LKSTp7MrOENE60PGzZSAiC07ci1DqFd7zfLPZf2q5IwPwOBa/Avy8z95V
 oHKqcMT6WO1SPOm/PxZn16FCKyET4gZDTXvHBAyiyFsbk36R522ZY615P9T3eLu/
 ZA1NFsSjj68SqMCUlAWfeqjcbQiX63bryEpugOIg0qWy7R/+rtWxj9TjriZ+v9tq
 uC45UcjBqphpmoPG8BifA3jjyclwO3DeQb5u7E8//HPPraGeB19ntsymUg7ljoGk
 NrqCkZtv6E+FRCDTR5f0O7M1T4BWJodxw2NwngnTwKByLC25EZaGx80o+VyMt0eT
 Pog+++JZaa5zZr2OYOtdlPVMLc2ALL6p/8lHOqFU3GKfIf04hWOm6/Lb2IWoXs3f
 CG2b6vzoXYyjbF0Q7kxadb8uBY2S1Ds+CVu2HMBBsXsPdwbbtFWOT/6aRAQu61qn
 PW+f47NR8By3SO6nMzWts2SZEERZNIEdSKeXHuR7As1jFMXrHLItreb4GCSPay5h
 2bzRpxt2br5CDLh7Jv2pZnHtUqBWOSbslTix77+Z/hPKaNowvD9v3tc5hX87rDmB
 dYXROD6/KoyXFYDcMdphlvORFhqGqd5bEYuHHum/VjSIRh237+/nxFY/vZ4i4bzU
 2fvpAmUlVX1c4rw=
 =LlWA
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull SELinux updates from Paul Moore:
 "We've got twenty SELinux patches for the v5.7 merge window, the
  highlights are below:

   - Deprecate setting /sys/fs/selinux/checkreqprot to 1.

     This flag was originally created to deal with legacy userspace and
     the READ_IMPLIES_EXEC personality flag. We changed the default from
     1 to 0 back in Linux v4.4 and now we are taking the next step of
     deprecating it, at some point in the future we will take the final
     step of rejecting 1.

   - Allow kernfs symlinks to inherit the SELinux label of the parent
     directory. In order to preserve backwards compatibility this is
     protected by the genfs_seclabel_symlinks SELinux policy capability.

   - Optimize how we store filename transitions in the kernel, resulting
     in some significant improvements to policy load times.

   - Do a better job calculating our internal hash table sizes which
     resulted in additional policy load improvements and likely general
     SELinux performance improvements as well.

   - Remove the unused initial SIDs (labels) and improve how we handle
     initial SIDs.

   - Enable per-file labeling for the bpf filesystem.

   - Ensure that we properly label NFS v4.2 filesystems to avoid a
     temporary unlabeled condition.

   - Add some missing XFS quota command types to the SELinux quota
     access controls.

   - Fix a problem where we were not updating the seq_file position
     index correctly in selinuxfs.

   - We consolidate some duplicated code into helper functions.

   - A number of list to array conversions.

   - Update Stephen Smalley's email address in MAINTAINERS"

* tag 'selinux-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: clean up indentation issue with assignment statement
  NFS: Ensure security label is set for root inode
  MAINTAINERS: Update my email address
  selinux: avtab_init() and cond_policydb_init() return void
  selinux: clean up error path in policydb_init()
  selinux: remove unused initial SIDs and improve handling
  selinux: reduce the use of hard-coded hash sizes
  selinux: Add xfs quota command types
  selinux: optimize storage of filename transitions
  selinux: factor out loop body from filename_trans_read()
  security: selinux: allow per-file labeling for bpffs
  selinux: generalize evaluate_cond_node()
  selinux: convert cond_expr to array
  selinux: convert cond_av_list to array
  selinux: convert cond_list to array
  selinux: sel_avc_get_stat_idx should increase position index
  selinux: allow kernfs symlinks to inherit parent directory context
  selinux: simplify evaluate_cond_node()
  Documentation,selinux: deprecate setting checkreqprot to 1
  selinux: move status variables out of selinux_ss
2020-03-31 15:07:55 -07:00
Linus Torvalds
42595ce90b Merge branch 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vmware updates from Ingo Molnar:
 "The main change in this tree is the addition of 'steal time clock
  support' for VMware guests"

* 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vmware: Use bool type for vmw_sched_clock
  x86/vmware: Enable steal time accounting
  x86/vmware: Add steal time clock support for VMware guests
  x86/vmware: Remove vmware_sched_clock_setup()
  x86/vmware: Make vmware_select_hypercall() __init
2020-03-31 12:09:51 -07:00
Paolo Bonzini
cf39d37539 KVM/arm updates for Linux 5.7
- GICv4.1 support
 - 32bit host removal
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl6DKKIPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDDe0P/30Oda6HJdcUY+g0dnHkH8N7t+VKjPPnihlX
 WBaT0Y4SzMsfAtG5lQqS48A50dXKWW70QvwkZjxu7abQhYFWGd2SGtTQxwqJXT8J
 I6MBh4r9xrIfiqzVT2BXslA6id5H6wCyyFI6vKm/IFkIu1J6JtwnKakQ0CIddS1d
 Blbgj5jcxGw+2xOppHCQXbWwwDdmYWkMZEBZjmhkezddqLDK+oaAUiUhHHHizTsB
 kLjgqYBVENpR1zDIsGpQAJloKXAiHfBQshQAmnhnBNzXE60LZ0n0/iODU9U5FDEO
 5j0DRWccKvsIMsUh7JpPr5xerGJ0rqk1IwPC2JcyzfRbvRLMpK1IOWfhI5Tg5lbP
 4Ev96QLEMBnKOWMSE0MqnMdq6JPzDLA6WZ28HZe2nc3/oWNgsSDtlXigx4xFFxTX
 zfc2YpAgFu3xJkPf8PtWTFvItm0AvFNFynPg0Rr/NsGf/FGeszYR4cLcHmv5NlWS
 IiV4+lgnlmr2LZr3VjUaumbtWIpuVF4Db5Al2K2E/PCN7ObfEkyCweDic8ophkH8
 sMS9TI38aH1Efy+I2Nfxxqpy8BcElZAMrAWt9R27A4JRLHdr7j5DsGnyRigXHgRe
 pFgbqtk/EjWkHwjaJVg8kPxf2+2P05VZsQeGG721nbKAIKDetM3RA2BflexdsptY
 kXplNsVr
 =eILh
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm updates for Linux 5.7

- GICv4.1 support
- 32bit host removal
2020-03-31 10:44:53 -04:00
Linus Torvalds
2853d5fafb Support for "split lock" detection:
- Atomic operations (lock prefixed instructions) which span two cache
     lines have to acquire the global bus lock. This is at least 1k cycles
     slower than an atomic operation within a cache line and disrupts
     performance on other cores. Aside of performance disruption this is
     a unpriviledged form of DoS.
 
     Some newer CPUs have the capability to raise an #AC trap when such an
     operation is attempted. The detection is by default enabled in warning
     mode which will warn once when a user space application is caught. A
     command line option allows to disable the detection or to select fatal
     mode which will terminate offending applications with SIGBUS.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl6B/uMTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYocsAD/9yqpw+XlPKNPsfbm9sbirBDfTrENcL
 F44iwn4WnrjoW/gnnZCYmPxJFsTtGVPqxHdUf4eyGemg9r9ZEO0DQftmUHC5Z6KX
 aa/b5JoeM61wp9HlpVlD4D1jVt4pWyQODQeZnUXE4DEzmRc3cD/5lSU+/VeaIwwz
 lxwUemqmXK7ucH2KA7smOGsl2nU6ED84q3mdOB1b4Cw+gWYMUnPJnuS/ipriBRx4
 BYbMItcxsFvtdO9Hx8PvGd5LUK0wW8JOWrYQICD2kLpZtHtGeaHpBzFzL0+nMU7d
 1epyDqJQDmX+PAzvj+EYyn3HTfobZlckn+tbxMQkkS+oDk1ywOZd+BancClvn5/5
 jMfPIQJF5bGASVnzGMWhzVdwthTZiMG4d1iKsUWOA/hN0ch0+rm1BqraToabsEFg
 Sv7/rvl9KtSOtMJTeAmMhlZUMBj9m8BtPFjniDwp6nw/upGgJdST5mrKFNYZvqOj
 JnXsEMr/nJVW6bnUvT6LF66xbHlzHdxtodkQWqF+IEsyRaOz1zAGpQamP98KxNLc
 dq/XYoEe1KqIFbg4BkNP+GeDL3FQDxjFNwPQnnjQEzWRbjkHlfmq1uKCsR2r8mBO
 fYNJ1X8lTyGV0kx/ERpWGazzabpzh+8Lr1yMhnoA3EWvlzUjmpN2PFI4oTpTrtzT
 c/q16SCxim3NWA==
 =D9x8
 -----END PGP SIGNATURE-----

Merge tag 'x86-splitlock-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 splitlock updates from Thomas Gleixner:
 "Support for 'split lock' detection:

  Atomic operations (lock prefixed instructions) which span two cache
  lines have to acquire the global bus lock. This is at least 1k cycles
  slower than an atomic operation within a cache line and disrupts
  performance on other cores. Aside of performance disruption this is a
  unpriviledged form of DoS.

  Some newer CPUs have the capability to raise an #AC trap when such an
  operation is attempted. The detection is by default enabled in warning
  mode which will warn once when a user space application is caught. A
  command line option allows to disable the detection or to select fatal
  mode which will terminate offending applications with SIGBUS"

* tag 'x86-splitlock-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
  x86/split_lock: Rework the initialization flow of split lock detection
  x86/split_lock: Enable split lock detection by kernel
2020-03-30 19:35:52 -07:00
Linus Torvalds
642e53ead6 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Various NUMA scheduling updates: harmonize the load-balancer and
     NUMA placement logic to not work against each other. The intended
     result is better locality, better utilization and fewer migrations.

   - Introduce Thermal Pressure tracking and optimizations, to improve
     task placement on thermally overloaded systems.

   - Implement frequency invariant scheduler accounting on (some) x86
     CPUs. This is done by observing and sampling the 'recent' CPU
     frequency average at ~tick boundaries. The CPU provides this data
     via the APERF/MPERF MSRs. This hopefully makes our capacity
     estimates more precise and keeps tasks on the same CPU better even
     if it might seem overloaded at a lower momentary frequency. (As
     usual, turbo mode is a complication that we resolve by observing
     the maximum frequency and renormalizing to it.)

   - Add asymmetric CPU capacity wakeup scan to improve capacity
     utilization on asymmetric topologies. (big.LITTLE systems)

   - PSI fixes and optimizations.

   - RT scheduling capacity awareness fixes & improvements.

   - Optimize the CONFIG_RT_GROUP_SCHED constraints code.

   - Misc fixes, cleanups and optimizations - see the changelog for
     details"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits)
  threads: Update PID limit comment according to futex UAPI change
  sched/fair: Fix condition of avg_load calculation
  sched/rt: cpupri_find: Trigger a full search as fallback
  kthread: Do not preempt current task if it is going to call schedule()
  sched/fair: Improve spreading of utilization
  sched: Avoid scale real weight down to zero
  psi: Move PF_MEMSTALL out of task->flags
  MAINTAINERS: Add maintenance information for psi
  psi: Optimize switching tasks inside shared cgroups
  psi: Fix cpu.pressure for cpu.max and competing cgroups
  sched/core: Distribute tasks within affinity masks
  sched/fair: Fix enqueue_task_fair warning
  thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code
  sched/rt: Remove unnecessary push for unfit tasks
  sched/rt: Allow pulling unfitting task
  sched/rt: Optimize cpupri_find() on non-heterogenous systems
  sched/rt: Re-instate old behavior in select_task_rq_rt()
  sched/rt: cpupri_find: Implement fallback mechanism for !fit case
  sched/fair: Fix reordering of enqueue/dequeue_task_fair()
  sched/fair: Fix runnable_avg for throttled cfs
  ...
2020-03-30 17:01:51 -07:00
Linus Torvalds
7c4fa15071 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Make kfree_rcu() use kfree_bulk() for added performance

   - RCU updates

   - Callback-overload handling updates

   - Tasks-RCU KCSAN and sparse updates

   - Locking torture test and RCU torture test updates

   - Documentation updates

   - Miscellaneous fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
  rcu: Make rcu_barrier() account for offline no-CBs CPUs
  rcu: Mark rcu_state.gp_seq to detect concurrent writes
  Documentation/memory-barriers: Fix typos
  doc: Add rcutorture scripting to torture.txt
  doc/RCU/rcu: Use https instead of http if possible
  doc/RCU/rcu: Use absolute paths for non-rst files
  doc/RCU/rcu: Use ':ref:' for links to other docs
  doc/RCU/listRCU: Update example function name
  doc/RCU/listRCU: Fix typos in a example code snippets
  doc/RCU/Design: Remove remaining HTML tags in ReST files
  doc: Add some more RCU list patterns in the kernel
  rcutorture: Set KCSAN Kconfig options to detect more data races
  rcutorture: Manually clean up after rcu_barrier() failure
  rcutorture: Make rcu_torture_barrier_cbs() post from corresponding CPU
  rcuperf: Measure memory footprint during kfree_rcu() test
  rcutorture: Annotation lockless accesses to rcu_torture_current
  rcutorture: Add READ_ONCE() to rcu_torture_count and rcu_torture_batch
  rcutorture: Fix stray access to rcu_fwd_cb_nodelay
  rcutorture: Fix rcu_torture_one_read()/rcu_torture_writer() data race
  rcutorture: Make kvm-find-errors.sh abort on bad directory
  ...
2020-03-30 15:52:00 -07:00
Linus Torvalds
6d90508121 ACPI updates for 5.7-rc1
- Update the ACPICA code in the kernel to the 20200214 upstream
    release including:
 
    * Fix to re-enable the sleep button after wakeup (Anchal Agarwal).
    * Fixes for mistakes in comments and typos (Bob Moore).
    * ASL-ASL+ converter updates (Erik Kaneda).
    * Type casting cleanups (Sven Barth).
 
  - Clean up the intialization of the EC driver and eliminate some
    dead code from it (Rafael Wysocki).
 
  - Clean up the quirk tables in the AC and battery drivers (Hans de
    Goede).
 
  - Fix the global lock handling on x86 to ignore unspecified bit
    positions in the global lock field (Jan Engelhardt).
 
  - Add a new "tiny" driver for ACPI button devices exposed by VMs to
    guest kernels to send signals directly to init (Josh Triplett).
 
  - Add a kernel parameter to disable ACPI BGRT on x86 (Alex Hung).
 
  - Make the ACPI PCI host bridge and fan drivers use scnprintf() to
    avoid potential buffer overflows (Takashi Iwai).
 
  - Clean up assorted pieces of code:
 
    * Reorder "asmlinkage" to make g++ happy (Alexey Dobriyan).
    * Drop unneeded variable initialization (Colin Ian King).
    * Add missing __acquires/__releases annotations (Jules Irenge).
    * Replace list_for_each_safe() with list_for_each_entry_safe()
      (chenqiwu).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl6CCQASHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx15UQAISSZxFTq6huh9c3r0xEgddamhn7VOX+
 phjRuTmPzRn2RqFt7Q/ypiy5qqRgBko7oR0UyMJeHc7YPYcJ2nrRx/6Ymg46nmac
 mdIwTG3y1bH6cD/Fz8cM+9ZCtQl8iZRf36zvlY/8fNpk+Cj98et+x+wbUN8GMO9F
 9anHpPKk7hHCwxSN/SnyrJGJpjKdW057sv9sYwgR65XnM35dGxExQNjqtQVFk/ih
 N7TKVHUAlEE06liS0QYCeugsZsu5/GviU/1uy3qwg+Fxcxw7muHfG/impZwFhdjn
 QrdnFOGz9lFXzY+ynQplW0tJtt1AvOLJzQtzGVOxurTJIgz1pEJnptvDXFWP2YBX
 aESfuFt47bzi/NT1f31L3YQ3vuOJczwkS/QlDxv4TJh6rFdZFnQQNo+iIxBAlB6n
 xSsADFbZ3OaAU2VcjVn6WSL7iD3znnIBZp/xQIybb+9BUoDhSXCTH7rNT7p025cR
 g4KGAevlNDEVKIsZs3UHRQYpFQ+qHDM3WNiAiIEyF9cdenSXEMKrBnEYKSbV7DnI
 rBYexFTvjAyVEb6qnuaQDwHHKhu5Xc0JebIXeTjByg993Y8SFLll7a5d40H71S6Z
 /nG4mOa8+Qt6MqhwvkXLu/cxrXgNmnCG8W9RH0/2sQs25AMys9SESo1jsvEeCS2o
 tC2xCpKl2TlU
 =kQmH
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:

   - Update the ACPICA code in the kernel to the 20200214 upstream
     release including:

       * Fix to re-enable the sleep button after wakeup (Anchal
         Agarwal).

       * Fixes for mistakes in comments and typos (Bob Moore).

       * ASL-ASL+ converter updates (Erik Kaneda).

       * Type casting cleanups (Sven Barth).

   - Clean up the intialization of the EC driver and eliminate some dead
     code from it (Rafael Wysocki).

   - Clean up the quirk tables in the AC and battery drivers (Hans de
     Goede).

   - Fix the global lock handling on x86 to ignore unspecified bit
     positions in the global lock field (Jan Engelhardt).

   - Add a new "tiny" driver for ACPI button devices exposed by VMs to
     guest kernels to send signals directly to init (Josh Triplett).

   - Add a kernel parameter to disable ACPI BGRT on x86 (Alex Hung).

   - Make the ACPI PCI host bridge and fan drivers use scnprintf() to
     avoid potential buffer overflows (Takashi Iwai).

   - Clean up assorted pieces of code:

       * Reorder "asmlinkage" to make g++ happy (Alexey Dobriyan).

       * Drop unneeded variable initialization (Colin Ian King).

       * Add missing __acquires/__releases annotations (Jules Irenge).

       * Replace list_for_each_safe() with list_for_each_entry_safe()
         (chenqiwu)"

* tag 'acpi-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (31 commits)
  ACPICA: Update version to 20200214
  ACPI: PCI: Use scnprintf() for avoiding potential buffer overflow
  ACPI: fan: Use scnprintf() for avoiding potential buffer overflow
  ACPI: EC: Eliminate EC_FLAGS_QUERY_HANDSHAKE
  ACPI: EC: Do not clear boot_ec_is_ecdt in acpi_ec_add()
  ACPI: EC: Simplify acpi_ec_ecdt_start() and acpi_ec_init()
  ACPI: EC: Consolidate event handler installation code
  acpi/x86: ignore unspecified bit positions in the ACPI global lock field
  acpi/x86: add a kernel parameter to disable ACPI BGRT
  x86/acpi: make "asmlinkage" part first thing in the function definition
  ACPI: list_for_each_safe() -> list_for_each_entry_safe()
  ACPI: video: remove redundant assignments to variable result
  ACPI: OSL: Add missing __acquires/__releases annotations
  ACPI / battery: Cleanup Lenovo Ideapad Miix 320 DMI table entry
  ACPI / AC: Cleanup DMI quirk table
  ACPI: EC: Use fast path in acpi_ec_add() for DSDT boot EC
  ACPI: EC: Simplify acpi_ec_add()
  ACPI: EC: Drop AE_NOT_FOUND special case from ec_install_handlers()
  ACPI: EC: Avoid passing redundant argument to functions
  ACPI: EC: Avoid printing confusing messages in acpi_ec_setup()
  ...
2020-03-30 15:17:04 -07:00
Linus Torvalds
49835c15a5 Power management updates for 5.7-rc1
- Clean up and rework the PM QoS API to simplify the code and
    reduce the size of it (Rafael Wysocki).
 
  - Fix a suspend-to-idle wakeup regression on Dell XPS13 9370
    and similar platforms where the USB plug/unplug events are
    handled by the EC (Rafael Wysocki).
 
  - CLean up the intel_idle and PSCI cpuidle drivers (Rafael Wysocki,
    Ulf Hansson).
 
  - Extend the haltpoll cpuidle driver so that it can be forced to
    run on some systems where it refused to load (Maciej Szmigiero).
 
  - Convert several cpufreq documents to the .rst format and move the
    legacy driver documentation into one common file (Mauro Carvalho
    Chehab, Rafael Wysocki).
 
  - Update several cpufreq drivers:
 
    * Extend and fix the imx-cpufreq-dt driver (Anson Huang).
 
    * Improve the -EPROBE_DEFER handling and fix unwanted CPU
      overclocking on i.MX6ULL in imx6q-cpufreq (Anson Huang,
      Christoph Niedermaier).
 
    * Add support for Krait based SoCs to the qcom driver (Ansuel
      Smith).
 
    * Add support for OPP_PLUS to ti-cpufreq (Lokesh Vutla).
 
    * Add platform specific intermediate callbacks support to
      cpufreq-dt and update the imx6q driver (Peng Fan).
 
    * Simplify and consolidate some pieces of the intel_pstate driver
      and update its documentation (Rafael Wysocki, Alex Hung).
 
  - Fix several devfreq issues:
 
    * Remove unneeded extern keyword from a devfreq header file
      and use the DEVFREQ_GOV_UPDATE_INTERNAL event name instead of
      DEVFREQ_GOV_INTERNAL (Chanwoo Choi).
 
    * Fix the handling of dev_pm_qos_remove_request() result (Leonard
      Crestez).
 
    * Use constant name for userspace governor (Pierre Kuo).
 
    * Get rid of doc warnings and fix a typo (Christophe JAILLET).
 
  - Use built-in RCU list checking in some places in the PM core to
    avoid false-positive RCU usage warnings (Madhuparna Bhowmik).
 
  - Add explicit READ_ONCE()/WRITE_ONCE() annotations to low-level
    PM QoS routines (Qian Cai).
 
  - Fix removal of wakeup sources to avoid NULL pointer dereferences
    in a corner case (Neeraj Upadhyay).
 
  - Clean up the handling of hibernate compat ioctls and fix the
    related documentation (Eric Biggers).
 
  - Update the idle_inject power capping driver to use variable-length
    arrays instead of zero-length arrays (Gustavo Silva).
 
  - Fix list format in a PM QoS document (Randy Dunlap).
 
  - Make the cpufreq stats module use scnprintf() to avoid potential
    buffer overflows (Takashi Iwai).
 
  - Add pm_runtime_get_if_active() to PM-runtime API (Sakari Ailus).
 
  - Allow no domain-idle-states DT property in generic PM domains (Ulf
    Hansson).
 
  - Fix a broken y-axis scale in the intel_pstate_tracer utility (Doug
    Smythies).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl6B/YkSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxEjIP/jXoO1pAxq7BMx7naZnZL7pzemJfAGR7
 HVnRLDo0IlsSwI7Jvuy13a0eI+EcGPA6pRo5qnBM4TZCIFsHoO5Yle47ndNGsi8r
 Jd3T89oT3I+fXI4KTfWO0n+K/F6mv8/CTZDz/E7Z6zirpFxyyZQxgIsAT76RcZom
 xhWna9vygOlBnFsQaAeph+GzoXBWIylaMZfylUeT3v4c4DLH6FzcbnINPkgJsZCw
 Ayt1bmE0L9yiqCizEto91eaDObkxTHVFGr2OVNa/Y/SVW+VUThUJrXqV28opQxPZ
 h4TiQorpTX1CwMmiXZwmoeqqsiVXrm0KyhK0lwc5tZ9FnZWiW4qjJ487Eu6TjOmh
 gecT+M2Yexy0BvUGN0wIdaCLtfmf2Hjxk0trxM2blAh3uoFjf3UJ9SLNkRjlu2/b
 QqWmIRRPljD5fEUid5lVV4EAXuITUzWMJeia+FiAsgx1SF3pZPar80f+FGrYfaJN
 wL2BTwBx1aXpPpAkEX0kM9Rkf6oJsFATR3p7DNzyZ1bMrQUxiToWRlQBID5H6G4v
 /kAkSTQjNQVwkkylUzTLOlcmL56sCvc0YPdybH62OsLXs9K4gyC8v6tEdtdA5qtw
 0Up9DrYbNKKv6GrSXf8eyk2Q2CEqfRXHv2ACNnkLRXZ6fWnFiTfMgNj7zqtrfna7
 tJBvrV9/ACXE
 =cBQd
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These clean up and rework the PM QoS API, address a suspend-to-idle
  wakeup regression on some ACPI-based platforms, clean up and extend a
  few cpuidle drivers, update multiple cpufreq drivers and cpufreq
  documentation, and fix a number of issues in devfreq and several other
  things all over.

  Specifics:

   - Clean up and rework the PM QoS API to simplify the code and reduce
     the size of it (Rafael Wysocki).

   - Fix a suspend-to-idle wakeup regression on Dell XPS13 9370 and
     similar platforms where the USB plug/unplug events are handled by
     the EC (Rafael Wysocki).

   - CLean up the intel_idle and PSCI cpuidle drivers (Rafael Wysocki,
     Ulf Hansson).

   - Extend the haltpoll cpuidle driver so that it can be forced to run
     on some systems where it refused to load (Maciej Szmigiero).

   - Convert several cpufreq documents to the .rst format and move the
     legacy driver documentation into one common file (Mauro Carvalho
     Chehab, Rafael Wysocki).

   - Update several cpufreq drivers:

        * Extend and fix the imx-cpufreq-dt driver (Anson Huang).

        * Improve the -EPROBE_DEFER handling and fix unwanted CPU
          overclocking on i.MX6ULL in imx6q-cpufreq (Anson Huang,
          Christoph Niedermaier).

        * Add support for Krait based SoCs to the qcom driver (Ansuel
          Smith).

        * Add support for OPP_PLUS to ti-cpufreq (Lokesh Vutla).

        * Add platform specific intermediate callbacks support to
          cpufreq-dt and update the imx6q driver (Peng Fan).

        * Simplify and consolidate some pieces of the intel_pstate
          driver and update its documentation (Rafael Wysocki, Alex
          Hung).

   - Fix several devfreq issues:

        * Remove unneeded extern keyword from a devfreq header file and
          use the DEVFREQ_GOV_UPDATE_INTERNAL event name instead of
          DEVFREQ_GOV_INTERNAL (Chanwoo Choi).

        * Fix the handling of dev_pm_qos_remove_request() result
          (Leonard Crestez).

        * Use constant name for userspace governor (Pierre Kuo).

        * Get rid of doc warnings and fix a typo (Christophe JAILLET).

   - Use built-in RCU list checking in some places in the PM core to
     avoid false-positive RCU usage warnings (Madhuparna Bhowmik).

   - Add explicit READ_ONCE()/WRITE_ONCE() annotations to low-level PM
     QoS routines (Qian Cai).

   - Fix removal of wakeup sources to avoid NULL pointer dereferences in
     a corner case (Neeraj Upadhyay).

   - Clean up the handling of hibernate compat ioctls and fix the
     related documentation (Eric Biggers).

   - Update the idle_inject power capping driver to use variable-length
     arrays instead of zero-length arrays (Gustavo Silva).

   - Fix list format in a PM QoS document (Randy Dunlap).

   - Make the cpufreq stats module use scnprintf() to avoid potential
     buffer overflows (Takashi Iwai).

   - Add pm_runtime_get_if_active() to PM-runtime API (Sakari Ailus).

   - Allow no domain-idle-states DT property in generic PM domains (Ulf
     Hansson).

   - Fix a broken y-axis scale in the intel_pstate_tracer utility (Doug
     Smythies)"

* tag 'pm-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (78 commits)
  cpufreq: intel_pstate: Simplify intel_pstate_cpu_init()
  tools/power/x86/intel_pstate_tracer: fix a broken y-axis scale
  ACPI: PM: s2idle: Refine active GPEs check
  ACPICA: Allow acpi_any_gpe_status_set() to skip one GPE
  PM: sleep: wakeup: Skip wakeup_source_sysfs_remove() if device is not there
  PM / devfreq: Get rid of some doc warnings
  PM / devfreq: Fix handling dev_pm_qos_remove_request result
  PM / devfreq: Fix a typo in a comment
  PM / devfreq: Change to DEVFREQ_GOV_UPDATE_INTERVAL event name
  PM / devfreq: Remove unneeded extern keyword
  PM / devfreq: Use constant name of userspace governor
  ACPI: PM: s2idle: Fix comment in acpi_s2idle_prepare_late()
  cpufreq: qcom: Add support for krait based socs
  cpufreq: imx6q-cpufreq: Improve the logic of -EPROBE_DEFER handling
  cpufreq: Use scnprintf() for avoiding potential buffer overflow
  cpuidle: psci: Split psci_dt_cpu_init_idle()
  PM / Domains: Allow no domain-idle-states DT property in genpd when parsing
  PM / hibernate: Remove unnecessary compat ioctl overrides
  PM: hibernate: fix docs for ioctls that return loff_t via pointer
  Documentation: intel_pstate: update links for references
  ...
2020-03-30 15:05:01 -07:00
Linus Torvalds
59838093be Driver core patches for 5.7-rc1
Here is the "big" set of driver core changes for 5.7-rc1.
 
 Nothing huge in here, just lots of little firmware core changes and use
 of new apis, a libfs fix, a debugfs api change, and some driver core
 deferred probe rework.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXoHLIg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yle2ACgjJJzRJl9Ckae3ms+9CS4OSFFZPsAoKSrXmFc
 Z7goYQdZo1zz8c0RYDrJ
 =Y91m
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the "big" set of driver core changes for 5.7-rc1.

  Nothing huge in here, just lots of little firmware core changes and
  use of new apis, a libfs fix, a debugfs api change, and some driver
  core deferred probe rework.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (44 commits)
  Revert "driver core: Set fw_devlink to "permissive" behavior by default"
  driver core: Set fw_devlink to "permissive" behavior by default
  driver core: Replace open-coded list_last_entry()
  driver core: Read atomic counter once in driver_probe_done()
  libfs: fix infoleak in simple_attr_read()
  driver core: Add device links from fwnode only for the primary device
  platform/x86: touchscreen_dmi: Add info for the Chuwi Vi8 Plus tablet
  platform/x86: touchscreen_dmi: Add EFI embedded firmware info support
  Input: icn8505 - Switch to firmware_request_platform for retreiving the fw
  Input: silead - Switch to firmware_request_platform for retreiving the fw
  selftests: firmware: Add firmware_request_platform tests
  test_firmware: add support for firmware_request_platform
  firmware: Add new platform fallback mechanism and firmware_request_platform()
  Revert "drivers: base: power: wakeup.c: Use built-in RCU list checking"
  drivers: base: power: wakeup.c: Use built-in RCU list checking
  component: allow missing unbind callback
  debugfs: remove return value of debugfs_create_file_size()
  debugfs: Check module state before warning in {full/open}_proxy_open()
  firmware: fix a double abort case with fw_load_sysfs_fallback
  arch_topology: Fix putting invalid cpu clk
  ...
2020-03-30 13:59:52 -07:00
Linus Torvalds
481ed297d9 This has been a busy cycle for documentation work. Highlights include:
- Lots of RST conversion work by Mauro, Daniel ALmeida, and others.
     Maybe someday we'll get to the end of this stuff...maybe...
 
   - Some organizational work to bring some order to the core-api manual.
 
   - Various new docs and additions to the existing documentation.
 
   - Typo fixes, warning fixes, ...
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl6BLf4PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YLhkIAIhcg6gxp0oZZ3KDfQyhvej0EWQGVDNkmloQ
 O1VOSV3RJsZL9HwN9xSNnNfN5+hw5RUYVbn1s201uj6kovZY9qcTpHP2LCizUeGb
 eFkSTmzkyAuAbJjuVLgMPDerJPEew0HnudiToeSpQeoIL1WB6YGd4/5H/cN1KLex
 8ggjllcY0wOgbiFffmK6+tavDv7vT0lKTdwKRYh2nxu7zrPVVd1ZnW+RtntdTVQt
 i+xwV6/YdWtg5C53IwBPpeyubX40vqaIjU8rzpLq5SCVbsZN14sSR709m1AYCOK0
 i4VDWEhfA2XBi6Nycl5U0czuGziwoHrTgSCkS1mmSDujnpgfKM8=
 =6YOS
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.7' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "This has been a busy cycle for documentation work.

  Highlights include:

   - Lots of RST conversion work by Mauro, Daniel ALmeida, and others.
     Maybe someday we'll get to the end of this stuff...maybe...

   - Some organizational work to bring some order to the core-api
     manual.

   - Various new docs and additions to the existing documentation.

   - Typo fixes, warning fixes, ..."

* tag 'docs-5.7' of git://git.lwn.net/linux: (123 commits)
  Documentation: x86: exception-tables: document CONFIG_BUILDTIME_TABLE_SORT
  MAINTAINERS: adjust to filesystem doc ReST conversion
  docs: deprecated.rst: Add BUG()-family
  doc: zh_CN: add translation for virtiofs
  doc: zh_CN: index files in filesystems subdirectory
  docs: locking: Drop :c:func: throughout
  docs: locking: Add 'need' to hardirq section
  docs: conf.py: avoid thousands of duplicate label warning on Sphinx
  docs: prevent warnings due to autosectionlabel
  docs: fix reference to core-api/namespaces.rst
  docs: fix pointers to io-mapping.rst and io_ordering.rst files
  Documentation: Better document the softlockup_panic sysctl
  docs: hw-vuln: tsx_async_abort.rst: get rid of an unused ref
  docs: perf: imx-ddr.rst: get rid of a warning
  docs: filesystems: fuse.rst: supress a Sphinx warning
  docs: translations: it: avoid duplicate refs at programming-language.rst
  docs: driver.rst: supress two ReSt warnings
  docs: trace: events.rst: convert some new stuff to ReST format
  Documentation: Add io_ordering.rst to driver-api manual
  Documentation: Add io-mapping.rst to driver-api manual
  ...
2020-03-30 12:45:23 -07:00
Rafael J. Wysocki
2409000a0c Merge branches 'pm-devfreq', 'powercap' and 'pm-docs'
* pm-devfreq:
  PM / devfreq: Get rid of some doc warnings
  PM / devfreq: Fix handling dev_pm_qos_remove_request result
  PM / devfreq: Fix a typo in a comment
  PM / devfreq: Change to DEVFREQ_GOV_UPDATE_INTERVAL event name
  PM / devfreq: Remove unneeded extern keyword
  PM / devfreq: Use constant name of userspace governor

* powercap:
  powercap: idle_inject: Replace zero-length array with flexible-array member

* pm-docs:
  docs: cpu-freq: convert cpufreq-stats.txt to ReST
  docs: cpu-freq: convert cpu-drivers.txt to ReST
  docs: cpu-freq: convert core.txt to ReST
  docs: cpu-freq: convert index.txt to ReST
  docs: cpufreq: fix a broken reference
  Documentation: cpufreq: Move legacy driver documentation
2020-03-30 14:48:31 +02:00
Rafael J. Wysocki
0411f0d10e Merge branch 'pm-cpufreq'
* pm-cpufreq:
  cpufreq: intel_pstate: Simplify intel_pstate_cpu_init()
  cpufreq: qcom: Add support for krait based socs
  cpufreq: imx6q-cpufreq: Improve the logic of -EPROBE_DEFER handling
  cpufreq: Use scnprintf() for avoiding potential buffer overflow
  Documentation: intel_pstate: update links for references
  cpufreq: intel_pstate: Consolidate policy verification
  cpufreq: dt: Allow platform specific intermediate callbacks
  cpufreq: imx-cpufreq-dt: Correct i.MX8MP's market segment fuse location
  cpufreq: imx6q: read OCOTP through nvmem for imx6q
  cpufreq: imx6q: fix error handling
  cpufreq: imx-cpufreq-dt: Add "cpu-supply" property check
  cpufreq: ti-cpufreq: Add support for OPP_PLUS
  cpufreq: imx6q: Fixes unwanted cpu overclocking on i.MX6ULL
2020-03-30 14:46:27 +02:00
Rafael J. Wysocki
8f1073ed8c Merge branch 'pm-qos'
* pm-qos: (30 commits)
  PM: QoS: annotate data races in pm_qos_*_value()
  Documentation: power: fix pm_qos_interface.rst format warning
  PM: QoS: Make CPU latency QoS depend on CONFIG_CPU_IDLE
  Documentation: PM: QoS: Update to reflect previous code changes
  PM: QoS: Update file information comments
  PM: QoS: Drop PM_QOS_CPU_DMA_LATENCY and rename related functions
  sound: Call cpu_latency_qos_*() instead of pm_qos_*()
  drivers: usb: Call cpu_latency_qos_*() instead of pm_qos_*()
  drivers: tty: Call cpu_latency_qos_*() instead of pm_qos_*()
  drivers: spi: Call cpu_latency_qos_*() instead of pm_qos_*()
  drivers: net: Call cpu_latency_qos_*() instead of pm_qos_*()
  drivers: mmc: Call cpu_latency_qos_*() instead of pm_qos_*()
  drivers: media: Call cpu_latency_qos_*() instead of pm_qos_*()
  drivers: hsi: Call cpu_latency_qos_*() instead of pm_qos_*()
  drm: i915: Call cpu_latency_qos_*() instead of pm_qos_*()
  x86: platform: iosf_mbi: Call cpu_latency_qos_*() instead of pm_qos_*()
  cpuidle: Call cpu_latency_qos_limit() instead of pm_qos_request()
  PM: QoS: Add CPU latency QoS API wrappers
  PM: QoS: Adjust pm_qos_request() signature and reorder pm_qos.h
  PM: QoS: Simplify definitions of CPU latency QoS trace events
  ...
2020-03-30 14:45:57 +02:00
Konstantin Khlebnikov
2b8bd42361 block/diskstats: more accurate approximation of io_ticks for slow disks
Currently io_ticks is approximated by adding one at each start and end of
requests if jiffies counter has changed. This works perfectly for requests
shorter than a jiffy or if one of requests starts/ends at each jiffy.

If disk executes just one request at a time and they are longer than two
jiffies then only first and last jiffies will be accounted.

Fix is simple: at the end of request add up into io_ticks jiffies passed
since last update rather than just one jiffy.

Example: common HDD executes random read 4k requests around 12ms.

fio --name=test --filename=/dev/sdb --rw=randread --direct=1 --runtime=30 &
iostat -x 10 sdb

Note changes of iostat's "%util" 8,43% -> 99,99% before/after patch:

Before:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,60    0,00   330,40     0,00     8,00     0,96   12,09   12,09    0,00   1,02   8,43

After:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0,00     0,00   82,50    0,00   330,00     0,00     8,00     1,00   12,10   12,10    0,00  12,12  99,99

Now io_ticks does not loose time between start and end of requests, but
for queue-depth > 1 some I/O time between adjacent starts might be lost.

For load estimation "%util" is not as useful as average queue length,
but it clearly shows how often disk queue is completely empty.

Fixes: 5b18b5a737 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-25 08:49:08 -06:00
Ingo Molnar
baf5fe7618 Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU changes from Paul E. McKenney:

 - Make kfree_rcu() use kfree_bulk() for added performance
 - RCU updates
 - Callback-overload handling updates
 - Tasks-RCU KCSAN and sparse updates
 - Locking torture test and RCU torture test updates
 - Documentation updates
 - Miscellaneous fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-03-24 10:10:09 +01:00
Alexey Makhalov
e73a8f38f8 x86/vmware: Enable steal time accounting
Set paravirt_steal_rq_enabled if steal clock present.
paravirt_steal_rq_enabled is used in sched/core.c to adjust task
progress by offsetting stolen time. Use 'no-steal-acc' off switch (share
same name with KVM) to disable steal time accounting.

Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200323195707.31242-5-amakhalov@vmware.com
2020-03-24 10:06:27 +01:00
Paul E. McKenney
aa93ec620b Merge branches 'doc.2020.02.27a', 'fixes.2020.03.21a', 'kfree_rcu.2020.02.20a', 'locktorture.2020.02.20a', 'ovld.2020.02.20a', 'rcu-tasks.2020.02.20a', 'srcu.2020.02.20a' and 'torture.2020.02.20a' into HEAD
doc.2020.02.27a: Documentation updates.
fixes.2020.03.21a: Miscellaneous fixes.
kfree_rcu.2020.02.20a: Updates to kfree_rcu().
locktorture.2020.02.20a: Lock torture-test updates.
ovld.2020.02.20a: Updates to callback-overload handling.
rcu-tasks.2020.02.20a: RCU-tasks updates.
srcu.2020.02.20a: SRCU updates.
torture.2020.02.20a: Torture-test updates.
2020-03-21 17:15:11 -07:00
Romuald Brunet
a7d47e59e7 docs: Add reference in binfmt-misc.rst
While looking at the binfmt-misc documentation I noticed that there is a
reference to the “java” example that could be a direct link.

This patch simply adds the link without changing the HTML output.

Signed-off-by: Romuald Brunet <romuald@chivil.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-03-17 20:41:00 +01:00
Alex Hung
c1f59a3782 Documentation: intel_pstate: update links for references
URLs for presentation and Intel Software Developer’s Manual are updated
as they were using "http" which are gradually replaced by "https".

Signed-off-by: Alex Hung <alex.hung@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-14 11:46:02 +01:00
Alex Hung
1ffb8d032d acpi/x86: add a kernel parameter to disable ACPI BGRT
BGRT is for displaying seamless OEM logo from booting to login screen;
however, this mechanism does not always work well on all configurations
and the OEM logo can be displayed multiple times. This looks worse than
without BGRT enabled.

This patch adds a kernel parameter to disable BGRT in boot time. This is
easier than re-compiling a kernel with CONFIG_ACPI_BGRT disabled.

Signed-off-by: Alex Hung <alex.hung@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-14 10:36:49 +01:00