Commit Graph

1455 Commits

Author SHA1 Message Date
Colin Ian King
773aaadcd4 xen/pvcalls: remove redundant check for irq >= 0
This is a moot point, but irq is always less than zero at the label
out_error, so the check for irq >= 0 is redundant and can be removed.

Detected by CoverityScan, CID#1460371 ("Logically dead code")

Fixes: cb1c7d9bbc ("xen/pvcalls: implement connect command")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03 11:38:45 -04:00
Colin Ian King
95110ac88d xen/pvcalls: fix unsigned less than zero error check
The check on bedata->ref is never true because ref is an unsigned
integer. Fix this by assigning signed int ret to the return of the
call to gnttab_claim_grant_reference so the -ve return can be checked.

Detected by CoverityScan, CID#1460358 ("Unsigned compared against 0")

Fixes: 2196819099 ("xen/pvcalls: connect to the backend")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03 11:38:27 -04:00
Gustavo A. R. Silva
3d8765d4f5 xen/pvcalls-front: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Notice that in this particular case I placed the "fall through" comment
on its own line, which is what GCC is expecting to find.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03 11:37:29 -04:00
Gustavo A. R. Silva
5fa916f7ac xen: xenbus_probe_frontend: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 146562
Addresses-Coverity-ID: 146563
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-03 11:36:07 -04:00
Dongli Zhang
5e25f5db6a xen/time: do not decrease steal time after live migration on xen
After guest live migration on xen, steal time in /proc/stat
(cpustat[CPUTIME_STEAL]) might decrease because steal returned by
xen_steal_lock() might be less than this_rq()->prev_steal_time which is
derived from previous return value of xen_steal_clock().

For instance, steal time of each vcpu is 335 before live migration.

cpu  198 0 368 200064 1962 0 0 1340 0 0
cpu0 38 0 81 50063 492 0 0 335 0 0
cpu1 65 0 97 49763 634 0 0 335 0 0
cpu2 38 0 81 50098 462 0 0 335 0 0
cpu3 56 0 107 50138 374 0 0 335 0 0

After live migration, steal time is reduced to 312.

cpu  200 0 370 200330 1971 0 0 1248 0 0
cpu0 38 0 82 50123 500 0 0 312 0 0
cpu1 65 0 97 49832 634 0 0 312 0 0
cpu2 39 0 82 50167 462 0 0 312 0 0
cpu3 56 0 107 50207 374 0 0 312 0 0

Since runstate times are cumulative and cleared during xen live migration
by xen hypervisor, the idea of this patch is to accumulate runstate times
to global percpu variables before live migration suspend. Once guest VM is
resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
runstate times and previously accumulated times stored in global percpu
variables.

Comment above HYPERVISOR_suspend() has been removed as it is inaccurate:
the call can return an error code (e.g., possibly -EPERM in the future).

Similar and more severe issue would impact prior linux 4.8-4.10 as
discussed by Michael Las at
https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
which would overflow steal time and lead to 100% st usage in top command
for linux 4.8-4.10. A backport of this patch would fix that issue.

[boris: added linux/slab.h to driver/xen/time.c, slightly reformatted
        commit message]

References: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-02 16:49:41 -04:00
Linus Torvalds
ead751507d License cleanup: add SPDX license identifiers to some files
Many source files in the tree are missing licensing information, which
 makes it harder for compliance tools to determine the correct license.
 
 By default all files without license information are under the default
 license of the kernel, which is GPL version 2.
 
 Update the files which contain no license information with the 'GPL-2.0'
 SPDX license identifier.  The SPDX identifier is a legally binding
 shorthand, which can be used instead of the full boiler plate text.
 
 This patch is based on work done by Thomas Gleixner and Kate Stewart and
 Philippe Ombredanne.
 
 How this work was done:
 
 Patches were generated and checked against linux-4.14-rc6 for a subset of
 the use cases:
  - file had no licensing information it it.
  - file was a */uapi/* one with no licensing information in it,
  - file was a */uapi/* one with existing licensing information,
 
 Further patches will be generated in subsequent months to fix up cases
 where non-standard license headers were used, and references to license
 had to be inferred by heuristics based on keywords.
 
 The analysis to determine which SPDX License Identifier to be applied to
 a file was done in a spreadsheet of side by side results from of the
 output of two independent scanners (ScanCode & Windriver) producing SPDX
 tag:value files created by Philippe Ombredanne.  Philippe prepared the
 base worksheet, and did an initial spot review of a few 1000 files.
 
 The 4.13 kernel was the starting point of the analysis with 60,537 files
 assessed.  Kate Stewart did a file by file comparison of the scanner
 results in the spreadsheet to determine which SPDX license identifier(s)
 to be applied to the file. She confirmed any determination that was not
 immediately clear with lawyers working with the Linux Foundation.
 
 Criteria used to select files for SPDX license identifier tagging was:
  - Files considered eligible had to be source code files.
  - Make and config files were included as candidates if they contained >5
    lines of source
  - File already had some variant of a license header in it (even if <5
    lines).
 
 All documentation files were explicitly excluded.
 
 The following heuristics were used to determine which SPDX license
 identifiers to apply.
 
  - when both scanners couldn't find any license traces, file was
    considered to have no license information in it, and the top level
    COPYING file license applied.
 
    For non */uapi/* files that summary was:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|-------
    GPL-2.0                                              11139
 
    and resulted in the first patch in this series.
 
    If that file was a */uapi/* path one, it was "GPL-2.0 WITH
    Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|-------
    GPL-2.0 WITH Linux-syscall-note                        930
 
    and resulted in the second patch in this series.
 
  - if a file had some form of licensing information in it, and was one
    of the */uapi/* ones, it was denoted with the Linux-syscall-note if
    any GPL family license was found in the file or had no licensing in
    it (per prior point).  Results summary:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|------
    GPL-2.0 WITH Linux-syscall-note                       270
    GPL-2.0+ WITH Linux-syscall-note                      169
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
    LGPL-2.1+ WITH Linux-syscall-note                      15
    GPL-1.0+ WITH Linux-syscall-note                       14
    ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
    LGPL-2.0+ WITH Linux-syscall-note                       4
    LGPL-2.1 WITH Linux-syscall-note                        3
    ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
    ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
 
    and that resulted in the third patch in this series.
 
  - when the two scanners agreed on the detected license(s), that became
    the concluded license(s).
 
  - when there was disagreement between the two scanners (one detected a
    license but the other didn't, or they both detected different
    licenses) a manual inspection of the file occurred.
 
  - In most cases a manual inspection of the information in the file
    resulted in a clear resolution of the license that should apply (and
    which scanner probably needed to revisit its heuristics).
 
  - When it was not immediately clear, the license identifier was
    confirmed with lawyers working with the Linux Foundation.
 
  - If there was any question as to the appropriate license identifier,
    the file was flagged for further research and to be revisited later
    in time.
 
 In total, over 70 hours of logged manual review was done on the
 spreadsheet to determine the SPDX license identifiers to apply to the
 source files by Kate, Philippe, Thomas and, in some cases, confirmation
 by lawyers working with the Linux Foundation.
 
 Kate also obtained a third independent scan of the 4.13 code base from
 FOSSology, and compared selected files where the other two scanners
 disagreed against that SPDX file, to see if there was new insights.  The
 Windriver scanner is based on an older version of FOSSology in part, so
 they are related.
 
 Thomas did random spot checks in about 500 files from the spreadsheets
 for the uapi headers and agreed with SPDX license identifier in the
 files he inspected. For the non-uapi files Thomas did random spot checks
 in about 15000 files.
 
 In initial set of patches against 4.14-rc6, 3 files were found to have
 copy/paste license identifier errors, and have been fixed to reflect the
 correct identifier.
 
 Additionally Philippe spent 10 hours this week doing a detailed manual
 inspection and review of the 12,461 patched files from the initial patch
 version early this week with:
  - a full scancode scan run, collecting the matched texts, detected
    license ids and scores
  - reviewing anything where there was a license detected (about 500+
    files) to ensure that the applied SPDX license was correct
  - reviewing anything where there was no detection but the patch license
    was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
    SPDX license was correct
 
 This produced a worksheet with 20 files needing minor correction.  This
 worksheet was then exported into 3 different .csv files for the
 different types of files to be modified.
 
 These .csv files were then reviewed by Greg.  Thomas wrote a script to
 parse the csv files and add the proper SPDX tag to the file, in the
 format that the file expected.  This script was further refined by Greg
 based on the output to detect more types of files automatically and to
 distinguish between header and source .c files (which need different
 comment types.)  Finally Greg ran the script using the .csv files to
 generate the patches.
 
 Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
 Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
 Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA
 6dVh26uchcEQLN/XqUDt
 =x306
 -----END PGP SIGNATURE-----

Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull initial SPDX identifiers from Greg KH:
 "License cleanup: add SPDX license identifiers to some files

  Many source files in the tree are missing licensing information, which
  makes it harder for compliance tools to determine the correct license.

  By default all files without license information are under the default
  license of the kernel, which is GPL version 2.

  Update the files which contain no license information with the
  'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
  binding shorthand, which can be used instead of the full boiler plate
  text.

  This patch is based on work done by Thomas Gleixner and Kate Stewart
  and Philippe Ombredanne.

  How this work was done:

  Patches were generated and checked against linux-4.14-rc6 for a subset
  of the use cases:

   - file had no licensing information it it.

   - file was a */uapi/* one with no licensing information in it,

   - file was a */uapi/* one with existing licensing information,

  Further patches will be generated in subsequent months to fix up cases
  where non-standard license headers were used, and references to
  license had to be inferred by heuristics based on keywords.

  The analysis to determine which SPDX License Identifier to be applied
  to a file was done in a spreadsheet of side by side results from of
  the output of two independent scanners (ScanCode & Windriver)
  producing SPDX tag:value files created by Philippe Ombredanne.
  Philippe prepared the base worksheet, and did an initial spot review
  of a few 1000 files.

  The 4.13 kernel was the starting point of the analysis with 60,537
  files assessed. Kate Stewart did a file by file comparison of the
  scanner results in the spreadsheet to determine which SPDX license
  identifier(s) to be applied to the file. She confirmed any
  determination that was not immediately clear with lawyers working with
  the Linux Foundation.

  Criteria used to select files for SPDX license identifier tagging was:

   - Files considered eligible had to be source code files.

   - Make and config files were included as candidates if they contained
     >5 lines of source

   - File already had some variant of a license header in it (even if <5
     lines).

  All documentation files were explicitly excluded.

  The following heuristics were used to determine which SPDX license
  identifiers to apply.

   - when both scanners couldn't find any license traces, file was
     considered to have no license information in it, and the top level
     COPYING file license applied.

     For non */uapi/* files that summary was:

       SPDX license identifier                            # files
       ---------------------------------------------------|-------
       GPL-2.0                                              11139

     and resulted in the first patch in this series.

     If that file was a */uapi/* path one, it was "GPL-2.0 WITH
     Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
     was:

       SPDX license identifier                            # files
       ---------------------------------------------------|-------
       GPL-2.0 WITH Linux-syscall-note                        930

     and resulted in the second patch in this series.

   - if a file had some form of licensing information in it, and was one
     of the */uapi/* ones, it was denoted with the Linux-syscall-note if
     any GPL family license was found in the file or had no licensing in
     it (per prior point). Results summary:

       SPDX license identifier                            # files
       ---------------------------------------------------|------
       GPL-2.0 WITH Linux-syscall-note                       270
       GPL-2.0+ WITH Linux-syscall-note                      169
       ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
       ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
       LGPL-2.1+ WITH Linux-syscall-note                      15
       GPL-1.0+ WITH Linux-syscall-note                       14
       ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
       LGPL-2.0+ WITH Linux-syscall-note                       4
       LGPL-2.1 WITH Linux-syscall-note                        3
       ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
       ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

     and that resulted in the third patch in this series.

   - when the two scanners agreed on the detected license(s), that
     became the concluded license(s).

   - when there was disagreement between the two scanners (one detected
     a license but the other didn't, or they both detected different
     licenses) a manual inspection of the file occurred.

   - In most cases a manual inspection of the information in the file
     resulted in a clear resolution of the license that should apply
     (and which scanner probably needed to revisit its heuristics).

   - When it was not immediately clear, the license identifier was
     confirmed with lawyers working with the Linux Foundation.

   - If there was any question as to the appropriate license identifier,
     the file was flagged for further research and to be revisited later
     in time.

  In total, over 70 hours of logged manual review was done on the
  spreadsheet to determine the SPDX license identifiers to apply to the
  source files by Kate, Philippe, Thomas and, in some cases,
  confirmation by lawyers working with the Linux Foundation.

  Kate also obtained a third independent scan of the 4.13 code base from
  FOSSology, and compared selected files where the other two scanners
  disagreed against that SPDX file, to see if there was new insights.
  The Windriver scanner is based on an older version of FOSSology in
  part, so they are related.

  Thomas did random spot checks in about 500 files from the spreadsheets
  for the uapi headers and agreed with SPDX license identifier in the
  files he inspected. For the non-uapi files Thomas did random spot
  checks in about 15000 files.

  In initial set of patches against 4.14-rc6, 3 files were found to have
  copy/paste license identifier errors, and have been fixed to reflect
  the correct identifier.

  Additionally Philippe spent 10 hours this week doing a detailed manual
  inspection and review of the 12,461 patched files from the initial
  patch version early this week with:

   - a full scancode scan run, collecting the matched texts, detected
     license ids and scores

   - reviewing anything where there was a license detected (about 500+
     files) to ensure that the applied SPDX license was correct

   - reviewing anything where there was no detection but the patch
     license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
     applied SPDX license was correct

  This produced a worksheet with 20 files needing minor correction. This
  worksheet was then exported into 3 different .csv files for the
  different types of files to be modified.

  These .csv files were then reviewed by Greg. Thomas wrote a script to
  parse the csv files and add the proper SPDX tag to the file, in the
  format that the file expected. This script was further refined by Greg
  based on the output to detect more types of files automatically and to
  distinguish between header and source .c files (which need different
  comment types.) Finally Greg ran the script using the .csv files to
  generate the patches.

  Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
  Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
  Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  License cleanup: add SPDX license identifier to uapi header files with a license
  License cleanup: add SPDX license identifier to uapi header files with no license
  License cleanup: add SPDX GPL-2.0 license identifier to files with no license
2017-11-02 10:04:46 -07:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Stefano Stabellini
5eee149ab9 xen: introduce a Kconfig option to enable the pvcalls frontend
Also add pvcalls-front to the Makefile.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
235a71c539 xen/pvcalls: implement release command
Send PVCALLS_RELEASE to the backend and wait for a reply. Take both
in_mutex and out_mutex to avoid concurrent accesses. Then, free the
socket.

For passive sockets, check whether we have already pre-allocated an
active socket for the purpose of being accepted. If so, free that as
well.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
5842c83596 xen/pvcalls: implement poll command
For active sockets, check the indexes and use the inflight_conn_req
waitqueue to wait.

For passive sockets if an accept is outstanding
(PVCALLS_FLAG_ACCEPT_INFLIGHT), check if it has been answered by looking
at bedata->rsp[req_id]. If so, return POLLIN.  Otherwise use the
inflight_accept_req waitqueue.

If no accepts are inflight, send PVCALLS_POLL to the backend. If we have
outstanding POLL requests awaiting for a response use the inflight_req
waitqueue: inflight_req is awaken when a new response is received; on
wakeup we check whether the POLL response is arrived by looking at the
PVCALLS_FLAG_POLL_RET flag. We set the flag from
pvcalls_front_event_handler, if the response was for a POLL command.

In pvcalls_front_event_handler, get the struct sock_mapping from the
poll id (we previously converted struct sock_mapping* to uintptr_t and
used it as id).

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
ae0d04052e xen/pvcalls: implement recvmsg
Implement recvmsg by copying data from the "in" ring. If not enough data
is available and the recvmsg call is blocking, then wait on the
inflight_conn_req waitqueue. Take the active socket in_mutex so that
only one function can access the ring at any given time.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
45ddce214a xen/pvcalls: implement sendmsg
Send data to an active socket by copying data to the "out" ring. Take
the active socket out_mutex so that only one function can access the
ring at any given time.

If not enough room is available on the ring, rather than returning
immediately or sleep-waiting, spin for up to 5000 cycles. This small
optimization turns out to improve performance significantly.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
9774c6cca2 xen/pvcalls: implement accept command
Introduce a waitqueue to allow only one outstanding accept command at
any given time and to implement polling on the passive socket. Introduce
a flags field to keep track of in-flight accept and poll commands.

Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make
sure that only one accept command is executed at any given time by
setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the
inflight_accept_req waitqueue.

Convert the new struct sock_mapping pointer into an uintptr_t and use it
as id for the new socket to pass to the backend.

Check if the accept call is non-blocking: in that case after sending the
ACCEPT command to the backend store the sock_mapping pointer of the new
struct and the inflight req_id then return -EAGAIN (which will respond
only when there is something to accept). Next time accept is called,
we'll check if the ACCEPT command has been answered, if so we'll pick up
where we left off, otherwise we return -EAGAIN again.

Note that, differently from the other commands, we can use
wait_event_interruptible (instead of wait_event) in the case of accept
as we are able to track the req_id of the ACCEPT response that we are
waiting.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
1853f11d72 xen/pvcalls: implement listen command
Send PVCALLS_LISTEN to the backend.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
67ea9893cb xen/pvcalls: implement bind command
Send PVCALLS_BIND to the backend. Introduce a new structure, part of
struct sock_mapping, to store information specific to passive sockets.

Introduce a status field to keep track of the status of the passive
socket.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
cb1c7d9bbc xen/pvcalls: implement connect command
Send PVCALLS_CONNECT to the backend. Allocate a new ring and evtchn for
the active socket.

Introduce fields in struct sock_mapping to keep track of active sockets.
Introduce a waitqueue to allow the frontend to wait on data coming from
the backend on the active socket (recvmsg command).

Two mutexes (one of reads and one for writes) will be used to protect
the active socket in and out rings from concurrent accesses.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
2195046bfd xen/pvcalls: implement socket command and handle events
Send a PVCALLS_SOCKET command to the backend, use the masked
req_prod_pvt as req_id. This way, req_id is guaranteed to be between 0
and PVCALLS_NR_REQ_PER_RING. We already have a slot in the rsp array
ready for the response, and there cannot be two outstanding responses
with the same req_id.

Wait for the response by waiting on the inflight_req waitqueue and
check for the req_id field in rsp[req_id]. Use atomic accesses and
barriers to read the field. Note that the barriers are simple smp
barriers (as opposed to virt barriers) because they are for internal
frontend synchronization, not frontend<->backend communication.

Once a response is received, clear the corresponding rsp slot by setting
req_id to PVCALLS_INVALID_ID. Note that PVCALLS_INVALID_ID is invalid
only from the frontend point of view. It is not part of the PVCalls
protocol.

pvcalls_front_event_handler is in charge of copying responses from the
ring to the appropriate rsp slot. It is done by copying the body of the
response first, then by copying req_id atomically. After the copies,
wake up anybody waiting on waitqueue.

socket_lock protects accesses to the ring.

Convert the pointer to sock_mapping into an uintptr_t and use it as
id for the new socket to pass to the backend. The struct will be fully
initialized later on connect or bind.

sock->sk->sk_send_head is not used for ip sockets: reuse the field to
store a pointer to the struct sock_mapping corresponding to the socket.
This way, we can easily get the struct sock_mapping from the struct
socket.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
2196819099 xen/pvcalls: connect to the backend
Implement the probe function for the pvcalls frontend. Read the
supported versions, max-page-order and function-calls nodes from
xenstore.

Only one frontend<->backend connection is supported at any given time
for a guest. Store the active frontend device to a static pointer.

Introduce a stub functions for the event handler.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
aa7ba37678 xen/pvcalls: implement frontend disconnect
Introduce a data structure named pvcalls_bedata. It contains pointers to
the command ring, the event channel, a list of active sockets and a list
of passive sockets. Lists accesses are protected by a spin_lock.

Introduce a waitqueue to allow waiting for a response on commands sent
to the backend.

Introduce an array of struct xen_pvcalls_response to store commands
responses.

Introduce a new struct sock_mapping to keep track of sockets.  In this
patch the struct sock_mapping is minimal, the fields will be added by
the next patches.

pvcalls_refcount is used to keep count of the outstanding pvcalls users.
Only remove connections once the refcount is zero.

Implement pvcalls frontend removal function. Go through the list of
active and passive sockets and free them all, one at a time.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Stefano Stabellini
416efba0bd xen/pvcalls: introduce the pvcalls xenbus frontend
Introduce a xenbus frontend for the pvcalls protocol, as defined by
https://xenbits.xen.org/docs/unstable/misc/pvcalls.html.

This patch only adds the stubs, the code will be added by the following
patches.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-31 09:05:53 -04:00
Linus Torvalds
11224e1fc4 xen: fixes for 4.14-rc7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJZ8uDpAAoJELDendYovxMvnYAH/iYlLBkNhw2yLScYxMNuMo60
 8W82/70UNdC2ZIWlIKQSDsvlU0Omy9Iu51zBrE6SEVKpISxrOvtYO5JiaZGhPAqY
 2/Jpeuawdm44uaFPFwajLRsHIhgyuAxMxj7Y+TLFGW/+X6FrmFg5G3CNt5pRf0Ah
 xraD8O5MYG6FfqxftCLMD8cKlxqslZwZUFuf5CjxSKbw4HTWcTEA7a86toONUI9L
 hJjmAD6VRW/PgEVrLklQBRRwiSsV1nyBLrY5Q+sSEy9BGGvblO/yEEg8uO6sYmJ0
 a9bycTASUbisV1LvCp7HcHFn6h60CZV2XwNgwRaToEF8ebycAw5hq6l7t8pFTNI=
 =EvbJ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.14c-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - a fix for the Xen gntdev device repairing an issue in case of partial
   failure of mapping multiple pages of another domain

 - a fix of a regression in the Xen balloon driver introduced in 4.13

 - a build fix for Xen on ARM which will trigger e.g. for Linux RT

 - a maintainers update for pvops (not really Xen, but carrying through
   this tree just for convenience)

* tag 'for-linus-4.14c-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  maintainers: drop Chris Wright from pvops
  arm/xen: don't inclide rwlock.h directly.
  xen: fix booting ballooned down hvm guest
  xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()
2017-10-27 20:41:05 -07:00
Juergen Gross
5266b8e444 xen: fix booting ballooned down hvm guest
Commit 96edd61dcf ("xen/balloon: don't
online new memory initially") introduced a regression when booting a
HVM domain with memory less than mem-max: instead of ballooning down
immediately the system would try to use the memory up to mem-max
resulting in Xen crashing the domain.

For HVM domains the current size will be reflected in Xenstore node
memory/static-max instead of memory/target.

Additionally we have to trigger the ballooning process at once.

Cc: <stable@vger.kernel.org> # 4.13
Fixes: 96edd61dcf ("xen/balloon: don't
       online new memory initially")

Reported-by: Simon Gaiser <hw42@ipsumj.de>
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-26 08:11:44 -04:00
Juergen Gross
298d275d4d xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()
In case gntdev_mmap() succeeds only partially in mapping grant pages
it will leave some vital information uninitialized needed later for
cleanup. This will lead to an out of bounds array access when unmapping
the already mapped pages.

So just initialize the data needed for unmapping the pages a little bit
earlier.

Cc: <stable@vger.kernel.org>
Reported-by: Arthur Borsboom <arthurborsboom@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-25 12:48:13 -04:00
Al Viro
83683dc6e0 xen: don't open-code iov_iter_kvec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-11 22:34:31 -04:00
Kees Cook
1d27e3e225 timer: Remove expires and data arguments from DEFINE_TIMER
Drop the arguments from the macro and adjust all callers with the
following script:

  perl -pi -e 's/DEFINE_TIMER\((.*), 0, 0\);/DEFINE_TIMER($1);/g;' \
    $(git grep DEFINE_TIMER | cut -d: -f1 | sort -u | grep -v timer.h)

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # for m68k parts
Acked-by: Guenter Roeck <linux@roeck-us.net> # for watchdog parts
Acked-by: David S. Miller <davem@davemloft.net> # for networking parts
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Kalle Valo <kvalo@codeaurora.org> # for wireless parts
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-mips@linux-mips.org
Cc: Petr Mladek <pmladek@suse.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Kalle Valo <kvalo@qca.qualcomm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: linux1394-devel@lists.sourceforge.net
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-s390@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Michael Reed <mdr@sgi.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-pm@vger.kernel.org
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mark Gross <mark.gross@intel.com>
Cc: linux-watchdog@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: netdev@vger.kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lkml.kernel.org/r/1507159627-127660-11-git-send-email-keescook@chromium.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-10-05 15:01:20 +02:00
Linus Torvalds
9f2a5128b9 xen: fixes for 4.14-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJZzmQ4AAoJELDendYovxMvL1oIAIBiL7SCEX4mlCjNBYBGw8N+
 4pcYUcKPu07JeAQGC7SOEjcwjrSUw+b6NJIZLCHcPAG/JyejwBAbuztmRgqsIqN9
 sVa/7GjecigsE+Jw3gT1OHDxxLMsyk2pa+poeTVdjjqFNOGRzWhG3D5dZGgOUMkF
 o8KaPgh2jyA2rg6SnxEDXy9aEpDFOO6Yb9cxApwdC+Y399zPEdqauEzFunxzIoa+
 S155tI9rr2HcXUp/DxAk/C6PaSmKfEszuKKyvvjFE8latHCaUEJ+HLacURuJUu7C
 pEc2gOTOo4dkYyDLLIQeCyGbRnH4B1GF9cv0vF//1gfAJzVGtJxmwj1qlezVDCQ=
 =jfCe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - avoid a warning when compiling with clang

 - consider read-only bits in xen-pciback when writing to a BAR

 - fix a boot crash of pv-domains

* tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping
  xen-pciback: relax BAR sizing write value check
  x86/xen: clean up clang build warning
2017-09-29 12:24:28 -07:00
Jan Beulich
8c28ef3f1c xen-pciback: relax BAR sizing write value check
Just like done in d2bd05d88d ("xen-pciback: return proper values during
BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the
written value directly against ~0, but consider the r/o bits at the
bottom (if any).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-28 08:26:25 -04:00
Linus Torvalds
0a8abd97dc xen: Fixes for rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJZxSV8AAoJELDendYovxMvf5YH/jUEgeFVgP0KRjNsvgp4gu88
 4BW7nWYtFt4gGE1KnrKEPbg5Je0OwkpW7vUXxvLwDGWymtHMzVuuB2xxwzkePyzS
 17Kzmb/JiuaNpVF4+5v3JvAw3b9iHrZ7T6cXOQgm28agd3m/y+9FSyzoMoNNRdGG
 xURwUyK1idRqtkQV5VsQAK0Z1lVF7YhhaxWXBtClsqnKWoeLBLc8fpRJmUNruA33
 E2Sdi06mNNN3xudu1s2edC5hAO4EgVKmonnmyHRYonIYwuqSND8fhEXj+PRdHj7s
 lLVRQixd3raBiSscLASaQ7I/66frBm+TXzmoHAVtYkdlXBJlisTIvQlMPtAwu60=
 =c3HX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.14b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "A fix for a missing __init annotation and two cleanup patches"

* tag 'for-linus-4.14b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen, arm64: drop dummy lookup_address()
  xen: don't compile pv-specific parts if XEN_PV isn't configured
  xen: x86: mark xen_find_pt_base as __init
2017-09-22 06:40:47 -10:00
Juergen Gross
fe9c1c9555 xen: don't compile pv-specific parts if XEN_PV isn't configured
xenbus_client.c contains some functions specific for pv guests.
Enclose them with #ifdef CONFIG_XEN_PV to avoid compiling them when
they are not needed (e.g. on ARM).

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-18 09:13:23 -04:00
Michal Hocko
0ee931c4e3 mm: treewide: remove GFP_TEMPORARY allocation flag
GFP_TEMPORARY was introduced by commit e12ba74d8f ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation.  As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory.  So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification.  I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring.  This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL.  Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
seems to be a heuristic without any measured advantage for most (if not
all) its current users.  The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers.  So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
  Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-13 18:53:16 -07:00
Linus Torvalds
3ee31b89d9 xen: fixes and features for 4.14
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJZsBbYAAoJELDendYovxMv4hoH/39psrSeHw2hPX78KJ6orq4v
 mTVEP2gLA/qxaM03EnFljXfd88J8NcJsxv7vVjh/U4xRwntvAMovCkygkkO1aw93
 nZEhUq6IGupr8KzmqQi5U7WtiWAXFwDbGSasnOKEj/lLa7E0/9MsYYQ01FS6oFkc
 c9CHONaCWepdz0Xpt7s6BKyzo74ZbJeCc5rUZU81oH40XphaZEoy8E9NOgDdfz3l
 VvPSaxZvebynT8JKDe4KxrMPpBjhr7mwgLcXk/Zy2EzOzxFSxXLsDAnwjtCW1gTh
 lPLD4TkgtziDfPfZXxFH3J34IUe1tZ2M+7Cz157FBu6BKX/g9ETQT24DXWDzFuI=
 =cgfV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.14b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - the new pvcalls backend for routing socket calls from a guest to dom0

 - some cleanups of Xen code

 - a fix for wrong usage of {get,put}_cpu()

* tag 'for-linus-4.14b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (27 commits)
  xen/mmu: set MMU_NORMAL_PT_UPDATE in remap_area_mfn_pte_fn
  xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guests
  xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()
  xen/pvcalls: use WARN_ON(1) instead of __WARN()
  xen: remove not used trace functions
  xen: remove unused function xen_set_domain_pte()
  xen: remove tests for pvh mode in pure pv paths
  xen-platform: constify pci_device_id.
  xen: cleanup xen.h
  xen: introduce a Kconfig option to enable the pvcalls backend
  xen/pvcalls: implement write
  xen/pvcalls: implement read
  xen/pvcalls: implement the ioworker functions
  xen/pvcalls: disconnect and module_exit
  xen/pvcalls: implement release command
  xen/pvcalls: implement poll command
  xen/pvcalls: implement accept command
  xen/pvcalls: implement listen command
  xen/pvcalls: implement bind command
  xen/pvcalls: implement connect command
  ...
2017-09-07 10:24:21 -07:00
Linus Torvalds
44b1671fae Driver core update for 4.14-rc1
Here is the "big" driver core update for 4.14-rc1.
 
 It's really not all that big, the largest thing here being some firmware
 tests to help ensure that that crazy api is working properly.
 
 There's also a new uevent for when a driver is bound or unbound from a
 device, fixing a hole in the driver model that's been there since the
 very beginning.  Many thanks to Dmitry for being persistent and pointing
 out how wrong I was about this all along :)
 
 Patches for the new uevents are already in the systemd tree, if people
 want to play around with them.
 
 Otherwise just a number of other small api changes and updates here,
 nothing major.  All of these patches have been in linux-next for a
 while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWa1/IQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn8jACfdQg+YXGxTExonxnyiWgoDMMSO2gAn1ETOaak
 itLO5ll4b6EQ0r3pU27d
 =pCYl
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core update from Greg KH:
 "Here is the "big" driver core update for 4.14-rc1.

  It's really not all that big, the largest thing here being some
  firmware tests to help ensure that that crazy api is working properly.

  There's also a new uevent for when a driver is bound or unbound from a
  device, fixing a hole in the driver model that's been there since the
  very beginning. Many thanks to Dmitry for being persistent and
  pointing out how wrong I was about this all along :)

  Patches for the new uevents are already in the systemd tree, if people
  want to play around with them.

  Otherwise just a number of other small api changes and updates here,
  nothing major. All of these patches have been in linux-next for a
  while with no reported issues"

* tag 'driver-core-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits)
  driver core: bus: Fix a potential double free
  Do not disable driver and bus shutdown hook when class shutdown hook is set.
  base: topology: constify attribute_group structures.
  base: Convert to using %pOF instead of full_name
  kernfs: Clarify lockdep name for kn->count
  fbdev: uvesafb: remove DRIVER_ATTR() usage
  xen: xen-pciback: remove DRIVER_ATTR() usage
  driver core: Document struct device:dma_ops
  mod_devicetable: Remove excess description from structured comment
  test_firmware: add batched firmware tests
  firmware: enable a debug print for batched requests
  firmware: define pr_fmt
  firmware: send -EINTR on signal abort on fallback mechanism
  test_firmware: add test case for SIGCHLD on sync fallback
  initcall_debug: add deferred probe times
  Input: axp20x-pek - switch to using devm_device_add_group()
  Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01
  Input: gpio_keys - use devm_device_add_group() for attributes
  driver core: add devm_device_add_group() and friends
  driver core: add device_{add|remove}_group() helpers
  ...
2017-09-05 10:41:21 -07:00
Linus Torvalds
24e700e291 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Thomas Gleixner:
 "This update provides:

   - Cleanup of the IDT management including the removal of the extra
     tracing IDT. A first step to cleanup the vector management code.

   - The removal of the paravirt op adjust_exception_frame. This is a
     XEN specific issue, but merged through this branch to avoid nasty
     merge collisions

   - Prevent dmesg spam about the TSC DEADLINE bug, when the CPU has
     disabled the TSC DEADLINE timer in CPUID.

   - Adjust a debug message in the ioapic code to print out the
     information correctly"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  x86/idt: Fix the X86_TRAP_BP gate
  x86/xen: Get rid of paravirt op adjust_exception_frame
  x86/eisa: Add missing include
  x86/idt: Remove superfluous ALIGNment
  x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature
  x86/idt: Remove the tracing IDT leftovers
  x86/idt: Hide set_intr_gate()
  x86/idt: Simplify alloc_intr_gate()
  x86/idt: Deinline setup functions
  x86/idt: Remove unused functions/inlines
  x86/idt: Move interrupt gate initialization to IDT code
  x86/idt: Move APIC gate initialization to tables
  x86/idt: Move regular trap init to tables
  x86/idt: Move IST stack based traps to table init
  x86/idt: Move debug stack init to table based
  x86/idt: Switch early trap init to IDT tables
  x86/idt: Prepare for table based init
  x86/idt: Move early IDT setup out of 32-bit asm
  x86/idt: Move early IDT handler setup to IDT code
  x86/idt: Consolidate IDT invalidation
  ...
2017-09-04 17:43:56 -07:00
Jérôme Glisse
a81461b054 xen/gntdev: update to new mmu_notifier semantic
Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: xen-devel@lists.xenproject.org (moderated for non-subscribers)
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-31 16:13:00 -07:00
Boris Ostrovsky
b194da25ca xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guests
Commit aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
removed XENFEAT_auto_translated_physmap test in xen_alloc_p2m_entry()
since it is assumed that the routine is never called by non-PV guests.

However, alloc_xenballooned_pages() may make this call on a PVH guest.
Prevent this from happening by adding XENFEAT_auto_translated_physmap
check there.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Fixes: aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
2017-08-31 09:45:55 -04:00
Julien Grall
22f12f0df8 xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()
When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following
splat appears:

[    0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.019717] ASID allocator initialised with 65536 entries
[    0.020019] xen:grant_table: Grant tables using version 1 layout
[    0.020051] Grant table initialized
[    0.020069] BUG: sleeping function called from invalid context at /data/src/linux/mm/page_alloc.c:4046
[    0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
[    0.020123] no locks held by swapper/0/1.
[    0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598
[    0.020166] Hardware name: FVP Base (DT)
[    0.020182] Call trace:
[    0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270
[    0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30
[    0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0
[    0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8
[    0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90
[    0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8
[    0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88
[    0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0
[    0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50
[    0.020411] [<ffff0000086566a4>] evtchn_fifo_alloc_control_block+0x2c/0xa0
[    0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4
[    0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8
[    0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300
[    0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130
[    0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288
[    0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110
[    0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40
[    0.020606] xen:events: Using FIFO-based ABI
[    0.020658] Xen: initializing cpu0
[    0.027727] Hierarchical SRCU implementation.
[    0.036235] EFI services will not be available.
[    0.043810] smp: Bringing up secondary CPUs ...

This is because get_cpu() in xen_evtchn_fifo_init() will disable
preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set).

xen_evtchn_fifo_init() will always be called before SMP is initialized,
so {get,put}_cpu() could be replaced by a simple smp_processor_id().

This also avoid to modify evtchn_fifo_alloc_control_block that will be
called in other context.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reported-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Fixes: 1fe565517b ("xen/events: use the FIFO-based ABI if available")
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Arnd Bergmann
fefcfb9935 xen/pvcalls: use WARN_ON(1) instead of __WARN()
__WARN() is an internal helper that is only available on
some architectures, but causes a build error e.g. on ARM64
in some configurations:

drivers/xen/pvcalls-back.c: In function 'set_backend_state':
drivers/xen/pvcalls-back.c:1097:5: error: implicit declaration of function '__WARN' [-Werror=implicit-function-declaration]

Unfortunately, there is no equivalent of BUG() that takes no
arguments, but WARN_ON(1) is commonly used in other drivers
and works on all configurations.

Fixes: 7160378206b2 ("xen/pvcalls: xenbus state handling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Arvind Yadav
fff219d90c xen-platform: constify pci_device_id.
pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
42d3078a8a xen: introduce a Kconfig option to enable the pvcalls backend
Also add pvcalls-back to the Makefile.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
5ad9918ffc xen/pvcalls: implement write
When the other end notifies us that there is data to be written
(pvcalls_back_conn_event), increment the io and write counters, and
schedule the ioworker.

Implement the write function called by ioworker by reading the data from
the data ring, writing it to the socket by calling inet_sendmsg.

Set out_error on error.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
b3f9f773af xen/pvcalls: implement read
When an active socket has data available, increment the io and read
counters, and schedule the ioworker.

Implement the read function by reading from the socket, writing the data
to the data ring.

Set in_error on error.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
5d520d8580 xen/pvcalls: implement the ioworker functions
We have one ioworker per socket. Each ioworker goes through the list of
outstanding read/write requests. Once all requests have been dealt with,
it returns.

We use one atomic counter per socket for "read" operations and one
for "write" operations to keep track of the reads/writes to do.

We also use one atomic counter ("io") per ioworker to keep track of how
many outstanding requests we have in total assigned to the ioworker. The
ioworker finishes when there are none.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
0a85d23b81 xen/pvcalls: disconnect and module_exit
Implement backend_disconnect. Call pvcalls_back_release_active on active
sockets and pvcalls_back_release_passive on passive sockets.

Implement module_exit by calling backend_disconnect on frontend
connections.

[ boris: fixed long lines ]

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
a51729cb9b xen/pvcalls: implement release command
Release both active and passive sockets. For active sockets, make sure
to avoid possible conflicts with the ioworker reading/writing to those
sockets concurrently. Set map->release to let the ioworker know
atomically that the socket will be released soon, then wait until the
ioworker finishes (flush_work).

Unmap indexes pages and data rings.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
3cf33a587d xen/pvcalls: implement poll command
Implement poll on passive sockets by requesting a delayed response with
mappass->reqcopy, and reply back when there is data on the passive
socket.

Poll on active socket is unimplemented as by the spec, as the frontend
should just wait for events and check the indexes on the indexes page.

Only support one outstanding poll (or accept) request for every passive
socket at any given time.

[ boris: fixed long lines ]

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
6f474e7116 xen/pvcalls: implement accept command
Implement the accept command by calling inet_accept. To avoid blocking
in the kernel, call inet_accept(O_NONBLOCK) from a workqueue, which get
scheduled on sk_data_ready (for a passive socket, it means that there
are connections to accept).

Use the reqcopy field to store the request. Accept the new socket from
the delayed work function, create a new sock_mapping for it, map
the indexes page and data ring, and reply to the other end. Allocate an
ioworker for the socket.

Only support one outstanding blocking accept request for every socket at
any time.

Add a field to sock_mapping to remember the passive socket from which an
active socket was created.

[ boris: fixed whitespaces ]

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
8ce3f7626f xen/pvcalls: implement listen command
Call inet_listen to implement the listen command.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
331a63e6f8 xen/pvcalls: implement bind command
Allocate a socket. Track the allocated passive sockets with a new data
structure named sockpass_mapping. It contains an unbound workqueue to
schedule delayed work for the accept and poll commands. It also has a
reqcopy field to be used to store a copy of a request for delayed work.
Reads/writes to it are protected by a lock (the "copy_lock" spinlock).
Initialize the workqueue in pvcalls_back_bind.

Implement the bind command with inet_bind.

The pass_sk_data_ready event handler will be added later.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
5db4d286a8 xen/pvcalls: implement connect command
Allocate a socket. Keep track of socket <-> ring mappings with a new data
structure, called sock_mapping. Implement the connect command by calling
inet_stream_connect, and mapping the new indexes page and data ring.
Allocate a workqueue and a work_struct, called ioworker, to perform
reads and writes to the socket.

When an active socket is closed (sk_state_change), set in_error to
-ENOTCONN and notify the other end, as specified by the protocol.

sk_data_ready and pvcalls_back_ioworker will be implemented later.

[ boris: fixed whitespaces ]

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
fb0298754a xen/pvcalls: implement socket command
Just reply with success to the other end for now. Delay the allocation
of the actual socket to bind and/or connect.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
b1efa69317 xen/pvcalls: handle commands from the frontend
When the other end notifies us that there are commands to be read
(pvcalls_back_event), wake up the backend thread to parse the command.

The command ring works like most other Xen rings, so use the usual
ring macros to read and write to it. The functions implementing the
commands are empty stubs for now.

[ boris: fixed whitespaces ]

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
d0e4d560c2 xen/pvcalls: connect to a frontend
Introduce a per-frontend data structure named pvcalls_fedata. It
contains pointers to the command ring, its event channel, a list of
active sockets and a tree of passive sockets (passing sockets need to be
looked up from the id on listen, accept and poll commands, while active
sockets only on release).

It also has an unbound workqueue to schedule the work of parsing and
executing commands on the command ring. socket_lock protects the two
lists. In pvcalls_back_global, keep a list of connected frontends.

[ boris: fixed whitespaces/long lines ]

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
0a9c75c2c7 xen/pvcalls: xenbus state handling
Introduce the code to handle xenbus state changes.

Implement the probe function for the pvcalls backend. Write the
supported versions, max-page-order and function-calls nodes to xenstore,
as required by the protocol.

Introduce stub functions for disconnecting/connecting to a frontend.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
9be07334f9 xen/pvcalls: initialize the module and register the xenbus backend
Keep a list of connected frontends. Use a semaphore to protect list
accesses.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Stefano Stabellini
72e59c30df xen/pvcalls: introduce the pvcalls xenbus backend
Introduce a xenbus backend for the pvcalls protocol, as defined by
https://xenbits.xen.org/docs/unstable/misc/pvcalls.html.

This patch only adds the stubs, the code will be added by the following
patches.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-08-31 09:45:55 -04:00
Thomas Gleixner
4447ac1195 x86/idt: Simplify alloc_intr_gate()
The only users of alloc_intr_gate() are hypervisors, which both check the
used_vectors bitmap whether they have allocated the gate already. Move that
check into alloc_intr_gate() and simplify the users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20170828064959.580830286@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-29 12:07:28 +02:00
Greg Kroah-Hartman
0538dcb017 xen: xen-pciback: remove DRIVER_ATTR() usage
It's better to be explicit and use the DRIVER_ATTR_RW() and
DRIVER_ATTR_RO() macros when defining a driver's sysfs file.

Bonus is this fixes up a checkpatch.pl warning.

This is part of a series to drop DRIVER_ATTR() from the tree entirely.

Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28 16:42:30 +02:00
Linus Torvalds
f7bbf0754b Kbuild fixes for v4.13
- fix linker script regression caused by dead code elimination support
 
 - fix typos and outdated comments
 
 - specify kselftest-clean as a PHONY target
 
 - fix "make dtbs_install" when $(srctree) includes shell special
   characters like '~'
 
 - Move -fshort-wchar to the global option list because defining it
   partially emits warnings
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZnvezAAoJED2LAQed4NsGTK0P/jZY3jNT3tzgldH5ggCeA7Z2
 qekRVR3z0CCm3FU+NlWLVNsiyWDRxuN93lIVML8f7wt/kuJDNxycx5G7+Yiw60yH
 h41gibJJdS3GL+VzfGUk1WeZkoAiGVVbLvENJzOQRIeHblsi8Ris+pBwT3MY/WKl
 XM0k+xsc/abB3dnhgws8WiUvrNhRiqsciq7IBkDPzxGa4JC53+9Yjjc1HA6gah+g
 gKX2qZmsjKm9Hsueot9clTttT+iDHmJ0QetHeMGy0T6bqjlRzG+H37ZGtGv3ZP4p
 wcsct8QJt+8/TSEtvkROg2apBUlrEkblvsL/G2UFew42Fz4Z3XLariHTcrDoYuHY
 IiUc/tUbmx6Ft72ZvepD+qN+rxQiWKqRzTD+7sDDNPPbdgC4pwZ2Z6m7Oih8mRT5
 jGlex4PN/rA5ZvTDFF8c/1GhStZrfxM7A7t6k4Snui6hOpxcpK9vHDgKrfGjk83h
 xkwvrPIXnreaAvUYz18JhTU1Zuzj5vwIZgPI1bRl3fi4JmagsAwl+IJawWXMFODA
 WONJnqavCYRcdffD0cGfJqA8YSmpxfiDxtP7NiOt05jagDHDGH6bbsUR5Oni/9wU
 9D8wdn4XdrrYGG2kEVMDa0x4Yn8vVuPfyxy5DgcRciMcYCUfNHOIHFutRvFhwcqm
 rV7ege5oIH8DdsWoE5pj
 =m8Zm
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - fix linker script regression caused by dead code elimination support

 - fix typos and outdated comments

 - specify kselftest-clean as a PHONY target

 - fix "make dtbs_install" when $(srctree) includes shell special
   characters like '~'

 - Move -fshort-wchar to the global option list because defining it
   partially emits warnings

* tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: update comments of Makefile.asm-generic
  kbuild: Do not use hyphen in exported variable name
  Makefile: add kselftest-clean to PHONY target list
  Kbuild: use -fshort-wchar globally
  fixdep: trivial: typo fix and correction
  kbuild: trivial cleanups on the comments
  kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured
2017-08-24 14:22:27 -07:00
Arnd Bergmann
8c97023cf0 Kbuild: use -fshort-wchar globally
Commit 971a69db7d ("Xen: don't warn about 2-byte wchar_t in efi")
added the --no-wchar-size-warning to the Makefile to avoid this
harmless warning:

arm-linux-gnueabi-ld: warning: drivers/xen/efi.o uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail

Changing kbuild to use thin archives instead of recursive linking
unfortunately brings the same warning back during the final link.

The kernel does not use wchar_t string literals at this point, and
xen does not use wchar_t at all (only efi_char16_t), so the flag
has no effect, but as pointed out by Jan Beulich, adding a wchar_t
string literal would be bad here.

Since wchar_t is always defined as u16, independent of the toolchain
default, always passing -fshort-wchar is correct and lets us
remove the Xen specific hack along with fixing the warning.

Link: https://patchwork.kernel.org/patch/9275217/
Fixes: 971a69db7d ("Xen: don't warn about 2-byte wchar_t in efi")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-08-21 09:05:59 +09:00
Jens Axboe
3e09fc802d Merge branch 'stable/for-jens-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
Pull xen block changes from Konrad:

Two fixes, both of them spotted by Amazon:

 1) Fix in Xen-blkfront caused by the re-write in 4.8 time-frame.
 2) Fix in the xen_biovec_phys_mergeable which allowed guest
    requests when using NVMe - to slurp up more data than allowed
    leading to an XSA (which has been made public today).
2017-08-16 09:56:34 -06:00
Roger Pau Monne
462cdace79 xen: fix bio vec merging
The current test for bio vec merging is not fully accurate and can be
tricked into merging bios when certain grant combinations are used.
The result of these malicious bio merges is a bio that extends past
the memory page used by any of the originating bios.

Take into account the following scenario, where a guest creates two
grant references that point to the same mfn, ie: grant 1 -> mfn A,
grant 2 -> mfn A.

These references are then used in a PV block request, and mapped by
the backend domain, thus obtaining two different pfns that point to
the same mfn, pfn B -> mfn A, pfn C -> mfn A.

If those grants happen to be used in two consecutive sectors of a disk
IO operation becoming two different bios in the backend domain, the
checks in xen_biovec_phys_mergeable will succeed, because bfn1 == bfn2
(they both point to the same mfn). However due to the bio merging,
the backend domain will end up with a bio that expands past mfn A into
mfn A + 1.

Fix this by making sure the check in xen_biovec_phys_mergeable takes
into account the offset and the length of the bio, this basically
replicates whats done in __BIOVEC_PHYS_MERGEABLE using mfns (bus
addresses). While there also remove the usage of
__BIOVEC_PHYS_MERGEABLE, since that's already checked by the callers
of xen_biovec_phys_mergeable.

CC: stable@vger.kernel.org
Reported-by: "Jan H. Schönherr" <jschoenh@amazon.de>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-08-15 10:32:49 -04:00
Liu Shuo
020db9d3c1 xen/events: Fix interrupt lost during irq_disable and irq_enable
Here is a device has xen-pirq-MSI interrupt. Dom0 might lost interrupt
during driver irq_disable/irq_enable. Here is the scenario,
 1. irq_disable -> disable_dynirq -> mask_evtchn(irq channel)
 2. dev interrupt raised by HW and Xen mark its evtchn as pending
 3. irq_enable -> startup_pirq -> eoi_pirq ->
    clear_evtchn(channel of irq) -> clear pending status
 4. consume_one_event process the irq event without pending bit assert
    which result in interrupt lost once
 5. No HW interrupt raising anymore.

Now use enable_dynirq for enable_pirq of xen_pirq_chip to remove
eoi_pirq when irq_enable.

Signed-off-by: Liu Shuo <shuo.a.liu@intel.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-08-11 16:46:01 +02:00
Juergen Gross
529871bb3c xen: avoid deadlock in xenbus
When starting the xenwatch thread a theoretical deadlock situation is
possible:

xs_init() contains:

    task = kthread_run(xenwatch_thread, NULL, "xenwatch");
    if (IS_ERR(task))
        return PTR_ERR(task);
    xenwatch_pid = task->pid;

And xenwatch_thread() does:

    mutex_lock(&xenwatch_mutex);
    ...
    event->handle->callback();
    ...
    mutex_unlock(&xenwatch_mutex);

The callback could call unregister_xenbus_watch() which does:

    ...
    if (current->pid != xenwatch_pid)
        mutex_lock(&xenwatch_mutex);
    ...

In case a watch is firing before xenwatch_pid could be set and the
callback of that watch unregisters a watch, then a self-deadlock would
occur.

Avoid this by setting xenwatch_pid in xenwatch_thread().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-08-11 16:45:56 +02:00
Juergen Gross
e91b2b1194 xen: dont fiddle with event channel masking in suspend/resume
Instead of fiddling with masking the event channels during suspend
and resume handling let do the irq subsystem do its job. It will do
the mask and unmask operations as needed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-07-27 19:55:46 +02:00
Gustavo A. R. Silva
039937308e xen: selfballoon: remove unnecessary static in frontswap_selfshrink()
Remove unnecessary static on local variables last_frontswap_pages and
tgt_frontswap_pages. Such variables are initialized before being used,
on every execution path throughout the function. The statics have no
benefit and, removing them reduce the code size.

This issue was detected using Coccinelle and the following semantic patch:

@bad exists@
position p;
identifier x;
type T;
@@

static T x@p;
...
x = <+...x...+>

@@
identifier x;
expression e;
type T;
position p != bad.p;
@@

-static
 T x@p;
 ... when != x
     when strict
?x = e;

You can see a significant difference in the code size after executing
the size command, before and after the code change:

before:
   text	   data	    bss	    dec	    hex	filename
   5633	   3452	    384	   9469	   24fd	drivers/xen/xen-selfballoon.o

after:
   text	   data	    bss	    dec	    hex	filename
   5576	   3308	    256	   9140	   23b4	drivers/xen/xen-selfballoon.o

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-07-27 19:55:41 +02:00
Punit Agrawal
d02ca074f6 xen: Drop un-informative message during boot
On systems that are not booted as a Xen domain, the xenfs driver prints
the following message during boot.

[    3.460595] xenfs: not registering filesystem on non-xen platform

As the user chose not to boot a Xen domain, this message does not
provide useful information. Drop this message.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-07-27 19:55:34 +02:00
Juergen Gross
96edd61dcf xen/balloon: don't online new memory initially
When setting up the Xenstore watch for the memory target size the new
watch will fire at once. Don't try to reach the configured target size
by onlining new memory in this case, as the current memory size will
be smaller in almost all cases due to e.g. BIOS reserved pages.

Onlining new memory will lead to more problems e.g. undesired conflicts
with NVMe devices meant to be operated as block devices.

Instead remember the difference between target size and current size
when the watch fires for the first time and apply it to any further
size changes, too.

In order to avoid races between balloon.c and xen-balloon.c init calls
do the xen-balloon.c initialization from balloon.c.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-07-23 08:13:18 +02:00
Wengang Wang
29d11cfd86 xen/grant-table: log the lack of grants
log a message when we enter this situation:
1) we already allocated the max number of available grants from hypervisor
and
2) we still need more (but the request fails because of 1)).

Sometimes the lack of grants causes IO hangs in xen_blkfront devices.
Adding this log would help debuging.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-07-23 08:09:43 +02:00
Linus Torvalds
48ea2cedde Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "It's been usually busy for summer, with most of the efforts centered
  around TCMU developments and various target-core + fabric driver bug
  fixing activities. Not particularly large in terms of LoC, but lots of
  smaller patches from many different folks.

  The highlights include:

   - ibmvscsis logical partition manager support (Michael Cyr + Bryant
     Ly)

   - Convert target/iblock WRITE_SAME to blkdev_issue_zeroout (hch +
     nab)

   - Add support for TMR percpu LUN reference counting (nab)

   - Fix a potential deadlock between EXTENDED_COPY and iscsi shutdown
     (Bart)

   - Fix COMPARE_AND_WRITE caw_sem leak during se_cmd quiesce (Jiang Yi)

   - Fix TMCU module removal (Xiubo Li)

   - Fix iser-target OOPs during login failure (Andrea Righi + Sagi)

   - Breakup target-core free_device backend driver callback (mnc)

   - Perform TCMU add/delete/reconfig synchronously (mnc)

   - Fix TCMU multiple UIO open/close sequences (mnc)

   - Fix TCMU CHECK_CONDITION sense handling (mnc)

   - Fix target-core SAM_STAT_BUSY + TASK_SET_FULL handling (mnc + nab)

   - Introduce TYPE_ZBC support in PSCSI (Damien Le Moal)

   - Fix possible TCMU memory leak + OOPs when recalculating cmd base
     size (Xiubo Li + Bryant Ly + Damien Le Moal + mnc)

   - Add login_keys_workaround attribute for non RFC initiators (Robert
     LeBlanc + Arun Easi + nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (68 commits)
  iscsi-target: Add login_keys_workaround attribute for non RFC initiators
  Revert "qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT"
  tcmu: clean up the code and with one small fix
  tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size
  target: export lio pgr/alua support as device attr
  target: Fix return sense reason in target_scsi3_emulate_pr_out
  target: Fix cmd size for PR-OUT in passthrough_parse_cdb
  tcmu: Fix dev_config_store
  target: pscsi: Introduce TYPE_ZBC support
  target: Use macro for WRITE_VERIFY_32 operation codes
  target: fix SAM_STAT_BUSY/TASK_SET_FULL handling
  target: remove transport_complete
  pscsi: finish cmd processing from pscsi_req_done
  tcmu: fix sense handling during completion
  target: add helper to copy sense to se_cmd buffer
  target: do not require a transport_complete for SCF_TRANSPORT_TASK_SENSE
  target: make device_mutex and device_list static
  tcmu: Fix flushing cmd entry dcache page
  tcmu: fix multiple uio open/close sequences
  tcmu: drop configured check in destroy
  ...
2017-07-13 14:27:32 -07:00
Bart Van Assche
af90e84d1f xen/scsiback: Make TMF processing slightly faster
Target drivers must guarantee that struct se_cmd and struct se_tmr_req
exist as long as target_tmr_work() is in progress. Since the last
access by the LIO core is a call to .check_stop_free() and since the
Xen scsiback .check_stop_free() drops a reference to the TMF, it is
already guaranteed that the struct se_cmd that corresponds to the TMF
exists as long as target_tmr_work() is in progress. Hence change the
second argument of transport_generic_free_cmd() from 1 into 0.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:58:03 -07:00
Bart Van Assche
e3eac12442 xen/scsiback: Replace a waitqueue and a counter by a completion
This patch simplifies the implementation of the scsiback driver
but does not change its behavior.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:58:02 -07:00
Bart Van Assche
9f4ab18ac5 xen/scsiback: Fix a TMR related use-after-free
scsiback_release_cmd() must not dereference se_cmd->se_tmr_req
because that memory is freed by target_free_cmd_mem() before
scsiback_release_cmd() is called. Fix this use-after-free by
inlining struct scsiback_tmr into struct vscsibk_pend.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: David Disseldorp <ddiss@suse.de>
Cc: xen-devel@lists.xenproject.org
Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 22:58:01 -07:00
Linus Torvalds
f72e24a124 This is the first pull request for the new dma-mapping subsystem
In this new subsystem we'll try to properly maintain all the generic
 code related to dma-mapping, and will further consolidate arch code
 into common helpers.
 
 This pull request contains:
 
  - removal of the DMA_ERROR_CODE macro, replacing it with calls
    to ->mapping_error so that the dma_map_ops instances are
    more self contained and can be shared across architectures (me)
  - removal of the ->set_dma_mask method, which duplicates the
    ->dma_capable one in terms of functionality, but requires more
    duplicate code.
  - various updates for the coherent dma pool and related arm code
    (Vladimir)
  - various smaller cleanups (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlldmw0LHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOiKA/+Ln1mFLSf3nfTzIHa24Bbk8ZTGr0B8TD4Vmyyt8iG
 oO3AeaTLn3d6ugbH/uih/tPz8PuyXsdiTC1rI/ejDMiwMTSjW6phSiIHGcStSR9X
 VFNhmMFacp7QpUpvxceV0XZYKDViAoQgHeGdp3l+K5h/v4AYePV/v/5RjQPaEyOh
 YLbCzETO+24mRWdJxdAqtTW4ovYhzj6XsiJ+pAjlV0+SWU6m5L5E+VAPNi1vqv1H
 1O2KeCFvVYEpcnfL3qnkw2timcjmfCfeFAd9mCUAc8mSRBfs3QgDTKw3XdHdtRml
 LU2WuA5cpMrOdBO4mVra2plo8E2szvpB1OZZXoKKdCpK3VGwVpVHcTvClK2Ks/3B
 GDLieroEQNu2ZIUIdWXf/g2x6le3BcC9MmpkAhnGPqCZ7skaIBO5Cjpxm0zTJAPl
 PPY3CMBBEktAvys6DcudOYGixNjKUuAm5lnfpcfTEklFdG0AjhdK/jZOplAFA6w4
 LCiy0rGHM8ZbVAaFxbYoFCqgcjnv6EjSiqkJxVI4fu/Q7v9YXfdPnEmE0PJwCVo5
 +i7aCLgrYshTdHr/F3e5EuofHN3TDHwXNJKGh/x97t+6tt326QMvDKX059Kxst7R
 rFukGbrYvG8Y7yXwrSDbusl443ta0Ht7T1oL4YUoJTZp0nScAyEluDTmrH1JVCsT
 R4o=
 =0Fso
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping infrastructure from Christoph Hellwig:
 "This is the first pull request for the new dma-mapping subsystem

  In this new subsystem we'll try to properly maintain all the generic
  code related to dma-mapping, and will further consolidate arch code
  into common helpers.

  This pull request contains:

   - removal of the DMA_ERROR_CODE macro, replacing it with calls to
     ->mapping_error so that the dma_map_ops instances are more self
     contained and can be shared across architectures (me)

   - removal of the ->set_dma_mask method, which duplicates the
     ->dma_capable one in terms of functionality, but requires more
     duplicate code.

   - various updates for the coherent dma pool and related arm code
     (Vladimir)

   - various smaller cleanups (me)"

* tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits)
  ARM: dma-mapping: Remove traces of NOMMU code
  ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus
  ARM: NOMMU: Introduce dma operations for noMMU
  drivers: dma-mapping: allow dma_common_mmap() for NOMMU
  drivers: dma-coherent: Introduce default DMA pool
  drivers: dma-coherent: Account dma_pfn_offset when used with device tree
  dma: Take into account dma_pfn_offset
  dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs
  dma-mapping: remove dmam_free_noncoherent
  crypto: qat - avoid an uninitialized variable warning
  au1100fb: remove a bogus dma_free_nonconsistent call
  MAINTAINERS: add entry for dma mapping helpers
  powerpc: merge __dma_set_mask into dma_set_mask
  dma-mapping: remove the set_dma_mask method
  powerpc/cell: use the dma_supported method for ops switching
  powerpc/cell: clean up fixed mapping dma_ops initialization
  tile: remove dma_supported and mapping_error methods
  xen-swiotlb: remove xen_swiotlb_set_dma_mask
  arm: implement ->dma_supported instead of ->set_dma_mask
  mips/loongson64: implement ->dma_supported instead of ->set_dma_mask
  ...
2017-07-06 19:20:54 -07:00
Linus Torvalds
6e6c5b9606 xen: features and fixes for 4.13-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJZXdVXAAoJELDendYovxMvVA0IAITmvH21SDTFiilKCOrxhCv0
 W3q3cOhZA4D+UtTqqIm/os/et08n72864s0mUFoY4PxETaUsb1jBav7z7Tod2c6B
 wh26UgIAhVO3ZewFSmpdPYoW0l3elC5JUMkVMfwSvHkROaU+YDEYUsLWGuIHZiiy
 V/kIskcKe08HLObU//BMjfFusmMHmQSg+TruyqRWodlWj4Rwm7q5fNZ/xaap1UCM
 O7GcHyq1k699w5YYTlIEkLWsX/pGM+auGSlT1xdjJEc2bpjH8ps0xbvAn6dsAKsE
 yoDyxQWtX2wBUXCqF0hXYAB2r1iFx2aFfLQjwc7p+V6BvxpWwSsC7Ur4QIDnm3E=
 =OLb7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Other than fixes and cleanups it contains:

   - support > 32 VCPUs at domain restore

   - support for new sysfs nodes related to Xen

   - some performance tuning for Linux running as Xen guest"

* tag 'for-linus-4.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: allow userspace access during hypercalls
  x86: xen: remove unnecessary variable in xen_foreach_remap_area()
  xen: allocate page for shared info page from low memory
  xen: avoid deadlock in xenbus driver
  xen: add sysfs node for hypervisor build id
  xen: sync include/xen/interface/version.h
  xen: add sysfs node for guest type
  doc,xen: document hypervisor sysfs nodes for xen
  xen/vcpu: Handle xen_vcpu_setup() failure at boot
  xen/vcpu: Handle xen_vcpu_setup() failure in hotplug
  xen/pv: Fix OOPS on restore for a PV, !SMP domain
  xen/pvh*: Support > 32 VCPUs at domain restore
  xen/vcpu: Simplify xen_vcpu related code
  xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU
  xen: avoid type warning in xchg_xen_ulong
  xen: fix HYPERVISOR_dm_op() prototype
  xen: don't print error message in case of missing Xenstore entry
  arm/xen: Adjust one function call together with a variable assignment
  arm/xen: Delete an error message for a failed memory allocation in __set_phys_to_machine_multi()
  arm/xen: Improve a size determination in __set_phys_to_machine_multi()
2017-07-06 19:11:24 -07:00
Linus Torvalds
4422d80ed7 Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Thomas Gleixner:
 "The RAS updates for the 4.13 merge window:

   - Cleanup of the MCE injection facility (Borsilav Petkov)

   - Rework of the AMD/SMCA handling (Yazen Ghannam)

   - Enhancements for ACPI/APEI to handle new notitication types (Shiju
     Jose)

   - atomic_t to refcount_t conversion (Elena Reshetova)

   - A few fixes and enhancements all over the place"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  RAS/CEC: Check the correct variable in the debugfs error handling
  x86/mce: Always save severity in machine_check_poll()
  x86/MCE, xen/mcelog: Make /dev/mcelog registration messages more precise
  x86/mce: Update bootlog description to reflect behavior on AMD
  x86/mce: Don't disable MCA banks when offlining a CPU on AMD
  x86/mce/mce-inject: Preset the MCE injection struct
  x86/mce: Clean up include files
  x86/mce: Get rid of register_mce_write_callback()
  x86/mce: Merge mce_amd_inj into mce-inject
  x86/mce/AMD: Use saved threshold block info in interrupt handler
  x86/mce/AMD: Use msr_stat when clearing MCA_STATUS
  x86/mce/AMD: Carve out SMCA bank configuration
  x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers
  x86/mce: Convert threshold_bank.cpus from atomic_t to refcount_t
  RAS: Make local function parse_ras_param() static
  ACPI/APEI: Handle GSIV and GPIO notification types
2017-07-03 18:33:03 -07:00
Linus Torvalds
03ffbcdd78 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The irq department delivers:

   - Expand the generic infrastructure handling the irq migration on CPU
     hotplug and convert X86 over to it. (Thomas Gleixner)

     Aside of consolidating code this is a preparatory change for:

   - Finalizing the affinity management for multi-queue devices. The
     main change here is to shut down interrupts which are affine to a
     outgoing CPU and reenabling them when the CPU comes online again.
     That avoids moving interrupts pointlessly around and breaking and
     reestablishing affinities for no value. (Christoph Hellwig)

     Note: This contains also the BLOCK-MQ and NVME changes which depend
     on the rework of the irq core infrastructure. Jens acked them and
     agreed that they should go with the irq changes.

   - Consolidation of irq domain code (Marc Zyngier)

   - State tracking consolidation in the core code (Jeffy Chen)

   - Add debug infrastructure for hierarchical irq domains (Thomas
     Gleixner)

   - Infrastructure enhancement for managing generic interrupt chips via
     devmem (Bartosz Golaszewski)

   - Constification work all over the place (Tobias Klauser)

   - Two new interrupt controller drivers for MVEBU (Thomas Petazzoni)

   - The usual set of fixes, updates and enhancements all over the
     place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits)
  irqchip/or1k-pic: Fix interrupt acknowledgement
  irqchip/irq-mvebu-gicp: Allocate enough memory for spi_bitmap
  irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity
  nvme: Allocate queues for all possible CPUs
  blk-mq: Create hctx for each present CPU
  blk-mq: Include all present CPUs in the default queue mapping
  genirq: Avoid unnecessary low level irq function calls
  genirq: Set irq masked state when initializing irq_desc
  genirq/timings: Add infrastructure for estimating the next interrupt arrival time
  genirq/timings: Add infrastructure to track the interrupt timings
  genirq/debugfs: Remove pointless NULL pointer check
  irqchip/gic-v3-its: Don't assume GICv3 hardware supports 16bit INTID
  irqchip/gic-v3-its: Add ACPI NUMA node mapping
  irqchip/gic-v3-its-platform-msi: Make of_device_ids const
  irqchip/gic-v3-its: Make of_device_ids const
  irqchip/irq-mvebu-icu: Add new driver for Marvell ICU
  irqchip/irq-mvebu-gicp: Add new driver for Marvell GICP
  dt-bindings/interrupt-controller: Add DT binding for the Marvell ICU
  genirq/irqdomain: Remove auto-recursive hierarchy support
  irqchip/MSI: Use irq_domain_update_bus_token instead of an open coded access
  ...
2017-07-03 16:50:31 -07:00
Linus Torvalds
9bd42183b9 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Add the SYSTEM_SCHEDULING bootup state to move various scheduler
     debug checks earlier into the bootup. This turns silent and
     sporadically deadly bugs into nice, deterministic splats. Fix some
     of the splats that triggered. (Thomas Gleixner)

   - A round of restructuring and refactoring of the load-balancing and
     topology code (Peter Zijlstra)

   - Another round of consolidating ~20 of incremental scheduler code
     history: this time in terms of wait-queue nomenclature. (I didn't
     get much feedback on these renaming patches, and we can still
     easily change any names I might have misplaced, so if anyone hates
     a new name, please holler and I'll fix it.) (Ingo Molnar)

   - sched/numa improvements, fixes and updates (Rik van Riel)

   - Another round of x86/tsc scheduler clock code improvements, in hope
     of making it more robust (Peter Zijlstra)

   - Improve NOHZ behavior (Frederic Weisbecker)

   - Deadline scheduler improvements and fixes (Luca Abeni, Daniel
     Bristot de Oliveira)

   - Simplify and optimize the topology setup code (Lauro Ramos
     Venancio)

   - Debloat and decouple scheduler code some more (Nicolas Pitre)

   - Simplify code by making better use of llist primitives (Byungchul
     Park)

   - ... plus other fixes and improvements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
  sched/cputime: Refactor the cputime_adjust() code
  sched/debug: Expose the number of RT/DL tasks that can migrate
  sched/numa: Hide numa_wake_affine() from UP build
  sched/fair: Remove effective_load()
  sched/numa: Implement NUMA node level wake_affine()
  sched/fair: Simplify wake_affine() for the single socket case
  sched/numa: Override part of migrate_degrades_locality() when idle balancing
  sched/rt: Move RT related code from sched/core.c to sched/rt.c
  sched/deadline: Move DL related code from sched/core.c to sched/deadline.c
  sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled
  sched/fair: Spare idle load balancing on nohz_full CPUs
  nohz: Move idle balancer registration to the idle path
  sched/loadavg: Generalize "_idle" naming to "_nohz"
  sched/core: Drop the unused try_get_task_struct() helper function
  sched/fair: WARN() and refuse to set buddy when !se->on_rq
  sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well
  sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming
  sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c
  sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h>
  sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h>
  ...
2017-07-03 13:08:04 -07:00
Linus Torvalds
81e3e04489 UUID/GUID updates:
- introduce the new uuid_t/guid_t types that are going to replace
    the somewhat confusing uuid_be/uuid_le types and make the terminology
    fit the various specs, as well as the userspace libuuid library.
    (me, based on a previous version from Amir)
  - consolidated generic uuid/guid helper functions lifted from XFS
    and libnvdimm (Amir and me)
  - conversions to the new types and helpers (Amir, Andy and me)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAllZfmILHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMvyg/9EvWHOOsSdeDykCK3KdH2uIqnxwpl+m7ljccaGJIc
 MmaH0KnsP9p/Cuw5hESh2tYlmCYN7pmYziNXpf/LRS65/HpEYbs4oMqo8UQsN0UM
 2IXHfXY0HnCoG5OixH8RNbFTkxuGphsTY8meaiDr6aAmqChDQI2yGgQLo3WM2/Qe
 R9N1KoBWH/bqY6dHv+urlFwtsREm2fBH+8ovVma3TO73uZCzJGLJBWy3anmZN+08
 uYfdbLSyRN0T8rqemVdzsZ2SrpHYkIsYGUZV43F581vp8e/3OKMoMxpWRRd9fEsa
 MXmoaHcLJoBsyVSFR9lcx3axKrhAgBPZljASbbA0h49JneWXrzghnKBQZG2SnEdA
 ktHQ2sE4Yb5TZSvvWEKMQa3kXhEfIbTwgvbHpcDr5BUZX8WvEw2Zq8e7+Mi4+KJw
 QkvFC1S96tRYO2bxdJX638uSesGUhSidb+hJ/edaOCB/GK+sLhUdDTJgwDpUGmyA
 xVXTF51ramRS2vhlbzN79x9g33igIoNnG4/PV0FPvpCTSqxkHmPc5mK6Vals1lqt
 cW6XfUjSQECq5nmTBtYDTbA/T+8HhBgSQnrrvmferjJzZUFGr/7MXl+Evz2x4CjX
 OBQoAMu241w6Vp3zoXqxzv+muZ/NLar52M/zbi9TUjE0GvvRNkHvgCC4NmpIlWYJ
 Sxg=
 =J/4P
 -----END PGP SIGNATURE-----

Merge tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid

Pull uuid subsystem from Christoph Hellwig:
 "This is the new uuid subsystem, in which Amir, Andy and I have started
  consolidating our uuid/guid helpers and improving the types used for
  them. Note that various other subsystems have pulled in this tree, so
  I'd like it to go in early.

  UUID/GUID summary:

   - introduce the new uuid_t/guid_t types that are going to replace the
     somewhat confusing uuid_be/uuid_le types and make the terminology
     fit the various specs, as well as the userspace libuuid library.
     (me, based on a previous version from Amir)

   - consolidated generic uuid/guid helper functions lifted from XFS and
     libnvdimm (Amir and me)

   - conversions to the new types and helpers (Amir, Andy and me)"

* tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid: (34 commits)
  ACPI: hns_dsaf_acpi_dsm_guid can be static
  mmc: sdhci-pci: make guid intel_dsm_guid static
  uuid: Take const on input of uuid_is_null() and guid_is_null()
  thermal: int340x_thermal: fix compile after the UUID API switch
  thermal: int340x_thermal: Switch to use new generic UUID API
  acpi: always include uuid.h
  ACPI: Switch to use generic guid_t in acpi_evaluate_dsm()
  ACPI / extlog: Switch to use new generic UUID API
  ACPI / bus: Switch to use new generic UUID API
  ACPI / APEI: Switch to use new generic UUID API
  acpi, nfit: Switch to use new generic UUID API
  MAINTAINERS: add uuid entry
  tmpfs: generate random sb->s_uuid
  scsi_debug: switch to uuid_t
  nvme: switch to uuid_t
  sysctl: switch to use uuid_t
  partitions/ldm: switch to use uuid_t
  overlayfs: use uuid_t instead of uuid_be
  fs: switch ->s_uuid to uuid_t
  ima/policy: switch to use uuid_t
  ...
2017-07-03 09:55:26 -07:00
Christoph Hellwig
a88f540101 xen-swiotlb: remove xen_swiotlb_set_dma_mask
This just duplicates the generic implementation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28 06:54:50 -07:00
Juergen Gross
1a3fc2c402 xen: avoid deadlock in xenbus driver
There has been a report about a deadlock in the xenbus driver:

[  247.979498] ======================================================
[  247.985688] WARNING: possible circular locking dependency detected
[  247.991882] 4.12.0-rc4-00022-gc4b25c0 #575 Not tainted
[  247.997040] ------------------------------------------------------
[  248.003232] xenbus/91 is trying to acquire lock:
[  248.007875]  (&u->msgbuffer_mutex){+.+.+.}, at: [<ffff00000863e904>]
xenbus_dev_queue_reply+0x3c/0x230
[  248.017163]
[  248.017163] but task is already holding lock:
[  248.023096]  (xb_write_mutex){+.+...}, at: [<ffff00000863a940>]
xenbus_thread+0x5f0/0x798
[  248.031267]
[  248.031267] which lock already depends on the new lock.
[  248.031267]
[  248.039615]
[  248.039615] the existing dependency chain (in reverse order) is:
[  248.047176]
[  248.047176] -> #1 (xb_write_mutex){+.+...}:
[  248.052943]        __lock_acquire+0x1728/0x1778
[  248.057498]        lock_acquire+0xc4/0x288
[  248.061630]        __mutex_lock+0x84/0x868
[  248.065755]        mutex_lock_nested+0x3c/0x50
[  248.070227]        xs_send+0x164/0x1f8
[  248.074015]        xenbus_dev_request_and_reply+0x6c/0x88
[  248.079427]        xenbus_file_write+0x260/0x420
[  248.084073]        __vfs_write+0x48/0x138
[  248.088113]        vfs_write+0xa8/0x1b8
[  248.091983]        SyS_write+0x54/0xb0
[  248.095768]        el0_svc_naked+0x24/0x28
[  248.099897]
[  248.099897] -> #0 (&u->msgbuffer_mutex){+.+.+.}:
[  248.106088]        print_circular_bug+0x80/0x2e0
[  248.110730]        __lock_acquire+0x1768/0x1778
[  248.115288]        lock_acquire+0xc4/0x288
[  248.119417]        __mutex_lock+0x84/0x868
[  248.123545]        mutex_lock_nested+0x3c/0x50
[  248.128016]        xenbus_dev_queue_reply+0x3c/0x230
[  248.133005]        xenbus_thread+0x788/0x798
[  248.137306]        kthread+0x110/0x140
[  248.141087]        ret_from_fork+0x10/0x40

It is rather easy to avoid by dropping xb_write_mutex before calling
xenbus_dev_queue_reply().

Fixes: fd8aa9095a ("xen: optimize xenbus
driver for multiple concurrent xenstore accesses").

Cc: <stable@vger.kernel.org> # 4.11
Reported-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-06-25 13:11:22 +02:00
Thomas Gleixner
ef1c2cc885 xen/events: Add support for effective affinity mask
Update the effective affinity mask when an interrupt was successfully
targeted to a CPU.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20170619235446.799944725@linutronix.de
2017-06-22 18:21:23 +02:00
Juergen Gross
b867059018 x86/MCE, xen/mcelog: Make /dev/mcelog registration messages more precise
When running under Xen as dom0, /dev/mcelog is being provided by Xen
instead of the normal mcelog character device of the MCE core. Convert
an error message being issued by the MCE core in this case to an
informative message that Xen has registered the device.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: xen-devel@lists.xenproject.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170614084059.19294-1-jgross@suse.com
2017-06-20 23:25:19 +02:00
Ingo Molnar
902b319413 Merge branch 'WIP.sched/core' into sched/core
Conflicts:
	kernel/sched/Makefile

Pick up the waitqueue related renames - it didn't get much feedback,
so it appears to be uncontroversial. Famous last words? ;-)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 12:28:21 +02:00
Christoph Hellwig
4d048dbc05 xen-swiotlb: implement ->mapping_error
DMA_ERROR_CODE is going to go away, so don't rely on it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-06-20 11:13:00 +02:00
Christoph Hellwig
dceb1a6819 xen-swiotlb: consolidate xen_swiotlb_dma_ops
ARM and x86 had duplicated versions of the dma_ops structure, the
only difference is that x86 hasn't wired up the set_dma_mask,
mmap, and get_sgtable ops yet.  On x86 all of them are identical
to the generic version, so they aren't needed but harmless.

All the symbols used only for xen_swiotlb_dma_ops can now be marked
static as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2017-06-20 11:12:59 +02:00
Juergen Gross
84b7625728 xen: add sysfs node for hypervisor build id
For support of Xen hypervisor live patching the hypervisor build id is
needed. Add a node /sys/hypervisor/properties/buildid containing the
information.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-06-15 08:50:37 +02:00
Juergen Gross
4a4c29c96d xen: add sysfs node for guest type
Currently there is no reliable user interface inside a Xen guest to
determine its type (e.g. HVM, PV or PVH). Instead of letting user mode
try to determine this by various rather hacky mechanisms (parsing of
boot messages before they are gone, trying to make use of known subtle
differences in behavior of some instructions), add a sysfs node
/sys/hypervisor/guest_type to explicitly deliver this information as
it is known to the kernel.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-06-15 08:50:32 +02:00
Anoob Soman
c48f64ab47 xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU
A HVM domian booting generates around 200K (evtchn:qemu-dm xen-dyn)
interrupts,in a short period of time. All these evtchn:qemu-dm are bound
to VCPU 0, until irqbalance sees these IRQ and moves it to a different VCPU.
In one configuration, irqbalance runs every 10 seconds, which means
irqbalance doesn't get to see these burst of interrupts and doesn't
re-balance interrupts most of the time, making all evtchn:qemu-dm to be
processed by VCPU0. This cause VCPU0 to spend most of time processing
hardirq and very little time on softirq. Moreover, if dom0 kernel PREEMPTION
is disabled, VCPU0 never runs watchdog (process context), triggering a
softlockup detection code to panic.

Binding evtchn:qemu-dm to next online VCPU, will spread hardirq
processing evenly across different CPU. Later, irqbalance will try to balance
evtchn:qemu-dm, if required.

Signed-off-by: Anoob Soman <anoob.soman@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-06-13 15:30:27 +02:00
Linus Torvalds
eb4125dfdb xen: fix for 4.12 rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJZOscBAAoJELDendYovxMvN8EH+wXMRtFufmdXxyh3Wi5IHbfg
 B56J4mjrOFpw+NnNHZk5H0cUSDwYb14dRCEnLNIXUpzCAb0mRMhPclhe07IMLqe1
 FEqz6qWAh301mugqu6PlXaPZs9af7A6t6LEnfbAxXzgthWEhfzOecOXo0D5oV9sN
 e4qFfoY9/5IoSShbEuHVLf5OBs4S5rhyQ0DNCEfqnHKvCn0VlRlBQMTrYTNZG28O
 jgAWdxIPKXxCy2hoVV/vovuan1F38v9ZeWyVbf03IGfAGjVBFHzIbd9dH1OJm6X0
 H/RGfJW6VPvswEsZXD6z0UkMW1IXa8fKCjwtvkVf5BFrKDJi4QUB/wZuteqmxrY=
 =BUAE
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "A fix for Xen on ARM when dealing with 64kB page size of a guest"

* tag 'for-linus-4.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/privcmd: Support correctly 64KB page granularity when mapping memory
2017-06-09 09:59:51 -07:00
Julien Grall
753c09b565 xen/privcmd: Support correctly 64KB page granularity when mapping memory
Commit 5995a68 "xen/privcmd: Add support for Linux 64KB page granularity" did
not go far enough to support 64KB in mmap_batch_fn.

The variable 'nr' is the number of 4KB chunk to map. However, when Linux
is using 64KB page granularity the array of pages (vma->vm_private_data)
contain one page per 64KB. Fix it by incrementing st->index correctly.

Furthermore, st->va is not correctly incremented as PAGE_SIZE !=
XEN_PAGE_SIZE.

Fixes: 5995a68 ("xen/privcmd: Add support for Linux 64KB page granularity")
CC: stable@vger.kernel.org
Reported-by: Feng Kan <fkan@apm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-06-07 11:23:14 +02:00
Juergen Gross
4e93b6481c xen: don't print error message in case of missing Xenstore entry
When registering for the Xenstore watch of the node control/sysrq the
handler will be called at once. Don't issue an error message if the
Xenstore node isn't there, as it will be created only when an event
is being triggered.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-06-07 09:48:02 +02:00
Christoph Hellwig
85787090a2 fs: switch ->s_uuid to uuid_t
For some file systems we still memcpy into it, but in various places this
already allows us to use the proper uuid helpers.  More to come..

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> (Changes to IMA/EVM)
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2017-06-05 16:59:12 +02:00
Thomas Gleixner
69a78ff226 init: Introduce SYSTEM_SCHEDULING state
might_sleep() debugging and smp_processor_id() debugging should be active
right after the scheduler starts working. The init task can invoke
smp_processor_id() from preemptible context as it is pinned on the boot cpu
until sched_smp_init() removes the pinning and lets it schedule on all non
isolated cpus.

Add a new state which allows to enable those checks earlier and add it to
the xen do_poweroff() function.

No functional change.

Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20170516184736.196214622@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-23 10:01:38 +02:00
Linus Torvalds
11fbf53d66 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted bits and pieces from various people. No common topic in this
  pile, sorry"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs/affs: add rename exchange
  fs/affs: add rename2 to prepare multiple methods
  Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx()
  fs: don't set *REFERENCED on single use objects
  fs: compat: Remove warning from COMPATIBLE_IOCTL
  remove pointless extern of atime_need_update_rcu()
  fs: completely ignore unknown open flags
  fs: add a VALID_OPEN_FLAGS
  fs: remove _submit_bh()
  fs: constify tree_descr arrays passed to simple_fill_super()
  fs: drop duplicate header percpu-rwsem.h
  fs/affs: bugfix: Write files greater than page size on OFS
  fs/affs: bugfix: enable writes on OFS disks
  fs/affs: remove node generation check
  fs/affs: import amigaffs.h
  fs/affs: bugfix: make symbolic links work again
2017-05-09 09:12:53 -07:00
Michal Hocko
752ade68cb treewide: use kv[mz]alloc* rather than opencoded variants
There are many code paths opencoding kvmalloc.  Let's use the helper
instead.  The main difference to kvmalloc is that those users are
usually not considering all the aspects of the memory allocator.  E.g.
allocation requests <= 32kB (with 4kB pages) are basically never failing
and invoke OOM killer to satisfy the allocation.  This sounds too
disruptive for something that has a reasonable fallback - the vmalloc.
On the other hand those requests might fallback to vmalloc even when the
memory allocator would succeed after several more reclaim/compaction
attempts previously.  There is no guarantee something like that happens
though.

This patch converts many of those places to kv[mz]alloc* helpers because
they are more conservative.

Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390
Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph
Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Santosh Raspatur <santosh@chelsio.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: "Yan, Zheng" <zyan@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-08 17:15:13 -07:00
Julien Grall
e371fd7607 xen: Implement EFI reset_system callback
When rebooting DOM0 with ACPI on ARM64, the kernel is crashing with the stack
trace [1].

This is happening because when EFI runtimes are enabled, the reset code
(see machine_restart) will first try to use EFI restart method.

However, the EFI restart code is expecting the reset_system callback to
be always set. This is not the case for Xen and will lead to crash.

The EFI restart helper is used in multiple places and some of them don't
not have fallback (see machine_power_off). So implement reset_system
callback as a call to xen_reboot when using EFI Xen.

[   36.999270] reboot: Restarting system
[   37.002921] Internal error: Attempting to execute userspace memory: 86000004 [#1] PREEMPT SMP
[   37.011460] Modules linked in:
[   37.014598] CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.11.0-rc1-00003-g1e248b60a39b-dirty #506
[   37.023903] Hardware name: (null) (DT)
[   37.027734] task: ffff800902068000 task.stack: ffff800902064000
[   37.033739] PC is at 0x0
[   37.036359] LR is at efi_reboot+0x94/0xd0
[   37.040438] pc : [<0000000000000000>] lr : [<ffff00000880f2c4>] pstate: 404001c5
[   37.047920] sp : ffff800902067cf0
[   37.051314] x29: ffff800902067cf0 x28: ffff800902068000
[   37.056709] x27: ffff000008992000 x26: 000000000000008e
[   37.062104] x25: 0000000000000123 x24: 0000000000000015
[   37.067499] x23: 0000000000000000 x22: ffff000008e6e250
[   37.072894] x21: ffff000008e6e000 x20: 0000000000000000
[   37.078289] x19: ffff000008e5d4c8 x18: 0000000000000010
[   37.083684] x17: 0000ffffa7c27470 x16: 00000000deadbeef
[   37.089079] x15: 0000000000000006 x14: ffff000088f42bef
[   37.094474] x13: ffff000008f42bfd x12: ffff000008e706c0
[   37.099870] x11: ffff000008e70000 x10: 0000000005f5e0ff
[   37.105265] x9 : ffff800902067a50 x8 : 6974726174736552
[   37.110660] x7 : ffff000008cc6fb8 x6 : ffff000008cc6fb0
[   37.116055] x5 : ffff000008c97dd8 x4 : 0000000000000000
[   37.121453] x3 : 0000000000000000 x2 : 0000000000000000
[   37.126845] x1 : 0000000000000000 x0 : 0000000000000000
[   37.132239]
[   37.133808] Process systemd-shutdow (pid: 1, stack limit = 0xffff800902064000)
[   37.141118] Stack: (0xffff800902067cf0 to 0xffff800902068000)
[   37.146949] 7ce0:                                   ffff800902067d40 ffff000008085334
[   37.154869] 7d00: 0000000000000000 ffff000008f3b000 ffff800902067d40 ffff0000080852e0
[   37.162787] 7d20: ffff000008cc6fb0 ffff000008cc6fb8 ffff000008c7f580 ffff000008c97dd8
[   37.170706] 7d40: ffff800902067d60 ffff0000080e2c2c 0000000000000000 0000000001234567
[   37.178624] 7d60: ffff800902067d80 ffff0000080e2ee8 0000000000000000 ffff0000080e2df4
[   37.186544] 7d80: 0000000000000000 ffff0000080830f0 0000000000000000 00008008ff1c1000
[   37.194462] 7da0: ffffffffffffffff 0000ffffa7c4b1cc 0000000000000000 0000000000000024
[   37.202380] 7dc0: ffff800902067dd0 0000000000000005 0000fffff24743c8 0000000000000004
[   37.210299] 7de0: 0000fffff2475f03 0000000000000010 0000fffff2474418 0000000000000005
[   37.218218] 7e00: 0000fffff2474578 000000000000000a 0000aaaad6b722c0 0000000000000001
[   37.226136] 7e20: 0000000000000123 0000000000000038 ffff800902067e50 ffff0000081e7294
[   37.234055] 7e40: ffff800902067e60 ffff0000081e935c ffff800902067e60 ffff0000081e9388
[   37.241973] 7e60: ffff800902067eb0 ffff0000081ea388 0000000000000000 00008008ff1c1000
[   37.249892] 7e80: ffffffffffffffff 0000ffffa7c4a79c 0000000000000000 ffff000000020000
[   37.257810] 7ea0: 0000010000000004 0000000000000000 0000000000000000 ffff0000080830f0
[   37.265729] 7ec0: fffffffffee1dead 0000000028121969 0000000001234567 0000000000000000
[   37.273651] 7ee0: ffffffffffffffff 8080000000800000 0000800000008080 feffa9a9d4ff2d66
[   37.281567] 7f00: 000000000000008e feffa9a9d5b60e0f 7f7fffffffff7f7f 0101010101010101
[   37.289485] 7f20: 0000000000000010 0000000000000008 000000000000003a 0000ffffa7ccf588
[   37.297404] 7f40: 0000aaaad6b87d00 0000ffffa7c4b1b0 0000fffff2474be0 0000aaaad6b88000
[   37.305326] 7f60: 0000fffff2474fb0 0000000001234567 0000000000000000 0000000000000000
[   37.313240] 7f80: 0000000000000000 0000000000000001 0000aaaad6b70d4d 0000000000000000
[   37.321159] 7fa0: 0000000000000001 0000fffff2474ea0 0000aaaad6b5e2e0 0000fffff2474e80
[   37.329078] 7fc0: 0000ffffa7c4b1cc 0000000000000000 fffffffffee1dead 000000000000008e
[   37.336997] 7fe0: 0000000000000000 0000000000000000 9ce839cffee77eab fafdbf9f7ed57f2f
[   37.344911] Call trace:
[   37.347437] Exception stack(0xffff800902067b20 to 0xffff800902067c50)
[   37.353970] 7b20: ffff000008e5d4c8 0001000000000000 0000000080f82000 0000000000000000
[   37.361883] 7b40: ffff800902067b60 ffff000008e17000 ffff000008f44c68 00000001081081b4
[   37.369802] 7b60: ffff800902067bf0 ffff000008108478 0000000000000000 ffff000008c235b0
[   37.377721] 7b80: ffff800902067ce0 0000000000000000 0000000000000000 0000000000000015
[   37.385643] 7ba0: 0000000000000123 000000000000008e ffff000008992000 ffff800902068000
[   37.393557] 7bc0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[   37.401477] 7be0: 0000000000000000 ffff000008c97dd8 ffff000008cc6fb0 ffff000008cc6fb8
[   37.409396] 7c00: 6974726174736552 ffff800902067a50 0000000005f5e0ff ffff000008e70000
[   37.417318] 7c20: ffff000008e706c0 ffff000008f42bfd ffff000088f42bef 0000000000000006
[   37.425234] 7c40: 00000000deadbeef 0000ffffa7c27470
[   37.430190] [<          (null)>]           (null)
[   37.434982] [<ffff000008085334>] machine_restart+0x6c/0x70
[   37.440550] [<ffff0000080e2c2c>] kernel_restart+0x6c/0x78
[   37.446030] [<ffff0000080e2ee8>] SyS_reboot+0x130/0x228
[   37.451337] [<ffff0000080830f0>] el0_svc_naked+0x24/0x28
[   37.456737] Code: bad PC value
[   37.459891] ---[ end trace 76e2fc17e050aecd ]---

Signed-off-by: Julien Grall <julien.grall@arm.com>

--

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org

The x86 code has theoritically a similar issue, altought EFI does not
seem to be the preferred method. I have only built test it on x86.

This should also probably be fixed in stable tree.

    Changes in v2:
        - Implement xen_efi_reset_system using xen_reboot
        - Move xen_efi_reset_system in drivers/xen/efi.c
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 12:06:50 +02:00
Boris Ostrovsky
84d582d236 xen: Revert commits da72ff5bfc and 72a9b18629
Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741)
established that commit 72a9b18629 ("xen: Remove event channel
notification through Xen PCI platform device") (and thus commit
da72ff5bfc ("partially revert "xen: Remove event channel
notification through Xen PCI platform device"")) are unnecessary and,
in fact, prevent HVM guests from booting on Xen releases prior to 4.0

Therefore we revert both of those commits.

The summary of that discussion is below:

  Here is the brief summary of the current situation:

  Before the offending commit (72a9b18629):

  1) INTx does not work because of the reset_watches path.
  2) The reset_watches path is only taken if you have Xen > 4.0
  3) The Linux Kernel by default will use vector inject if the hypervisor
     support. So even INTx does not work no body running the kernel with
     Xen > 4.0 would notice. Unless he explicitly disabled this feature
     either in the kernel or in Xen (and this can only be disabled by
     modifying the code, not user-supported way to do it).

  After the offending commit (+ partial revert):

  1) INTx is no longer support for HVM (only for PV guests).
  2) Any HVM guest The kernel will not boot on Xen < 4.0 which does
     not have vector injection support. Since the only other mode
     supported is INTx which.

  So based on this summary, I think before commit (72a9b18629) we were
  in much better position from a user point of view.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:18:05 +02:00
Stefano Stabellini
d5ff5061c3 xen/arm,arm64: rename __generic_dma_ops to xen_get_dma_ops
Now that __generic_dma_ops is a xen specific function, rename it to
xen_get_dma_ops. Change all the call sites appropriately.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: linux@armlinux.org.uk
CC: catalin.marinas@arm.com
CC: will.deacon@arm.com
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
CC: Julien Grall <julien.grall@arm.com>
2017-05-02 11:14:47 +02:00
Vitaly Kuznetsov
4fee9ad84e xen/balloon: decorate PV-only parts with #ifdef CONFIG_XEN_PV
Balloon driver uses several PV-only concepts (xen_start_info,
xen_extra_mem,..) and it seems the simpliest solution to make HVM-only
build happy is to decorate these parts with #ifdefs.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2017-05-02 11:09:56 +02:00
Eric Biggers
cda37124f4 fs: constify tree_descr arrays passed to simple_fill_super()
simple_fill_super() is passed an array of tree_descr structures which
describe the files to create in the filesystem's root directory.  Since
these arrays are never modified intentionally, they should be 'const' so
that they are placed in .rodata and benefit from memory protection.
This patch updates the function signature and all users, and also
constifies tree_descr.name.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-26 23:54:06 -04:00