License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 21:07:57 +07:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2005-11-19 16:17:32 +07:00
|
|
|
#ifndef __ASM_POWERPC_MMU_CONTEXT_H
|
|
|
|
#define __ASM_POWERPC_MMU_CONTEXT_H
|
2005-12-17 04:43:46 +07:00
|
|
|
#ifdef __KERNEL__
|
2005-11-19 16:17:32 +07:00
|
|
|
|
2008-12-19 02:13:24 +07:00
|
|
|
#include <linux/kernel.h>
|
|
|
|
#include <linux/mm.h>
|
|
|
|
#include <linux/sched.h>
|
|
|
|
#include <linux/spinlock.h>
|
2007-07-03 15:22:05 +07:00
|
|
|
#include <asm/mmu.h>
|
|
|
|
#include <asm/cputable.h>
|
2008-12-19 02:13:24 +07:00
|
|
|
#include <asm/cputhreads.h>
|
2007-07-03 15:22:05 +07:00
|
|
|
|
|
|
|
/*
|
2008-12-19 02:13:24 +07:00
|
|
|
* Most if the context management is out of line
|
2007-07-03 15:22:05 +07:00
|
|
|
*/
|
2005-04-17 05:20:36 +07:00
|
|
|
extern int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
|
|
|
|
extern void destroy_context(struct mm_struct *mm);
|
2015-06-05 13:35:24 +07:00
|
|
|
#ifdef CONFIG_SPAPR_TCE_IOMMU
|
|
|
|
struct mm_iommu_table_group_mem_t;
|
|
|
|
|
2016-09-06 13:27:31 +07:00
|
|
|
extern int isolate_lru_page(struct page *page); /* from internal.h */
|
2016-11-30 13:52:00 +07:00
|
|
|
extern bool mm_iommu_preregistered(struct mm_struct *mm);
|
2018-12-19 15:52:14 +07:00
|
|
|
extern long mm_iommu_new(struct mm_struct *mm,
|
2016-11-30 13:52:00 +07:00
|
|
|
unsigned long ua, unsigned long entries,
|
2015-06-05 13:35:24 +07:00
|
|
|
struct mm_iommu_table_group_mem_t **pmem);
|
2018-12-19 15:52:15 +07:00
|
|
|
extern long mm_iommu_newdev(struct mm_struct *mm, unsigned long ua,
|
|
|
|
unsigned long entries, unsigned long dev_hpa,
|
|
|
|
struct mm_iommu_table_group_mem_t **pmem);
|
2016-11-30 13:52:00 +07:00
|
|
|
extern long mm_iommu_put(struct mm_struct *mm,
|
|
|
|
struct mm_iommu_table_group_mem_t *mem);
|
2016-11-30 13:51:59 +07:00
|
|
|
extern void mm_iommu_init(struct mm_struct *mm);
|
|
|
|
extern void mm_iommu_cleanup(struct mm_struct *mm);
|
2016-11-30 13:52:00 +07:00
|
|
|
extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
|
|
|
|
unsigned long ua, unsigned long size);
|
2017-03-22 11:21:47 +07:00
|
|
|
extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
|
|
|
|
struct mm_struct *mm, unsigned long ua, unsigned long size);
|
2018-12-19 15:52:14 +07:00
|
|
|
extern struct mm_iommu_table_group_mem_t *mm_iommu_get(struct mm_struct *mm,
|
2016-11-30 13:52:00 +07:00
|
|
|
unsigned long ua, unsigned long entries);
|
2015-06-05 13:35:24 +07:00
|
|
|
extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
|
2018-07-17 14:19:13 +07:00
|
|
|
unsigned long ua, unsigned int pageshift, unsigned long *hpa);
|
2017-03-22 11:21:47 +07:00
|
|
|
extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
|
2018-07-17 14:19:13 +07:00
|
|
|
unsigned long ua, unsigned int pageshift, unsigned long *hpa);
|
2018-09-10 15:29:07 +07:00
|
|
|
extern void mm_iommu_ua_mark_dirty_rm(struct mm_struct *mm, unsigned long ua);
|
2018-12-19 15:52:15 +07:00
|
|
|
extern bool mm_iommu_is_devmem(struct mm_struct *mm, unsigned long hpa,
|
|
|
|
unsigned int pageshift, unsigned long *size);
|
2015-06-05 13:35:24 +07:00
|
|
|
extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
|
|
|
|
extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
|
2018-12-19 15:52:15 +07:00
|
|
|
#else
|
|
|
|
static inline bool mm_iommu_is_devmem(struct mm_struct *mm, unsigned long hpa,
|
|
|
|
unsigned int pageshift, unsigned long *size)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
2019-03-22 15:08:40 +07:00
|
|
|
static inline void mm_iommu_init(struct mm_struct *mm) { }
|
2015-06-05 13:35:24 +07:00
|
|
|
#endif
|
2005-04-17 05:20:36 +07:00
|
|
|
extern void switch_slb(struct task_struct *tsk, struct mm_struct *mm);
|
2008-12-19 02:13:24 +07:00
|
|
|
extern void set_context(unsigned long id, pgd_t *pgd);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2009-07-24 06:15:26 +07:00
|
|
|
#ifdef CONFIG_PPC_BOOK3S_64
|
2016-04-29 20:26:02 +07:00
|
|
|
extern void radix__switch_mmu_context(struct mm_struct *prev,
|
powerpc/mm/radix: Workaround prefetch issue with KVM
There's a somewhat architectural issue with Radix MMU and KVM.
When coming out of a guest with AIL (Alternate Interrupt Location, ie,
MMU enabled), we start executing hypervisor code with the PID register
still containing whatever the guest has been using.
The problem is that the CPU can (and will) then start prefetching or
speculatively load from whatever host context has that same PID (if
any), thus bringing translations for that context into the TLB, which
Linux doesn't know about.
This can cause stale translations and subsequent crashes.
Fixing this in a way that is neither racy nor a huge performance
impact is difficult. We could just make the host invalidations always
use broadcast forms but that would hurt single threaded programs for
example.
We chose to fix it instead by partitioning the PID space between guest
and host. This is possible because today Linux only use 19 out of the
20 bits of PID space, so existing guests will work if we make the host
use the top half of the 20 bits space.
We additionally add support for a property to indicate to Linux the
size of the PID register which will be useful if we eventually have
processors with a larger PID space available.
There is still an issue with malicious guests purposefully setting the
PID register to a value in the hosts PID range. Hopefully future HW
can prevent that, but in the meantime, we handle it with a pair of
kludges:
- On the way out of a guest, before we clear the current VCPU in the
PACA, we check the PID and if it's outside of the permitted range
we flush the TLB for that PID.
- When context switching, if the mm is "new" on that CPU (the
corresponding bit was set for the first time in the mm cpumask), we
check if any sibling thread is in KVM (has a non-NULL VCPU pointer
in the PACA). If that is the case, we also flush the PID for that
CPU (core).
This second part is needed to handle the case where a process is
migrated (or starts a new pthread) on a sibling thread of the CPU
coming out of KVM, as there's a window where stale translations can
exist before we detect it and flush them out.
A future optimization could be added by keeping track of whether the
PID has ever been used and avoid doing that for completely fresh PIDs.
We could similarily mark PIDs that have been the subject of a global
invalidation as "fresh". But for now this will do.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop
unneeded include of kvm_book3s_asm.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-24 11:26:06 +07:00
|
|
|
struct mm_struct *next);
|
2016-04-29 20:26:01 +07:00
|
|
|
static inline void switch_mmu_context(struct mm_struct *prev,
|
|
|
|
struct mm_struct *next,
|
|
|
|
struct task_struct *tsk)
|
|
|
|
{
|
2016-04-29 20:26:02 +07:00
|
|
|
if (radix_enabled())
|
|
|
|
return radix__switch_mmu_context(prev, next);
|
2016-04-29 20:26:01 +07:00
|
|
|
return switch_slb(tsk, next);
|
|
|
|
}
|
|
|
|
|
2017-03-29 18:00:46 +07:00
|
|
|
extern int hash__alloc_context_id(void);
|
2017-03-22 10:37:00 +07:00
|
|
|
extern void hash__reserve_context_id(int id);
|
2009-11-02 19:02:30 +07:00
|
|
|
extern void __destroy_context(int context_id);
|
2009-07-24 06:15:26 +07:00
|
|
|
static inline void mmu_context_init(void) { }
|
2018-03-26 17:04:48 +07:00
|
|
|
|
|
|
|
static inline int alloc_extended_context(struct mm_struct *mm,
|
|
|
|
unsigned long ea)
|
|
|
|
{
|
|
|
|
int context_id;
|
|
|
|
|
|
|
|
int index = ea >> MAX_EA_BITS_PER_CONTEXT;
|
|
|
|
|
|
|
|
context_id = hash__alloc_context_id();
|
|
|
|
if (context_id < 0)
|
|
|
|
return context_id;
|
|
|
|
|
|
|
|
VM_WARN_ON(mm->context.extended_id[index]);
|
|
|
|
mm->context.extended_id[index] = context_id;
|
|
|
|
return context_id;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool need_extra_context(struct mm_struct *mm, unsigned long ea)
|
|
|
|
{
|
|
|
|
int context_id;
|
|
|
|
|
2018-09-20 15:33:57 +07:00
|
|
|
context_id = get_user_context(&mm->context, ea);
|
2018-03-26 17:04:48 +07:00
|
|
|
if (!context_id)
|
|
|
|
return true;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2009-07-24 06:15:26 +07:00
|
|
|
#else
|
2016-04-29 20:26:01 +07:00
|
|
|
extern void switch_mmu_context(struct mm_struct *prev, struct mm_struct *next,
|
|
|
|
struct task_struct *tsk);
|
2010-04-16 05:11:36 +07:00
|
|
|
extern unsigned long __init_new_context(void);
|
|
|
|
extern void __destroy_context(unsigned long context_id);
|
2009-07-24 06:15:26 +07:00
|
|
|
extern void mmu_context_init(void);
|
2018-03-26 17:04:48 +07:00
|
|
|
static inline int alloc_extended_context(struct mm_struct *mm,
|
|
|
|
unsigned long ea)
|
|
|
|
{
|
|
|
|
/* non book3s_64 should never find this called */
|
|
|
|
WARN_ON(1);
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool need_extra_context(struct mm_struct *mm, unsigned long ea)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
2009-07-24 06:15:26 +07:00
|
|
|
#endif
|
|
|
|
|
powerpc/mm/radix: Workaround prefetch issue with KVM
There's a somewhat architectural issue with Radix MMU and KVM.
When coming out of a guest with AIL (Alternate Interrupt Location, ie,
MMU enabled), we start executing hypervisor code with the PID register
still containing whatever the guest has been using.
The problem is that the CPU can (and will) then start prefetching or
speculatively load from whatever host context has that same PID (if
any), thus bringing translations for that context into the TLB, which
Linux doesn't know about.
This can cause stale translations and subsequent crashes.
Fixing this in a way that is neither racy nor a huge performance
impact is difficult. We could just make the host invalidations always
use broadcast forms but that would hurt single threaded programs for
example.
We chose to fix it instead by partitioning the PID space between guest
and host. This is possible because today Linux only use 19 out of the
20 bits of PID space, so existing guests will work if we make the host
use the top half of the 20 bits space.
We additionally add support for a property to indicate to Linux the
size of the PID register which will be useful if we eventually have
processors with a larger PID space available.
There is still an issue with malicious guests purposefully setting the
PID register to a value in the hosts PID range. Hopefully future HW
can prevent that, but in the meantime, we handle it with a pair of
kludges:
- On the way out of a guest, before we clear the current VCPU in the
PACA, we check the PID and if it's outside of the permitted range
we flush the TLB for that PID.
- When context switching, if the mm is "new" on that CPU (the
corresponding bit was set for the first time in the mm cpumask), we
check if any sibling thread is in KVM (has a non-NULL VCPU pointer
in the PACA). If that is the case, we also flush the PID for that
CPU (core).
This second part is needed to handle the case where a process is
migrated (or starts a new pthread) on a sibling thread of the CPU
coming out of KVM, as there's a window where stale translations can
exist before we detect it and flush them out.
A future optimization could be added by keeping track of whether the
PID has ever been used and avoid doing that for completely fresh PIDs.
We could similarily mark PIDs that have been the subject of a global
invalidation as "fresh". But for now this will do.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop
unneeded include of kvm_book3s_asm.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-24 11:26:06 +07:00
|
|
|
#if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && defined(CONFIG_PPC_RADIX_MMU)
|
|
|
|
extern void radix_kvm_prefetch_workaround(struct mm_struct *mm);
|
|
|
|
#else
|
|
|
|
static inline void radix_kvm_prefetch_workaround(struct mm_struct *mm) { }
|
|
|
|
#endif
|
|
|
|
|
2011-05-03 03:43:04 +07:00
|
|
|
extern void switch_cop(struct mm_struct *next);
|
|
|
|
extern int use_cop(unsigned long acop, struct mm_struct *mm);
|
|
|
|
extern void drop_cop(unsigned long acop, struct mm_struct *mm);
|
|
|
|
|
2017-09-04 01:15:13 +07:00
|
|
|
#ifdef CONFIG_PPC_BOOK3S_64
|
|
|
|
static inline void inc_mm_active_cpus(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
atomic_inc(&mm->context.active_cpus);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void dec_mm_active_cpus(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
atomic_dec(&mm->context.active_cpus);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void mm_context_add_copro(struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
/*
|
2018-03-23 05:29:05 +07:00
|
|
|
* If any copro is in use, increment the active CPU count
|
|
|
|
* in order to force TLB invalidations to be global as to
|
|
|
|
* propagate to the Nest MMU.
|
2017-09-04 01:15:13 +07:00
|
|
|
*/
|
2018-03-23 05:29:05 +07:00
|
|
|
if (atomic_inc_return(&mm->context.copros) == 1)
|
|
|
|
inc_mm_active_cpus(mm);
|
2017-09-04 01:15:13 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void mm_context_remove_copro(struct mm_struct *mm)
|
|
|
|
{
|
2018-03-23 05:29:05 +07:00
|
|
|
int c;
|
|
|
|
|
2017-09-04 01:15:13 +07:00
|
|
|
/*
|
2018-07-31 20:24:52 +07:00
|
|
|
* When removing the last copro, we need to broadcast a global
|
|
|
|
* flush of the full mm, as the next TLBI may be local and the
|
|
|
|
* nMMU and/or PSL need to be cleaned up.
|
|
|
|
*
|
|
|
|
* Both the 'copros' and 'active_cpus' counts are looked at in
|
|
|
|
* flush_all_mm() to determine the scope (local/global) of the
|
|
|
|
* TLBIs, so we need to flush first before decrementing
|
|
|
|
* 'copros'. If this API is used by several callers for the
|
|
|
|
* same context, it can lead to over-flushing. It's hopefully
|
|
|
|
* not common enough to be a problem.
|
2017-09-04 01:15:13 +07:00
|
|
|
*
|
|
|
|
* Skip on hash, as we don't know how to do the proper flush
|
|
|
|
* for the time being. Invalidations will remain global if
|
2018-07-31 20:24:52 +07:00
|
|
|
* used on hash. Note that we can't drop 'copros' either, as
|
|
|
|
* it could make some invalidations local with no flush
|
|
|
|
* in-between.
|
2017-09-04 01:15:13 +07:00
|
|
|
*/
|
2018-07-31 20:24:52 +07:00
|
|
|
if (radix_enabled()) {
|
2017-09-04 01:15:13 +07:00
|
|
|
flush_all_mm(mm);
|
2018-07-31 20:24:52 +07:00
|
|
|
|
|
|
|
c = atomic_dec_if_positive(&mm->context.copros);
|
|
|
|
/* Detect imbalance between add and remove */
|
|
|
|
WARN_ON(c < 0);
|
|
|
|
|
|
|
|
if (c == 0)
|
|
|
|
dec_mm_active_cpus(mm);
|
2017-09-04 01:15:13 +07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
#else
|
|
|
|
static inline void inc_mm_active_cpus(struct mm_struct *mm) { }
|
|
|
|
static inline void dec_mm_active_cpus(struct mm_struct *mm) { }
|
|
|
|
static inline void mm_context_add_copro(struct mm_struct *mm) { }
|
|
|
|
static inline void mm_context_remove_copro(struct mm_struct *mm) { }
|
|
|
|
#endif
|
|
|
|
|
|
|
|
|
2017-07-24 11:28:03 +07:00
|
|
|
extern void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
|
|
|
|
struct task_struct *tsk);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
powerpc/mm: Ensure IRQs are off in switch_mm()
powerpc expects IRQs to already be (soft) disabled when switch_mm() is
called, as made clear in the commit message of 9c1e105238c4 ("powerpc: Allow
perf_counters to access user memory at interrupt time").
Aside from any race conditions that might exist between switch_mm() and an IRQ,
there is also an unconditional hard_irq_disable() in switch_slb(). If that isn't
followed at some point by an IRQ enable then interrupts will remain disabled
until we return to userspace.
It is true that when switch_mm() is called from the scheduler IRQs are off, but
not when it's called by use_mm(). Looking closer we see that last year in commit
f98db6013c55 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler")
this was made more explicit by the addition of switch_mm_irqs_off() which is now
called by the scheduler, vs switch_mm() which is used by use_mm().
Arguably it is a bug in use_mm() to call switch_mm() in a different context than
it expects, but fixing that will take time.
This was discovered recently when vhost started throwing warnings such as:
BUG: sleeping function called from invalid context at kernel/mutex.c:578
in_atomic(): 0, irqs_disabled(): 1, pid: 10768, name: vhost-10760
no locks held by vhost-10760/10768.
irq event stamp: 10
hardirqs last enabled at (9): _raw_spin_unlock_irq+0x40/0x80
hardirqs last disabled at (10): switch_slb+0x2e4/0x490
softirqs last enabled at (0): copy_process+0x5e8/0x1260
softirqs last disabled at (0): (null)
Call Trace:
show_stack+0x88/0x390 (unreliable)
dump_stack+0x30/0x44
__might_sleep+0x1c4/0x2d0
mutex_lock_nested+0x74/0x5c0
cgroup_attach_task_all+0x5c/0x180
vhost_attach_cgroups_work+0x58/0x80 [vhost]
vhost_worker+0x24c/0x3d0 [vhost]
kthread+0xec/0x100
ret_from_kernel_thread+0x5c/0xd4
Prior to commit 04b96e5528ca ("vhost: lockless enqueuing") (Aug 2016) the
vhost_worker() would do a spin_unlock_irq() not long after calling use_mm(),
which had the effect of reenabling IRQs. Since that commit removed the locking
in vhost_worker() the body of the vhost_worker() loop now runs with interrupts
off causing the warnings.
This patch addresses the problem by making the powerpc code mirror the x86 code,
ie. we disable interrupts in switch_mm(), and optimise the scheduler case by
defining switch_mm_irqs_off().
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[mpe: Flesh out/rewrite change log, add stable]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-04-19 13:38:26 +07:00
|
|
|
static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
|
|
|
|
struct task_struct *tsk)
|
|
|
|
{
|
|
|
|
unsigned long flags;
|
|
|
|
|
|
|
|
local_irq_save(flags);
|
|
|
|
switch_mm_irqs_off(prev, next, tsk);
|
|
|
|
local_irq_restore(flags);
|
|
|
|
}
|
|
|
|
#define switch_mm_irqs_off switch_mm_irqs_off
|
|
|
|
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
#define deactivate_mm(tsk,mm) do { } while (0)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* After we have set current->mm to a new value, this activates
|
|
|
|
* the context for the new mm so we see the new mappings.
|
|
|
|
*/
|
|
|
|
static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next)
|
|
|
|
{
|
|
|
|
switch_mm(prev, next, current);
|
|
|
|
}
|
|
|
|
|
2008-12-19 02:13:24 +07:00
|
|
|
/* We don't currently use enter_lazy_tlb() for anything */
|
|
|
|
static inline void enter_lazy_tlb(struct mm_struct *mm,
|
|
|
|
struct task_struct *tsk)
|
|
|
|
{
|
2009-07-24 06:15:47 +07:00
|
|
|
/* 64-bit Book3E keeps track of current PGD in the PACA */
|
|
|
|
#ifdef CONFIG_PPC_BOOK3E_64
|
|
|
|
get_paca()->pgd = NULL;
|
|
|
|
#endif
|
2008-12-19 02:13:24 +07:00
|
|
|
}
|
|
|
|
|
2017-10-24 20:06:54 +07:00
|
|
|
extern void arch_exit_mmap(struct mm_struct *mm);
|
2015-06-25 06:56:22 +07:00
|
|
|
|
|
|
|
static inline void arch_unmap(struct mm_struct *mm,
|
|
|
|
unsigned long start, unsigned long end)
|
|
|
|
{
|
|
|
|
if (start <= mm->context.vdso_base && mm->context.vdso_base < end)
|
|
|
|
mm->context.vdso_base = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void arch_bprm_mm_init(struct mm_struct *mm,
|
|
|
|
struct vm_area_struct *vma)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2018-01-19 08:50:39 +07:00
|
|
|
#ifdef CONFIG_PPC_MEM_KEYS
|
|
|
|
bool arch_vma_access_permitted(struct vm_area_struct *vma, bool write,
|
|
|
|
bool execute, bool foreign);
|
2018-12-21 03:03:30 +07:00
|
|
|
void arch_dup_pkeys(struct mm_struct *oldmm, struct mm_struct *mm);
|
2018-01-19 08:50:39 +07:00
|
|
|
#else /* CONFIG_PPC_MEM_KEYS */
|
2016-02-13 04:02:21 +07:00
|
|
|
static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
|
2016-02-13 04:02:24 +07:00
|
|
|
bool write, bool execute, bool foreign)
|
mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
Today, for normal faults and page table walks, we check the VMA
and/or PTE to ensure that it is compatible with the action. For
instance, if we get a write fault on a non-writeable VMA, we
SIGSEGV.
We try to do the same thing for protection keys. Basically, we
try to make sure that if a user does this:
mprotect(ptr, size, PROT_NONE);
*ptr = foo;
they see the same effects with protection keys when they do this:
mprotect(ptr, size, PROT_READ|PROT_WRITE);
set_pkey(ptr, size, 4);
wrpkru(0xffffff3f); // access disable pkey 4
*ptr = foo;
The state to do that checking is in the VMA, but we also
sometimes have to do it on the page tables only, like when doing
a get_user_pages_fast() where we have no VMA.
We add two functions and expose them to generic code:
arch_pte_access_permitted(pte_flags, write)
arch_vma_access_permitted(vma, write)
These are, of course, backed up in x86 arch code with checks
against the PTE or VMA's protection key.
But, there are also cases where we do not want to respect
protection keys. When we ptrace(), for instance, we do not want
to apply the tracer's PKRU permissions to the PTEs from the
process being traced.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boaz Harrosh <boaz@plexistor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Dominik Vogt <vogt@linux.vnet.ibm.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Shachar Raindel <raindel@mellanox.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20160212210219.14D5D715@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-13 04:02:19 +07:00
|
|
|
{
|
|
|
|
/* by default, allow everything */
|
|
|
|
return true;
|
|
|
|
}
|
2018-01-19 08:50:24 +07:00
|
|
|
|
2018-01-19 08:50:25 +07:00
|
|
|
#define pkey_mm_init(mm)
|
2018-01-19 08:50:31 +07:00
|
|
|
#define thread_pkey_regs_save(thread)
|
|
|
|
#define thread_pkey_regs_restore(new_thread, old_thread)
|
|
|
|
#define thread_pkey_regs_init(thread)
|
2018-12-21 03:03:30 +07:00
|
|
|
#define arch_dup_pkeys(oldmm, mm)
|
2018-01-19 08:50:34 +07:00
|
|
|
|
2018-01-19 08:50:36 +07:00
|
|
|
static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
|
|
|
|
{
|
|
|
|
return 0x0UL;
|
|
|
|
}
|
|
|
|
|
2018-01-19 08:50:25 +07:00
|
|
|
#endif /* CONFIG_PPC_MEM_KEYS */
|
|
|
|
|
2018-12-21 03:03:30 +07:00
|
|
|
static inline int arch_dup_mmap(struct mm_struct *oldmm,
|
|
|
|
struct mm_struct *mm)
|
|
|
|
{
|
|
|
|
arch_dup_pkeys(oldmm, mm);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2005-12-17 04:43:46 +07:00
|
|
|
#endif /* __KERNEL__ */
|
2005-11-19 16:17:32 +07:00
|
|
|
#endif /* __ASM_POWERPC_MMU_CONTEXT_H */
|