Commit Graph

141145 Commits

Author SHA1 Message Date
Christophe Leroy
252eb55816 powerpc: Fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX
On powerpc32, patch_instruction() is called by apply_feature_fixups()
which is called from early_init()

There is the following note in front of early_init():
 * Note that the kernel may be running at an address which is different
 * from the address that it was linked at, so we must use RELOC/PTRRELOC
 * to access static data (including strings).  -- paulus

Therefore, slab_is_available() cannot be called yet, and
text_poke_area must be addressed with PTRRELOC()

Fixes: 95902e6c88 ("powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-22 23:04:20 +11:00
Andrey Ryabinin
f68d62a567 x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
[ Note, this commit is a cherry-picked version of:

    d17a1d97dc: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")

  ... for easier x86 entry code testing and back-porting. ]

The KASAN shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt.  However,
since that no longer zeroes the mapped pages, it is not suitable for
KASAN, which requires zeroed shadow memory.

Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate().  Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.

Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-22 07:18:35 +01:00
Andy Lutomirski
548c3050ea x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
before it.  This means that users of entry_SYSCALL_64_after_hwframe()
were responsible for invoking TRACE_IRQS_OFF, and the one and only
user (Xen, added in the same commit) got it wrong.

I think this would manifest as a warning if a Xen PV guest with
CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
context tracking bit is to cause lockdep to get invoked before we
turn IRQs back on.)  I haven't tested that for real yet because I
can't get a kernel configured like that to boot at all on Xen PV.

Move TRACE_IRQS_OFF below the label.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 8a9949bc71 ("x86/xen/64: Rearrange the SYSCALL entries")
Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-22 06:35:48 +01:00
Michael Ellerman
f3f1dfd600 powerpc/perf/imc: Use cpu_to_node() not topology_physical_package_id()
init_imc_pmu() uses topology_physical_package_id() to detect the
node id of the processor it is on to get local memory, but that's
wrong, and can lead to crashes. Fix it to use cpu_to_node().

Fixes: 885dcd709b ("powerpc/perf: Add nest IMC PMU support")
Cc: stable@vger.kernel.org # v4.14+
Reported-By: Rob Lippert <rlippert@google.com>
Tested-By: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-22 12:24:46 +11:00
Kees Cook
86cb30ec07 treewide: setup_timer() -> timer_setup() (2 field)
This converts all remaining setup_timer() calls that use a nested field
to reach a struct timer_list. Coccinelle does not have an easy way to
match multiple fields, so a new script is needed to change the matches of
"&_E->_timer" into "&_E->_field1._timer" in all the rules.

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/timer_setup-2fields.cocci

@fix_address_of depends@
expression e;
@@

 setup_timer(
-&(e)
+&e
 , ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _field1;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_field1._timer, NULL, _E);
+timer_setup(&_E->_field1._timer, NULL, 0);
|
-setup_timer(&_E->_field1._timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, NULL, 0);
|
-setup_timer(&_E._field1._timer, NULL, &_E);
+timer_setup(&_E._field1._timer, NULL, 0);
|
-setup_timer(&_E._field1._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _field1;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_field1._timer, _callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, &_callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._field1._timer, _callback, 0);
|
 _E->_field1._timer@_stl.function = _callback;
|
 _E->_field1._timer@_stl.function = &_callback;
|
 _E->_field1._timer@_stl.function = (_cast_func)_callback;
|
 _E->_field1._timer@_stl.function = (_cast_func)&_callback;
|
 _E._field1._timer@_stl.function = _callback;
|
 _E._field1._timer@_stl.function = &_callback;
|
 _E._field1._timer@_stl.function = (_cast_func)_callback;
|
 _E._field1._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
 depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
(
	... when != _origarg
	_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(void *)_origarg;
+from_timer(_handle, t, _field1._timer);
	... when != _origarg
)
 }

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
 depends on change_timer_function_usage &&
                     !change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
+	_handletype *_origarg = from_timer(_origarg, t, _field1._timer);
+
	... when != _origarg
-	(_handletype *)_origarg
+	_origarg
	... when != _origarg
 }

// Avoid already converted callbacks.
@match_callback_converted
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
	    !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

 void _callback(struct timer_list *t)
 { ... }

// callback(struct something *handle)
@change_callback_handle_arg
 depends on change_timer_function_usage &&
	    !match_callback_converted &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

 void _callback(
-_handletype *_handle
+struct timer_list *t
 )
 {
+	_handletype *_handle = from_timer(_handle, t, _field1._timer);
	...
 }

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
 depends on change_timer_function_usage &&
	    change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

 void _callback(struct timer_list *t)
 {
-	_handletype *_handle = from_timer(_handle, t, _field1._timer);
 }

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg &&
	    !change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_field1._timer, _callback, 0);
+setup_timer(&_E->_field1._timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._field1._timer, _callback, 0);
+setup_timer(&_E._field1._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
 _E->_field1._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_field1._timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_field1._timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_field1._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._field1._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._field1;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

 _callback(
(
-(_cast_data)_E
+&_E->_field1._timer
|
-(_cast_data)&_E
+&_E._field1._timer
|
-_E
+&_E->_field1._timer
)
 )

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _field1;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_field1._timer, _callback, 0);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, _callback, 0L);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E->_field1._timer, _callback, 0UL);
+timer_setup(&_E->_field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, 0);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, 0L);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_E._field1._timer, _callback, 0UL);
+timer_setup(&_E._field1._timer, _callback, 0);
|
-setup_timer(&_field1._timer, _callback, 0);
+timer_setup(&_field1._timer, _callback, 0);
|
-setup_timer(&_field1._timer, _callback, 0L);
+timer_setup(&_field1._timer, _callback, 0);
|
-setup_timer(&_field1._timer, _callback, 0UL);
+timer_setup(&_field1._timer, _callback, 0);
|
-setup_timer(_field1._timer, _callback, 0);
+timer_setup(_field1._timer, _callback, 0);
|
-setup_timer(_field1._timer, _callback, 0L);
+timer_setup(_field1._timer, _callback, 0);
|
-setup_timer(_field1._timer, _callback, 0UL);
+timer_setup(_field1._timer, _callback, 0);
)

@change_callback_unused_data
 depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *unused
 )
 {
	... when != _origarg
 }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:09 -08:00
Kees Cook
e99e88a9d2 treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.

Casting from unsigned long:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, ptr);

and forced object casts:

    void my_callback(struct something *ptr)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

become:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

Direct function assignments:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    ptr->my_timer.function = my_callback;

have a temporary cast added, along with converting the args:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

And finally, callbacks without a data assignment:

    void my_callback(unsigned long data)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, 0);

have their argument renamed to verify they're unused during conversion:

    void my_callback(struct timer_list *unused)
    {
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

The conversion is done with the following Coccinelle script:

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/timer_setup.cocci

@fix_address_of@
expression e;
@@

 setup_timer(
-&(e)
+&e
 , ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
 _E->_timer@_stl.function = _callback;
|
 _E->_timer@_stl.function = &_callback;
|
 _E->_timer@_stl.function = (_cast_func)_callback;
|
 _E->_timer@_stl.function = (_cast_func)&_callback;
|
 _E._timer@_stl.function = _callback;
|
 _E._timer@_stl.function = &_callback;
|
 _E._timer@_stl.function = (_cast_func)_callback;
|
 _E._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
 depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
(
	... when != _origarg
	_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
)
 }

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
 depends on change_timer_function_usage &&
                     !change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
+	_handletype *_origarg = from_timer(_origarg, t, _timer);
+
	... when != _origarg
-	(_handletype *)_origarg
+	_origarg
	... when != _origarg
 }

// Avoid already converted callbacks.
@match_callback_converted
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
	    !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

 void _callback(struct timer_list *t)
 { ... }

// callback(struct something *handle)
@change_callback_handle_arg
 depends on change_timer_function_usage &&
	    !match_callback_converted &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

 void _callback(
-_handletype *_handle
+struct timer_list *t
 )
 {
+	_handletype *_handle = from_timer(_handle, t, _timer);
	...
 }

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
 depends on change_timer_function_usage &&
	    change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

 void _callback(struct timer_list *t)
 {
-	_handletype *_handle = from_timer(_handle, t, _timer);
 }

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg &&
	    !change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
 _E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

 _callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
 )

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)

@change_callback_unused_data
 depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *unused
 )
 {
	... when != _origarg
 }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:07 -08:00
Kees Cook
b9eaf18722 treewide: init_timer() -> setup_timer()
This mechanically converts all remaining cases of ancient open-coded timer
setup with the old setup_timer() API, which is the first step in timer
conversions. This has no behavioral changes, since it ultimately just
changes the order of assignment to fields of struct timer_list when
finding variations of:

    init_timer(&t);
    f.function = timer_callback;
    t.data = timer_callback_arg;

to be converted into:

    setup_timer(&t, timer_callback, timer_callback_arg);

The conversion is done with the following Coccinelle script, which
is an improved version of scripts/cocci/api/setup_timer.cocci, in the
following ways:
 - assignments-before-init_timer() cases
 - limit the .data case removal to the specific struct timer_list instance
 - handling calls by dereference (timer->field vs timer.field)

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/setup_timer.cocci

@fix_address_of@
expression e;
@@

 init_timer(
-&(e)
+&e
 , ...)

// Match the common cases first to avoid Coccinelle parsing loops with
// "... when" clauses.

@match_immediate_function_data_after_init_timer@
expression e, func, da;
@@

-init_timer
+setup_timer
 ( \(&e\|e\)
+, func, da
 );
(
-\(e.function\|e->function\) = func;
-\(e.data\|e->data\) = da;
|
-\(e.data\|e->data\) = da;
-\(e.function\|e->function\) = func;
)

@match_immediate_function_data_before_init_timer@
expression e, func, da;
@@

(
-\(e.function\|e->function\) = func;
-\(e.data\|e->data\) = da;
|
-\(e.data\|e->data\) = da;
-\(e.function\|e->function\) = func;
)
-init_timer
+setup_timer
 ( \(&e\|e\)
+, func, da
 );

@match_function_and_data_after_init_timer@
expression e, e2, e3, e4, e5, func, da;
@@

-init_timer
+setup_timer
 ( \(&e\|e\)
+, func, da
 );
 ... when != func = e2
     when != da = e3
(
-e.function = func;
... when != da = e4
-e.data = da;
|
-e->function = func;
... when != da = e4
-e->data = da;
|
-e.data = da;
... when != func = e5
-e.function = func;
|
-e->data = da;
... when != func = e5
-e->function = func;
)

@match_function_and_data_before_init_timer@
expression e, e2, e3, e4, e5, func, da;
@@
(
-e.function = func;
... when != da = e4
-e.data = da;
|
-e->function = func;
... when != da = e4
-e->data = da;
|
-e.data = da;
... when != func = e5
-e.function = func;
|
-e->data = da;
... when != func = e5
-e->function = func;
)
... when != func = e2
    when != da = e3
-init_timer
+setup_timer
 ( \(&e\|e\)
+, func, da
 );

@r1 exists@
expression t;
identifier f;
position p;
@@

f(...) { ... when any
  init_timer@p(\(&t\|t\))
  ... when any
}

@r2 exists@
expression r1.t;
identifier g != r1.f;
expression e8;
@@

g(...) { ... when any
  \(t.data\|t->data\) = e8
  ... when any
}

// It is dangerous to use setup_timer if data field is initialized
// in another function.
@script:python depends on r2@
p << r1.p;
@@

cocci.include_match(False)

@r3@
expression r1.t, func, e7;
position r1.p;
@@

(
-init_timer@p(&t);
+setup_timer(&t, func, 0UL);
... when != func = e7
-t.function = func;
|
-t.function = func;
... when != func = e7
-init_timer@p(&t);
+setup_timer(&t, func, 0UL);
|
-init_timer@p(t);
+setup_timer(t, func, 0UL);
... when != func = e7
-t->function = func;
|
-t->function = func;
... when != func = e7
-init_timer@p(t);
+setup_timer(t, func, 0UL);
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:06 -08:00
Kees Cook
24ed960abf treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
This changes all DEFINE_TIMER() callbacks to use a struct timer_list
pointer instead of unsigned long. Since the data argument has already been
removed, none of these callbacks are using their argument currently, so
this renames the argument to "unused".

Done using the following semantic patch:

@match_define_timer@
declarer name DEFINE_TIMER;
identifier _timer, _callback;
@@

 DEFINE_TIMER(_timer, _callback);

@change_callback depends on match_define_timer@
identifier match_define_timer._callback;
type _origtype;
identifier _origarg;
@@

 void
-_callback(_origtype _origarg)
+_callback(struct timer_list *unused)
 { ... }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:05 -08:00
Kees Cook
6cc73a06da s390: cmm: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:46:44 -08:00
Vineet Gupta
82385732b1 ARC: perf: avoid vmalloc backed mmap
For non-alising Dcache, vmalloc is not needed.

vmalloc triggers additonal D-TLB Misses in the perf interrupt code path
making it slightly inefficient as evident from hackbench runs below.

| [ARCLinux]# perf stat -e dTLB-load-misses --repeat 5 hackbench
| Running with 10*40 (== 400) tasks.
| Time: 35.060
| ...
| Performance counter stats for 'hackbench' (5 runs):

Before:      399235      dTLB-load-misses ( +-  2.08% )
After :      397676      dTLB-load-misses ( +-  2.27% )

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-11-21 15:21:08 -08:00
Vineet Gupta
5b9027d6d0 ARCv2: perf: optimize given that num counters <= 32
use ffz primitive which maps to ARCv2 instruction, vs. non atomic
__test_and_set_bit

It is unlikely if we will even have more than 32 counters, but still add
a BUILD_BUG to catch that

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-11-21 15:20:55 -08:00
Vineet Gupta
4d43129040 ARCv2: perf: tweak overflow interrupt
Current perf ISR loops thru all 32 counters, checking for each if it
caused the interrupt. Instead only loop thru counters which actually
interrupted (typically 1).

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-11-21 15:20:31 -08:00
Philip Derrin
400eeffaff ARM: 8722/1: mm: make STRICT_KERNEL_RWX effective for LPAE
Currently, for ARM kernels with CONFIG_ARM_LPAE and
CONFIG_STRICT_KERNEL_RWX enabled, the 2MiB pages mapping the
kernel code and rodata are writable. They are marked read-only in
a software bit (L_PMD_SECT_RDONLY) but the hardware read-only bit
is not set (PMD_SECT_AP2).

For user mappings, the logic that propagates the software bit
to the hardware bit is in set_pmd_at(); but for the kernel,
section_update() writes the PMDs directly, skipping this logic.

The fix is to set PMD_SECT_AP2 for read-only sections in
section_update(), at the same time as L_PMD_SECT_RDONLY.

Fixes: 1e3479225a ("ARM: 8275/1: mm: fix PMD_SECT_RDONLY undeclared compile error")
Signed-off-by: Philip Derrin <philip@cog.systems>
Reported-by: Neil Dick <neil@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Tested-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-21 15:10:07 +00:00
Philip Derrin
3b0c0c922f ARM: 8721/1: mm: dump: check hardware RO bit for LPAE
When CONFIG_ARM_LPAE is set, the PMD dump relies on the software
read-only bit to determine whether a page is writable. This
concealed a bug which left the kernel text section writable
(AP2=0) while marked read-only in the software bit.

In a kernel with the AP2 bug, the dump looks like this:

    ---[ Kernel Mapping ]---
    0xc0000000-0xc0200000           2M RW NX SHD
    0xc0200000-0xc0600000           4M ro x  SHD
    0xc0600000-0xc0800000           2M ro NX SHD
    0xc0800000-0xc4800000          64M RW NX SHD

The fix is to check that the software and hardware bits are both
set before displaying "ro". The dump then shows the true perms:

    ---[ Kernel Mapping ]---
    0xc0000000-0xc0200000           2M RW NX SHD
    0xc0200000-0xc0600000           4M RW x  SHD
    0xc0600000-0xc0800000           2M RW NX SHD
    0xc0800000-0xc4800000          64M RW NX SHD

Fixes: ded9477984 ("ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE")
Signed-off-by: Philip Derrin <philip@cog.systems>
Tested-by: Neil Dick <neil@cog.systems>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-21 15:10:07 +00:00
Russell King
29337b60ab ARM: make decompressor debug output user selectable
Make the decompressor debug output user selectable, otherwise merely
enabling DEBUG_LL causes the decompressor to become board specific,
thereby preventing a multi-platform kernel from booting.  Enabling
DEBUG_LL doesn't cause the kernel itself to become platform specific
unless EARLY_PRINTK is enabled, or one of the debugging routines is
added in a path that results in it being called.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-21 14:46:11 +00:00
Russell King
1ee5e87f86 ARM: fix get_user_pages_fast
Ensure that get_user_pages_fast() is not able to access memory which
has been mapped with PROT_NONE.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-21 14:45:36 +00:00
Sukadev Bhattiprolu
62b49c4210 powerpc/vas: Export chip_to_vas_id()
Export the symbol chip_to_vas_id() to fix a build failure when
CONFIG_CRYPTO_DEV_NX_COMPRESS_POWERNV=m.

Fixes: d4ef61b5e8 ("powerpc/vas, nx-842: Define and use chip_to_vas_id()")
Reported-by: Haren Myneni <hbabu@us.ibm.com>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-21 21:02:26 +11:00
Ricardo Neri
fd11a6496e x86/umip: Print a warning into the syslog if UMIP-protected instructions are used
Print a rate-limited warning when a user-space program attempts to execute
any of the instructions that UMIP protects (i.e., SGDT, SIDT, SLDT, STR
and SMSW).

This is useful, because when CONFIG_X86_INTEL_UMIP=y is selected and
supported by the hardware, user space programs that try to execute such
instructions will receive a SIGSEGV signal that they might not expect.

In the specific cases for which emulation is provided (instructions SGDT,
SIDT and SMSW in protected and virtual-8086 modes), no signal is
generated. However, a warning is helpful to encourage updates in such
programs to avoid the use of such instructions.

Warnings are printed via a customized printk() function that also provides
information about the program that attempted to use the affected
instructions.

Utility macros are defined to wrap umip_printk() for the error and warning
kernel log levels.

While here, replace an existing call to the generic rate-limited pr_err()
with the new umip_pr_err().

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: ricardo.neri@intel.com
Link: http://lkml.kernel.org/r/1511233476-17088-1-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-21 08:13:43 +01:00
Aneesh Kumar K.V
7a06c66835 powerpc/64s/slice: Use addr limit when computing slice mask
While computing slice mask for the free area we need make sure we only
search in the addr limit applicable for this mmap. We update the
slb_addr_limit after we request for a mmap above 128TB. But the
following mmap request with hint addr below 128TB should still limit
its search to below 128TB. ie. we should not use slb_addr_limit to
compute slice mask in this case. Instead, we should derive high addr
limit based on the mmap hint addr value.

Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-20 19:28:25 +11:00
Heiko Carstens
de35089cc8 s390/disassembler: remove confusing code
When searching the opcode offset table within find_insn() the check
"entry->opcode == 0" was intended to clarify that 1-byte opcodes, the
first one being 0, are special.

However there is no mnemonic for an illegal opcode starting with 0.
Therefore there is also no opcode offset table entry that matches,
which again means that the check never is true. Therefore just remove
the confusing check, and add a comment which hopefully explains how
this works.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-20 08:51:02 +01:00
Heiko Carstens
3241d3eb7d s390: rework __switch_to() to allow larger task_struct offsets
If GCC_PLUGIN_RANDSTRUCT is enabled the members of task_struct will be
shuffled around. The offsets of the "pid" and "stack" members within
task_struct may not necessarily fit into 12 bits anymore, which causes
compile errors within __switch_to, since instructions are used, which
only have a 12 bit displacement field.

Therefore rework __switch_to, to allow for larger offsets.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-20 08:51:01 +01:00
Thomas Richter
38389ec84e s390/topology: fix compile error in file arch/s390/kernel/smp.c
Commit 1887aa07b6
("s390/topology: add detection of dedicated vs shared CPUs")
introduced following compiler error when CONFIG_SCHED_TOPOLOGY is not set.

 CC      arch/s390/kernel/smp.o
...
arch/s390/kernel/smp.c: In function ‘smp_start_secondary’:
arch/s390/kernel/smp.c:812:6: error: implicit declaration of function
	‘topology_cpu_dedicated’; did you mean ‘topology_cpu_init’?

This patch fixes the compiler error by adding function
topology_cpu_dedicated() to return false when this config option is
not defined.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-20 08:51:01 +01:00
Linus Torvalds
1deab8ce2c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:

 1) Add missing cmpxchg64() for 32-bit sparc.

 2) Timer conversions from Allen Pais and Kees Cook.

 3) vDSO support, from Nagarathnam Muthusamy.

 4) Fix sparc64 huge page table walks based upon bug report by Al Viro,
    from Nitin Gupta.

 5) Optimized fls() for T4 and above, from Vijay Kumar.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix page table walk for PUD hugepages
  sparc64: Convert timers to user timer_setup()
  sparc64: convert mdesc_handle.refcnt from atomic_t to refcount_t
  sparc/led: Convert timers to use timer_setup()
  sparc64: Use sparc optimized fls and __fls for T4 and above
  sparc64: SPARC optimized __fls function
  sparc64: SPARC optimized fls function
  sparc64: Define SPARC default __fls function
  sparc64: Define SPARC default fls function
  vDSO for sparc
  sparc32: Add cmpxchg64().
  sbus: char: Move D7S_MINOR to include/linux/miscdevice.h
  sparc: time: Remove unneeded linux/miscdevice.h include
  sparc64: mmu_context: Add missing include files
2017-11-17 20:21:44 -08:00
Masahiro Yamada
bf070bb0e6 kbuild: remove all dummy assignments to obj-
Now kbuild core scripts create empty built-in.o where necessary.
Remove "obj- := dummy.o" tricks.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-18 11:46:06 +09:00
Linus Torvalds
09bd7c75e5 Kbuild updates for v4.15
One of the most remarkable improvements in this cycle is, Kbuild is
 now able to cache the result of shell commands.  Some variables are
 expensive to compute, for example, $(call cc-option,...) invokes the
 compiler.  It is not efficient to redo this computation every time,
 even when we are not actually building anything.  Kbuild creates a
 hidden file ".cache.mk" that contains invoked shell commands and
 their results.  The speed-up should be noticeable.
 
 Summary:
 
 - Fix arch build issues (hexagon, sh)
 
 - Clean up various Makefiles and scripts
 
 - Fix wrong usage of {CFLAGS,LDFLAGS}_MODULE in arch Makefiles
 
 - Cache variables that are expensive to compute
 
 - Improve cc-ldopton and ld-option for Clang
 
 - Optimize output directory creation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDxaLAAoJED2LAQed4NsGIHQP/isMxxaIxIAWU56+ZcII74k7
 639VgrKi9n5y25d1dBRTQg+vReHE6E2JbkCqpVOu11t7m0LT7yUK8v3WwyLf1qTN
 GxnqZ/WMQU5/AYVqIWo8jN4FGHpivHJ6qbeiNJM9qN4RAkzG0sZUq746VaFZYmIR
 Lu0Gf4m4qjifkkhXsQdWT5i7yNTidPqaL6GNb+FcFkEHlVre8jma0kJlgfHxru84
 WmETpjQXvHAZ/R61vY6ekAWpqFhw3ecJY96A9npnx+SQVQdSNAdpaU0SK29jB0ON
 /SAfpHg9oa/gD0LFOKV6zkjnAkd4TEjrJEiHHhz5gjT/SbS3T1llBIGZ1oV4X7Y0
 Vlh9KWlm1FJJI4SIzc9qUaQMp6JtLfEfHKJCc45xVaN3fNrDnR8jl80x5+95ELga
 dCkZgnq5u82MtTysCbHBESwDYQaVPyIrh7In+mduglaCqhqj9KoDjoLoiGfCg7SA
 3tPflYVd629w5l5GrazJ40jWn1+ggMtgMOVooJNJ+dINCP+GxsUpH84Ww2Pdic+/
 qLdud6TeqxrZDGzWXqKNLu8alM8NGgSr101l9gIf1oqSyy63duBpMrxGDoIJS3FU
 rFDoFFUhlfkAXNbQHtVGNzKtcpCjURh992j9Fa1+NfMwSce5IHkMwTvPmNSRowi8
 0llLjXhD/bxK6FpdvlV8
 =zIdO
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:
 "One of the most remarkable improvements in this cycle is, Kbuild is
  now able to cache the result of shell commands. Some variables are
  expensive to compute, for example, $(call cc-option,...) invokes the
  compiler. It is not efficient to redo this computation every time,
  even when we are not actually building anything. Kbuild creates a
  hidden file ".cache.mk" that contains invoked shell commands and their
  results. The speed-up should be noticeable.

  Summary:

   - Fix arch build issues (hexagon, sh)

   - Clean up various Makefiles and scripts

   - Fix wrong usage of {CFLAGS,LDFLAGS}_MODULE in arch Makefiles

   - Cache variables that are expensive to compute

   - Improve cc-ldopton and ld-option for Clang

   - Optimize output directory creation"

* tag 'kbuild-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits)
  kbuild: move coccicheck help from scripts/Makefile.help to top Makefile
  sh: decompressor: add shipped files to .gitignore
  frv: .gitignore: ignore vmlinux.lds
  selinux: remove unnecessary assignment to subdir-
  kbuild: specify FORCE in Makefile.headersinst as .PHONY target
  kbuild: remove redundant mkdir from ./Kbuild
  kbuild: optimize object directory creation for incremental build
  kbuild: create object directories simpler and faster
  kbuild: filter-out PHONY targets from "targets"
  kbuild: remove redundant $(wildcard ...) for cmd_files calculation
  kbuild: create directory for make cache only when necessary
  sh: select KBUILD_DEFCONFIG depending on ARCH
  kbuild: fix linker feature test macros when cross compiling with Clang
  kbuild: shrink .cache.mk when it exceeds 1000 lines
  kbuild: do not call cc-option before KBUILD_CFLAGS initialization
  kbuild: Cache a few more calls to the compiler
  kbuild: Add a cache for generated variables
  kbuild: add forward declaration of default target to Makefile.asm-generic
  kbuild: remove KBUILD_SUBDIR_ASFLAGS and KBUILD_SUBDIR_CCFLAGS
  hexagon/kbuild: replace CFLAGS_MODULE with KBUILD_CFLAGS_MODULE
  ...
2017-11-17 17:45:29 -08:00
Linus Torvalds
fa7f578076 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - a bit more MM

 - procfs updates

 - dynamic-debug fixes

 - lib/ updates

 - checkpatch

 - epoll

 - nilfs2

 - signals

 - rapidio

 - PID management cleanup and optimization

 - kcov updates

 - sysvipc updates

 - quite a few misc things all over the place

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
  EXPERT Kconfig menu: fix broken EXPERT menu
  include/asm-generic/topology.h: remove unused parent_node() macro
  arch/tile/include/asm/topology.h: remove unused parent_node() macro
  arch/sparc/include/asm/topology_64.h: remove unused parent_node() macro
  arch/sh/include/asm/topology.h: remove unused parent_node() macro
  arch/ia64/include/asm/topology.h: remove unused parent_node() macro
  drivers/pcmcia/sa1111_badge4.c: avoid unused function warning
  mm: add infrastructure for get_user_pages_fast() benchmarking
  sysvipc: make get_maxid O(1) again
  sysvipc: properly name ipc_addid() limit parameter
  sysvipc: duplicate lock comments wrt ipc_addid()
  sysvipc: unteach ids->next_id for !CHECKPOINT_RESTORE
  initramfs: use time64_t timestamps
  drivers/watchdog: make use of devm_register_reboot_notifier()
  kernel/reboot.c: add devm_register_reboot_notifier()
  kcov: update documentation
  Makefile: support flag -fsanitizer-coverage=trace-cmp
  kcov: support comparison operands collection
  kcov: remove pointless current != NULL check
  kernel/panic.c: add TAINT_AUX
  ...
2017-11-17 16:56:17 -08:00
Dou Liyang
52563d05f2 arch/tile/include/asm/topology.h: remove unused parent_node() macro
Commit a7be6e5a7f ("mm: drop useless local parameters of
__register_one_node()") removed the last user of parent_node().

The parent_node() macro in tile platform is unnecessary.

Remove it for cleanup.

Link: http://lkml.kernel.org/r/1504234599-29533-7-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:04 -08:00
Dou Liyang
5f4cdac6bc arch/sparc/include/asm/topology_64.h: remove unused parent_node() macro
Commit a7be6e5a7f ("mm: drop useless local parameters of
__register_one_node()") removed the last user of parent_node().

The parent_node() macro in SPARC64 platform is unnecessary.

Remove it for cleanup.

Link: http://lkml.kernel.org/r/1504234599-29533-6-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:04 -08:00
Dou Liyang
ece15787e7 arch/sh/include/asm/topology.h: remove unused parent_node() macro
Commit a7be6e5a7f ("mm: drop useless local parameters of
__register_one_node()") removed the last user of parent_node().

The parent_node() macro in SUPERH platform is unnecessary.

Remove it for cleanup.

Link: http://lkml.kernel.org/r/1504234599-29533-5-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:04 -08:00
Dou Liyang
5eb9e8ac9a arch/ia64/include/asm/topology.h: remove unused parent_node() macro
Commit a7be6e5a7f ("mm: drop useless local parameters of
__register_one_node()") removed the last user of parent_node().

The parent_node() macro in IA64(Itanium) platform is unnecessary.

Remove it for cleanup.

Link: http://lkml.kernel.org/r/1504234599-29533-2-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:04 -08:00
Gargi Sharma
e8cfbc245e pid: remove pidhash
pidhash is no longer required as all the information can be looked up
from idr tree.  nr_hashed represented the number of pids that had been
hashed.  Since, nr_hashed and PIDNS_HASH_ADDING are no longer relevant,
it has been renamed to pid_allocated and PIDNS_ADDING respectively.

[gs051095@gmail.com: v6]
  Link: http://lkml.kernel.org/r/1507760379-21662-3-git-send-email-gs051095@gmail.com
Link: http://lkml.kernel.org/r/1507583624-22146-3-git-send-email-gs051095@gmail.com
Signed-off-by: Gargi Sharma <gs051095@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>	[ia64]
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:04 -08:00
Gargi Sharma
95846ecf9d pid: replace pid bitmap implementation with IDR API
Patch series "Replacing PID bitmap implementation with IDR API", v4.

This series replaces kernel bitmap implementation of PID allocation with
IDR API.  These patches are written to simplify the kernel by replacing
custom code with calls to generic code.

The following are the stats for pid and pid_namespace object files
before and after the replacement.  There is a noteworthy change between
the IDR and bitmap implementation.

Before
   text       data        bss        dec        hex    filename
   8447       3894         64      12405       3075    kernel/pid.o
After
   text       data        bss        dec        hex    filename
   3397        304          0       3701        e75    kernel/pid.o

Before
   text       data        bss        dec        hex    filename
   5692       1842        192       7726       1e2e    kernel/pid_namespace.o
After
   text       data        bss        dec        hex    filename
   2854        216         16       3086        c0e    kernel/pid_namespace.o

The following are the stats for ps, pstree and calling readdir on /proc
for 10,000 processes.

ps:
        With IDR API    With bitmap
real    0m1.479s        0m2.319s
user    0m0.070s        0m0.060s
sys     0m0.289s        0m0.516s

pstree:
        With IDR API    With bitmap
real    0m1.024s        0m1.794s
user    0m0.348s        0m0.612s
sys     0m0.184s        0m0.264s

proc:
        With IDR API    With bitmap
real    0m0.059s        0m0.074s
user    0m0.000s        0m0.004s
sys     0m0.016s        0m0.016s

This patch (of 2):

Replace the current bitmap implementation for Process ID allocation.
Functions that are no longer required, for example, free_pidmap(),
alloc_pidmap(), etc.  are removed.  The rest of the functions are
modified to use the IDR API.  The change was made to make the PID
allocation less complex by replacing custom code with calls to generic
API.

[gs051095@gmail.com: v6]
  Link: http://lkml.kernel.org/r/1507760379-21662-2-git-send-email-gs051095@gmail.com
[avagin@openvz.org: restore the old behaviour of the ns_last_pid sysctl]
  Link: http://lkml.kernel.org/r/20171106183144.16368-1-avagin@openvz.org
Link: http://lkml.kernel.org/r/1507583624-22146-2-git-send-email-gs051095@gmail.com
Signed-off-by: Gargi Sharma <gs051095@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:03 -08:00
Kees Cook
2a8358d8a3 bug: define the "cut here" string in a single place
The "cut here" string is used in a few paths.  Define it in a single
place.

Link: http://lkml.kernel.org/r/1510100869-73751-3-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:01 -08:00
Kees Cook
fb6cc4ac15 sh/boot: add static stack-protector to pre-kernel
The sh decompressor code triggers stack-protector code generation when
using CONFIG_CC_STACKPROTECTOR_STRONG.  As done for arm and mips, add a
simple static stack-protector canary.  As this wasn't protected before,
the risk of using a weak canary is minimized.  Once the kernel is
actually up, a better canary is chosen.

Link: http://lkml.kernel.org/r/1506972007-80614-2-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michal Marek <mmarek@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-17 16:10:00 -08:00
Linus Torvalds
e75080f185 Two power management fixes for v4.15-rc1
This is the change making /proc/cpuinfo on x86 report current
 CPU frequency in "cpu MHz" again in all cases and an additional
 one dealing with an overzealous check in one of the helper
 routines in the runtime PM framework.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJaDvBIAAoJEILEb/54YlRxZ58QAJP6p53XDcml8Risw9CrpnZV
 6kBdFTYn6JSJiE4cALTER14ScqHQdTP2M6QJPDDLV5LwiQFa5fJYsSNP7F1Dpg4r
 8V3QNZbBjpyc8rSGRUkjY7+WsvUUb2UWzEkLIUjOWIT4mfC969JxV/fBYEL7ZDn9
 Wg7q79qI5Tss9PU2GUmaFtdkR0lqUIdNrrWe+qyLl0XHkrmU8DGL4XkPykdkwX0L
 gn0i/RrK+5DBUVPR1qQTU2CO3751IdIDktpK3RLmWl/yb4TqlM4WKIhIZvvglc2g
 S+OWGg/E4CNU6/EcGllNCPENAH7v0FNvvLMslPs6ao+wGQBcgO4R5d70dzobph/i
 P1ns6iJbd+lgRlGSQBReVo/FWcwi4HrINRxAB4W88dBBxchHdt+G3/Juq6GiGEJi
 mOh3ZHWd0J3mQEIWLKEcm5nHwIeY9yhCFJIpr5azte7JIz1fDuMnnp2gYl1SOVCK
 CHv0uD8Mw7hQFC0Dzje8T0Hr29MBwpEJiXE4Eh+Fp4zWiI7BYd1TNtp5WPDtchhv
 weqFqgDArN5gpkrZuSsxxg8eeRRwPeQR/mCyxofmsQ5lplCVJi8Ieqcf/KZrCy/c
 1vHGJsn9ec2dNeQKTFFT5luznQSSSXoZCXprumFuTp2804E3Hpkf/UnAldc4EYSn
 SwzAOO3gNA76eaFikvTK
 =h6Ux
 -----END PGP SIGNATURE-----

Merge tag 'pm-fixes-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull two power management fixes from Rafael Wysocki:
 "This is the change making /proc/cpuinfo on x86 report current CPU
  frequency in "cpu MHz" again in all cases and an additional one
  dealing with an overzealous check in one of the helper routines in the
  runtime PM framework"

* tag 'pm-fixes-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / runtime: Drop children check from __pm_runtime_set_status()
  x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
2017-11-17 14:49:25 -08:00
Linus Torvalds
e29116758c Merge branch 'parisc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
 "Highlights:

   - one important fix from Dave to prevent kernel crash when userspace
     hands over invalid values to our in-kernel CAS implementation.

   - added CPU topology support, including multi-core scheduler support
     on PA8900 CPUs

  Minor changes:

   - minor fixes for sparse (from Luc)

   - drop duplicates for CPU_BIG_ENDIAN from parisc and sparc top
     Kconfig files (from Babu)

   - reorganized parisc PDC (firmware-access) header files for usage
     from userspace. Required for upcoming qemu parisc emulator and
     SeaBIOS fork to support parisc"

* 'parisc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  arch: Fix duplicates in Kconfig for parisc and sparc
  parisc: Make some PDC structures accessible in uapi headers
  parisc: Pass endianness info to sparse
  parisc: Add CPU topology support
  parisc: Fix validity check of pointer size argument in new CAS implementation
2017-11-17 14:26:14 -08:00
Linus Torvalds
e71d5126e7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull second round of s390 updates from Martin Schwidefsky:

 - rework of the vdso code to avoid the use of the access register mode

 - use perf AUX buffers for the transport of diagnostic sample data

 - add perf_regs and user stack dump support

 - enable perf call graphs for user space programs

 - add perf register support for floating-point registers

 - all remaining s390 related timer_setup conversions

 - bug fixes and cleanups

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (30 commits)
  s390: remove unused parameter from Makefile
  zfcp: purely mechanical update using timer API, plus blank lines
  s390/scsi: Convert timers to use timer_setup()
  s390/cpum_sf: correctly set the PID and TID in perf samples
  s390/cpum_sf: load program parameter at sampler enablement
  s390/perf: add perf register support for floating-point registers
  s390/perf: extend perf_regs support to include floating-point registers
  s390/perf: define common DWARF register string table
  s390/perf: add support for perf_regs and libdw
  s390/perf: add perf_regs support and user stack dump
  s390/cpum_sf: do not register PMU if no sampling mode is authorized
  s390/cpumf: remove raw event support in basic-only sampling mode
  s390/perf: add callback to perf to enable using AUX buffer
  s390/cpumf: enable using AUX buffer
  s390/cpumf: introduce AUX buffer for dump diagnostic sample data
  s390/disassembler: increase show_code buffer size
  s390: Remove CONFIG_HARDENED_USERCOPY
  s390: enable CPU alternatives unconditionally
  s390/nmi: remove unused code
  s390/mm: remove unused code
  ...
2017-11-17 14:23:52 -08:00
Linus Torvalds
5a3e0b196b File locking related changes for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaDuoWAAoJEAAOaEEZVoIVXEQP/jQYoU9hgvEj8j3ZIgi56SDJ
 pR45w2zcJz2/uU43DEKyShyLgsuoBbJQ3l/gGBH/tl+xGm9NzB0gatoEu9GmKNYz
 /IN6/vUFnoIAUyD+iMZbpmsYKIkz0z2YJo261IfspAwIft/cvHJnYYGQrP9YXg9F
 c7bdDuANTKocdQigc4BQyOe3OfIBGfTwJhuakO+1yuZmGOVNyxEcdYbMM8FiTfc8
 +62kvQQ3t7WMqSbM8M0QdGcYQjG0EwcVAuV7COurLJIva7hUkVel32MVUjoFcf28
 BnRu2ztFJCubm1HA85twlJDtpeXbcMqrUl/CcwRMpwDaePd5GVB1h5iKqbZ51BZ1
 fWT2STmt+8hY2B5eiXoYEaG3B7ZRr+r0oroxqOxpiZ/m4AVeouF+gPGv+NV5zgvD
 NGWC0MdklIJ4xaC99NEeP6kBhz0M74VKymFCTeHkVg9m4TqDepNvitKed0qagw19
 uw8seei7TOTm4o117+l55NHmyfTHXFO4U0WLTJyeZcoEnUs0rOcHeqyy0RwCBMrK
 W2fJtdBLFr+tBIIrID4TnPhhYtSvIPjz+FpiRDobqhgvMva/PIvLGTWK4unrgIjG
 ZQ7YGnwWda8GjqKhgZacn/BSXyJzOAF9hJp0mz2ORaOxaMarEV55duiZufCvGuZw
 uUQWRCKuQX7Oi05i9jXp
 =fCeF
 -----END PGP SIGNATURE-----

Merge tag 'locks-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking update from Jeff Layton:
 "A couple of fixes for a patch that went into v4.14, and the bug report
  just came in a few days ago.. It passes my (minimal) testing, and has
  been in linux-next for a few days now.

  I also would like to get my address changed in MAINTAINERS to clear
  that hurdle"

* tag 'locks-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
  fcntl: don't leak fd reference when fixup_compat_flock fails
  MAINTAINERS: s/jlayton@poochiereds.net/jlayton@kernel.org/
2017-11-17 13:21:58 -08:00
Linus Torvalds
93f30c73ec Merge branch 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat and uaccess updates from Al Viro:

 - {get,put}_compat_sigset() series

 - assorted compat ioctl stuff

 - more set_fs() elimination

 - a few more timespec64 conversions

 - several removals of pointless access_ok() in places where it was
   followed only by non-__ variants of primitives

* 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (24 commits)
  coredump: call do_unlinkat directly instead of sys_unlink
  fs: expose do_unlinkat for built-in callers
  ext4: take handling of EXT4_IOC_GROUP_ADD into a helper, get rid of set_fs()
  ipmi: get rid of pointless access_ok()
  pi433: sanitize ioctl
  cxlflash: get rid of pointless access_ok()
  mtdchar: get rid of pointless access_ok()
  r128: switch compat ioctls to drm_ioctl_kernel()
  selection: get rid of field-by-field copyin
  VT_RESIZEX: get rid of field-by-field copyin
  i2c compat ioctls: move to ->compat_ioctl()
  sched_rr_get_interval(): move compat to native, get rid of set_fs()
  mips: switch to {get,put}_compat_sigset()
  sparc: switch to {get,put}_compat_sigset()
  s390: switch to {get,put}_compat_sigset()
  ppc: switch to {get,put}_compat_sigset()
  parisc: switch to {get,put}_compat_sigset()
  get_compat_sigset()
  get rid of {get,put}_compat_itimerspec()
  io_getevents: Use timespec64 to represent timeouts
  ...
2017-11-17 11:54:55 -08:00
Linus Torvalds
a3841f94c7 libnvdimm for 4.15
* Introduce MAP_SYNC and MAP_SHARED_VALIDATE, a mechanism to enable
  'userspace flush' of persistent memory updates via filesystem-dax
   mappings. It arranges for any filesystem metadata updates that may be
   required to satisfy a write fault to also be flushed ("on disk") before
   the kernel returns to userspace from the fault handler. Effectively
   every write-fault that dirties metadata completes an fsync() before
   returning from the fault handler. The new MAP_SHARED_VALIDATE mapping
   type guarantees that the MAP_SYNC flag is validated as supported by the
   filesystem's ->mmap() file operation.
 
 * Add support for the standard ACPI 6.2 label access methods that
   replace the NVDIMM_FAMILY_INTEL (vendor specific) label methods. This
   enables interoperability with environments that only implement the
   standardized methods.
 
 * Add support for the ACPI 6.2 NVDIMM media error injection methods.
 
 * Add support for the NVDIMM_FAMILY_INTEL v1.6 DIMM commands for latch
   last shutdown status, firmware update, SMART error injection, and
   SMART alarm threshold control.
 
 * Cleanup physical address information disclosures to be root-only.
 
 * Fix revalidation of the DIMM "locked label area" status to support
   dynamic unlock of the label area.
 
 * Expand unit test infrastructure to mock the ACPI 6.2 Translate SPA
   (system-physical-address) command and error injection commands.
 
 Acknowledgements that came after the commits were pushed to -next:
 
 957ac8c421 dax: fix PMD faults on zero-length files
 Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
 
 a39e596baa xfs: support for synchronous DAX faults
 Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
 
 7b565c9f96 xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()
 Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaDfvcAAoJEB7SkWpmfYgCk7sP/2qJhBH+VTTdg2osDnhAdAhI
 co/AGEmsHFlUCMBb/Ek7UnMAmhBYiJU2q4ywPsNFBpusXpMlqNy5Iwo7k4/wQHE/
 SJcIM0g4zg0ViFuUhwV+C2T0R5UzFR8JLd9EYWj/YS6aJpurtotm5l4UStaM0Hzo
 AhxSXJLrBDuqCpbOxbctfiGEmdRL7aRfBEAARTNRKBn/iXxJUcYHlp62rtXQS+t4
 I6LC/URCWTNTTMGmzW6TRsgSD9WMfd19xKcGzN3qL6ee0KFccxN4ctFqHA/sFGOh
 iYLeR0XJUjJxyp+PkWGteXPVZL0Kj3bD/lSTG+Co5bm/ra8a/sh3TSFfgFyoBZD1
 EqMN8Ryf80hGp3FabeH2Iw2SviYPZpHSWgjddjxLD0RA6OmpzINc+Wm8eqApjMME
 sbZDTOijiab4QMQ0XamF4GuDHyQtawv5Y/w2Ehhl1tmiqW+5tKhsKqxkQt+/V3Yt
 RTVSRe2Pkway66b+cD64IdQ6L2tyonPnmi5IzgkKOhlOEGomy+4/U2Jt2bMbhzq6
 ymszKmXp2XI8P06wU8sHrIUeXO5I9qoKn/fZA73Eb8aIzgJe3tBE/5+Ab7RG6HB9
 1OVfcMWoXU1gNgNktTs63X1Lsg4aW9kt/K4fPHHcqUcaliEJpJTlAbg9GLF2buoW
 nQ+0fTRgMRihE3ZA0Fs3
 =h2vZ
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm and dax updates from Dan Williams:
 "Save for a few late fixes, all of these commits have shipped in -next
  releases since before the merge window opened, and 0day has given a
  build success notification.

  The ext4 touches came from Jan, and the xfs touches have Darrick's
  reviewed-by. An xfstest for the MAP_SYNC feature has been through
  a few round of reviews and is on track to be merged.

   - Introduce MAP_SYNC and MAP_SHARED_VALIDATE, a mechanism to enable
     'userspace flush' of persistent memory updates via filesystem-dax
     mappings. It arranges for any filesystem metadata updates that may
     be required to satisfy a write fault to also be flushed ("on disk")
     before the kernel returns to userspace from the fault handler.
     Effectively every write-fault that dirties metadata completes an
     fsync() before returning from the fault handler. The new
     MAP_SHARED_VALIDATE mapping type guarantees that the MAP_SYNC flag
     is validated as supported by the filesystem's ->mmap() file
     operation.

   - Add support for the standard ACPI 6.2 label access methods that
     replace the NVDIMM_FAMILY_INTEL (vendor specific) label methods.
     This enables interoperability with environments that only implement
     the standardized methods.

   - Add support for the ACPI 6.2 NVDIMM media error injection methods.

   - Add support for the NVDIMM_FAMILY_INTEL v1.6 DIMM commands for
     latch last shutdown status, firmware update, SMART error injection,
     and SMART alarm threshold control.

   - Cleanup physical address information disclosures to be root-only.

   - Fix revalidation of the DIMM "locked label area" status to support
     dynamic unlock of the label area.

   - Expand unit test infrastructure to mock the ACPI 6.2 Translate SPA
     (system-physical-address) command and error injection commands.

  Acknowledgements that came after the commits were pushed to -next:

   - 957ac8c421 ("dax: fix PMD faults on zero-length files"):
       Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>

   - a39e596baa ("xfs: support for synchronous DAX faults") and
     7b565c9f96 ("xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()")
        Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>"

* tag 'libnvdimm-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (49 commits)
  acpi, nfit: add 'Enable Latch System Shutdown Status' command support
  dax: fix general protection fault in dax_alloc_inode
  dax: fix PMD faults on zero-length files
  dax: stop requiring a live device for dax_flush()
  brd: remove dax support
  dax: quiet bdev_dax_supported()
  fs, dax: unify IOMAP_F_DIRTY read vs write handling policy in the dax core
  tools/testing/nvdimm: unit test clear-error commands
  acpi, nfit: validate commands against the device type
  tools/testing/nvdimm: stricter bounds checking for error injection commands
  xfs: support for synchronous DAX faults
  xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()
  ext4: Support for synchronous DAX faults
  ext4: Simplify error handling in ext4_dax_huge_fault()
  dax: Implement dax_finish_sync_fault()
  dax, iomap: Add support for synchronous faults
  mm: Define MAP_SYNC and VM_SYNC flags
  dax: Allow tuning whether dax_insert_mapping_entry() dirties entry
  dax: Allow dax_iomap_fault() to return pfn
  dax: Fix comment describing dax_iomap_fault()
  ...
2017-11-17 09:51:57 -08:00
Prarit Bhargava
b4c0a7326f x86/smpboot: Fix __max_logical_packages estimate
A system booted with a small number of cores enabled per package
panics because the estimate of __max_logical_packages is too low.

This occurs when the total number of active cores across all packages is
less than the maximum core count for a single package. e.g.:

  On a 4 package system with 20 cores/package where only 4 cores are
  enabled on each package, the value of __max_logical_packages is
  calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4.

Calculate __max_logical_packages after the cpu enumeration has completed.
Use the boot cpu's data to extrapolate the number of packages.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-4-prarit@redhat.com
2017-11-17 16:22:31 +01:00
Andi Kleen
30bb981185 x86/topology: Avoid wasting 128k for package id array
Analyzing large early boot allocations unveiled the logical package id
storage as a prominent memory waste. Since commit 1f12e32f4c
("x86/topology: Create logical package id") every 64-bit system allocates a
128k array to convert logical package ids.

This happens because the array is sized for MAX_LOCAL_APIC which is always
32k on 64bit systems, and it needs 4 bytes for each entry.

This is fairly wasteful, especially for the common case of having only one
socket, which uses exactly 4 byte out of 128K.

There is no user of the package id map which is performance critical, so
the lookup is not required to be O(1). Store the logical processor id in
cpu_data and use a loop based lookup.

To keep the mapping stable accross cpu hotplug operations, add a flag to
cpu_data which is set when the CPU is brought up the first time. When the
flag is set, then cpu_data is not reinitialized by copying boot_cpu_data on
subsequent bringups.

[ tglx: Rename the flag to 'initialized', use proper pointers instead of
  	repeated cpu_data(x) evaluation and massage changelog. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-3-prarit@redhat.com
2017-11-17 16:22:30 +01:00
Andi Kleen
d46b4c1ce5 perf/x86/intel/uncore: Cache logical pkg id in uncore driver
The SNB-EP uncore driver is the only user of topology_phys_to_logical_pkg
in a performance critical path.

Change it query the logical pkg ID only once at initialization time and
then cache it in box structure. This allows to change the logical package
management without affecting the performance critical path.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: He Chen <he.chen@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Piotr Luc <piotr.luc@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20171114124257.22013-2-prarit@redhat.com
2017-11-17 16:22:30 +01:00
Vikas C Sajjan
4ee2ec1b12 x86/acpi: Reduce code duplication in mp_override_legacy_irq()
The new function mp_register_ioapic_irq() is a subset of the code in
mp_override_legacy_irq().

Replace the code duplication by invoking mp_register_ioapic_irq() from
mp_override_legacy_irq().

Signed-off-by: Vikas C Sajjan <vikas.cha.sajjan@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: kkamagui@gmail.com
Cc: linux-acpi@vger.kernel.org
Link: https://lkml.kernel.org/r/1510848825-21965-3-git-send-email-vikas.cha.sajjan@hpe.com
2017-11-17 15:30:33 +01:00
Vikas C Sajjan
252714155f x86/acpi: Handle SCI interrupts above legacy space gracefully
Platforms which support only IOAPIC mode, pass the SCI information above
the legacy space (0-15) via the FADT mechanism and not via MADT.

In such cases mp_override_legacy_irq() which is invoked from
acpi_sci_ioapic_setup() to register SCI interrupts fails for interrupts
greater equal 16, since it is meant to handle only the legacy space and
emits error "Invalid bus_irq %u for legacy override".

Add a new function to handle SCI interrupts >= 16 and invoke it
conditionally in acpi_sci_ioapic_setup().

The code duplication due to this new function will be cleaned up in a
separate patch.

Co-developed-by: Sunil V L <sunil.vl@hpe.com>
Signed-off-by: Vikas C Sajjan <vikas.cha.sajjan@hpe.com>
Signed-off-by: Sunil V L <sunil.vl@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Abdul Lateef Attar <abdul-lateef.attar@hpe.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: kkamagui@gmail.com
Cc: linux-acpi@vger.kernel.org
Link: https://lkml.kernel.org/r/1510848825-21965-2-git-send-email-vikas.cha.sajjan@hpe.com
2017-11-17 15:30:33 +01:00
Tom Lendacky
ac5292e9a2 x86/boot: Fix boot failure when SMP MP-table is based at 0
When crosvm is used to boot a kernel as a VM, the SMP MP-table is found
at physical address 0x0. This causes mpf_base to be set to 0 and a
subsequent "if (!mpf_base)" check in default_get_smp_config() results in
the MP-table not being parsed.  Further into the boot this results in an
oops when attempting a read_apic_id().

Add a boolean variable that is set to true when the MP-table is found.
Use this variable for testing if the MP-table was found so that even a
value of 0 for mpf_base will result in continued parsing of the MP-table.

Fixes: 5997efb967 ("x86/boot: Use memremap() to map the MPF and MPC data")
Reported-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: regression@leemhuis.info
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20171106201753.23059.86674.stgit@tlendack-t1.amdoffice.net
2017-11-17 15:30:33 +01:00
Babu Moger
7b8b098c47 arch: Fix duplicates in Kconfig for parisc and sparc
Fix duplicates for sparc and parisc. This was due these following commits.

1. commit 4c97a0c8fe ("arch: define CPU_BIG_ENDIAN for all fixed big
   endian archs")
2. commit 97d9f96916 ("arch/sparc: Define config parameter
   CPU_BIG_ENDIAN")
3. commit 74ad3d28af ("parisc: Define CONFIG_CPU_BIG_ENDIAN")

Remove duplicates.

Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2017-11-17 15:27:54 +01:00
Helge Deller
bc5a768e56 parisc: Make some PDC structures accessible in uapi headers
While working on a qemu and SeaBIOS-port to parisc, those PDC structures are
useful to have accessible from userspace.

Signed-off-by: Helge Deller <deller@gmx.de>
2017-11-17 15:27:42 +01:00
Luc Van Oostenryck
3744d988c0 parisc: Pass endianness info to sparse
parisc is big-endian only but sparse assumes the same endianness as the
building machine.
This is problematic for code which expect __BYTE_ORDER__ being correctly
predefined by the compiler which sparse can then pre-process differently
from what gcc would.

Fix this by letting sparse know about the architecture endianness.

To: James Bottomley <jejb@parisc-linux.org>
To: Helge Deller <deller@gmx.de>
CC: linux-parisc@vger.kernel.org
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2017-11-17 15:27:33 +01:00
Helge Deller
bf7b4c1b3c parisc: Add CPU topology support
Add topology support, including multi-core scheduler support on
PA8800/PA8900 CPUs and enhanced output in /proc/cpuinfo, e.g.
lscpu now reports on a single-socket, dual-core machine:

Architecture:          parisc64
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
CPU family:            PA-RISC 2.0
Model name:            PA8800 (Mako)

Signed-off-by: Helge Deller <deller@gmx.de>
2017-11-17 15:27:22 +01:00
John David Anglin
05f016d2ca parisc: Fix validity check of pointer size argument in new CAS implementation
As noted by Christoph Biedl, passing a pointer size of 4 in the new CAS
implementation causes a kernel crash.  The attached patch corrects the
off by one error in the argument validity check.

In reviewing the code, I noticed that we only perform word operations
with the pointer size argument.  The subi instruction intentionally uses
a word condition on 64-bit kernels.  Nullification was used instead of a
cmpib instruction as the branch should never be taken.  The shlw
pseudo-operation generates a depw,z instruction and it clears the target
before doing a shift left word deposit.  Thus, we don't need to clip the
upper 32 bits of this argument on 64-bit kernels.

Tested with a gcc testsuite run with a 64-bit kernel.  The gcc atomic
code in libgcc is the only direct user of the new CAS implementation
that I am aware of.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Helge Deller <deller@gmx.de>
2017-11-17 15:27:13 +01:00
Paolo Bonzini
c4ad77e0d4 KVM: vmx: use X86_CR4_UMIP and X86_FEATURE_UMIP
These bits were not defined until now in common code, but they are
now that the kernel supports UMIP too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-11-17 13:20:23 +01:00
Janakarajan Natarajan
50a671d4d1 KVM: x86: Fix CPUID function for word 6 (80000001_ECX)
The function for CPUID 80000001 ECX is set to 0xc0000001. Set it to
0x80000001.

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Fixes: d6321d4933 ("KVM: x86: generalize guest_cpuid_has_ helpers")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:22 +01:00
Liran Alon
917dc6068b KVM: nVMX: Fix vmx_check_nested_events() return value in case an event was reinjected to L2
vmx_check_nested_events() should return -EBUSY only in case there is a
pending L1 event which requires a VMExit from L2 to L1 but such a
VMExit is currently blocked. Such VMExits are blocked either
because nested_run_pending=1 or an event was reinjected to L2.
vmx_check_nested_events() should return 0 in case there are no
pending L1 events which requires a VMExit from L2 to L1 or if
a VMExit from L2 to L1 was done internally.

However, upstream commit which introduced blocking in case an event was
reinjected to L2 (commit acc9ab6013 ("KVM: nVMX: Fix pending events
injection")) contains a bug: It returns -EBUSY even if there are no
pending L1 events which requires VMExit from L2 to L1.

This commit fix this issue.

Fixes: acc9ab6013 ("KVM: nVMX: Fix pending events injection")

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:21 +01:00
Nikita Leshenko
b200dded0a KVM: x86: ioapic: Preserve read-only values in the redirection table
According to 82093AA (IOAPIC) manual, Remote IRR and Delivery Status are
read-only. QEMU implements the bits as RO in commit 479c2a1cb7fb
("ioapic: keep RO bits for IOAPIC entry").

Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:21 +01:00
Nikita Leshenko
a8bfec2930 KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered
Some OSes (Linux, Xen) use this behavior to clear the Remote IRR bit for
IOAPICs without an EOI register. They simulate the EOI message manually
by changing the trigger mode to edge and then back to level, with the
entry being masked during this.

QEMU implements this feature in commit ed1263c363c9
("ioapic: clear remote irr bit for edge-triggered interrupts")

As a side effect, this commit removes an incorrect behavior where Remote
IRR was cleared when the redirection table entry was rewritten. This is not
consistent with the manual and also opens an opportunity for a strange
behavior when a redirection table entry is modified from an interrupt
handler that handles the same entry: The modification will clear the
Remote IRR bit even though the interrupt handler is still running.

Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:20 +01:00
Nikita Leshenko
7d2253684d KVM: x86: ioapic: Remove redundant check for Remote IRR in ioapic_set_irq
Remote IRR for level-triggered interrupts was previously checked in
ioapic_set_irq, but since we now have a check in ioapic_service we
can remove the redundant check from ioapic_set_irq.

This commit doesn't change semantics.

Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:19 +01:00
Nikita Leshenko
da3fe7bdfa KVM: x86: ioapic: Don't fire level irq when Remote IRR set
Avoid firing a level-triggered interrupt that has the Remote IRR bit set,
because that means that some CPU is already processing it. The Remote
IRR bit will be cleared after an EOI and the interrupt will refire
if the irq line is still asserted.

This behavior is aligned with QEMU's IOAPIC implementation that was
introduced by commit f99b86b94987
("x86: ioapic: ignore level irq during processing") in QEMU.

Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:18 +01:00
Nikita Leshenko
0fc5a36dd6 KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race
KVM uses ioapic_handled_vectors to track vectors that need to notify the
IOAPIC on EOI. The problem is that IOAPIC can be reconfigured while an
interrupt with old configuration is pending or running and
ioapic_handled_vectors only remembers the newest configuration;
thus EOI from the old interrupt is not delievered to the IOAPIC.

A previous commit db2bdcbbbd
("KVM: x86: fix edge EOI and IOAPIC reconfig race")
addressed this issue by adding pending edge-triggered interrupts to
ioapic_handled_vectors, fixing this race for edge-triggered interrupts.
The commit explicitly ignored level-triggered interrupts,
but this race applies to them as well:

1) IOAPIC sends a level triggered interrupt vector to VCPU0
2) VCPU0's handler deasserts the irq line and reconfigures the IOAPIC
   to route the vector to VCPU1. The reconfiguration rewrites only the
   upper 32 bits of the IOREDTBLn register. (Causes KVM to update
   ioapic_handled_vectors for VCPU0 and it no longer includes the vector.)
3) VCPU0 sends EOI for the vector, but it's not delievered to the
   IOAPIC because the ioapic_handled_vectors doesn't include the vector.
4) New interrupts are not delievered to VCPU1 because remote_irr bit
   is set forever.

Therefore, the correct behavior is to add all pending and running
interrupts to ioapic_handled_vectors.

This commit introduces a slight performance hit similar to
commit db2bdcbbbd ("KVM: x86: fix edge EOI and IOAPIC reconfig race")
for the rare case that the vector is reused by a non-IOAPIC source on
VCPU0. We prefer to keep solution simple and not handle this case just
as the original commit does.

Fixes: db2bdcbbbd ("KVM: x86: fix edge EOI and IOAPIC reconfig race")

Signed-off-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:17 +01:00
Paolo Bonzini
6ea6e84309 KVM: x86: inject exceptions produced by x86_decode_insn
Sometimes, a processor might execute an instruction while another
processor is updating the page tables for that instruction's code page,
but before the TLB shootdown completes.  The interesting case happens
if the page is in the TLB.

In general, the processor will succeed in executing the instruction and
nothing bad happens.  However, what if the instruction is an MMIO access?
If *that* happens, KVM invokes the emulator, and the emulator gets the
updated page tables.  If the update side had marked the code page as non
present, the page table walk then will fail and so will x86_decode_insn.

Unfortunately, even though kvm_fetch_guest_virt is correctly returning
X86EMUL_PROPAGATE_FAULT, x86_decode_insn's caller treats the failure as
a fatal error if the instruction cannot simply be reexecuted (as is the
case for MMIO).  And this in fact happened sometimes when rebooting
Windows 2012r2 guests.  Just checking ctxt->have_exception and injecting
the exception if true is enough to fix the case.

Thanks to Eduardo Habkost for helping in the debugging of this issue.

Reported-by: Yanan Fu <yfu@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:16 +01:00
Eyal Moscovici
fab0aa3b77 KVM: x86: Allow suppressing prints on RDMSR/WRMSR of unhandled MSRs
Some guests use these unhandled MSRs very frequently.
This cause dmesg to be populated with lots of aggregated messages on
usage of ignored MSRs. As ignore_msrs=true means that the user is
well-aware his guest use ignored MSRs, allow to also disable the
prints on their usage.

An example of such guest is ESXi which tends to access a lot to MSR
0x34 (MSR_SMI_COUNT) very frequently.

In addition, we have observed this to cause unnecessary delays to
guest execution. Such an example is ESXi which experience networking
delays in it's guests (L2 guests) because of these prints (even when
prints are rate-limited). This can easily be reproduced by pinging
from one L2 guest to another.  Once in a while, a peak in ping RTT
will be observed. Removing these unhandled MSR prints solves the
issue.

Because these prints can help diagnose issues with guests,
this commit only suppress them by a module parameter instead of
removing them from code entirely.

Signed-off-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[Changed suppress_ignore_msrs_prints to report_ignored_msrs - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:16 +01:00
David Hildenbrand
4d772cb85f KVM: x86: fix em_fxstor() sleeping while in atomic
Commit 9d643f6312 ("KVM: x86: avoid large stack allocations in
em_fxrstor") optimize the stack size, but introduced a guest memory access
which might sleep while in atomic.

Fix it by introducing, again, a second fxregs_state. Try to avoid
large stacks by using noinline. Add some helpful comments.

Reported by syzbot:

in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109
2 locks held by syzkaller879109/2909:
  #0:  (&vcpu->mutex){+.+.}, at: [<ffffffff8106222c>] vcpu_load+0x1c/0x70
arch/x86/kvm/../../../virt/kvm/kvm_main.c:154
  #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_enter_guest
arch/x86/kvm/x86.c:6983 [inline]
  #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_run
arch/x86/kvm/x86.c:7061 [inline]
  #1:  (&kvm->srcu){....}, at: [<ffffffff810dd162>]
kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222
CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted 4.13.0-rc4-next-20170811
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:16 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:52
  ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014
  __might_sleep+0x95/0x190 kernel/sched/core.c:5967
  __might_fault+0xab/0x1d0 mm/memory.c:4383
  __copy_from_user include/linux/uaccess.h:71 [inline]
  __kvm_read_guest_page+0x58/0xa0
arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771
  kvm_vcpu_read_guest_page+0x44/0x60
arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791
  kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407
  kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466
  segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819
  em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022
  x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471
  x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698
  kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854
  handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400
  vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718
  vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline]
  vcpu_run arch/x86/kvm/x86.c:7061 [inline]
  kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222
  kvm_vcpu_ioctl+0x64c/0x1010 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591
  vfs_ioctl fs/ioctl.c:45 [inline]
  do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
  SYSC_ioctl fs/ioctl.c:700 [inline]
  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
  entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x437fc9
RSP: 002b:00007ffc7b4d5ab8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000437fc9
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000020ae8000
R10: 0000000000009120 R11: 0000000000000206 R12: 0000000000000000
R13: 0000000000000004 R14: 0000000000000004 R15: 0000000020077000

Fixes: 9d643f6312 ("KVM: x86: avoid large stack allocations in em_fxrstor")
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:15 +01:00
Wanpeng Li
5af4157388 KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure
Commit 4f350c6dbc (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure
properly) can result in L1(run kvm-unit-tests/run_tests.sh vmx_controls in L1)
null pointer deference and also L0 calltrace when EPT=0 on both L0 and L1.

In L1:

BUG: unable to handle kernel paging request at ffffffffc015bf8f
 IP: vmx_vcpu_run+0x202/0x510 [kvm_intel]
 PGD 146e13067 P4D 146e13067 PUD 146e15067 PMD 3d2686067 PTE 3d4af9161
 Oops: 0003 [#1] PREEMPT SMP
 CPU: 2 PID: 1798 Comm: qemu-system-x86 Not tainted 4.14.0-rc4+ #6
 RIP: 0010:vmx_vcpu_run+0x202/0x510 [kvm_intel]
 Call Trace:
 WARNING: kernel stack frame pointer at ffffb86f4988bc18 in qemu-system-x86:1798 has bad value 0000000000000002

In L0:

-----------[ cut here ]------------
 WARNING: CPU: 6 PID: 4460 at /home/kernel/linux/arch/x86/kvm//vmx.c:9845 vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
 CPU: 6 PID: 4460 Comm: qemu-system-x86 Tainted: G           OE   4.14.0-rc7+ #25
 RIP: 0010:vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
 Call Trace:
  paging64_page_fault+0x500/0xde0 [kvm]
  ? paging32_gva_to_gpa_nested+0x120/0x120 [kvm]
  ? nonpaging_page_fault+0x3b0/0x3b0 [kvm]
  ? __asan_storeN+0x12/0x20
  ? paging64_gva_to_gpa+0xb0/0x120 [kvm]
  ? paging64_walk_addr_generic+0x11a0/0x11a0 [kvm]
  ? lock_acquire+0x2c0/0x2c0
  ? vmx_read_guest_seg_ar+0x97/0x100 [kvm_intel]
  ? vmx_get_segment+0x2a6/0x310 [kvm_intel]
  ? sched_clock+0x1f/0x30
  ? check_chain_key+0x137/0x1e0
  ? __lock_acquire+0x83c/0x2420
  ? kvm_multiple_exception+0xf2/0x220 [kvm]
  ? debug_check_no_locks_freed+0x240/0x240
  ? debug_smp_processor_id+0x17/0x20
  ? __lock_is_held+0x9e/0x100
  kvm_mmu_page_fault+0x90/0x180 [kvm]
  kvm_handle_page_fault+0x15c/0x310 [kvm]
  ? __lock_is_held+0x9e/0x100
  handle_exception+0x3c7/0x4d0 [kvm_intel]
  vmx_handle_exit+0x103/0x1010 [kvm_intel]
  ? kvm_arch_vcpu_ioctl_run+0x1628/0x2e20 [kvm]

The commit avoids to load host state of vmcs12 as vmcs01's guest state
since vmcs12 is not modified (except for the VM-instruction error field)
if the checking of vmcs control area fails. However, the mmu context is
switched to nested mmu in prepare_vmcs02() and it will not be reloaded
since load_vmcs12_host_state() is skipped when nested VMLAUNCH/VMRESUME
fails. This patch fixes it by reloading mmu context when nested
VMLAUNCH/VMRESUME fails.

Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:14 +01:00
Wanpeng Li
f1b026a331 KVM: nVMX: Validate the IA32_BNDCFGS on nested VM-entry
According to the SDM, if the "load IA32_BNDCFGS" VM-entry controls is 1, the
following checks are performed on the field for the IA32_BNDCFGS MSR:
 - Bits reserved in the IA32_BNDCFGS MSR must be 0.
 - The linear address in bits 63:12 must be canonical.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:13 +01:00
Wanpeng Li
3853be2603 KVM: X86: Fix operand/address-size during instruction decoding
Pedro reported:
  During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
  instruction under KVM produces different results on both memory and the SP
  register depending on whether EPT support is enabled. With EPT the SP is
  reduced by 4 bytes (and the written value is 0-padded) but without EPT support
  it is only reduced by 2 bytes. The difference can be observed when the CS.DB
  field is 1 (32-bit) but not when it's 0 (16-bit).

The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
also should be respected instead of just default operand/address-size/66H
prefix/67H prefix during instruction decoding. This patch fixes it by also
adjusting operand/address-size according to CS.D.

Reported-by: Pedro Fonseca <pfonseca@cs.washington.edu>
Tested-by: Pedro Fonseca <pfonseca@cs.washington.edu>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Pedro Fonseca <pfonseca@cs.washington.edu>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:12 +01:00
Liran Alon
9b8ae63798 KVM: x86: Don't re-execute instruction when not passing CR2 value
In case of instruction-decode failure or emulation failure,
x86_emulate_instruction() will call reexecute_instruction() which will
attempt to use the cr2 value passed to x86_emulate_instruction().
However, when x86_emulate_instruction() is called from
emulate_instruction(), cr2 is not passed (passed as 0) and therefore
it doesn't make sense to execute reexecute_instruction() logic at all.

Fixes: 51d8b66199 ("KVM: cleanup emulate_instruction")

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:12 +01:00
Liran Alon
1f4dcb3b21 KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
On this case, handle_emulation_failure() fills kvm_run with
internal-error information which it expects to be delivered
to user-mode for further processing.
However, the code reports a wrong return-value which makes KVM to never
return to user-mode on this scenario.

Fixes: 6d77dbfc88 ("KVM: inject #UD if instruction emulation fails and exit to
userspace")

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:11 +01:00
Liran Alon
61cb57c9ed KVM: x86: Exit to user-mode on #UD intercept when emulator requires
Instruction emulation after trapping a #UD exception can result in an
MMIO access, for example when emulating a MOVBE on a processor that
doesn't support the instruction.  In this case, the #UD vmexit handler
must exit to user mode, but there wasn't any code to do so.  Add it for
both VMX and SVM.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:10 +01:00
Liran Alon
ac9b305caa KVM: nVMX/nSVM: Don't intercept #UD when running L2
When running L2, #UD should be intercepted by L1 or just forwarded
directly to L2. It should not reach L0 x86 emulator.
Therefore, set intercept for #UD only based on L1 exception-bitmap.

Also add WARN_ON_ONCE() on L0 #UD intercept handlers to make sure
it is never reached while running L2.

This improves commit ae1f576707 ("KVM: nVMX: Do not emulate #UD while
in guest mode") by removing an unnecessary exit from L2 to L0 on #UD
when L1 doesn't intercept it.

In addition, SVM L0 #UD intercept handler doesn't handle correctly the
case it is raised from L2. In this case, it should forward the #UD to
guest instead of x86 emulator. As done in VMX #UD intercept handler.
This commit fixes this issue as-well.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:09 +01:00
Liran Alon
51c4b8bba6 KVM: x86: pvclock: Handle first-time write to pvclock-page contains random junk
When guest passes KVM it's pvclock-page GPA via WRMSR to
MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize
pvclock-page to some start-values. It just requests a clock-update which
will happen before entering to guest.

The clock-update logic will call kvm_setup_pvclock_page() to update the
pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly*
assumes that the version-field is initialized to an even number. This is
wrong because at first-time write, field could be any-value.

Fix simply makes sure that if first-time version-field is odd, increment
it once more to make it even and only then start standard logic.
This follows same logic as done in other pvclock shared-pages (See
kvm_write_wall_clock() and record_steal_time()).

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:08 +01:00
Paolo Bonzini
d02fcf5077 kvm: vmx: Allow disabling virtual NMI support
To simplify testing of these rarely used code paths, add a module parameter
that turns it on.  One eventinj.flat test (NMI after iret) fails when
loading kvm_intel with vnmi=0.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:07 +01:00
Paolo Bonzini
8a1b43922d kvm: vmx: Reinstate support for CPUs without virtual NMI
This is more or less a revert of commit 2c82878b0c ("KVM: VMX: require
virtual NMI support", 2017-03-27); it turns out that Core 2 Duo machines
only had virtual NMIs in some SKUs.

The revert is not trivial because in the meanwhile there have been several
fixes to nested NMI injection.  Therefore, the entire vNMI state is moved
to struct loaded_vmcs.

Another change compared to before the patch is a simplification here:

       if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked &&
           !(is_guest_mode(vcpu) && nested_cpu_has_virtual_nmis(
                                       get_vmcs12(vcpu))))) {

The final condition here is always true (because nested_cpu_has_virtual_nmis
is always false) and is removed.

Fixes: 2c82878b0c
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1490803
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:07 +01:00
Paolo Bonzini
15038e1472 KVM: SVM: obey guest PAT
For many years some users of assigned devices have reported worse
performance on AMD processors with NPT than on AMD without NPT,
Intel or bare metal.

The reason turned out to be that SVM is discarding the guest PAT
setting and uses the default (PA0=PA4=WB, PA1=PA5=WT, PA2=PA6=UC-,
PA3=UC).  The guest might be using a different setting, and
especially might want write combining but isn't getting it
(instead getting slow UC or UC- accesses).

Thanks a lot to geoff@hostfission.com for noticing the relation
to the g_pat setting.  The patch has been tested also by a bunch
of people on VFIO users forums.

Fixes: 709ddebf81
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196409
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2017-11-17 13:20:06 +01:00
Paolo Bonzini
fc3790fa07 GICv4 Support for KVM/ARM for v4.15
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJaBYxhAAoJEEtpOizt6ddyOc4H/1qADSdnZFVVE5v15Y+E8HLv
 EOXAo/yYJg26fY/TBIXo7gxSZFCd0Ah703aucPGTRFyOb8t0VqIvI07rS1u4sKPp
 mxfidYIZwLMibgno8NBdWB2mFeXrNlWTmwNt/IoO0iMn7IGqQZ/FZdf3GmWEVEsG
 CU/DrQRXArJqS77NuZtkhhZOKBxB0lQNv52DkVgy/QlcBagAI14hbezkLQAco4oT
 NUC4GyXn9yHzpTfhuQXv5hLd4xCqg9e51OgYNSL9oC/JXSByd7edQuqpd4fmnG4Y
 qoDPJ11wmkuUKEDaGbC7nZWIaiVc/TfJy2Hwj3bUVwQFbopCeYhQqCDUSKftncA=
 =o4u7
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-gicv4-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

GICv4 Support for KVM/ARM for v4.15
2017-11-17 13:20:01 +01:00
Linus Torvalds
cf9b0772f2 ARM: SoC driver updates for v4.15
This branch contains platform-related driver updates for ARM and ARM64,
 these are the areas that bring the changes:
 
 New drivers:
  - Driver support for Renesas R-Car V3M (R8A77970)
  - Power management support for Amlogic GX
  - A new driver for the Tegra BPMP thermal sensor
  - A new bus driver for Technologic Systems NBUS
 
 Changes for subsystems that prefer to merge through arm-soc:
  - The usual updates for reset controller drivers from Philipp Zabel,
    with five added drivers for SoCs in the arc, meson, socfpa, uniphier
    and mediatek families.
  - Updates to the ARM SCPI and PSCI frameworks, from Sudeep Holla,
    Heiner Kallweit and Lorenzo Pieralisi.
 
 Changes specific to some ARM-based SoC
  - The Freescale/NXP DPAA QBMan drivers from PowerPC can now work
    on ARM as well.
  - Several changes for power management on Broadcom SoCs
  - Various improvements on Qualcomm, Broadcom, Amlogic, Atmel, Mediatek
  - Minor Cleanups for Samsung, TI OMAP SoCs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDggbAAoJEGCrR//JCVInIeQQAN1MDyO1UaWiFYnbkVOgzFcj
 dqbFOc41DBE/90JoBWE8kR/rjyF83OqztiaYpx9viu2qMMBZVcOwxhCUthWK59c/
 IujYdw4zGevLscF+jdrLbXgk97nfaWebsHyTAF307WAdZVJxiVGGzQEcgm71d6Zp
 CXjLiUii4winHUMK9FLRY2st0HKAevXhuvZJVV432+sTg3p7fGVilYeGOL5G62WO
 zQfCisqzC5q677kGGyUlPRGlHWMPkllsTTnfXcmV/FUiGyVa3lUWY5sEu+wCl96O
 U1ffPENeNj/A/4fa1dbErtbiNnC2z/+jf+Dg7Cn8w/dPk4Suf0ppjP8RqIGyxmDl
 Wm/UxbwDClxaeF4GSaYh2yKgGRJMH5N87bJnZRINE5ccGiol8Ww/34bFG0xNnfyh
 jSAFAc318AFG62WD4lvqWc7LSpzOYxp/MNqIFXKN692St/MJLkx8/q0nTwY1qPY0
 3SELz9II3hz+3MfDRqtRi7hZpkgHgQ+UG7S5+Xhmqrl309GOEldCjPVJhhXxWoxK
 ZPtZOuyYvGhIC+YAnHaN6lUjADIdNJZHwbuXFImx85oKHVofoxHbcni5vk8Uu7z1
 sQNYOtdDGaPG/2u9RJdJlPg/jIgLKxxt/Xm9TYVawpZ5hFANhBTtIq5ExCRAil68
 j9sMOrpZ1DzCQyR7zN2v
 =qDhq
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Arnd Bergmann:
 "This branch contains platform-related driver updates for ARM and
  ARM64, these are the areas that bring the changes:

  New drivers:

   - driver support for Renesas R-Car V3M (R8A77970)

   - power management support for Amlogic GX

   - a new driver for the Tegra BPMP thermal sensor

   - a new bus driver for Technologic Systems NBUS

  Changes for subsystems that prefer to merge through arm-soc:

   - the usual updates for reset controller drivers from Philipp Zabel,
     with five added drivers for SoCs in the arc, meson, socfpa,
     uniphier and mediatek families

   - updates to the ARM SCPI and PSCI frameworks, from Sudeep Holla,
     Heiner Kallweit and Lorenzo Pieralisi

  Changes specific to some ARM-based SoC

   - the Freescale/NXP DPAA QBMan drivers from PowerPC can now work on
     ARM as well

   - several changes for power management on Broadcom SoCs

   - various improvements on Qualcomm, Broadcom, Amlogic, Atmel,
     Mediatek

   - minor Cleanups for Samsung, TI OMAP SoCs"

[ NOTE! This doesn't work without the previous ARM SoC device-tree pull,
  because the R8A77970 driver is missing a header file that came from
  that pull.

  The fact that this got merged afterwards only fixes it at this point,
  and bisection of that driver will fail if/when you walk into the
  history of that driver.           - Linus ]

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (96 commits)
  soc: amlogic: meson-gx-pwrc-vpu: fix power-off when powered by bootloader
  bus: add driver for the Technologic Systems NBUS
  memory: omap-gpmc: Remove deprecated gpmc_update_nand_reg()
  soc: qcom: remove unused label
  soc: amlogic: gx pm domain: add PM and OF dependencies
  drivers/firmware: psci_checker: Add missing destroy_timer_on_stack()
  dt-bindings: power: add amlogic meson power domain bindings
  soc: amlogic: add Meson GX VPU Domains driver
  soc: qcom: Remote filesystem memory driver
  dt-binding: soc: qcom: Add binding for rmtfs memory
  of: reserved_mem: Accessor for acquiring reserved_mem
  of/platform: Generalize /reserved-memory handling
  soc: mediatek: pwrap: fix fatal compiler error
  soc: mediatek: pwrap: fix compiler errors
  arm64: mediatek: cleanup message for platform selection
  soc: Allow test-building of MediaTek drivers
  soc: mediatek: place Kconfig for all SoC drivers under menu
  soc: mediatek: pwrap: add support for MT7622 SoC
  soc: mediatek: pwrap: add common way for setup CS timing extenstion
  soc: mediatek: pwrap: add MediaTek MT6380 as one slave of pwrap
  ..
2017-11-16 16:05:01 -08:00
Linus Torvalds
527d147074 ARM: Device-tree updates for 4.15
We add device tree files for a couple of additional SoCs in various areas:
 
 Allwinner R40/V40 for entertainment, Broadcom Hurricane 2 for networking,
 Amlogic A113D for audio, and Renesas R-Car V3M for automotive.
 
 As usual, lots of new boards get added based on those and other SoCs:
 
  - Actions S500 based CubieBoard6 single-board computer
 
  - Amlogic Meson-AXG A113D based development board
  - Amlogic S912 based Khadas VIM2 single-board computer
  - Amlogic S912 based Tronsmart Vega S96 set-top-box
 
  - Allwinner H5 based NanoPi NEO Plus2 single-board computer
  - Allwinner R40 based Banana Pi M2 Ultra and Berry single-board computers
  - Allwinner A83T based TBS A711 Tablet
 
  - Broadcom Hurricane 2 based Ubiquiti UniFi Switch 8
  - Broadcom bcm47xx based Luxul XAP-1440/XAP-810/ABR-4500/XBR-4500
      wireless access points and routers
 
  - NXP i.MX51 based Zodiac Inflight Innovations RDU1 board
  - NXP i.MX53 based GE Healthcare PPD biometric monitor
  - NXP i.MX6 based Pistachio single-board computer
  - NXP i.MX6 based Vining-2000 automotive diagnostic interface
  - NXP i.MX6 based Ka-Ro TX6 Computer-on-Module in additional variants
 
  - Qualcomm MSM8974 (Snapdragon 800) based Fairphone 2 phone
  - Qualcomm MSM8974pro (Snapdragon 801) based Sony Xperia Z2 Tablet
 
  - Realtek RTD1295 based set-top-boxes MeLE V9 and PROBOX2 AVA
 
  - Renesas R-Car V3M (R8A77970) SoC and "Eagle" reference board
  - Renesas H3ULCB and M3ULCB "Kingfisher" extension infotainment boards
  - Renasas r8a7745 based iWave G22D-SODIMM SoM
 
  - Rockchip rk3288 based Amarula Vyasa single-board computer
 
  - Samsung Exynos5800 based Odroid HC1 single-board computer
 
 For existing SoC support, there was a lot of ongoing work, as usual
 most of that concentrated on the Renesas, Rockchip, OMAP, i.MX, Amlogic
 and Allwinner platforms, but others were also active.
 
 Rob Herring and many others worked on reducing the number of issues that
 the latest version of 'dtc' now warns about. Unfortunately there is still
 a lot left to do.
 
 A rework of the ARM foundation model introduced several new files
 for common variations of the model.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDhcfAAoJEGCrR//JCVIngu0QAI2ntVotaOAOaCurNCnoVwI1
 j+eKwHGTawQRcSHWN8C+p4FzzaOmw+vvbOyewky8PWaDOCkK6yWEHRf3hb2la2jw
 j9prht28R1RAHIRPuah4SxKHYoT4VW9q/2hMHJ2BiNDOMX54xE7j2cUvWSsIRz5o
 id2QqKsp2OIDNQAXAA4N25FjdBCYvSik80panSdJITtJODIj6UfmcXSgqkoQ3TTV
 rwVyFtryl9Si3eyZYcfB2/0ILKuaMC8gl7IX9z+PkRqu9XN7i6bZKZlMMtpJqX3u
 Ad89kLkFqNhiwZ77bIoRRl+0NEoSu5hTPLHRqghS6gPfDY2JT6igf0rGC8twjfea
 fzGOBWr6NlIlUmR4smS0GyE/3YsfOQvYWjE+zx5qkmay30TORVTZBzsBR+kQJzKK
 tnbO1zvst1ECtk9e8np0di4NAo9rwM37dxpu4aspP1Umxw1K68VSNE3RhGl8UUwW
 oNvHa8hD8Ck0QDBNltrkmKBVoIYKRU3XhXrRXVjRQdu6Xitml0XYBi80V0h33EE3
 162UXDEMu1/aqRRZUtKw7+yozT8fqOHjH8Zrv2zCVGg0HEwVohcWv/BPXbrg0abJ
 wXYS8VocZJP6Nb4FQMe+cRbBUHoBgBQqbsF60tWiYsjv0zoc5hogLWcZYqzDcIO6
 06OBR3HgUW27urUn/JBu
 =TnSo
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM device-tree updates from Arnd Bergmann:
 "We add device tree files for a couple of additional SoCs in various
  areas:

  Allwinner R40/V40 for entertainment, Broadcom Hurricane 2 for
  networking, Amlogic A113D for audio, and Renesas R-Car V3M for
  automotive.

  As usual, lots of new boards get added based on those and other SoCs:

   - Actions S500 based CubieBoard6 single-board computer

   - Amlogic Meson-AXG A113D based development board
   - Amlogic S912 based Khadas VIM2 single-board computer
   - Amlogic S912 based Tronsmart Vega S96 set-top-box

   - Allwinner H5 based NanoPi NEO Plus2 single-board computer
   - Allwinner R40 based Banana Pi M2 Ultra and Berry single-board computers
   - Allwinner A83T based TBS A711 Tablet

   - Broadcom Hurricane 2 based Ubiquiti UniFi Switch 8
   - Broadcom bcm47xx based Luxul XAP-1440/XAP-810/ABR-4500/XBR-4500
     wireless access points and routers

   - NXP i.MX51 based Zodiac Inflight Innovations RDU1 board
   - NXP i.MX53 based GE Healthcare PPD biometric monitor
   - NXP i.MX6 based Pistachio single-board computer
   - NXP i.MX6 based Vining-2000 automotive diagnostic interface
   - NXP i.MX6 based Ka-Ro TX6 Computer-on-Module in additional variants

   - Qualcomm MSM8974 (Snapdragon 800) based Fairphone 2 phone
   - Qualcomm MSM8974pro (Snapdragon 801) based Sony Xperia Z2 Tablet

   - Realtek RTD1295 based set-top-boxes MeLE V9 and PROBOX2 AVA

   - Renesas R-Car V3M (R8A77970) SoC and "Eagle" reference board
   - Renesas H3ULCB and M3ULCB "Kingfisher" extension infotainment boards
   - Renasas r8a7745 based iWave G22D-SODIMM SoM

   - Rockchip rk3288 based Amarula Vyasa single-board computer

   - Samsung Exynos5800 based Odroid HC1 single-board computer

  For existing SoC support, there was a lot of ongoing work, as usual
  most of that concentrated on the Renesas, Rockchip, OMAP, i.MX,
  Amlogic and Allwinner platforms, but others were also active.

  Rob Herring and many others worked on reducing the number of issues
  that the latest version of 'dtc' now warns about. Unfortunately there
  is still a lot left to do.

  A rework of the ARM foundation model introduced several new files for
  common variations of the model"

* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (599 commits)
  arm64: dts: uniphier: route on-board device IRQ to GPIO controller for PXs3
  dt-bindings: bus: Add documentation for the Technologic Systems NBUS
  arm64: dts: actions: s900-bubblegum-96: Add fake uart5 clock
  ARM: dts: owl-s500: Add CubieBoard6
  dt-bindings: arm: actions: Add CubieBoard6
  ARM: dts: owl-s500-guitar-bb-rev-b: Add fake uart3 clock
  ARM: dts: owl-s500: Set power domains for CPU2 and CPU3
  arm: dts: mt7623: remove unused compatible string for pio node
  arm: dts: mt7623: update usb related nodes
  arm: dts: mt7623: update crypto node
  ARM: dts: sun8i: a711: Enable USB OTG
  ARM: dts: sun8i: a711: Add regulator support
  ARM: dts: sun8i: a83t: bananapi-m3: Enable AP6212 WiFi on mmc1
  ARM: dts: sun8i: a83t: cubietruck-plus: Enable AP6330 WiFi on mmc1
  ARM: dts: sun8i: a83t: Move mmc1 pinctrl setting to dtsi file
  ARM: dts: sun8i: a83t: allwinner-h8homlet-v2: Add AXP818 regulator nodes
  ARM: dts: sun8i: a83t: bananapi-m3: Add AXP813 regulator nodes
  ARM: dts: sun8i: a83t: cubietruck-plus: Add AXP818 regulator nodes
  ARM: dts: sunxi: Add dtsi for AXP81x PMIC
  arm64: dts: allwinner: H5: Restore EMAC changes
  ...
2017-11-16 15:48:26 -08:00
Linus Torvalds
8c60969856 ARM: SoC platform updates for 4.15
Most of the commits are for defconfig changes, to enable newly added
 drivers or features that people have started using. For the changed
 lines lines, we have mostly cleanups, the affected platforms are
 OMAP, Versatile, EP93xx, Samsung, Broadcom, i.MX, and Actions.
 
 The largest single change is the introduction of the TI "sysc" bus
 driver, with the intention of cleaning up more legacy code.
 
 Two new SoC platforms get added this time:
 - Allwinner R40 is a modernized version of the A20 chip, now
   with a Quad-Core ARM Cortex-A7. According to the manufacturer,
   it is intended for "Smart Hardware"
 - Broadcom Hurricane 2 (Aka Strataconnect BCM5334X) is a family
   of chips meant for managed gigabit ethernet switches, based
   around a Cortex-A9 CPU.
 
 Finally, we gain SMP support for two platforms: Renesas R-Car E2
 and Amlogic Meson8/8b, which were previously added but only supported
 uniprocessor operation.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDgf/AAoJEGCrR//JCVIntcMQAKI2q0Dr2giWtKSoH9GDh5co
 137MamTj1YExIcmtbDVO22jV4WSKhIduo+rRBYmQ/uvrkUe9tf7I172JeAIzMzGf
 HGYJ6fxpaEMUAbUlNcjuXJc7jQXNKLBK2X9CMuwXX3X3HddxKkL38D1d/Mxv5RGu
 G1pEe0j734Qio9LpACnb0xnluwyUBJOYNwo7Agj5RWzOrXZ+TdwkiIW0JdQiG7Z5
 wabzDa7OW1maB+hVYMAM3wHcqO7DKEvGvjYLRoT12cnOLXq7BNbXqXFufuMUNmNE
 ABhWA1h9SYrXT3n5pQLwoonvvTsI7KXCefrZ0wuxbjrdD4yGW1gmgpRee9RfoggD
 A6/62wpmSS61X5QWC6BLEa5v/o5NKewndyWhnjLllgJX8sRUbnPQa/xKv7ngdlN5
 7YL5HWoNpMQv7fEweSc6j5l/F3yRBndn9TpeKiqCiUiNDrIGlZYhYKIcr9rGESFk
 pu2KgK+e9+1k7F4s7LotsA65Q5bZIMveyyVtx0XHXz1G4O8NksoQCLJ3wcqQ2pzI
 WpyOO5R1CNltPhKGC7EP3OZcIMlCtCnsNcedb/AGHgPS+ert2UxBnlSeSMBQlLZY
 4fDwEAlA1qx9PuG9N3xrK/gAFiFLafK2sNxtVc7NSmXkkdm3xgJ95Y9sa72Y2qNO
 rU2LL8SM7cOwhXHrlEFB
 =jlJ2
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC platform updates from Arnd Bergmann:
 "Most of the commits are for defconfig changes, to enable newly added
  drivers or features that people have started using. For the changed
  lines lines, we have mostly cleanups, the affected platforms are OMAP,
  Versatile, EP93xx, Samsung, Broadcom, i.MX, and Actions.

  The largest single change is the introduction of the TI "sysc" bus
  driver, with the intention of cleaning up more legacy code.

  Two new SoC platforms get added this time:

   - Allwinner R40 is a modernized version of the A20 chip, now with a
     Quad-Core ARM Cortex-A7. According to the manufacturer, it is
     intended for "Smart Hardware"

   - Broadcom Hurricane 2 (Aka Strataconnect BCM5334X) is a family of
     chips meant for managed gigabit ethernet switches, based around a
     Cortex-A9 CPU.

  Finally, we gain SMP support for two platforms: Renesas R-Car E2 and
  Amlogic Meson8/8b, which were previously added but only supported
  uniprocessor operation"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (118 commits)
  ARM: multi_v7_defconfig: Select RPMSG_VIRTIO as module
  ARM: multi_v7_defconfig: enable CONFIG_GPIO_UNIPHIER
  arm64: defconfig: enable CONFIG_GPIO_UNIPHIER
  ARM: meson: enable MESON_IRQ_GPIO in Kconfig for meson8b
  ARM: meson: Add SMP bringup code for Meson8 and Meson8b
  ARM: smp_scu: allow the platform code to read the SCU CPU status
  ARM: smp_scu: add a helper for powering on a specific CPU
  dt-bindings: Amlogic: Add Meson8 and Meson8b SMP related documentation
  ARM: OMAP3: Delete an unnecessary variable initialisation in omap3xxx_hwmod_init()
  ARM: OMAP3: Use common error handling code in omap3xxx_hwmod_init()
  ARM: defconfig: select the right SX150X driver
  arm64: defconfig: Enable QCOM_IOMMU
  arm64: Add ThunderX drivers to defconfig
  arm64: defconfig: Enable Tegra PCI controller
  cpufreq: imx6q: Move speed grading check to cpufreq driver
  arm64: defconfig: re-enable Qualcomm DB410c USB
  ARM: configs: stm32: Add MDMA support in STM32 defconfig
  ARM: imx: Enable cpuidle for i.MX6DL starting at 1.1
  bus: ti-sysc: Fix unbalanced pm_runtime_enable by adding remove
  bus: ti-sysc: mark PM functions as __maybe_unused
  ...
2017-11-16 14:05:12 -08:00
Linus Torvalds
051089a2ee xen: features and fixes for v4.15-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJaDdh4AAoJELDendYovxMvPFAH/2QjTys2ydIAdmwke4odpJ7U
 xuy7HOQCzOeZ5YsZthzCBsN90VmnDM7X7CcB8weSdjcKlXMSAWD+J1RgkL2iAJhI
 8tzIEXECrlNuz4V5mX9TmMgtPCr4qzU3fsts0pZy4fYDq1PVWDefqOwEtbpbWabb
 wRSMq/nTb9iASTMgheSC0WfhJneqtJ+J20zrzkGPCBPRFcwfppeP8/7vpkmJslBi
 eH/pfchICM4w093T/BfavnsPvhLdjgRuwVzn6+e46s4tLnZAxnLRVQ7SXZXzBORq
 /dL/qC0XH3YXdU+XfIs//giZsmLns6SxZaMr4vs6TxFtuzZBKpLtkOKo9zndvxk=
 =sZY5
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Xen features and fixes for v4.15-rc1

  Apart from several small fixes it contains the following features:

   - a series by Joao Martins to add vdso support of the pv clock
     interface

   - a series by Juergen Gross to add support for Xen pv guests to be
     able to run on 5 level paging hosts

   - a series by Stefano Stabellini adding the Xen pvcalls frontend
     driver using a paravirtualized socket interface"

* tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (34 commits)
  xen/pvcalls: fix potential endless loop in pvcalls-front.c
  xen/pvcalls: Add MODULE_LICENSE()
  MAINTAINERS: xen, kvm: track pvclock-abi.h changes
  x86/xen/time: setup vcpu 0 time info page
  x86/xen/time: set pvclock flags on xen_time_init()
  x86/pvclock: add setter for pvclock_pvti_cpu0_va
  ptp_kvm: probe for kvm guest availability
  xen/privcmd: remove unused variable pageidx
  xen: select grant interface version
  xen: update arch/x86/include/asm/xen/cpuid.h
  xen: add grant interface version dependent constants to gnttab_ops
  xen: limit grant v2 interface to the v1 functionality
  xen: re-introduce support for grant v2 interface
  xen: support priv-mapping in an HVM tools domain
  xen/pvcalls: remove redundant check for irq >= 0
  xen/pvcalls: fix unsigned less than zero error check
  xen/time: Return -ENODEV from xen_get_wallclock()
  xen/pvcalls-front: mark expected switch fall-through
  xen: xenbus_probe_frontend: mark expected switch fall-throughs
  xen/time: do not decrease steal time after live migration on xen
  ...
2017-11-16 13:06:27 -08:00
Linus Torvalds
974aa5630b First batch of KVM changes for 4.15
Common:
  - Python 3 support in kvm_stat
 
  - Accounting of slabs to kmemcg
 
 ARM:
  - Optimized arch timer handling for KVM/ARM
 
  - Improvements to the VGIC ITS code and introduction of an ITS reset
    ioctl
 
  - Unification of the 32-bit fault injection logic
 
  - More exact external abort matching logic
 
 PPC:
  - Support for running hashed page table (HPT) MMU mode on a host that
    is using the radix MMU mode;  single threaded mode on POWER 9 is
    added as a pre-requisite
 
  - Resolution of merge conflicts with the last second 4.14 HPT fixes
 
  - Fixes and cleanups
 
 s390:
  - Some initial preparation patches for exitless interrupts and crypto
 
  - New capability for AIS migration
 
  - Fixes
 
 x86:
  - Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs, and
    after-reset state
 
  - Refined dependencies for VMX features
 
  - Fixes for nested SMI injection
 
  - A lot of cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJaDayXAAoJEED/6hsPKofo/3UH/3HvlcHt+ADTkCU1/iiKAs+i
 0zngIOXIxgHDnV0ww6bV+Znww0BzTYgKCAXX76z603jdpDwG/pzQQcbLDF5ZoJnD
 sQtF10gZinWaRsHlfbLqjrHGL2pGDHO1UKBKLJ0bAIyORPZBxs7i+VmrY/blnr9c
 0wsybJ8RbvwAxjsDL5jeX/z4NehPupmKUc4Lf0eZdSHwVOf9sjn+MP6jJ0r2JcIb
 D+zddPBiLStzN97t4gZpQsrlj3LKrDS+6hY+1TjSvlh+yHKFVFh58VhLm4DuDeb5
 bYOAlWJ/gAWEzfvr5Ld+Nd7SqWWn/14logPkQ4gcU4BI/neAOzk4c6hJfCHl1nk=
 =593n
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "First batch of KVM changes for 4.15

  Common:
   - Python 3 support in kvm_stat
   - Accounting of slabs to kmemcg

  ARM:
   - Optimized arch timer handling for KVM/ARM
   - Improvements to the VGIC ITS code and introduction of an ITS reset
     ioctl
   - Unification of the 32-bit fault injection logic
   - More exact external abort matching logic

  PPC:
   - Support for running hashed page table (HPT) MMU mode on a host that
     is using the radix MMU mode; single threaded mode on POWER 9 is
     added as a pre-requisite
   - Resolution of merge conflicts with the last second 4.14 HPT fixes
   - Fixes and cleanups

  s390:
   - Some initial preparation patches for exitless interrupts and crypto
   - New capability for AIS migration
   - Fixes

  x86:
   - Improved emulation of LAPIC timer mode changes, MCi_STATUS MSRs,
     and after-reset state
   - Refined dependencies for VMX features
   - Fixes for nested SMI injection
   - A lot of cleanups"

* tag 'kvm-4.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (89 commits)
  KVM: s390: provide a capability for AIS state migration
  KVM: s390: clear_io_irq() requests are not expected for adapter interrupts
  KVM: s390: abstract conversion between isc and enum irq_types
  KVM: s390: vsie: use common code functions for pinning
  KVM: s390: SIE considerations for AP Queue virtualization
  KVM: s390: document memory ordering for kvm_s390_vcpu_wakeup
  KVM: PPC: Book3S HV: Cosmetic post-merge cleanups
  KVM: arm/arm64: fix the incompatible matching for external abort
  KVM: arm/arm64: Unify 32bit fault injection
  KVM: arm/arm64: vgic-its: Implement KVM_DEV_ARM_ITS_CTRL_RESET
  KVM: arm/arm64: Document KVM_DEV_ARM_ITS_CTRL_RESET
  KVM: arm/arm64: vgic-its: Free caches when GITS_BASER Valid bit is cleared
  KVM: arm/arm64: vgic-its: New helper functions to free the caches
  KVM: arm/arm64: vgic-its: Remove kvm_its_unmap_device
  arm/arm64: KVM: Load the timer state when enabling the timer
  KVM: arm/arm64: Rework kvm_timer_should_fire
  KVM: arm/arm64: Get rid of kvm_timer_flush_hwstate
  KVM: arm/arm64: Avoid phys timer emulation in vcpu entry/exit
  KVM: arm/arm64: Move phys_timer_emulate function
  KVM: arm/arm64: Use kvm_arm_timer_set/get_reg for guest register traps
  ...
2017-11-16 13:00:24 -08:00
Linus Torvalds
441692aafc Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:

 - add support for ELF fdpic binaries on both MMU and noMMU platforms

 - linker script cleanups

 - support for compressed .data section for XIP images

 - discard memblock arrays when possible

 - various cleanups

 - atomic DMA pool updates

 - better diagnostics of missing/corrupt device tree

 - export information to allow userspace kexec tool to place images more
   inteligently, so that the device tree isn't overwritten by the
   booting kernel

 - make early_printk more efficient on semihosted systems

 - noMMU cleanups

 - SA1111 PCMCIA update in preparation for further cleanups

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (38 commits)
  ARM: 8719/1: NOMMU: work around maybe-uninitialized warning
  ARM: 8717/2: debug printch/printascii: translate '\n' to "\r\n" not "\n\r"
  ARM: 8713/1: NOMMU: Support MPU in XIP configuration
  ARM: 8712/1: NOMMU: Use more MPU regions to cover memory
  ARM: 8711/1: V7M: Add support for MPU to M-class
  ARM: 8710/1: Kconfig: Kill CONFIG_VECTORS_BASE
  ARM: 8709/1: NOMMU: Disallow MPU for XIP
  ARM: 8708/1: NOMMU: Rework MPU to be mostly done in C
  ARM: 8707/1: NOMMU: Update MPU accessors to use cp15 helpers
  ARM: 8706/1: NOMMU: Move out MPU setup in separate module
  ARM: 8702/1: head-common.S: Clear lr before jumping to start_kernel()
  ARM: 8705/1: early_printk: use printascii() rather than printch()
  ARM: 8703/1: debug.S: move hexbuf to a writable section
  ARM: add additional table to compressed kernel
  ARM: decompressor: fix BSS size calculation
  pcmcia: sa1111: remove special sa1111 mmio accessors
  pcmcia: sa1111: use sa1111_get_irq() to obtain IRQ resources
  ARM: better diagnostics with missing/corrupt dtb
  ARM: 8699/1: dma-mapping: Remove init_dma_coherent_pool_size()
  ARM: 8698/1: dma-mapping: Mark atomic_pool as __ro_after_init
  ..
2017-11-16 12:50:35 -08:00
Linus Torvalds
5b0e2cb020 powerpc updates for 4.15
Non-highlights:
 
  - Five fixes for the >128T address space handling, both to fix bugs in our
    implementation and to bring the semantics exactly into line with x86.
 
 Highlights:
 
  - Support for a new OPAL call on bare metal machines which gives us a true NMI
    (ie. is not masked by MSR[EE]=0) for debugging etc.
 
  - Support for Power9 DD2 in the CXL driver.
 
  - Improvements to machine check handling so that uncorrectable errors can be
    reported into the generic memory_failure() machinery.
 
  - Some fixes and improvements for VPHN, which is used under PowerVM to notify
    the Linux partition of topology changes.
 
  - Plumbing to enable TM (transactional memory) without suspend on some Power9
    processors (PPC_FEATURE2_HTM_NO_SUSPEND).
 
  - Support for emulating vector loads form cache-inhibited memory, on some
    Power9 revisions.
 
  - Disable the fast-endian switch "syscall" by default (behind a CONFIG), we
    believe it has never had any users.
 
  - A major rework of the API drivers use when initiating and waiting for long
    running operations performed by OPAL firmware, and changes to the
    powernv_flash driver to use the new API.
 
  - Several fixes for the handling of FP/VMX/VSX while processes are using
    transactional memory.
 
  - Optimisations of TLB range flushes when using the radix MMU on Power9.
 
  - Improvements to the VAS facility used to access coprocessors on Power9, and
    related improvements to the way the NX crypto driver handles requests.
 
  - Implementation of PMEM_API and UACCESS_FLUSHCACHE for 64-bit.
 
 Thanks to:
   Alexey Kardashevskiy, Alistair Popple, Allen Pais, Andrew Donnellan, Aneesh
   Kumar K.V, Arnd Bergmann, Balbir Singh, Benjamin Herrenschmidt, Breno Leitao,
   Christophe Leroy, Christophe Lombard, Cyril Bur, Frederic Barrat, Gautham R.
   Shenoy, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo Romero, Haren
   Myneni, Joel Stanley, Kamalesh Babulal, Kautuk Consul, Markus Elfring, Masami
   Hiramatsu, Michael Bringmann, Michael Neuling, Michal Suchanek, Naveen N. Rao,
   Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pedro Miraglia Franco de
   Carvalho, Philippe Bergheaud, Sandipan Das, Seth Forshee, Shriya, Stephen
   Rothwell, Stewart Smith, Sukadev Bhattiprolu, Tyrel Datwyler, Vaibhav Jain,
   Vaidyanathan Srinivasan, William A. Kennington III.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDXGuAAoJEFHr6jzI4aWAEqwP/0TA35KFAK6wqfkCf67z4q+O
 I+5piI4eDV4jdCakfoIN1JfjhQRULNePSoCHTccan30mu/bm30p69xtOLL2/h5xH
 Mhz/eDBAOo0lrT20nyZfYMW3FnM66wnNf++qJ0O+8L052r4WOB02J0k1uM1ST01D
 5Lb5mUoxRLRzCgKRYAYWJifn+IFPUB9NMsvMTym94krAFlIjIzMEQXhDoln+jJMr
 QmY5f1BTA/fLfXobn0zwoc/C1oa2PUtxd+rxbwGrLoZ6G843mMqUi90SMr5ybhXp
 RzepnBTj4by3vOsnk/X1mANyaZfLsunp75FwnjHdPzKrAS/TuPp8D/iSxxE/PzEq
 cLwJFBnFXSgQMefDErXxhHSDz2dAg5r14rsTpDcq2Ko8TPV4rPsuSfmbd9Txekb0
 yWHsjoJUBBMl2QcWqIHl+AlV8j1RklF6solcTBcGnH1CZJMfa05VKXV7xGEvOHa0
 RJ+/xPyR9KjoB/SUp++9Vmx/M6SwQYFOJlr3Zpg9LNtR8WpoPYu1E6eO+u1Hhzny
 eJqaNstH+i+VdY9eqszkAsEBh8o9M/+b+7Wx7TetvU+v368CbXtgFYs9qy2oZjPF
 t9sY/BHaHZ8eZ7I00an77a0fVV5B1PVASUtIz5CqkwGpMvX6Z6W2K/XUUFI61kuu
 E06HS6Ht8UPJAzrAPUMl
 =Rq81
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "A bit of a small release, I suspect in part due to me travelling for
  KS. But my backlog of patches to review is smaller than usual, so I
  think in part folks just didn't send as much this cycle.

  Non-highlights:

   - Five fixes for the >128T address space handling, both to fix bugs
     in our implementation and to bring the semantics exactly into line
     with x86.

  Highlights:

   - Support for a new OPAL call on bare metal machines which gives us a
     true NMI (ie. is not masked by MSR[EE]=0) for debugging etc.

   - Support for Power9 DD2 in the CXL driver.

   - Improvements to machine check handling so that uncorrectable errors
     can be reported into the generic memory_failure() machinery.

   - Some fixes and improvements for VPHN, which is used under PowerVM
     to notify the Linux partition of topology changes.

   - Plumbing to enable TM (transactional memory) without suspend on
     some Power9 processors (PPC_FEATURE2_HTM_NO_SUSPEND).

   - Support for emulating vector loads form cache-inhibited memory, on
     some Power9 revisions.

   - Disable the fast-endian switch "syscall" by default (behind a
     CONFIG), we believe it has never had any users.

   - A major rework of the API drivers use when initiating and waiting
     for long running operations performed by OPAL firmware, and changes
     to the powernv_flash driver to use the new API.

   - Several fixes for the handling of FP/VMX/VSX while processes are
     using transactional memory.

   - Optimisations of TLB range flushes when using the radix MMU on
     Power9.

   - Improvements to the VAS facility used to access coprocessors on
     Power9, and related improvements to the way the NX crypto driver
     handles requests.

   - Implementation of PMEM_API and UACCESS_FLUSHCACHE for 64-bit.

  Thanks to: Alexey Kardashevskiy, Alistair Popple, Allen Pais, Andrew
  Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Balbir Singh, Benjamin
  Herrenschmidt, Breno Leitao, Christophe Leroy, Christophe Lombard,
  Cyril Bur, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven,
  Guilherme G. Piccoli, Gustavo Romero, Haren Myneni, Joel Stanley,
  Kamalesh Babulal, Kautuk Consul, Markus Elfring, Masami Hiramatsu,
  Michael Bringmann, Michael Neuling, Michal Suchanek, Naveen N. Rao,
  Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pedro Miraglia
  Franco de Carvalho, Philippe Bergheaud, Sandipan Das, Seth Forshee,
  Shriya, Stephen Rothwell, Stewart Smith, Sukadev Bhattiprolu, Tyrel
  Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, and William A.
  Kennington III"

* tag 'powerpc-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (151 commits)
  powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature
  powerpc/64s: Fix masking of SRR1 bits on instruction fault
  powerpc/64s: mm_context.addr_limit is only used on hash
  powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation
  powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary
  powerpc/64s/hash: Fix fork() with 512TB process address space
  powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
  powerpc/64s/hash: Fix 512T hint detection to use >= 128T
  powerpc: Fix DABR match on hash based systems
  powerpc/signal: Properly handle return value from uprobe_deny_signal()
  powerpc/fadump: use kstrtoint to handle sysfs store
  powerpc/lib: Implement UACCESS_FLUSHCACHE API
  powerpc/lib: Implement PMEM API
  powerpc/powernv/npu: Don't explicitly flush nmmu tlb
  powerpc/powernv/npu: Use flush_all_mm() instead of flush_tlb_mm()
  powerpc/powernv/idle: Round up latency and residency values
  powerpc/kprobes: refactor kprobe_lookup_name for safer string operations
  powerpc/kprobes: Blacklist emulate_update_regs() from kprobes
  powerpc/kprobes: Do not disable interrupts for optprobes and kprobes_on_ftrace
  powerpc/kprobes: Disable preemption before invoking probe handler for optprobes
  ...
2017-11-16 12:47:46 -08:00
Linus Torvalds
487e2c9f44 AFS development
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAWgm9V/Sw1s6N8H32AQK5mQ//QGUDZLXsUPCtq0XJq0V+r4MUjNp9tCZR
 htiuNrEkHSyPpYgCcQ2Aqdl9kndwVXcE7lWT99mp/a0zwNAsp9GOGVhCXUd5R86G
 XlrBuUYVvBJk18tDsUNWdjRQ0gMHgQSlEnEbsaGiU1bVrpXatI9hL8qoeO78Iy7+
 eaJUQLCuCVJq7qMQGhC0hg338vmHVeYhnViXIxq+HFjsMmR9IVanuK+sQr6NSJxS
 F6RkPxBUPWkRVMHmxTLWj/XSHZwtwu+Mnc/UFYsAPLKEbY0cIohsI8EgfE8U7geU
 yRVnu3MIOXUXUrZizj9SwVYWdJfneRlINqMbHIO8QXMKR38tnQ0C2/7bgBsXiNPv
 YdiAyeqL4nM+JthV/rgA3hWgupwBlSb4ubclTphDNxMs5MBIUIK3XUt9GOXDDUZz
 2FT/FdrphM2UORaI2AEOi4Q0/nHdin+3rld8fjV0Ree/TPNXwcrOmvy8yGnxFCEp
 5b7YLwKrffZGnnS965dhZlnFR6hjndmzFgHdyRrJwc80hXi1Q/+W4F19MoYkkoVK
 G/gLvD3FbmygmFnjCik9TjUrro6vQxo56H/TuWgHTvYriNGH+D/D7EGUwg4GiXZZ
 +7vrNw660uXmZiu9i0YacCRyD8lvm7QpmWLb+uHwzfsBE1+C8UetyQ+egSWVdWJO
 KwPspygWXD4=
 =3vy0
 -----END PGP SIGNATURE-----

Merge tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS updates from David Howells:
 "kAFS filesystem driver overhaul.

  The major points of the overhaul are:

   (1) Preliminary groundwork is laid for supporting network-namespacing
       of kAFS. The remainder of the namespacing work requires some way
       to pass namespace information to submounts triggered by an
       automount. This requires something like the mount overhaul that's
       in progress.

   (2) sockaddr_rxrpc is used in preference to in_addr for holding
       addresses internally and add support for talking to the YFS VL
       server. With this, kAFS can do everything over IPv6 as well as
       IPv4 if it's talking to servers that support it.

   (3) Callback handling is overhauled to be generally passive rather
       than active. 'Callbacks' are promises by the server to tell us
       about data and metadata changes. Callbacks are now checked when
       we next touch an inode rather than actively going and looking for
       it where possible.

   (4) File access permit caching is overhauled to store the caching
       information per-inode rather than per-directory, shared over
       subordinate files. Whilst older AFS servers only allow ACLs on
       directories (shared to the files in that directory), newer AFS
       servers break that restriction.

       To improve memory usage and to make it easier to do mass-key
       removal, permit combinations are cached and shared.

   (5) Cell database management is overhauled to allow lighter locks to
       be used and to make cell records autonomous state machines that
       look after getting their own DNS records and cleaning themselves
       up, in particular preventing races in acquiring and relinquishing
       the fscache token for the cell.

   (6) Volume caching is overhauled. The afs_vlocation record is got rid
       of to simplify things and the superblock is now keyed on the cell
       and the numeric volume ID only. The volume record is tied to a
       superblock and normal superblock management is used to mediate
       the lifetime of the volume fscache token.

   (7) File server record caching is overhauled to make server records
       independent of cells and volumes. A server can be in multiple
       cells (in such a case, the administrator must make sure that the
       VL services for all cells correctly reflect the volumes shared
       between those cells).

       Server records are now indexed using the UUID of the server
       rather than the address since a server can have multiple
       addresses.

   (8) File server rotation is overhauled to handle VMOVED, VBUSY (and
       similar), VOFFLINE and VNOVOL indications and to handle rotation
       both of servers and addresses of those servers. The rotation will
       also wait and retry if the server says it is busy.

   (9) Data writeback is overhauled. Each inode no longer stores a list
       of modified sections tagged with the key that authorised it in
       favour of noting the modified region of a page in page->private
       and storing a list of keys that made modifications in the inode.

       This simplifies things and allows other keys to be used to
       actually write to the server if a key that made a modification
       becomes useless.

  (10) Writable mmap() is implemented. This allows a kernel to be build
       entirely on AFS.

  Note that Pre AFS-3.4 servers are no longer supported, though this can
  be added back if necessary (AFS-3.4 was released in 1998)"

* tag 'afs-next-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (35 commits)
  afs: Protect call->state changes against signals
  afs: Trace page dirty/clean
  afs: Implement shared-writeable mmap
  afs: Get rid of the afs_writeback record
  afs: Introduce a file-private data record
  afs: Use a dynamic port if 7001 is in use
  afs: Fix directory read/modify race
  afs: Trace the sending of pages
  afs: Trace the initiation and completion of client calls
  afs: Fix documentation on # vs % prefix in mount source specification
  afs: Fix total-length calculation for multiple-page send
  afs: Only progress call state at end of Tx phase from rxrpc callback
  afs: Make use of the YFS service upgrade to fully support IPv6
  afs: Overhaul volume and server record caching and fileserver rotation
  afs: Move server rotation code into its own file
  afs: Add an address list concept
  afs: Overhaul cell database management
  afs: Overhaul permit caching
  afs: Overhaul the callback handling
  afs: Rename struct afs_call server member to cm_server
  ...
2017-11-16 11:41:22 -08:00
Linus Torvalds
b630a23a73 This is the bulk of pin control changes for the v4.15
kernel cycle:
 
 Core:
 
 - The pin control Kconfig entry PINCTRL is now turned into
   a menuconfig option. This obviously has the implication of
   making the subsystem menu visible in menuconfig. This is
   happening because of two things:
 
   - Intel have started to deploy and depend on pin controllers
     in a way that is affecting users directly. This happens
     on the highly integrated laptop chipsets named after
     geographical places: baytrail, broxton, cannonlake,
     cedarfork, cherryview, denverton, geminilake, lewisburg,
     merrifield, sunrisepoint... It started a while back and
     now it is ever more evident that this is crucial
     infrastructure for x86 laptops and not an embedded
     obscurity anymore. Users need to be aware.
 
   - Pin control expanders on I2C and SPI that are
     arch-agnostic. Currently Semtech SX150X and Microchip
     MCP28x08 but more are expected. Users will have to be
     able to configure these in directly for their set-up.
 
 - Just go and select GPIOLIB now that we made sure that
   GPIOLIB is a very vanilla subsystem. Do not depend on
   it, if we need it, select it.
 
 - Exposing the pin control subsystem in menuconfig uncovered
   a bunch of obscure bugs that are now hopefully fixed,
   all more or less pertaining to Blackfin.
 
 - Unified namespace for cross-calls between pin control and
   GPIO.
 
 - New support for clock skew/delay generic DT bindings
   and generic pin config options for this.
 
 - Minor documentation improvements.
 
 Various:
 
 - The Renesas SH-PFC pin controller has evolved a lot. It seems
   Renesas are churning out new SoCs by the minute.
 
 - A bunch of non-critical fixes for the Rockchip driver.
 
 - Improve the use of library functions instead of open coding.
 
 - Support the MCP28018 variant in the MCP28x08 driver.
 
 - Static constifying.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaDV9TAAoJEEEQszewGV1zf0AQAIlHxM8B0mJPOFv7WdPIHs8j
 GSGAPv0rPobdgZI8vegosIQmAiry5jjaHP6VGOrK5n8FRxfBLd89NLT7dgK7J9Yx
 tYcQRQn1/MqZKaIjWWgTes3okEr9s77Of3aWkA9gyvBjTGoo2hu8BTwZOYuPrIPP
 aYcI7VR0VbTe7FQR1QRtKBXnBTXfznF1j5ckKNY4ahgIPcUgxyh6EA1E61rDorLK
 gvwwzoBqIKQAcnapgarF7YOJjoE0i7ZoSlhL0b0nvhcgolyK/zLN4xujLcTGPeTJ
 hQwe7LhxtvtmJmu0jRMuetDLFT52d6eq8ttyFBMULkgRzcgMv6GZZXUy4k92t7ZT
 F2DRbAjyAlxkhUhQ8BORzEXwfWYITt1M49jWQqugdDR2fV/MAlF8motOkVBl73iS
 zHIQ/ZDcAD+PlwTHiDyDOUxj7qyDs2MkTLTzfXc0koOQZOqskDHQ1dIf3UzLzZ9S
 /dx339/ejwP73E0lzOsanhianfonqWZ3Apn3aRG18uqCt2+eHySWpxyRANuOlBZI
 czERg+47wDfng24xyuH0EElgbS5G0Bt1lT5zLVLdFEvoLmcBHVKqaCkiuvYXOjVM
 GyMRvQPiJbhT6qiJ+aSP8t/utl1aUhXQLtrUnXxu8qv9tQ6jgmqiQd9855Uvrzb0
 ZR2wyNc2jtWzwCfrkWjt
 =kj/b
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "This is the bulk of pin control changes for the v4.15 kernel cycle:

  Core:

   - The pin control Kconfig entry PINCTRL is now turned into a
     menuconfig option. This obviously has the implication of making the
     subsystem menu visible in menuconfig. This is happening because of
     two things:

      (a) Intel have started to deploy and depend on pin controllers in
          a way that is affecting users directly. This happens on the
          highly integrated laptop chipsets named after geographical
          places: baytrail, broxton, cannonlake, cedarfork, cherryview,
          denverton, geminilake, lewisburg, merrifield, sunrisepoint...
          It started a while back and now it is ever more evident that
          this is crucial infrastructure for x86 laptops and not an
          embedded obscurity anymore. Users need to be aware.

      (b) Pin control expanders on I2C and SPI that are arch-agnostic.
          Currently Semtech SX150X and Microchip MCP28x08 but more are
          expected. Users will have to be able to configure these in
          directly for their set-up.

   - Just go and select GPIOLIB now that we made sure that GPIOLIB is a
     very vanilla subsystem. Do not depend on it, if we need it, select
     it.

   - Exposing the pin control subsystem in menuconfig uncovered a bunch
     of obscure bugs that are now hopefully fixed, all more or less
     pertaining to Blackfin.

   - Unified namespace for cross-calls between pin control and GPIO.

   - New support for clock skew/delay generic DT bindings and generic
     pin config options for this.

   - Minor documentation improvements.

  Various:

   - The Renesas SH-PFC pin controller has evolved a lot. It seems
     Renesas are churning out new SoCs by the minute.

   - A bunch of non-critical fixes for the Rockchip driver.

   - Improve the use of library functions instead of open coding.

   - Support the MCP28018 variant in the MCP28x08 driver.

   - Static constifying"

* tag 'pinctrl-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (91 commits)
  pinctrl: gemini: Fix missing pad descriptions
  pinctrl: Add some depends on HAS_IOMEM
  pinctrl: samsung/s3c24xx: add CONFIG_OF dependency
  pinctrl: gemini: Fix GMAC groups
  pinctrl: qcom: spmi-gpio: Add pmi8994 gpio support
  pinctrl: ti-iodelay: remove redundant unused variable dev
  pinctrl: max77620: Use common error handling code in max77620_pinconf_set()
  pinctrl: gemini: Implement clock skew/delay config
  pinctrl: gemini: Use generic DT parser
  pinctrl: Add skew-delay pin config and bindings
  pinctrl: armada-37xx: Add edge both type gpio irq support
  pinctrl: uniphier: remove eMMC hardware reset pin-mux
  pinctrl: rockchip: Add iomux-route switching support for rk3288
  pinctrl: intel: Add Intel Cedar Fork PCH pin controller support
  pinctrl: intel: Make offset to interrupt status register configurable
  pinctrl: sunxi: Enforce the strict mode by default
  pinctrl: sunxi: Disable strict mode for old pinctrl drivers
  pinctrl: sunxi: Introduce the strict flag
  pinctrl: sh-pfc: Save/restore registers for PSCI system suspend
  pinctrl: sh-pfc: r8a7796: Use generic IOCTRL register description
  ...
2017-11-16 10:57:11 -08:00
Linus Torvalds
2bf16b7a73 Char/Misc patches for 4.15-rc1
Here is the big set of char/misc and other driver subsystem patches for
 4.15-rc1.
 
 There are small changes all over here, hyperv driver updates, pcmcia
 driver updates, w1 driver updats, vme driver updates, nvmem driver
 updates, and lots of other little one-off driver updates as well.  The
 shortlog has the full details.
 
 Note, there will be a merge conflict in drivers/misc/lkdtm_core.c when
 merging to your tree as one lkdtm patch came in through the perf tree as
 well as this one.  The resolution is to take the const change that this
 tree provides.
 
 All of these have been in linux-next for quite a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWg2Lnw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymTUwCgwp46+I8yPlgDH8oe5TxyyJnpdHQAn1XW0i+a
 sBi6WS87In5v1QO1Rgfc
 =dH2a
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.15-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc updates from Greg KH:
 "Here is the big set of char/misc and other driver subsystem patches
  for 4.15-rc1.

  There are small changes all over here, hyperv driver updates, pcmcia
  driver updates, w1 driver updats, vme driver updates, nvmem driver
  updates, and lots of other little one-off driver updates as well. The
  shortlog has the full details.

  All of these have been in linux-next for quite a while with no
  reported issues"

* tag 'char-misc-4.15-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (90 commits)
  VME: Return -EBUSY when DMA list in use
  w1: keep balance of mutex locks and refcnts
  MAINTAINERS: Update VME subsystem tree.
  nvmem: sunxi-sid: add support for A64/H5's SID controller
  nvmem: imx-ocotp: Update module description
  nvmem: imx-ocotp: Enable i.MX7D OTP write support
  nvmem: imx-ocotp: Add i.MX7D timing write clock setup support
  nvmem: imx-ocotp: Move i.MX6 write clock setup to dedicated function
  nvmem: imx-ocotp: Add support for banked OTP addressing
  nvmem: imx-ocotp: Pass parameters via a struct
  nvmem: imx-ocotp: Restrict OTP write to IMX6 processors
  nvmem: uniphier: add UniPhier eFuse driver
  dt-bindings: nvmem: add description for UniPhier eFuse
  nvmem: set nvmem->owner to nvmem->dev->driver->owner if unset
  nvmem: qfprom: fix different address space warnings of sparse
  nvmem: mtk-efuse: fix different address space warnings of sparse
  nvmem: mtk-efuse: use stack for nvmem_config instead of malloc'ing it
  nvmem: imx-iim: use stack for nvmem_config instead of malloc'ing it
  thunderbolt: tb: fix use after free in tb_activate_pcie_devices
  MAINTAINERS: Add git tree for Thunderbolt development
  ...
2017-11-16 09:10:59 -08:00
Masahiro Yamada
ba5b5034bd arm64: dts: uniphier: route on-board device IRQ to GPIO controller for PXs3
Commit 429f203eb7 ("arm64: dts: uniphier: route on-board device IRQ
to GPIO controller") missed to update this DTS.  It becames a real
problem when arm and arm64 trees are merged together.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-11-16 17:16:55 +01:00
Masahiro Yamada
52c291a36c sh: decompressor: add shipped files to .gitignore
These files are copied from arch/sh/lib, so should be ignored by git.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-17 00:33:08 +09:00
Masahiro Yamada
380a1edbcb frv: .gitignore: ignore vmlinux.lds
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2017-11-17 00:33:08 +09:00
Heiko Carstens
ab35727eb8 s390: remove unused parameter from Makefile
Remove unused parameter from the call function,
which I accidentally added.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:21 +01:00
Hendrik Brueckner
544e8dd7a8 s390/cpum_sf: correctly set the PID and TID in perf samples
The hardware sampler creates samples that are processed at a later
point in time.  The PID and TID values of the perf samples that are
created for hardware samples are initialized with values from the
current task.  Hence, the PID and TID values are not correct and
perf samples are associated with wrong processes.

The PID and TID values are obtained from the Host Program Parameter
(HPP) field in the basic-sampling data entries.  These PIDs are
valid in the init PID namespace.  Ensure that the PIDs in the perf
samples are resolved considering the PID namespace in which the
perf event was created.

To correct the PID and TID values in the created perf samples,
a special overflow handler is installed.  It replaces the default
overflow handler and does not become effective if any other
overflow handler is used.  With the special overflow handler most
of the perf samples are associated with the right processes.
For processes, that are no longer exist, the association might
still be wrong.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:17 +01:00
Hendrik Brueckner
d4c7e649d7 s390/cpum_sf: load program parameter at sampler enablement
The lpp instruction is used to place the PID of the current
task in the program-parameter (PP) register.  The register
contents is then included in the sampling data entries.

The lpp instruction loads the PP register only when at least
one sampling function is enabled.  Otherwise it is executed
as a no-op.

Linux calls lpp at context switch.  If the context switch
happens before the sampler is enabled, the PP register is
empty.  That means, the PID of the task that is sampled is
not stored in sampling data until the next context switch.

Hence, always call lpp when enabling the sampler.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:16 +01:00
Hendrik Brueckner
0da0017f72 s390/perf: extend perf_regs support to include floating-point registers
Extend the perf register support to also export floating-point register
contents for user space tasks.  Floating-point registers might be used
in leaf functions to contain the return address.  Hence, they are required
for proper DWARF unwinding.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:14 +01:00
Heiko Carstens
c33eff6005 s390/perf: add perf_regs support and user stack dump
Add s390 support to dump user stack to user space for DWARF
stack unwinding.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:11 +01:00
Hendrik Brueckner
9232c3c741 s390/cpum_sf: do not register PMU if no sampling mode is authorized
Previously, the cpum_sf PMU was registered even if there is no
sampling mode authorized.  Add a check and register cpum_sf only
at least one sampling mode is authorized.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:10 +01:00
Pu Hou
3d43b981eb s390/cpumf: remove raw event support in basic-only sampling mode
Raw sample was implemented to export the diagnostic samples.
With having this achieved with AUX buffers, there is no requirement
for basic samples to export raw data.  In particular, most basic
sampling information are consumed for creating the perf event sample.

Signed-off-by: Pu Hou <bjhoupu@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:09 +01:00
Pu Hou
cbf6948f36 s390/cpumf: enable using AUX buffer
Modify PMU callback to use AUX buffer for diagnostic mode sampling.
Basic-mode sampling still use orignal way.

Signed-off-by: Pu Hou <bjhoupu@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:07 +01:00
Pu Hou
ca5955cdea s390/cpumf: introduce AUX buffer for dump diagnostic sample data
Current implementation uses a private buffer for cpumf to dump samples.
Samples first go to this buffer. Then copy to ring buffer allocated
by perf core. With AUX buffer, this copy is not needed. AUX buffer is
shared and zero-copy mapped to user space. The trailer information at
the end of each SDB(sample data block) is also exported to user space.
AUX buffer is used when diagnostic sampling mode is enabled.

This patch contains functions to setup/free AUX buffer or to begin/end
sampling per-cpu. Also include function called in interrupt to
collect samples.

Signed-off-by: Pu Hou <bjhoupu@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 15:06:06 +01:00
Radim Krčmář
a6014f1ab7 KVM: s390: fixes and improvements for 4.15
- Some initial preparation patches for exitless interrupts and crypto
 - New capability for AIS migration
 - Fixes
 - merge of the sthyi tree from the base s390 team, which moves the sthyi
 out of KVM into a shared function also for non-KVM
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJaBHkvAAoJEBF7vIC1phx84nwP/AvR7yRRqdeWvpZ+T9hiwscR
 p7AY5jnbVun7QtqR3yK+Z0IuZzU3gWheDNB4ZPegLLgzxN+ge4C45cZbpKJZUYXf
 Fef8kdXs7Agi6oRU+xXKgYipot4g3VBdRGrfktUMiYD/LC7WpDlJybF0UW45FCKk
 ECBumYXQ+6Jo5pplF3VH8XUwZb1IK3+//WaMLOToYyYuijCpcE0KfoKqCMrc39CU
 GMQVq87IXnhKFeIAt4upiaXXHK/0mGBHdkOG6ILwdNMDfCDiBgynU4HALz+wf/IE
 cvukxsTbHWDcgHMznBgDmnmb4DiWqahtatqGpXEpzzabgjIfkHKagYlaRb+VwwNm
 vIDhm18Wq66nnmrNkpwbn1SB2OoOh7ug+YLrc3XTnGnSWWfZGNmrgOj7K9xTE9yZ
 VmQReOT70KB/DYqZDVAlgNFqWA6MyQxcdxKDaLAqGWt2TN/uXrbdBd8Aa4AF1mtb
 V8aFchFPauuj60cqG91NnKVVdVQ1qB+ZG7AzKo9BPa7iCH8IW5UlPL4FXz3lKdDu
 /KWfL/nUFu8hPSmZijdHRKwdekXHfadHXsvGLVSsLxe/DYWaWTAgnwWceUAXUnvK
 ZUDVKUT9S30dIZwbxSLoou03Bu8WZVefxd6/wOi3g2BYRrqZN+w0lhkWIQwKdMo4
 VUP8fkMR9p6cAnitHh3i
 =9mW5
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: fixes and improvements for 4.15

- Some initial preparation patches for exitless interrupts and crypto
- New capability for AIS migration
- Fixes
- merge of the sthyi tree from the base s390 team, which moves the sthyi
out of KVM into a shared function also for non-KVM
2017-11-16 14:39:46 +01:00
Vasily Gorbik
b192571d1a s390/disassembler: increase show_code buffer size
Current buffer size of 64 is too small. objdump shows that there are
instructions which would require up to 75 bytes buffer (with current
formating). 128 bytes "ought to be enough for anybody".

Also replaces 8 spaces with a single tab to reduce the memory footprint.

Fixes the following KASAN finding:

BUG: KASAN: stack-out-of-bounds in number+0x3fe/0x538
Write of size 1 at addr 000000005a4a75a0 by task bash/1282

CPU: 1 PID: 1282 Comm: bash Not tainted 4.14.0+ #215
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
Call Trace:
([<000000000011eeb6>] show_stack+0x56/0x88)
 [<0000000000e1ce1a>] dump_stack+0x15a/0x1b0
 [<00000000004e2994>] print_address_description+0xf4/0x288
 [<00000000004e2cf2>] kasan_report+0x13a/0x230
 [<0000000000e38ae6>] number+0x3fe/0x538
 [<0000000000e3dfe4>] vsnprintf+0x194/0x948
 [<0000000000e3ea42>] sprintf+0xa2/0xb8
 [<00000000001198dc>] print_insn+0x374/0x500
 [<0000000000119346>] show_code+0x4ee/0x538
 [<000000000011f234>] show_registers+0x34c/0x388
 [<000000000011f2ae>] show_regs+0x3e/0xa8
 [<000000000011f502>] die+0x1ea/0x2e8
 [<0000000000138f0e>] do_no_context+0x106/0x168
 [<0000000000139a1a>] do_protection_exception+0x4da/0x7d0
 [<0000000000e55914>] pgm_check_handler+0x16c/0x1c0
 [<000000000090639e>] sysrq_handle_crash+0x46/0x58
([<0000000000000007>] 0x7)
 [<00000000009073fa>] __handle_sysrq+0x102/0x218
 [<0000000000907c06>] write_sysrq_trigger+0xd6/0x100
 [<000000000061d67a>] proc_reg_write+0xb2/0x128
 [<0000000000520be6>] __vfs_write+0xee/0x368
 [<0000000000521222>] vfs_write+0x21a/0x278
 [<000000000052156a>] SyS_write+0xda/0x178
 [<0000000000e555cc>] system_call+0xc4/0x270

The buggy address belongs to the page:
page:000003d1016929c0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x0()
raw: 0000000000000000 0000000000000000 0000000000000000 ffffffff00000000
raw: 0000000000000100 0000000000000200 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 000000005a4a7480: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
 000000005a4a7500: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00
>000000005a4a7580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
                               ^
 000000005a4a7600: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f8
 000000005a4a7680: f2 f2 f2 f2 f2 f2 f8 f8 f2 f2 f3 f3 f3 f3 00 00
==================================================================

Cc: <stable@vger.kernel.org>
Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 13:12:44 +01:00
Michael Holzheu
6470c0cc48 s390: Remove CONFIG_HARDENED_USERCOPY
When running the crash tool on a s390 live system we get a kernel panic
for reading memory within the kernel image:

 # uname -a
   Linux r3545011 4.14.0-rc8-00066-g1c9dbd4615fd #45 SMP PREEMPT Fri Nov 10 16:16:22 CET 2017 s390x s390x s390x GNU/Linux
 # crash /boot/vmlinux-devel /dev/mem
 # crash> rd 0x100000

 usercopy: kernel memory exposure attempt detected from 0000000000100000 (<kernel text>) (8 bytes)
 ------------[ cut here ]------------
 kernel BUG at mm/usercopy.c:72!
 illegal operation: 0001 ilc:1 [#1] PREEMPT SMP.
 Modules linked in:
 CPU: 0 PID: 1461 Comm: crash Not tainted 4.14.0-rc8-00066-g1c9dbd4615fd-dirty #46
 Hardware name: IBM 2827 H66 706 (z/VM 6.3.0)
 task: 000000001ad10100 task.stack: 000000001df78000
 Krnl PSW : 0704d00180000000 000000000038165c (__check_object_size+0x164/0x1d0)
            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
 Krnl GPRS: 0000000012440e1d 0000000080000000 0000000000000061 00000000001cabc0
            00000000001cc6d6 0000000000000000 0000000000cc4ed2 0000000000001000
            000003ffc22fdd20 0000000000000008 0000000000100008 0000000000000001
            0000000000000008 0000000000100000 0000000000381658 000000001df7bc90
 Krnl Code: 000000000038164c: c020004a1c4a        larl    %r2,cc4ee0
            0000000000381652: c0e5fff2581b        brasl   %r14,1cc688
           #0000000000381658: a7f40001            brc     15,38165a
           >000000000038165c: eb42000c000c        srlg    %r4,%r2,12
            0000000000381662: eb32001c000c        srlg    %r3,%r2,28
            0000000000381668: c0110003ffff        lgfi    %r1,262143
            000000000038166e: ec31ff752065        clgrj   %r3,%r1,2,381558
            0000000000381674: a7f4ff67            brc     15,381542
 Call Trace:
 ([<0000000000381658>] __check_object_size+0x160/0x1d0)
  [<000000000082263a>] read_mem+0xaa/0x130.
  [<0000000000386182>] __vfs_read+0x42/0x168.
  [<000000000038632e>] vfs_read+0x86/0x140.
  [<0000000000386a26>] SyS_read+0x66/0xc0.
  [<0000000000ace6a4>] system_call+0xc4/0x2b0.
 INFO: lockdep is turned off.
 Last Breaking-Event-Address:
  [<0000000000381658>] __check_object_size+0x160/0x1d0

 Kernel panic - not syncing: Fatal exception: panic_on_oops

With CONFIG_HARDENED_USERCOPY copy_to_user() checks in __check_object_size()
if the source address is within the kernel image. When the crash tool reads
from 0x100000, this check leads to the kernel BUG().

So disable the kernel config option until this bug is fixed.

Corresponding bug report on LKML: https://lkml.org/lkml/2017/11/10/341

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-16 13:12:21 +01:00
Craig Bergstrom
be62a32044 x86/mm: Limit mmap() of /dev/mem to valid physical addresses
One thing /dev/mem access APIs should verify is that there's no way
that excessively large pfn's can leak into the high bits of the
page table entry.

In particular, if people can use "very large physical page addresses"
through /dev/mem to set the bits past bit 58 - SOFTW4 and permission
key bits and NX bit, that could *really* confuse the kernel.

We had an earlier attempt:

  ce56a86e2a ("x86/mm: Limit mmap() of /dev/mem to valid physical addresses")

... which turned out to be too restrictive (breaking mem=... bootups for example) and
had to be reverted in:

  90edaac627 ("Revert "x86/mm: Limit mmap() of /dev/mem to valid physical addresses"")

This v2 attempt modifies the original patch and makes sure that mmap(/dev/mem)
limits the pfns so that it at least fits in the actual pteval_t architecturally:

 - Make sure mmap_mem() actually validates that the offset fits in phys_addr_t

    ( This may be indirectly true due to some other check, but it's not
      entirely obvious. )

 - Change valid_mmap_phys_addr_range() to just use phys_addr_valid()
   on the top byte

    ( Top byte is sufficient, because mmap_mem() has already checked that
      it cannot wrap. )

 - Add a few comments about what the valid_phys_addr_range() vs.
   valid_mmap_phys_addr_range() difference is.

Signed-off-by: Craig Bergstrom <craigb@google.com>
[ Fixed the checks and added comments. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ Collected the discussion and patches into a commit. ]
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Sean Young <sean@mess.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/CA+55aFyEcOMb657vWSmrM13OxmHxC-XxeBmNis=DwVvpJUOogQ@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-16 12:49:48 +01:00
Kirill A. Shutemov
1e0f25dbf2 x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border
In case of 5-level paging, the kernel does not place any mapping above
47-bit, unless userspace explicitly asks for it.

Userspace can request an allocation from the full address space by
specifying the mmap address hint above 47-bit.

Nicholas noticed that the current implementation violates this interface:

  If user space requests a mapping at the end of the 47-bit address space
  with a length which causes the mapping to cross the 47-bit border
  (DEFAULT_MAP_WINDOW), then the vma is partially in the address space
  below and above.

Sanity check the mmap address hint so that start and end of the resulting
vma are on the same side of the 47-bit border. If that's not the case fall
back to the code path which ignores the address hint and allocate from the
regular address space below 47-bit.

To make the checks consistent, mask out the address hints lower bits
(either PAGE_MASK or huge_page_mask()) instead of using ALIGN() which can
push them up to the next boundary.

[ tglx: Moved the address check to a function and massaged comment and
  	changelog ]

Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: linux-mm@kvack.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lkml.kernel.org/r/20171115143607.81541-1-kirill.shutemov@linux.intel.com
2017-11-16 11:43:11 +01:00
Linus Torvalds
e60e1ee606 main drm pull request for v4.15
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaCm8RAAoJEAx081l5xIa+zX0QAJSm31kCG3vdw2CNiRx25L3q
 3hcsEOgAjVJ9FQVGKFWjzb8TK35tSqtNx5kWIj0VGaIfBE5Bdg5SLLgKKUYas8rY
 4LaphqICq2uxu2BNa2tpiar/sHhAnuozwQ4czpVWXzlaISnb9yYzRl7gMuyUVGkx
 +Gih5VUhLmQC0HsRTLJ3vaZQoUsLAl2gAjKcWa1bx57j2S+iKOPfsLaq7VYo+y1I
 Njc+iSGqMhJzRLXVkxL2lQKaslp7R38Bbh5K4Kvyjkm4Aq7zErOF6irpOXKMcrGl
 mwnr89vf1G9thjikrBaXpKnuvdbWYveoN/ORMlTdCfxkFnChHLnm3bd7NJ49RXDN
 Hv/Iq9YYjmZ9GTatxnx7lWtmXnZXC5he1yn1JAuz/yt7/0b/Wx+Mu/wEpBXYNFTd
 1AZdD586i+AmPo3yDkqH9nBu8JC0W0AnS9VZma4LVvZOP2UfJmj5Im1CLHItbGDN
 FnUCkwyD/lJUUk+WgT+w/GOMJgmFHDiFFl4tFtYVVjrUirpCFVguSKG9xuv6tT8P
 8iRsoP7RrcmDN9ojN2SEHwcpsAv3HnKkDv+9+GIbWnrGsSbCPq8Qm+JDSvf4h22I
 K5lwNpJrcpSKI+q10L7w2xliTBwb98sJkWGA/rssomrdBOWteGZAyqFRYAVgQ+mJ
 x/nJurIqQYh2KQN9+uLG
 =xVV2
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.15.

  Core:
   - Atomic object lifetime fixes
   - Atomic iterator improvements
   - Sparse/smatch fixes
   - Legacy kms ioctls to be interruptible
   - EDID override improvements
   - fb/gem helper cleanups
   - Simple outreachy patches
   - Documentation improvements
   - Fix dma-buf rcu races
   - DRM mode object leasing for improving VR use cases.
   - vgaarb improvements for non-x86 platforms.

  New driver:
   - tve200: Faraday Technology TVE200 block.

     This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
     the StorLink SL3516 (later Cortina Systems CS3516) as well as the
     Grain Media GM8180.

  New bridges:
   - SiI9234 support

  New panels:
   - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
     LT089AC19000, Innolux AT043TN24

  i915:
   - Remove Coffeelake from alpha support
   - Cannonlake workarounds
   - Infoframe refactoring for DisplayPort
   - VBT updates
   - DisplayPort vswing/emph/buffer translation refactoring
   - CCS fixes
   - Restore GPU clock boost on missed vblanks
   - Scatter list updates for userptr allocations
   - Gen9+ transition watermarks
   - Display IPC (Isochronous Priority Control)
   - Private PAT management
   - GVT: improved error handling and pci config sanitizing
   - Execlist refactoring
   - Transparent Huge Page support
   - User defined priorities support
   - HuC/GuC firmware refactoring
   - DP MST fixes
   - eDP power sequencing fixes
   - Use RCU instead of stop_machine
   - PSR state tracking support
   - Eviction fixes
   - BDW DP aux channel timeout fixes
   - LSPCON fixes
   - Cannonlake PLL fixes

  amdgpu:
   - Per VM BO support
   - Powerplay cleanups
   - CI powerplay support
   - PASID mgr for kfd
   - SR-IOV fixes
   - initial GPU reset for vega10
   - Prime mmap support
   - TTM updates
   - Clock query interface for Raven
   - Fence to handle ioctl
   - UVD encode ring support on Polaris
   - Transparent huge page DMA support
   - Compute LRU pipe tweaks
   - BO flag to allow buffers to opt out of implicit sync
   - CTX priority setting API
   - VRAM lost infrastructure plumbing

  qxl:
   - fix flicker since atomic rework

  amdkfd:
   - Further improvements from internal AMD tree
   - Usermode events
   - Drop radeon support

  nouveau:
   - Pascal temperature sensor support
   - Improved BAR2 handling
   - MMU rework to support Pascal MMU

  exynos:
   - Improved HDMI/mixer support
   - HDMI audio interface support

  tegra:
   - Prep work for tegra186
   - Cleanup/fixes

  msm:
   - Preemption support for a5xx
   - Display fixes for 8x96 (snapdragon 820)
   - Async cursor plane fixes
   - FW loading rework
   - GPU debugging improvements

  vc4:
   - Prep for DSI panels
   - fix T-format tiling scanout
   - New madvise ioctl

  Rockchip:
   - LVDS support

  omapdrm:
   - omap4 HDMI CEC support

  etnaviv:
   - GPU performance counters groundwork

  sun4i:
   - refactor driver load + TCON backend
   - HDMI improvements
   - A31 support
   - Misc fixes

  udl:
   - Probe/EDID read fixes.

  tilcdc:
   - Misc fixes.

  pl111:
   - Support more variants

  adv7511:
   - Improve EDID handling.
   - HDMI CEC support

  sii8620:
   - Add remote control support"

* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
  drm/rockchip: analogix_dp: Use mutex rather than spinlock
  drm/mode_object: fix documentation for object lookups.
  drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
  drm/i915: Move init_clock_gating() back to where it was
  drm/i915: Prune the reservation shared fence array
  drm/i915: Idle the GPU before shinking everything
  drm/i915: Lock llist_del_first() vs llist_del_all()
  drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
  drm/i915: Disable lazy PPGTT page table optimization for vGPU
  drm/i915/execlists: Remove the priority "optimisation"
  drm/i915: Filter out spurious execlists context-switch interrupts
  drm/amdgpu: use irq-safe lock for kiq->ring_lock
  drm/amdgpu: bypass lru touch for KIQ ring submission
  drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
  drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
  drm/amd/powerplay: initialize a variable before using it
  drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
  drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
  drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
  drm/rockchip: add CONFIG_OF dependency for lvds
  ...
2017-11-15 20:42:10 -08:00
Linus Torvalds
5d352e69c6 media updates for v4.15-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaC/4EAAoJEAhfPr2O5OEVQnsP/2JNpdLuzgwNp0p2gXrvK5pl
 KOsA6Fld6RNmpuel8eHARbbDTPKF1Y1bvYXVo7lPhXb7KuM2IzG56VxoNech/5pX
 eflKwnpV/Ns/ZMLYue7Rqdw0iZnjWESBWf5lzg9MvzwhZBaPlRwqu/aOJy360AZr
 FnjKHtU/6WUIOCB8r0qLBDR/epoh7y2lKfjDTcEBrURrFEsTajdyd59npdMSIQqO
 iUeeBVEIUKyytYDQNM/VOsBnh0G+2inLuykF8Nd6pYs8O0iUEUpZYwdGuwGUG1HB
 VmCcRGU62efl5nu8zQMPnwAvjXwZAh8vbS0ha+B1vBJh1RwNVUz0kKIKEgAaOMZ3
 zZa3NLfDP4cHgYtr2Xw2vSvJvDwQecmiItJKeZ/Id4cPy03TKEV1KEaHCQJHwbDn
 RP/o9C+5gagMO/oIvZPQ+esVZXQ4prAzOdX53N7HPn4Wn+k4clkI0+hMvMGf67mo
 EYOguCqbN2D0e11BLiPP1bRbGZRSI8I9xcKuhcw4ajJHbRRkrjl8EW7V6c8CuMkd
 0Wj5oidFleJ0Vy+qQOPqXN1FwR7AbHNtI38JfWNz324AIrFCQERpfXVmKwRPZfl4
 YXgGIA9fil3a01YJCtxc0PsXlRkveKJ8hKCLpjXbjNTh1oSbgrDxx5sMx9PO6WqJ
 VOb6fL17rwTXlKV/GeU/
 =d9nT
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.15-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - Documentation for digital TV (both kAPI and uAPI) are now in sync
   with the implementation (except for legacy/deprecated ioctls). This
   is a major step, as there were always a gap there

 - New sensor driver: imx274

 - New cec driver: cec-gpio

 - New platform driver for rockship rga and tegra CEC

 - New RC driver: tango-ir

 - Several cleanups at atomisp driver

 - Core improvements for RC, CEC, V4L2 async probing support and DVB

 - Lots of drivers cleanup, fixes and improvements.

* tag 'media/v4.15-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (332 commits)
  dvb_frontend: don't use-after-free the frontend struct
  media: dib0700: fix invalid dvb_detach argument
  media: v4l2-ctrls: Don't validate BITMASK twice
  media: s5p-mfc: fix lockdep warning
  media: dvb-core: always call invoke_release() in fe_free()
  media: usb: dvb-usb-v2: dvb_usb_core: remove redundant code in dvb_usb_fe_sleep
  media: au0828: make const array addr_list static
  media: cx88: make const arrays default_addr_list and pvr2000_addr_list static
  media: drxd: make const array fastIncrDecLUT static
  media: usb: fix spelling mistake: "synchronuously" -> "synchronously"
  media: ddbridge: fix build warnings
  media: av7110: avoid 2038 overflow in debug print
  media: Don't do DMA on stack for firmware upload in the AS102 driver
  media: v4l: async: fix unregister for implicitly registered sub-device notifiers
  media: v4l: async: fix return of unitialized variable ret
  media: imx274: fix missing return assignment from call to imx274_mode_regs
  media: camss-vfe: always initialize reg at vfe_set_xbar_cfg()
  media: atomisp: make function calls cleaner
  media: atomisp: get rid of storage_class.h
  media: atomisp: get rid of wrong stddef.h include
  ...
2017-11-15 20:30:12 -08:00
Linus Torvalds
7c225c69f8 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc bits

 - ocfs2 updates

 - almost all of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits)
  memory hotplug: fix comments when adding section
  mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP
  mm: simplify nodemask printing
  mm,oom_reaper: remove pointless kthread_run() error check
  mm/page_ext.c: check if page_ext is not prepared
  writeback: remove unused function parameter
  mm: do not rely on preempt_count in print_vma_addr
  mm, sparse: do not swamp log with huge vmemmap allocation failures
  mm/hmm: remove redundant variable align_end
  mm/list_lru.c: mark expected switch fall-through
  mm/shmem.c: mark expected switch fall-through
  mm/page_alloc.c: broken deferred calculation
  mm: don't warn about allocations which stall for too long
  fs: fuse: account fuse_inode slab memory as reclaimable
  mm, page_alloc: fix potential false positive in __zone_watermark_ok
  mm: mlock: remove lru_add_drain_all()
  mm, sysctl: make NUMA stats configurable
  shmem: convert shmem_init_inodecache() to void
  Unify migrate_pages and move_pages access checks
  mm, pagevec: rename pagevec drained field
  ...
2017-11-15 19:42:40 -08:00
Michal Hocko
fcdaf842bd mm, sparse: do not swamp log with huge vmemmap allocation failures
While doing memory hotplug tests under heavy memory pressure we have
noticed too many page allocation failures when allocating vmemmap memmap
backed by huge page

  kworker/u3072:1: page allocation failure: order:9, mode:0x24084c0(GFP_KERNEL|__GFP_REPEAT|__GFP_ZERO)
  [...]
  Call Trace:
    dump_trace+0x59/0x310
    show_stack_log_lvl+0xea/0x170
    show_stack+0x21/0x40
    dump_stack+0x5c/0x7c
    warn_alloc_failed+0xe2/0x150
    __alloc_pages_nodemask+0x3ed/0xb20
    alloc_pages_current+0x7f/0x100
    vmemmap_alloc_block+0x79/0xb6
    __vmemmap_alloc_block_buf+0x136/0x145
    vmemmap_populate+0xd2/0x2b9
    sparse_mem_map_populate+0x23/0x30
    sparse_add_one_section+0x68/0x18e
    __add_pages+0x10a/0x1d0
    arch_add_memory+0x4a/0xc0
    add_memory_resource+0x89/0x160
    add_memory+0x6d/0xd0
    acpi_memory_device_add+0x181/0x251
    acpi_bus_attach+0xfd/0x19b
    acpi_bus_scan+0x59/0x69
    acpi_device_hotplug+0xd2/0x41f
    acpi_hotplug_work_fn+0x1a/0x23
    process_one_work+0x14e/0x410
    worker_thread+0x116/0x490
    kthread+0xbd/0xe0
    ret_from_fork+0x3f/0x70

and we do see many of those because essentially every allocation fails
for each memory section.  This is an excessive way to tell the user that
there is nothing to really worry about because we do have a fallback
mechanism to use base pages.  The only downside might be a performance
degradation due to TLB pressure.

This patch changes vmemmap_alloc_block() to use __GFP_NOWARN and warn
explicitly once on the first allocation failure.  This will reduce the
noise in the kernel log considerably, while we still have an indication
that a performance might be impacted.

[mhocko@kernel.org: forgot to git add the follow up fix]
  Link: http://lkml.kernel.org/r/20171107090635.c27thtse2lchjgvb@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20171106092228.31098-1-mhocko@kernel.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Joe Perches <joe@perches.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:07 -08:00
Mel Gorman
2d4894b5d2 mm: remove cold parameter from free_hot_cold_page*
Most callers users of free_hot_cold_page claim the pages being released
are cache hot.  The exception is the page reclaim paths where it is
likely that enough pages will be freed in the near future that the
per-cpu lists are going to be recycled and the cache hotness information
is lost.  As no one really cares about the hotness of pages being
released to the allocator, just ditch the parameter.

The APIs are renamed to indicate that it's no longer about hot/cold
pages.  It should also be less confusing as there are subtle differences
between them.  __free_pages drops a reference and frees a page when the
refcount reaches zero.  free_hot_cold_page handled pages whose refcount
was already zero which is non-obvious from the name.  free_unref_page
should be more obvious.

No performance impact is expected as the overhead is marginal.  The
parameter is removed simply because it is a bit stupid to have a useless
parameter copied everywhere.

[mgorman@techsingularity.net: add pages to head, not tail]
  Link: http://lkml.kernel.org/r/20171019154321.qtpzaeftoyyw4iey@techsingularity.net
Link: http://lkml.kernel.org/r/20171018075952.10627-8-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:06 -08:00
Pavel Tatashin
78c943662f sparc64: optimize struct page zeroing
Add an optimized mm_zero_struct_page(), so struct page's are zeroed
without calling memset().  We do eight to ten regular stores based on
the size of struct page.  Compiler optimizes out the conditions of
switch() statement.

SPARC-M6 with 15T of memory, single thread performance:

                               BASE            FIX  OPTIMIZED_FIX
        bootmem_init   28.440467985s   2.305674818s   2.305161615s
free_area_init_nodes  202.845901673s 225.343084508s 172.556506560s
                      --------------------------------------------
Total                 231.286369658s 227.648759326s 174.861668175s

BASE:  current linux
FIX:   This patch series without "optimized struct page zeroing"
OPTIMIZED_FIX: This patch series including the current patch.

bootmem_init() is where memory for struct pages is zeroed during
allocation.  Note, about two seconds in this function is a fixed time:
it does not increase as memory is increased.

Link: http://lkml.kernel.org/r/20171013173214.27300-11-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Will Deacon
e17d8025f0 arm64/mm/kasan: don't use vmemmap_populate() to initialize shadow
The kasan shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt.  However,
since that no longer zeroes the mapped pages, it is not suitable for
kasan, which requires zeroed shadow memory.

Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate().  Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.

Link: http://lkml.kernel.org/r/20171103185147.2688-3-pasha.tatashin@oracle.com
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Andrey Ryabinin
d17a1d97dc x86/mm/kasan: don't use vmemmap_populate() to initialize shadow
The kasan shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt.  However,
since that no longer zeroes the mapped pages, it is not suitable for
kasan, which requires zeroed shadow memory.

Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate().  Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.

Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Pavel Tatashin
df8ee57889 sparc64: simplify vmemmap_populate
Remove duplicating code by using common functions vmemmap_pud_populate
and vmemmap_pgd_populate.

Link: http://lkml.kernel.org/r/20171013173214.27300-5-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Pavel Tatashin
2a20aa1710 sparc64/mm: set fields in deferred pages
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set
some fields prior to initializing:

mem_init() {
     register_page_bootmem_info();
     free_all_bootmem();
     ...
}

When register_page_bootmem_info() is called only non-deferred struct
pages are initialized.  But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page->freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() <- Only if this is deferred reserved page
    __init_single_pfn()
     __init_single_page()
      memset(0) <-- Loose the set fields here

We end up with similar issue as in the previous patch, where currently
we do not observe problem as memory is zeroed.  But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Pavel Tatashin
353b1e7b58 x86/mm: set fields in deferred pages
Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled, however, we set fields in
register_page_bootmem_info that are subsequently clobbered right after
in free_all_bootmem:

        mem_init() {
                register_page_bootmem_info();
                free_all_bootmem();
                ...
        }

When register_page_bootmem_info() is called only non-deferred struct
pages are initialized.  But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.

  mem_init
   register_page_bootmem_info
    register_page_bootmem_info_node
     get_page_bootmem
      .. setting fields here ..
      such as: page->freelist = (void *)type;

  free_all_bootmem()
   free_low_memory_core_early()
    for_each_reserved_mem_region()
     reserve_bootmem_region()
      init_reserved_page() <- Only if this is deferred reserved page
       __init_single_pfn()
        __init_single_page()
            memset(0) <-- Loose the set fields here

We end up with issue where, currently we do not observe problem as
memory is explicitly zeroed.  But, if flag asserts are changed we can
start hitting issues.

Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Link: http://lkml.kernel.org/r/20171013173214.27300-3-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Tested-by: Bob Picco <bob.picco@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Levin, Alexander (Sasha Levin)
4675ff05de kmemcheck: rip it out
Fix up makefiles, remove references, and git rm kmemcheck.

Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Tim Hansen <devtimhansen@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Levin, Alexander (Sasha Levin)
d8be75663c kmemcheck: remove whats left of NOTRACK flags
Now that kmemcheck is gone, we don't need the NOTRACK flags.

Link: http://lkml.kernel.org/r/20171007030159.22241-5-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Hansen <devtimhansen@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:05 -08:00
Levin, Alexander (Sasha Levin)
75f296d93b kmemcheck: stop using GFP_NOTRACK and SLAB_NOTRACK
Convert all allocations that used a NOTRACK flag to stop using it.

Link: http://lkml.kernel.org/r/20171007030159.22241-3-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Hansen <devtimhansen@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:04 -08:00
Levin, Alexander (Sasha Levin)
4950276672 kmemcheck: remove annotations
Patch series "kmemcheck: kill kmemcheck", v2.

As discussed at LSF/MM, kill kmemcheck.

KASan is a replacement that is able to work without the limitation of
kmemcheck (single CPU, slow).  KASan is already upstream.

We are also not aware of any users of kmemcheck (or users who don't
consider KASan as a suitable replacement).

The only objection was that since KASAN wasn't supported by all GCC
versions provided by distros at that time we should hold off for 2
years, and try again.

Now that 2 years have passed, and all distros provide gcc that supports
KASAN, kill kmemcheck again for the very same reasons.

This patch (of 4):

Remove kmemcheck annotations, and calls to kmemcheck from the kernel.

[alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs]
  Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com
Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Hansen <devtimhansen@gmail.com>
Cc: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:04 -08:00
Kirill A. Shutemov
c4812909f5 mm: introduce wrappers to access mm->nr_ptes
Let's add wrappers for ->nr_ptes with the same interface as for nr_pmd
and nr_pud.

The patch also makes nr_ptes accounting dependent onto CONFIG_MMU.  Page
table accounting doesn't make sense if you don't have page tables.

It's preparation for consolidation of page-table counters in mm_struct.

Link: http://lkml.kernel.org/r/20171006100651.44742-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:04 -08:00
Kirill A. Shutemov
b4e98d9ac7 mm: account pud page tables
On a machine with 5-level paging support a process can allocate
significant amount of memory and stay unnoticed by oom-killer and memory
cgroup.  The trick is to allocate a lot of PUD page tables.  We don't
account PUD page tables, only PMD and PTE.

We already addressed the same issue for PMD page tables, see commit
dc6c9a35b6 ("mm: account pmd page tables to the process").
Introduction of 5-level paging brings the same issue for PUD page
tables.

The patch expands accounting to PUD level.

[kirill.shutemov@linux.intel.com: s/pmd_t/pud_t/]
  Link: http://lkml.kernel.org/r/20171004074305.x35eh5u7ybbt5kar@black.fi.intel.com
[heiko.carstens@de.ibm.com: s390/mm: fix pud table accounting]
  Link: http://lkml.kernel.org/r/20171103090551.18231-1-heiko.carstens@de.ibm.com
Link: http://lkml.kernel.org/r/20171002080427.3320-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:04 -08:00
Michal Hocko
8745808fda mm, arch: remove empty_bad_page*
empty_bad_page() and empty_bad_pte_table() seem to be relics from old
days which is not used by any code for a long time.  I have tried to
find when exactly but this is not really all that straightforward due to
many code movements - traces disappear around 2.4 times.

Anyway no code really references neither empty_bad_page nor
empty_bad_pte_table.  We only allocate the storage which is not used by
anybody so remove them.

Link: http://lkml.kernel.org/r/20171004150045.30755-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Ralf Baechle <ralf@linus-mips.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David Howells <dhowells@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:03 -08:00
Geert Uytterhoeven
c95f121142 m32r: fix endianness constraints
The m32r Kconfig provides both CPU_BIG_ENDIAN and CPU_LITTLE_ENDIAN
configuration options.  As they are user-selectable and independent,
this allows invalid configurations:

  - All m32r defconfigs build a big endian kernel, but CPU_BIG_ENDIAN is
    not set, causing compiler warnings like:

	include/linux/byteorder/big_endian.h:7:2: warning: #warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIAN [-Wcpp]
	 #warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIAN
	  ^

  - Since commit 5bdfca6435 ("m32r: define CPU_BIG_ENDIAN"),
    building an allmodconfig or allyesconfig enables both
    CONFIG_CPU_BIG_ENDIAN and CONFIG_CPU_LITTLE_ENDIAN.
    While this did get rid of the warning above, both options are
    obviously mutually exclusive.

Fix this by making only CPU_LITTLE_ENDIAN configurable by the user, as
before, and by making sure exactly one of CPU_BIG_ENDIAN and
CPU_LITTLE_ENDIAN is always enabled.

Link: http://lkml.kernel.org/r/1509361505-18150-1-git-send-email-geert@linux-m68k.org
Fixes: 5bdfca6435 ("m32r: define CPU_BIG_ENDIAN")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:00 -08:00
Masahiro Yamada
8a78756eb5 kbuild: create object directories simpler and faster
For the out-of-tree build, scripts/Makefile.build creates output
directories, but this operation is not efficient.

scripts/Makefile.lib calculates obj-dirs as follows:

  obj-dirs := $(dir $(multi-objs) $(obj-y))

Please notice $(sort ...) is not used here.  Usually the result is
as many "./" as objects here.

For a lot of duplicated paths, the following command is invoked.

  _dummy := $(foreach d,$(obj-dirs), $(shell [ -d $(d) ] || mkdir -p $(d)))

Then, the costly shell command is run over and over again.

I see many points for optimization:

[1] Use $(sort ...) to cut down duplicated paths before passing them
    to system call
[2] Use single $(shell ...) instead of repeating it with $(foreach ...)
    This will reduce forking.
[3] We can calculate obj-dirs more simply.  Most of objects are already
    accumulated in $(targets).  So, $(dir $(targets)) is fine and more
    comprehensive.

I also removed ugly code in arch/x86/entry/vdso/Makefile.  This is now
really unnecessary.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
2017-11-16 09:07:35 +09:00
Linus Torvalds
1b6115fbe3 pci-v4.15-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDFIdAAoJEFmIoMA60/r8Jg4P/3IrmMNVnpqmYEZ7lRSW7UQ3
 8jtupbzIkbPsIAEhbJ7xqO7zKx85j6Og+ZSOv4a8u/tS6cd1aVZu2PpWsTkacez0
 7nLGVCSL3HZi5qcrtOvb2Pmke18SUKSPxVYSgS2ajQavB1oKaY03FbHDWyWidCZx
 qxkeGZOiUDw5kSGkQWyks1WgB0dd76rVbPcrKKEJueGgrdSm+EdgdDv8eT6bZInZ
 uMrCmSjNYTQP0KASCJJvgYOtJbdwvP6NuQTxzOlU2G+H2SqsLRjsz4UUR8FF06T5
 cndpgpG3QSAZLx7wCeWTvRorTEYORzKMoyw/AUjhiGbRep9Zw0aKNvCC99E6xjyD
 FECrk6kCrqZs7l+LVXK4SwpBXCVjNgRoFAHBEKF2X3/SWUkUhHXZHCVvMQB8LQiS
 2p8VRoYWw2aCLkHCGynuzToUrD2P2Pjxe5n/13aYVJkyBNfQqqZ3l2YHiZdpDO3j
 rgG6RW0WCrpZxfb/0WAbPnQ2qpZAwDPO6hOW7dIfTZabFVXRIkBvNq53by/0MxvP
 jyOcMTsq2l8y46f3VgNPUAHj0f52HwfZA3PQRPh+MQDz5385BJklDRWtfVM7cQx9
 IoeGkq1zLLvpOh63he/jnnRELxDvNVcxND8lOkenJlObj9kK63hUEcXg/zEMS4w3
 oetLw9TqE32Jb7GfpVSw
 =j4L3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

  - detach driver before tearing down procfs/sysfs (Alex Williamson)

  - disable PCIe services during shutdown (Sinan Kaya)

  - fix ASPM oops on systems with no Root Ports (Ard Biesheuvel)

  - fix ASPM LTR_L1.2_THRESHOLD programming (Bjorn Helgaas)

  - fix ASPM Common_Mode_Restore_Time computation (Bjorn Helgaas)

  - fix portdrv MSI/MSI-X vector allocation (Dongdong Liu, Bjorn
    Helgaas)

  - report non-fatal AER errors only to the affected endpoint (Gabriele
    Paoloni)

  - distribute bus numbers, MMIO, and I/O space among hotplug bridges to
    allow more devices to be hot-added (Mika Westerberg)

  - fix pciehp races during initialization and surprise link down (Mika
    Westerberg)

  - handle surprise-removed devices in PME handling (Qiang)

  - support resizable BARs for large graphics devices (Christian König)

  - expose SR-IOV offset, stride, and VF device ID via sysfs (Filippo
    Sironi)

  - create SR-IOV virtfn/physfn sysfs links before attaching driver
    (Stuart Hayes)

  - fix SR-IOV "ARI Capable Hierarchy" restore issue (Tony Nguyen)

  - enforce Kconfig IOV/REALLOC dependency (Sascha El-Sharkawy)

  - avoid slot reset if bridge itself is broken (Jan Glauber)

  - clean up pci_reset_function() path (Jan H. Schönherr)

  - make pci_map_rom() fail if the option ROM is invalid (Changbin Du)

  - convert timers to timer_setup() (Kees Cook)

  - move PCI_QUIRKS to PCI bus Kconfig menu (Randy Dunlap)

  - constify pci_dev_type and intel_mid_pci_ops (Bhumika Goyal)

  - remove unnecessary pci_dev, pci_bus, resource, pcibios_set_master()
    declarations (Bjorn Helgaas)

  - fix endpoint framework overflows and BUG()s (Dan Carpenter)

  - fix endpoint framework issues (Kishon Vijay Abraham I)

  - avoid broken Cavium CN8xxx bus reset behavior (David Daney)

  - extend Cavium ACS capability quirks (Vadim Lomovtsev)

  - support Synopsys DesignWare RC in ECAM mode (Ard Biesheuvel)

  - turn off dra7xx clocks cleanly on shutdown (Keerthy)

  - fix Faraday probe error path (Wei Yongjun)

  - support HiSilicon STB SoC PCIe host controller (Jianguo Sun)

  - fix Hyper-V interrupt affinity issue (Dexuan Cui)

  - remove useless ACPI warning for Hyper-V pass-through devices (Vitaly
    Kuznetsov)

  - support multiple MSI on iProc (Sandor Bodo-Merle)

  - support Layerscape LS1012a and LS1046a PCIe host controllers (Hou
    Zhiqiang)

  - fix Layerscape default error response (Minghuan Lian)

  - support MSI on Tango host controller (Marc Gonzalez)

  - support Tegra186 PCIe host controller (Manikanta Maddireddy)

  - use generic accessors on Tegra when possible (Thierry Reding)

  - support V3 Semiconductor PCI host controller (Linus Walleij)

* tag 'pci-v4.15-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (85 commits)
  PCI/ASPM: Add L1 Substates definitions
  PCI/ASPM: Reformat ASPM register definitions
  PCI/ASPM: Use correct capability pointer to program LTR_L1.2_THRESHOLD
  PCI/ASPM: Account for downstream device's Port Common_Mode_Restore_Time
  PCI: xgene: Rename xgene_pcie_probe_bridge() to xgene_pcie_probe()
  PCI: xilinx: Rename xilinx_pcie_link_is_up() to xilinx_pcie_link_up()
  PCI: altera: Rename altera_pcie_link_is_up() to altera_pcie_link_up()
  PCI: Fix kernel-doc build warning
  PCI: Fail pci_map_rom() if the option ROM is invalid
  PCI: Move pci_map_rom() error path
  PCI: Move PCI_QUIRKS to the PCI bus menu
  alpha/PCI: Make pdev_save_srm_config() static
  PCI: Remove unused declarations
  PCI: Remove redundant pci_dev, pci_bus, resource declarations
  PCI: Remove redundant pcibios_set_master() declarations
  PCI/PME: Handle invalid data when reading Root Status
  PCI: hv: Use effective affinity mask
  PCI: pciehp: Do not clear Presence Detect Changed during initialization
  PCI: pciehp: Fix race condition handling surprise link down
  PCI: Distribute available resources to hotplug-capable bridges
  ...
2017-11-15 15:01:28 -08:00
Linus Torvalds
1be2172e96 Modules updates for v4.15
Summary of modules changes for the 4.15 merge window:
 
 - Treewide module_param_call() cleanup, fix up set/get function
   prototype mismatches, from Kees Cook
 
 - Minor code cleanups
 
 Signed-off-by: Jessica Yu <jeyu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJaDCyzAAoJEMBFfjjOO8FyaYQP/AwHBy6XmwwVlWDP4BqIF6hL
 Vhy3ccVLYEORvePv68tWSRPUz5n6+1Ebqanmwtkw6i8l+KwxY2SfkZql09cARc33
 2iBE4bHF98iWQmnJbF6me80fedY9n5bZJNMQKEF9VozJWwTMOTQFTCfmyJRDBmk9
 iidQj6M3idbSUOYIJjvc40VGx5NyQWSr+FFfqsz1rU5iLGRGEvA3I2/CDT0oTuV6
 D4MmFxzE2Tv/vIMa2GzKJ1LGScuUfSjf93Lq9Kk0cG36qWao8l930CaXyVdE9WJv
 bkUzpf3QYv/rDX6QbAGA0cada13zd+dfBr8YhchclEAfJ+GDLjMEDu04NEmI6KUT
 5lP0Xw0xYNZQI7bkdxDMhsj5jaz/HJpXCjPCtZBnSEKiL4OPXVMe+pBHoCJ2/yFN
 6M716XpWYgUviUOdiE+chczB5p3z4FA6u2ykaM4Tlk0btZuHGxjcSWwvcIdlPmjm
 kY4AfDV6K0bfEBVguWPJicvrkx44atqT5nWbbPhDwTSavtsuRJLb3GCsHedx7K8h
 ZO47lCQFAWCtrycK1HYw+oupNC3hYWQ0SR42XRdGhL1bq26C+1sei1QhfqSgA9PQ
 7CwWH4UTOL9fhtrzSqZngYOh9sjQNFNefqQHcecNzcEjK2vjrgQZvRNWZKHSwaFs
 fbGX8juZWP4ypbK+irTB
 =c8vb
 -----END PGP SIGNATURE-----

Merge tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull module updates from Jessica Yu:
 "Summary of modules changes for the 4.15 merge window:

   - treewide module_param_call() cleanup, fix up set/get function
     prototype mismatches, from Kees Cook

   - minor code cleanups"

* tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: Do not paper over type mismatches in module_param_call()
  treewide: Fix function prototypes for module_param_call()
  module: Prepare to convert all module_param_call() prototypes
  kernel/module: Delete an error message for a failed memory allocation in add_module_usage()
2017-11-15 13:46:33 -08:00
Linus Torvalds
5bbcc0f595 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Maintain the TCP retransmit queue using an rbtree, with 1GB
      windows at 100Gb this really has become necessary. From Eric
      Dumazet.

   2) Multi-program support for cgroup+bpf, from Alexei Starovoitov.

   3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew
      Lunn.

   4) Add meter action support to openvswitch, from Andy Zhou.

   5) Add a data meta pointer for BPF accessible packets, from Daniel
      Borkmann.

   6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet.

   7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli.

   8) More work to move the RTNL mutex down, from Florian Westphal.

   9) Add 'bpftool' utility, to help with bpf program introspection.
      From Jakub Kicinski.

  10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper
      Dangaard Brouer.

  11) Support 'blocks' of transformations in the packet scheduler which
      can span multiple network devices, from Jiri Pirko.

  12) TC flower offload support in cxgb4, from Kumar Sanghvi.

  13) Priority based stream scheduler for SCTP, from Marcelo Ricardo
      Leitner.

  14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg.

  15) Add RED qdisc offloadability, and use it in mlxsw driver. From
      Nogah Frankel.

  16) eBPF based device controller for cgroup v2, from Roman Gushchin.

  17) Add some fundamental tracepoints for TCP, from Song Liu.

  18) Remove garbage collection from ipv6 route layer, this is a
      significant accomplishment. From Wei Wang.

  19) Add multicast route offload support to mlxsw, from Yotam Gigi"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits)
  tcp: highest_sack fix
  geneve: fix fill_info when link down
  bpf: fix lockdep splat
  net: cdc_ncm: GetNtbFormat endian fix
  openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start
  netem: remove unnecessary 64 bit modulus
  netem: use 64 bit divide by rate
  tcp: Namespace-ify sysctl_tcp_default_congestion_control
  net: Protect iterations over net::fib_notifier_ops in fib_seq_sum()
  ipv6: set all.accept_dad to 0 by default
  uapi: fix linux/tls.h userspace compilation error
  usbnet: ipheth: prevent TX queue timeouts when device not ready
  vhost_net: conditionally enable tx polling
  uapi: fix linux/rxrpc.h userspace compilation errors
  net: stmmac: fix LPI transitioning for dwmac4
  atm: horizon: Fix irq release error
  net-sysfs: trigger netlink notification on ifalias change via sysfs
  openvswitch: Using kfree_rcu() to simplify the code
  openvswitch: Make local function ovs_nsh_key_attr_size() static
  openvswitch: Fix return value check in ovs_meter_cmd_features()
  ...
2017-11-15 11:56:19 -08:00
Linus Torvalds
892204e06c MIPS changes for 4.15
These are the main MIPS changes for 4.15.
 
 Fixes:
 - ralink: Fix MT7620 PCI build issues (4.5)
 - Disable cmpxchg64() and HAVE_VIRT_CPU_ACCOUNTING_GEN for 32-bit SMP
   (4.1)
 - Fix MIPS64 FP save/restore on 32-bit kernels (4.0)
 - ptrace: Pick up ptrace/seccomp changed syscall numbers (3.19)
 - ralink: Fix MT7628 pinmux (3.19)
 - BCM47XX: Fix LED inversion on WRT54GSv1 (3.17)
 - Fix n32 core dumping as o32 since regset support (3.13)
 - ralink: Drop obsolete USB_ARCH_HAS_HCD select
 
 Build system:
 - Default to "generic" (multiplatform) system type instead of IP22
 - Use generic little endian MIPS32 r2 configuration as default defconfig
   instead of ip22_defconfig
 
 FPU emulation:
 - Fix exception generation for certain R6 FPU instructions
 
 SMP:
 - Allow __cpu_number_map to be larger than NR_CPUS for sparse CPU id
   spaces
 
 Miscellaneous:
 - Add iomem resource for kernel bss section for kexec/kdump
 - Atomics: Nudge writes on bit unlock
 - DT files: Standardise "ok" -> "okay"
 
 Platform support:
 
  BMIPS:
  - Enable HARDIRQS_SW_RESEND
 
  Broadcom BCM63XX:
  - Add clkdev lookup support
  - Update clk driver, UART driver, DTs to handle named refclk from DTs
  - Split apart various clocks to more closely match hardware
  - Add ethernet clocks
 
  Cavium Octeon:
  - Remove usage of cvmx_wait() in favour of __delay()
 
  ImgTec Pistachio:
  - DT: Drop deprecated dwmmc num-slots property
 
  Ingenic JZ4780:
  - Add NFS root to Ci20 defconfig
  - Add watchdog to Ci20 DT & defconfig, and allow building of watchdog
    driver with this SoC
 
  Generic (multiplatform):
  - Migrate xilfpga (MIPSfpga) platform to the generic platform
 
  Lantiq xway:
  - Fix ASC0/ASC1 clocks
 
 Minor cleanups:
 - Define virt_to_pfn()
 - Make thread_saved_pc static
 - Simplify 32-bit sign extension in __read_64bit_c0_split()
 - DMA: Use vma_pages() helper
 - FPU emulation: Replace unsigned with unsigned int
 - MM: Removed unused lastpfn
 - Alchemy: Make clk_ops const
 - Lasat: Use setup_timer() helper
 - ralink: Use BIT() in MT7620 PCI driver
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAloJ8ecACgkQbAtpk944
 dnqF3w/+IPcxcYl7QpVFvM3MsDgxJI8ENIkY5ffMi1UVM8gApAHuFSnGotikS8C8
 jjnFyorrOkUKuuX9m9pfwfmvMHAy8j77so7kp2vpGjihe4iFntYJxJYUpYq8Ru8M
 jNzikrPbFv6eQyjwFEGuqxrJmsgTlJGiWA04a33LCfiFz5RZUSloHfPkjWiyWM1s
 xrbkbZpwvyX3jw39vguZvz5qjuUPViy/YOSyMhmTqnqDXqGmwlHgzev1/HEzISVe
 eN5n6bHGX5Dis4bCBPZuYbr6m96/z+xTKCKC7mlH0OnG/WWQtv9LFFU7o+ffRsI/
 nPKEN/TFFA7V0b9zI/lxfVSoZ67IZa5TDA+PLnzX9UQAxOA/wgFHPOgqJZN3/BXo
 OBgTuguwq9D22uSrvrMoqmcU+zDXG4ZQQCgv7mUUw2E9gHnsYJykhVa4kQVj9MxE
 LkixhhE+Qabsh6L3wDtBntpgoOd58dxNiMJ7UAzDW3rmyjo+EEWN1eeCxQCrewlf
 1aJaHeRoEOt/k7oPZWCd1InJ3vEsrNcO74KSZuQ+q0ytuqYOLUZ7ZXteA86VzroI
 4qcftvR4cVOCz86B6NZdQQVOM95P7vgqBMJqh52i1pjQlVdvE92MBgzbm4BSOUAL
 Y+hybhhIwJriF8WtTq2goL8osvMODM1uM3Zlm0XtA5JfUYbWK/E=
 =xbL0
 -----END PGP SIGNATURE-----

Merge tag 'mips_4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips

Pull MIPS updates from James Hogan:
 "These are the main MIPS changes for 4.15.

  Fixes:
   - ralink: Fix MT7620 PCI build issues (4.5)
   - Disable cmpxchg64() and HAVE_VIRT_CPU_ACCOUNTING_GEN for 32-bit SMP
     (4.1)
   - Fix MIPS64 FP save/restore on 32-bit kernels (4.0)
   - ptrace: Pick up ptrace/seccomp changed syscall numbers (3.19)
   - ralink: Fix MT7628 pinmux (3.19)
   - BCM47XX: Fix LED inversion on WRT54GSv1 (3.17)
   - Fix n32 core dumping as o32 since regset support (3.13)
   - ralink: Drop obsolete USB_ARCH_HAS_HCD select

  Build system:
   - Default to "generic" (multiplatform) system type instead of IP22
   - Use generic little endian MIPS32 r2 configuration as default
     defconfig instead of ip22_defconfig

  FPU emulation:
   - Fix exception generation for certain R6 FPU instructions

  SMP:
   - Allow __cpu_number_map to be larger than NR_CPUS for sparse CPU id
     spaces

  Miscellaneous:
   - Add iomem resource for kernel bss section for kexec/kdump
   - Atomics: Nudge writes on bit unlock
   - DT files: Standardise "ok" -> "okay"

  Minor cleanups:
   - Define virt_to_pfn()
   - Make thread_saved_pc static
   - Simplify 32-bit sign extension in __read_64bit_c0_split()
   - DMA: Use vma_pages() helper
   - FPU emulation: Replace unsigned with unsigned int
   - MM: Removed unused lastpfn
   - Alchemy: Make clk_ops const
   - Lasat: Use setup_timer() helper
   - ralink: Use BIT() in MT7620 PCI driver

  Platform support:

  BMIPS:
  - Enable HARDIRQS_SW_RESEND

  Broadcom BCM63XX:
  - Add clkdev lookup support
  - Update clk driver, UART driver, DTs to handle named refclk from DTs
  - Split apart various clocks to more closely match hardware
  - Add ethernet clocks

  Cavium Octeon:
  - Remove usage of cvmx_wait() in favour of __delay()

  ImgTec Pistachio:
  - DT: Drop deprecated dwmmc num-slots property

  Ingenic JZ4780:
  - Add NFS root to Ci20 defconfig
  - Add watchdog to Ci20 DT & defconfig, and allow building of watchdog
    driver with this SoC

  Generic (multiplatform):
  - Migrate xilfpga (MIPSfpga) platform to the generic platform

  Lantiq xway:
  - Fix ASC0/ASC1 clocks"

* tag 'mips_4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips: (46 commits)
  MIPS: Add iomem resource for kernel bss section.
  MIPS: cmpxchg64() and HAVE_VIRT_CPU_ACCOUNTING_GEN don't work for 32-bit SMP
  MIPS: BMIPS: Enable HARDIRQS_SW_RESEND
  MIPS: pci: Make use of the BIT() macro inside the mt7620 driver
  MIPS: pci: Remove KERN_WARN instance inside the mt7620 driver
  MIPS: pci: Remove duplicate define in mt7620 driver
  MIPS: ralink: Fix typo in mt7628 pinmux function
  MIPS: ralink: Fix MT7628 pinmux
  MIPS: Fix odd fp register warnings with MIPS64r2
  watchdog: jz4780: Allow selection of jz4740-wdt driver
  MIPS/ptrace: Update syscall nr on register changes
  MIPS/ptrace: Pick up ptrace/seccomp changed syscalls
  MIPS: Fix an n32 core file generation regset support regression
  MIPS: Fix MIPS64 FP save/restore on 32-bit kernels
  MIPS: page.h: Define virt_to_pfn()
  MIPS: Xilfpga: Switch to using generic defconfigs
  MIPS: generic: Add support for MIPSfpga
  MIPS: Set defconfig target to a generic system for 32r2el
  MIPS: Kconfig: Set default MIPS system type as generic
  MIPS: DTS: Remove num-slots from Pistachio SoC
  ...
2017-11-15 11:36:08 -08:00
Linus Torvalds
c9b012e5f4 arm64 updates for 4.15
Plenty of acronym soup here:
 
 - Initial support for the Scalable Vector Extension (SVE)
 - Improved handling for SError interrupts (required to handle RAS events)
 - Enable GCC support for 128-bit integer types
 - Remove kernel text addresses from backtraces and register dumps
 - Use of WFE to implement long delay()s
 - ACPI IORT updates from Lorenzo Pieralisi
 - Perf PMU driver for the Statistical Profiling Extension (SPE)
 - Perf PMU driver for Hisilicon's system PMUs
 - Misc cleanups and non-critical fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJaCcLqAAoJELescNyEwWM0JREH/2FbmD/khGzEtP8LW+o9D8iV
 TBM02uWQxS1bbO1pV2vb+512YQO+iWfeQwJH9Jv2FZcrMvFv7uGRnYgAnJuXNGrl
 W+LL6OhN22A24LSawC437RU3Xe7GqrtONIY/yLeJBPablfcDGzPK1eHRA0pUzcyX
 VlyDruSHWX44VGBPV6JRd3x0vxpV8syeKOjbRvopRfn3Nwkbd76V3YSfEgwoTG5W
 ET1sOnXLmHHdeifn/l1Am5FX1FYstpcd7usUTJ4Oto8y7e09tw3bGJCD0aMJ3vow
 v1pCUWohEw7fHqoPc9rTrc1QEnkdML4vjJvMPUzwyTfPrN+7uEuMIEeJierW+qE=
 =0qrg
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The big highlight is support for the Scalable Vector Extension (SVE)
  which required extensive ABI work to ensure we don't break existing
  applications by blowing away their signal stack with the rather large
  new vector context (<= 2 kbit per vector register). There's further
  work to be done optimising things like exception return, but the ABI
  is solid now.

  Much of the line count comes from some new PMU drivers we have, but
  they're pretty self-contained and I suspect we'll have more of them in
  future.

  Plenty of acronym soup here:

   - initial support for the Scalable Vector Extension (SVE)

   - improved handling for SError interrupts (required to handle RAS
     events)

   - enable GCC support for 128-bit integer types

   - remove kernel text addresses from backtraces and register dumps

   - use of WFE to implement long delay()s

   - ACPI IORT updates from Lorenzo Pieralisi

   - perf PMU driver for the Statistical Profiling Extension (SPE)

   - perf PMU driver for Hisilicon's system PMUs

   - misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
  arm64: Make ARMV8_DEPRECATED depend on SYSCTL
  arm64: Implement __lshrti3 library function
  arm64: support __int128 on gcc 5+
  arm64/sve: Add documentation
  arm64/sve: Detect SVE and activate runtime support
  arm64/sve: KVM: Hide SVE from CPU features exposed to guests
  arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
  arm64/sve: KVM: Prevent guests from using SVE
  arm64/sve: Add sysctl to set the default vector length for new processes
  arm64/sve: Add prctl controls for userspace vector length management
  arm64/sve: ptrace and ELF coredump support
  arm64/sve: Preserve SVE registers around EFI runtime service calls
  arm64/sve: Preserve SVE registers around kernel-mode NEON use
  arm64/sve: Probe SVE capabilities and usable vector lengths
  arm64: cpufeature: Move sys_caps_initialised declarations
  arm64/sve: Backend logic for setting the vector length
  arm64/sve: Signal handling support
  arm64/sve: Support vector length resetting for new processes
  arm64/sve: Core task context handling
  arm64/sve: Low-level CPU setup
  ...
2017-11-15 10:56:56 -08:00
Linus Torvalds
b293fca43b RISC-V Port for Linux 4.15 v9
This tag contains the core RISC-V Linux port, which has been through
 nine rounds of review on various mailing lists.  The port is not
 complete: there's some cleanup patches moving through the review
 process, a whole bunch of drivers that need some work, and a lot of
 feature additions that will be needed.
 
 The patches contained in this tag have been through nine rounds of
 review on the various mailing lists.  I have some outstanding cleanup
 patches, but since there's been so much review on these patches I
 thought it would be best to submit them as-is and then submit explicit
 cleanup patches so everyone can review them.  This first patch set is
 big enough that it's a bit of a pain to constantly rewrite, and it's
 caused a few headaches with various contributors.
 
 The port is definately a work in progress.  While what's there builds
 and boots with 4.14, it's a bit hard to actually see anything happen
 because there are no device drivers yet.  I maintain a staging branch
 that contains all the device drivers and cleanup that actually works,
 but those patches won't all be ready for a while.  I'd like to get what
 we currently have into your tree so everyone can start working from a
 single base -- of particular importance is allowing the glibc
 upstreaming process to proceed so we can sort out any possibly lingering
 user-visible ABI problems we might have.
 
 Copied below is the ChangeLog that contains the history of this patch
 set:
 
 (v9) As per suggestions on our v8 patch set, I've split the core architecture code
 out from our drivers and would like to submit this patch set to be included
 into linux-next, with the goal being to be merged in during the next merge
 window.  This patch set is based on 4.14-rc2, but if it's better to have it
 based on something else then I can change it around.
 
 This patch set contains just the core arch code for RISC-V, so while it builds
 an nominally boots, you can't print or take an interrupt so it's not that
 useful.  If you're looking to actually boot a system it would probably be
 better to use the full patch set listed below.
 
 We've collected a handful of tags from reviewers, and the remainder of the
 patch set only got minimal feedback last time.  Here's what changed:
 
  * We now use the device tree to initialize the timer driver so it's less
    tighly coupled with the arch port.
  * I cleaned up the defconfigs -- there's actually now just one, and it's
    empty.  For now I think we're OK with what the kernel sets as defaults, but
    I anticipate we'll begin to expand this as people start to use the port
    more.
  * The VDSO symbols version is sane.
  * We WFI while spinning in the boot loop.
  * A handful of comments have been added.
 
 While there are still a handful of FIXMEs in this patch set, we've started to
 get enough interest from various users and contributors that maintaining an out
 of tree patch set is starting to become a big burden.  Hopefully the patches
 are good enough to merge now, which will at least get everyone working in a
 more reasonable manner as we clean up the remaining issues.
 
 This patch set is also availiable on github
 
   https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v9-arch
 
 as is the entire patch set necessary to get a more functional RISC-V system up
 and running, including a handful of patches that aren't ready for upstream yet.
 
   https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v9
 
 Hopefully I've managed to get everyone's feedback
 
 Here's the change highlights from the whole patch set:
 
 (v8) I know it may not be the ideal time to submit a patch set right now, as
 it's the middle of the merge window, but things have calmed down quite a bit in
 the last month so I thought it would be good to get everyone on the same page.
 There's been a handful of changes since the last patch set, but most of them
 are fairly minor:
 
 * We changed PAGE_OFFSET to allowing mapping more physical memory on 64-bit
   systems.  This is user configurable, as it triggers a different code model
   that generates slightly less efficient code.
 * The device tree binding documentation is back, I'd managed to lose it at some
   point.
 * We now pass the atomic64 test suite.  The SBI timer driver has been
 * refactored.
 
 (v7) It's been a while since my last patch set, but the changes han been fairly
 minimal:
 
  * The PCI cleanup patches have been dropped, we'll do them as a separate patch
    set later.
  * We've the Kconfig entries from CONFIG_ISA_* to CONFIG_RISCV_ISA_*, to make
    grep easier.
  * There have been a handful of memory model related tweaks in I/O land,
    particularly relating the PCI and the upcoming platform specification.
    There are significant comments in the relevant files.  This is still a WIP,
    but I think we're close to getting as good as we're going to get until we
    end up with some more specifications.
 
 (v6) As it's been only a day since the v5 patch set, the changes are pretty
 minimal:
 
  * The patch set is now based on linux-next/master, which I believe is a better
    base now that we're getting closer to upstream.
  * EARLY_PRINTK is no longer an option.  Since the SBI console is reasonable,
    there's no penalty to enabling it (and thus no benefit to disabling it).
  * The mmap syscalls were refactored a bit.
 
 (v5) Things have really started to calm down, so this is fairly similar to the
 v4 patch set.  The most interesting changes include:
 
  * We've moved back to a single patch set.
 
  * SMP support has been fixed, I was accidentally running on a non-SMP
    configuration.  There were various mistakes all over the tree as a result of
    this.
 
  * The cmpxchg syscalls have been removed, as they were deemed a bad idea.  As
    a result, RISC-V Linux systems mandate the A extension.  The corresponding
    Kconfig entry to enable builds on non-A systems has been removed.
 
  * A few more atomic fixes: mostly fence changes, but those resulted in a
    handful of additional macros that were no longer necessary.
 
  * riscv_early_sie has been removed.
 
 (v4) There have only been a few changes since the v3 patch set:
 
  * The cmpxchg64 syscall is no longer enabled on 32-bit systems.  It's not
    possible to provide this on SMP systems, and it's not necessary as glibc
    knows not to call it.
 
  * We provide a ELF_HWCAP so users can determine the ISA of the machine the
    kernel is running on.
 
  * The multi-line comments are in a better form.
 
  * There were a handful of headers that could be replaced with the asm-generic
    versions, and a few unnecessary definitions.
 
  * We no longer use printk, but instead use pr_*.
 
  * A few Kconfig and defconfig entries have been cleaned up.
 
 (v3) A highlight of the changes since the v2 patch set includes:
 
  * We've split out all our drivers into separate patch sets, which I've already
    sent out to the relevant maintainers.  I haven't included those patches in
    this patch set, but some of them are necessary to build our port.  A git
    tree that contains all our patch sets merged together lives at
    <https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v3>.
 
  * The patch set is now split up differently: rather than being split per
    directory it is split per topic.  Hopefully this will make it easier to
    review the port on the mailing list.  The split is a bit rough, so you
    probably still want to look at the patch set as a whole.
 
  * atomic.h has been completely rewritten and is hopefully now correct.  I've
    attempted to sanitize the various other memory model related code as well,
    and I think it should all be sane now aside from a handful of FIXMEs
    commented in the code.
 
  * We've changed the cmpexchg syscall to always exist and to not be
    multiplexed.  There is also a VDSO entry for compare and exchange, which
    allows kernels with the A extension to execute user code without the A
    extension reasonably fast.
 
  * Our user-visible register state now contains enough space for the Q
    extension for 128-bit floating point, as well as a few words to allow
    extensibility to future ISA extensions like the eventual V extension for
    vectors.
 
  * A handful of driver cleanups, but these have been split into separate patch
    sets now so I won't duplicate them here.
 
 (v2) A highlight of the changes since the v1 patch set includes:
 
   * We've split out our drivers into the right places, which means now there's
     a lot more patches.  I'll be submitting these patches to various subsystem
     maintainers and including them in any future RISC-V patch sets until
     they've been merged.
 
   * The SBI console driver has been completely rewritten to use the HVC helpers
     and is now significantly smaller.
 
   * We've begun to use weaker barriers as opposed to just the big "fence".
     There's still some work to do here, specifically:
     - We need fences in the relaxed MMIO functions.
     - The non-relaxed MMIO functions are missing R/W bits on their fences.
     - Many AMOs need the aq and rl bits set.
 
   * We now have thread_info in task_struct.  As a result, sscratch now contains
     TP instead of SP.  This was necessary because thread_info is no longer on
     the stack.
 
   * A few shared routines have been added that we use instead of creating
     another arch copy.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAloLD8sTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQbCZEAC2IgWFOAhYDIv4s39jC/iuGcofuuwC
 atTVgKSM8tUES5wBomoVxRH1yjDvmyb2jeq3gsp6gWPcchUpLMdfwf2MwW3NV3Mw
 ESCZPwYiuFhORh1Jt5RSespjK+V9qMvCW0iU6cPE/9kAlPfMGGDv2vEttOFgOGEm
 yVb1i0gHBcdzbw5H0xszBionUAQVXOFqkfO8AW8VPtFMdzZB6t9OBXRgHJLdWgmK
 2Zr5pFN75uivNh4RI1KXHpUeD1kLRVICzG7Ak/aQCfKxWsJutFI1dnLFZmFOIoTf
 2wgW4KsDsZakcA9rILtfo3SFH+mSD5PWzvv5G44yf9sEkGG9bSgxl29GeJYL7NzG
 3Da9FVMvzjIhmxamPGHfFOFTxTud9+6GU6Lj0iBLpHzpcttjhNgE2NXzcY8r1uMD
 BcSwkK3duybjeiZLpwnxOywZidCQDv6pZYyc50WBtV/oUG1fncj8DT2ZTIqGv1V8
 L6D/MXSr1jt9oJeWzfDCxHlaGaHL6grrmyJ8L1tQKPjMp+DbBPFbMLfvbn/dlsat
 mPqmfQZ4zydOVO53k6KiHozGQh6K+cuXMvNxrb9pCRy3etFV2wfTNxtbdeJSa7gj
 xarC6vSia8KFVyXp5nydSks5woHGJFQ1kQYSLEORUWiL5zWILbtI6POzOZeYHgej
 BvTzVq0AVIbxjA==
 =xDIk
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-4.15-arch-v9-premerge' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux

Pull RISC-V architecture support from Palmer Dabbelt:
 "This contains the core RISC-V Linux port, which has been through nine
  rounds of review on various mailing lists. The port is not complete:
  there's some cleanup patches moving through the review process, a
  whole bunch of drivers that need some work, and a lot of feature
  additions that will be needed.

  The patches contained in this tag have been through nine rounds of
  review on the various mailing lists. I have some outstanding cleanup
  patches, but since there's been so much review on these patches I
  thought it would be best to submit them as-is and then submit explicit
  cleanup patches so everyone can review them. This first patch set is
  big enough that it's a bit of a pain to constantly rewrite, and it's
  caused a few headaches with various contributors.

  The port is definately a work in progress. While what's there builds
  and boots with 4.14, it's a bit hard to actually see anything happen
  because there are no device drivers yet. I maintain a staging branch
  that contains all the device drivers and cleanup that actually works,
  but those patches won't all be ready for a while. I'd like to get what
  we currently have into your tree so everyone can start working from a
  single base -- of particular importance is allowing the glibc
  upstreaming process to proceed so we can sort out any possibly
  lingering user-visible ABI problems we might have.

  Copied below is the ChangeLog that contains the history of this patch
  set:

   (v9) As per suggestions on our v8 patch set, I've split the core
        architecture code out from our drivers and would like to submit
        this patch set to be included into linux-next, with the goal
        being to be merged in during the next merge window. This patch
        set is based on 4.14-rc2, but if it's better to have it based on
        something else then I can change it around.

        This patch set contains just the core arch code for RISC-V, so
        while it builds an nominally boots, you can't print or take an
        interrupt so it's not that useful. If you're looking to actually
        boot a system it would probably be better to use the full patch
        set listed below.

        We've collected a handful of tags from reviewers, and the
        remainder of the patch set only got minimal feedback last time.
        Here's what changed:

         - We now use the device tree to initialize the timer driver so
           it's less tighly coupled with the arch port.

         - I cleaned up the defconfigs -- there's actually now just one,
           and it's empty. For now I think we're OK with what the kernel
           sets as defaults, but I anticipate we'll begin to expand this
           as people start to use the port more.

         - The VDSO symbols version is sane.

         - We WFI while spinning in the boot loop.

         - A handful of comments have been added.

        While there are still a handful of FIXMEs in this patch set,
        we've started to get enough interest from various users and
        contributors that maintaining an out of tree patch set is
        starting to become a big burden. Hopefully the patches are good
        enough to merge now, which will at least get everyone working in
        a more reasonable manner as we clean up the remaining issues.

   (v8) I know it may not be the ideal time to submit a patch set right
        now, as it's the middle of the merge window, but things have
        calmed down quite a bit in the last month so I thought it would
        be good to get everyone on the same page. There's been a handful
        of changes since the last patch set, but most of them are fairly
        minor:

         - We changed PAGE_OFFSET to allowing mapping more physical
           memory on 64-bit systems. This is user configurable, as it
           triggers a different code model that generates slightly less
           efficient code.

         - The device tree binding documentation is back, I'd managed to
           lose it at some point.

         - We now pass the atomic64 test suite

         - The SBI timer driver has been refactored.

   (v7) It's been a while since my last patch set, but the changes han
        been fairly minimal:

         - The PCI cleanup patches have been dropped, we'll do them as a
           separate patch set later.

         - We've the Kconfig entries from CONFIG_ISA_* to
           CONFIG_RISCV_ISA_*, to make grep easier.

         - There have been a handful of memory model related tweaks in
           I/O land, particularly relating the PCI and the upcoming
           platform specification. There are significant comments in the
           relevant files. This is still a WIP, but I think we're close
           to getting as good as we're going to get until we end up with
           some more specifications.

   (v6) As it's been only a day since the v5 patch set, the changes are
        pretty minimal:

         - The patch set is now based on linux-next/master, which I
           believe is a better base now that we're getting closer to
           upstream.

         - EARLY_PRINTK is no longer an option. Since the SBI console is
           reasonable, there's no penalty to enabling it (and thus no
           benefit to disabling it).

         - The mmap syscalls were refactored a bit.

   (v5) Things have really started to calm down, so this is fairly
        similar to the v4 patch set. The most interesting changes
        include:

         - We've moved back to a single patch set.

         - SMP support has been fixed, I was accidentally running on a
           non-SMP configuration. There were various mistakes all over
           the tree as a result of this.

         - The cmpxchg syscalls have been removed, as they were deemed a
           bad idea. As a result, RISC-V Linux systems mandate the A
           extension. The corresponding Kconfig entry to enable builds
           on non-A systems has been removed.

         - A few more atomic fixes: mostly fence changes, but those
           resulted in a handful of additional macros that were no
           longer necessary.

         - riscv_early_sie has been removed.

   (v4) There have only been a few changes since the v3 patch set:

         - The cmpxchg64 syscall is no longer enabled on 32-bit systems.
           It's not possible to provide this on SMP systems, and it's
           not necessary as glibc knows not to call it.

         - We provide a ELF_HWCAP so users can determine the ISA of the
           machine the kernel is running on.

         - The multi-line comments are in a better form.

         - There were a handful of headers that could be replaced with
           the asm-generic versions, and a few unnecessary definitions.

         - We no longer use printk, but instead use pr_*.

         - A few Kconfig and defconfig entries have been cleaned up.

   (v3) A highlight of the changes since the v2 patch set includes:

         - We've split out all our drivers into separate patch sets,
           which I've already sent out to the relevant maintainers. I
           haven't included those patches in this patch set, but some of
           them are necessary to build our port.

         - The patch set is now split up differently: rather than being
           split per directory it is split per topic. Hopefully this
           will make it easier to review the port on the mailing list.
           The split is a bit rough, so you probably still want to look
           at the patch set as a whole.

         - atomic.h has been completely rewritten and is hopefully now
           correct. I've attempted to sanitize the various other memory
           model related code as well, and I think it should all be sane
           now aside from a handful of FIXMEs commented in the code.

         - We've changed the cmpexchg syscall to always exist and to not
           be multiplexed. There is also a VDSO entry for compare and
           exchange, which allows kernels with the A extension to
           execute user code without the A extension reasonably fast.

         - Our user-visible register state now contains enough space for
           the Q extension for 128-bit floating point, as well as a few
           words to allow extensibility to future ISA extensions like
           the eventual V extension for vectors.

         - A handful of driver cleanups, but these have been split into
           separate patch sets now so I won't duplicate them here.

   (v2) A highlight of the changes since the v1 patch set includes:

         - We've split out our drivers into the right places, which
           means now there's a lot more patches. I'll be submitting
           these patches to various subsystem maintainers and including
           them in any future RISC-V patch sets until they've been
           merged.

         - The SBI console driver has been completely rewritten to use
           the HVC helpers and is now significantly smaller.

         - We've begun to use weaker barriers as opposed to just the big
           "fence". There's still some work to do here, specifically:
            - We need fences in the relaxed MMIO functions.
            - The non-relaxed MMIO functions are missing R/W bits on their fences.
            - Many AMOs need the aq and rl bits set.

         - We now have thread_info in task_struct. As a result, sscratch
           now contains TP instead of SP. This was necessary because
           thread_info is no longer on the stack.

         - A few shared routines have been added that we use instead of
           creating another arch copy"

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

* tag 'riscv-for-linus-4.15-arch-v9-premerge' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux:
  RISC-V: Build Infrastructure
  RISC-V: User-facing API
  RISC-V: Paging and MMU
  RISC-V: Device, timer, IRQs, and the SBI
  RISC-V: Task implementation
  RISC-V: ELF and module implementation
  RISC-V: Generic library routines and assembly
  RISC-V: Atomic and Locking Code
  RISC-V: Init and Halt Code
  dt-bindings: RISC-V CPU Bindings
  lib: Add shared copies of some GCC library routines
  MAINTAINERS: Add RISC-V
2017-11-15 10:49:15 -08:00
Rafael J. Wysocki
7d5905dc14 x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
After commit 890da9cf09 (Revert "x86: do not use cpufreq_quick_get()
for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo
on x86 can be either the nominal CPU frequency (which is constant)
or the frequency most recently requested by a scaling governor in
cpufreq, depending on the cpufreq configuration.  That is somewhat
inconsistent and is different from what it was before 4.13, so in
order to restore the previous behavior, make it report the current
CPU frequency like the scaling_cur_freq sysfs file in cpufreq.

To that end, modify the /proc/cpuinfo implementation on x86 to use
aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback
registers, if available, and use their values to compute the CPU
frequency to be reported as "cpu MHz".

However, do that carefully enough to avoid accumulating delays that
lead to unacceptable access times for /proc/cpuinfo on systems with
many CPUs.  Run aperfmperf_snapshot_khz() once on all CPUs
asynchronously at the /proc/cpuinfo open time, add a single delay
upfront (if necessary) at that point and simply compute the current
frequency while running show_cpuinfo() for each individual CPU.

Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce
the default delay between consecutive APERF and MPERF reads to 10 ms,
which should be sufficient to get large enough numbers for the
frequency computation in all cases.

Fixes: 890da9cf09 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
2017-11-15 19:46:50 +01:00
Linus Torvalds
9682b3dea2 Merge branch 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "The usual rocket-science from trivial tree for 4.15"

* 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  MAINTAINERS: relinquish kconfig
  MAINTAINERS: Update my email address
  treewide: Fix typos in Kconfig
  kfifo: Fix comments
  init/Kconfig: Fix module signing document location
  misc: ibmasm: Return error on error path
  HID: logitech-hidpp: fix mistake in printk, "feeback" -> "feedback"
  MAINTAINERS: Correct path to uDraw PS3 driver
  tracing: Fix doc mistakes in trace sample
  tracing: Kconfig text fixes for CONFIG_HWLAT_TRACER
  MIPS: Alchemy: Remove reverted CONFIG_NETLINK_MMAP from db1xxx_defconfig
  mm/huge_memory.c: fixup grammar in comment
  lib/xz: Add fall-through comments to a switch statement
2017-11-15 10:14:11 -08:00
Eugeniy Paltsev
ff64d695f9 ARC: [plat-axs10x] DTS: Add reset controller node to manage ethernet reset
DW ethernet controller on axs10x hangs sometimes after SW reset.
Invoke the newly aded driver (reset-axs10x.c) by adding the DT bits.

With this in place, we don't need the open-coded quirk in platform
code, so get rid of it as well !

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-11-15 09:40:43 -08:00
Jeff Layton
4d2dc2cc76 fcntl: don't cap l_start and l_end values for F_GETLK64 in compat syscall
Currently, we're capping the values too low in the F_GETLK64 case. The
fields in that structure are 64-bit values, so we shouldn't need to do
any sort of fixup there.

Make sure we check that assumption at build time in the future however
by ensuring that the sizes we're copying will fit.

With this, we no longer need COMPAT_LOFF_T_MAX either, so remove it.

Fixes: 94073ad77f (fs/locks: don't mess with the address limit in compat_fcntl64)
Reported-by: Vitaly Lipatov <lav@etersoft.ru>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
2017-11-15 08:08:36 -05:00
Nitin Gupta
70f3c8b7c2 sparc64: Fix page table walk for PUD hugepages
For a PUD hugepage entry, we need to propagate bits [32:22]
from virtual address to resolve at 4M granularity. However,
the current code was incorrectly propagating bits [29:19].
This bug can cause incorrect data to be returned for pages
backed with 16G hugepages.

Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:37:43 +09:00
Allen Pais
ff0296877a sparc64: Convert timers to user timer_setup()
Switch to using the new timer_setup() and from_timer()
in LDOM Virtual I/O handshake.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:36:27 +09:00
Elena Reshetova
cdf5976fcb sparc64: convert mdesc_handle.refcnt from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable mdesc_handle.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:28:23 +09:00
Kees Cook
68fa10dce0 sparc/led: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. Adds a static variable to hold timeout
value.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geliang Tang <geliangtang@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: sparclinux@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:27:50 +09:00
Vijay Kumar
46ad8d2d22 sparc64: Use sparc optimized fls and __fls for T4 and above
For T4 and above, patch fls and __fls functions
at the boot time to use lzcnt instruction.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:26:46 +09:00
Vijay Kumar
2b41ce5df2 sparc64: SPARC optimized __fls function
Defined SPARC optimized __fls using lzcnt opcode.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:26:46 +09:00
Vijay Kumar
70cbec0c53 sparc64: SPARC optimized fls function
Defined SPARC optimized fls using lzcnt opcode.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:26:46 +09:00
Vijay Kumar
be52bbe3ea sparc64: Define SPARC default __fls function
__fls will now require a boot time patching on T4 and above.
Redefining it under arch/sparc/lib.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:26:46 +09:00
Vijay Kumar
41413a6035 sparc64: Define SPARC default fls function
fls will now require a boot time patching on T4 and above.
Redefining it under arch/sparc/lib.

Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:26:45 +09:00
Nagarathnam Muthusamy
9a08862a5d vDSO for sparc
Following patch is based on work done by Nick Alcock on 64-bit vDSO for sparc
in Oracle linux. I have extended it to include support for 32-bit vDSO for sparc
on 64-bit kernel.

vDSO for sparc is based on the X86 implementation. This patch
provides vDSO support for both 64-bit and 32-bit programs on 64-bit kernel.
vDSO will be disabled on 32-bit linux kernel on sparc.

*) vclock_gettime.c contains all the vdso functions. Since data page is mapped
   before the vdso code page, the pointer to data page is got by subracting offset
   from an address in the vdso code page. The return address stored in
   %i7 is used for this purpose.
*) During compilation, both 32-bit and 64-bit vdso images are compiled and are
   converted into raw bytes by vdso2c program to be ready for mapping into the
   process. 32-bit images are compiled only if CONFIG_COMPAT is enabled. vdso2c
   generates two files vdso-image-64.c and vdso-image-32.c which contains the
   respective vDSO image in C structure.
*) During vdso initialization, required number of vdso pages are allocated and
   raw bytes are copied into the pages.
*) During every exec, these pages are mapped into the process through
   arch_setup_additional_pages and the location of mapping is passed on to the
   process through aux vector AT_SYSINFO_EHDR which is used by glibc.
*) A new update_vsyscall routine for sparc is added to keep the data page in
   vdso updated.
*) As vDSO cannot contain dynamically relocatable references, a new version of
   cpu_relax is added for the use of vDSO.

This change also requires a putback to glibc to use vDSO. For testing,
programs planning to try vDSO can be compiled against the generated
vdso(64/32).so in the source.

Testing:

========
[root@localhost ~]# cat vdso_test.c
int main() {
        struct timespec tv_start, tv_end;
        struct timeval tv_tmp;
	int i;
        int count = 1 * 1000 * 10000;
	long long diff;

        clock_gettime(0, &tv_start);
        for (i = 0; i < count; i++)
              gettimeofday(&tv_tmp, NULL);
        clock_gettime(0, &tv_end);
        diff = (long long)(tv_end.tv_sec -
		tv_start.tv_sec)*(1*1000*1000*1000);
        diff += (tv_end.tv_nsec - tv_start.tv_nsec);
	printf("Start sec: %d\n", tv_start.tv_sec);
	printf("End sec  : %d\n", tv_end.tv_sec);
        printf("%d cycles in %lld ns = %f ns/cycle\n", count, diff,
		(double)diff / (double)count);
        return 0;
}

[root@localhost ~]# cc vdso_test.c -o t32_without_fix -m32 -lrt
[root@localhost ~]# ./t32_without_fix
Start sec: 1502396130
End sec  : 1502396140
10000000 cycles in 9565148528 ns = 956.514853 ns/cycle
[root@localhost ~]# cc vdso_test.c -o t32_with_fix -m32 ./vdso32.so.dbg
[root@localhost ~]# ./t32_with_fix
Start sec: 1502396168
End sec  : 1502396169
10000000 cycles in 798141262 ns = 79.814126 ns/cycle
[root@localhost ~]# cc vdso_test.c -o t64_without_fix -m64 -lrt
[root@localhost ~]# ./t64_without_fix
Start sec: 1502396208
End sec  : 1502396218
10000000 cycles in 9846091800 ns = 984.609180 ns/cycle
[root@localhost ~]# cc vdso_test.c -o t64_with_fix -m64 ./vdso64.so.dbg
[root@localhost ~]# ./t64_with_fix
Start sec: 1502396257
End sec  : 1502396257
10000000 cycles in 380984048 ns = 38.098405 ns/cycle

V1 to V2 Changes:
=================
	Added hot patching code to switch the read stick instruction to read
tick instruction based on the hardware.

V2 to V3 Changes:
=================
	Merged latest changes from sparc-next and moved the initialization
of clocksource_tick.archdata.vclock_mode to time_init_early. Disabled
queued spinlock and rwlock configuration when simulating 32-bit config
to compile 32-bit VDSO.

V3 to V4 Changes:
=================
	Hardcoded the page size as 8192 in linker script for both 64-bit and
32-bit binaries. Removed unused variables in vdso2c.h. Added -mv8plus flag to
Makefile to prevent the generation of relocation entries for __lshrdi3 in 32-bit
vdso binary.

Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-15 14:21:03 +09:00
Michael Ellerman
3ffa9d9e2a powerpc/64s: Fix Power9 DD2.0 workarounds by adding DD2.1 feature
Recently we added a CPU feature for Power9 DD2.0, to capture the fact
that some workarounds are required only on Power9 DD1 and DD2.0 but
not DD2.1 or later.

Then in commit 9d2f510a66 ("powerpc/64s/idle: avoid POWER9 DD1 and
DD2.0 ERAT workaround on DD2.1") and commit e3646330cf
"powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on
DD2.1") we changed CPU_FTR_SECTIONs to check for DD1 or DD20, eg:

  BEGIN_FTR_SECTION
          PPC_INVALIDATE_ERAT
  END_FTR_SECTION_IFSET(CPU_FTR_POWER9_DD1 | CPU_FTR_POWER9_DD20)

Unfortunately although this reads as "if set DD1 or DD2.0", the or is
a bitwise or and actually generates a mask of both bits. The code that
does the feature patching then checks that the value of the CPU
features masked with that mask are equal to the mask.

So the end result is we're checking for DD1 and DD20 being set, which
never happens. Yes the API is terrible.

Removing the ERAT workaround on DD2.0 results in random SEGVs, the
system tends to boot, but things randomly die including sometimes
dhclient, udev etc.

To fix the problem and hopefully avoid it in future, we remove the
DD2.0 CPU feature and instead add a DD2.1 (or later) feature. This
allows us to easily express that the workarounds are required if DD2.1
is not set.

At some point we will drop the DD1 workarounds entirely and some of
this can be cleaned up.

Fixes: 9d2f510a66 ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 ERAT workaround on DD2.1")
Fixes: e3646330cf ("powerpc/64s/idle: avoid POWER9 DD1 and DD2.0 PMU workaround on DD2.1")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-15 14:25:42 +11:00
Linus Torvalds
37cb8e1f8e DeviceTree for 4.15:
- kbuild cleanups and improvements for dtbs
 
 - Code clean-up of overlay code and fixing for some long standing memory
   leak and race condition in applying overlays
 
 - Improvements to DT memory usage making sysfs/kobjects optional and
   skipping unflattening of disabled nodes. This is part of kernel
   tinification efforts.
 
 - Final piece of removing storing the full path for every DT node. The
   prerequisite conversion of printk's to use device_node format
   specifier happened in 4.14.
 
 - Sync with current upstream dtc. This brings additional checks to dtb
   compiling.
 
 - Binding doc tree wide removal of leading 0s from examples
 
 - RTC binding documentation adding missing devices and some
   consolidation of duplicated bindings
 
 - Vendor prefix documentation for nutsboard, Silicon Storage Technology,
   shimafuji, Tecon Microprocessor Technologies, DH electronics GmbH,
   Opal Kelly, and Next Thing
 -----BEGIN PGP SIGNATURE-----
 
 iQItBAABCAAXBQJaCwaSEBxyb2JoQGtlcm5lbC5vcmcACgkQ+vtdtY28YcNzeA/8
 C8uQhSsX2+UQZvFzcEA8KQAMGT3kYdrcf+gidRKwCEUWg1qscUEpTb3n3Rm5NUbU
 RPD1s6GSlh6fJCMHDTQ6Tti/T59L7nZa2/AIGmUishGu4x4q1o18AobpFJmYP/EM
 SJPwnmm5RV9WcZFao1y+sY3Xtn8DStxHO4cS+dyF5/EvPN9D8nbLJfu7bgTBAZww
 HktIMB9kx+GTipRQZBvBwXoy5MJjthIZub4XwzesA4tGananj4cXlc0xaVxpdYy3
 5bO6q5F7cbrZ2uyrF+oIChpCENK4VaXh80m0WHc8EzaG++shzEkR4he1vYkwnV+I
 OYo4vsUg9dP8rBksUG1eYhS8fJKPvEBRNP7ETT5utVBy5I/tDEbo/crmQZRTIDIC
 hZbhcdZlISZj0DzkMK2ZHQV9UYtRWzXrJbZHFIPP12GCyvXVxYJUIWb9iYnUYSon
 KugygsFSpZHMWmfAhemw5/ctJZ19qhM5UIl2KZk5tMBHAf466ILmZjg0me6fYkOp
 eADfwHJ1dLMdK79CVMHSfp+vArcZXp35B16c3sWpJB36Il97Mc/9siEufCL4GKX7
 IBBnQBlbpSBKBejWVyI7Ip/Xp5u4qAQD+ZMJ9oLqBRqfWerHbDuOERlEOgwGqJYr
 9v4HvP7V8eVUvAdqXka4EBfCyAgUzXDAxG2Dfmv9vGU=
 =jgpN
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree updates from Rob Herring:
 "A bigger diffstat than usual with the kbuild changes and a tree wide
  fix in the binding documentation.

  Summary:

   - kbuild cleanups and improvements for dtbs

   - Code clean-up of overlay code and fixing for some long standing
     memory leak and race condition in applying overlays

   - Improvements to DT memory usage making sysfs/kobjects optional and
     skipping unflattening of disabled nodes. This is part of kernel
     tinification efforts.

   - Final piece of removing storing the full path for every DT node.
     The prerequisite conversion of printk's to use device_node format
     specifier happened in 4.14.

   - Sync with current upstream dtc. This brings additional checks to
     dtb compiling.

   - Binding doc tree wide removal of leading 0s from examples

   - RTC binding documentation adding missing devices and some
     consolidation of duplicated bindings

   - Vendor prefix documentation for nutsboard, Silicon Storage
     Technology, shimafuji, Tecon Microprocessor Technologies, DH
     electronics GmbH, Opal Kelly, and Next Thing"

* tag 'devicetree-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (55 commits)
  dt-bindings: usb: add #phy-cells to usb-nop-xceiv
  dt-bindings: Remove leading zeros from bindings notation
  kbuild: handle dtb-y and CONFIG_OF_ALL_DTBS natively in Makefile.lib
  MIPS: dts: remove bogus bcm96358nb4ser.dtb from dtb-y entry
  kbuild: clean up *.dtb and *.dtb.S patterns from top-level Makefile
  .gitignore: move *.dtb and *.dtb.S patterns to the top-level .gitignore
  .gitignore: sort normal pattern rules alphabetically
  dt-bindings: add vendor prefix for Next Thing Co.
  scripts/dtc: Update to upstream version v1.4.5-6-gc1e55a5513e9
  of: dynamic: fix memory leak related to properties of __of_node_dup
  of: overlay: make pr_err() string unique
  of: overlay: pr_err from return NOTIFY_OK to overlay apply/remove
  of: overlay: remove unneeded check for NULL kbasename()
  of: overlay: remove a dependency on device node full_name
  of: overlay: simplify applying symbols from an overlay
  of: overlay: avoid race condition between applying multiple overlays
  of: overlay: loosen overly strict phandle clash check
  of: overlay: expand check of whether overlay changeset can be removed
  of: overlay: detect cases where device tree may become corrupt
  of: overlay: minor restructuring
  ...
2017-11-14 18:25:40 -08:00
Linus Torvalds
4008e6a9bc Merge branch 'i2c/for-4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
 "This contains two bigger than usual tree-wide changes this time. They
  all have proper acks, caused no merge conflicts in linux-next where
  they have been for a while. They are namely:

   - to-gpiod conversion of the i2c-gpio driver and its users (touching
     arch/* and drivers/mfd/*)

   - adding a sbs-manager based on I2C core updates to SMBus alerts
     (touching drivers/power/*)

  Other notable changes:

   - i2c_boardinfo can now carry a dev_name to be used when the device
     is created. This is because some devices in ACPI world need fixed
     names to find the regulators.

   - the designware driver got a long discussed overhaul of its PM
     handling. img-scb and davinci got PM support, too.

   - at24 driver has way better OF support. And it has a new maintainer.
     Thanks Bartosz for stepping up!

  The rest is regular driver updates and fixes"

* 'i2c/for-4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (55 commits)
  ARM: sa1100: simpad: Correct I2C GPIO offsets
  i2c: aspeed: Deassert reset in probe
  eeprom: at24: Add OF device ID table
  MAINTAINERS: new maintainer for AT24 driver
  i2c: nuc900: remove platform_data, too
  i2c: thunderx: Remove duplicate NULL check
  i2c: taos-evm: Remove duplicate NULL check
  i2c: Make i2c_unregister_device() NULL-aware
  i2c: xgene-slimpro: Support v2
  i2c: mpc: remove useless variable initialization
  i2c: omap: Trigger bus recovery in lockup case
  i2c: gpio: Add support for named gpios in DT
  dt-bindings: i2c: i2c-gpio: Add support for named gpios
  i2c: gpio: Local vars in probe
  i2c: gpio: Augment all boardfiles to use open drain
  i2c: gpio: Enforce open drain through gpiolib
  gpio: Make it possible for consumers to enforce open drain
  i2c: gpio: Convert to use descriptors
  power: supply: sbs-message: fix some code style issues
  power: supply: sbs-battery: remove unchecked return var
  ...
2017-11-14 17:52:21 -08:00
Linus Torvalds
e37e0ee019 A couple of dma-mapping updates:
- turn dma_cache_sync into a dma_map_ops instance and remove
    implementation that purely are dead because the architecture
    doesn't support noncoherent allocations
  - add a flag for busses that need DMA configuration (Robin Murphy)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAloLSrYLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOMuQ//XXD94uNPYavrgXzGsAtg+I+LEm+xyk4T0dX5fxfj
 amXX49MHoGemjsBgzJlkQMMFqwDEdkKyEuFnEuy6OeowYCyD6zW0MJ3MwP9OosNJ
 PNTdGZIfSvxPYEW8cR9AdK3iQ2loMBZnYhd+O/oVjSugULLW2DNa7r2VRktcCKoh
 8Ob/8gL6Y9xEYJBRszhrBwKTa/hU8IThxxozBFzN7I3LIKyFboSTcwXGLAHow43g
 4anCTjWTaDcoU2JwY6UTRKRRTV+gD0ZRcsZfd8lNNb5rtMVZkBVOHbF14SMAmw1r
 kSgRcU3+WIFPhK/8wBYqtGZZGnOgFBTHVeqow3AdS728pBWlWl8niTK0DiIgCd3m
 qzScF6SqfN1bCZkZAy8FUV2l0DPYKS6lvyNkf00Eb2W/f6LEqAcjCi2QDDxRfaw+
 Vm97nPUiM+uXNy/6KtAy6ChdprSqx12/edXPp7Y3H2rS/+Dmr6exeix+wb7QUN8W
 JI7ZRHo4JLaJZk/XrZtGX/6jnN1Jo7vfApQOmYDY7kE1iGtOU/LQQj8gcZRVQxML
 4soN6ivSmZX2k03LabWHpYQ8QiyCSYChLC+Az7rQH47LDLeu1IdTJu6orpXpaxyo
 ymzEWlHbmF7mE66X4g/Up/eAYk2YLUA3rKLGVjAIaWDBzHftSFg5EaAnqMADC1G2
 hSo=
 =ALJf
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - turn dma_cache_sync into a dma_map_ops instance and remove
   implementation that purely are dead because the architecture doesn't
   support noncoherent allocations

 - add a flag for busses that need DMA configuration (Robin Murphy)

* tag 'dma-mapping-4.15' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: turn dma_cache_sync into a dma_map_ops method
  sh: make dma_cache_sync a no-op
  xtensa: make dma_cache_sync a no-op
  unicore32: make dma_cache_sync a no-op
  powerpc: make dma_cache_sync a no-op
  mn10300: make dma_cache_sync a no-op
  microblaze: make dma_cache_sync a no-op
  ia64: make dma_cache_sync a no-op
  frv: make dma_cache_sync a no-op
  x86: make dma_cache_sync a no-op
  floppy: consolidate the dummy fd_cacheflush definition
  drivers: flag buses which demand DMA configuration
2017-11-14 16:54:12 -08:00
Linus Torvalds
e2c5923c34 Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block
Pull core block layer updates from Jens Axboe:
 "This is the main pull request for block storage for 4.15-rc1.

  Nothing out of the ordinary in here, and no API changes or anything
  like that. Just various new features for drivers, core changes, etc.
  In particular, this pull request contains:

   - A patch series from Bart, closing the whole on blk/scsi-mq queue
     quescing.

   - A series from Christoph, building towards hidden gendisks (for
     multipath) and ability to move bio chains around.

   - NVMe
        - Support for native multipath for NVMe (Christoph).
        - Userspace notifications for AENs (Keith).
        - Command side-effects support (Keith).
        - SGL support (Chaitanya Kulkarni)
        - FC fixes and improvements (James Smart)
        - Lots of fixes and tweaks (Various)

   - bcache
        - New maintainer (Michael Lyle)
        - Writeback control improvements (Michael)
        - Various fixes (Coly, Elena, Eric, Liang, et al)

   - lightnvm updates, mostly centered around the pblk interface
     (Javier, Hans, and Rakesh).

   - Removal of unused bio/bvec kmap atomic interfaces (me, Christoph)

   - Writeback series that fix the much discussed hundreds of millions
     of sync-all units. This goes all the way, as discussed previously
     (me).

   - Fix for missing wakeup on writeback timer adjustments (Yafang
     Shao).

   - Fix laptop mode on blk-mq (me).

   - {mq,name} tupple lookup for IO schedulers, allowing us to have
     alias names. This means you can use 'deadline' on both !mq and on
     mq (where it's called mq-deadline). (me).

   - blktrace race fix, oopsing on sg load (me).

   - blk-mq optimizations (me).

   - Obscure waitqueue race fix for kyber (Omar).

   - NBD fixes (Josef).

   - Disable writeback throttling by default on bfq, like we do on cfq
     (Luca Miccio).

   - Series from Ming that enable us to treat flush requests on blk-mq
     like any other request. This is a really nice cleanup.

   - Series from Ming that improves merging on blk-mq with schedulers,
     getting us closer to flipping the switch on scsi-mq again.

   - BFQ updates (Paolo).

   - blk-mq atomic flags memory ordering fixes (Peter Z).

   - Loop cgroup support (Shaohua).

   - Lots of minor fixes from lots of different folks, both for core and
     driver code"

* 'for-4.15/block' of git://git.kernel.dk/linux-block: (294 commits)
  nvme: fix visibility of "uuid" ns attribute
  blk-mq: fixup some comment typos and lengths
  ide: ide-atapi: fix compile error with defining macro DEBUG
  blk-mq: improve tag waiting setup for non-shared tags
  brd: remove unused brd_mutex
  blk-mq: only run the hardware queue if IO is pending
  block: avoid null pointer dereference on null disk
  fs: guard_bio_eod() needs to consider partitions
  xtensa/simdisk: fix compile error
  nvme: expose subsys attribute to sysfs
  nvme: create 'slaves' and 'holders' entries for hidden controllers
  block: create 'slaves' and 'holders' entries for hidden gendisks
  nvme: also expose the namespace identification sysfs files for mpath nodes
  nvme: implement multipath access to nvme subsystems
  nvme: track shared namespaces
  nvme: introduce a nvme_ns_ids structure
  nvme: track subsystems
  block, nvme: Introduce blk_mq_req_flags_t
  block, scsi: Make SCSI quiesce and resume work reliably
  block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
  ...
2017-11-14 15:32:19 -08:00
Heiko Carstens
049a2c2d48 s390: enable CPU alternatives unconditionally
Remove the CPU_ALTERNATIVES config option and enable the code
unconditionally. The config option was only added to avoid a conflict
with the named saved segment support. Since that code is gone there is
no reason to keep the CPU_ALTERNATIVES config option.

Just enable it unconditionally to also reduce the number of config
options and make it less likely that something breaks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-14 22:08:12 +01:00
Heiko Carstens
a6de0a91d9 s390/nmi: remove unused code
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-14 22:08:08 +01:00
Heiko Carstens
2be1da8d4d s390/mm: remove unused code
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-14 22:08:05 +01:00
Heiko Carstens
78ca4fe3bb s390/spinlock: fix indentation
checkpatch:
    WARNING: Statements should start on a tabstop
    #9499: FILE: arch/s390/lib/spinlock.c:231:
    +                          return;

sparse:
arch/s390/lib/spinlock.c:81 arch_load_niai4()
    warn: inconsistent indenting

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-14 22:07:58 +01:00