Commit Graph

767226 Commits

Author SHA1 Message Date
Breno Leitao
b2f82565f2 powerpc: Wire up io_pgetevents
Wire up io_pgetevents system call on powerpc.

io_pgetevents is a new syscall to read asynchronous I/O events from the
completion queue.

Tested with libaio branch aio-poll[1] and the io_pgetevents test (#22) passed
on both ppc64 LE and BE modes.

[1] https://pagure.io/libaio/branch/aio-poll

CC: Christoph Hellwig <hch@lst.de>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-23 21:43:21 +10:00
Rob Herring
6b4154a655 arm64: dts: msm8916: fix Coresight ETF graph connections
The ETF input should be connected to the funnel output, and the ETF
output should be connected to the replicator input. The labels are wrong
and these got swapped:

Warning (graph_endpoint): /soc/funnel@821000/ports/port@8/endpoint: graph connection to node '/soc/etf@825000/ports/port@1/endpoint' is not bidirectional
Warning (graph_endpoint): /soc/replicator@824000/ports/port@2/endpoint: graph connection to node '/soc/etf@825000/ports/port@0/endpoint' is not bidirectional

Fixes: 7c10da3736 ("arm64: dts: qcom: Add msm8916 CoreSight components")
Cc: Ivan T. Ivanov <ivan.ivanov@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Andy Gross <andy.gross@linaro.org>
Cc: David Brown <david.brown@linaro.org>
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-06-23 00:00:05 -05:00
Srinivas Kandagatla
1ebb2709ba arm64: dts: apq8096-db820c: disable uart0 by default
Access to UART0 is disabled by bootloaders. By leaving it enabled by
default would reboot the board.
Disable this for now, this would alteast give a board which boots.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
2018-06-22 23:59:04 -05:00
Cong Wang
35b42da69e net_sched: remove a bogus warning in hfsc
In update_vf():

  cftree_remove(cl);
  update_cfmin(cl->cl_parent);

the cl_cfmin of cl->cl_parent is intentionally updated to 0
when that parent only has one child. And if this parent is
root qdisc, we could end up, in hfsc_schedule_watchdog(),
that we can't decide the next schedule time for qdisc watchdog.
But it seems safe that we can just skip it, as this watchdog is
not always scheduled anyway.

Thanks to Marco for testing all the cases, nothing is broken.

Reported-by: Marco Berizzi <pupilla@libero.it>
Tested-by: Marco Berizzi <pupilla@libero.it>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:58:46 +09:00
David S. Miller
2ca4eb8574 Merge branch 'dccp-fixes-around-rx_tstamp_last_feedback'
Eric Dumazet says:

====================
net: dccp: fixes around rx_tstamp_last_feedback

This patch series fix some issues with rx_tstamp_last_feedback.

- Switch to monotonic clock.
- Avoid potential overflows on fast hosts/networks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:46:44 +09:00
Eric Dumazet
0ce4e70ff0 net: dccp: switch rx_tstamp_last_feedback to monotonic clock
To compute delays, better not use time of the day which can
be changed by admins or malicious programs.

Also change ccid3_first_li() to use s64 type for delta variable
to avoid potential overflows.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:46:44 +09:00
Eric Dumazet
74174fe563 net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
On fast hosts or malicious bots, we trigger a DCCP_BUG() which
seems excessive.

syzbot reported :

BUG: delta (-6195) <= 0 at net/dccp/ccids/ccid3.c:628/ccid3_hc_rx_send_feedback()
CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.18.0-rc1+ #112
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 ccid3_hc_rx_send_feedback net/dccp/ccids/ccid3.c:628 [inline]
 ccid3_hc_rx_packet_recv.cold.16+0x38/0x71 net/dccp/ccids/ccid3.c:793
 ccid_hc_rx_packet_recv net/dccp/ccid.h:185 [inline]
 dccp_deliver_input_to_ccids+0xf0/0x280 net/dccp/input.c:180
 dccp_rcv_established+0x87/0xb0 net/dccp/input.c:378
 dccp_v4_do_rcv+0x153/0x180 net/dccp/ipv4.c:654
 sk_backlog_rcv include/net/sock.h:914 [inline]
 __sk_receive_skb+0x3ba/0xd80 net/core/sock.c:517
 dccp_v4_rcv+0x10f9/0x1f58 net/dccp/ipv4.c:875
 ip_local_deliver_finish+0x2eb/0xda0 net/ipv4/ip_input.c:215
 NF_HOOK include/linux/netfilter.h:287 [inline]
 ip_local_deliver+0x1e9/0x750 net/ipv4/ip_input.c:256
 dst_input include/net/dst.h:450 [inline]
 ip_rcv_finish+0x823/0x2220 net/ipv4/ip_input.c:396
 NF_HOOK include/linux/netfilter.h:287 [inline]
 ip_rcv+0xa18/0x1284 net/ipv4/ip_input.c:492
 __netif_receive_skb_core+0x2488/0x3680 net/core/dev.c:4628
 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693
 process_backlog+0x219/0x760 net/core/dev.c:5373
 napi_poll net/core/dev.c:5771 [inline]
 net_rx_action+0x7da/0x1980 net/core/dev.c:5837
 __do_softirq+0x2e8/0xb17 kernel/softirq.c:284
 run_ksoftirqd+0x86/0x100 kernel/softirq.c:645
 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164
 kthread+0x345/0x410 kernel/kthread.c:240
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:46:43 +09:00
Casey Schaufler
7b4e88434c Smack: Mark inode instant in smack_task_to_inode
Smack: Mark inode instant in smack_task_to_inode

/proc clean-up in commit 1bbc55131e
resulted in smack_task_to_inode() being called before smack_d_instantiate.
This resulted in the smk_inode value being ignored, even while present
for files in /proc/self. Marking the inode as instant here fixes that.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
2018-06-23 10:45:56 +09:00
Geert Uytterhoeven
e020797b7d net: Remove depends on HAS_DMA in case of platform dependency
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.

Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.

This simplifies the dependencies, and allows to improve compile-testing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:44:30 +09:00
Geert Uytterhoeven
935c5e3eaf MAINTAINERS: Add file patterns for dsa device tree bindings
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:43:00 +09:00
Antoine Tenart
c2cd650b5b net: mscc: make sparse happy
This patch fixes a sparse warning about using an incorrect type in
argument 2 of ocelot_write_rix(), as an u32 was expected but a __be32
was given. The conversion to u32 is forced, which is safe as the value
will be written as-is in the hardware without any modification.

Fixes: 08d02364b1 ("net: mscc: fix the injection header")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:42:02 +09:00
Antoine Tenart
271f7ff5aa net: mvneta: fix the Rx desc DMA address in the Rx path
When using s/w buffer management, buffers are allocated and DMA mapped.
When doing so on an arm64 platform, an offset correction is applied on
the DMA address, before storing it in an Rx descriptor. The issue is
this DMA address is then used later in the Rx path without removing the
offset correction. Thus the DMA address is wrong, which can led to
various issues.

This patch fixes this by removing the offset correction from the DMA
address retrieved from the Rx descriptor before using it in the Rx path.

Fixes: 8d5047cf9c ("net: mvneta: Convert to be 64 bits compatible")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:40:15 +09:00
Tobin C. Harding
805f16a5f1 Documentation: e1000: Fix docs build error
Recent patch updated e1000 docs to rst format.  Docs build (`make
htmldocs`) is currently failing due to this file with error:

        (SEVERE/4) Unexpected section title.

This is because a section of the file is indented 2 spaces.  Build error
can be cleared by aligning the text with column 0.  While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.

Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.

Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Tobin C. Harding
3b0c3ebe2a Documentation: e100: Fix docs build error
Recent patch updated e100 docs to rst format.  Docs build (`make
htmldocs`) is currently failing due to this file with error:

	(SEVERE/4) Unexpected section title.

This is because a section of the file is indented 2 spaces.  Build error
can be cleared by aligning the text with column 0.  While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.

Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.

Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Tobin C. Harding
3be40e5476 Documentation: e1000: Use correct heading adornment
Recently documentation file was converted to rst.  The document title
has the incorrect heading adornment.  From kernel docs:

	* Please stick to this order of heading adornments:

	  1. ``=`` with overline for document title::

	       ==============
	       Document title
	       ==============

Add  overline heading adornment to document title.

Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Tobin C. Harding
32e6996ca6 Documentation: e100: Use correct heading adornment
Recently documentation file was converted to rst.  The document title
has the incorrect heading adornment.  From kernel docs:

	* Please stick to this order of heading adornments:

	  1. ``=`` with overline for document title::

	       ==============
	       Document title
	       ==============

Add  overline heading adornment to document title.

Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Hangbin Liu
6c6da92808 ipv6: mcast: fix unsolicited report interval after receiving querys
After recieving MLD querys, we update idev->mc_maxdelay with max_delay
from query header. This make the later unsolicited reports have the same
interval with mc_maxdelay, which means we may send unsolicited reports with
long interval time instead of default configured interval time.

Also as we will not call ipv6_mc_reset() after device up. This issue will
be there even after leave the group and join other groups.

Fixes: fc4eba58b4 ("ipv6: make unsolicited report intervals configurable for mld")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:27:45 +09:00
Jason Wang
b8f1f65882 vhost_net: validate sock before trying to put its fd
Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when
we meet errors during ubuf allocation, the code does not check for
NULL before calling sockfd_put(), this will lead NULL
dereferencing. Fixing by checking sock pointer before.

Fixes: bab632d69e ("vhost: vhost TX zero-copy support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:23:49 +09:00
Michel Dänzer
38e624a18f drm/amdgpu: GPU vs CPU page size fixes in amdgpu_vm_bo_split_mapping
start / last / max_entries are numbers of GPU pages, pfn / count are
numbers of CPU pages. Convert between them accordingly.

Fixes badness on systems with > 4K page size.

Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/106258
Reported-by: Matt Corallo <freedesktop@bluematt.me>
Tested-by: foxbat@ruin.net
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-22 14:57:17 -05:00
Lyude Paul
fe2a196529 drm/amdgpu: Count disabled CRTCs in commit tail earlier
This fixes a regression I accidentally reduced that was picked up by
kasan, where we were checking the CRTC atomic states after DRM's helpers
had already freed them. Example:

==================================================================
BUG: KASAN: use-after-free in amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
Read of size 1 at addr ffff8803a697b071 by task kworker/u16:0/7

CPU: 7 PID: 7 Comm: kworker/u16:0 Tainted: G           O      4.18.0-rc1Lyude-Upstream+ #1
Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.21 05/02/2018
Workqueue: events_unbound commit_work [drm_kms_helper]
Call Trace:
 dump_stack+0xc1/0x169
 ? dump_stack_print_info.cold.1+0x42/0x42
 ? kmsg_dump_rewind_nolock+0xd9/0xd9
 ? printk+0x9f/0xc5
 ? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
 print_address_description+0x6c/0x23c
 ? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
 kasan_report.cold.6+0x241/0x2fd
 amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
 ? commit_planes_to_stream.constprop.45+0x13b0/0x13b0 [amdgpu]
 ? cpu_load_update_active+0x290/0x290
 ? finish_task_switch+0x2bd/0x840
 ? __switch_to_asm+0x34/0x70
 ? read_word_at_a_time+0xe/0x20
 ? strscpy+0x14b/0x460
 ? drm_atomic_helper_wait_for_dependencies+0x47d/0x7e0 [drm_kms_helper]
 commit_tail+0x96/0xe0 [drm_kms_helper]
 process_one_work+0x88a/0x1360
 ? create_worker+0x540/0x540
 ? __sched_text_start+0x8/0x8
 ? move_queued_task+0x760/0x760
 ? call_rcu_sched+0x20/0x20
 ? vsnprintf+0xcda/0x1350
 ? wait_woken+0x1c0/0x1c0
 ? mutex_unlock+0x1d/0x40
 ? init_timer_key+0x190/0x230
 ? schedule+0xea/0x390
 ? __schedule+0x1ea0/0x1ea0
 ? need_to_create_worker+0xe4/0x210
 ? init_worker_pool+0x700/0x700
 ? try_to_del_timer_sync+0xbf/0x110
 ? del_timer+0x120/0x120
 ? __mutex_lock_slowpath+0x10/0x10
 worker_thread+0x196/0x11f0
 ? flush_rcu_work+0x50/0x50
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? __schedule+0x7d6/0x1ea0
 ? migrate_swap_stop+0x850/0x880
 ? __sched_text_start+0x8/0x8
 ? save_stack+0x8c/0xb0
 ? kasan_kmalloc+0xbf/0xe0
 ? kmem_cache_alloc_trace+0xe4/0x190
 ? kthread+0x98/0x390
 ? ret_from_fork+0x35/0x40
 ? ret_from_fork+0x35/0x40
 ? deactivate_slab.isra.67+0x3c4/0x5c0
 ? kthread+0x98/0x390
 ? kthread+0x98/0x390
 ? set_track+0x76/0x120
 ? schedule+0xea/0x390
 ? __schedule+0x1ea0/0x1ea0
 ? wait_woken+0x1c0/0x1c0
 ? kasan_unpoison_shadow+0x30/0x40
 ? parse_args.cold.15+0x17a/0x17a
 ? flush_rcu_work+0x50/0x50
 kthread+0x2d4/0x390
 ? kthread_create_worker_on_cpu+0xc0/0xc0
 ret_from_fork+0x35/0x40

Allocated by task 1124:
 kasan_kmalloc+0xbf/0xe0
 kmem_cache_alloc_trace+0xe4/0x190
 dm_crtc_duplicate_state+0x78/0x130 [amdgpu]
 drm_atomic_get_crtc_state+0x147/0x410 [drm]
 page_flip_common+0x57/0x230 [drm_kms_helper]
 drm_atomic_helper_page_flip+0xa6/0x110 [drm_kms_helper]
 drm_mode_page_flip_ioctl+0xc4b/0x10a0 [drm]
 drm_ioctl_kernel+0x1d4/0x260 [drm]
 drm_ioctl+0x433/0x920 [drm]
 amdgpu_drm_ioctl+0x11d/0x290 [amdgpu]
 do_vfs_ioctl+0x1a1/0x13d0
 ksys_ioctl+0x60/0x90
 __x64_sys_ioctl+0x6f/0xb0
 do_syscall_64+0x147/0x440
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 1124:
 __kasan_slab_free+0x12e/0x180
 kfree+0x92/0x1a0
 drm_atomic_state_default_clear+0x315/0xc40 [drm]
 __drm_atomic_state_free+0x35/0xd0 [drm]
 drm_atomic_helper_update_plane+0xac/0x350 [drm_kms_helper]
 __setplane_internal+0x2d6/0x840 [drm]
 drm_mode_cursor_universal+0x41e/0xbe0 [drm]
 drm_mode_cursor_common+0x49f/0x880 [drm]
 drm_mode_cursor_ioctl+0xd8/0x130 [drm]
 drm_ioctl_kernel+0x1d4/0x260 [drm]
 drm_ioctl+0x433/0x920 [drm]
 amdgpu_drm_ioctl+0x11d/0x290 [amdgpu]
 do_vfs_ioctl+0x1a1/0x13d0
 ksys_ioctl+0x60/0x90
 __x64_sys_ioctl+0x6f/0xb0
 do_syscall_64+0x147/0x440
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff8803a697b068
 which belongs to the cache kmalloc-1024 of size 1024
The buggy address is located 9 bytes inside of
 1024-byte region [ffff8803a697b068, ffff8803a697b468)
The buggy address belongs to the page:
page:ffffea000e9a5e00 count:1 mapcount:0 mapping:ffff88041e00efc0 index:0x0 compound_mapcount: 0
flags: 0x8000000000008100(slab|head)
raw: 8000000000008100 ffffea000ecbc208 ffff88041e000c70 ffff88041e00efc0
raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8803a697af00: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff8803a697af80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8803a697b000: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
                                                             ^
 ffff8803a697b080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8803a697b100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

So, we fix this by counting the number of CRTCs this atomic commit disabled
early on in the function before their atomic states have been freed, then use
that count later to do the appropriate number of RPM puts at the end of the
function.

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: stable@vger.kernel.org
Fixes: 97028037a3 ("drm/amdgpu: Grab/put runtime PM references in atomic_commit_tail()")
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Michel Dänzer <michel@daenzer.net>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-22 14:55:25 -05:00
Suravee Suthikulpanit
964d978433 x86/CPU/AMD: Fix LLC ID bit-shift calculation
The current logic incorrectly calculates the LLC ID from the APIC ID.

Unless specified otherwise, the LLC ID should be calculated by removing
the Core and Thread ID bits from the least significant end of the APIC
ID. For more info, see "ApicId Enumeration Requirements" in any Fam17h
PPR document.

[ bp: Improve commit message. ]

Fixes: 68091ee7ac ("Calculate last level cache ID from number of sharing threads")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1528915390-30533-1-git-send-email-suravee.suthikulpanit@amd.com
2018-06-22 21:21:49 +02:00
Thomas Gleixner
7731b8bc94 Merge branch 'linus' into x86/urgent
Required to queue a dependent fix.
2018-06-22 21:20:35 +02:00
Arnd Bergmann
f2ccaa5904 dm raid: don't use 'const' in function return
A newly introduced function has 'const int' as the return type,
but as "make W=1" reports, that has no meaning:

drivers/md/dm-raid.c:510:18: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]

This changes the return type to plain 'int'.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 33e53f0685 ("dm raid: introduce extended superblock and new raid types to support takeover/reshaping")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Fixes: 552aa679f2 ("dm raid: use rs_is_raid*()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-06-22 14:51:12 -04:00
Bart Van Assche
2d0b2d64d3 dm zoned: avoid triggering reclaim from inside dmz_map()
This patch avoids that lockdep reports the following:

======================================================
WARNING: possible circular locking dependency detected
4.18.0-rc1 #62 Not tainted
------------------------------------------------------
kswapd0/84 is trying to acquire lock:
00000000c313516d (&xfs_nondir_ilock_class){++++}, at: xfs_free_eofblocks+0xa2/0x1e0

but task is already holding lock:
00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (fs_reclaim){+.+.}:
  kmem_cache_alloc+0x2c/0x2b0
  radix_tree_node_alloc.constprop.19+0x3d/0xc0
  __radix_tree_create+0x161/0x1c0
  __radix_tree_insert+0x45/0x210
  dmz_map+0x245/0x2d0 [dm_zoned]
  __map_bio+0x40/0x260
  __split_and_process_non_flush+0x116/0x220
  __split_and_process_bio+0x81/0x180
  __dm_make_request.isra.32+0x5a/0x100
  generic_make_request+0x36e/0x690
  submit_bio+0x6c/0x140
  mpage_readpages+0x19e/0x1f0
  read_pages+0x6d/0x1b0
  __do_page_cache_readahead+0x21b/0x2d0
  force_page_cache_readahead+0xc4/0x100
  generic_file_read_iter+0x7c6/0xd20
  __vfs_read+0x102/0x180
  vfs_read+0x9b/0x140
  ksys_read+0x55/0xc0
  do_syscall_64+0x5a/0x1f0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #1 (&dmz->chunk_lock){+.+.}:
  dmz_map+0x133/0x2d0 [dm_zoned]
  __map_bio+0x40/0x260
  __split_and_process_non_flush+0x116/0x220
  __split_and_process_bio+0x81/0x180
  __dm_make_request.isra.32+0x5a/0x100
  generic_make_request+0x36e/0x690
  submit_bio+0x6c/0x140
  _xfs_buf_ioapply+0x31c/0x590
  xfs_buf_submit_wait+0x73/0x520
  xfs_buf_read_map+0x134/0x2f0
  xfs_trans_read_buf_map+0xc3/0x580
  xfs_read_agf+0xa5/0x1e0
  xfs_alloc_read_agf+0x59/0x2b0
  xfs_alloc_pagf_init+0x27/0x60
  xfs_bmap_longest_free_extent+0x43/0xb0
  xfs_bmap_btalloc_nullfb+0x7f/0xf0
  xfs_bmap_btalloc+0x428/0x7c0
  xfs_bmapi_write+0x598/0xcc0
  xfs_iomap_write_allocate+0x15a/0x330
  xfs_map_blocks+0x1cf/0x3f0
  xfs_do_writepage+0x15f/0x7b0
  write_cache_pages+0x1ca/0x540
  xfs_vm_writepages+0x65/0xa0
  do_writepages+0x48/0xf0
  __writeback_single_inode+0x58/0x730
  writeback_sb_inodes+0x249/0x5c0
  wb_writeback+0x11e/0x550
  wb_workfn+0xa3/0x670
  process_one_work+0x228/0x670
  worker_thread+0x3c/0x390
  kthread+0x11c/0x140
  ret_from_fork+0x3a/0x50

-> #0 (&xfs_nondir_ilock_class){++++}:
  down_read_nested+0x43/0x70
  xfs_free_eofblocks+0xa2/0x1e0
  xfs_fs_destroy_inode+0xac/0x270
  dispose_list+0x51/0x80
  prune_icache_sb+0x52/0x70
  super_cache_scan+0x127/0x1a0
  shrink_slab.part.47+0x1bd/0x590
  shrink_node+0x3b5/0x470
  balance_pgdat+0x158/0x3b0
  kswapd+0x1ba/0x600
  kthread+0x11c/0x140
  ret_from_fork+0x3a/0x50

other info that might help us debug this:

Chain exists of:
  &xfs_nondir_ilock_class --> &dmz->chunk_lock --> fs_reclaim

Possible unsafe locking scenario:

     CPU0                    CPU1
     ----                    ----
lock(fs_reclaim);
                             lock(&dmz->chunk_lock);
                             lock(fs_reclaim);
lock(&xfs_nondir_ilock_class);

*** DEADLOCK ***

3 locks held by kswapd0/84:
 #0: 00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
 #1: 000000000f8208f5 (shrinker_rwsem){++++}, at: shrink_slab.part.47+0x3f/0x590
 #2: 00000000cacefa54 (&type->s_umount_key#43){.+.+}, at: trylock_super+0x16/0x50

stack backtrace:
CPU: 7 PID: 84 Comm: kswapd0 Not tainted 4.18.0-rc1 #62
Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0 12/17/2015
Call Trace:
 dump_stack+0x85/0xcb
 print_circular_bug.isra.36+0x1ce/0x1db
 __lock_acquire+0x124e/0x1310
 lock_acquire+0x9f/0x1f0
 down_read_nested+0x43/0x70
 xfs_free_eofblocks+0xa2/0x1e0
 xfs_fs_destroy_inode+0xac/0x270
 dispose_list+0x51/0x80
 prune_icache_sb+0x52/0x70
 super_cache_scan+0x127/0x1a0
 shrink_slab.part.47+0x1bd/0x590
 shrink_node+0x3b5/0x470
 balance_pgdat+0x158/0x3b0
 kswapd+0x1ba/0x600
 kthread+0x11c/0x140
 ret_from_fork+0x3a/0x50

Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
Fixes: 4218a95546 ("dm zoned: use GFP_NOIO in I/O path")
Cc: <stable@vger.kernel.org>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-06-22 14:51:12 -04:00
Kees Cook
50a7d3ba7c dm writecache: use 2-factor allocator arguments
This adjusts the allocator calls to use the 2-factor argument style, as
already done treewide for better defense against allocator overflows.

Signed-off-by: Kees Cook <keescook@chromium.org>
[snitzer: tweaked code to leave assignment in a test alone]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-06-22 14:51:12 -04:00
Mike Snitzer
7ccdbf85d3 dm thin metadata: remove needless work from __commit_transaction
Commit 5a32083d03 ("dm: take care to copy the space map roots before
locking the superblock") properly removed the calls to dm_sm_root_size()
from __write_initial_superblock().  But the dm_sm_root_size() calls were
left dangling in __commit_transaction().

Fixes: 5a32083d03 ("dm: take care to copy the space map roots before locking the superblock")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-06-22 14:51:11 -04:00
Mike Snitzer
f21c601a2b dm: use bio_split() when splitting out the already processed bio
Use of bio_clone_bioset() is inefficient if there is no need to clone
the original bio's bio_vec array.  Best to use the bio_clone_fast()
variant.  Also, just using bio_advance() is only part of what is needed
to properly setup the clone -- it doesn't account for the various
bio_integrity() related work that also needs to be performed (see
bio_split).

Address both of these issues by switching from bio_clone_bioset() to
bio_split().

Fixes: 18a25da8 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.15+, requires removal of '&' before md->queue->bio_split
Reported-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-06-22 14:51:11 -04:00
Jan Kara
3ee7e8697d bdi: Fix another oops in wb_workfn()
syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was
WB_shutting_down after wb->bdi->dev became NULL. This indicates that
unregister_bdi() failed to call wb_shutdown() on one of wb objects.

The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
drops bdi's reference to wb structures before going through the list of
wbs again and calling wb_shutdown() on each of them. This way the loop
iterating through all wbs can easily miss a wb if that wb has already
passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
from cgwb_release_workfn() and as a result fully shutdown bdi although
wb_workfn() for this wb structure is still running. In fact there are
also other ways cgwb_bdi_unregister() can race with
cgwb_release_workfn() leading e.g. to use-after-free issues:

CPU1                            CPU2
                                cgwb_bdi_unregister()
                                  cgwb_kill(*slot);

cgwb_release()
  queue_work(cgwb_release_wq, &wb->release_work);
cgwb_release_workfn()
                                  wb = list_first_entry(&bdi->wb_list, ...)
                                  spin_unlock_irq(&cgwb_lock);
  wb_shutdown(wb);
  ...
  kfree_rcu(wb, rcu);
                                  wb_shutdown(wb); -> oops use-after-free

We solve these issues by synchronizing writeback structure shutdown from
cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
way we also no longer need synchronization using WB_shutting_down as the
mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
bdi_unregister().

Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-22 12:08:07 -06:00
Geert Uytterhoeven
0ae52ddf5b lightnvm: Remove depends on HAS_DMA in case of platform dependency
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.

Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.

This simplifies the dependencies, and allows to improve compile-testing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-22 12:07:11 -06:00
Will Deacon
784e0300fe rseq: Avoid infinite recursion when delivering SIGSEGV
When delivering a signal to a task that is using rseq, we call into
__rseq_handle_notify_resume() so that the registers pushed in the
sigframe are updated to reflect the state of the restartable sequence
(for example, ensuring that the signal returns to the abort handler if
necessary).

However, if the rseq management fails due to an unrecoverable fault when
accessing userspace or certain combinations of RSEQ_CS_* flags, then we
will attempt to deliver a SIGSEGV. This has the potential for infinite
recursion if the rseq code continuously fails on signal delivery.

Avoid this problem by using force_sigsegv() instead of force_sig(), which
is explicitly designed to reset the SEGV handler to SIG_DFL in the case
of a recursive fault. In doing so, remove rseq_signal_deliver() from the
internal rseq API and have an optional struct ksignal * parameter to
rseq_handle_notify_resume() instead.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: peterz@infradead.org
Cc: paulmck@linux.vnet.ibm.com
Cc: boqun.feng@gmail.com
Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com
2018-06-22 19:04:22 +02:00
Masahiro Yamada
3f6e698604 mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz unconditionally
Since commit 1bb8866677 ("mtd: nand: denali: handle timing parameters
by setup_data_interface()"), denali_dt.c gets the clock rate from the
clock driver.  The driver expects the frequency of the bus interface
clock, whereas the clock driver of SOCFPGA provides the core clock.
Thus, the setup_data_interface() hook calculates timing parameters
based on a wrong frequency.

To make it work without relying on the clock driver, hard-code the clock
frequency, 200MHz.  This is fine for existing DT of UniPhier, and also
fixes the issue of SOCFPGA because both platforms use 200 MHz for the
bus interface clock.

Fixes: 1bb8866677 ("mtd: nand: denali: handle timing parameters by setup_data_interface()")
Cc: linux-stable <stable@vger.kernel.org> #4.14+
Reported-by: Philipp Rosenberger <p.rosenberger@linutronix.de>
Suggested-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-06-22 18:47:56 +02:00
Will Deacon
71c8fc0c96 arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
When rewriting swapper using nG mappings, we must performance cache
maintenance around each page table access in order to avoid coherency
problems with the host's cacheable alias under KVM. To ensure correct
ordering of the maintenance with respect to Device memory accesses made
with the Stage-1 MMU disabled, DMBs need to be added between the
maintenance and the corresponding memory access.

This patch adds a missing DMB between writing a new page table entry and
performing a clean+invalidate on the same line.

Fixes: f992b4dfd5 ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings")
Cc: <stable@vger.kernel.org> # 4.16.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-22 17:23:40 +01:00
Will Deacon
b5b7dd647f arm64: kpti: Use early_param for kpti= command-line option
We inspect __kpti_forced early on as part of the cpufeature enable
callback which remaps the swapper page table using non-global entries.

Ensure that __kpti_forced has been updated to reflect the kpti=
command-line option before we start using it.

Fixes: ea1e3de85e ("arm64: entry: Add fake CPU feature for unmapping the kernel at EL0")
Cc: <stable@vger.kernel.org> # 4.16.x-
Reported-by: Wei Xu <xuwei5@hisilicon.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-22 17:23:26 +01:00
Geert Uytterhoeven
48e315618d MAINTAINERS: Add file patterns for x86 device tree bindings
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622100820.29616-1-geert@linux-m68k.org
2018-06-22 17:58:32 +02:00
Geert Uytterhoeven
abcbcb80cd time: Make sure jiffies_to_msecs() preserves non-zero time periods
For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of
1000, jiffies_to_msecs() never returns zero when passed a non-zero time
period.

However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or
1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero
for small non-zero time periods.  This may break code that relies on
receiving back a non-zero value.

jiffies_to_usecs() does not need such a fix: one jiffy can only be less
than one µs if HZ > 1000000, and such large values of HZ are already
rejected at build time, twice:

  - include/linux/jiffies.h does #error if HZ >= 12288,
  - kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC).

Broken since forever.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org
2018-06-22 17:48:36 +02:00
Vitaly Kuznetsov
2ddc649810 KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number
KVM_CAP_HYPERV_TLBFLUSH collided with KVM_CAP_S390_PSW-BPB, its paragraph
number should now be 8.18.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-06-22 17:30:20 +02:00
Marc Orr
0447378a4a kvm: vmx: Nested VM-entry prereqs for event inj.
This patch extends the checks done prior to a nested VM entry.
Specifically, it extends the check_vmentry_prereqs function with checks
for fields relevant to the VM-entry event injection information, as
described in the Intel SDM, volume 3.

This patch is motivated by a syzkaller bug, where a bad VM-entry
interruption information field is generated in the VMCS02, which causes
the nested VM launch to fail. Then, KVM fails to resume L1.

While KVM should be improved to correctly resume L1 execution after a
failed nested launch, this change is justified because the existing code
to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is
sparse.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Marc Orr <marcorr@google.com>
[Removed comment whose parts were describing previous revisions and the
 rest was obvious from function/variable naming. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-06-22 16:46:26 +02:00
Jens Axboe
f9da9d0786 Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph:

"Various relatively small fixes, mostly to fix error handling of various
 sorts."

* 'nvme-4.18' of git://git.infradead.org/nvme:
  nvme-pci: limit max IO size and segments to avoid high order allocations
  nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
  nvme-fc: release io queues to allow fast fail
  nvmet: reset keep alive timer in controller enable
  nvme-rdma: don't override opts->queue_size
  nvme-rdma: Fix command completion race at error recovery
  nvme-rdma: fix possible free of a non-allocated async event buffer
  nvme-rdma: fix possible double free condition when failing to create a controller
2018-06-22 08:45:29 -06:00
Radim Krčmář
5f9077cbae KVM/arm fixes for 4.18, take #1
- Lazy FPSIMD switching fixes
 - Really disable compat ioctls on architectures that don't want it
 - Disable compat on arm64 (it was never implemented...)
 - Rely on architectural requirements for GICV on GICv3
 - Detect bad alignments in unmap_stage2_range
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlss7fgVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDjosQAKG9+NIpcKx15k6Rzw+WcpEwk3IY
 yx47MR9ndgfypQGrJ1bczbaP0xzMW/snErtN2iKGv/q8QdaKI0fj63XFkDMcfp+i
 RdsIrWyXkhATbFvChKvp2jUWPoDKAO+eUYpwKk60QqT/NEzo6JoLyRXcGwIruHoZ
 pzJEZXnmNUTwgAWnJlLGVCJ17lDVd39Gp9/z97MzOyjFCVpfJ61HsEWAtjmiCQr5
 opHcdJ5KBW7XOU4uRqM4QEIruO3Hq56L9VJJWfRf2gdnFQowyMhKNErvuuSBpeBF
 jHA+1nEWbxph97aEEYcnXDQt7wh/o7hZ7kGdGqmW03gJWZdMhXeivlsavXihkxIq
 s3rUGRFhkIJ2+RwsmhSchlkkR9RFTItAAoMsRPwxAGcLA/2CBIPjdUFqbIse8O/h
 mWvXVrYa6uVQQzxJgtbtiOFFVdBOSxr9uE4eq+8vKNkS+3UTs0vUgueB6QtERNtM
 L1uyLGfyZVGuRvvIo5YTGO5vmRQdsBuPi4YtkJpdtks3TDZBw2bsf38ybyvPc963
 0orsS065JiFbh/aCtcMUnD8nPkdfpWVEmSa470bsSNK0f7Wa311fawW+/wGGpGdh
 2Jz/CIm5a29bIE+aR9k9B4QJIvXrX6uCiobfEHjQGJMC1DOJPVFvQhMn2JMnFc/F
 LlulbKhiYbdhxAnS
 =nzFZ
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-for-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm

KVM/arm fixes for 4.18, take #1

- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
2018-06-22 14:56:19 +02:00
Zhenzhong Duan
0218c76626 x86/microcode/intel: Fix memleak in save_microcode_patch()
Free useless ucode_patch entry when it's replaced.

[ bp: Drop the memfree_patch() two-liner. ]

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Srinivas REDDY Eeda <srinivas.eeda@oracle.com>
Link: http://lkml.kernel.org/r/888102f0-fd22-459d-b090-a1bd8a00cb2b@default
2018-06-22 14:42:59 +02:00
Tony Luck
40c36e2741 x86/mce: Fix incorrect "Machine check from unknown source" message
Some injection testing resulted in the following console log:

  mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
  mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
  mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
  mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
  mce: [Hardware Error]: Run the above through 'mcelog --ascii'
  Kernel panic - not syncing: Machine check from unknown source

This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".

The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.

It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.

We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).

[ bp: Remove unneeded severity assignment. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.2
Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
2018-06-22 14:35:50 +02:00
Borislav Petkov
1f74c8a647 x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
mce_no_way_out() does a quick check during #MC to see whether some of
the MCEs logged would require the kernel to panic immediately. And it
passes a struct mce where MCi_STATUS gets written.

However, after having saved a valid status value, the next iteration
of the loop which goes over the MCA banks on the CPU, overwrites the
valid status value because we're using struct mce as storage instead of
a temporary variable.

Which leads to MCE records with an empty status value:

  mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000
  mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10}

In order to prevent the loss of the status register value, return
immediately when severity is a panic one so that we can panic
immediately with the first fatal MCE logged. This is also the intention
of this function and not to noodle over the banks while a fatal MCE is
already logged.

Tony: read the rest of the MCA bank to populate the struct mce fully.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
2018-06-22 14:35:50 +02:00
John Garry
bed9df97b3 irqdesc: Delete irq_desc_get_msi_desc()
Function irq_desc_get_msi_desc() is not referenced in the kernel (and does
not seem to have been referenced since e39758e0ea, 3 years ago), so
delete it.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <marc.zyngier@arm.com>
Cc: <will.deacon@arm.com>
Cc: <kstewart@linuxfoundation.org>
Cc: <julien.thierry@arm.com>
Cc: <andrew@lunn.ch>
Cc: <trivial@kernel.org>
Link: https://lkml.kernel.org/r/1529667333-92959-1-git-send-email-john.garry@huawei.com
2018-06-22 14:22:02 +02:00
Marc Zyngier
82f499c881 irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug
Enabling LPIs was made a lot stricter recently, by checking that they are
disabled before enabling them. By doing so, the CPU hotplug case was missed
altogether, which leaves LPIs enabled on hotplug off (expecting the CPU to
eventually come back), and won't write a different value anyway on hotplug
on.

So skip that check if that particular case is detected

Fixes: 6eb486b66a ("irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enabling")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lkml.kernel.org/r/20180622095254.5906-8-marc.zyngier@arm.com
2018-06-22 14:22:02 +02:00
Marc Zyngier
205e065d91 irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection
Similarily to the SYNC operation, it must be verified that the VPE
targetted by a VLPI is backed by a valid collection in the GIC driver data
structures.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-7-marc.zyngier@arm.com
2018-06-22 14:22:01 +02:00
Marc Zyngier
83559b47cd irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection
It is possible, under obscure circumstances, to convince the ITS driver to
emit a SYNC operation that targets a collection that is not bound to any
redistributor (and the target_address field is zero) because the
corresponding CPU has not been seen yet (the system has been booted with
max_cpus="something small").

If the ITS is using the linear CPU number as the target, this is not a big
deal, as we just end-up issuing a SYNC to CPU0. But if the ITS requires the
physical address of the redistributor (with GITS_TYPER.PTA==1), we end-up
asking the ITS to write to the physical address zero, which is not exactly
a good idea (there has been report of the ITS locking up). This should of
course never happen, but hey, this is SW...

In order to avoid the above disaster, let's track which collections have
been actually initialized, and let's not generate a SYNC if the collection
hasn't been properly bound to a redistributor.  Take this opportunity to
spit our a warning, in the hope that someone may report the issue if it
arrises again.

Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-6-marc.zyngier@arm.com
2018-06-22 14:22:01 +02:00
Yang Yingliang
c1797b11a0 irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
On a NUMA system, if an ITS is local to an offline node, the ITS driver may
pick an offline CPU to bind the LPI.  In this case, pick an online CPU (and
the first one will do).

But on some systems, binding an LPI to non-local node CPU may cause
deadlock (see Cavium erratum 23144).  In this case, just fail the activate
and return an error code.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622095254.5906-5-marc.zyngier@arm.com
2018-06-22 14:22:01 +02:00
Marc Zyngier
cbaf45a6be irqchip/gic-v2m: Fix SPI release on error path
On failing to allocate the required SPIs, the actual number of interrupts
should be freed and not its log2 value.

Fixes: de337ee301 ("irqchip/gic-v2m: Add PCI Multi-MSI support")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-4-marc.zyngier@arm.com
2018-06-22 14:22:00 +02:00
Marc Zyngier
893fbfff97 irqchip/ls-scfg-msi: Fix MSI affinity handling
The ls-scfs-msi driver is not dealing with the effective affinity
as it should. Let's fix that, and make it clear that the effective
affinity is restricted to a single CPU. Also prevent the driver from
messing with the internals of the affinity setting infrastructure.

Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-3-marc.zyngier@arm.com
2018-06-22 14:22:00 +02:00
Marc Zyngier
72a8edc2d9 genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug
Debug is missing the IRQCHIP_SUPPORTS_LEVEL_MSI debug entry, making debugfs
slightly less useful.

Take this opportunity to also add a missing comment in the definition of
IRQCHIP_SUPPORTS_LEVEL_MSI.

Fixes: 6988e0e0d2 ("genirq/msi: Limit level-triggered MSI to platform devices")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-2-marc.zyngier@arm.com
2018-06-22 14:22:00 +02:00