Commit Graph

618949 Commits

Author SHA1 Message Date
Nicolas Dichtel
63c43787d3 vti6: fix input path
Since commit 1625f45299, vti6 is broken, all input packets are dropped
(LINUX_MIB_XFRMINNOSTATES is incremented).

XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling
xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in
xfrm6_rcv_spi().

A new function xfrm6_rcv_tnl() that enables to pass a value to
xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function
is used in several handlers).

CC: Alexey Kodanev <alexey.kodanev@oracle.com>
Fixes: 1625f45299 ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2016-09-21 10:09:14 +02:00
Nikolay Aleksandrov
b5036cd4ed ipmr, ip6mr: return lastuse relative to now
When I introduced the lastuse member I made a subtle error because it was
returned as an absolute value but that is meaningless to user-space as it
doesn't allow to see how old exactly an entry is. Let's make it similar to
how the bridge returns such values and make it relative to "now" (jiffies).
This allows us to show the actual age of the entries and is much more
useful (e.g. user-space daemons can age out entries, iproute2 can display
the lastuse properly).

Fixes: 43b9e12740 ("net: ipmr/ip6mr: add support for keeping an entry age")
Reported-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:58:23 -04:00
David S. Miller
493d5f6db0 Merge branch 'r8152-phy-fixes'
Hayes Wang says:

====================
r8152: correct the flow of PHY

First, to enable the PHY as early as possible. Some settings may fail if the
PHY is power down.

Move the other PHY settings to hw_phy_cfg() to make sure the order is correct.

Finally, disable ALDPS and EEE before updating the PHY for RTL8153.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:53:53 -04:00
hayeswang
d768c61bc3 r8152: disable ALDPS and EEE before setting PHY
Disable ALDPS and EEE to avoid the possible failure when setting the PHY.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:53:47 -04:00
hayeswang
af0287ec10 r8152: remove r8153_enable_eee
Remove r8153_enable_eee().

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:53:47 -04:00
hayeswang
ef39df8eab r8152: move PHY settings to hw_phy_cfg
Move the PHY relative settings together to hw_phy_cfg().

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:53:47 -04:00
hayeswang
2dd436daac r8152: move enabling PHY
Move enabling PHY to init(), otherwise some other settings may fail.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:53:47 -04:00
hayeswang
e644953982 r8152: move some functions
Move the following functions forward.

	r8152_mmd_indirect()
	r8152_mmd_read()
	r8152_mmd_write()
	r8152_eee_en()
	r8152b_enable_eee()
	r8153_eee_en()
	r8153_enable_eee()
	r8152b_enable_fc()
	r8153_aldps_en()

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:53:46 -04:00
Hariprasad Shenai
9b86a8d19b cxgb4/cxgb4vf: Allocate more queues for 25G and 100G adapter
We were missing check for 25G and 100G while checking port speed,
which lead to less number of queues getting allocated for 25G & 100G
adapters and leading to low throughput. Adding the missing check for
both NIC and vNIC driver.

Also fixes port advertisement for 25G and 100G in ethtool output.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:48:09 -04:00
Russell Currey
b79331a5eb powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment
Commit 5958d19a14 checks for prefetchable m64 BARs by comparing the
addresses instead of using resource flags.  This broke SR-IOV as the m64
check in pnv_pci_ioda_fixup_iov_resources() fails.

The condition in pnv_pci_window_alignment() also changed to checking
only IORESOURCE_MEM_64 instead of both IORESOURCE_MEM_64 and
IORESOURCE_PREFETCH.

Revert these cases to the previous behaviour, adding a new helper function
to do so.  This is named pnv_pci_is_m64_flags() to make it clear this
function is only looking at resource flags and should not be relied on for
non-SRIOV resources.

Fixes: 5958d19a14 ("Fix incorrect PE reservation attempt on some 64-bit BARs")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-21 14:04:13 +10:00
David S. Miller
ceb16a9013 linux-can-fixes-for-4.8-20160919
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCgAGBQJX3/GcAAoJED07qiWsqSVq9PgH/jucrHh2H7IAnJi6ZasZmu1t
 +beOkzg954rJfz2cGulhpNBAl9xJYRhPuJ+Jw5JhDEPXVisgeRVm6lT+8YJaCsWL
 utTotzyRYI3HUrXIl6V2rjn1ms7TAlZsHrAGPj1ufJKEZg8/SF5C15NuY9YTlh4d
 JZT0V13Yxj0CV3d2jqZshFXrF+CiAsGyQF1aGA9Wo2xQeZIt2vY/1DfNtC9eXlXW
 +t7tBCgLka7cwTBC513BKjl9fGwzYO0eYU993N9+Ijir6HLM9N3igZwxjH1hEl/S
 ow6n5T1O89JEnghSN5Xq+2ThkkXR7zZwi+6l74eFICkIwHK/pekZ8dPr9Ct9ZZ0=
 =JcjG
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.8-20160919' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2016-09-19

this is a pull request of one patch for the upcoming linux-4.8 release.

The patch by Fabio Estevam fixes the pm handling in the flexcan driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 22:46:14 -04:00
Linus Torvalds
7d1e042314 - expand the arm64 vmalloc check to include skipping the module space too
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJX4cLAAAoJEIly9N/cbcAmJnIP/j2Impla2LlKHrZ3eRdy8wxB
 J786/g60VFR+Yu8LZ+q9VKD+SnYIHonqQf/7QoRAK02F1uq15TxQXJqqY3R+PAvS
 pW2wYuatwqSrfF5ywiB9ffDo3h7B6rq6SdwZ4voEOOylIeZgga83DrhhmnXx1mCU
 WUikCvvsuhfQctA3MUDB6T9P/oEB3t0m8BVhny6N0g1agD0992Dlq/CHi2DXsm6J
 fbrlbkRg2y1H3K4F9C+5RqwN/bjWxUAVqH1UoU4L/kJKSsCyYfmxT3nAo52D4/Z4
 erZ30wRmgGenuIR1MrBugxR0a2M6AoBJvB5V1s7/etCNYi7x8UUEMxJbxsUov493
 4y91cp1OOSWTXLqKdq/NWOepC1VBJiQy6jL2Oie/8SX9dKucSg2SvcHQ3tDUKjmR
 iYeF3JmPkDlWunSAv9GyIoeLg6aqLg1s1ksgQolQtJI5URn/QAFkBAD+dUDSDRZc
 KKqWkZHG7nOUvMw0jQ+PsqJf20RVlHnlIOHR11u5IRwkGBEL/dSrGQZFx8F2FerN
 51wJKAwAcFoG5Hod6gf7Hl2ZEROMkrNUkFvZ9CDtqU4vRtcgCrBAWFDsEhqw4C9A
 3lVPVzeWFfPIK85pPF0Fj/LPSRBGbTu1rHMEcflUJhZrkIhzGYdvbsC2QauDModv
 5+oSyh7iwVy09eKtaA5U
 =tEF0
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull usercopy hardening fix from Kees Cook:
 "Expand the arm64 vmalloc check to include skipping the module space
  too"

* tag 'usercopy-v4.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mm: usercopy: Check for module addresses
2016-09-20 17:11:19 -07:00
Al Viro
e23d4159b1 fix fault_in_multipages_...() on architectures with no-op access_ok()
Switching iov_iter fault-in to multipages variants has exposed an old
bug in underlying fault_in_multipages_...(); they break if the range
passed to them wraps around.  Normally access_ok() done by callers will
prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into
such a range and they should not point to any valid objects).

However, on architectures where userland and kernel live in different
MMU contexts (e.g. s390) access_ok() is a no-op and on those a range
with a wraparound can reach fault_in_multipages_...().

Since any wraparound means EFAULT there, the fix is trivial - turn
those

    while (uaddr <= end)
	    ...
into

    if (unlikely(uaddr > end))
	    return -EFAULT;
    do
	    ...
    while (uaddr <= end);

Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Cc: stable@vger.kernel.org # v3.5+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-20 16:44:28 -07:00
Laura Abbott
aa4f060111 mm: usercopy: Check for module addresses
While running a compile on arm64, I hit a memory exposure

usercopy: kernel memory exposure attempt detected from fffffc0000f3b1a8 (buffer_head) (1 bytes)
------------[ cut here ]------------
kernel BUG at mm/usercopy.c:75!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in: ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp
llc ebtable_nat ip6table_security ip6table_raw ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
iptable_security iptable_raw iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
ebtable_filter ebtables ip6table_filter ip6_tables vfat fat xgene_edac
xgene_enet edac_core i2c_xgene_slimpro i2c_core at803x realtek xgene_dma
mdio_xgene gpio_dwapb gpio_xgene_sb xgene_rng mailbox_xgene_slimpro nfsd
auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sdhci_of_arasan
sdhci_pltfm sdhci mmc_core xhci_plat_hcd gpio_keys
CPU: 0 PID: 19744 Comm: updatedb Tainted: G        W 4.8.0-rc3-threadinfo+ #1
Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016
task: fffffe03df944c00 task.stack: fffffe00d128c000
PC is at __check_object_size+0x70/0x3f0
LR is at __check_object_size+0x70/0x3f0
...
[<fffffc00082b4280>] __check_object_size+0x70/0x3f0
[<fffffc00082cdc30>] filldir64+0x158/0x1a0
[<fffffc0000f327e8>] __fat_readdir+0x4a0/0x558 [fat]
[<fffffc0000f328d4>] fat_readdir+0x34/0x40 [fat]
[<fffffc00082cd8f8>] iterate_dir+0x190/0x1e0
[<fffffc00082cde58>] SyS_getdents64+0x88/0x120
[<fffffc0008082c70>] el0_svc_naked+0x24/0x28

fffffc0000f3b1a8 is a module address. Modules may have compiled in
strings which could get copied to userspace. In this instance, it
looks like "." which matches with a size of 1 byte. Extend the
is_vmalloc_addr check to be is_vmalloc_or_module_addr to cover
all possible cases.

Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2016-09-20 16:07:39 -07:00
Josh Poimboeuf
71f5443ebb x86/dumpstack: Fix show_stack() task pointer regression
With the following commit:

  e18bcccd1a ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")

The task pointer argument to show_stack_log_lvl() in show_stack() was
inadvertently changed to 'current'.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: byungchul.park@lge.com
Cc: fweisbec@gmail.com
Cc: keescook@chromium.org
Cc: linux-tip-commits@vger.kernel.org
Cc: luto@amacapital.net
Cc: nilayvaish@gmail.com
Cc: rostedt@goodmis.org
Cc: tip-bot for Josh Poimboeuf <tipbot@zytor.com>
Fixes: e18bcccd1a ("x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder")
Link: http://lkml.kernel.org/r/20160920155340.yhewlx7vmgmov5fb@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 23:36:37 +02:00
Ingo Molnar
89f1c2c59c perf/core improvements and fixes:
User visible:
 
 - Support event group view with hierarchy mode in 'perf top' and 'perf report'
   (Namhyung Kim)
 
   e.g.:
 
   $ perf record -e '{cycles,instructions}' make
   $ perf report --hierarchy --stdio
   ...
   #               Overhead  Command / Shared Object / Symbol
   # ......................  ..................................
   ...
       25.74%  27.18%        sh
          19.96%  24.14%        libc-2.24.so
             9.55%  14.64%        [.] __strcmp_sse2
             1.54%   0.00%        [.] __tfind
             1.07%   1.13%        [.] _int_malloc
             0.95%   0.00%        [.] __strchr_sse2
             0.89%   1.39%        [.] __tsearch
             0.76%   0.00%        [.] strlen
 
 - Fix the dwarf regs table for x86_64, adding a missing % to the "%di"
   register, noticed with a failing 'perf test bpf' (Arnaldo Carvalho de Melo)
 
 - Fix handling of mmap parameters in the 'perf trace' beautifier in
   architectures that don't have the same mappings as x86_64 (Wang Nan)
 
 - Handle hugetbl mappings in older systems running new kernels (Wang Nan)
 
 - Resolve 'call' operands in 'annotate', that when using /proc/kcore
   were appearing just as hexadecimal addresses, to function names
   (Arnaldo Carvalho de Melo)
 
 - Fix width computation for srcline sort entry (Jiri Olsa)
 
 - Do not ignore call instruction with indirect target in 'annotate'
   (Ravi Bangoria)
 
 - Handle MADV_FREE in the madvise 'trace' beautifier (Wang Nan)
 
 - Fix build of 'perf trace' mman beautifier in !x86_64 (Wang Nan)
 
 Infrastructure:
 
 - Add infrastructure for PMU specific configuration, allowing to pass
   config variables directly to the kernel PMU driver, prefixing those
   variables with a '@', part of a larger series to support Coresight (Mathieu Poirier)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJX4Y1JAAoJENZQFvNTUqpAVTMP/3EQ6Riqleasg5m7UU7adyTZ
 h4Oaq8KT1ON8VmR8gf6czZjzCE2Oiy9d1iUupza5iqau0wLF5IKPLdf9E0P9JA6n
 KPXgnAwq/ydIPQTSLWRgcDRyk2A61VDBBib/9pKBGK2Rkam4d/Eh3eD4AawyVkHd
 hUAdVPNTx5m3Xv/keySn/fcHAYaHCSnI1hNa8OMnQHkdkElak3akRzrpVmGHLz8j
 NAzCutmlBSLFNMVx6Kxa+gR3c0GwMcIMtpq5iTtvtqgf76i6evjjuJeY8hQF4SqJ
 lWA3pzdhP9lxD81gxbZzP9Vq0G6mifc8q9iJE4pM13NXWhQ6sneI0s2ERlkXCRSS
 16Subg4yHp6DP6KRvmsS4kKCeJauHJ5wiWs6JtPPTiZv0V9h7MLKc7QMZIKjhKhO
 WVb2UUvSCOMgiu+2xLo1P/vq1DoJriM05Q90E+uqVrdy4c7ptiP5nq6vP6e98uD3
 SsosnsMjX9UiAqYAYLiRxulsb6F1xhNHHV5Iw15LKyzdCRu1UNLeuOJ14FenBU58
 S+gKD0MdVk2qCChh0klbGcleomNgt4hCnPgLfiAwKxj352toxpThkFQ/ZaBRhWKL
 12eid79tGz5BPxOHg0pyBX50fE1HCww95DpTQS1r2bqzh1ijecxwY5w/4PVq+ZIP
 al/4gotCYp9Wv90miLi8
 =4bBB
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-20160920' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- Support event group view with hierarchy mode in 'perf top' and 'perf report'
  (Namhyung Kim)

  e.g.:

  $ perf record -e '{cycles,instructions}' make
  $ perf report --hierarchy --stdio
  ...
  #               Overhead  Command / Shared Object / Symbol
  # ......................  ..................................
  ...
      25.74%  27.18%        sh
         19.96%  24.14%        libc-2.24.so
            9.55%  14.64%        [.] __strcmp_sse2
            1.54%   0.00%        [.] __tfind
            1.07%   1.13%        [.] _int_malloc
            0.95%   0.00%        [.] __strchr_sse2
            0.89%   1.39%        [.] __tsearch
            0.76%   0.00%        [.] strlen

- Fix the dwarf regs table for x86_64, adding a missing % to the "%di"
  register, noticed with a failing 'perf test bpf' (Arnaldo Carvalho de Melo)

- Fix handling of mmap parameters in the 'perf trace' beautifier in
  architectures that don't have the same mappings as x86_64 (Wang Nan)

- Handle hugetbl mappings in older systems running new kernels (Wang Nan)

- Resolve 'call' operands in 'annotate', that when using /proc/kcore
  were appearing just as hexadecimal addresses, to function names
  (Arnaldo Carvalho de Melo)

- Fix width computation for srcline sort entry (Jiri Olsa)

- Do not ignore call instruction with indirect target in 'annotate'
  (Ravi Bangoria)

- Handle MADV_FREE in the madvise 'trace' beautifier (Wang Nan)

- Fix build of 'perf trace' mman beautifier in !x86_64 (Wang Nan)

Infrastructure changes:

- Add infrastructure for PMU specific configuration, allowing to pass
  config variables directly to the kernel PMU driver, prefixing those
  variables with a '@', part of a larger series to support Coresight (Mathieu Poirier)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 23:32:02 +02:00
Paul Burton
e875bd66df irqchip/mips-gic: Fix local interrupts
Since the device hierarchy domain was added by commit c98c1822ee
("irqchip/mips-gic: Add device hierarchy domain"), GIC local interrupts
have been broken.

Users attempting to setup a per-cpu local IRQ, for example the GIC timer
clock events code in drivers/clocksource/mips-gic-timer.c, the
setup_percpu_irq function would refuse with -EINVAL because the GIC
irqchip driver never called irq_set_percpu_devid so the
IRQ_PER_CPU_DEVID flag was never set for the IRQ. This happens because
irq_set_percpu_devid was being called from the gic_irq_domain_map
function which is no longer called.

Doing only that runs into further problems because gic_dev_domain_alloc
set the struct irq_chip for all interrupts, local or shared, to
gic_level_irq_controller despite that only being suitable for shared
interrupts. The typical outcome of this is that gic_level_irq_controller
callback functions are called for local interrupts, and then hwirq
number calculations overflow & the driver ends up attempting to access
some invalid register with an address calculated from an invalid hwirq
number. Best case scenario is that this then leads to a bus error. This
is fixed by abstracting the setup of the hwirq & chip to a new function
gic_setup_dev_chip which is used by both the root GIC IRQ domain & the
device domain.

Finally, decoding local interrupts failed because gic_dev_domain_alloc
only called irq_domain_alloc_irqs_parent for shared interrupts. Local
ones were therefore never associated with hwirqs in the root GIC IRQ
domain and the virq in gic_handle_local_int would always be 0. This is
fixed by calling irq_domain_alloc_irqs_parent unconditionally & having
gic_irq_domain_alloc handle both local & shared interrupts, which is
easy due to the aforementioned abstraction of chip setup into
gic_setup_dev_chip.

This fixes use of the MIPS GIC timer for clock events, which has been
broken since c98c1822ee ("irqchip/mips-gic: Add device hierarchy
domain") but hadn't been noticed due to a silent fallback to the MIPS
coprocessor 0 count/compare clock events device.

Fixes: c98c1822ee ("irqchip/mips-gic: Add device hierarchy domain")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: stable@vger.kernel.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20160913165335.31389-1-paul.burton@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 23:20:02 +02:00
Jiri Olsa
df04abfd18 fs/proc/kcore.c: Add bounce buffer for ktext data
We hit hardened usercopy feature check for kernel text access by reading
kcore file:

  usercopy: kernel memory exposure attempt detected from ffffffff8179a01f (<kernel text>) (4065 bytes)
  kernel BUG at mm/usercopy.c:75!

Bypassing this check for kcore by adding bounce buffer for ktext data.

Reported-by: Steve Best <sbest@redhat.com>
Fixes: f5509cc18d ("mm: Hardened usercopy")
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-20 13:32:49 -07:00
Jiri Olsa
f5beeb1851 fs/proc/kcore.c: Make bounce buffer global for read
Next patch adds bounce buffer for ktext area, so it's
convenient to have single bounce buffer for both
vmalloc/module and ktext cases.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-20 13:32:49 -07:00
Jiri Olsa
3c028a0cb5 perf symbols: Do not open device files
The dso__read_binary_type_filename gets the dso's file name to open. We
need to check it for regular file before trying to open it, otherwise we
might get stuck with device file.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160920161245.GA8995@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20 16:20:21 -03:00
Namhyung Kim
e3b60bc93d perf hists: Factor out hists__reset_column_width()
The stdio and tui has same code to reset hpp format column width.
Factor it out as a new function.

Suggested-and-Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160920053025.13989-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20 16:13:37 -03:00
Namhyung Kim
5ff3e7a224 perf ui/tui: Reset output width for hierarchy
When --hierarchy option is used, each entry has its own hpp_list to show
the result.  But it missed to update width of each column.

Before:

  - 46.29% 48.12%        netctl-auto
     + 31.44% 29.25%        [kernel.vmlinux]
     + 8.52% 11.55%        libc-2.22.so
     + 5.19% 6.91%        bash
  + 10.75% 11.83%        wpa_cli
  + 8.25% 2.23%        swapper
  + 6.45% 5.40%        tr
  + 4.81% 8.09%        awk
  + 4.15% 2.85%        firefox
  + 3.86% 2.53%        sh

After:

  -  46.29%  48.12%        netctl-auto
      +  31.44%  29.25%        [kernel.vmlinux]
      +   8.52%  11.55%        libc-2.22.so
      +   5.19%   6.91%        bash
  +  10.75%  11.83%        wpa_cli
  +   8.25%   2.23%        swapper
  +   6.45%   5.40%        tr
  +   4.81%   8.09%        awk
  +   4.15%   2.85%        firefox
  +   3.86%   2.53%        sh

Committer note:

Full testing instructions:

1) Record with an event group:

  $ perf record -e '{cycles,instructions}' make -j4

2) Use report in hierarchy mode, to get a few expanded trees on
   the same screen, use --percent-limit:

  $ perf report --hierarchy --percent-limit 0.5

Samples: 103K of event 'anon group { cycles:u, instructions:u }',
Event count (approx.): 57317631725
         Overhead        Command / Shared Object / Symbol        ◆
-  58.89%  55.12%        cc1                                     ▒
   -  50.26%  48.10%        cc1                                  ▒
          3.61%   5.13%        [.] _cpp_lex_token                ▒
          2.58%   0.78%        [.] ht_lookup_with_hash           ▒
          1.31%   1.30%        [.] ggc_internal_alloc            ▒
          1.08%   2.25%        [.] get_combined_adhoc_loc        ▒
          1.01%   1.95%        [.] ira_init                      ▒
          0.96%   1.78%        [.] linemap_position_for_column   ▒
          0.65%   1.01%        [.] cpp_get_token_with_location   ▒
   -   7.52%   6.58%        libc-2.23.so                         ▒
          1.70%   1.78%        [.] _int_malloc                   ▒
          0.69%   0.75%        [.] _int_free                     ▒
          0.67%   0.42%        [.] malloc_consolidate            ▒
   -   0.58%   0.42%        ld-2.23.so                           ▒
                               no entry >= 0.50%                 ▒
   -   0.52%   0.03%        [kernel.vmlinux]                     ▒
                               no entry >= 0.50%                 ▒

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 1b2dbbf41a ("perf hists: Use own hpp_list for hierarchy mode")
Link: http://lkml.kernel.org/r/20160920053025.13989-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20 16:08:30 -03:00
Arnaldo Carvalho de Melo
5f62d4fd35 perf annotate: Resolve 'call' operands to function names
Before this patch the '_raw_spin_lock_irqsave' and 'update_rq_clock' operands
were appearing just as hexadecimal numbers:

  update_blocked_averages  /proc/kcore
       │       push   %r12
       │       push   %rbx
       │       and    $0xfffffffffffffff0,%rsp
       │       sub    $0x40,%rsp
       │       add    -0x662cac00(,%rdi,8),%rax
       │       mov    %rax,%rbx
       │       mov    %rax,%rdi
       │       mov    %rax,0x38(%rsp)
       │     → callq  _raw_spin_lock_irqsave
       │       mov    %rbx,%rdi
       │       mov    %rax,0x30(%rsp)
       │     → callq  update_rq_clock
       │       mov    0x8d0(%rbx),%rax
       │       lea    0x8d0(%rbx),%r11

To check that all is right one can always use the 'o' hotkey and see
the original objdump -dS output, that for this case is:

  update_blocked_averages  /proc/kcore
       │ffffffff990d5489:   push   %r12
       │ffffffff990d548b:   push   %rbx
       │ffffffff990d548c:   and    $0xfffffffffffffff0,%rsp
       │ffffffff990d5490:   sub    $0x40,%rsp
       │ffffffff990d5494:   add    -0x662cac00(,%rdi,8),%rax
       │ffffffff990d549c:   mov    %rax,%rbx
       │ffffffff990d549f:   mov    %rax,%rdi
       │ffffffff990d54a2:   mov    %rax,0x38(%rsp)
       │ffffffff990d54a7: → callq  0xffffffff997eb7a0
       │ffffffff990d54ac:   mov    %rbx,%rdi
       │ffffffff990d54af:   mov    %rax,0x30(%rsp)
       │ffffffff990d54b4: → callq  0xffffffff990c7720
       │ffffffff990d54b9:   mov    0x8d0(%rbx),%rax
       │ffffffff990d54c0:   lea    0x8d0(%rbx),%r11

Use the 'h' hotkey to see a list of available hotkeys.

More work needed to cover operands for other instructions, such as 'mov',
that can resolve variable names, etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xqgtw9mzmzcjgwkis9kiiv1p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20 12:28:30 -03:00
Arnaldo Carvalho de Melo
bff5c30613 perf annotate: Pass the symbol's map/dso to the instruction parsers
So that things like:

       → callq  0xffffffff993e3230

found while disassembling /proc/kcore can be beautified by later
patches, that will resolve that address to a function, looking it up in
/proc/kallsyms.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-p76myuke4j7gplg54amaklxk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20 12:28:29 -03:00
Ravi Bangoria
88a7fcf961 perf annotate: Do not ignore call instruction with indirect target
Do not ignore call instruction with indirect target when its already
identified as a call. This is an extension of commit e8ea156195 ("perf
annotate: Use raw form for register indirect call instructions") to
generalize annotation for all instructions with indirect calls.

This is needed for certain powerpc call instructions that use address in
a register (such as bctrl, btarl, ...).

Apart from that, when kcore is used to disassemble function, all call
instructions were ignored. This patch will fix it as a side effect by
not ignoring them. For example,

Before (with kcore):
       mov    %r13,%rdi
       callq  0xffffffff811a7e70
     ^ jmpq   64
       mov    %gs:0x7ef41a6e(%rip),%al

After (with kcore):
       mov    %r13,%rdi
     > callq  0xffffffff811a7e70
     ^ jmpq   64
       mov    %gs:0x7ef41a6e(%rip),%al

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
[Suggested about 'bctrl' instruction]
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Riyder <chris.ryder@arm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1471611578-11255-5-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20 12:28:29 -03:00
Jiri Olsa
f666ac0dab perf hists: Fix width computation for srcline sort entry
Adding header size to width computation for srcline sort entry,
because it's possible to get empty data with ':0' which set width
of 2 which is lower than width needed to display column header.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1474290610-23241-62-git-send-email-jolsa@kernel.org
[ Added declaration to sort.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-20 12:28:28 -03:00
Ingo Molnar
2ab78a724b * Fix a boot crash reported by Mike Galbraith and Mike Krinkin. The
new EFI memory map reservation code didn't align reservations to
    EFI_PAGE_SIZE boundaries causing bogus regions to be inserted into
    the global EFI memory map - Matt Fleming
 -----BEGIN PGP SIGNATURE-----
 
 iQI2BAABCAAgBQJX4UshGRxtYXR0QGNvZGVibHVlcHJpbnQuY28udWsACgkQLzhZ
 wI0jPVX2RA/9HiT48K98R1fkiig/V3Wh8H8y7lQbwO+qA1WZ7q/D+DYau62qI3zM
 sLgTSoV2zzjBr+4JWuGIdbtpYBjhHci0HwEcOGp5EFSZmgJGOpc7VY2VdFaYyZ2X
 xhowGLXv2jgfKoAPt9MfgZX2Qh5Jt4CRlC+UcilUJp0wy1+uD4W7XmrLM1Rg7Ug2
 8XTL/BA5QxQa3wx8+/z6xV/ADlhzf3DQ7BwI3qkx5RHSXSuy1/E2XGTnyuDt/7Oo
 KmMwQGGrrv92uSjyq5vJ53DuNxXDJ7nGtlXTT2u0WtcZbzJXsXIaCZItbCA5rqW/
 wWGgsz5XSc406m1WNv8Zs8xx9NAEP/+KXQfftwuQHFqoylG4QqYuO060V5XOxQlh
 cdRuyVUgMONZAvTzEAUC+5SaN3c+WDCL8oEnrs/tdMxSzBL9Rj7yaHMW+ozVe5Nf
 GFxMnO0l5SG9+o8hSAkkn5RDfIDzSTC1W5imqWJQXhC2BXHEdmh4wVWIG4QgPM/K
 gqaTpX72Nu04lBjCXXpfKSK5Xqv67hZOzxUDkkjxtq1PtDbL1dPpzAF3s3GFdreI
 FfQgetg4jM4EPnWDQWqWVJxQUv+uCfpDV9g5y9kMinVb2OpyGjcZpNckGHKnqIXr
 hyntPYqyy793W6dPGEjTqfcVP751qisMjp6iIIqk3h+qr7HyJSF5nqs=
 =S9gJ
 -----END PGP SIGNATURE-----

Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/core

Pull EFI fix from Matt Fleming:

 * Fix a boot crash reported by Mike Galbraith and Mike Krinkin. The
   new EFI memory map reservation code didn't align reservations to
   EFI_PAGE_SIZE boundaries causing bogus regions to be inserted into
   the global EFI memory map (Matt Fleming)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 16:59:15 +02:00
Ingo Molnar
41a66072c3 Merge branch 'efi/urgent' into efi/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 16:58:59 +02:00
Ingo Molnar
7597cdc066 * Fix a boot hang on large memory machines (multiple terabyte) caused
by type conversion errors in the x86 pat code - Matt Fleming
 -----BEGIN PGP SIGNATURE-----
 
 iQI2BAABCAAgBQJX4UkgGRxtYXR0QGNvZGVibHVlcHJpbnQuY28udWsACgkQLzhZ
 wI0jPVWaoA/9EqgJqgXP5JSoV5EFzGA0uyE65J46hhmpqry0gnJ0Jv4lf9nWVShI
 iYm3VRW2bIjF6sNKw/Z9ZFzhXY96bn3axlPwoMbwsuCJoWHwIG0dYkI+N4oQGcaR
 7xX5hrnWeRPdU28wTZNyvqWBojXJcY2Cv3mg+/1/wByL7wABljcAxmr3a0Vk1j/R
 UsAdWKz0AFYMnKKhfwcmIwTRDFXlzmCVNP1luycFNjVXALW7vHOIav7Wzn7ZMCfv
 Nd6FYlabUm5Nq/o8o7QD/KniBj/0maurBq2jnbtV/Iim6OsMpN62AsXYY06QNtrp
 IlrDb4RIYJa+Oi4VDpjoEZicBlVuEBBJaziS3bPMUlswkO2S/v7LvddIqqEm/K1V
 rqLTpcx+O66LODs1GHz2hDd/T8I5lHsR8F9ggmboQj8kTze4kowpeN4kuTB91+oK
 4gybkXiB7olrjdWCxPe2UE/2c325nHsSmuObDmfJjTxCn1zsbJY233oXLLfaiMgn
 y4URazGH7m2kTAfjbwfJQKLMAz9AOItxaY4f2nIO6BM/ZaNM59CyGQUKU9muhecd
 p/e8qJtyP4narwtDFWm5XMUybS3W46F5jAAux1AGi83vhpfbVjnvajm5wzeonTq3
 pAS3NBWMuJ0xHQsq/zkI+S5bvzErsQKwV0p+tNhpKSb0n72tDaN+CYk=
 =qMBL
 -----END PGP SIGNATURE-----

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/urgent

Pull EFI fixes from Matt Fleming:

 * Fix a boot hang on large memory machines (multiple terabyte) caused
   by type conversion errors in the x86 PAT code (Matt Fleming)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 16:56:56 +02:00
Matt Fleming
92dc33501b x86/efi: Round EFI memmap reservations to EFI_PAGE_SIZE
Mike Galbraith reported that his machine started rebooting during boot
after,

  commit 8e80632fb2 ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")

The ESRT table on his machine is 56 bytes and at no point in the
efi_arch_mem_reserve() call path is that size rounded up to
EFI_PAGE_SIZE, nor is the start address on an EFI_PAGE_SIZE boundary.

Since the EFI memory map only deals with whole pages, inserting an EFI
memory region with 56 bytes results in a new entry covering zero
pages, and completely screws up the calculations for the old regions
that were trimmed.

Round all sizes upwards, and start addresses downwards, to the nearest
EFI_PAGE_SIZE boundary.

Additionally, efi_memmap_insert() expects the mem::range::end value to
be one less than the end address for the region.

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Reported-by: Mike Krinkin <krinkin.m.u@gmail.com>
Tested-by: Mike Krinkin <krinkin.m.u@gmail.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-20 15:43:31 +01:00
Sebastian Andrzej Siewior
f1e1c9e5e3 perf/x86/intel/bts: Make sure debug store is valid
Since commit 4d4c474124 ("perf/x86/intel/bts: Fix BTS PMI detection")
my box goes boom on boot:

| .... node  #0, CPUs:      #1 #2 #3 #4 #5 #6 #7
| BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
| IP: [<ffffffff8100c463>] intel_bts_interrupt+0x43/0x130
| Call Trace:
|  <NMI> d [<ffffffff8100b341>] intel_pmu_handle_irq+0x51/0x4b0
|  [<ffffffff81004d47>] perf_event_nmi_handler+0x27/0x40

This happens because the code introduced in this commit dereferences the
debug store pointer unconditionally. The debug store is not guaranteed to
be available, so a NULL pointer check as on other places is required.

Fixes: 4d4c474124 ("perf/x86/intel/bts: Fix BTS PMI detection")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: vince@deater.net
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20160920131220.xg5pbdjtznszuyzb@breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 16:06:09 +02:00
Matt Fleming
1297667083 x86/efi: Only map RAM into EFI page tables if in mixed-mode
Waiman reported that booting with CONFIG_EFI_MIXED enabled on his
multi-terabyte HP machine results in boot crashes, because the EFI
region mapping functions loop forever while trying to map those
regions describing RAM.

While this patch doesn't fix the underlying hang, there's really no
reason to map EFI_CONVENTIONAL_MEMORY regions into the EFI page tables
when mixed-mode is not in use at runtime.

Reported-by: Waiman Long <waiman.long@hpe.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
CC: Theodore Ts'o <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-20 14:53:04 +01:00
Matt Fleming
e535ec0899 x86/mm/pat: Prevent hang during boot when mapping pages
There's a mixture of signed 32-bit and unsigned 32-bit and 64-bit data
types used for keeping track of how many pages have been mapped.

This leads to hangs during boot when mapping large numbers of pages
(multiple terabytes, as reported by Waiman) because those values are
interpreted as being negative.

commit 742563777e ("x86/mm/pat: Avoid truncation when converting
cpa->numpages to address") fixed one of those bugs, but there is
another lurking in __change_page_attr_set_clr().

Additionally, the return value type for the populate_*() functions can
return negative values when a large number of pages have been mapped,
triggering the error paths even though no error occurred.

Consistently use 64-bit types on 64-bit platforms when counting pages.
Even in the signed case this gives us room for regions 8PiB
(pebibytes) in size whilst still allowing the usual negative value
error checking idiom.

Reported-by: Waiman Long <waiman.long@hpe.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
CC: Theodore Ts'o <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
2016-09-20 14:53:00 +01:00
Yuval Mintz
67a99b7061 qed: Fix stack corruption on probe
Commit fe56b9e6a8 ("qed: Add module with basic common support")
has introduced a stack corruption during probe, where filling a
local struct with data to be sent to management firmware is incorrectly
filled; The data is written outside of the struct and corrupts
the stack.

Changes from v1:
----------------
 - Correct the value written [Caught by David Laight]

Fixes: fe56b9e6a8 ("qed: Add module with basic common support")
Signed-off-by: Yuval Mintz <Yuval.Mintz@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 04:57:16 -04:00
Paul Gortmaker
0edfa83916 arm64: migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-20 09:36:21 +01:00
Andrew Lunn
3ed6e498b9 MAINTAINERS: Add an entry for the core network DSA code
The core distributed switch architecture code currently does not have
a MAINTAINERS entry, which results in some contributions not landing
in the right peoples inbox.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 03:54:59 -04:00
Vincent Bernat
a435a07f91 net: ipv6: fallback to full lookup if table lookup is unsuitable
Commit 8c14586fc3 ("net: ipv6: Use passed in table for nexthop
lookups") introduced a regression: insertion of an IPv6 route in a table
not containing the appropriate connected route for the gateway but which
contained a non-connected route (like a default gateway) fails while it
was previously working:

    $ ip link add eth0 type dummy
    $ ip link set up dev eth0
    $ ip addr add 2001:db8::1/64 dev eth0
    $ ip route add ::/0 via 2001:db8::5 dev eth0 table 20
    $ ip route add 2001:db8:cafe::1/128 via 2001:db8::6 dev eth0 table 20
    RTNETLINK answers: No route to host
    $ ip -6 route show table 20
    default via 2001:db8::5 dev eth0  metric 1024  pref medium

After this patch, we get:

    $ ip route add 2001:db8:cafe::1/128 via 2001:db8::6 dev eth0 table 20
    $ ip -6 route show table 20
    2001:db8:cafe::1 via 2001:db8::6 dev eth0  metric 1024  pref medium
    default via 2001:db8::5 dev eth0  metric 1024  pref medium

Fixes: 8c14586fc3 ("net: ipv6: Use passed in table for nexthop lookups")
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-20 03:28:29 -04:00
Josh Poimboeuf
c8fe460982 x86/dumpstack: Remove dump_trace() and related callbacks
All previous users of dump_trace() have been converted to use the new
unwind interfaces, so we can remove it and the related
print_context_stack() and print_context_stack_bp() callback functions.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5b97da3572b40b5a4d8e185cf2429308d0987a13.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:34 +02:00
Josh Poimboeuf
e18bcccd1a x86/dumpstack: Convert show_trace_log_lvl() to use the new unwinder
Convert show_trace_log_lvl() to use the new unwinder.  dump_trace() has
been deprecated.

show_trace_log_lvl() is special compared to other users of the unwinder.
It's the only place where both reliable *and* unreliable addresses are
needed.  With frame pointers enabled, most callers of the unwinder don't
want to know about unreliable addresses.  But in this case, when we're
dumping the stack to the console because something presumably went
wrong, the unreliable addresses are useful:

- They show stale data on the stack which can provide useful clues.

- If something goes wrong with the unwinder, or if frame pointers are
  corrupt or missing, all the stack addresses still get shown.

So in order to show all addresses on the stack, and at the same time
figure out which addresses are reliable, we have to do the scanning and
the unwinding in parallel.

The scanning is done with the help of get_stack_info() to traverse the
stacks.  The unwinding is done separately by the new unwinder.

In theory we could simplify show_trace_log_lvl() by instead pushing some
of this logic into the unwind code.  But then we would need some kind of
"fake" frame logic in the unwinder which would add a lot of complexity
and wouldn't be worth it in order to support only one user.

Another benefit of this approach is that once we have a DWARF unwinder,
we should be able to just plug it in with minimal impact to this code.

Another change here is that callers of show_trace_log_lvl() don't need
to provide the 'bp' argument.  The unwinder already finds the relevant
frame pointer by unwinding until it reaches the first frame after the
provided stack pointer.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/703b5998604c712a1f801874b43f35d6dac52ede.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:34 +02:00
Josh Poimboeuf
ec2ad9ccf1 oprofile/x86: Convert x86_backtrace() to use the new unwinder
Convert oprofile's x86_backtrace() to use the new unwinder.
dump_trace() has been deprecated.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/412df8927705795e8ea60cffcf89a79e010713b1.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:34 +02:00
Josh Poimboeuf
49a612c6b0 x86/stacktrace: Convert save_stack_trace_*() to use the new unwinder
Convert save_stack_trace_*() to use the new unwinder.  dump_trace() has
been deprecated.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/815494c627d89887db0ce56ceffd58ad16ee6c21.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:33 +02:00
Josh Poimboeuf
35f4d9b325 perf/x86: Convert perf_callchain_kernel() to use the new unwinder
Convert perf_callchain_kernel() to use the new unwinder.  dump_trace()
has been deprecated.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a2df0c4f09b3d438e11b41681f10b0775a819a7f.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:33 +02:00
Josh Poimboeuf
7c7900f897 x86/unwind: Add new unwind interface and implementations
The x86 stack dump code is a bit of a mess.  dump_trace() uses
callbacks, and each user of it seems to have slightly different
requirements, so there are several slightly different callbacks floating
around.

Also there are some upcoming features which will need more changes to
the stack dump code, including the printing of stack pt_regs, reliable
stack detection for live patching, and a DWARF unwinder.  Each of those
features would at least need more callbacks and/or callback interfaces,
resulting in a much bigger mess than what we have today.

Before doing all that, we should try to clean things up and replace
dump_trace() with something cleaner and more flexible.

The new unwinder is a simple state machine which was heavily inspired by
a suggestion from Andy Lutomirski:

  https://lkml.kernel.org/r/CALCETrUbNTqaM2LRyXGRx=kVLRPeY5A3Pc6k4TtQxF320rUT=w@mail.gmail.com

It's also similar to the libunwind API:

  http://www.nongnu.org/libunwind/man/libunwind(3).html

Some if its advantages:

- Simplicity: no more callback sprawl and less code duplication.

- Flexibility: it allows the caller to stop and inspect the stack state
  at each step in the unwinding process.

- Modularity: the unwinder code, console stack dump code, and stack
  metadata analysis code are all better separated so that changing one
  of them shouldn't have much of an impact on any of the others.

Two implementations are added which conform to the new unwind interface:

- The frame pointer unwinder which is used for CONFIG_FRAME_POINTER=y.

- The "guess" unwinder which is used for CONFIG_FRAME_POINTER=n.  This
  isn't an "unwinder" per se.  All it does is scan the stack for kernel
  text addresses.  But with no frame pointers, guesses are better than
  nothing in most cases.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6dc2f909c47533d213d0505f0a113e64585bec82.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:33 +02:00
Ingo Molnar
b2c16e1efd Merge branch 'linus' into x86/asm, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:29:21 +02:00
Jan Beulich
c907420fda locking/rwsem, x86: Drop a bogus cc clobber
With the addition of uses of GCC's condition code outputs in commit:

  35ccfb7114 ("x86, asm: Use CC_SET()/CC_OUT() in <asm/rwsem.h>")

... there's now an overlap of outputs and clobbers in __down_write_trylock().

Such overlaps are generally getting tagged with an error (occasionally
even with an ICE). I can't really tell why plain GCC 6.2 doesn't detect
this (judging by the code it is meant to), while the slightly modified
one I use does. Since condition code clobbers are never necessary on x86
(other than perhaps for documentation purposes, which doesn't really
get done consistently), remove it altogether rather than inventing
something like CC_CLOBBER (to accompany CC_SET/CC_OUT).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/57E003CC0200007800110102@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-20 08:26:41 +02:00
David S. Miller
7675bb2b17 Merge branch 'mlx5-fixes'
Or Gerlitz says:

====================
mlx5 fixes to 4.8-rc6

This series series has a fix from Roi to memory corruption bug in
the bulk flow counters code and two late and hopefully last fixes
from me to the new eswitch offloads code.

Series done over net commit 37dd348 "bna: fix crash in bnad_get_strings()"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 22:10:25 -04:00
Or Gerlitz
6c419ba8e2 net/mlx5: E-Switch, Handle mode change failures
E-switch mode changes involve creating HW tables, potentially allocating
netdevices, etc, and things can fail. Add an attempt to rollback to the
existing mode when changing to the new mode fails. Only if rollback fails,
getting proper SRIOV functionality requires module unload or sriov
disablement/enablement.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 22:10:16 -04:00
Or Gerlitz
4eea37d7b9 net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code
When enablement of the SRIOV e-switch in certain mode (switchdev or legacy)
fails, we must set the mode to none. Otherwise, we'll run into double free
based crashes when further attempting to deal with the e-switch (such
as when disabling sriov or unloading the driver).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 22:10:15 -04:00
Roi Dayan
babd6134a5 net/mlx5: Fix flow counter bulk command out mailbox allocation
The FW command output length should be only the length of struct
mlx5_cmd_fc_bulk out field. Failing to do so will cause the memcpy
call which is invoked later in the driver to write over wrong memory
address and corrupt kernel memory which results in random crashes.

This bug was found using the kernel address sanitizer (kasan).

Fixes: a351a1b03b ('net/mlx5: Introduce bulk reading of flow counters')
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-19 22:10:15 -04:00
James Morse
727653d6ce irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning
gic_raise_softirq() walks the list of cpus using for_each_cpu(), it calls
gic_compute_target_list() which advances the iterator by the number of
CPUs in the cluster.

If gic_compute_target_list() reaches the last CPU it leaves the iterator
pointing at the last CPU. This means the next time round the for_each_cpu()
loop cpumask_next() will be called with an invalid CPU.

This triggers a warning when built with CONFIG_DEBUG_PER_CPU_MAPS:
[    3.077738] GICv3: CPU1: found redistributor 1 region 0:0x000000002f120000
[    3.077943] CPU1: Booted secondary processor [410fd0f0]
[    3.078542] ------------[ cut here ]------------
[    3.078746] WARNING: CPU: 1 PID: 0 at ../include/linux/cpumask.h:121 gic_raise_softirq+0x12c/0x170
[    3.078812] Modules linked in:
[    3.078869]
[    3.078930] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-rc5+ #5188
[    3.078994] Hardware name: Foundation-v8A (DT)
[    3.079059] task: ffff80087a1a0080 task.stack: ffff80087a19c000
[    3.079145] PC is at gic_raise_softirq+0x12c/0x170
[    3.079226] LR is at gic_raise_softirq+0xa4/0x170
[    3.079296] pc : [<ffff0000083ead24>] lr : [<ffff0000083eac9c>] pstate: 200001c9
[    3.081139] Call trace:
[    3.081202] Exception stack(0xffff80087a19fbe0 to 0xffff80087a19fd10)

[    3.082269] [<ffff0000083ead24>] gic_raise_softirq+0x12c/0x170
[    3.082354] [<ffff00000808e614>] smp_send_reschedule+0x34/0x40
[    3.082433] [<ffff0000080e80a0>] resched_curr+0x50/0x88
[    3.082512] [<ffff0000080e89d0>] check_preempt_curr+0x60/0xd0
[    3.082593] [<ffff0000080e8a60>] ttwu_do_wakeup+0x20/0xe8
[    3.082672] [<ffff0000080e8bb8>] ttwu_do_activate+0x90/0xc0
[    3.082753] [<ffff0000080ea9a4>] try_to_wake_up+0x224/0x370
[    3.082836] [<ffff0000080eabc8>] default_wake_function+0x10/0x18
[    3.082920] [<ffff000008103134>] __wake_up_common+0x5c/0xa0
[    3.083003] [<ffff0000081031f4>] __wake_up_locked+0x14/0x20
[    3.083086] [<ffff000008103f80>] complete+0x40/0x60
[    3.083168] [<ffff00000808df7c>] secondary_start_kernel+0x15c/0x1d0
[    3.083240] [<00000000808911a4>] 0x808911a4
[    3.113401] Detected PIPT I-cache on CPU2

Avoid updating the iterator if the next call to cpumask_next() would
cause the for_each_cpu() loop to exit.

There is no change to gic_raise_softirq()'s behaviour, (cpumask_next()s
eventual call to _find_next_bit() will return early as start >= nbits),
this patch just silences the warning.

Fixes: 021f653791 ("irqchip: gic-v3: Initial support for GICv3")
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1474306155-3303-1-git-send-email-james.morse@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-20 01:43:23 +02:00