Commit Graph

680545 Commits

Author SHA1 Message Date
Vivien Didelot
e289ef0ded net: dsa: mv88e6xxx: clarify SMI PHY functions
Marvell chips with an SMI PHY access in Global 2 registers handle both
Clause 22 and Clause 45 of IEEE 802.3.

The 88E6390 family has addition bits to target the internal or external
PHYs connected to the device, and a Setup function in addition to the
default (register) Access function.

Prefix the SMI PHY Command and Data registers macros, implement clear
helpers for Clause 22 and 44 Access functions, rename variable to match
the SMI and switch vocabulary (device and register addresses for Clause
22 and port and device class for Clause 45.)

Finally do not use complex macros but simple 16-bit mask to document the
registers organization.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:24:41 -04:00
Vivien Didelot
cd8da8bb0e net: dsa: mv88e6xxx: add irl_init_all op
Some Marvell chips have an Ingress Rate Limit unit. But the command
values slightly differs between models: 88E6352 use 3-bit for operations
while 88E6390 use different 2-bit operations.

This commit kills the IRL flags in favor of a new operation implementing
the "Init all resources to the initial state" operation.

This fixes the operation of 88E6390 family where 0x1000 means Read the
selected resource 0, register 0 on port 16, instead of init all.

A mv88e6xxx_irl_setup helper is added to wrap the operation call.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:24:41 -04:00
David S. Miller
cc07cb935a Merge branch 'net-next-stmmac-dwmac-sun8i-add-support-for-V3s'
Icenowy Zheng says:

====================
net-next: stmmac: dwmac-sun8i: add support for V3s

Allwinner V3s features an EMAC like the on in H3, but without external MII
interfaces, so being not able really to use RMII/RGMII.

And it has a different default value of syscon (0x38000 instead of 0x58000
on H3), which shows a problem that the EMAC clock freq should be 24MHz.
(Both H3 and V3s SoCs doesn't have extra xtal input for EPHY, and the
main xtal is 24MHz. The default value of H3 is set to 24MHz, but the V3s
default value is set to 25MHz).

First two patches are device tree binding patches, the third forces
the frequency to 24MHz and the fourth really add the V3s support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:23:06 -04:00
Icenowy Zheng
57fde47db8 net-next: stmmac: dwmac-sun8i: add support for V3s EMAC
Allwinner V3s SoC has an Ethernet MAC and an internal PHY like the ones
in H3 SoC, however the MAC has no external *MII interfaces available at
GPIOs, thus only MII connection to internal PHY is supported.

Add this variant of EMAC to dwmac-sun8i driver.

The default value of the syscon EMAC-related register seems to have
changed from H3, but it seems to be a harmless change.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:23:05 -04:00
Icenowy Zheng
1450ba8a61 net-next: stmmac: dwmac-sun8i: force EPHY clock freq to 24MHz
The EPHY control part of the EMAC syscon register has a bit called
CLK_SEL. On the datasheet it says that if it's 0 the EPHY clock is 25MHz
and if it's 1 the clock is 24MHz.

However, according to the datasheets, no Allwinner SoC with EPHY has any
extra xtal input pins for the EPHY, and the system xtal is 24MHz.

That means the EPHY is not possible to get a 25MHz xtal input, and thus
the frequency can only be 24MHz.

It doesn't matter on H3 as the default value of H3 is 24MHz, however on
V3s the default value is wrongly set to 25MHz, which prevented the EPHY
from working properly.

Force the EPHY clock frequency to 24MHz.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:23:05 -04:00
Icenowy Zheng
2b5bdebd00 dt-bindings: syscon: Add DT bindings documentation for Allwinner V3s syscon
Allwinner V3s SoC has a syscon like the one in H3.

Add its compatible string.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:23:05 -04:00
Icenowy Zheng
e29602b03f dt-bindings: net-next: Add DT bindings documentation for Allwinner V3s EMAC
Allwinner V3s SoC has a Ethernet MAC like the one in Allwinner H3, but
have no external MII capability. That means that it can only use the
EPHY and cannot do Gbps transmission.

Add binding for it.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:23:04 -04:00
David S. Miller
708d32e4e5 Merge branch 'net-Introduction-of-the-tc-tests'
Lucas Bates says:

====================
net: Introduction of the tc tests

Apologies for sending this as one big patch. I've been sitting on this a little
too long, but it's ready and I wanted to get it out.

There are a limited number of tests to start - I plan to add more on a regular
basis.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:15:11 -04:00
Lucas Bates
76b903ee19 selftests: Introduce tc testsuite
Add the beginnings of a testsuite for tc functionality in the kernel.
These are a series of unit tests that use the tc executable and verify
the success of those commands by checking both the exit codes and the
output from tc's 'show' operation.

To run the tests:
  # cd tools/testing/selftests/tc-testing
  # sudo ./tdc.py

You can specify the tc executable to use with the -p argument on the command
line or editing the 'TC' variable in tdc_config.py. Refer to the README for
full details on how to run.

The initial complement of test cases are limited mostly to tc actions. Test
cases are most welcome; see the creating-testcases subdirectory for help
in creating them.

Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:15:10 -04:00
Sebastian Siewior
fe420d87bb net/core: remove explicit do_softirq() from busy_poll_stop()
Since commit 217f697436 ("net: busy-poll: allow preemption in
sk_busy_loop()") there is an explicit do_softirq() invocation after
local_bh_enable() has been invoked.
I don't understand why we need this because local_bh_enable() will
invoke do_softirq() once the softirq counter reached zero and we have
softirq-related work pending.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:09:33 -04:00
Serhey Popovych
bdaf32c3ce fib_rules: Resolve goto rules target on delete
We should avoid marking goto rules unresolved when their
target is actually reachable after rule deletion.

Consolder following sample scenario:

  # ip -4 ru sh
  0:      from all lookup local
  32000:  from all goto 32100
  32100:  from all lookup main
  32100:  from all lookup default
  32766:  from all lookup main
  32767:  from all lookup default

  # ip -4 ru del pref 32100 table main
  # ip -4 ru sh
  0:      from all lookup local
  32000:  from all goto 32100 [unresolved]
  32100:  from all lookup default
  32766:  from all lookup main
  32767:  from all lookup default

After removal of first rule with preference 32100 we
mark all goto rules as unreachable, even when rule with
same preference as removed one still present.

Check if next rule with same preference is available
and make all rules with goto action pointing to it.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:39:18 -04:00
David S. Miller
93dda1e0d6 Merge branch 'qed-RDMA-and-infrastructure-for-iWARP'
Yuval Mintz says:

====================
qed*: RDMA and infrastructure for iWARP

This series focuses on RDMA in general with emphasis on required changes
toward adding iWARP support. The vast majority of the changes introduced
are in qed/qede, with a couple of small changes to qedr
[mentioned below].

The infrastructure changes:
 - Patch #1 adds the ability to pass PBL memory externally for a newly
created chain.
 - Patches #4, #5 rename qede_roce.[ch] into qede_rdma.[ch] + change
prefixes from _roce_ to _rdma_, as the API between qede and qedr is
agnostic to the variant of the RDMA protocol used. These patches also
touch qedr [basically to align it with the renaming, nothing more].
 - Patch #7 replaces the current SPQ async mechanism into serving
registered callbacks [before adding iWARP which would add another client
in need of this sort of functionallity].

The non-infrastrucutre changes:
 - Patches #2, #3 contain DCB-related changes to better align RDMA with
configured DCB.
 - Patch #6 contains a minor [mostly theoretical fix] to release flow.

Changes from previous versions
------------------------------
 - V4: This is actually a repost of V3 due to some confusion regarding
   the sent cover-letter
 - V3: Add commit log message in #4 indicating change in header inclusion
 - V2: Add several inclusion into qede_rdma.h to have proper declarations
   of all variable types used in it
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:09 -04:00
Michal Kalderon
6c9e80ea57 qed: SPQ async callback registration
Whenever firmware indicates that there's an async indication it needs
to handle, there's a switch-case where the right functionality is called
based on function's personality and information.

Before iWARP is added [as yet another client], switch over the SPQ into
a callback-registered mechanism, allowing registration of the relevant
event-processing logic based on the function's personality. This allows
us to tidy the code by removing protocol-specifics from a common file.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:09 -04:00
Michal Kalderon
898fff120d qed: Wait for resources before FUNC_CLOSE
Driver needs to wait for all resources to return from FW before it can send
the FUNC_CLOSE ramrod.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:08 -04:00
Michal Kalderon
bbfcd1e8e1 qed*: Set rdma generic functions prefix
Rename the functions common to both iWARP and RoCE to have a prefix of
_rdma_ instead of _roce_.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:08 -04:00
Michal Kalderon
b262a06e64 qed*: qede_roce.[ch] -> qede_rdma.[ch]
Once we have iWARP support, the qede portion of the qedr<->qede would
serve all the RDMA protocols - so rename the file to be appropriate
to its function.

While we're at it, we're also moving a couple of inclusions to it into
.h files and adding includes to make sure it contains all type
definitions it requires.

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:07 -04:00
Mintz, Yuval
9331dad1bb qed: Disable RoCE dpm when DCBx change occurs
If DCBx update occurs while QPs are open, stop sending edpms until all
QPs are closed.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:07 -04:00
Mintz, Yuval
26462ad9c7 qed: RoCE EDPM to honor PFC
Configure device according to DCBx results so that EDPMs
made by RoCE would honor flow-control.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:07 -04:00
Mintz, Yuval
1a4a69751f qed: Chain support for external PBL
iWARP would require the chains to allocate/free their PBL memory
independently, so add the infrastructure to provide it externally.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 12:34:06 -04:00
Jiri Kosina
900a88ef34 Merge branch 'for-4.12/upstream-fixes' into for-linus 2017-06-20 10:52:46 +02:00
Petr Mladek
842c088464 livepatch: Fix stacking of patches with respect to RCU
rcu_read_(un)lock(), list_*_rcu(), and synchronize_rcu() are used for a secure
access and manipulation of the list of patches that modify the same function.
In particular, it is the variable func_stack that is accessible from the ftrace
handler via struct ftrace_ops and klp_ops.

Of course, it synchronizes also some states of the patch on the top of the
stack, e.g. func->transition in klp_ftrace_handler.

At the same time, this mechanism guards also the manipulation of
task->patch_state. It is modified according to the state of the transition and
the state of the process.

Now, all this works well as long as RCU works well. Sadly livepatching might
get into some corner cases when this is not true. For example, RCU is not
watching when rcu_read_lock() is taken in idle threads.  It is because they
might sleep and prevent reaching the grace period for too long.

There are ways how to make RCU watching even in idle threads, see
rcu_irq_enter(). But there is a small location inside RCU infrastructure when
even this does not work.

This small problematic location can be detected either before calling
rcu_irq_enter() by rcu_irq_enter_disabled() or later by rcu_is_watching().
Sadly, there is no safe way how to handle it.  Once we detect that RCU was not
watching, we might see inconsistent state of the function stack and the related
variables in klp_ftrace_handler(). Then we could do a wrong decision, use an
incompatible implementation of the function and break the consistency of the
system. We could warn but we could not avoid the damage.

Fortunately, ftrace has similar problems and they seem to be solved well there.
It uses a heavy weight implementation of some RCU operations. In particular, it
replaces:

  + rcu_read_lock() with preempt_disable_notrace()
  + rcu_read_unlock() with preempt_enable_notrace()
  + synchronize_rcu() with schedule_on_each_cpu(sync_work)

My understanding is that this is RCU implementation from a stone age. It meets
the core RCU requirements but it is rather ineffective. Especially, it does not
allow to batch or speed up the synchronize calls.

On the other hand, it is very trivial. It allows to safely trace and/or
livepatch even the RCU core infrastructure.  And the effectiveness is a not a
big issue because using ftrace or livepatches on productive systems is a rare
operation.  The safety is much more important than a negligible extra load.

Note that the alternative implementation follows the RCU principles. Therefore,
     we could and actually must use list_*_rcu() variants when manipulating the
     func_stack.  These functions allow to access the pointers in the right
     order and with the right barriers. But they do not use any other
     information that would be set only by rcu_read_lock().

Also note that there are actually two problems solved in ftrace:

First, it cares about the consistency of RCU read sections.  It is being solved
the way as described and used in this patch.

Second, ftrace needs to make sure that nobody is inside the dynamic trampoline
when it is being freed. For this, it also calls synchronize_rcu_tasks() in
preemptive kernel in ftrace_shutdown().

Livepatch has similar problem but it is solved by ftrace for free.
klp_ftrace_handler() is a good guy and never sleeps. In addition, it is
registered with FTRACE_OPS_FL_DYNAMIC. It causes that
unregister_ftrace_function() calls:

	* schedule_on_each_cpu(ftrace_sync) - always
	* synchronize_rcu_tasks() - in preemptive kernel

The effect is that nobody is neither inside the dynamic trampoline nor inside
the ftrace handler after unregister_ftrace_function() returns.

[jkosina@suse.cz: reformat changelog, fix comment]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-06-20 10:42:19 +02:00
Daniel Stone
53145c2e35 Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse"
Setting these bits causes libinput to fail to initialize the device;
setting BTN_TOUCH and BTN_TOOL_FINGER causes it to treat the mouse as a
touchpad, and it then refuses to continue when it discovers ABS_X is not
set.

This breaks all known Wayland compositors, as well as Xorg when the
libinput driver is being used.

This reverts commit f4b65b9563.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Cc: Che-Liang Chiou <clchiou@chromium.org>
Cc: Thierry Escande <thierry.escande@collabora.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-06-20 10:38:17 +02:00
Linus Torvalds
9705596d08 One build fix for an Amlogic clk driver and a handful of Allwinner clk driver
fixes for some DT bindings and a randconfig build error that all came in this
 merge window.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJZSF2rAAoJEK0CiJfG5JUlPJAP/R3Q+S4bfoyCKJtKfMGnKyB0
 ehXrOn6IawQT51iCDUxiHZCHrfpnENBWl2bZb5zFuOJ9bUrsmLsPwLItOqCxMBPJ
 szKqc00sf4sGPNGzF7UJnTqvpBowXuKMtdyqbYfPMGYJBRbDzmWbE+1UyGNGdoHd
 EmbNa0dTC6cJ+B4KBV+JSRkvEAYlGFBQj5vyp6xTVS81SB2sp/4DfuZyr16ItfW3
 ert4Gdic8hI8i3TEjia+5OkctvnQa7l0YY5rW6iR/WqIPeNrMmCI4QSlGTlLK8z5
 IS30M5lgnqoqGAsxcXVrDtxs1eOPaUEHYRke0UJ3ne4JN5hCak4lRQHvYNfpZanO
 YZF/rtXl7go5gCUrAkIilDNizYruprPT0jEGXoGwmgQb477dd5sF7LDf8M7TzcFB
 Uysze3nNVqB9N27waenHt200HWh+FwBTw0JE7a16EAjFo2vLMDsUl4Fjc3rIKSsy
 nBMNGjo3kLvM91wfyuOEoaiuO0EMkR3m7osYrGNxHaY+Jw0oXcpzd+A84tLbBzBC
 EDDD1o+rdnSmcgjbYZqUiq5U1BEiHjmhDh5RBtSij94tBLU2s50kV5dixpLtELdY
 DNm79fzbJu0IH1lUArG7fIegglgxNroxdc6RwfmLkjDX51fVxCiNDtBfdQfytuRc
 9U8x0o/cT/XLDZYf+HP4
 =6EkP
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "One build fix for an Amlogic clk driver and a handful of Allwinner clk
  driver fixes for some DT bindings and a randconfig build error that
  all came in this merge window"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
  clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
  dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
  clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition
  clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
  clk: meson: gxbb: fix build error without RESET_CONTROLLER
  clk: sunxi-ng: v3s: Fix usb otg device reset bit
  clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
2017-06-20 11:02:29 +08:00
Linus Torvalds
865be78022 NTB bug fixes to address the modinfo in ntb_perf, a couple of bugs in
the NTB transport QP calculations, skx doorbells,  and sleeping in
 ntb_async_tx_submit.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZSBrpAAoJEG5mS6x6i9IjCyIP/274O7rAk1voWlKAGDJzSqpq
 m2zE91egT8f6g0dc5INYVsiF9B2ZOIndWT53VrU88432Lk0dXFkc6AOQyra28gBr
 Tjw4x0Vwctk474x0D0GwDyqYCBm1ztilH4tJ7a4Txh4kBeQipFjk1jWIke0EjVF3
 vr87wSSZzgbPpinHHM9g2RcT5S9/GB4OXz8Gn1NcEGKYZa48TkpWZKMSPv+3slhw
 hx3J4SYBvZnD1xe9GzzuBt4spDG9HaXRfMdtT0+iOvm3AkWxdFBgg7Wh6c3MMzZI
 hAmZa5Lhmqae/Swe3Ckk1VGmSJR9Z1rY+LTgAWW+sHOF1DAEVClPRVBQqQBXpui+
 tK7XlYTluOsw1gzEhySgJzW5p790TV2ZMGSkV2oWOQpAIXa5NObaM5++Us7oLeuO
 qIdwggb/HXRh4Yl79fJEKXdd7zBCWd6R3u0kgeTpvmI9ZX9Y52JsBv0sTguzSfnq
 LiTuQC6Yp2y77Lm3qey/JmgKfbiVmT+WEGIpLE4PAUaN5K79Ga990LYzx33IT+PY
 X3wftYkezbXfz2hUkDDlptDgbJdGJJwY4yZkGrUNstCvOmgZ8PfC+tQ3qPjQspTY
 /EEQzqy93Zh1dZzxmQuVr4yNMrXDRfUUbB2PyVN5df9glJVUGNt/qfXKmiE/9iiw
 utugynR83QsVbIOqtsPI
 =BFIx
 -----END PGP SIGNATURE-----

Merge tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb

Pull NTB fixes from Jon Mason:
 "NTB bug fixes to address the modinfo in ntb_perf, a couple of bugs in
  the NTB transport QP calculations, skx doorbells, and sleeping in
  ntb_async_tx_submit"

* tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb:
  ntb: no sleep in ntb_async_tx_submit
  ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits
  ntb_transport: fix bug calculating num_qps_mw
  ntb_transport: fix qp count bug
  NTB: ntb_test: fix bug printing ntb_perf results
  ntb: Correct modinfo usage statement for ntb_perf
2017-06-20 10:57:06 +08:00
Xin Long
86fdb3448c sctp: ensure ep is not destroyed before doing the dump
Now before dumping a sock in sctp_diag, it only holds the sock while
the ep may be already destroyed. It can cause a use-after-free panic
when accessing ep->asocs.

This patch is to set sctp_sk(sk)->ep NULL in sctp_endpoint_destroy,
and check if this ep is already destroyed before dumping this ep.

Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdrver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 15:13:43 -04:00
Allen Hubbe
88931ec3dc ntb: no sleep in ntb_async_tx_submit
Do not sleep in ntb_async_tx_submit, which could deadlock.
This reverts commit "8c874cc140d667f84ae4642bb5b5e0d6396d2ca4"

Fixes: 8c874cc140 ("NTB: Address out of DMA descriptor issue with NTB")
Reported-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Dave Jiang
5eb449e15d ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits
Fixing doorbell register length to 32bits per spec. On Skylake NTB, the
doorbell registers are 32bit write only registers. The source for the
doorbell is a 64bit register that shows the interrupt bits.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 783dfa6cc4 ("ntb: Adding Skylake Xeon NTB support")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Logan Gunthorpe
8e8496e0e9 ntb_transport: fix bug calculating num_qps_mw
A divide by zero error occurs if qp_count is less than mw_count because
num_qps_mw is calculated to be zero. The calculation appears to be
incorrect.

The requirement is for num_qps_mw to be set to qp_count / mw_count
with any remainder divided among the earlier mws.

For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1
will have 3 qps per window and mws 2 through 4 will have 2 qps per window.
Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher
than when mw_num >= qp_count.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Logan Gunthorpe
cb827ee6cc ntb_transport: fix qp count bug
In cases where there are more mw's than spads/2-2, the mw count gets
reduced to match the limitation. ntb_transport also tries to ensure that
there are fewer qps than mws but uses the full mw count instead of
the reduced one. When this happens, the math in
'ntb_transport_setup_qp_mw' will get confused and result in a kernel
paging request bug.

This patch fixes the bug by reducing qp_count to the reduced mw count
instead of the full mw count.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Logan Gunthorpe
07b0b22b3e NTB: ntb_test: fix bug printing ntb_perf results
The code mistakenly prints the local perf results for the remote test
so the script reports identical results for both directions. Fix this
by ensuring we print the remote result.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: a9c59ef774 ("ntb_test: Add a selftest script for the NTB subsystem")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Gary R Hook
94fc795454 ntb: Correct modinfo usage statement for ntb_perf
The order parameters are powers of 2; adjust the usage information
to use correct mathematical representations.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Fixes: 8a7b6a778a ("ntb: ntb perf tool")
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Lin Yun Sheng
7fe5b91431 net/hns:bugfix of ethtool -t phy self_test
This patch fixes the phy loopback self_test failed issue. when
Marvell Phy Module is loaded, it will powerdown fiber when doing
phy loopback self test, which cause phy loopback self_test fail.

Signed-off-by: Lin Yun Sheng <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 14:20:42 -04:00
Gao Feng
9745e362ad net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
The register_vlan_device would invoke free_netdev directly, when
register_vlan_dev failed. It would trigger the BUG_ON in free_netdev
if the dev was already registered. In this case, the netdev would be
freed in netdev_run_todo later.

So add one condition check now. Only when dev is not registered, then
free it directly.

The following is the part coredump when netdev_upper_dev_link failed
in register_vlan_dev. I removed the lines which are too long.

[  411.237457] ------------[ cut here ]------------
[  411.237458] kernel BUG at net/core/dev.c:7998!
[  411.237484] invalid opcode: 0000 [#1] SMP
[  411.237705]  [last unloaded: 8021q]
[  411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G            E   4.12.0-rc5+ #6
[  411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[  411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000
[  411.237782] RIP: 0010:free_netdev+0x116/0x120
[  411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297
[  411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878
[  411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000
[  411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801
[  411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000
[  411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000
[  411.239518] FS:  00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000
[  411.239949] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0
[  411.240936] Call Trace:
[  411.241462]  vlan_ioctl_handler+0x3f1/0x400 [8021q]
[  411.241910]  sock_ioctl+0x18b/0x2c0
[  411.242394]  do_vfs_ioctl+0xa1/0x5d0
[  411.242853]  ? sock_alloc_file+0xa6/0x130
[  411.243465]  SyS_ioctl+0x79/0x90
[  411.243900]  entry_SYSCALL_64_fastpath+0x1e/0xa9
[  411.244425] RIP: 0033:0x7fb69089a357
[  411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[  411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357
[  411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003
[  411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999
[  411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004
[  411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001
[  411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 14:10:20 -04:00
Ivan Delalande
8917a777be tcp: md5: add TCP_MD5SIG_EXT socket option to set a key address prefix
Replace first padding in the tcp_md5sig structure with a new flag field
and address prefix length so it can be specified when configuring a new
key for TCP MD5 signature. The tcpm_flags field will only be used if the
socket option is TCP_MD5SIG_EXT to avoid breaking existing programs, and
tcpm_prefixlen only when the TCP_MD5SIG_FLAG_PREFIX flag is set.

Signed-off-by: Bob Gilligan <gilligan@arista.com>
Signed-off-by: Eric Mowat <mowat@arista.com>
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 13:51:34 -04:00
Ivan Delalande
6797318e62 tcp: md5: add an address prefix for key lookup
This allows the keys used for TCP MD5 signature to be used for whole
range of addresses, specified with a prefix length, instead of only one
address as it currently is.

Signed-off-by: Bob Gilligan <gilligan@arista.com>
Signed-off-by: Eric Mowat <mowat@arista.com>
Signed-off-by: Ivan Delalande <colona@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 13:50:55 -04:00
Feras Daoud
1170fbd8ff net/mlx5e: IPoIB, Add ioctl support to IPoIB device driver
Add ioctl support to IPoIB device driver. For now, this
ioctl will support timestamp get and set.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Feras Daoud
3844b07ee4 net/mlx5e: IPoIB, Add PTP support to IPoIB device driver
Enable PTP for IPoIB rdma_netdev and add the ability
to get the time stamping parameters using ethtool.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Erez Shitrit
4ec5cf781b net/mlx5e: IPoIB, Get more TX statistics
Add misses counters (bytes, packet, gso, xmit_more) in TX flow for ipoib
traffic.

Fixes: 58545449b7b ("net/mlx5e: IPoIB, Xmit flow")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Erez Shitrit
807c441597 net/mlx5e: IPoIB, Handle change_mtu
Add the ndo that supports change mtu for IPoIB.
The callback called from the ipoib ULP driver, that gives the ability to
change the SW and HW resources accordingly in the lower driver.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Erez Shitrit
c139dbfddd net/mlx5e: Use hard_mtu as part of the mlx5e_priv struct
The mtu extra space that kept for the HW is specific for each link type,
and it is different in mlx5e and mlx5i modules.
Now it is kept in the priv structures, set by the mlx5e/mlx5i driver
accordingly.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Erez Shitrit
b6dc510fac net/mlx5e: IPoIB, Change parameters default values
Add function that sets the default values for ipoib, setting/clearing
abilities that IPoIB doesn't support, like RQ size in this case.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Erez Shitrit
7ca42c8094 net/mlx5e: Add new profile function update_carrier
Updating the carrier involves specific HW setting, each profile should
use its own function for that.

Both IPoIB and VF representor don't need carrier update function, since
VF representor has only a logical link to VF and IPoIB manages its own
link via ib_core upper layer.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Erez Shitrit
076b0936e5 net/mlx5e: IPoIB, Add ethtool support
Add support for the following:
	"ethtool -S" (statistics).
	"ethtool -i" (driver info).
	"ethtool -g/G" (rings parameters).
	"ethtool -l/L" (channels parameters).
	"ethtool -c/C" (coalesce options).

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Feras Daoud
c66f2091c9 net/mlx5e: Prevent PFC call for non ethernet ports
Port flow control supported only for ethernet ports,
therefore, prevent any call if the port type differs from
MLX5_CAP_PORT_TYPE_ETH.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Saeed Mahameed
4301ba7b3e net/mlx5e: IPoIB, Move to a separate directory
IPoIB netdevice driver was only introduced in previous kernel release
and it is growing in terms of features and LOC, move it to a separate
directory.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-19 18:40:20 +03:00
Raju Rangoju
dec6b33163 cxgb4: notify uP to route ctrlq compl to rdma rspq
During the module initialisation there is a possible race
(basically race between uld and lld) where neither the uld
nor lld notifies the uP about where to route the ctrl queue
completions. LLD skips notifying uP as the rdma queues were
not created by then (will leave it to ULD to notify the uP).
As the ULD comes up, it also skips notifying the uP as the
flag FULL_INIT_DONE is not set yet (ULD assumes that the
interface is not up yet).

Consequently, this race between uld and lld leaves uP
unnotified about where to send the ctrl queue completions
to, leading to iwarp RI_RES WR failure.

Here is the race:

CPU 0                                   CPU1

- allocates nic rx queus
- t4_sge_alloc_ctrl_txq()
(if rdma rsp queues exists,
tell uP to route ctrl queue
compl to rdma rspq)
                                - acquires the mutex_lock
                                - allocates rdma response queues
                                - if FULL_INIT_DONE set,
                                  tell uP to route ctrl queue compl
                                  to rdma rspq
                                - relinquishes mutex_lock
- acquires the mutex_lock
- enable_rx()
- set FULL_INIT_DONE
- relinquishes mutex_lock

This patch fixes the above issue.

Fixes: e7519f9926f1('cxgb4: avoid enabling napi twice to the same queue')
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
CC: Stable <stable@vger.kernel.org> # 4.9+
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 10:59:04 -04:00
Raju Rangoju
910603818c cxgb4: notify uP to route ctrlq compl to rdma rspq
During the module initialisation there is a possible race
(basically race between uld and lld) where neither the uld
nor lld notifies the uP about where to route the ctrl queue
completions. LLD skips notifying uP as the rdma queues were
not created by then (will leave it to ULD to notify the uP).
As the ULD comes up, it also skips notifying the uP as the
flag FULL_INIT_DONE is not set yet (ULD assumes that the
interface is not up yet).

Consequently, this race between uld and lld leaves uP
unnotified about where to send the ctrl queue completions
to, leading to iwarp RI_RES WR failure.

Here is the race:

CPU 0                                   CPU1

- allocates nic rx queus
- t4_sge_alloc_ctrl_txq()
(if rdma rsp queues exists,
tell uP to route ctrl queue
compl to rdma rspq)
                                - acquires the mutex_lock
                                - allocates rdma response queues
                                - if FULL_INIT_DONE set,
                                  tell uP to route ctrl queue compl
                                  to rdma rspq
                                - relinquishes mutex_lock
- acquires the mutex_lock
- enable_rx()
- set FULL_INIT_DONE
- relinquishes mutex_lock

This patch fixes the above issue.

Fixes: e7519f9926f1('cxgb4: avoid enabling napi twice to the same queue')
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
CC: Stable <stable@vger.kernel.org> # 4.9+
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 10:51:45 -04:00
Ganesh Goudar
89ff67718c cxgb4: add new T6 pci device id's
Add 0x6082, 0x6083 and 0x6084 T6 device id's

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-19 10:37:05 -04:00
Linus Torvalds
41f1830f5a Linux 4.12-rc6 2017-06-19 22:19:37 +08:00
Hugh Dickins
1be7107fbe mm: larger stack guard gap, between vmas
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19 21:50:20 +08:00