Commit Graph

840596 Commits

Author SHA1 Message Date
Kevin 'ldir' Darbyshire-Bryant
24ec483cec net: sched: Introduce act_ctinfo action
ctinfo is a new tc filter action module.  It is designed to restore
information contained in firewall conntrack marks to other packet fields
and is typically used on packet ingress paths.  At present it has two
independent sub-functions or operating modes, DSCP restoration mode &
skb mark restoration mode.

The DSCP restore mode:

This mode copies DSCP values that have been placed in the firewall
conntrack mark back into the IPv4/v6 diffserv fields of relevant
packets.

The DSCP restoration is intended for use and has been found useful for
restoring ingress classifications based on egress classifications across
links that bleach or otherwise change DSCP, typically home ISP Internet
links.  Restoring DSCP on ingress on the WAN link allows qdiscs such as
but by no means limited to CAKE to shape inbound packets according to
policies that are easier to set & mark on egress.

Ingress classification is traditionally a challenging task since
iptables rules haven't yet run and tc filter/eBPF programs are pre-NAT
lookups, hence are unable to see internal IPv4 addresses as used on the
typical home masquerading gateway.  Thus marking the connection in some
manner on egress for later restoration of classification on ingress is
easier to implement.

Parameters related to DSCP restore mode:

dscpmask - a 32 bit mask of 6 contiguous bits and indicate bits of the
conntrack mark field contain the DSCP value to be restored.

statemask - a 32 bit mask of (usually) 1 bit length, outside the area
specified by dscpmask.  This represents a conditional operation flag
whereby the DSCP is only restored if the flag is set.  This is useful to
implement a 'one shot' iptables based classification where the
'complicated' iptables rules are only run once to classify the
connection on initial (egress) packet and subsequent packets are all
marked/restored with the same DSCP.  A mask of zero disables the
conditional behaviour ie. the conntrack mark DSCP bits are always
restored to the ip diffserv field (assuming the conntrack entry is found
& the skb is an ipv4/ipv6 type)

e.g. dscpmask 0xfc000000 statemask 0x01000000

|----0xFC----conntrack mark----000000---|
| Bits 31-26 | bit 25 | bit24 |~~~ Bit 0|
| DSCP       | unused | flag  |unused   |
|-----------------------0x01---000000---|
      |                   |
      |                   |
      ---|             Conditional flag
         v             only restore if set
|-ip diffserv-|
| 6 bits      |
|-------------|

The skb mark restore mode (cpmark):

This mode copies the firewall conntrack mark to the skb's mark field.
It is completely the functional equivalent of the existing act_connmark
action with the additional feature of being able to apply a mask to the
restored value.

Parameters related to skb mark restore mode:

mask - a 32 bit mask applied to the firewall conntrack mark to mask out
bits unwanted for restoration.  This can be useful where the conntrack
mark is being used for different purposes by different applications.  If
not specified and by default the whole mark field is copied (i.e.
default mask of 0xffffffff)

e.g. mask 0x00ffffff to mask out the top 8 bits being used by the
aforementioned DSCP restore mode.

|----0x00----conntrack mark----ffffff---|
| Bits 31-24 |                          |
| DSCP & flag|      some value here     |
|---------------------------------------|
			|
			|
			v
|------------skb mark-------------------|
|            |                          |
|  zeroed    |                          |
|---------------------------------------|

Overall parameters:

zone - conntrack zone

control - action related control (reclassify | pipe | drop | continue |
ok | goto chain <CHAIN_INDEX>)

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 21:43:54 -07:00
Heiner Kallweit
a6851c613f r8169: remove 1000/Half from supported modes
MAC on the GBit versions supports 1000/Full only, however the PHY
partially claims to support 1000/Half. So let's explicitly remove
this mode.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 21:42:48 -07:00
Joergen Andreasen
2c1d029a01 net: mscc: ocelot: Implement port policers via tc command
Hardware offload of matchall classifier and police action are now
supported via the tc command.
Supported police parameters are: rate and burst.

Example:

Add:
tc qdisc add dev eth3 handle ffff: ingress
tc filter add dev eth3 parent ffff: prio 1 handle 2	\
	matchall skip_sw				\
	action police rate 100Mbit burst 10000

Show:
tc -s -d qdisc show dev eth3
tc -s -d filter show dev eth3 ingress

Delete:
tc filter del dev eth3 parent ffff: prio 1
tc qdisc del dev eth3 handle ffff: ingress

Signed-off-by: Joergen Andreasen <joergen.andreasen@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 21:37:49 -07:00
Andrii Nakryiko
399dc65e9c libbpf: reduce unnecessary line wrapping
There are a bunch of lines of code or comments that are unnecessary
wrapped into multi-lines. Fix that without violating any code
guidelines.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:35 +02:00
Andrii Nakryiko
76e1022b96 libbpf: typo and formatting fixes
A bunch of typo and formatting fixes.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:35 +02:00
Andrii Nakryiko
7e8c328c4e libbpf: simplify two pieces of logic
Extra check for type is unnecessary in first case.

Extra zeroing is unnecessary, as snprintf guarantees that it will
zero-terminate string.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:35 +02:00
Andrii Nakryiko
fba01a0689 libbpf: use negative fd to specify missing BTF
0 is a valid FD, so it's better to initialize it to -1, as is done in
other places. Also, technically, BTF type ID 0 is valid (it's a VOID
type), so it's more reliable to check btf_fd, instead of
btf_key_type_id, to determine if there is any BTF associated with a map.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:35 +02:00
Andrii Nakryiko
f102154d31 libbpf: fix error code returned on corrupted ELF
All of libbpf errors are negative, except this one. Fix it.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:35 +02:00
Andrii Nakryiko
c51829bb6e libbpf: check map name retrieved from ELF
Validate there was no error retrieving symbol name corresponding to
a BPF map.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:35 +02:00
Andrii Nakryiko
12ef5634a8 libbpf: simplify endianness check
Rewrite endianness check to use "more canonical" way, using
compiler-defined macros, similar to few other places in libbpf. It also
is more obvious and shorter.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:35 +02:00
Andrii Nakryiko
be5c5d4e9d libbpf: preserve errno before calling into user callback
pr_warning ultimately may call into user-provided callback function,
which can clobber errno value, so we need to save it before that.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:34 +02:00
Andrii Nakryiko
8ca990ce0d libbpf: fix detection of corrupted BPF instructions section
Ensure that size of a section w/ BPF instruction is exactly a multiple
of BPF instruction size.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-30 01:23:34 +02:00
David S. Miller
7da33a8f87 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-05-29

This series contains updates to ice driver only.

Bruce cleans up white space issues and fixes complaints about using
bitop assignments using operands of different sizes.

Anirudh cleans up code that is no longer needed now that the firmware
supports the functionality.  Adds support for ethtool selftestto the ice
driver, which includes testing link, interrupts, eeprom, registers and
packet loopback.  Also, cleaned up duplicate code.

Tony implements support for toggling receive VLAN filter via ethtool.

Brett bumps up the minimum receive descriptor count per queue to resolve
dropped packets.  Refactored the interrupt tracking for the ice driver
to resolve issues seen with the co-existence of features and SR-IOV, so
instead of having a hardware IRQ tracker and a software IRQ tracker,
simply use one tracker.  Also adds a helper function to trigger software
interrupts.

Mitch changes how Malicious Driver Detection (MDD) events are handled,
to ensure all VFs checked for MDD events and just log the event instead
of disabling the VF, which was preventing proper release of resources if
the VF is rebooted or the VF driver reloaded.

Dave cleans up a redundant call to register LLDP MIB change events.

Dan adds support to retrieve the current setting of firmware logging
from the hardware to properly initialize the hardware structure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:51:23 -07:00
Linus Torvalds
bec7550cca The Sphinx 2.0 release contained a few incompatible API changes that broke
our extensions and, thus, the documentation build in general.  Who knew
 that those deprecation warnings it was outputting actually meant we should
 change something?  This set of fixes makes the build work again with
 Sphinx 2.0 and eliminates the warnings for 1.8.  As part of that, we also
 need a few fixes to the docs for places where the new Sphinx is more
 strict.
 
 It is a bit late in the cycle for this kind of change, but it does fix
 problems that people are experiencing now.
 
 There has been some talk of raising the minimum version of Sphinx we
 support.  I don't want to do that abruptly, though, so these changes add
 some glue to continue to support versions back to 1.3.  We will be adding
 some infrastructure soon to nudge users of old versions forward, with the
 idea of maybe increasing our minimum version (and removing this glue)
 sometime in the future.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAlzsYw8PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YGuwH+gOk2lGPpI7tsIsctUr3TTviYbu9EywGINUA
 c2pPQBHbC13F/QKkb/WArnIhM7eRMWxNy1IlrzqTTLRcAteJihoRNldLVpImxwHJ
 DaULovUHRO9uNuVORGV4RLOBUUkrguoz4X7OZooT3tluRZcL58C3Qn8MbZYiZYf1
 ZJKHVJxfSedyHYTK1H5qB9uAAvdoNGLjvK8pzA9S7HgHM9viuOlY2GTHn4uvlmI+
 qgnbVRNeEb8IRFztvKy2RyMAisx2o5v/lfpKVM+OC9EJgkpw+mw55YUdfDeNavJg
 2d3xv8zgS1TP+qAmba2uVwiRDNChTKxz72nbaqZZGOt16StRiyY=
 =hrKI
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.2-fixes2' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
 "The Sphinx 2.0 release contained a few incompatible API changes that
  broke our extensions and, thus, the documentation build in general.
  Who knew that those deprecation warnings it was outputting actually
  meant we should change something? This set of fixes makes the build
  work again with Sphinx 2.0 and eliminates the warnings for 1.8. As
  part of that, we also need a few fixes to the docs for places where
  the new Sphinx is more strict.

  It is a bit late in the cycle for this kind of change, but it does fix
  problems that people are experiencing now.

  There has been some talk of raising the minimum version of Sphinx we
  support. I don't want to do that abruptly, though, so these changes
  add some glue to continue to support versions back to 1.3. We will be
  adding some infrastructure soon to nudge users of old versions
  forward, with the idea of maybe increasing our minimum version (and
  removing this glue) sometime in the future"

* tag 'docs-5.2-fixes2' of git://git.lwn.net/linux:
  drm/i915: Maintain consistent documentation subsection ordering
  scripts/sphinx-pre-install: make it handle Sphinx versions
  docs: Fix conf.py for Sphinx 2.0
  docs: fix multiple doc build warnings in enumeration.rst
  lib/list_sort: fix kerneldoc build error
  docs: fix numaperf.rst and add it to the doc tree
  doc: Cope with the deprecation of AutoReporter
  doc: Cope with Sphinx logging deprecations
2019-05-29 14:36:41 -07:00
David S. Miller
58e8b37069 Merge branch 'net-phy-dp83867-add-some-fixes'
Max Uvarov says:

====================
net: phy: dp83867: add some fixes

v3: use phy_modify_mmd()
v2: fix minor comments by Heiner Kallweit and Florian Fainelli
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:28:48 -07:00
Max Uvarov
2b89264925 net: phy: dp83867: Set up RGMII TX delay
PHY_INTERFACE_MODE_RGMII_RXID is less then TXID
so code to set tx delay is never called.

Fixes: 2a10154abc ("net: phy: dp83867: Add TI dp83867 phy")
Signed-off-by: Max Uvarov <muvarov@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:28:48 -07:00
Max Uvarov
c8081fc397 net: phy: dp83867: do not call config_init twice
Phy state machine calls _config_init just after
reset.

Signed-off-by: Max Uvarov <muvarov@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:28:48 -07:00
Max Uvarov
1a97a477e6 net: phy: dp83867: increase SGMII autoneg timer duration
After reset SGMII Autoneg timer is set to 2us (bits 6 and 5 are 01).
That is not enough to finalize autonegatiation on some devices.
Increase this timer duration to maximum supported 16ms.

Signed-off-by: Max Uvarov <muvarov@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:28:48 -07:00
Max Uvarov
333061b924 net: phy: dp83867: fix speed 10 in sgmii mode
For supporting 10Mps speed in SGMII mode DP83867_10M_SGMII_RATE_ADAPT bit
of DP83867_10M_SGMII_CFG register has to be cleared by software.
That does not affect speeds 100 and 1000 so can be done on init.

Signed-off-by: Max Uvarov <muvarov@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:28:48 -07:00
Russell King
3d3ced2ec5 net: phy: marvell10g: report if the PHY fails to boot firmware
Some boards do not have the PHY firmware programmed in the 3310's flash,
which leads to the PHY not working as expected.  Warn the user when the
PHY fails to boot the firmware and refuse to initialise.

Fixes: 20b2af32ff ("net: phy: add Marvell Alaska X 88X3310 10Gigabit PHY support")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:25:10 -07:00
Russell King
c678726305 net: phylink: ensure consistent phy interface mode
Ensure that we supply the same phy interface mode to mac_link_down() as
we did for the corresponding mac_link_up() call.  This ensures that MAC
drivers that use the phy interface mode in these methods can depend on
mac_link_down() always corresponding to a mac_link_up() call for the
same interface mode.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:20:24 -07:00
YueHaibing
a3e2f6ad89 net: stmmac: Fix build error without CONFIG_INET
Fix gcc build error while CONFIG_INET is not set

drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.o: In function `__stmmac_test_loopback':
stmmac_selftests.c:(.text+0x8ec): undefined reference to `ip_send_check'
stmmac_selftests.c:(.text+0xacc): undefined reference to `udp4_hwcsum'

Add CONFIG_INET dependency to fix this.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 091810dbde ("net: stmmac: Introduce selftests support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 14:18:24 -07:00
Herbert Xu
279758f800 rhashtable: Add rht_ptr_rcu and improve rht_ptr
This patch moves common code between rht_ptr and rht_ptr_exclusive
into __rht_ptr.  It also adds a new helper rht_ptr_rcu exclusively
for the RCU case.  This way rht_ptr becomes a lock-only construct
so we can use the lighter rcu_dereference_protected primitive.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 13:27:08 -07:00
Jisheng Zhang
af64935213 net: stmmac: use dev_info() before netdev is registered
Before the netdev is registered, calling netdev_info() will emit
something as "(unnamed net device) (uninitialized)", looks confusing.

Before this patch:
[    3.155028] stmmaceth f7b60000.ethernet (unnamed net_device) (uninitialized): device MAC address 52:1a:55:18:9e:9d

After this patch:
[    3.155028] stmmaceth f7b60000.ethernet: device MAC address 52:1a:55:18:9e:9d

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 13:26:18 -07:00
Colin Ian King
1b3855aba8 qed: fix spelling mistake "inculde" -> "include"
There is a spelling mistake in a DP_INFO message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 13:25:43 -07:00
Yoshihiro Shimoda
315ca92dd8 net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs
The sh_eth_close() resets the MAC and then calls phy_stop()
so that mdio read access result is incorrect without any error
according to kernel trace like below:

ifconfig-216   [003] .n..   109.133124: mdio_access: ee700000.ethernet-ffffffff read  phy:0x01 reg:0x00 val:0xffff

According to the hardware manual, the RMII mode should be set to 1
before operation the Ethernet MAC. However, the previous code was not
set to 1 after the driver issued the soft_reset in sh_eth_dev_exit()
so that the mdio read access result seemed incorrect. To fix the issue,
this patch adds a condition and set the RMII mode register in
sh_eth_dev_exit() for R-Car Gen2 and RZ/A1 SoCs.

Note that when I have tried to move the sh_eth_dev_exit() calling
after phy_stop() on sh_eth_close(), but it gets worse (kernel panic
happened and it seems that a register is accessed while the clock is
off).

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 13:24:54 -07:00
Linus Torvalds
2b28601d62 linux-kselftest-5.2-rc3
This Kselftest update for Linux 5.2-rc3 consists of
 
 - Alexandre Belloni's fixes to rtc regressions introduced in kselftest
   Makefile test run output refactoring work from Kees Cook.
 
 - ftrace test checkbashisms fixes from Masami Hiramatsu
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAlzumDsACgkQCwJExA0N
 QxymNQ/8DwDmpfkLCBpedHIy9kxY7PiJGNwQYkVMPBxAt/qY7c0BS46TnKxc6ndH
 Ra/7Shm7nTbit9Upcap5s4Gtbjf62PW0FVVAZgMZe+2Raq80QV8ANDHOFz8KF7R1
 iaDWyW8a5cP56fuzSMmJlpHPWw1G3tYAvjJwqKuTPegYrHpkKdeGpEVm4gyIoS3t
 k7z4RuGwouRh3e9Bfks+Q1TZQCH8RbA5W1riA4XM9/iXi5wqp0Ib27sMYbH1eA4s
 Gqg4lLFHfC2Pdo2ASOlC9aQAHgE23pCQFFFgXaY2wl00CgThXhhZ1JiJmdF4PryW
 z59+3+Ncyr9di7ZL6H+Po/vin1b0902FDx0K4pOqSwr1yQ+54tmSfjylbQRpdnA2
 GpswgqFuqCF5SpOTikjl3oiNSi2BStHaLmkEV8XU8LkTe4Px4TSTZ/3JVgJ6PBAL
 55mMgD41W1OIQtK6LqLfqfRz7Uj5xdfVcobAigE9IEYLBB/PBnwo/xyW8pSxXoDO
 80WiVUf9XHxSU1Mdmb516540thxvS4+TK8CPLSCZiqLUDY1facsNH/KYzkBU7Upt
 eJdQxpCI55ORQCRQbfcbuIOrddZFWAjZ5PBieQvYv4W7O1wEcK3hTPl75aff5bIS
 UlAt7aW8iLwU+xQ4byMtfvUdzVh2WJv7/jNvjcQCipXXD1PoyAg=
 =zYD6
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - Alexandre Belloni's fixes to rtc regressions introduced in kselftest
   Makefile test run output refactoring work from Kees Cook.

 - ftrace test checkbashisms fixes from Masami Hiramatsu

* tag 'linux-kselftest-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: rtc: rtctest: specify timeouts
  selftests/harness: Allow test to configure timeout
  selftests/ftrace: Add checkbashisms meta-testcase
  selftests/ftrace: Make a script checkbashisms clean
2019-05-29 13:20:02 -07:00
Linus Torvalds
9e82b4a91d This fixes a memory leak from the error path in the event filter logic.
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXO303xQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qlb7AP4179SWPy9RFxvxyjTmpPFPL9oR0Q26
 sOTyIBN99MUTsgEA0FNWz7/FWtFDa1wbh0tEVreaTQlKEeoIYF96dkN0iwE=
 =BJgg
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "This fixes a memory leak from the error path in the event filter
  logic"

* tag 'trace-v5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Avoid memory leak in predicate_parse()
2019-05-29 11:26:40 -07:00
Quentin Monnet
501b125a29 libbpf: prevent overwriting of log_level in bpf_object__load_progs()
There are two functions in libbpf that support passing a log_level
parameter for the verifier for loading programs:
bpf_object__load_xattr() and bpf_prog_load_xattr(). Both accept an
attribute object containing the log_level, and apply it to the programs
to load.

It turns out that to effectively load the programs, the latter function
eventually relies on the former. This was not taken into account when
adding support for log_level in bpf_object__load_xattr(), and the
log_level passed to bpf_prog_load_xattr() later gets overwritten with a
zero value, thus disabling verifier logs for the program in all cases:

bpf_prog_load_xattr()             // prog->log_level = attr1->log_level;
-> bpf_object__load()             // attr2->log_level = 0;
   -> bpf_object__load_xattr()    // <pass prog and attr2>
      -> bpf_object__load_progs() // prog->log_level = attr2->log_level;

Fix this by OR-ing the log_level in bpf_object__load_progs(), instead of
overwriting it.

v2: Fix commit log description (confusion on function names in v1).

Fixes: 60276f9849 ("libbpf: add bpf_object__load_xattr() API function to pass log_level")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 19:36:55 +02:00
Stanislav Fomichev
e672db03ab bpf: tracing: properly use bpf_prog_array api
Now that we don't have __rcu markers on the bpf_prog_array helpers,
let's use proper rcu_dereference_protected to obtain array pointer
under mutex.

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 15:17:35 +02:00
Stanislav Fomichev
dbcc1ba26e bpf: cgroup: properly use bpf_prog_array api
Now that we don't have __rcu markers on the bpf_prog_array helpers,
let's use proper rcu_dereference_protected to obtain array pointer
under mutex.

We also don't need __rcu annotations on cgroup_bpf.inactive since
it's not read/updated concurrently.

v4:
* drop cgroup_rcu_xyz wrappers and use rcu APIs directly; presumably
  should be more clear to understand which mutex/refcount protects
  each particular place

v3:
* amend cgroup_rcu_dereference to include percpu_ref_is_dying;
  cgroup_bpf is now reference counted and we don't hold cgroup_mutex
  anymore in cgroup_bpf_release

v2:
* replace xchg with rcu_swap_protected

Cc: Roman Gushchin <guro@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 15:17:35 +02:00
Stanislav Fomichev
02205d2ed6 bpf: media: properly use bpf_prog_array api
Now that we don't have __rcu markers on the bpf_prog_array helpers,
let's use proper rcu_dereference_protected to obtain array pointer
under mutex.

Cc: linux-media@vger.kernel.org
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Sean Young <sean@mess.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 15:17:35 +02:00
Stanislav Fomichev
54e9c9d4b5 bpf: remove __rcu annotations from bpf_prog_array
Drop __rcu annotations and rcu read sections from bpf_prog_array
helper functions. They are not needed since all existing callers
call those helpers from the rcu update side while holding a mutex.
This guarantees that use-after-free could not happen.

In the next patches I'll fix the callers with missing
rcu_dereference_protected to make sparse/lockdep happy, the proper
way to use these helpers is:

	struct bpf_prog_array __rcu *progs = ...;
	struct bpf_prog_array *p;

	mutex_lock(&mtx);
	p = rcu_dereference_protected(progs, lockdep_is_held(&mtx));
	bpf_prog_array_length(p);
	bpf_prog_array_copy_to_user(p, ...);
	bpf_prog_array_delete_safe(p, ...);
	bpf_prog_array_copy_info(p, ...);
	bpf_prog_array_copy(p, ...);
	bpf_prog_array_free(p);
	mutex_unlock(&mtx);

No functional changes! rcu_dereference_protected with lockdep_is_held
should catch any cases where we update prog array without a mutex
(I've looked at existing call sites and I think we hold a mutex
everywhere).

Motivation is to fix sparse warnings:
kernel/bpf/core.c:1803:9: warning: incorrect type in argument 1 (different address spaces)
kernel/bpf/core.c:1803:9:    expected struct callback_head *head
kernel/bpf/core.c:1803:9:    got struct callback_head [noderef] <asn:4> *
kernel/bpf/core.c:1877:44: warning: incorrect type in initializer (different address spaces)
kernel/bpf/core.c:1877:44:    expected struct bpf_prog_array_item *item
kernel/bpf/core.c:1877:44:    got struct bpf_prog_array_item [noderef] <asn:4> *
kernel/bpf/core.c:1901:26: warning: incorrect type in assignment (different address spaces)
kernel/bpf/core.c:1901:26:    expected struct bpf_prog_array_item *existing
kernel/bpf/core.c:1901:26:    got struct bpf_prog_array_item [noderef] <asn:4> *
kernel/bpf/core.c:1935:26: warning: incorrect type in assignment (different address spaces)
kernel/bpf/core.c:1935:26:    expected struct bpf_prog_array_item *[assigned] existing
kernel/bpf/core.c:1935:26:    got struct bpf_prog_array_item [noderef] <asn:4> *

v2:
* remove comment about potential race; that can't happen
  because all callers are in rcu-update section

Cc: Roman Gushchin <guro@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 15:17:35 +02:00
Alan Maguire
fe937ea12e selftests/bpf: fix compilation error for flow_dissector.c
When building the tools/testing/selftest/bpf subdirectory,
(running both a local directory "make" and a
"make -C tools/testing/selftests/bpf") I keep hitting the
following compilation error:

prog_tests/flow_dissector.c: In function ‘create_tap’:
prog_tests/flow_dissector.c:150:38: error: ‘IFF_NAPI’ undeclared (first
use in this function)
   .ifr_flags = IFF_TAP | IFF_NO_PI | IFF_NAPI | IFF_NAPI_FRAGS,
                                      ^
prog_tests/flow_dissector.c:150:38: note: each undeclared identifier is
reported only once for each function it appears in
prog_tests/flow_dissector.c:150:49: error: ‘IFF_NAPI_FRAGS’ undeclared

Adding include/uapi/linux/if_tun.h to tools/include/uapi/linux
resolves the problem and ensures the compilation of the file
does not depend on having up-to-date kernel headers locally.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-05-29 15:15:07 +02:00
Sami Tolvanen
1e29ab3186 arm64: use the correct function type for __arm64_sys_ni_syscall
Calling sys_ni_syscall through a syscall_fn_t pointer trips indirect
call Control-Flow Integrity checking due to a function type
mismatch. Use SYSCALL_DEFINE0 for __arm64_sys_ni_syscall instead and
remove the now unnecessary casts.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29 13:46:00 +01:00
Sami Tolvanen
0e358bd7b7 arm64: use the correct function type in SYSCALL_DEFINE0
Although a syscall defined using SYSCALL_DEFINE0 doesn't accept
parameters, use the correct function type to avoid indirect call
type mismatches with Control-Flow Integrity checking.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29 13:45:59 +01:00
Sami Tolvanen
8ef8f368ce arm64: fix syscall_fn_t type
Syscall wrappers in <asm/syscall_wrapper.h> use const struct pt_regs *
as the argument type. Use const in syscall_fn_t as well to fix indirect
call type mismatches with Control-Flow Integrity checking.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29 13:45:59 +01:00
Geert Uytterhoeven
6954158a16 ALSA: fireface: Use ULL suffixes for 64-bit constants
With gcc 4.1:

    sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_switch_fetching_mode’:
    sound/firewire/fireface/ff-protocol-latter.c:97: warning: integer constant is too large for ‘long’ type
    sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_begin_session’:
    sound/firewire/fireface/ff-protocol-latter.c:170: warning: integer constant is too large for ‘long’ type
    sound/firewire/fireface/ff-protocol-latter.c:197: warning: integer constant is too large for ‘long’ type
    sound/firewire/fireface/ff-protocol-latter.c:205: warning: integer constant is too large for ‘long’ type
    sound/firewire/fireface/ff-protocol-latter.c: In function ‘latter_finish_session’:
    sound/firewire/fireface/ff-protocol-latter.c:214: warning: integer constant is too large for ‘long’ type

Fix this by adding the missing "ULL" suffixes.
Add the same suffix to the last constant, to maintain consistency.

Fixes: fd1cc9de64 ("ALSA: fireface: add support for Fireface UCX")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-05-29 12:06:48 +02:00
Eric W. Biederman
d76cac67db signal/arm64: Use force_sig not force_sig_fault for SIGKILL
I don't think this is userspace visible but SIGKILL does not have
any si_codes that use the fault member of the siginfo union.  Correct
this the simple way and call force_sig instead of force_sig_fault when
the signal is SIGKILL.

The two know places where synchronous SIGKILL are generated are
do_bad_area and fpsimd_save.  The call paths to force_sig_fault are:
do_bad_area
  arm64_force_sig_fault
    force_sig_fault
force_signal_inject
  arm64_notify_die
    arm64_force_sig_fault
       force_sig_fault

Which means correcting this in arm64_force_sig_fault is enough
to ensure the arm64 code is not misusing the generic code, which
could lead to maintenance problems later.

Cc: stable@vger.kernel.org
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: af40ff687b ("arm64: signal: Ensure si_code is valid for all fault signals")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-29 11:05:25 +01:00
Brett Creeley
e89e899f3e ice: Add a helper to trigger software interrupt
Add a new function ice_trigger_sw_intr to trigger interrupts.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 03:00:53 -07:00
Md Fahad Iqbal Polash
3a9e32bb06 ice: Configure RSS LUT key only if RSS is enabled
Call ice_vsi_cfg_rss_lut_key only if RSS is enabled.

Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:59:08 -07:00
Dan Nowlin
11fe1b3a38 ice: Add ice_get_fw_log_cfg to init FW logging
In order to initialize the current status of the FW logging,
this patch adds ice_get_fw_log_cfg. The function retrieves
the current setting of the FW logging from HW and updates the
ice_hw structure accordingly.

Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:57:27 -07:00
Anirudh Venkataramanan
1eb11036a3 ice: Minor cleanup in ice_switch.h
Remove duplicate define for ICE_INVAL_Q_HANDLE. Move defines to the
top of the file.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:55:34 -07:00
Dave Ertman
91aed40da3 ice: Remove redundant and premature event config
In the path for re-enabling FW LLDP engine, there is
a call to register for LLDP MIB change events.  This
call is redundant, in that the call to ice_pf_dcb_cfg
will already register the driver for these events.  Also,
the call as it stands now is too early in the flow before
before DCB is configured.

Remove the redundant call.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:53:57 -07:00
Mitch Williams
4cc82aaa74 ice: Change message level
Change the message level of the MTU change log message from debug to
info.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:52:23 -07:00
Mitch Williams
23c0112246 ice: Check all VFs for MDD activity, don't disable
Don't use the mdd_detected variable as an exit condition for this loop;
the first VF to NOT have an MDD event will cause the loop to terminate.

Instead just look at all of the VFs, but don't disable them. This
prevents proper release of resources if the VFs are rebooted or the VF
driver reloaded. Instead, just log a message and call out repeat
offenders.

To make it clear what we are doing, use a differently-named variable in
the loop.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:50:46 -07:00
Brett Creeley
cbe66bfee6 ice: Refactor interrupt tracking
Currently we have two MSI-x (IRQ) trackers, one for OS requested MSI-x
entries (sw_irq_tracker) and one for hardware MSI-x vectors
(hw_irq_tracker). Generally the sw_irq_tracker has less entries than the
hw_irq_tracker because the hw_irq_tracker has entries equal to the max
allowed MSI-x per PF and the sw_irq_tracker is mainly the minimum (non
SR-IOV portion of the vectors, kernel granted IRQs). All of the non
SR-IOV portions of the driver (i.e. LAN queues, RDMA queues, OICR, etc.)
take at least one of each type of tracker resource. SR-IOV only grabs
entries from the hw_irq_tracker. There are a few issues with this approach
that can be seen when doing any kind of device reconfiguration (i.e.
ethtool -L, SR-IOV, etc.). One of them being, any time the driver creates
an ice_q_vector and associates it to a LAN queue pair it will grab and
use one entry from the hw_irq_tracker and one from the sw_irq_tracker.
If the indices on these does not match it will cause a Tx timeout, which
will cause a reset and then the indices will match up again and traffic
will resume. The mismatched indices come from the trackers not being the
same size and/or the search_hint in the two trackers not being equal.
Another reason for the refactor is the co-existence of features with
SR-IOV. If SR-IOV is enabled and the interrupts are taken from the end
of the sw_irq_tracker then other features can no longer use this space
because the hardware has now given the remaining interrupts to SR-IOV.

This patch reworks how we track MSI-x vectors by removing the
hw_irq_tracker completely and instead MSI-x resources needed for SR-IOV
are determined all at once instead of per VF. This can be done because
when creating VFs we know how many are wanted and how many MSI-x vectors
each VF needs. This also allows us to start using MSI-x resources from
the end of the PF's allowed MSI-x vectors so we are less likely to use
entries needed for other features (i.e. RDMA, L2 Offload, etc).

This patch also reworks the ice_res_tracker structure by removing the
search_hint and adding a new member - "end". Instead of having a
search_hint we will always search from 0. The new member, "end", will be
used to manipulate the end of the ice_res_tracker (specifically
sw_irq_tracker) during runtime based on MSI-x vectors needed by SR-IOV.
In the normal case, the end of ice_res_tracker will be equal to the
ice_res_tracker's num_entries.

The sriov_base_vector member was added to the PF structure. It is used
to represent the starting MSI-x index of all the needed MSI-x vectors
for all SR-IOV VFs. Depending on how many MSI-x are needed, SR-IOV may
have to take resources from the sw_irq_tracker. This is done by setting
the sw_irq_tracker->end equal to the pf->sriov_base_vector. When all
SR-IOV VFs are removed then the sw_irq_tracker->end is reset back to
sw_irq_tracker->num_entries. The sriov_base_vector, along with the VF's
number of MSI-x (pf->num_vf_msix), vf_id, and the base MSI-x index on
the PF (pf->hw.func_caps.common_cap.msix_vector_first_id), is used to
calculate the first HW absolute MSI-x index for each VF, which is used
to write to the VPINT_ALLOC[_PCI] and GLINT_VECT2FUNC registers to
program the VFs MSI-x PCI configuration bits. Also, the sriov_base_vector
is used along with VF's num_vf_msix, vf_id, and q_vector->v_idx to
determine the MSI-x register index (used for writing to GLINT_DYN_CTL)
within the PF's space.

Interrupt changes removed any references to hw_base_vector, hw_oicr_idx,
and hw_irq_tracker. Only sw_base_vector, sw_oicr_idx, and sw_irq_tracker
variables remain. Change all of these by removing the "sw_" prefix to
help avoid confusion with these variables and their use.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:48:49 -07:00
Anirudh Venkataramanan
0e674aeb0b ice: Add handler for ethtool selftest
This patch adds a handler for ethtool selftest. Selftest includes
testing link, interrupts, eeprom, registers and packet loopback.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:44:12 -07:00
Brett Creeley
4b6f3ecabf ice: Don't call ice_cfg_itr() for SR-IOV
ice_cfg_itr() sets the ITR granularity and default ITR values for the
PF's interrupt vectors. For VF's this will be done in the AVF driver
flow. Fix this by not calling ice_cfg_itr() for SR-IOV.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:40:30 -07:00
Brett Creeley
1aec6e1b08 ice: Set minimum default Rx descriptor count to 512
Currently we set the default number of Rx descriptors per
queue to the system's page size divided by the number of bytes per
descriptor. For 4K page size systems this is resulting in 128 Rx
descriptors per queue. This is causing more dropped packets than desired
in the default configuration. Fix this by setting the minimum default
Rx descriptor count per queue to 512.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:38:50 -07:00