Commit Graph

2480 Commits

Author SHA1 Message Date
Florian Fainelli
7cb6a2a2c7 net: systemport: Protect stop from timeout
A timing hazard exists when the network interface is stopped that
allows a watchdog timeout to be processed by a separate core in
parallel. This creates the potential for the timeout handler to
wake the queues while the driver is shutting down, or access
registers after their clocks have been removed.

The more common case is that the watchdog timeout will produce a
warning message which doesn't lead to a crash. The chances of this
are greatly increased by the fact that bcm_sysport_netif_stop stops
the transmit queues which can easily precipitate a watchdog time-
out because of stale trans_start data in the queues.

This commit corrects the behavior by ensuring that the watchdog
timeout is disabled before enterring bcm_sysport_netif_stop. There
are currently only two users of the bcm_sysport_netif_stop function:
close and suspend.

The close case already handles the issue by exiting the RUNNING
state before invoking the driver close service.

The suspend case now performs the netif_device_detach to exit the
PRESENT state before the call to bcm_sysport_netif_stop rather than
after it.

These behaviors prevent any future scheduling of the driver timeout
service during the window. The netif_tx_stop_all_queues function
in bcm_sysport_netif_stop is replaced with netif_tx_disable to ensure
synchronization with any transmit or timeout threads that may
already be executing on other cores.

For symmetry, the netif_device_attach call upon resume is moved to
after the call to bcm_sysport_netif_start. Since it wakes the transmit
queues it is not necessary to invoke netif_tx_start_all_queues from
bcm_sysport_netif_start so it is moved into the driver open service.

Fixes: 40755a0fce ("net: systemport: add suspend and resume support")
Fixes: 80105befdb ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03 00:03:40 -07:00
Doug Berger
09e805d257 net: bcmgenet: protect stop from timeout
A timing hazard exists when the network interface is stopped that
allows a watchdog timeout to be processed by a separate core in
parallel. This creates the potential for the timeout handler to
wake the queues while the driver is shutting down, or access
registers after their clocks have been removed.

The more common case is that the watchdog timeout will produce a
warning message which doesn't lead to a crash. The chances of this
are greatly increased by the fact that bcmgenet_netif_stop stops
the transmit queues which can easily precipitate a watchdog time-
out because of stale trans_start data in the queues.

This commit corrects the behavior by ensuring that the watchdog
timeout is disabled before enterring bcmgenet_netif_stop. There
are currently only two users of the bcmgenet_netif_stop function:
close and suspend.

The close case already handles the issue by exiting the RUNNING
state before invoking the driver close service.

The suspend case now performs the netif_device_detach to exit the
PRESENT state before the call to bcmgenet_netif_stop rather than
after it.

These behaviors prevent any future scheduling of the driver timeout
service during the window. The netif_tx_stop_all_queues function
in bcmgenet_netif_stop is replaced with netif_tx_disable to ensure
synchronization with any transmit or timeout threads that may
already be executing on other cores.

For symmetry, the netif_device_attach call upon resume is moved to
after the call to bcmgenet_netif_start. Since it wakes the transmit
queues it is not necessary to invoke netif_tx_start_all_queues from
bcmgenet_netif_start so it is moved into the driver open service.

Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03 00:03:39 -07:00
Linus Torvalds
4904008165 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "What better way to start off a weekend than with some networking bug
  fixes:

  1) net namespace leak in dump filtering code of ipv4 and ipv6, fixed
     by David Ahern and Bjørn Mork.

  2) Handle bad checksums from hardware when using CHECKSUM_COMPLETE
     properly in UDP, from Sean Tranchetti.

  3) Remove TCA_OPTIONS from policy validation, it turns out we don't
     consistently use nested attributes for this across all packet
     schedulers. From David Ahern.

  4) Fix SKB corruption in cadence driver, from Tristram Ha.

  5) Fix broken WoL handling in r8169 driver, from Heiner Kallweit.

  6) Fix OOPS in pneigh_dump_table(), from Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits)
  net/neigh: fix NULL deref in pneigh_dump_table()
  net: allow traceroute with a specified interface in a vrf
  bridge: do not add port to router list when receives query with source 0.0.0.0
  net/smc: fix smc_buf_unuse to use the lgr pointer
  ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are called
  net/{ipv4,ipv6}: Do not put target net if input nsid is invalid
  lan743x: Remove SPI dependency from Microchip group.
  drivers: net: remove <net/busy_poll.h> inclusion when not needed
  net: phy: genphy_10g_driver: Avoid NULL pointer dereference
  r8169: fix broken Wake-on-LAN from S5 (poweroff)
  octeontx2-af: Use GFP_ATOMIC under spin lock
  net: ethernet: cadence: fix socket buffer corruption problem
  net/ipv6: Allow onlink routes to have a device mismatch if it is the default route
  net: sched: Remove TCA_OPTIONS from policy
  ice: Poll for link status change
  ice: Allocate VF interrupts and set queue map
  ice: Introduce ice_dev_onetime_setup
  net: hns3: Fix for warning uninitialized symbol hw_err_lst3
  octeontx2-af: Copy the right amount of memory
  net: udp: fix handling of CHECKSUM_COMPLETE packets
  ...
2018-10-26 19:25:07 -07:00
Linus Torvalds
b27186abb3 Devicetree updates for 4.20:
- Sync dtc with upstream version v1.4.7-14-gc86da84d30e4
 
 - Work to get rid of direct accesses to struct device_node name and
   type pointers in preparation for removing them. New helpers for
   parsing DT cpu nodes and conversions to use the helpers. printk
   conversions to %pOFn for printing DT node names. Most went thru
   subystem trees, so this is the remainder.
 
 - Fixes to DT child node lookups to actually be restricted to child
   nodes instead of treewide.
 
 - Refactoring of dtb targets out of arch code. This makes the support
   more uniform and enables building all dtbs on c6x, microblaze, and
   powerpc.
 
 - Various DT binding updates for Renesas r8a7744 SoC
 
 - Vendor prefixes for Facebook, OLPC
 
 - Restructuring of some ARM binding docs moving some peripheral bindings
   out of board/SoC binding files
 
 - New "secure-chosen" binding for secure world settings on ARM
 
 - Dual licensing of 2 DT IRQ binding headers
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAlvTKWYQHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhw8J5EACMAnrTxWQmXfQXOZEVxztcFavH6LP8mh2e
 7FZIZ38jzHXXvl81tAg1nBhzFUU/qtvqW8NDCZ9OBxKvp6PFDNhWu241ZodSB1Kw
 MZWy2A9QC+qbHYCC+SB5gOT0+Py3v7LNCBa5/TxhbFd35THJM8X0FP7gmcCGX593
 9Ml1rqawT4mK5XmCpczT0cXxyC4TgVtpfDWZH2KgJTR/kwXVQlOQOGZ8a1y/wrt7
 8TLIe7Qy4SFRzjhwbSta1PUehyYfe4uTSsXIJ84kMvNMxinLXQtvd7t9TfsK8p/R
 WjYUneJskVjtxVrMQfdV4MxyFL1YEt2mYcr0PMKIWxMCgGDAZsHPoUZmjyh/PrCI
 uiZtEHn3fXpUZAV/xEHHNirJxYyQfHGiksAT+lPrUXYYLCcZ3ZmqiTEYhGoQAfH5
 CQPMuxA6yXxp6bov6zJwZSTZtkXciju8aQRhUhlxIfHTqezmGYeql/bnWd+InNuR
 upANLZBh6D2jTWzDyobconkCCLlVkSqDoqOx725mMl6hIcdH9d2jVX7hwRf077VI
 5i3CyPSJOkSOLSdB8bAPYfBoaDtH2bthxieUrkkSbIjbwHO1H6a2lxPeG/zah0a3
 ePMGhi7J84UM4VpJEi000cP+bhPumJtJrG7zxP7ldXdfAF436sQ6KRptlcpLpj5i
 IwMhUQNH+g==
 =335v
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull Devicetree updates from Rob Herring:
 "A bit bigger than normal as I've been busy this cycle.

  There's a few things with dependencies and a few things subsystem
  maintainers didn't pick up, so I'm taking them thru my tree.

  The fixes from Johan didn't get into linux-next, but they've been
  waiting for some time now and they are what's left of what subsystem
  maintainers didn't pick up.

  Summary:

   - Sync dtc with upstream version v1.4.7-14-gc86da84d30e4

   - Work to get rid of direct accesses to struct device_node name and
     type pointers in preparation for removing them. New helpers for
     parsing DT cpu nodes and conversions to use the helpers. printk
     conversions to %pOFn for printing DT node names. Most went thru
     subystem trees, so this is the remainder.

   - Fixes to DT child node lookups to actually be restricted to child
     nodes instead of treewide.

   - Refactoring of dtb targets out of arch code. This makes the support
     more uniform and enables building all dtbs on c6x, microblaze, and
     powerpc.

   - Various DT binding updates for Renesas r8a7744 SoC

   - Vendor prefixes for Facebook, OLPC

   - Restructuring of some ARM binding docs moving some peripheral
     bindings out of board/SoC binding files

   - New "secure-chosen" binding for secure world settings on ARM

   - Dual licensing of 2 DT IRQ binding headers"

* tag 'devicetree-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (78 commits)
  ARM: dt: relicense two DT binding IRQ headers
  power: supply: twl4030-charger: fix OF sibling-node lookup
  NFC: nfcmrvl_uart: fix OF child-node lookup
  net: stmmac: dwmac-sun8i: fix OF child-node lookup
  net: bcmgenet: fix OF child-node lookup
  drm/msm: fix OF child-node lookup
  drm/mediatek: fix OF sibling-node lookup
  of: Add missing exports of node name compare functions
  dt-bindings: Add OLPC vendor prefix
  dt-bindings: misc: bk4: Add device tree binding for Liebherr's BK4 SPI bus
  dt-bindings: thermal: samsung: Add SPDX license identifier
  dt-bindings: clock: samsung: Add SPDX license identifiers
  dt-bindings: timer: ostm: Add R7S9210 support
  dt-bindings: phy: rcar-gen2: Add r8a7744 support
  dt-bindings: can: rcar_can: Add r8a7744 support
  dt-bindings: timer: renesas, cmt: Document r8a7744 CMT support
  dt-bindings: watchdog: renesas-wdt: Document r8a7744 support
  dt-bindings: thermal: rcar: Add device tree support for r8a7744
  Documentation: dt: Add binding for /secure-chosen/stdout-path
  dt-bindings: arm: zte: Move sysctrl bindings to their own doc
  ...
2018-10-26 12:09:58 -07:00
Eric Dumazet
55469bc6b5 drivers: net: remove <net/busy_poll.h> inclusion when not needed
Drivers using generic NAPI interface no longer need to include
<net/busy_poll.h>, since busy polling was moved to core networking
stack long ago.

See commit 79e7fff47b ("net: remove support for per driver
ndo_busy_poll()") for reference.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-25 16:20:02 -07:00
Linus Torvalds
bd6bf7c104 pci-v4.20-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlvPV7IUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyaUg//WnCaRIu2oKOp8c/bplZJDW5eT10d
 oYAN9qeyptU9RYrg4KBNbZL9UKGFTk3AoN5AUjrk8njxc/dY2ra/79esOvZyyYQy
 qLXBvrXKg3yZnlNlnyBneGSnUVwv/kl2hZS+kmYby2YOa8AH/mhU0FIFvsnfRK2I
 XvwABFm2ZYvXCqh3e5HXaHhOsR88NQ9In0AXVC7zHGqv1r/bMVn2YzPZHL/zzMrF
 mS79tdBTH+shSvchH9zvfgIs+UEKvvjEJsG2liwMkcQaV41i5dZjSKTdJ3EaD/Y2
 BreLxXRnRYGUkBqfcon16Yx+P6VCefDRLa+RhwYO3dxFF2N4ZpblbkIdBATwKLjL
 npiGc6R8yFjTmZU0/7olMyMCm7igIBmDvWPcsKEE8R4PezwoQv6YKHBMwEaflIbl
 Rv4IUqjJzmQPaA0KkRoAVgAKHxldaNqno/6G1FR2gwz+fr68p5WSYFlQ3axhvTjc
 bBMJpB/fbp9WmpGJieTt6iMOI6V1pnCVjibM5ZON59WCFfytHGGpbYW05gtZEod4
 d/3yRuU53JRSj3jQAQuF1B6qYhyxvv5YEtAQqIFeHaPZ67nL6agw09hE+TlXjWbE
 rTQRShflQ+ydnzIfKicFgy6/53D5hq7iH2l7HwJVXbXRQ104T5DB/XHUUTr+UWQn
 /Nkhov32/n6GjxQ=
 =58I4
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

 - Fix ASPM link_state teardown on removal (Lukas Wunner)

 - Fix misleading _OSC ASPM message (Sinan Kaya)

 - Make _OSC optional for PCI (Sinan Kaya)

 - Don't initialize ASPM link state when ACPI_FADT_NO_ASPM is set
   (Patrick Talbert)

 - Remove x86 and arm64 node-local allocation for host bridge structures
   (Punit Agrawal)

 - Pay attention to device-specific _PXM node values (Jonathan Cameron)

 - Support new Immediate Readiness bit (Felipe Balbi)

 - Differentiate between pciehp surprise and safe removal (Lukas Wunner)

 - Remove unnecessary pciehp includes (Lukas Wunner)

 - Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner)

 - Tolerate PCIe Slot Presence Detect being hardwired to zero to
   workaround broken hardware, e.g., the Wilocity switch/wireless device
   (Lukas Wunner)

 - Unify pciehp controller & slot structs (Lukas Wunner)

 - Constify hotplug_slot_ops (Lukas Wunner)

 - Drop hotplug_slot_info (Lukas Wunner)

 - Embed hotplug_slot struct into users instead of allocating it
   separately (Lukas Wunner)

 - Initialize PCIe port service drivers directly instead of relying on
   initcall ordering (Keith Busch)

 - Restore PCI config state after a slot reset (Keith Busch)

 - Save/restore DPC config state along with other PCI config state
   (Keith Busch)

 - Reference count devices during AER handling to avoid race issue with
   concurrent hot removal (Keith Busch)

 - If an Upstream Port reports ERR_FATAL, don't try to read the Port's
   config space because it is probably unreachable (Keith Busch)

 - During error handling, use slot-specific reset instead of secondary
   bus reset to avoid link up/down issues on hotplug ports (Keith Busch)

 - Restore previous AER/DPC handling that does not remove and
   re-enumerate devices on ERR_FATAL (Keith Busch)

 - Notify all drivers that may be affected by error recovery resets
   (Keith Busch)

 - Always generate error recovery uevents, even if a driver doesn't have
   error callbacks (Keith Busch)

 - Make PCIe link active reporting detection generic (Keith Busch)

 - Support D3cold in PCIe hierarchies during system sleep and runtime,
   including hotplug and Thunderbolt ports (Mika Westerberg)

 - Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots
   are empty or occupied (Jon Derrick)

 - Remove duplicated include from pci/pcie/err.c and unused variable
   from cpqphp (YueHaibing)

 - Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza
   Pawandeep)

 - Uninline PCI bus accessors for better ftracing (Keith Busch)

 - Remove unused AER Root Port .error_resume method (Keith Busch)

 - Use kfifo in AER instead of a local version (Keith Busch)

 - Use threaded IRQ in AER bottom half (Keith Busch)

 - Use managed resources in AER core (Keith Busch)

 - Reuse pcie_port_find_device() for AER injection (Keith Busch)

 - Abstract AER interrupt handling to disconnect error injection (Keith
   Busch)

 - Refactor AER injection callbacks to simplify future improvments
   (Keith Busch)

 - Remove unused Netronome NFP32xx Device IDs (Jakub Kicinski)

 - Use bitmap_zalloc() for dma_alias_mask (Andy Shevchenko)

 - Add switch fall-through annotations (Gustavo A. R. Silva)

 - Remove unused Switchtec quirk variable (Joshua Abraham)

 - Fix pci.c kernel-doc warning (Randy Dunlap)

 - Remove trivial PCI wrappers for DMA APIs (Christoph Hellwig)

 - Add Intel GPU device IDs to spurious interrupt quirk (Bin Meng)

 - Run Switchtec DMA aliasing quirk only on NTB endpoints to avoid
   useless dmesg errors (Logan Gunthorpe)

 - Update Switchtec NTB documentation (Wesley Yung)

 - Remove redundant "default n" from Kconfig (Bartlomiej Zolnierkiewicz)

 - Avoid panic when drivers enable MSI/MSI-X twice (Tonghao Zhang)

 - Add PCI support for peer-to-peer DMA (Logan Gunthorpe)

 - Add sysfs group for PCI peer-to-peer memory statistics (Logan
   Gunthorpe)

 - Add PCI peer-to-peer DMA scatterlist mapping interface (Logan
   Gunthorpe)

 - Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan
   Gunthorpe)

 - Add PCI peer-to-peer DMA driver writer's documentation (Logan
   Gunthorpe)

 - Add block layer flag to indicate driver support for PCI peer-to-peer
   DMA (Logan Gunthorpe)

 - Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P
   memory (Logan Gunthorpe)

 - Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan
   Gunthorpe)

 - Add nvme-pci support for PCI peer-to-peer memory in requests (Logan
   Gunthorpe)

 - Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise,
   Christoph Hellwig, Logan Gunthorpe)

 - Cache VF config space size to optimize enumeration of many VFs
   (KarimAllah Ahmed)

 - Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas)

 - Fix VMD AERSID quirk Device ID matching (Jon Derrick)

 - Fix Cadence PHY handling during probe (Alan Douglas)

 - Signal Cadence Endpoint interrupts via AXI region 0 instead of last
   region (Alan Douglas)

 - Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan
   Douglas)

 - Remove redundant controller tests for "device_type == pci" (Rob
   Herring)

 - Document R-Car E3 (R8A77990) bindings (Tho Vu)

 - Add device tree support for R-Car r8a7744 (Biju Das)

 - Drop unused mvebu PCIe capability code (Thomas Petazzoni)

 - Add shared PCI bridge emulation code (Thomas Petazzoni)

 - Convert mvebu to use shared PCI bridge emulation (Thomas Petazzoni)

 - Add aardvark Root Port emulation (Thomas Petazzoni)

 - Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach)

 - Add initial power management for i.MX7 (Leonard Crestez)

 - Add PME_Turn_Off support for i.MX7 (Leonard Crestez)

 - Fix qcom runtime power management error handling (Bjorn Andersson)

 - Update TI dra7xx unaligned access errata workaround for host mode as
   well as endpoint mode (Vignesh R)

 - Fix kirin section mismatch warning (Nathan Chancellor)

 - Remove iproc PAXC slot check to allow VF support (Jitendra Bhivare)

 - Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I)

 - Update Keystone to use MRRS quirk for host bridge instead of open
   coding (Kishon Vijay Abraham I)

 - Refactor Keystone link establishment (Kishon Vijay Abraham I)

 - Simplify and speed up Keystone link training (Kishon Vijay Abraham I)

 - Remove unused Keystone host_init argument (Kishon Vijay Abraham I)

 - Merge Keystone driver files into one (Kishon Vijay Abraham I)

 - Remove redundant Keystone platform_set_drvdata() (Kishon Vijay
   Abraham I)

 - Rename Keystone functions for uniformity (Kishon Vijay Abraham I)

 - Add Keystone device control module DT binding (Kishon Vijay Abraham
   I)

 - Use SYSCON API to get Keystone control module device IDs (Kishon
   Vijay Abraham I)

 - Clean up Keystone PHY handling (Kishon Vijay Abraham I)

 - Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I)

 - Clean up Keystone config space access checks (Kishon Vijay Abraham I)

 - Get Keystone outbound window count from DT (Kishon Vijay Abraham I)

 - Clean up Keystone outbound window configuration (Kishon Vijay Abraham
   I)

 - Clean up Keystone DBI setup (Kishon Vijay Abraham I)

 - Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I)

 - Fix Keystone IRQ status checking (Kishon Vijay Abraham I)

 - Add debug messages for all Keystone errors (Kishon Vijay Abraham I)

 - Clean up Keystone includes and macros (Kishon Vijay Abraham I)

 - Fix Mediatek unchecked return value from devm_pci_remap_iospace()
   (Gustavo A. R. Silva)

 - Fix Mediatek endpoint/port matching logic (Honghui Zhang)

 - Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui
   Zhang)

 - Remove redundant Mediatek PM domain check (Honghui Zhang)

 - Convert Mediatek to pci_host_probe() (Honghui Zhang)

 - Fix Mediatek MSI enablement (Honghui Zhang)

 - Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang)

 - Add Mediatek loadable module support (Honghui Zhang)

 - Detach VMD resources after stopping root bus to prevent orphan
   resources (Jon Derrick)

 - Convert pcitest build process to that used by other tools (iio, perf,
   etc) (Gustavo Pimentel)

* tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
  PCI/AER: Refactor error injection fallbacks
  PCI/AER: Abstract AER interrupt handling
  PCI/AER: Reuse existing pcie_port_find_device() interface
  PCI/AER: Use managed resource allocations
  PCI: pcie: Remove redundant 'default n' from Kconfig
  PCI: aardvark: Implement emulated root PCI bridge config space
  PCI: mvebu: Convert to PCI emulated bridge config space
  PCI: mvebu: Drop unused PCI express capability code
  PCI: Introduce PCI bridge emulated config space common logic
  PCI: vmd: Detach resources after stopping root bus
  nvmet: Optionally use PCI P2P memory
  nvmet: Introduce helper functions to allocate and free request SGLs
  nvme-pci: Add support for P2P memory in requests
  nvme-pci: Use PCI p2pmem subsystem to manage the CMB
  IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()
  block: Add PCI P2P flag for request queue
  PCI/P2PDMA: Add P2P DMA driver writer's documentation
  docs-rst: Add a new directory for PCI documentation
  PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers
  PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset
  ...
2018-10-25 06:50:48 -07:00
Johan Hovold
d397dbe606 net: bcmgenet: fix OF child-node lookup
Use the new of_get_compatible_child() helper to lookup the mdio child
node instead of using of_find_compatible_node(), which searches the
entire tree from a given start node and thus can return an unrelated
(i.e. non-child) node.

This also addresses a potential use-after-free (e.g. after probe
deferral) as the tree-wide helper drops a reference to its first
argument (i.e. the node of the device being probed).

Fixes: aa09677cba ("net: bcmgenet: add MDIO routines")
Cc: stable <stable@vger.kernel.org>     # 3.15
Cc: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-10-23 13:28:52 -05:00
David S. Miller
2e2d6f0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 11:03:06 -07:00
Dan Carpenter
35b842f25b bnxt_en: Copy and paste bug in extended tx_stats
The struct type was copied from the line before but it should be "tx"
instead of "rx".  I have reviewed the code and I can't immediately see
that this bug causes a runtime issue.

Fixes: 36e53349b6 ("bnxt_en: Add additional extended port statistics.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18 15:59:10 -07:00
Michael Chan
1ab968d2f1 bnxt_en: Add PCI ID for BCM57508 device.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:33 -07:00
Michael Chan
0fcec9854a bnxt_en: Add new NAPI poll function for 57500 chips.
Add a new poll function that polls for NQ events.  If the NQ event is
a CQ notification, we locate the CP ring from the cq_handle and call
__bnxt_poll_work() to handle RX/TX events on the CP ring.

Add a new has_more_work field in struct bnxt_cp_ring_info to indicate
budget has been reached.  __bnxt_poll_cqs_done() is called to update or
ARM the CP rings if budget has not been reached or not.  If budget
has been reached, the next bnxt_poll_p5() call will continue to poll
from the CQ rings directly.  Otherwise, the NQ will be ARMed for the
next IRQ.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:33 -07:00
Michael Chan
3675b92fa7 bnxt_en: Refactor bnxt_poll_work().
Separate the CP ring polling logic in bnxt_poll_work() into 2 separate
functions __bnxt_poll_work() and __bnxt_poll_work_done().  Since the logic
is separated, we need to add tx_pkts and events fields to struct bnxt_napi
to keep track of the events to handle between the 2 functions.  We also
add had_work_done field to struct bnxt_cp_ring_info to indicate whether
some work was performed on the CP ring.

This is needed to better support the 57500 chips.  We need to poll up to
2 separate CP rings before we update or ARM the CP rings on the 57500 chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:33 -07:00
Michael Chan
58590c8d90 bnxt_en: Add coalescing setup for 57500 chips.
On legacy chips, the CP ring may be shared between RX and TX and so only
setup the RX coalescing parameters in such a case.  On 57500 chips, we
always have a dedicated CP ring for TX so we can always set up the
TX coalescing parameters in bnxt_hwrm_set_coal().

Also, the min_timer coalescing parameter applies to the NQ on the new
chips and a separate firmware call needs to be made to set it up.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:33 -07:00
Michael Chan
e44758b78a bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.
In the RX code path, we current use the bnxt_napi struct pointer to
identify the associated RX/CP rings.  Change it to use the struct
bnxt_cp_ring_info pointer instead since there are now up to 2
CP rings per MSIX.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:33 -07:00
Michael Chan
7b3af4f75b bnxt_en: Add RSS support for 57500 chips.
RSS context allocation and RSS indirection table setup are very different
on the new chip.  Refactor bnxt_setup_vnic() to call 2 different functions
to set up RSS for the vnic based on chip type.  On the new chip, the
number of RSS contexts and the indirection table size depends on the
number of RX rings.  Each indirection table entry is also different
on the new chip since ring groups are no longer used.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
44c6f72a4c bnxt_en: Increase RSS context array count and skip ring groups on 57500 chips.
On the new 57500 chips, we need to allocate one RSS context for every
64 RX rings.  In previous chips, only one RSS context per vnic is
required regardless of the number of RX rings.  So increase the max
RSS context array count to 8.

Hardware ring groups are not used on the new chips.  Note that the
software ring group structure is still maintained in the driver to
keep track of the rings associated with the vnic.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
3e08b1841b bnxt_en: Allocate/Free CP rings for 57500 series chips.
On the new 57500 chips, we allocate/free one CP ring for each RX ring or
TX ring separately.  Using separate CP rings for RX/TX is an improvement
as TX events will no longer be stuck behind RX events.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
23aefdd761 bnxt_en: Modify bnxt_ring_alloc_send_msg() to support 57500 chips.
Firmware ring allocation semantics are slightly different for most
ring types on 57500 chips.  Allocation/deallocation for NQ rings are
also added for the new chips.

A CP ring handle is also added so that from the NQ interrupt event,
we can locate the CP ring.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
2c61d2117e bnxt_en: Add helper functions to get firmware CP ring ID.
On the new 57500 chips, getting the associated CP ring ID associated with
an RX ring or TX ring is different than before.  On the legacy chips,
we find the associated ring group and look up the CP ring ID.  On the
57500 chips, each RX ring and TX ring has a dedicated CP ring even if
they share the MSIX.  Use these helper functions at appropriate places
to get the CP ring ID.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
50e3ab7836 bnxt_en: Allocate completion ring structures for 57500 series chips.
On 57500 chips, the original bnxt_cp_ring_info struct now refers to the
NQ.  bp->cp_nr_rings refer to the number of NQs on 57500 chips.  There
are now 2 pointers for the CP rings associated with RX and TX rings.
Modify bnxt_alloc_cp_rings() and bnxt_free_cp_rings() accordingly.

With multiple CP rings per NAPI, we need to add a pointer in
bnxt_cp_ring_info struct to point back to the bnxt_napi struct.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
41e8d79837 bnxt_en: Modify the ring reservation functions for 57500 series chips.
The ring reservation functions have to be modified for P5 chips in the
following ways:

- bnxt_cp_ring_info structs map to internal NQs as well as CP rings.
- Ring groups are not used.
- 1 CP ring must be available for each RX or TX ring.
- number of RSS contexts to reserve is multiples of 64 RX rings.
- RFS currently not supported.

Also, RX AGG rings are only used for jumbo frames, so we need to
unconditionally call bnxt_reserve_rings() in __bnxt_open_nic()
to see if we need to reserve AGG rings in case MTU has changed.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
9c1fabdf42 bnxt_en: Adjust MSIX and ring groups for 57500 series chips.
Store the maximum MSIX capability in PCIe config. space earlier.  When
we call firmware to query capability, we need to compare the PCIe
MSIX max count with the firmware count and use the smaller one as
the MSIX count for 57500 (P5) chips.

The new chips don't use ring groups.  But previous chips do and
the existing logic limits the available rings based on resource
calculations including ring groups.  Setting the max ring groups to
the max rx rings will work on the new chips without changing the
existing logic.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
697197e5a1 bnxt_en: Re-structure doorbells.
The 57500 series chips have a new 64-bit doorbell format.  Use a new
bnxt_db_info structure to unify the new and the old 32-bit doorbells.
Add a new bnxt_set_db() function to set up the doorbell addreses and
doorbell keys ahead of time.  Modify and introduce new doorbell
helpers to help abstract and unify the old and new doorbells.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
e38287b72e bnxt_en: Add 57500 new chip ID and basic structures.
57500 series is a new chip class (P5) that requires some driver changes
in the next several patches.  This adds basic chip ID, doorbells, and
the notification queue (NQ) structures.  Each MSIX is associated with an
NQ instead of a CP ring in legacy chips.  Each NQ has up to 2 associated
CP rings for RX and TX.  The same bnxt_cp_ring_info struct will be used
for the NQ.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
1b9394e5a2 bnxt_en: Configure context memory on new devices.
Call firmware to configure the DMA addresses of all context memory
pages on new devices requiring context memory.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
98f04cf0f1 bnxt_en: Check context memory requirements from firmware.
New device requires host context memory as a backing store.  Call
firmware to check for context memory requirements and store the
parameters.  Allocate host pages accordingly.

We also need to move the call bnxt_hwrm_queue_qportcfg() earlier
so that all the supported hardware queues and the IDs are known
before checking and allocating context memory.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:32 -07:00
Michael Chan
66cca20abc bnxt_en: Add new flags to setup new page table PTE bits on newer devices.
Newer chips require the PTU_PTE_VALID bit to be set for every page
table entry for context memory and rings.  Additional bits are also
required for page table entries for all rings.  Add a flags field to
bnxt_ring_mem_info struct to specify these additional bits to be used
when setting up the pages tables as needed.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
6fe1988685 bnxt_en: Refactor bnxt_ring_struct.
Move the DMA page table and vmem fields in bnxt_ring_struct to a new
bnxt_ring_mem_info struct.  This will allow context memory management
for a new device to re-use some of the existing infrastructure.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
74706afa71 bnxt_en: Update interrupt coalescing logic.
New firmware spec. allows interrupt coalescing parameters, such as
maximums, timer units, supported features to be queried.  Update
the driver to make use of the new call to query these parameters
and provide the legacy defaults if the call is not available.

Replace the hard-coded values with these parameters.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
1dfddc41ae bnxt_en: Add maximum extended request length fw message support.
Support the max_ext_req_len field from the HWRM_VER_GET_RESPONSE.
If this field is valid and greater than the mailbox size, use the
short command format to send firmware messages greater than the
mailbox size.  Newer devices use this method to send larger messages
to the firmware.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
36e53349b6 bnxt_en: Add additional extended port statistics.
Latest firmware spec. has some additional rx extended port stats and new
tx extended port stats added.  We now need to check the size of the
returned rx and tx extended stats and determine how many counters are
valid.  New counters added include CoS byte and packet counts for rx
and tx.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Michael Chan
31d357c069 bnxt_en: Update firmware interface spec. to 1.10.0.3.
Among the new changes are trusted VF support, 200Gbps support, and new
API to dump ring information on the new chips.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:44:31 -07:00
Florian Fainelli
64bd9c8135 net: bcmgenet: Poll internal PHY for GENETv5
On GENETv5, there is a hardware issue which prevents the GENET hardware
from generating a link UP interrupt when the link is operating at
10Mbits/sec. Since we do not have any way to configure the link
detection logic, fallback to polling in that case.

Fixes: 421380856d ("net: bcmgenet: add support for the GENETv5 hardware")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 22:10:21 -07:00
Gustavo A. R. Silva
5fc7c12ffa bnxt_en: Remove unnecessary unsigned integer comparison and initialize variable
There is no need to compare *val.vu32* with < 0 because
such variable is of type u32 (32 bits, unsigned), making it
impossible to hold a negative value. Fix this by removing
such comparison.

Also, initialize variable *max_val* to -1, just in case
it is not initialized to either BNXT_MSIX_VEC_MAX or
BNXT_MSIX_VEC_MIN_MAX before using it in a comparison
with val.vu32 at line 159:

	if (val.vu32 > max_val)

Addresses-Coverity-ID: 1473915 ("Unsigned compared against 0")
Addresses-Coverity-ID: 1473920 ("Uninitialized scalar variable")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-07 20:36:40 -07:00
David S. Miller
72438f8cef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-06 14:43:42 -07:00
Vasundhara Volam
c78fe05887 bnxt_en: get the reduced max_irqs by the ones used by RDMA
When getting the max rings supported, get the reduced max_irqs
by the ones used by RDMA.

If the number MSIX is the limiting factor, this bug may cause the
max ring count to be higher than it should be when RDMA driver is
loaded and may result in ring allocation failures.

Fixes: 30f529473e ("bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Venkat Duvvuru
a2bf74f4e1 bnxt_en: free hwrm resources, if driver probe fails.
When the driver probe fails, all the resources that were allocated prior
to the failure must be freed. However, hwrm dma response memory is not
getting freed.

This patch fixes the problem described above.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Vasundhara Volam
5db0e0969a bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits
set only for the queue ids which are having the valid parameters.

This causes firmware to return error when the TC to hardware CoS queue
mapping is not 1:1 during DCBNL ETS setup.

Fixes: 2e8ef77ee0 ("bnxt_en: Add TC to hardware QoS queue mapping logic.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Michael Chan
dbe80d446c bnxt_en: Fix VNIC reservations on the PF.
The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG
firmware call to reserve VNICs.  This has the effect that the firmware
will keep a large number of VNICs for the PF, and having very few for
VFs.  DPDK driver running on the VFs, which requires more VNICs, may not
work properly as a result.

Fixes: 674f50a5b0 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 21:41:16 -07:00
Vasundhara Volam
2dc0865e9a bnxt_en: Add a driver specific gre_ver_check devlink parameter.
This patch adds following driver-specific permanent mode boolean
parameter.

gre_ver_check - Generic Routing Encapsulation(GRE) version check
will be enabled in the device. If disabled, device skips version
checking for GRE packets.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
Vasundhara Volam
f399e84978 bnxt_en: Use msix_vec_per_pf_max and msix_vec_per_pf_min devlink params.
This patch adds support for following generic permanent mode
devlink parameters. They can be modified using devlink param
commands.

msix_vec_per_pf_max - This param sets the number of MSIX vectors
that the device requests from the host on driver initialization.
This value is set in the device which limits MSIX vectors per PF.

msix_vec_per_pf_min - This param sets the number of minimal MSIX
vectors required for the device initialization. Value 0 indicates
a default value is selected. This value is set in the device which
limits MSIX vectors per PF.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
Vasundhara Volam
3a1d52a54a bnxt_en: return proper error when FW returns HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED
Return proper error code when Firmware returns
HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED for HWRM_NVM_GET/SET_VARIABLE
commands.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
Vasundhara Volam
7d85923487 bnxt_en: Use ignore_ari devlink parameter
This patch adds support for ignore_ari generic permanent mode
devlink parameter. This parameter is disabled by default. It can be
enabled using devlink param commands.

ignore_ari - If enabled, device ignores ARI(Alternate Routing ID)
capability, even when platforms has the support and creates same number
of partitions when platform does not support ARI capability.

Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 13:49:43 -07:00
David S. Miller
9e50727f0e mlx5-updates-2018-10-03
mlx5 core driver and ethernet netdev updates, please note there is a small
 devlink releated update to allow extack argument to eswitch operations.
 
 From Eli Britstein,
 1) devlink: Add extack argument to the eswitch related operations
 2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
 3) net/mlx5e: Add extack messages for TC offload failures
 
 From Eran Ben Elisha,
 4) mlx5e: Add counter for aRFS rule insertion failures
 
 From Feras Daoud
 5) Fast teardown support for mlx5 device
 This change introduces the enhanced version of the "Force teardown" that
 allows SW to perform teardown in a faster way without the need to reclaim
 all the FW pages.
 Fast teardown provides the following advantages:
     1- Fix a FW race condition that could cause command timeout
     2- Avoid moving to polling mode
     3- Close the vport to prevent PCI ACK to be sent without been scatter
     to memory
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbtU45AAoJEEg/ir3gV/o+/C4H/RHA4KImrb476EdB3VNYMqAN
 dgXb+bmh6sZP+jHWqQ4c3aVeh6/T8qm4gwiSn2nVTtHEnxtCdIYljzDC1Nswczeg
 pSjD1eOP7M1LpAOmBb8xdnJcX7yM7r1bTklnp2sN853WShbsDRYgZBHsBwTzx25U
 ZdzL4QTLuohlG/aLrbGXMntIy45ya2fVQrnK54s18nFlgsdFjEs0mi0xaUKNBC6+
 P8CTohHAxuuxmL5b+6MIYLZCdgd8cLNQFdtqbckEVw7SvcRTxfraRlyqJ0YOgTGB
 TdSWnqZz2JYH29wSFbpFG8qX6GCv8FoiZ+fKzldbolHk442rrktHv3+Y7qQuZVs=
 =NVks
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-10-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2018-10-03

mlx5 core driver and ethernet netdev updates, please note there is a small
devlink releated update to allow extack argument to eswitch operations.

From Eli Britstein,
1) devlink: Add extack argument to the eswitch related operations
2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
3) net/mlx5e: Add extack messages for TC offload failures

From Eran Ben Elisha,
4) mlx5e: Add counter for aRFS rule insertion failures

From Feras Daoud
5) Fast teardown support for mlx5 device
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the FW pages.
Fast teardown provides the following advantages:
    1- Fix a FW race condition that could cause command timeout
    2- Avoid moving to polling mode
    3- Close the vport to prevent PCI ACK to be sent without been scatter
    to memory
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-04 09:48:37 -07:00
David S. Miller
6f41617bf2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 21:00:17 -07:00
Eli Britstein
db7ff19e7b devlink: Add extack for eswitch operations
Add extack argument to the eswitch related operations.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-03 16:17:58 -07:00
Florian Fainelli
45ec318578 net: systemport: Fix wake-up interrupt race during resume
The AON_PM_L2 is normally used to trigger and identify the source of a
wake-up event. Since the RX_SYS clock is no longer turned off, we also
have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and
that interrupt remains active up until the magic packet detector is
disabled which happens much later during the driver resumption.

The race happens if we have a CPU that is entering the SYSTEMPORT
INTRL2_0 handler during resume, and another CPU has managed to clear the
wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we
have the first CPU stuck in the interrupt handler with an interrupt
cause that has been cleared under its feet, and so we keep returning
IRQ_NONE and we never make any progress.

This was not a problem before because we would always turn off the
RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned
off as well, thus not latching the interrupt.

The fix is to make sure we do not enable either the MPD or
BRCM_TAG_MATCH interrupts since those are redundant with what the
AON_PM_L2 interrupt controller already processes and they would cause
such a race to occur.

Fixes: bb9051a2b2 ("net: systemport: Add support for WAKE_FILTER")
Fixes: 83e82f4c70 ("net: systemport: add Wake-on-LAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-02 17:34:47 -07:00
Oza Pawandeep
62b36c3ea6 PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls
After bfcb79fca1 ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.

Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-02 16:04:40 -05:00
Florian Fainelli
a5d78ce793 net: systemport: Add software counters to track reallocations
When inserting the TSB, keep track of how many times we had to do it and
if there was a failure in doing so, this helps profile the driver for
possibly incorrect headroom settings.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:53 -07:00
Florian Fainelli
aa6ca0ec71 net: systemport: Be drop monitor friendly while re-allocating headroom
During bcm_sysport_insert_tsb() make sure we differentiate a SKB
headroom re-allocation failure from the normal swap and replace path.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-01 23:11:52 -07:00