linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-24 16:29:55 +07:00

Author	SHA1	Message	Date
Or Gerlitz	f24f790f8e	net/mlx4_core: Load the Eth driver first When running in SRIOV mode, VM that is assigned with a non-provisioned Ethernet VFs get themselves a random mac when the Eth driver starts. In this case, if the IB driver startup code that deals with RoCE runs first, it will use a zero mac as the source mac for the Para-Virtual CM MADs which is buggy. To handle that, we change the order of loading. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-05 15:48:22 -04:00
Amir Vadai	e1a5ddc506	net/mlx4_core: Defer VF initialization till PF is fully initialized Fix in commit [1] is not sufficient since a deferred VF initialization could happen after pci_enable_sriov() is finished, but before the PF is fully initialized. Need to prevent VFs from initializing till the PF is fully ready and comm channel is operational. [1] - `9798935` "net/mlx4_core: mlx4_init_slave() shouldn't access comm channel before PF is ready" CC: Stuart Hayes <Stuart_Hayes@Dell.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-14 13:24:42 -04:00
Wei Yang	befdf8978a	net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one() pci_match_id() just match the static pci_device_id, which may return NULL if someone binds the driver to a device manually using /sys/bus/pci/drivers/.../new_id. This patch wrap up a helper function __mlx4_remove_one() which does the tear down function but preserve the drv_data. Functions like mlx4_pci_err_detected() and mlx4_restart_one() will call this one with out releasing drvdata. Fixes: `97a5221` "net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset". CC: Bjorn Helgaas <bhelgaas@google.com> CC: Amir Vadai <amirv@mellanox.com> CC: Jack Morgenstein <jackm@dev.mellanox.co.il> CC: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-13 23:12:15 -04:00
David S. Miller	64c27237a0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/marvell/mvneta.c The mvneta.c conflict is a case of overlapping changes, a conversion to devm_ioremap_resource() vs. a conversion to netdev_alloc_pcpu_stats. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-29 18:48:54 -04:00
Wei Yang	97a5221f56	net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset The second parameter of __mlx4_init_one() is used to identify whether the pci_dev is a PF or VF. Currently, when it is invoked in mlx4_pci_slot_reset() this information is missed. This patch match the pci_dev with mlx4_pci_table and passes the pci_device_id.driver_data to __mlx4_init_one() in mlx4_pci_slot_reset(). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-27 15:35:33 -04:00
Matan Barak	dd41cc3bb9	net/mlx4: Adapt num_vfs/probed_vf params for single port VF A new syntax is added for the module parameters num_vfs and probe_vf. num_vfs=p1,p2,p1+p2 probe_bf=p1,p2,p1+p2 Where p1(2) is the number of VFs on / probed VFs for physical port1(2) and p1+p2 is the number of dual port VFs. Single port VFs are currently supported only when the link type for both ports of the device is Ethernet. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-20 16:18:30 -04:00
Matan Barak	449fc48866	net/mlx4: Adapt code for N-Port VF Adds support for N-Port VFs, this includes: 1. Adding support in the wrapped FW command In wrapped commands, we need to verify and convert the slave's port into the real physical port. Furthermore, when sending the response back to the slave, a reverse conversion should be made. 2. Adjusting sqpn for QP1 para-virtualization The slave assumes that sqpn is used for QP1 communication. If the slave is assigned to a port != (first port), we need to adjust the sqpn that will direct its QP1 packets into the correct endpoint. 3. Adjusting gid[5] to modify the port for raw ethernet In B0 steering, gid[5] contains the port. It needs to be adjusted into the physical port. 4. Adjusting number of ports in the query / ports caps in the FW commands When a slave queries the hardware, it needs to view only the physical ports it's assigned to. 5. Adjusting the sched_qp according to the port number The QP port is encoded in the sched_qp, thus in modify_qp we need to encode the correct port in sched_qp. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-20 16:18:30 -04:00
Matan Barak	1ab95d37bc	net/mlx4: Add data structures to support N-Ports per VF Adds the required data structures to support VFs with N (1 or 2) ports instead of always using the number of physical ports. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-20 16:18:29 -04:00
David S. Miller	85dcce7a73	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/r8152.c drivers/net/xen-netback/netback.c Both the r8152 and netback conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-14 22:31:55 -04:00
Or Gerlitz	7855bff42e	net/mlx4_core: Load the IB driver when the device supports IBoE When checking what protocol drivers to load, the IB driver should be requested also over Ethernet ports, if the device supports IBoE (RoCE). Fixes: `b046ffe` 'net/mlx4_core: Load higher level modules according to ports type' Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 16:12:42 -04:00
Jack Morgenstein	b6ffaeffae	mlx4: In RoCE allow guests to have multiple GIDS The GIDs are statically distributed, as follows: PF: gets 16 GIDs VFs: Remaining GIDS are divided evenly between VFs activated by the driver. If the division is not even, lower-numbered VFs get an extra GID. For an IB interface, the number of gids per guest remains as before: one gid per guest. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-12 15:57:14 -04:00
Amir Vadai	97989356af	net/mlx4_core: mlx4_init_slave() shouldn't access comm channel before PF is ready Currently, the PF call to pci_enable_sriov from the PF probe function stalls for 10 seconds times the number of VFs probed on the host. This happens because the way for such VFs to determine of the PF initialization finished, is by attempting to issue reset on the comm-channel and get timeout (after 10s). The PF probe function is called from a kenernel workqueue, and therefore during that time, rcu lock is being held and kernel's workqueue is stalled. This blocks other processes that try to use the workqueue or rcu lock. For example, interface renaming which is calling rcu_synchronize is blocked, and timedout by systemd. Changed mlx4_init_slave() to allow VF probed on the host to immediatly detect that the PF is not ready, and return EPROBE_DEFER instantly. Only when the PF finishes the initialization, allow such VFs to access the comm channel. This issue and fix are relevant only for probed VFs on the hypervisor, there is no way to pass this information to a VM until comm channel is ready, so in a VM, if PF is not ready, the first command will be timedout after 10 seconds and return EPROBE_DEFER. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-06 17:09:04 -05:00
Gavin Shan	367d56f7b4	net/mlx4: Support shutdown() interface In kexec scenario, we failed to load the mlx4 driver in the second kernel because the ownership bit was hold by the first kernel without release correctly. The patch adds shutdown() interface so that the ownership can be released correctly in the first kernel. It also helps avoiding EEH error happened during boot stage of the second kernel because of undesired traffic, which can't be handled by hardware during that stage on Power platform. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Tested-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-05 20:40:24 -05:00
Ido Shamay	bb2146bc88	net/mlx4: Fix limiting number of IRQ's instead of RSS queues This fix a performance bug introduced by commit `90b1ebe` "mlx4: set maximal number of default RSS queues", which limits the numbers of IRQs opened by core module. The limit should be on the number of queues in the indirection table - rx_rings, and not on the number of IRQ's. Also, limiting on mlx4_core initialization instead of in mlx4_en, prevented using "ethtool -L" to utilize all the CPU's, when performance mode is prefered, since limiting this number to 8 reduces overall packet rate by 15%-50% in multiple TCP streams applications. For example, after running ethtool -L <ethx> rx 16 Packet rate Before the fix 897799 After the fix `1142070` Results were obtained using netperf: S=200 ; ( for i in $(seq 1 $S) ; do ( \ netperf -H 11.7.13.55 -t TCP_RR -l 30 &) ; \ wait ; done \| grep "1 1" \| awk '{SUM+=$6} END {print SUM}' ) CC: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-24 18:38:14 -05:00
Alexander Gordeev	66e2f9c1de	mlx4: Use pci_enable_msix_range() instead of pci_enable_msix() As result of deprecation of MSI-X/MSI enablement functions pci_enable_msix() and pci_enable_msi_block() all drivers using these two interfaces need to be updated to use the new pci_enable_msi_range() and pci_enable_msix_range() interfaces. Signed-off-by: Alexander Gordeev <agordeev@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Amir Vadai <amirv@mellanox.com> Cc: netdev@vger.kernel.org Cc: linux-pci@vger.kernel.org Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-18 15:33:32 -05:00
Eyal Perry	b912b2f8fc	net/mlx4_core: Warn if device doesn't have enough PCI bandwidth Check if the device get enough bandwidth from the entire PCI chain to satisfy its capabilities. This patch determines the PCIe device's bandwidth capabilities by reading its PCIe Link Capabilities registers and then call the pcie_get_minimum_link function to ensure that the adapter is hooked into a slot which is capable of providing the necessary bandwidth capabilities. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-05 20:37:05 -05:00
Or Gerlitz	7ffdf726cf	net/mlx4_core: Add basic support for TCP/IP offloads under tunneling Add the low-level device commands and definitions used for TCP/IP HW offloads of tunneled/vxlan traffic which are supported by the ConnectX3-pro NIC. This is done through the following elements: - read tunneling device caps in QUERY_DEV_CAP - add helper function to do SET_PORT for tunneling - add DMFS VXLAN steering rule definitions - add CQE and WQE checksum offload field definitions Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-31 14:31:43 -05:00
Eyal Perry	be902ab122	net/mlx4_core: Set CQE/EQE size to 64B by default To achieve out of the box performance default is to use 64 byte CQE/EQE. In tests that we conduct in our labs, we achieved a performance improvement of twice the message rate. For older VF/libmlx4 support, enable_64b_cqe_eqe must be set to 0 (disabled). Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:04:44 -05:00
Hadar Hen Zion	8e1a28e8e6	net/mlx4_core: Expose physical port id as PF/VF capability Add the infrastructure needed to support ndo_get_phys_port_id which allows users to identify to which physical port a net-device is connected to by reading a unique port id. This will work for VFs and PFs. The driver uses a new device capability - phys_port_id, The PF driver reads the port phys_port_id from Firmware and stores it. The VF driver reads the port phys_port_id from the PF using QUERY_FUNC_CAP command. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-19 19:04:43 -05:00
Jack Morgenstein	7c6d74d23a	mlx4_core: Roll back round robin bitmap allocation commit for CQs, SRQs, and MPTs Commit `f4ec9e9` "mlx4_core: Change bitmap allocator to work in round-robin fashion" introduced round-robin allocation (via bitmap) for all resources which allocate via a bitmap. Round robin allocation is desirable for mcgs, counters, pd's, UARs, and xrcds. These are simply numbers, with no involvement of ICM memory mapping. Round robin is required for QPs, since we had a problem with immediate reuse of a 24-bit QP number (commit `f4ec9e9`). However, for other resources which use the bitmap allocator and involve mapping ICM memory -- MPTs, CQs, SRQs -- round-robin is not desirable. What happens in these cases is the following: ICM memory is allocated and mapped in chunks of 256K. Since the resource allocation index goes up monotonically, the allocator will eventually require mapping a new chunk. Now, chunks are also unmapped when their reference count goes back to zero. Thus, if a single app is running and starts/exits frequently we will have the following situation: When the app starts, a new chunk must be allocated and mapped. When the app exits, the chunk reference count goes back to zero, and the chunk is unmapped and freed. Therefore, the app must pay the cost of allocation and mapping of ICM memory each time it runs (although the price is paid only when allocating the initial entry in the new chunk). For apps which allocate MPTs/SRQs/CQs and which operate as described above, this presented a performance problem. We therefore roll back the round-robin allocator modification for MPTs, CQs, SRQs. Reported-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-09 21:12:13 -05:00
Wei Yang	1b85ee09aa	net/mlx4_core: destroy workqueue when driver fails to register When driver registration fails, we need to clean up the resources allocated before. mlx4_core missed destroying the workqueue allocated. This patch destroys the workqueue when registration fails. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-03 11:55:44 -05:00
Eugenia Emantayev	6e7136ed77	net/mlx4_core: ICM pages are allocated on device NUMA node This is done to optimize FW/HW access to host memory. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-07 19:22:48 -05:00
Jack Morgenstein	5a0d0a6161	mlx4: Structures and init/teardown for VF resource quotas This is step #1 for implementing SRIOV resource quotas for VFs. Quotas are implemented per resource type for VFs and the PF, to prevent any entity from simply grabbing all the resources for itself and leaving the other entities unable to obtain such resources. Resources which are allocated using quotas: QPs, CQs, SRQs, MPTs, MTTs, MAC, VLAN, and Counters. The quota system works as follows: Each entity (VF or PF) is given a max number of a given resource (its quota), and a guaranteed minimum number for each resource (starvation prevention). For QPs, CQs, SRQs, MPTs and MTTs: 50% of the available quantity for the resource is divided equally among the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool. The other 50% is the "free pool", allocated on a first-come-first-serve basis. For each VF/PF, resources are first allocated from its "guaranteed-minimum" pool. When that pool is exhausted, the driver attempts to allocate from the resource "free-pool". The quota (i.e., max) for the VFs and the PF is: The free-pool amount (50% of the real max) + the guaranteed minimum For MACs: Guarantee 2 MACs per VF/PF per port. As a result, since we have only 128 MACs per port, reduce the allowable number of VFs from 64 to 63. Any remaining MACs are put into a free pool. For VLANs: For the PF, the per-port quota is 128 and guarantee is 64 (to allow the PF to register at least a VLAN per VF in VST mode). For the VFs, the per-port quota is 64 and the guarantee is 0. We assume that VGT VFs are trusted not to abuse the VLAN resource. For Counters: For all functions (PF and VFs), the quota is 128 and the guarantee is 0. In this patch, we define the needed structures, which are added to the resource-tracker struct. In addition, we do initialization for the resource quota, and adjust the query_device response to use quotas rather than resource maxima. As part of the implementation, we introduce a new field in mlx4_dev: quotas. This field holds the resource quotas used to report maxima to the upper layers (ib_core, via query_device). The HCA maxima of these values are passed to the VFs (via QUERY_HCA) so that they may continue to use these in handling QPs, CQs, SRQs and MPTs. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-11-04 16:19:07 -05:00
Eyal Perry	b046ffe54d	net/mlx4_core: Load higher level modules according to ports type Mellanox ConnectX architecture is: mlx4_core is the lower level PCI driver which register on the PCI id, and protocol specific drivers are depended on it: mlx4_en - for Ethernet and mlx4_ib for Infiniband. NIC could have multiple ports which can change their type dynamically. We use the request_module() call to load the relevant protocol driver when needed: on loading time or at port type change event. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-17 15:10:50 -04:00
David S. Miller	0e76a3a587	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge net into net-next to setup some infrastructure Eric Dumazet needs for usbnet changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-03 21:36:46 -07:00
Jack Morgenstein	b30513202c	net/mlx4_core: VFs must ignore the enable_64b_cqe_eqe module param Slaves get the 64B CQE/EQE state from QUERY_HCA, not from the module parameter. If the parameter is set to zero, the slave outputs an incorrect/irrelevant warning message that 64B CQEs/EQEs are supported but not enabled (even if the hypervisor has enabled 64B CQEs/EQEs). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-01 15:09:36 -07:00
Yevgeny Petrilin	fe6f700d6c	net/mlx4_core: Respond to operation request by firmware This commit adds new firmware command and new firmware event. The firmware raises the MLX4_EVENT_TYPE_OP_REQUIRED event in order to signal the driver it needs to perform an administrative operation throughout the MLX4_CMD_GET_OP_REQ command. At the moment the supported operation is adding/removing multicast entries which are used by the firmware for handling NCSI traffic in B0 steering mode. Also, had to swap the order of mlx4_init_mcg_table() and mlx4_init_eq_table() to make sure that driver will get events only after resources are initialized to handle it. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-29 01:12:40 -07:00
Linus Torvalds	496322bc91	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "This is a re-do of the net-next pull request for the current merge window. The only difference from the one I made the other day is that this has Eliezer's interface renames and the timeout handling changes made based upon your feedback, as well as a few bug fixes that have trickeled in. Highlights: 1) Low latency device polling, eliminating the cost of interrupt handling and context switches. Allows direct polling of a network device from socket operations, such as recvmsg() and poll(). Currently ixgbe, mlx4, and bnx2x support this feature. Full high level description, performance numbers, and design in commit `0a4db187a9` ("Merge branch 'll_poll'") From Eliezer Tamir. 2) With the routing cache removed, ip_check_mc_rcu() gets exercised more than ever before in the case where we have lots of multicast addresses. Use a hash table instead of a simple linked list, from Eric Dumazet. 3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski, Marek Puzyniak, Michal Kazior, and Sujith Manoharan. 4) Support reporting the TUN device persist flag to userspace, from Pavel Emelyanov. 5) Allow controlling network device VF link state using netlink, from Rony Efraim. 6) Support GRE tunneling in openvswitch, from Pravin B Shelar. 7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from Daniel Borkmann and Eric Dumazet. 8) Allow controlling of TCP quickack behavior on a per-route basis, from Cong Wang. 9) Several bug fixes and improvements to vxlan from Stephen Hemminger, Pravin B Shelar, and Mike Rapoport. In particular, support receiving on multiple UDP ports. 10) Major cleanups, particular in the area of debugging and cookie lifetime handline, to the SCTP protocol code. From Daniel Borkmann. 11) Allow packets to cross network namespaces when traversing tunnel devices. From Nicolas Dichtel. 12) Allow monitoring netlink traffic via AF_PACKET sockets, in a manner akin to how we monitor real network traffic via ptype_all. From Daniel Borkmann. 13) Several bug fixes and improvements for the new alx device driver, from Johannes Berg. 14) Fix scalability issues in the netem packet scheduler's time queue, by using an rbtree. From Eric Dumazet. 15) Several bug fixes in TCP loss recovery handling, from Yuchung Cheng. 16) Add support for GSO segmentation of MPLS packets, from Simon Horman. 17) Make network notifiers have a real data type for the opaque pointer that's passed into them. Use this to properly handle network device flag changes in arp_netdev_event(). From Jiri Pirko and Timo Teräs. 18) Convert several drivers over to module_pci_driver(), from Peter Huewe. 19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a O(1) calculation instead. From Eric Dumazet. 20) Support setting of explicit tunnel peer addresses in ipv6, just like ipv4. From Nicolas Dichtel. 21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet. 22) Prevent a single high rate flow from overruning an individual cpu during RX packet processing via selective flow shedding. From Willem de Bruijn. 23) Don't use spinlocks in TCP md5 signing fast paths, from Eric Dumazet. 24) Don't just drop GSO packets which are above the TBF scheduler's burst limit, chop them up so they are in-bounds instead. Also from Eric Dumazet. 25) VLAN offloads are missed when configured on top of a bridge, fix from Vlad Yasevich. 26) Support IPV6 in ping sockets. From Lorenzo Colitti. 27) Receive flow steering targets should be updated at poll() time too, from David Majnemer. 28) Fix several corner case regressions in PMTU/redirect handling due to the routing cache removal, from Timo Teräs. 29) We have to be mindful of ipv4 mapped ipv6 sockets in upd_v6_push_pending_frames(). From Hannes Frederic Sowa. 30) Fix L2TP sequence number handling bugs, from James Chapman." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits) drivers/net: caif: fix wrong rtnl_is_locked() usage drivers/net: enic: release rtnl_lock on error-path vhost-net: fix use-after-free in vhost_net_flush net: mv643xx_eth: do not use port number as platform device id net: sctp: confirm route during forward progress virtio_net: fix race in RX VQ processing virtio: support unlocked queue poll net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit Documentation: Fix references to defunct linux-net@vger.kernel.org net/fs: change busy poll time accounting net: rename low latency sockets functions to busy poll bridge: fix some kernel warning in multicast timer sfc: Fix memory leak when discarding scattered packets sit: fix tunnel update via netlink dt:net:stmmac: Add dt specific phy reset callback support. dt:net:stmmac: Add support to dwmac version 3.610 and 3.710 dt:net:stmmac: Allocate platform data only if its NULL. net:stmmac: fix memleak in the open method ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available net: ipv6: fix wrong ping_v6_sendmsg return value ...	2013-07-09 18:24:39 -07:00
Linus Torvalds	80cc38b163	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "The usual stuff from trivial tree" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) treewide: relase -> release Documentation/cgroups/memory.txt: fix stat file documentation sysctl/net.txt: delete reference to obsolete 2.4.x kernel spinlock_api_smp.h: fix preprocessor comments treewide: Fix typo in printk doc: device tree: clarify stuff in usage-model.txt. open firmware: "/aliasas" -> "/aliases" md: bcache: Fixed a typo with the word 'arithmetic' irq/generic-chip: fix a few kernel-doc entries frv: Convert use of typedef ctl_table to struct ctl_table sgi: xpc: Convert use of typedef ctl_table to struct ctl_table doc: clk: Fix incorrect wording Documentation/arm/IXP4xx fix a typo Documentation/networking/ieee802154 fix a typo Documentation/DocBook/media/v4l fix a typo Documentation/video4linux/si476x.txt fix a typo Documentation/virtual/kvm/api.txt fix a typo Documentation/early-userspace/README fix a typo Documentation/video4linux/soc-camera.txt fix a typo lguest: fix CONFIG_PAE -> CONFIG_x86_PAE in comment ...	2013-07-04 11:40:58 -07:00
David S. Miller	0c1072ae02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/freescale/fec_main.c drivers/net/ethernet/renesas/sh_eth.c net/ipv4/gre.c The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list) and the splitting of the gre.c code into seperate files. The FEC conflict was two sets of changes adding ethtool support code in an "!CONFIG_M5272" CPP protected block. Finally the sh_eth.c conflict was between one commit add bits set in the .eesr_err_check mask whilst another commit removed the .tx_error_check member and assignments. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-03 14:55:13 -07:00
Jack Morgenstein	30e514a717	net/mlx4_core: Fail device init if num_vfs is negative Should not allow negative num_vfs Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com> Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Dotan Barak	618fad954b	net/mlx4_core: Replace sscanf() with kstrtoint() It is not safe to use sscanf. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.com> Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-25 16:29:39 -07:00
Amir Vadai	f9bd2d7f6d	net/mlx_en: Timestamping is not supported in slave mode Old hypervisors don't mask out timestamp capability for slave. Till slave support will be added, need to disable capability by slave. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-24 00:02:58 -07:00
Masanari Iida	278cee0515	treewide: Fix typo in printk Correct spelling typo in printk within various drivers. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-06-18 13:48:45 +02:00
Jack Morgenstein	5efe5355f2	net/mlx4_core: Return -EPROBE_DEFER when a VF is probed before PF is sufficiently initialized In the PF initialization, SRIOV is enabled before the PF is fully initialized. This allows the kernel to probe the newly-exposed VFs before the PF is ready to handle them (nested probes). Have the probe method return the -EPROBE_DEFER value in this situation (instead of the VF probe method retrying its initialization in a loop, and returning -EIO on failure). When -EPROBE_DEFER is returned by the VF probe method, the kernel itself will retry the probe after a suitable delay. Based upon a suggestion by Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-04 12:58:24 -07:00
Amir Vadai	ec693d4701	net/mlx4_en: Add HW timestamping (TS) support The patch allows to enable/disable HW timestamping for incoming and/or outgoing packets. It adds and initializes all structs and callbacks needed by kernel TS API. To enable/disable HW timestamping appropriate ioctl should be used. Currently HWTSTAMP_FILTER_ALL/NONE and HWTSAMP_TX_ON/OFF only are supported. When enabling TS on receive flow - VLAN stripping will be disabled. Also were made all relevant changes in RX/TX flows to consider TS request and plant HW timestamps into relevant structures. mlx4_ib was fixed to compile with new mlx4_cq_alloc() signature. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-24 16:30:14 -04:00
Eugenia Emantayev	ddd8a6c12d	net/mlx4_core: Read HCA frequency and map internal clock Read HCA frequency, read PCI clock bar and offset, map internal clock to PCI bar. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-24 16:30:13 -04:00
Jack Morgenstein	e7dbeba856	net/mlx4_core: Fix endianness bug in set_param_l The set_param_l function assumes casting a u64 pointer to a u32 pointer allows to access the lower 32bits, but it results in writing the upper 32 bits on big endian systems. The fixed function reads the upper 32 bits of the 64 argument, and or's them with the 32 bits of the 32-bit value passed to the function. Since this is now a "read-modify-write" operation, we got many "unintialized variable" warnings which needed to be fixed as well. Reported-by: Alexander Schmidt <alexschm@de.ibm.com>. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 15:52:03 -05:00
Linus Torvalds	70a3a06d01	Main batch of InfiniBand/RDMA changes for 3.9: - SRP error handling fixes from Bart Van Assche - Implementation of memory windows for mlx4 from Shani Michaeli - Lots of cxgb4 HW driver fixes from Vipul Pandya - Make iSER work for virtual functions, other fixes from Or Gerlitz - Fix for bug in qib HW driver from Mike Marciniszyn - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman - Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJRLPNoAAoJEENa44ZhAt0h0bMP/1xqlgDR3DGdoUoV4yTd6O9G Uccdb7og5o5tedVo+8xZz01y4at99P8FWW4hJ4os6k8n6yKoBVwo7qjN+BOR+JG5 q8+O+ynUSIg4tGrb5sXcMnKXAXbw/vkftMWYNA41cbrM24DTYzB/2SLpvhwbFoTT tdQc2tgz5QaDqzWbagyCR4+k/IgO+Llrz/RvIdtz4dsTnTDogN7QCoSffX8n/Lpb DxtyXK4sdl3DAtd3CsIdsB/TSMb3RkHLCoSvmrWlLnqMdsbRxVnCVfBm4BOghW3J Y2K3joRoCjjIZSRNs/i0FMFkT/jbCXg1oXg9ek/a6YFNcgyk7z8iGyXrRY7fOnno 8U2SfxJ69YpVYeJr+DSjaeHcmjsaYU7NN7JPxzvPKcJOIsxQJ/euJDXAXau3lEQY o9/p4JsGty0WHi1NanyygvghvBAoP1C5/59Sl4bHH5gckPyJT1kinPSCTT76YXGS WkSHg2mDhiJHy7Pnuy85iZldPoy2/5z09/I4aGMeL+8kUZbD4iFqzXIJU0HTsAim EONoRXDhIcN5DNVSVH1ig6nJ2a7Vhov4Z0r/vB8P4KhslBcqFwf2leC0eCoe5mNt SzcKhqosZDXoL8AwzpntzGIOid8pWmHbUx/PgIcoVXPjtl0h2ULNIFoYYyMZ3cyU AyN2tSiUZVddTV1/aKGL =RAQw -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband update from Roland Dreier: "Main batch of InfiniBand/RDMA changes for 3.9: - SRP error handling fixes from Bart Van Assche - Implementation of memory windows for mlx4 from Shani Michaeli - Lots of cxgb4 HW driver fixes from Vipul Pandya - Make iSER work for virtual functions, other fixes from Or Gerlitz - Fix for bug in qib HW driver from Mike Marciniszyn - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman - Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (41 commits) IB/mlx4: Advertise MW support IB/mlx4: Support memory window binding mlx4: Implement memory windows allocation and deallocation mlx4_core: Enable memory windows in {INIT, QUERY}_HCA mlx4_core: Disable memory windows for virtual functions IPoIB: Free ipoib neigh on path record failure so path rec queries are retried IB/srp: Fail I/O requests if the transport is offline IB/srp: Avoid endless SCSI error handling loop IB/srp: Avoid sending a task management function needlessly IB/srp: Track connection state properly IB/mlx4: Remove redundant NULL check before kfree IB/mlx4: Fix compiler warning about uninitialized 'vlan' variable IB/mlx4: Convert is_xxx variables in build_mlx_header() to bool IB/iser: Enable iser when FMRs are not supported IB/iser: Avoid error prints on EAGAIN registration failures IB/iser: Use proper define for the commands per LUN value advertised to SCSI ML IB/uverbs: Implement memory windows support in uverbs IB/core: Add "type 2" memory windows support mlx4_core: Propagate MR deregistration failures to caller mlx4_core: Rename MPT-related functions to have mpt_ prefix ...	2013-02-26 11:41:08 -08:00
Shani Michaeli	e448834e35	mlx4_core: Enable memory windows in {INIT, QUERY}_HCA Add memory windows-related code to INIT_HCA and QUERY_HCA. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-25 10:44:31 -08:00
David S. Miller	fd5023111c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Synchronize with 'net' in order to sort out some l2tp, wireless, and ipv6 GRE fixes that will be built on top of in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-08 18:02:14 -05:00
Yan Burman	16a10ffd20	net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en. Also convert the new functions to deal with MACs in form of char array instead of u64. Actual functions moved: mlx4_replace_mac mlx4_get_eth_qp mlx4_put_eth_qp To conduct this change, some functionality had to be exported from the core, the following functions were added: mlx4_get_base_qp __mlx4_replace_mac (low level function for CX1/A0 compatibility) Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:12 -05:00
Linus Torvalds	bb5204c2eb	IB regression fixes for 3.8: - Fix mlx4 VFs not working on old guests because of 64B CQE changes - Fix ill-considered sparse fix for qib - Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJREutcAAoJEENa44ZhAt0hjFwP/3SCQr/eboXyJV9GBlmbU9y2 X7t6JS1n9R5KxBj54XBL8ZA7qcaw7rQj8VgPC4qlMWTR1/fXHsrtiRtQU1VMBcBP Eh50iE5BEq4kpK93IYZZei+I7J/0O1Hpj1JwUuvGr7/hQltMWXItuPTlO4Ror5Kk Vvu/waQh9DDp1uQRSPbSqAhEx7cGbl27UT7BLPqszVla59GA8UOUcfit8I9CyTmk ySP2zrDC7JtPoOPYy6w32K4NSjp3KTR4EHWX0G3t/b0LvwEHARwQZ3RI4ZjNMqLl mtKfqaYjqCeSlaT6MAODlN0aTp38GFAU0RaGePL5GurxQExwGnVZfTRUJDkNGTGO vPDq6+L6XwPHgYTs1knafs3OT24nwv/vzZ/SLV7gcssbxdL8Cru16E4CO3Vpryrl 5B0w2+ld+L1lw/m4rSuqzQYpS6NpW35ATKzMhQNwk9cLCLNCOqv247WDvhBZDnpV lhLQ+RGs6DK7CQQ8w4rYLFBVk+1kPlZYILV0Rjni6vv7w9S/byVrshqE8eIkQwqE BEl0gMc83VZj5WH5s5MJEx+T5H2lZ80rIDKuamSz7wEduXWWENEqj5k7mBHa66Sn 0aHcrXDe26Cj1TUCGbrgeFPMlucVAK+fSjvEzZrzQwxLspnKXlFw5v0DvqmTqBok hO0iE4ajfXl9RfIC7KrK =bmKU -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull IB regression fixes from Roland Dreier: - Fix mlx4 VFs not working on old guests because of 64B CQE changes - Fix ill-considered sparse fix for qib - Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1 * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Fix for broken sparse warning fix mlx4_core: Fix advertisement of wrong PF context behaviour IPoIB: Fix crash due to skb double destruct	2013-02-08 12:15:14 +11:00
Or Gerlitz	f97b4b5d46	mlx4_core: Fix advertisement of wrong PF context behaviour Commit `08ff32352d` ("mlx4: 64-byte CQE/EQE support") introduced a regression where older guest VF drivers failed to load even when 64-byte EQEs/CQEs are disabled, since the PF wrongly advertises the new context behaviour anyway. The failure looks like: mlx4_core 0000:00:07.0: Unknown pf context behaviour mlx4_core 0000:00:07.0: Failed to obtain slave caps mlx4_core: probe of 0000:00:07.0 failed with error -38 Fix this by basing this advertisement on dev->caps.flags, which is the operational capabilities used by the QUERY_FUNC_CAP command wrapper (dev_cap->flags holds the firmware capabilities). Reported-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-05 09:35:41 -08:00
Joe Perches	b2adaca92c	ethernet: Remove unnecessary alloc/OOM messages, alloc cleanups alloc failures already get standardized OOM messages and a dump_stack. Convert kzalloc's with multiplies to kcalloc. Convert kmalloc's with multiplies to kmalloc_array. Fix a few whitespace defects. Convert a constant 6 to ETH_ALEN. Use parentheses around sizeof. Convert vmalloc/memset to vzalloc. Remove now unused size variables. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-04 13:22:33 -05:00
Hadar Hen Zion	23537b732f	net/mlx4_core: Use firmware driven flow steering hash mode The Firmware dynamically changes flow steering hash configuration from covering L2 only to "full" L2/L3/L4 mode needed. The dynamic change allows the driver to set hard coded hash configuration which is changed by the firmware from L2 to L2/L3/L4 when attaching the first L3/L4 flow steering rule and back to L2 when there are no more such rules. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:47 -05:00
David S. Miller	f1e7b73acc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Bring in the 'net' tree so that we can get some ipv4/ipv6 bug fixes that some net-next work will build upon. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-29 15:32:13 -05:00
Jack Morgenstein	f356fcbe12	net/mlx4_core: Return proper error code when __mlx4_add_one fails Returning 0 (success) when in fact we are aborting the load, leads to kernel panic when unloading the module. Fix that by returning the actual error code. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-28 00:13:57 -05:00
Or Gerlitz	ca4c7b35f7	net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaults The lines if (mlx4_is_mfunc(dev)) { nreq = 2; } else { which hard code the number of requested msi-x vectors under multi-function mode to two can be removed completely, since the firmware sets num_eqs and reserved_eqs appropriately Thus, the code line: nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs, nreq); is by itself sufficient and correct for all cases. Currently, for mfunc mode num_eqs = 32 and reserved_eqs = 28, hence four vectors will be enabled. This triples (one vector is used for the async events and commands EQ) the horse power provided for processing of incoming packets on netdev RSS scheme, IO initiators/targets commands processing flows, etc. Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-18 14:25:28 -05:00
Jack Morgenstein	3c439b5586	mlx4_core: Allow choosing flow steering mode Device managed flow steering will be enabled only under administrator directive provided through setting the existing module parameter log_num_mgm_entry_size to -1 (if the device actually supports flow steering). If flow steering isn't requested or not available, the driver will use the value of log_num_mgm_entry_size and B0 steering. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 11:47:22 -08:00

1 2 3

115 Commits