linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Hadar Hen Zion	e802f8e4c5	net/mlx4: Prepare VLAN macros for 802.1ad Hardware accelerated support To add Hardware accelerated support in 802.1ad vlan, replace Current VLAN macros to CVLAN. Replace: MLX4_WQE_CTRL_INS_VLAN MLX4_CQE_VLAN_PRESENT_MASK With: MLX4_WQE_CTRL_INS_CVLAN MLX4_CQE_CVLAN_PRESENT_MASK Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 15:00:37 -07:00
Hadar Hen Zion	7c509a48ff	net/mlx4_en: Prepare ethtool private flags to support more flags Currently we support only one ethtool private flag. Prepare mlx4_en_set_priv_flags function to support more than one private flag. Will be used in the next patch to support hardware accelerated 802.1ad vlan. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 15:00:36 -07:00
Hadar Hen Zion	77fc29c4bb	net/mlx4_core: Preparations for 802.1ad VLAN support mlx4_core preparation to support hardware accelerated 802.1ad VLAN device. To allow 802.1ad accelerated device, "packet has vlan" (phv) Firmware capability should be available. Firmware without the phv capability won't behave properly and can't support 802.1ad device acceleration. The driver checks the Firmware capability and sets the phv bit accordingly in SET_PORT command. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 15:00:36 -07:00
Achiad Shochat	a741749f21	net/mlx5e: Input IPSEC.SPI into the RX RSS hash function In addition to the source/destination IP which are already hashed. Only for unicast traffic for now. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 00:29:17 -07:00
Achiad Shochat	5a6f8aef16	net/mlx5e: Cosmetics: use BIT() instead of "1 <<", and others No logical change in this commit. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 00:29:17 -07:00
Achiad Shochat	88a85f99e5	net/mlx5e: TX latency optimization to save DMA reads A regular TX WQE execution involves two or more DMA reads - one to fetch the WQE, and another one per WQE gather entry. These DMA reads obviously increase the TX latency. There are two mlx5 mechanisms to bypass these DMA reads: 1) Inline WQE 2) Blue Flame (BF) An inline WQE contains a whole packet, thus saves the DMA read/s of the regular WQE gather entry/s. Inline WQE support was already added in the previous commit. A BF WQE is written directly to the device I/O mapped memory, thus enables saving the DMA read that fetches the WQE. The BF WQE I/O write must be in cache line granularity, thus uses the CPU write combining mechanism. A BF WQE I/O write acts also as a TX doorbell for notifying the device of new TX WQEs. A BF WQE is written to the same I/O mapped address as the regular TX doorbell, thus this address is being mapped twice - once by ioremap() and once by io_mapping_map_wc(). While both mechanisms reduce the TX latency, they both consume more CPU cycles than a regular WQE: - A BF WQE must still be written to host memory, in addition to being written directly to the device I/O mapped memory. - An inline WQE involves copying the SKB data into it. To handle this tradeoff, we introduce here a heuristic algorithm that strives to avoid using these two mechanisms in case the TX queue is being back-pressured by the device, and limit their usage rate otherwise. An inline WQE will always be "Blue Flamed" (written directly to the device I/O mapped memory) while a BF WQE may not be inlined (may contain gather entries). Preliminary testing using netperf UDP_RR shows that the latency goes down from 17.5us to 16.9us, while the message rate (tested with pktgen) stays the same. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 00:29:17 -07:00
Achiad Shochat	58d522912a	net/mlx5e: Support TX packet copy into WQE AKA inline WQE. A TX latency optimization to save data gather DMA reads. Controlled by ETHTOOL_TX_COPYBREAK. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 00:29:17 -07:00
Saeed Mahameed	311c7c71c9	net/mlx5e: Allocate DMA coherent memory on reader NUMA node By affinity hints and XPS, each mlx5e channel is assigned a CPU core. Channel DMA coherent memory that is written by the NIC and read by SW (e.g CQ buffer) is allocated on the NUMA node of the CPU core assigned for the channel. Channel DMA coherent memory that is written by SW and read by the NIC (e.g SQ/RQ buffer) is allocated on the NUMA node of the NIC. Doorbell record (written by SW and read by the NIC) is an exception since it is accessed by SW more frequently. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 00:29:17 -07:00
Saeed Mahameed	2be6967cdb	net/mlx5e: Support ETH_RSS_HASH_XOR The ConnectX-4 HW implements inverted XOR8. To make it act as XOR we re-order the HW RSS indirection table. Set XOR to be the default RSS hash function and add ethtool API to control it. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-27 00:29:16 -07:00
Ido Shamay	62e4c9b4fd	net/mlx4_en: Remove BUG_ON assert when checking if ring is full In mlx4_en_is_ring_empty we check if ring surpassed its size. Since the prod and cons indicators are u32, there might be a state where prod wrapped around and cons, making this assert false, although no actual bug exists (other code segment can cope with this state). Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-26 16:29:26 -07:00
Jack Morgenstein	9f5b031770	net/mlx4_core: Relieve cpu load average on the port sending flow When a port is not attached, the FW requires a longer than usual time to execute the SENSE_PORT command. In the command flow, the wait_for_completion_timeout call used in mlx4_cmd_wait puts the kernel thread into the uninterruptible state during this time. This, in turn, due to the computation method, causes the CPU load average to increase. Fix this by using wait_for_completion_interruptible_timeout() for the SENSE_PORT command, which puts the thread in the interruptible state. In this state, the thread does not contribute to the CPU load average. Treat the interrupted case as if the SENSE_PORT command returned port_type = NONE. Fix suggested by Gideon Naim <gideonn@mellanox.com> and Bart Van Assche <bart.vanassche@sandisk.com>. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-26 16:29:25 -07:00
Jack Morgenstein	1c1bf34951	net/mlx4_core: Fix wrong index in propagating port change event to VFs The port-change event processing in procedure mlx4_eq_int() uses "slave" as the vf_oper array index. Since the value of "slave" is the PF function index, the result is that the PF link state is used for deciding to propagate the event for all the VFs. The VF link state should be used, so the VF function index should be used here. Fixes: `948e306d7d` ('net/mlx4: Add VF link state support') Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-26 16:29:25 -07:00
Or Gerlitz	178d23e3cd	net/mlx4_core: Use sink counter for the VF default as fallback Some old PF drivers don't let VFs allocate counters, in that case, use the sink counter so the VF can load and operate properly. Fixes: `6de5f7f6a1` ('net/mlx4_core: Allocate default counter per port') Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-26 16:29:25 -07:00
Carol Soto	0beb44b065	net/mlx4_core: Add extra check for total vfs for SRIOV Add extra check for total vfs for SRIOV to check if that value is bigger than total vfs in pci SRIOV capabalities. Fix a check and print of the number of maximum vfs that hw can handle. Fix a check and print of the number of maximum vfs per port that driver can handle. Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 15:19:54 -07:00
Eric Dumazet	0a6d424569	mlx4: TCP/UDP packets have L4 hash Mellanox driver has the knowledge if rxhash is a L4 hash, if it receives a non fragmented TCP or UDP frame and NETIF_F_RXCSUM is enabled on netdev. ip_summed value is CHECKSUM_UNNECESSARY in this case. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Ido Shamay <idos@mellanox.com> Acked-by: Ido Shamay <idos@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-08 13:45:22 -07:00
Or Gerlitz	7254acffee	mlx4: Disable HA for SRIOV PF RoCE devices When in HA mode, the driver exposes an IB (RoCE) device instance with only one port. Under SRIOV, the existing implementation doesn't go well with the PF RoCE driver's role of Special QPs Para-Virtualization, etc. As such, disable HA for the mlx4 PF RoCE device in SRIOV mode. Fixes: `a575009030` ('IB/mlx4: Add port aggregation support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-25 02:06:29 -07:00
Ido Shamay	79a258526c	net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan offload is disabled. This is wrong since rxvlan offload can be switched on/off regardless of hwtstamp_rx_filter. Also moved check_csum to query CQE information to identify VLAN packets and removed the check of IP packets, since it has been validated before. Fixes: `f8c6455bb0` ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE') Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-25 02:06:28 -07:00
Ido Shamay	488a9b48e3	net/mlx4_en: Wake TX queues only when there's enough room Indication of a single completed packet, marked by txbbs_skipped being bigger then zero, in not enough in order to wake up a stopped TX queue. The completed packet may contain a single TXBB, while next packet to be sent (after the wake up) may have multiple TXBBs (LSO/TSO packets for example), causing overflow in queue followed by WQE corruption and TX queue timeout. Instead, wake the stopped queue only when there's enough room for the worst case (maximum sized WQE) packet that we should need to handle after the queue is opened again. Also created an helper routine - mlx4_en_is_tx_ring_full, which checks if the current TX ring is full or not. It provides better code readability and removes code duplication. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-25 02:06:27 -07:00
Eran Ben Elisha	0eb08514fd	net/mlx4_en: Release TX QP when destroying TX ring TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code used the deprecated base_tx_qpn field. Move TX QP release to mlx4_en_destroy_tx_ring and remove the base_tx_qpn field. Fixes: `ddae0349fd` ('net/mlx4: Change QP allocation scheme') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-25 02:06:26 -07:00
Linus Torvalds	e0456717e4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) Add TX fast path in mac80211, from Johannes Berg. 2) Add TSO/GRO support to ibmveth, from Thomas Falcon 3) Move away from cached routes in ipv6, just like ipv4, from Martin KaFai Lau. 4) Lots of new rhashtable tests, from Thomas Graf. 5) Run ingress qdisc lockless, from Alexei Starovoitov. 6) Allow servers to fetch TCP packet headers for SYN packets of new connections, for fingerprinting. From Eric Dumazet. 7) Add mode parameter to pktgen, for testing receive. From Alexei Starovoitov. 8) Cache access optimizations via simplifications of build_skb(), from Alexander Duyck. 9) Move page frag allocator under mm/, also from Alexander. 10) Add xmit_more support to hv_netvsc, from KY Srinivasan. 11) Add a counter guard in case we try to perform endless reclassify loops in the packet scheduler. 12) Extern flow dissector to be programmable and use it in new "Flower" classifier. From Jiri Pirko. 13) AF_PACKET fanout rollover fixes, performance improvements, and new statistics. From Willem de Bruijn. 14) Add netdev driver for GENEVE tunnels, from John W Linville. 15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso. 16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet. 17) Add an ECN retry fallback for the initial TCP handshake, from Daniel Borkmann. 18) Add tail call support to BPF, from Alexei Starovoitov. 19) Add several pktgen helper scripts, from Jesper Dangaard Brouer. 20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa. 21) Favor even port numbers for allocation to connect() requests, and odd port numbers for bind(0), in an effort to help avoid ip_local_port_range exhaustion. From Eric Dumazet. 22) Add Cavium ThunderX driver, from Sunil Goutham. 23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata, from Alexei Starovoitov. 24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai. 25) Double TCP Small Queues default to 256K to accomodate situations like the XEN driver and wireless aggregation. From Wei Liu. 26) Add more entropy inputs to flow dissector, from Tom Herbert. 27) Add CDG congestion control algorithm to TCP, from Kenneth Klette Jonassen. 28) Convert ipset over to RCU locking, from Jozsef Kadlecsik. 29) Track and act upon link status of ipv4 route nexthops, from Andy Gospodarek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits) bridge: vlan: flush the dynamically learned entries on port vlan delete bridge: multicast: add a comment to br_port_state_selection about blocking state net: inet_diag: export IPV6_V6ONLY sockopt stmmac: troubleshoot unexpected bits in des0 & des1 net: ipv4 sysctl option to ignore routes when nexthop link is down net: track link-status of ipv4 nexthops net: switchdev: ignore unsupported bridge flags net: Cavium: Fix MAC address setting in shutdown state drivers: net: xgene: fix for ACPI support without ACPI ip: report the original address of ICMP messages net/mlx5e: Prefetch skb data on RX net/mlx5e: Pop cq outside mlx5e_get_cqe net/mlx5e: Remove mlx5e_cq.sqrq back-pointer net/mlx5e: Remove extra spaces net/mlx5e: Avoid TX CQE generation if more xmit packets expected net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device ...	2015-06-24 16:49:49 -07:00
David S. Miller	3a07bd6fea	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/mellanox/mlx4/main.c net/packet/af_packet.c Both conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 02:58:51 -07:00
Saeed Mahameed	99611ba127	net/mlx5e: Prefetch skb data on RX Prefetch the 1st cache line used by the buffer pointed by the skb linear data. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:41 -07:00
Achiad Shochat	a1f5a1a87a	net/mlx5e: Pop cq outside mlx5e_get_cqe Separate between mlx5e_get_cqe() and mlx5_cqwq_pop(), this helps for better code readability and better CQ buffer management. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:41 -07:00
Achiad Shochat	e33910548a	net/mlx5e: Remove mlx5e_cq.sqrq back-pointer Use container_of() instead. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:39 -07:00
Achiad Shochat	8ca56ce39d	net/mlx5e: Remove extra spaces Coding Style fix, remove extra spaces. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:39 -07:00
Achiad Shochat	059ba072eb	net/mlx5e: Avoid TX CQE generation if more xmit packets expected In order to save PCI BW consumed by TX CQEs and to reduce the amount of CPU cache misses caused by TX CQE reading, we request TX CQE generation only when skb->xmit_more=0. As a consequence of the above, a single TX CQE may now indicate the transmission completion of multiple TX SKBs. This also handles a problem introduced in commit b1b8105ebf41 "net/mlx5e: Support NETIF_F_SG" where we didn't ask for NOP completions while the driver didn't have the proper code to handle this case. Fixes: b1b8105ebf41 ('net/mlx5e: Support NETIF_F_SG') Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:37 -07:00
Achiad Shochat	9fc5930625	net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion NOP completion SKBs are always NULL. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:37 -07:00
Achiad Shochat	ef583d037d	net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq() It is already assigned at mlx5e_build_rq_param() Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:36 -07:00
Saeed Mahameed	fb6c6f2529	net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them Instead of counting number of gso fragments, we can use skb_shinfo(skb)->gso_segs. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:35 -07:00
Saeed Mahameed	03289b88e3	net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues To save per-packet calculations, we use the following static mappings: 1) priv {channel, tc} to netdev txq (used @mlx5e_selec_queue()) 2) netdev txq to priv sq (used @mlx5e_xmit()) Thanks to these static mappings, no more need for a separate implementation of ndo_start_xmit when multiple TCs are configured. We believe the performance improvement of such separation would be negligible, if any. The previous way of dynamically calculating the above mappings required allocating more TX queues than actually used (@alloc_etherdev_mqs()), which is now no longer needed. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:34 -07:00
Eran Ben Elisha	f1a3badb0b	net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device Under SRIOV, the port rx/tx bytes/packets statistics should by read from the HW instead of using the PF netdevice SW accounting. This is needed in order to get the full port statistics and not just the PF own ones Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:33 -07:00
Eran Ben Elisha	9a2abf5a80	net/mlx4_en: Fix off-by-four in ethtool NUM_ALL_STATS was not updated with the new four entries, instead NUM_FLOW_STATS was updated, fix it. that caused off-by-four for all counters below pf__. Fixes: `b42de4d012` ('net/mlx4_en: Show PF own statistics via ethtool') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-24 00:42:32 -07:00
Linus Torvalds	f9d1b5a31a	Changes for 4.2 - A large cleanup of how device capabilities are checked for various features - Additional cleanups in the MAD processing - Update to the srp driver - Creation and use of centralized log message helpers - Add const to a number of args to calls and clean up call chain - Add support for extended cq create verb - Add support for timestamps on cq completion - Add support for processing OPA MAD packets -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVeyzqAAoJELgmozMOVy/di3wP/jml4F9crvQn7UBJjGm/rgcI wzZ2GZTqxQE8dn+W6gQsdKOzy0Ibxx5UYGp9ruInuxAcVh9t1PcylanasaiGMEtY mrGRFjipJ9jYa+yDQDTQi8EFMClZuMSvtRLKjzYITudCXQck37V+F5YlP6VphjX7 JeiM4a+4rD0ukk5PKGvUw51sP1eawKtEdUvnqcOEI2tJgQmzJBP4mXrhVtS/0wSc Pi8TRN5QKi3Drom/tK9QQ/ncoYngi4BKLfszCeU373HJq6qXqsxBYvs3jX6MPzfv Aooj272JxBgCYxkmEfECezDpmi3PbWDJjXj/xCLjfhjISDtHHHVLGVMODZpwUEsL 2wBgwlzdajVopSbSLvsjQNtQw25s7sDWpu+TFKbS0u+W2d0ZOyipM1Xeje+OtDHQ clhwvDhgSfeI/bJ1YdtNLbvINrwsfZD213zD+WH21A/9weAVr3hEfTuSaNFiTiRn 5yywP36TM0wH90KhiWoLrztcHvoE5p7kGuqzv04MRjrMMNHEJK2/IhWvT97Ewngu vWrZl7QRzXYcGspCOp2aJW9Wr2rhGRrv28TF+thpNrIJOB2JM4q4koCKZCcI0s2D E6pY2YQSzvrA/ZSfcWIg4yhugcycIJkOf7ur2N/U43cwGXtaCzPWVnKMApmdnVOO ZEMwD3OZ1OGcCHLhRL8Y =yISf -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: - a large cleanup of how device capabilities are checked for various features - additional cleanups in the MAD processing - update to the srp driver - creation and use of centralized log message helpers - add const to a number of args to calls and clean up call chain - add support for extended cq create verb - add support for timestamps on cq completion - add support for processing OPA MAD packets * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (92 commits) IB/mad: Add final OPA MAD processing IB/mad: Add partial Intel OPA MAD support IB/mad: Add partial Intel OPA MAD support IB/core: Add OPA MAD core capability flag IB/mad: Add support for additional MAD info to/from drivers IB/mad: Convert allocations from kmem_cache to kzalloc IB/core: Add ability for drivers to report an alternate MAD size. IB/mad: Support alternate Base Versions when creating MADs IB/mad: Create a generic helper for DR forwarding checks IB/mad: Create a generic helper for DR SMP Recv processing IB/mad: Create a generic helper for DR SMP Send processing IB/mad: Split IB SMI handling from MAD Recv handler IB/mad cleanup: Generalize processing of MAD data IB/mad cleanup: Clean up function params -- find_mad_agent IB/mlx4: Add support for CQ time-stamping IB/mlx4: Add mmap call to map the hardware clock IB/core: Pass hardware specific data in query_device IB/core: Add timestamp_mask and hca_core_clock to query_device IB/core: Extend ib_uverbs_create_cq IB/core: Add CQ creation time-stamping flag ...	2015-06-23 15:53:26 -07:00
Paul Gortmaker	138b15ed87	drivers/net: remove all references to obsolete Ethernet-HOWTO This howto made sense in the 1990s when users had to manually configure ISA cards with jumpers or vendor utilities, but with the implementation of PCI it became increasingly less and less relevant, to the point where it has been well over a decade since I last updated it. And there is no value in anyone else taking over updating it either. However the references to it continue to spread as boiler plate text from one Kconfig file into the next. We are not doing end users any favours by pointing them at this old document, so lets kill it with fire, once and for all, to hopefully stop any further spread. No code is changed in this commit, just Kconfig help text. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-23 06:50:35 -07:00
Eran Ben Elisha	62a890557f	net/mlx4_en: Support ndo_get_vf_stats Implement the ndo to gather VF statistics through the PF. All counters related to this VF are stored in a per slave list, run over the slave's list and collect all statistics. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:03 -07:00
Eran Ben Elisha	b42de4d012	net/mlx4_en: Show PF own statistics via ethtool Allow the user to observe the PF own statistics using ethtool with pf_ prefixed counter names. Those counters are the PF statistics out of the overall port statistics. Every PF QP is attached to a counter and the summary of those counters is the PF statistics. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:02 -07:00
Eran Ben Elisha	9616982f3f	net/mlx4_core: Add helper to query counters This is an infrastructure step for querying VF and PF counters. This code was in the IB driver, move it to the mlx4 core driver so it will be accessible for more use cases. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:02 -07:00
Eran Ben Elisha	6de5f7f6a1	net/mlx4_core: Allocate default counter per port Default counter per port will be allocated at the mlx4 core driver load. Every QP opened by the Ethernet driver will be attached to the port's default counter. This is an infrastructure step to collect VF statistics from the PF. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:02 -07:00
Eran Ben Elisha	68230242cd	net/mlx4_core: Add port attribute when tracking counters Counter will get its port attribute within the resource tracker when the first QP attached to it is modified to RTR. If a QP is counter-less, an attempt to create a new counter with assigned port will be made. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:02 -07:00
Eran Ben Elisha	9de92c60be	net/mlx4_core: Adjust counter grant policy in the resource tracker Each physical function has a guarantee of two counters per port, one for a default counter and one for the IB driver. Each virtual function has a guarantee of one counter per port. All other counters are free and can be obtained on demand. This is a preparation step for supporting a get_vf_stats ndo call, so we can promise a counter for every VF in order to collect their statistics from the PF context. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:01 -07:00
Eran Ben Elisha	2632d18d3a	net/mlx4_core: Remove counters table allocation from VF flow Since virtual functions get their counters indices allocation from the PF, allocate counters indices bitmap only in case the function isn't virtual. Also, check that the device has counters to allocate before creating the indices bitmap table. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:01 -07:00
Eran Ben Elisha	47d8417f59	net/mlx4_core: Add sink counter Reserve the last valid counter index for "sink" counter, when a new counter cannot be allocated, the driver will use this counter. In order to avoid allocating this counter on any other flow, fix the indices bitmap allocation range, and reserve the sink counter index. Add macro for the sink counter index and replace all appearences of the index with the macro. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:01 -07:00
Eran Ben Elisha	b72ca7e96a	net/mlx4_core: Reset counters data when freed Add resetting the counter data to the free counter flow, so the counter's data won't be accessible anymore if querying the counter. Also, on next counter allocation (to another VM for example), it will be fresh and clear. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:01 -07:00
Eran Ben Elisha	efa6bc91cb	net/mlx4_core: Check before cleaning counters bitmap If counters are not supported by the device. The indices bitmap table is not allocated during initialization. Add the symmetrical check before cleaning the counters bitmap table or freeing a counter. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 17:23:01 -07:00
Or Gerlitz	ac0a72a3e6	net/mlx4_core: Disable Granular QoS per VF under IB/Eth VPI configuration Due to firmware bug, under VPI configuration when port1 = IB and port2 = Eth, Granular QoS per VF isn't working properly. More over, the whole QP0/QP1 Para-Virtualization in the mlx4 IB driver is broken on that config. Hence, we must disable Granular QoS per VF under that configuration till a fix is introduced. Once that happens, a new device capability will be used to mark the feature support on that specific configuration. Reported-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-15 16:42:57 -07:00
Matan Barak	52033cfb5a	IB/mlx4: Add mmap call to map the hardware clock In order to read the HCA's cycle counter efficiently in user space, we need to map the HCA's register. This is done through mmap call. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-12 14:49:10 -04:00
Achiad Shochat	3191e05fea	net/mlx5e: Add transport domain to the ethernet TIRs/TISs Allocate and use transport domain by the Ethernet driver code. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:26 -07:00
Achiad Shochat	56508b5013	net/mlx5_core: Add transport domain alloc/dealloc support Each transport object, namely TIR and TIS, must have a transport domain number (TDN) identifier. The driver wrongly assumed that it is OK to use TDN=0 without explicit TDN allocation from the device. The TDN will also be used for isolating different processes once user mode Ethernet will be supported. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:26 -07:00
Saeed Mahameed	12be4b2190	net/mlx5e: Support NETIF_F_SG When NETIF_F_SG is set, each send WQE may have a different size since each skb can have different number of fragments as of LSO header etc. This implies that a given WQE may wrap around the send queue, i.e begin at its end and continue at its start. While it is legal by the device spec, we preferred a solution that avoids it - when building of current WQE is done, if the next WQE may wrap around the send queue, fill the send queue with NOPs WQEs till its end, so that the next WQE will begin at send queue start. NOP WQE for itself cannot wrap around the send queue since it is of minimal size - 64 bytes, and all send WQEs are a multiple of that size. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:25 -07:00
Gal Pressman	796a27ec2d	net/mlx5e: Enforce max flow-tables level >= 3 The Ethernet driver requires at least 3 flow table levels to operate, enforce that. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:25 -07:00
Saeed Mahameed	cd58c714ac	net/mlx5e: Disable client vlan TX acceleration We need to resolve a HW configuration issue for enabling HW CVLAN insertion. Meanwhile, no need to implement the VLAN insertion in the driver, rather use the generic kernel VLAN insertion method. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:25 -07:00
Saeed Mahameed	fc11fbf9a7	net/mlx5e: Add HW cacheline start padding Enable HW cacheline start padding and align RX WQE size to cacheline while considering HW start padding. Also, fix dma_unmap call to use the correct SKB data buffer size. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:25 -07:00
Saeed Mahameed	facc9699f0	net/mlx5e: Fix HW MTU settings Previously we configured HW MTU to be netdev->mtu, actually we need to configure netdev->mtu + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN). Also, query MTU can not fail, hence make the relevant helper a void functionm, add mlx5e_set_dev_port_mtu, helper function to handle MTU setting. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:55:25 -07:00
Dan Carpenter	7ec0bb227a	net/mlx5_core: fix an error code We return success if mlx5e_alloc_sq_db() fails but we should return an error code. Fixes: `f62b8bb8f2` ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:44:41 -07:00
Fabian Frederick	e55898a7ed	net/mlx4_core: use swap() in mlx4_make_profile() Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:19:41 -07:00
Fabian Frederick	2df1cafa67	net/mlx4: use swap() in mlx4_init_qp_table() Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-11 15:19:41 -07:00
Majd Dibbiny	7cf7fa529d	net/mlx5_core: Fix static checker warnings around system guid query flow Fix static checker warnings in the flow of system guid query. Fixes: `707c4602cd` ('net/mlx5_core: Add new query HCA vport commands') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-07 20:11:17 -07:00
Haggai Abramonvsky	4aa17b2879	mlx5: Enable mutual support for IB and Ethernet Ethernet functionality is only available when working in ISSI > 0 mode. Previously, the IB driver wasn't ready to work on that mode, and hence building both the IB driver and the Ethernet functionality in the core driver were disallowed by Kconfigs. Now, once we have all the pre-steps in place, we can remove this limitation. The last steps in the IB driver for getting that setup to work are: create dummy SRQ for the driver's use (until now we could use XRC_SRQ as SRQ and XRC_SRQ, after moving to ISSI > 0, we separate XRC SRQs from basic SRQs) and adapt the create QP function to be compatible with ISSI > 0. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:02 -07:00
Majd Dibbiny	a124d13ef5	net/mlx5_core: Add more query port helpers Add the following helpers: 1. mlx5_query_port_proto_oper -- queries the port speed port mask 2. mlx5_query_port_link_width_oper - queries the port link with bitmask 3. mlx5_query_port_vl_hw_cap - queries the Virtual Lanes supported on this port These helpers will be used from the IB driver when working in ISSI > 0 mode. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:02 -07:00
Majd Dibbiny	a05bdefa40	net/mlx5_core: Use port number when querying port ptys Until now, mlx5_query_port_ptys always queried port number one. Added new argument in the function's prototype so we can also query the second port. This will be needed when thr helper will be invoked from the IB driver on non FPP (Function-Per-Port) devices. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Majd Dibbiny	e760152d08	net/mlx5_core: Use port number in the query port mtu helpers Extend the function prototypes for max and operational mtu to take the local port number. In the Ethernet driver is this hard coded to one, since ConnectX4 Ethernet devices are always function-per-port. The IB driver also serves older devices (ConnectIB) which isn't such, and hence the part can vary. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Majd Dibbiny	211e6c80e5	net/mlx5_core: Get vendor-id using the query adapter command Add two wrapper functions to the query adapter command: 1. mlx5_query_board_id -- replaces the old mlx5_cmd_query_adapter. 2. mlx5_core_query_vendor_id -- retrieves the vendor_id from the query_adapter command. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Majd Dibbiny	707c4602cd	net/mlx5_core: Add new query HCA vport commands Added the implementation for the following commands: 1. QUERY_HCA_VPORT_GID 2. QUERY_HCA_VPORT_PKEY 3. QUERY_HCA_VPORT_CONTEXT They will be needed when we move to work with ISSI > 0 in the IB driver too. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Majd Dibbiny	d18a9470f8	net/mlx5_core: Make the vport helpers available for the IB driver too Move the vport header file to be under include/linux/mlx5, such that the mlx5 IB can use it as well. Also add nic_ prefix to the vport NIC commands to differeniate between HCA vport commands and NIC vport commands. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Haggai Abramonvsky	e74a1db033	net/mlx5_core: Check the return bitmask when querying ISSI The determination of the supported ISSI versions should be conditioned on the returned mask, and not only on the return status of the query ISSI command, fix that. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Haggai Abramonvsky	01949d0109	net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0 When working in ISSI > 0 mode, the model exposed by the device for XRCs and SRQs is different. XRCs use XRC SRQs and plain SRQs are based on RPM (Receive Memory Pool). Add helper functions to create, modify, query, and arm XRC SRQs and RMPs. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:01 -07:00
Haggai Abramonvsky	7db22ffb5b	net/mlx5_core: Apply proper name convention to helpers Some core helper functions were named with mlx5_ only prefix, fix that to mlx5_core_ so we're aligned with the overall scheme used for core services. Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:00 -07:00
Amir Vadai	5e24851ec5	net/mlx5_en: Add missing check for memory allocation failure The patch `afb736e933`: "net/mlx5: Ethernet resource handling files" from May 28, 2015, leads to the following static checker warning: drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c:726 mlx5e_create_main_flow_table() error: potential null dereference 'g'. (kcalloc returns null) Fixes: `afb736e933` ("net/mlx5: Ethernet resource handling files") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-04 16:41:00 -07:00
Carol Soto	613d8c188f	net/mlx4_core: fix typo in mlx4_set_vf_mac fix typo in mlx4_set_vf_mac Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-03 20:12:58 -07:00
Carol Soto	ed3d2276ef	net/mlx4_core: need to call close fw if alloc icm is called twice If mlx4_enable_sriov is called by adapter without this feature MLX4_DEV_CAP_FLAG2_SYS_EQS then during this path the function alloc icm is called twice without freeing the structures from the first time. Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-03 20:12:58 -07:00
Carol L Soto	5114a04e6c	net/mlx4_core: double free of dev_vfs If user loads mlx4_core with num_vfs greater than supported then variable dev->dev_vfs is freed 2 times after unloading the driver. Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-03 20:12:58 -07:00
Or Gerlitz	db9777e376	net/mlx4_core: Fix build failure introduced by the EQ pool changes When CONFIG_RFS_ACCEL or SMP aren't set, we fail to build, fix it. Also, avoid build warning as of unused function on that setup. Fixes: `c66fa19c40` ('net/mlx4: Add EQ pool') Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-03 19:34:51 -07:00
Ira Weiny	a97e2d86a9	IB/core cleanup: Add const on args - device->process_mad The process_mad device function declares some parameters as "in". Make those parameters const and adjust the call tree under process_mad in the various drivers accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-06-02 09:33:13 -04:00
David S. Miller	dda922c831	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/amd-xgbe-phy.c drivers/net/wireless/iwlwifi/Kconfig include/net/mac80211.h iwlwifi/Kconfig and mac80211.h were both trivial overlapping changes. The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and the bug fix that happened on the 'net' side is already integrated into the rest of the amd-xgbe driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-06-01 22:51:30 -07:00
Matan Barak	6d90aa5cf1	net/mlx4_core: Make sure there are no pending async events when freeing CQ When freeing a CQ, we need to make sure there are no asynchronous events (on the ASYNC EQ) that could relate to this CQ before freeing it. This is done by introducing synchronize_irq. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Ido Shamay	de1618034a	net/mlx4_core: Move affinity hints to mlx4_core ownership Now that EQs management is in the sole responsibility of mlx4_core, the IRQ affinity hints configuration should be in its hands as well. request_irq is called only once by the first consumer (maybe mlx4_ib), so mlx4_en passes the affinity mask too late. We also need to request vectors according to the cores we want to run on. mlx4_core distribution of IRQs to cores is straight forward, EQ(i)->IRQ will set affinity hint to core i. Consumers need to request EQ vectors, according to their cores considerations (NUMA). Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Matan Barak	c66fa19c40	net/mlx4: Add EQ pool Previously, mlx4_en allocated EQs and used them exclusively. This affected RoCE performance, as applications which are events sensitive were limited to use only the legacy EQs. Change that by introducing an EQ pool. This pool is managed by mlx4_core. EQs are assigned to ports (when there are limited number of EQs, multiple ports could be assigned to the same EQs). An exception to this rule is the ASYNC EQ which handles various events. Legacy EQs are completely removed as all EQs could be shared. When a consumer (mlx4_ib/mlx4_en) requests an EQ, it asks for EQ serving on a specific port. The core driver calculates which EQ should be assigned to that request. Because IRQs are shared between IB and Ethernet modules, their names only include the PCI device BDF address. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Matan Barak	48564135cb	net/mlx4_core: Demote simple multicast and broadcast flow steering rules In SRIOV, when simple (i.e - Ethernet L2 only) flow steering rules are created, always create them at MLX4_DOMAIN_NIC priority (instead of the real priority the function created them at). This is done in order to let multiple functions add broadcast/multicast rules without affecting other functions, which is necessary for DPDK in SRIOV. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 23:35:34 -07:00
Amir Vadai	f62b8bb8f2	net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality This is the Ethernet part of the driver for the Mellanox ConnectX(R)-4 Single/Dual-Port Adapter supporting 100Gb/s with VPI. The driver extends the existing mlx5 driver with Ethernet functionality. This patch contains the driver entry points but does not include transmit and receive (see the previous patch in the series) routines. It also adds the option MLX5_CORE_EN to Kconfig to enable/disable the Ethernet functionality. Currently, Kconfig is programmed to make Ethernet and Infiniband functionality mutally exclusive. Also changed MLX5_INFINIBAND to be depandant on MLX5_CORE instead of selecting it, since MLX5_CORE could be selected without MLX5_INFINIBAND being selected. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:51 -07:00
Amir Vadai	afb736e933	net/mlx5: Ethernet resource handling files This patch contains the resource handling files: - flow_table.c: This file contains the code to handle the low level API to configure hardware flow table. It is separated from the flow_table_en.c, because it will be used in the future by Raw Ethernet QP in mlx5_ib too. - en_flow_table.[ch]: Ethernet flow steering handling. The flow table object contain a mapping between flow specs and TIRs. This mechanism will be used also to configure e-switch in the future, when SR-IOV support will be added. - transobj.[ch] - Low level functions to create/modify/destroy the transport objects: RQ/SQ/TIR/TIS - vport.[ch] - Handle attributes of a virtual port (vPort) in the embedded switch. Currently this switch is a passthrough, until SR-IOV support will be added. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:39 -07:00
Amir Vadai	e586b3b0ba	net/mlx5: Ethernet Datapath files en_[rt]x.c contains the data path related code specific to tx or rx. en_txrx.c contains data path code which is common for both the rx and tx, this is mainly napi related code. Below are the objects that are being used by the hardware and the driver in the data path: Channel - one channel per IRQ. Every channel object contains: RQ - describes the rx queue TIR - One TIR (Transport Interface Receive) object per flow type. TIR contains attributes for a type of rx flow (e.g IPv4, IPv6 etc). A flow is defined in the Flow Table. Currently TIR describes the RSS hash parameters if exists and LRO attributes. SQ - describes the a tx queue. There is one SQ (Send Queue) per TC (traffic class). TIS - There is one TIS (Transport Interface Send) per TC. It describes the TC and may later be extended to describe more transport properties. Both RQ and SQ inherit from the object WQ (work queue). This common code to describe the layout of CQE's WQE's in memory is in the files wq.[cj] For every channel there is one NAPI context that is used for RX and for TX. Driver is using netdev_alloc_skb() to allocate skb's. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:20 -07:00
Saeed Mahameed	e725440e75	net/mlx5_core: Set/Query port MTU commands Introduce set/Query low level functions to access MTU in hardware. To be used by the netdev. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:24:08 -07:00
Rana Shahout	90b3e38d04	net/mlx5_core: Modify CQ moderation parameters Introduce mlx5_core_modify_cq_moderation() to be used by the netdev, to set hardware coalescing. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:59 -07:00
Rana Shahout	4c916a7980	net/mlx5_core: Implement get/set port status Implemet get/set port status low level functions to be exposed by the netdev. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:46 -07:00
Saeed Mahameed	adb0c9545b	net/mlx5_core: Implement access functions of ptys register fields Those registers will be used by the ethtool to set/get settings. Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:31 -07:00
Saeed Mahameed	938fe83c8d	net/mlx5_core: New device capabilities handling - Query all supported types of dev caps on driver load. - Store the Cap data outbox per cap type into driver private data. - Introduce new Macros to access/dump stored caps (using the auto generated data types). - Obsolete SW representation of dev caps (no need for SW copy for each cap). - Modify IB driver to use new macros for checking caps. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:22 -07:00
Saeed Mahameed	e281682bf2	net/mlx5_core: HW data structs/types definitions cleanup mlx5_ifc.h was heavily modified here since it is now generated by a script from the device specification (PRM rev 0.25). This specification is backward compatible to existing hardware. Some structures/fields were added here in order to enable the Ethernet functionality of the driver. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:23:11 -07:00
Saeed Mahameed	db058a186f	net/mlx5_core: Set irq affinity hints Preparation for upcoming ethernet driver. - Move msix array from eq_table struct to priv since its not related to eq_table - Intorduce irq_info struct to hold all irq information - Move name from mlx5_eq to irq_info struct since it is irq property. - Set IRQ affinity hints Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:22:48 -07:00
Amir Vadai	64ffaa2159	net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory As David Daney pointed in mlx4_core driver [1], mlx5_core is also misusing the DMA-API. This patch is removing the code that vmap() memory allocated by dma_alloc_coherent(). After this patch, users of this drivers might fail allocating resources on memory fragmeneted systems. This will be fixed later on. [1] - https://patchwork.ozlabs.org/patch/458531/ CC: David Daney <david.daney@cavium.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-30 18:22:37 -07:00
Linus Torvalds	6e49ba1bb1	NOW WITH TESTING! Two fixes which got lost in my recent distraction. One is a weird cpumask function which needed to be rewritten, the other is a module bug which is cc:stable. Thanks, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVaBENAAoJENkgDmzRrbjxxL4QAJMFwo21VN8rwIsEJ2P/Yh4u YXxJtnbrSPZtyad8J4G6FGOOfM7ImkkADhGJE8MN05goIFmeORWduAiozBtZBfo3 OVpeo0HIGTEMXq/QCxSQsDhP9MSeWV592vjhlqQJ2KhU9Gpstc/Ub9ArVWuY3FD3 CFN6ciw+5DIhoc6jMI2P9XX7jpR4VOBu320j+3lQ1QZ1aEZIaPefWH+VYuIZXirq E6N4yKgTahKb1Clr0DS6EB2Z5g+upNzFf4WBHaChP5EklwatZkHAOvzfSLWcbShI ochGV5LBPcn7ruqOD5mR4LGkxfQSYPCKCKihmenD/EVoO/dshKOQREfsqRXNsh5X xk4yx/VCy68ubIjx7FIDL18qDvJrX82+Z2bYZbENvKrVinaQ7MWB+CokK0fNW0ai ZMP5s32vSUZMMIIE7+fS4n3BLUxOpLZC8S0wIac19jNKzCHVTuhnUolCHk11zQLk IIDHEJwzvWtPjKOyUyd7HG0bYeczwf8DZgHg+xom9BNbHbK3Jk5d1Sibjgf8eGg+ O36XR8FYYvqHwqqrPKSSaWoLj578/IWyHZg/V4tQ2HWi189BVHk6Iw2knftsvvPw pBu2AdbRSLLD+X/pwrdmm+xgytjUIr1X/Qnwj/eE5MvB/vaVVwV0OjapU/Z6S+dL JrZGvbWcviyjpvGD+vG1 =wuP+ -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull fixes for cpumask and modules from Rusty Russell: " NOW WITH TESTING! Two fixes which got lost in my recent distraction. One is a weird cpumask function which needed to be rewritten, the other is a module bug which is cc:stable" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: cpumask_set_cpu_local_first => cpumask_local_spread, lament module: Call module notifier on failure after complete_formation()	2015-05-29 11:24:28 -07:00
Rusty Russell	f36963c9d3	cpumask_set_cpu_local_first => cpumask_local_spread, lament `da91309e0a` (cpumask: Utility function to set n'th cpu...) created a genuinely weird function. I never saw it before, it went through DaveM. (He only does this to make us other maintainers feel better about our own mistakes.) cpumask_set_cpu_local_first's purpose is say "I need to spread things across N online cpus, choose the ones on this numa node first"; you call it in a loop. It can fail. One of the two callers ignores this, the other aborts and fails the device open. It can fail in two ways: allocating the off-stack cpumask, or through a convoluted codepath which AFAICT can only occur if cpu_online_mask changes. Which shouldn't happen, because if cpu_online_mask can change while you call this, it could return a now-offline cpu anyway. It contains a nonsensical test "!cpumask_of_node(numa_node)". This was drawn to my attention by Geert, who said this causes a warning on Sparc. It sets a single bit in a cpumask instead of returning a cpu number, because that's what the callers want. It could be made more efficient by passing the previous cpu rather than an index, but that would be more invasive to the callers. Fixes: `da91309e0a` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (then rebased) Tested-by: Amir Vadai <amirv@mellanox.com> Acked-by: Amir Vadai <amirv@mellanox.com> Acked-by: David S. Miller <davem@davemloft.net>	2015-05-28 11:05:20 +09:30
Benjamin Poirier	f4ecf29fd7	mlx4_core: Fix fallback from MSI-X to INTx The test in mlx4_load_one() to remove MLX4_FLAG_MSI_X expects mlx4_NOP() to fail with -EBUSY. It is also necessary to avoid the reset since the device is not fully reinitialized before calling mlx4_start_hca() a second time. Note that this will also affect mlx4_test_interrupts(), the only other user of MLX4_CMD_NOP. Fixes: `f5aef5a` ("net/mlx4_core: Activate reset flow upon fatal command cases") Signed-off-by: Benjamin Poirier <bpoirier@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-27 13:08:03 -04:00
Or Gerlitz	be9b9eca25	net/mlx4_core: Enable single ported IB VFs Remove the limitation that disallows configuring single ported VFs in the presence of IB ports, after addressing the issues that prevented that to work. SMI (QP0) requests/responses are still not supported for single ported IB VFs. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-24 23:05:10 -04:00
Or Gerlitz	e5dfbf9a79	net/mlx4_core: Adjust the schedule queue port in reset-to-init too It's legal for drivers to provide the QP port through the QPC schedule-queue field on the reset-to-init QP state change. Add adjusting of the schedule queue port in the SRIOV wrapper for that operation too. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-24 23:05:10 -04:00
Or Gerlitz	f40e99e927	net/mlx4_core: Adjust the schedule queue port for single ported IB VFs Some VF drivers flow set the schedule queue in the QP context but without setting none of OPTPAR_SCHED_QUEUE or OPTPAR_PRIMARY_ADDR_PATH. To allow for such non-modified drivers to function as single ported IB VFs, we must adjust the schedule queue port whenever being set, e.g as currently done for single ported Eth VFs. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-24 23:05:09 -04:00
Or Gerlitz	74d4943fbb	net/mlx4_core: Modify port values when generting EQEs for VFs As part of enabling single ported VFs over IB ports we need to handle some of the flows for generting EQ events for VFs which don't come into play under Eth ports. This mainly includes port management events derived from changes of the phyiscal port (lid change, client re-register, down/up, etc), VF pkey table changes and VF guid changes initiated by the IB driver. (1) make sure that events are generated only for VFs sitting on the relevant physical port (under the ALL_SLAVES flow). (2) before generating the event, convert from physical (one or two) to VF port (always equals one). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-24 23:05:09 -04:00
Or Gerlitz	7c35ef4525	net/mlx4_core: Enhance the MAD_IFC wrapper to convert VF port to physical Single port VFs always provide port = 1 (even if the actual physical port used is port 2). As such, we need to convert the port provided by the VF to the physical port before calling into the firmware. It turns out that the Linux mlx4 VF RoCE driver maintains a copy of the GID table and hence this change became critical only for single ported IB VFs, but it could be needed for other RoCE VF drivers too. Fixes: `449fc48866` ('net/mlx4: Adapt code for N-Port VF') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-24 23:05:09 -04:00
Bjorn Helgaas	c1c52db16e	net/mlx4: Avoid 'may be used uninitialized' warnings With a cross-compiler based on gcc-4.9, I see warnings like the following: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_SW2HW_CQ_wrapper': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3048:10: error: 'cq' may be used uninitialized in this function [-Werror=maybe-uninitialized] cq->mtt = mtt; I think the warning is spurious because we only use cq when cq_res_start_move_to() returns zero, and it always initializes *cq in that case. The srq case is similar. But maybe gcc isn't smart enough to figure that out. Initialize cq and srq explicitly to avoid the warnings. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-14 22:28:48 -04:00
Yishai Hadas	2d3c739739	net/mlx4_core: Work properly with EQ numbers > 256 in SRIOV The Firmware uses dynamic EQs allocation based on number of VFs and max EQs that can be allocated. As a result, VF can have EQ numbers that are larger than 256. According to the firmware spec, the max value is limited to be 1024 (10 bits), adapt the relevant code accordingly. This bug was impossible to hit prior to commit `7ae0e400cd` ("net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs") which actually enables large number of EQs for VFs. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-05 19:39:12 -04:00
Eran Ben Elisha	cae0633872	net/mlx4_en: Fix off-by-one in counters manipulation This caused the en_stats_adder helper to accumulate a field which is not related to the counter, fix that. Fixes: `a3333b35da` ('net/mlx4_en: Moderate ethtool callback to show [..]') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-05 19:39:12 -04:00
Ido Shamay	07841f9d94	net/mlx4_en: Schedule napi when RX buffers allocation fails When system is out of memory, refilling of RX buffers fails while the driver continue to pass the received packets to the kernel stack. At some point, when all RX buffers deplete, driver may fall into a sleep, and not recover when memory for new RX buffers is once again availible. This is because hardware does not have valid descriptors, so no interrupt will be generated for the driver to return to work in napi context. Fix it by schedule the napi poll function from stats_task delayed workqueue, as long as the allocations fail. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-30 16:47:50 -04:00
David Ahern	17d5ceb6e4	net/mlx4_core: Fix unaligned accesses Addresses the following kernel logs seen during boot: Kernel unaligned access at TPC[100ee150] mlx4_QUERY_HCA+0x80/0x248 [mlx4_core] Kernel unaligned access at TPC[100f071c] mlx4_QUERY_ADAPTER+0x100/0x12c [mlx4_core] Kernel unaligned access at TPC[100f071c] mlx4_QUERY_ADAPTER+0x100/0x12c [mlx4_core] Kernel unaligned access at TPC[100f071c] mlx4_QUERY_ADAPTER+0x100/0x12c [mlx4_core] Kernel unaligned access at TPC[100f071c] mlx4_QUERY_ADAPTER+0x100/0x12c [mlx4_core] Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-30 16:26:30 -04:00
Benjamin Poirier	f94813f3c1	mlx4_en: Use correct loop cursor in error path. Signed-off-by: Benjamin Poirier <bpoirier@suse.de> Fixes: `9e311e7` ("net/mlx4_en: Use affinity hint") Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-30 16:25:14 -04:00
Benjamin Poirier	42eab005a5	mlx4: Fix tx ring affinity_mask creation By default, the number of tx queues is limited by the number of online cpus in mlx4_en_get_profile(). However, this limit no longer holds after the ethtool .set_channels method has been called. In that situation, the driver may access invalid bits of certain cpumask variables when queue_index >= nr_cpu_ids. Signed-off-by: Benjamin Poirier <bpoirier@suse.de> Acked-by: Ido Shamay <idos@mellanox.com> Fixes: `d03a68f` ("net/mlx4_en: Configure the XPS queue mapping on driver load") Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-29 15:16:57 -04:00
Linus Torvalds	2decb2682f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) mlx4 doesn't check fully for supported valid RSS hash function, fix from Amir Vadai 2) Off by one in ibmveth_change_mtu(), from David Gibson 3) Prevent altera chip from reporting false error interrupts in some circumstances, from Chee Nouk Phoon 4) Get rid of that stupid endless loop trying to allocate a FIN packet in TCP, and in the process kill deadlocks. From Eric Dumazet 5) Fix get_rps_cpus() crash due to wrong invalid-cpu value, also from Eric Dumazet 6) Fix two bugs in async rhashtable resizing, from Thomas Graf 7) Fix topology server listener socket namespace bug in TIPC, from Ying Xue 8) Add some missing HAS_DMA kconfig dependencies, from Geert Uytterhoeven 9) bgmac driver intends to force re-polling but does so by returning the wrong value from it's ->poll() handler. Fix from Rafał Miłecki 10) When the creater of an rhashtable configures a max size for it, don't bark in the logs and drop insertions when that is exceeded. Fix from Johannes Berg 11) Recover from out of order packets in ppp mppe properly, from Sylvain Rochet * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) bnx2x: really disable TPA if 'disable_tpa' option is set net:treewide: Fix typo in drivers/net net/mlx4_en: Prevent setting invalid RSS hash function mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions netfilter; Add some missing default cases to switch statements in nft_reject. ppp: mppe: discard late packet in stateless mode ppp: mppe: sanity error path rework net/bonding: Make DRV macros private net: rfs: fix crash in get_rps_cpus() altera tse: add support for fixed-links. pxa168: fix double deallocation of managed resources net: fix crash in build_skb() net: eth: altera: Resolve false errors from MSGDMA to TSE ehea: Fix memory hook reference counting crashes net/tg3: Release IRQs on permanent error net: mdio-gpio: support access that may sleep inet: fix possible panic in reqsk_queue_unlink() rhashtable: don't attempt to grow when at max_size bgmac: fix requests for extra polling calls from NAPI tcp: avoid looping in tcp_send_fin() ...	2015-04-27 14:05:19 -07:00
Amir Vadai	b37069090b	net/mlx4_en: Prevent setting invalid RSS hash function mlx4_en_check_rxfh_func() was checking for hardware support before setting a known RSS hash function, but didn't do any check before setting unknown RSS hash function. Need to make it fail on such values. In this occasion, moved the actual setting of the new value from the check function into mlx4_en_set_rxfh(). Fixes: `947cbb0` ("net/mlx4_en: Support for configurable RSS hash function") Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-27 13:36:48 -04:00
Linus Torvalds	7c034dfd58	InfiniBand/RDMA updates for 4.1: - IPoIB fixes from Doug Ledford and Erez Shitrit - iSER updates from Sagi Grimberg - mlx4 GUID handling changes from Yishai Hadas - other misc fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJVN9SzAAoJEENa44ZhAt0hWq4QAJRFrwoe9ubTextSHeTU0FkY CydiQtGWrhyAHTX/KtdB1Uv9FzGHc6gqkAOXImouacYTM9ffypMF6Oj4xIYIMQtz MvNlNm07KOtQYlubiaZWcP5BjdLfMZjQxb03/9smygLTBjm80dAEt5X1znx7YrqI ZfE+ibPdvRqVEvFZKfT2U0kGU6oEVKrbJEiUCoJPwwcghDZQl18YmGOxt5qdI2uO V+71ozwozT8utSIl7S2YTJZBdkJ7tLrqrX2D/D2jUAmh1rqHIDrsXXiZ44UJj82i oXuwqmHXfq1LfuC9kxCX5JJpGeLE7E3OoxM1zIev31710zPA0v57rNKKweCi2Tj6 Z36B0SIRV4ipWr/sBhVDr1Ffc/uap3DOIEU9Z+t8rwhELCEVuxmNaNb0K1e5nPiy YOQYp/ctC0NslM4mqQJLhGMVl6H8PjodbM1whnYZLsF1+8clNvdtLYzy/cA5fGbO tngUGXu0YZGdwvfuQhi5FB45XLaErJaPcMH0QRI5G0JgtjvbzXiMlqWtekTUBi7W DJNQlVRI4S1RYRBYkq709ymXiWwTeh3rhH+ZJpM+aY8b0NR/lx+dNyesNG+7GBJH y5UOOUck0w+JbQzZo264I6a5e8pXq3kMi3BH8pF4Jbo5WvxSF6uriXb6Q1JzfH20 Jn0J6W9ghCSfrhMI1zgQ =v1jB -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA updates from Roland Dreier: - IPoIB fixes from Doug Ledford and Erez Shitrit - iSER updates from Sagi Grimberg - mlx4 GUID handling changes from Yishai Hadas - other misc fixes * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (51 commits) mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures IB/iser: Rewrite bounce buffer code path IB/iser: Bump version to 1.6 IB/iser: Remove code duplication for a single DMA entry IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and iser_reg_sig_mr IB/iser: Modify struct iser_mem_reg members IB/iser: Make fastreg pool cache friendly IB/iser: Move PI context alloc/free to routines IB/iser: Move fastreg descriptor pool get/put to helper functions IB/iser: Merge build page-vec into register page-vec IB/iser: Get rid of struct iser_rdma_regd IB/iser: Remove redundant assignments in iser_reg_page_vec IB/iser: Move memory reg/dereg routines to iser_memory.c IB/iser: Don't pass ib_device to fall_to_bounce_buff routine IB/iser: Remove a redundant struct iser_data_buf IB/iser: Remove redundant cmd_data_len calculation IB/iser: Fix wrong calculation of protection buffer length IB/iser: Handle fastreg/local_inv completion errors IB/iser: Fix unload during ep_poll wrong dereference ib_srpt: convert printk's to pr_* functions ...	2015-04-22 11:50:05 -07:00
Eran Ben Elisha	fab9adfb71	net/mlx4_core: Fix reading HCA max message size in mlx4_QUERY_DEV_CAP Currently we parse max_msg_sz from the wrong offset in QUERY_DEV_CAP, fix to use the right offset. Fixes: `0b131561a7` ('net/mlx4_en: Add Flow control statistics [..]') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-21 17:36:08 -04:00
Doug Ledford	c1c2fef6cf	Merge branches 'cve-fixup', 'ipoib', 'iser', 'misc-4.1', 'or-mlx4' and 'srp' into for-4.1	2015-04-15 16:24:49 -04:00
Honggang LI	59d2d18cc4	mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures If CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for x86 systems and physical memory is more than 4GB, dma_map_page may return a valid memory address which greater than 0xffffffff. As a result, the mlx5 device page allocator RB tree will be initialized with valid addresses greater than 0xfffffff. However, (addr & PAGE_MASK) set the high four bytes to zeros. So, it's impossible for the function, free_4k, to release the pages whose addresses greater than 4GB. Memory leaks. And mlx5_ib module can't release the pages when user try to remove the module, as a result, system hang. [root@rdma05 root]# dmesg \| grep addr \| head addr = 3fe384000 addr & PAGE_MASK = fe384000 [root@rdma05 root]# rmmod mlx5_ib <---- hang on ---------------------- cosnole log ----------------- mlx5_ib 0000:04:00.0: irq 138 for MSI/MSI-X alloc irq_desc for 139 on node -1 alloc kstat_irqs on node -1 mlx5_ib 0000:04:00.0: irq 139 for MSI/MSI-X 0000:04:00.0:free_4k:221:(pid 1519): page not found 0000:04:00.0:free_4k:221:(pid 1519): page not found 0000:04:00.0:free_4k:221:(pid 1519): page not found 0000:04:00.0:free_4k:221:(pid 1519): page not found ---------------------- cosnole log ----------------- Fixes: `bf0bf77f65` ('mlx5: Support communicating arbitrary host page size to firmware') Signed-off-by: Honggang Li <honli@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-04-15 16:20:51 -04:00
Yishai Hadas	e9a7ff3a71	net/mlx4_core: Return the admin alias GUID upon host view request Return the admin alias GUID value upon a GET request via HOST. We do this so that the GUID value requested by the admin is returned even if the SM has not yet approved this GUID (e.g. the SM is down). Note that this does not create a problem, since the virtual port will remain down until the SM does ACK the requested GUID value. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-04-15 15:51:50 -04:00
Yishai Hadas	a0667a83af	net/mlx4_core: Raise slave shutdown event upon FLR There might be cases that PF doesn't get a "reset" command upon slave down (e.g. virsh destroy). In these cases, however, an FLR event is issued. Therefore, when the PF receives an FLR event for a slave, it should also generate a shutdown event on the PF for that slave, to let the PF upper layers (mlx4_ib, eth) perform any required cleanup/actions associated with slave shutdown. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-04-15 15:51:50 -04:00
Yishai Hadas	fb517a4f03	net/mlx4_core: Set initial admin GUIDs for VFs To have out of the box experience, the PF generates random GUIDs who serve as the initial admin values. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-04-15 15:51:50 -04:00
Yishai Hadas	773af94e4e	net/mlx4_core: Manage alias GUID per VF Manages alias GUIDs per VF per port in the core layer. This is a pre-step for managing alias GUIDs in a mode that the admin GUID is returned via ib_query_gid() regardless of whether the SM has approved it or not. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2015-04-15 15:51:50 -04:00
Alexander Duyck	12b3375f39	mlx4/mlx5: Use dma_wmb/rmb where appropriate This patch should help to improve the performance of the mlx4 and mlx5 on a number of architectures. For example, on x86 the dma_wmb/rmb equates out to a barrer() call as the architecture is already strong ordered, and on PowerPC the call works out to a lwsync which is significantly less expensive than the sync call that was being used for wmb. I placed the new barriers between any spots that seemed to be trying to order memory/memory reads or writes, if there are any spots that involved MMIO I left the existing wmb in place as the new barriers cannot order transactions between coherent and non-coherent memories. v2: Reduced the replacments to just the spots where I could clearly identify the usage pattern. Cc: Amir Vadai <amirv@mellanox.com> Cc: Ido Shamay <idos@mellanox.com> Cc: Eli Cohen <eli@mellanox.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-09 14:25:25 -04:00
David S. Miller	c85d6975ef	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/mellanox/mlx4/cmd.c net/core/fib_rules.c net/ipv4/fib_frontend.c The fib_rules.c and fib_frontend.c conflicts were locking adjustments in 'net' overlapping addition and removal of code in 'net-next'. The mlx4 conflict was a bug fix in 'net' happening in the same place a constant was being replaced with a more suitable macro. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 22:34:15 -04:00
Jack Morgenstein	fde913e254	net/mlx4_core: Fix error message deprecation for ConnectX-2 cards Commit `1daa4303b4` ("net/mlx4_core: Deprecate error message at ConnectX-2 cards startup to debug") did the deprecation only for port 1 of the card. Need to deprecate for port 2 as well. Fixes: `1daa4303b4` ("net/mlx4_core: Deprecate error message at ConnectX-2 cards startup to debug") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 17:32:27 -04:00
Saeed Mahameed	64613d9499	net/mlx5_core: Extend struct mlx5_interface to support multiple protocols Preparation for ethernet driver. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:43 -04:00
Saeed Mahameed	233d05d28a	net/mlx5_core: Move completion eqs from mlx5_ib to mlx5_core Preparation for ethernet driver. These functions will be used in drivers other than mlx5_ib. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:42 -04:00
Achiad Shochat	4ae6c18c59	net/mlx5_core: Update module info macros for ConnectX4 Support Declare the support of new ConnectX4 HCA in module info Pump up the module version Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:42 -04:00
Saeed Mahameed	302bdf68fc	net/mlx5_core: Fix Mellanox copyright note Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:42 -04:00
Achiad Shochat	4cbdd27c9c	net/mlx5_core: Fix a bug in alloc_token In alloc_token(), the token '1' would be allocated twice consecutively. Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:42 -04:00
Ira Gusinsky	21db507439	net/mlx5_core: Avoid usage command work entry after writing command doorbell Avoid usage of command work entry in cmd_work_handler since it can be released by mlx5_cmd_invoke before the work handler returns to running. Signed-off-by: Ira Gusinsky <irenag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:41 -04:00
Eli Cohen	05e4ecd1dc	net/mlx5_core: Avoid copying outbox in aysnc command completion Avoid copying to the output buffer in cmd_exec since this is done after the command is completed. Failure to do this may cause cases where the callback handler is called before the copy done by cmd_exec which then overwrites it. Reported-by: Tamer Hleihel <tamerh@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:41 -04:00
Eli Cohen	64599cca51	net/mlx5_core: Use coherent memory for command interface page Use coherent memory for the commands descriptor page. Take measures to make sure the page is aligned to MLX5_ADAPTER_PAGE_SIZE as required by the hardware. Reported-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:41 -04:00
Achiad Shochat	60722c2ba0	net/mlx5_core: Use the right inbox struct in destroy mkey command struct mlx5_query_mkey_mbox_in rather than mlx5_destroy_mkey_mbox_in Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:41 -04:00
Saeed Mahameed	b812b5441e	net/mlx5_core: Clear doorbell record inside mlx5_db_alloc() Do it in one place instead of every where the function is invoked Signed-off-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:41 -04:00
Eli Cohen	9ef9baa2ac	net/mlx5_core: Avoid setting DC requestor/responder resources PRM does not support setting these values so avoid setting them. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:41 -04:00
Eli Cohen	ad1891062a	net/mlx5_core: Allocate firmware pages from device's NUMA node Allocate firmware pages from the NUMA node which is close to the device. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:33:40 -04:00
Muhammad Mahajna	78500b8c03	net/mlx4_en: Add RX-ALL support Enabled when the device supports KEEP FCS and IGNORE FCS. When the flag is set, pass all received frames up the stack, even ones with invalid FCS, controlled by ethtool. Signed-off-by: Muhammad Mahajna <muhammadm@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:04 -04:00
Muhammad Mahajna	f0df35037a	net/mlx4_en: Add RX-FCS support Enabled when device supports KEEP FCS. When the flag is set, Ethernet FCS is appended to the end of the frame, controlled by ethtool. Signed-off-by: Muhammad Mahajna <muhammadm@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:04 -04:00
Ido Shamay	51af33cfed	net/mlx4_en: Add interface identify support Add support for the interface ethtool identify feature. Make the physical port LED to blink with green and yellow colors. The device handles the LED blink by itself (synchrous use of set_phys_id), by returning 0 to ETHTOOL_ID_ACTIVE command. Signed-off-by: Eyal Grossman <eyalgr@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:03 -04:00
Ido Shamay	a130b59057	net/mlx4: Add SET_PORT opcode modifiers enumeration The calls to SET_PORT used hard-code numbers, when supplying command's opcode modifiers, fix that to use well defined constants. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:03 -04:00
Ido Shamay	38438f7c7e	net/mlx4: Set enhanced QoS support by default when ETS supported If HCA supports ETS QoS feature, set enhanced QoS bit in init_hca as default. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:03 -04:00
Ido Shamay	3742cc6551	net/mlx4: Warn users of depracated QoS Firmware A new capability bit was introduced in the past to to differ devices using the QoS ETS feature. The old was deprecated since then. If driver sees device which set only the old capabilty, it will print warning to user suggesting to upgrade the FW. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:03 -04:00
Ido Shamay	cda373f484	net/mlx4_en: Enable TX rate limit per VF Support granular QoS per VF, by implementing the ndo_set_vf_rate. Enforce a rate limit per VF when called, and enabled only for VFs in VST mode with user priority supported by the device. We don't enforce VFs to be in VST mode at the moment of configuration, but rather save the given rate limit and enforce it when the VF is moved to VST with user priority which is supported (currently 0). VST<->VGT or VST qos value state changes are disallowed when a rate limit is configured. Minimum BW share is not supported yet. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:03 -04:00
Ido Shamay	08068cd568	net/mlx4: Added qos_vport QP configuration in VST mode Granular QoS per VF feature introduce a new QP field, qos_vport. PF administrator can connect VF QPs to a certain QoS Vport, to inherit its proporties. Connecting QPs to the default QoS Vport (defined as 0) is always allowed, even when there are no allocated VPPs. At this point, only the default vport is connected to QPs. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:03 -04:00
Ido Shamay	666672d480	net/mlx4: Allocate VPPs for each port on PF init Initialization of granular Qos per VF mechanism. Query the port availible VPPs and allocates those on all supported priorities in an equal share. Allocation is done only in SRIOV mode, when the feature is supported by the device and port type is Ethernet. Allocation currently is done only on the default priority 0. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:02 -04:00
Ido Shamay	d019fcb224	net/mlx4: Query device for QoS per VF support Checks in QUERY_DEV_CAP if the granular QoS per VF feature is supported by the device. Disabled for guests. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:02 -04:00
Ido Shamay	1c29146d38	net/mlx4: Add mlx4_SET_VPORT_QOS implementation Add the SET_VPORT_QOS device command, which is ntended for virtual granular QoS configuration per VF in SRIOV mode. The SET_VPORT_QOS command sets and queries QoS parameters of a VPort. Each priority allowed for a VPort is assigned with a share of the BW, and a BW limitation. QoS parameters can be modified at any time, but must be initialized before any QP is associated with the VPort. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:02 -04:00
Ido Shamay	7e95bb99a8	net/mlx4: Add mlx4_ALLOCATE_VPP implementation Implements device ALLOCATE_VPP command, to be used for granular QoS configuration of VFs by the PF device. Defines and queries the amount of VPPs assigned to each port, and the amount of VPPs assigned to each priority of each port. Once the total VPPs are split between the priorities of a port, they may be assigned with a share of the BW or a rate limit. Split into two functions (get/set) whoch are supplied with mlx4_alloc_vpp_context and physical port number. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:02 -04:00
Ido Shamay	12a889c057	net/mlx4: New file for QoS related firmware commands Create two new files fw_qos.h and fw_qos.c in mlx4_core module. It gathers all relevant QoS firmware related commands etc, thus improving encapsulation of the mlx4_core module. For now it contains the QoS existing commands: mlx4_SET_PORT_SCHEDULER and mlx4_SET_PORT_PRIO2TC. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:02 -04:00
Ido Shamay	4abccb6157	net/mlx4: Aesthetic code changes in multi_func_init Previous vf_oper and vf_admin code created very long lines, making it hard to read the code. Added relevant in-struct pointers to reduce code complexity and avoid code lines spread over 80 lines. Same logic is preserved. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:25:01 -04:00
Ido Shamay	fccea6436a	net/mlx4: Make mlx4_is_eth visible inline funcion Currently implemented as static function in resource_tracker.c -- this change will allow other files in mlx4_core to use it as well. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:24:51 -04:00
Ido Shamay	241a08c3a7	net/mlx4_en: Change loopback only upon feature change Currently any change of netdev features results in a call to mlx4_en_update_loopback_state(). Those calls are unnecessary, and should be called only upon loopback feature change. Also moved some of the logic into mlx4_en_update_loopback_state(). Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:24:51 -04:00
Ido Shamay	802f42a8d9	net/mlx4: Add RSS support for fragmented IP datagrams Enable RSS support for fragmented IP packets, when device supports it. Until now, fragmented IP packets were directed only to the default_qpn. Since IP fragments (datagram) have no upper protocols (L3 IP packets), hash is performed on 3-tuple - dst MAC, source IP and dest IP. The HW makes sure that this holds for the 1st fragment too, so all fragments go to the same QP. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:24:50 -04:00
David S. Miller	9f0d34bc34	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/asix_common.c drivers/net/usb/sr9800.c drivers/net/usb/usbnet.c include/linux/usb/usbnet.h net/ipv4/tcp_ipv4.c net/ipv6/tcp_ipv6.c The TCP conflicts were overlapping changes. In 'net' we added a READ_ONCE() to the socket cached RX route read, whilst in 'net-next' Eric Dumazet touched the surrounding code dealing with how mini sockets are handled. With USB, it's a case of the same bug fix first going into net-next and then I cherry picked it back into net. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-02 16:16:53 -04:00
Richard Cochran	f75419c81e	ptp: mlx4: use helpers for converting ns to timespec. This patch changes the driver to use ns_to_timespec64() instead of open coding the same logic. Compile tested only. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 17:19:19 -04:00
Eran Ben Elisha	a3333b35da	net/mlx4_en: Moderate ethtool callback to show more statistics More packet statistics are now calculated and visible to the user via ethtool: - RX packet errors statistics. - TX/RX drops are now calculated. - TX multicast and broadcast statistics. - RX/TX per priority bytes statistics. - RX/TX no vlan packets and bytes statistics. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:36:51 -04:00
Matan Barak	0b131561a7	net/mlx4_en: Add Flow control statistics display via ethtool Flow control per priority and Global pause counters are now visible via ethtool. The counters shows statistics regarding pauses in the device. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:36:51 -04:00
Eran Ben Elisha	3da8a36cc5	net/mlx4_en: Protect access to the statistics bitmap This will allow parallel access to the statistics bitmap. A pre-step for adding PFC counters, where the statistics bitmap can be dynamically changed when modifying the PFC setting. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:36:50 -04:00
Eran Ben Elisha	6fcd27354b	net/mlx4_en: Support general selective view of ethtool statistics The driver uses a bitmask to indicate which statistics should be displayed to the user in ethtool. The bitmask is u64, therefore we are limited for a selective view of up to 64 statistics. Extend the bitmap in order to show more than 64 statistics. In addition, add packet statistics to the ethtool display for PF. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:36:50 -04:00
Eran Ben Elisha	ffa88f37ff	net/mlx4_en: Move statistics bitmap setting to the Ethernet driver The statistics bitmap belongs to the Ethernet driver, move it there. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:36:50 -04:00
Eran Ben Elisha	b4b6e842fc	net/mlx4_en: Create new header file for all statistics info Add mlx4_stats.h file and move there all statistics structs and marcos. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:36:50 -04:00
Eran Ben Elisha	66f24a7ebf	net/mlx4_en: Fix port counters statistics bitmask Two counters (rx_chksum_complete and tx_chksum_offload) are not displayed under SRIOV for the PF via ethtool because their bit mask is off, fix that. Fixes: `f8c6455bb` ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE') Fixes: `9fab426de` ('mlx4: add a new xmit_more counter') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 16:36:50 -04:00
Richard Cochran	e394b80545	ptp: mlx4: convert to the 64 bit get/set time methods. This driver's clock is implemented using a timecounter, and so with this patch the driver is ready for the year 2038. Compile tested only. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-31 12:01:18 -04:00
Toshiaki Makita	8cb65d0008	net: Move check for multiple vlans to drivers To allow drivers to handle the features check for multiple tags, move the check to ndo_features_check(). As no drivers currently handle multiple tagged TSO, introduce dflt_features_check() and call it if the driver does not have ndo_features_check(). Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 13:33:22 -07:00
Jack Morgenstein	bffb023ad2	net/mlx4_core: Fix GEN_EQE accessing uninitialixed mutex We occasionally see in procedure mlx4_GEN_EQE that the driver tries to grab an uninitialized mutex. This can occur in only one of two ways: 1. We are trying to generate an async event on an uninitialized slave. 2. We are trying to generate an async event on an illegal slave number ( < 0 or > persist->num_vfs) or an inactive slave. To deal with #1: move the mutex initialization from specific slave init sequence in procedure mlx_master_do_cmd to mlx4_multi_func_init() (so that the mutex is always initialized for all slaves). To deal with #2: check in procedure mlx4_GEN_EQE that the slave number provided is in the proper range and that the slave is active. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:22:52 -04:00
Ido Shamay	e5eda89d97	net/mlx4_en: Call register_netdevice in the proper location Netdevice registration should be performed a the end of the driver initialization flow. If we don't do that, after calling register_netdevice, device callbacks may be issued by higher layers of the stack before final configuration of the device is done. For example (VXLAN configuration race), mlx4_SET_PORT_VXLAN was issued after the register_netdev command. System network scripts may configure the interface (UP) right after the registration, which also attach unicast VXLAN steering rule, before mlx4_SET_PORT_VXLAN was called, causing the firmware to fail the rule attachment. Fixes: `837052d0cc` ("net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling") Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:22:52 -04:00
David S. Miller	0fa74a4be4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/emulex/benet/be_main.c net/core/sysctl_net_core.c net/ipv4/inet_diag.c The be_main.c conflict resolution was really tricky. The conflict hunks generated by GIT were very unhelpful, to say the least. It split functions in half and moved them around, when the real actual conflict only existed solely inside of one function, that being be_map_pci_bars(). So instead, to resolve this, I checked out be_main.c from the top of net-next, then I applied the be_main.c changes from 'net' since the last time I merged. And this worked beautifully. The inet_diag.c and sysctl_net_core.c conflicts were simple overlapping changes, and were easily to resolve. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-20 18:51:09 -04:00
Wu Fengguang	de1cf8a7d7	net/mlx4_en: mlx4_en_set_tx_maxrate() can be static Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 23:21:41 -04:00
Eran Ben Elisha	39de961a4a	net/mlx4_en: Set statistics bitmap at port init Port statistics bitmap will now be initialized at port init. Even before starting the port, statistics are visible to the user and must be properly masked. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 15:17:11 -04:00
Eran Ben Elisha	a16f356570	net/mlx4_en: Fix off-by-one in ethtool statistics display NUM_PORT_STATS was 9 instead of 10, which caused off-by-one bug when displaying the statistics starting from tx_chksum_offload in ethtool. Fixes: `f8c6455bb0` ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 15:17:11 -04:00
Or Gerlitz	c10e4fc6c4	net/mlx4_en: Add tx queue maxrate support Add ndo_set_tx_maxrate support. To support per tx queue maxrate limit, we use the update-qp firmware command to do run-time rate setting for the qp that serves this tx ring. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 14:55:19 -04:00
Or Gerlitz	fc31e2560a	net/mlx4_core: Add basic support for QP max-rate limiting Add the low-level device commands and definitions used for QP max-rate limiting. This is done through the following elements: - read rate-limit device caps in QUERY_DEV_CAP: number of different rates and the min/max rates in Kbs/Mbs/Gbs units - enhance the QP context struct to contain rate limit units and value - allow to do run time rate-limit setting to QPs through the update-qp firmware command - QP rate-limiting is disallowed for VFs Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-18 14:55:19 -04:00
Julia Lawall	6b9f53bc10	net/mlx5_core: don't export static symbol The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ type T; identifier f; @@ static T f (...) { ... } @@ identifier r.f; declarer name EXPORT_SYMBOL; @@ -EXPORT_SYMBOL(f); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 00:03:34 -04:00
Joe Perches	dbedd44e98	ethernet: codespell comment spelling fixes To test a checkpatch spelling patch, I ran codespell against drivers/net/ethernet/. $ git ls-files drivers/net/ethernet/ \| \ while read file ; do \ codespell -w $file; \ done I removed a false positive in e1000_hw.h Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-08 22:54:22 -04:00
Shani Michaeli	708b869bf5	net/mlx4_en: Add QCN parameters and statistics handling Implement the IEEE DCB handlers for set/get QCN parameters and statistics reading per TC. Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 21:50:02 -05:00
Shani Michaeli	d237baa1cb	net/mlx4_core: Add basic elements for QCN Add device capability, firmware command opcode and etc prior elements needed for QCN suppprt. Disable SRIOV VF view/access for QCN is disabled. While here, remove a redundant offset definition into the QUERY_DEV_CAP mailbox. Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-06 21:50:02 -05:00
David S. Miller	71a83a6db6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/rocker/rocker.c The rocker commit was two overlapping changes, one to rename the ->vport member to ->pport, and another making the bitmask expression use '1ULL' instead of plain '1'. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 21:16:48 -05:00
Joe Perches	c7bf716940	ethernet: Use eth_<foo>_addr instead of memset Use the built-in function instead of memset. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-03 17:01:36 -05:00
Ido Shamay	1037ebbbd2	net/mlx4_en: Disbale GRO for incoming loopback/selftest packets Packets which are sent from the selftest (ethtool) flow, should not be passed to GRO stack but rather dropped by the driver after validation. To achieve that, we disable GRO for the duration of the selftest. Fixes: `dd65beac48` ("net/mlx4_en: Extend usage of napi_gro_frags") Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 15:27:19 -05:00
Or Gerlitz	f5956fafb0	net/mlx4_core: Fix wrong mask and error flow for the update-qp command The bit mask for currently supported driver features (MLX4_UPDATE_QP_SUPPORTED_ATTRS) of the update-qp command was defined twice (using enum value and pre-processor define directive) and wrong. The return value of the call to mlx4_update_qp() from within the SRIOV resource-tracker was wrongly voided down. Fix both issues. issue: none Fixes: `09e05c3f78` ('net/mlx4: Set vlan stripping policy by the right command') Fixes: `ce8d9e0d67` ('net/mlx4_core: Add UPDATE_QP SRIOV wrapper support') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-02 15:27:19 -05:00
Eli Cohen	de61390cb3	net/mlx5_core: Fix configuration of log_uar_page_sz The current code failed to configure the page size for architectures with page size different than 4K - PPC for example. Signed-off-by: Carol L Soto <clsoto@us.ibm.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-11 19:42:23 -08:00
Markus Elfring	1d966d03a3	net: Mellanox: Delete unnecessary checks before the function call "vunmap" The vunmap() function performs also input parameter validation. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-09 14:10:05 -08:00
David S. Miller	6e03f896b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 14:33:28 -08:00
Ido Shamay	cfb53f36a5	net/mlx4_en: Notify TX Vlan offload change Notify users when TX vlan offload feature changed with ethtool. Relevant command - ethtool -K <eth> txvlan on/off. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:17:46 -08:00
Ido Shamay	e8e7f018f1	net/mlx4_en: Adjust RX frag strides to frag sizes This patch improves memory utilization and therefore the packets rate for special MTU's. Instead of setting the frag_stride to the maximal hard coded frag_size, use the actual frag_size that is set according to the MTU, when setting the stride of the last frag. So, for example, for MTU 1600, where the frag_size of the 2nd frag is 86, the frag_size is set to 128 instead of 4096. See below: Before: frag:0 - size:1536 prefix:0 stride:1536 frag:1 - size:86 prefix:1536 stride:4096 frag 0 allocator: - size:32768 frags:21 frag 1 allocator: - size:32768 frags:8 After: frag:0 - size:1536 prefix:0 stride:1536 frag:1 - size:86 prefix:1536 stride:128 frag 0 allocator: - size:32768 frags:21 frag 1 allocator: - size:32768 frags:256 Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:17:45 -08:00
Ido Shamay	b110d2ce49	net/mlx4_en: Print page allocator information After Initialization of page_alloc, print actual allocated page size and number of frags it contains. prints is done only when drv message level is set on the interface. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:17:45 -08:00
Or Gerlitz	1c755cc5be	net/mlx5_core: Move to use hex PCI device IDs Align the IDs in the code with the modinfo, lspci -n, etc tools outputs. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:17:45 -08:00
Or Gerlitz	0fab541ac2	net/mlx4_core: Fix misleading debug print on CQE stride support We do support cache line sizes of 32 and 64 bytes without activating the CQE stride feature. Fix a misleading print saying that these cache line sizes aren't supported. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:17:45 -08:00
Maor Gottlieb	6af0a52f65	net/mlx4: mlx4_config_dev_retrieval() - Initialize struct config_dev before using Add Initialization to struct config_dev before filling and using it. Fix to warning: warning: config_dev.rx_checksum_val may be used uninitialized in this function Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:17:45 -08:00
Maor Gottlieb	b332068ce4	net/mlx4_core: Fix mpt_entry initialization in mlx4_mr_rereg_mem_write() a) Previously, mlx4_mr_rereg_write filled the MPT's start and length with the old MPT's values. Fixing the initialization to take the new start and length. b) In addition access flags in mpt_status were initialized instead of status due to bad boolean operation. Fixing the operation. c) Initialization of pd_slave caused a protection error. Fix - removing this initialization. d) In resource_tracker.c: Fixing vf encoding to be one-based. Fixes: `e630664c` ('mlx4_core: Add helper functions to support MR re-registration') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:17:45 -08:00
Moni Shoua	5da0354726	net/mlx4_en: Port aggregation configuration Capture NETDEV events generated by the bonding driver and based on that make decisions of how to configure port aggregation in the mlx4 core driver. This includes setting the V2P port table and re-creating the interested interfaces in bonded/non-bonded mode. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:25 -08:00
Moni Shoua	53f33ae295	net/mlx4_core: Port aggregation upper layer interface Supply interface functions to bond and unbond ports of a mlx4 internal interfaces. Example for such an interface is the one registered by the mlx4 IB driver under RoCE. There are 1. Functions to go in/out to/from bonded mode 2. Function to remap virtual ports to physical ports The bond_mutex prevents simultaneous access to data that keep status of the device in bonded mode. The upper mlx4 interface marks to the mlx4 core module that they want to be subject for such bonding by setting the MLX4_INTFF_BONDING flag. Interface which goes to/from bonded mode is re-created. The mlx4 Ethernet driver does not set this flag when registering the interface, the IB driver does. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:24 -08:00
Moni Shoua	59e14e3250	net/mlx4_core: Port aggregation low level interface Implement the hardware interface required for port aggregation. 1. Disable RX port check on receive - don't perform a validity check that matches to QP's port and the port where the packet is received. 2. Virtual to physical port remap - configure virtual to physical port mapping. Port remap capability for virtual functions. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-04 16:14:24 -08:00
Jack Morgenstein	5a2e87b168	net/mlx4_core: Fix kernel Oops (mem corruption) when working with more than 80 VFs Commit `de966c5928` (net/mlx4_core: Support more than 64 VFs) was meant to allow up to 126 VFs. However, due to leaving MLX4_MFUNC_MAX too low, using more than 80 VFs resulted in memory corruptions (and Oopses) when more than 80 VFs were requested. In addition, the number of slaves was left too high. This commit fixes these issues. Fixes: `de966c5928` ("net/mlx4_core: Support more than 64 VFs") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-02 19:38:04 -08:00
Majd Dibbiny	6d6e996c20	net/mlx4_core: Update the HCA core clock frequency after INIT_PORT The firmware might change the hca core clock frequency after the driver issues the INIT_PORT command. Therefore we need to query the new value again and save in to the cached dev caps. Fixes: `ddd8a6c1` ('net/mlx4_core: Read HCA frequency and map internal clock') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:58 -08:00
Or Gerlitz	cb2147a925	net/mlx4_core: Fix device capabilities dumping We are dumping device capabilities which are supported both by the firmware and the driver. Align the array that holds the capability strings with this practice. Reported-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:58 -08:00
Matan Barak	19ab574f62	net/mlx4: Fix memory corruption in mlx4_MAD_IFC_wrapper Fix a memory corruption at mlx4_MAD_IFC_wrapper. A table of size dev->caps.pkey_table_len[port]sizeof(table) was allocated, but get_full_pkey_table() assumes that the number of entries in the table is a multiplication of 32 (which isn't always correct). Fixes: `0a9a018` ('mlx4: MAD_IFC paravirtualization') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:58 -08:00
Saeed Mahameed	5a228c03d8	net/mlx4_en: Use ethtool cmd->autoneg as a hint for ethtool set settings Use cmd->autoneg as a user hint to decide what to set in ethtool set settings callback. When cmd->autoneg == AUTONEG_ENABLE set according to ethtool->advertise otherwise, set according to ethtool->speed. Usage: - ethtool -s eth<x> speed 56000 autoneg off - ethtool -s eth<x> advertise 0x800000 autoneg on While we're here: - Move proto_admin masking outcome check to be adjacent to the operation. - Move en_warn("port reset..") print to "port reset" block. Fixes: `312df74` ("net/mlx4_en: mlx4_en_set_settings() always fails when autoneg is set") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:58 -08:00
Jack Morgenstein	ef0223e693	net/mlx4_core: Remove duplicate code line from procedure mlx4_bf_alloc mlx4_bf_alloc had an unnecessary/duplicate code line. Did no harm, but not good practice. Reported by the Mellanox Beijing team. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:57 -08:00
Jack Morgenstein	dc7d500451	net/mlx4_core: Fix struct mlx4_vhcr_cmd to make implicit padding explicit Struct mlx4_vhcr was implicitly padded by the gcc compiler on 64-bit architectures. This commit makes that padding explicit, to prevent issues with changing compilers and with incompatibilities between 32-bit architecture implicit padding and 64-bit architecture implicit padding. This structure is used in virtualization for communication between the Host and its Guests. The explicit padding allows 64-bit Hosts (old and new) to continue to interoperate with 64-bit Guests (old and new). However, without this fix, 64-bit Hosts could not interoperate with 32-bit Guests (since these did not insert the padding dword). With this fix, 32-bit Guests will be able to interoperate with 64-bit Hosts (since the structure offsets will be identical on both). Reported-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:57 -08:00
Jack Morgenstein	30a5da5b33	net/mlx4_core: Fix HW2SW_EQ to conform to the firmware spec The driver incorrectly assigned an out-mailbox to this command, and used an opcode modifier = 0, which is a reserved value (it should use opcode modifier = 1). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:57 -08:00
Jack Morgenstein	5a03108689	net/mlx4_core: Adjust command timeouts to conform to the firmware spec The firmware spec states that the timeout for all commands should be 60 seconds. In the past, the spec indicated that there were several classes of timeout (short, medium, and long). The driver has these different timeout classes. We leave the class differentiation in the driver as-is (to protect against any future spec changes), but set the timeout for all classes to be 60 seconds. In addition, we fix a few commands which had hard-coded numeric timeouts specified. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:57 -08:00
Jack Morgenstein	772103e6b1	net/mlx4_core: Fix mem leak in SRIOV mlx4_init_one error flow Structs allocated for the resource tracker must be freed in the error flow. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:57 -08:00
Jack Morgenstein	f0ce061508	net/mlx4_core: Add reserved lkey for VFs to QUERY_FUNC_CAP The reserved lKey is different for each VF. A base lkey value is returned in QUERY_DEV_CAP at offset 0x98. The reserved L_key value for a VF is: VF_lkey = base_lkey + (VF_number << 8). This VF L_key value should be returned in QUERY_FUNC_CAP (opcode-modifier = 0) at offset 0x48. To indicate that the lkey value at offset 0x48 is valid, the Hypervisor sets a flag bit in dword 0x0, offset 27 in the QUERY_FUNC_CAP wrapper function. When the VF calls QUERY_FUNC_CAP, it should check if this flag bit is set. If it is set, the VF should take the reserved lkey value at offset 0x48. If the bit is not set, the VF should not use a reserved lkey (i.e., should set its reserved lkey value to 0). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:57 -08:00
Jack Morgenstein	be6a6b43b5	net/mlx4_core: Add bad-cable event support If the firmware can detect a bad cable, allow it to generate an event, and print the problem in the log. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 17:12:57 -08:00
David S. Miller	95f873f2ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: arch/arm/boot/dts/imx6sx-sdb.dts net/sched/cls_bpf.c Two simple sets of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 16:59:56 -08:00
Yishai Hadas	0cd9302734	net/mlx4_core: Reset flow activation upon SRIOV fatal command cases When SRIOV commands are executed over the comm-channel and get a fatal error (e.g. timeout, closing command failure) the VF enters into error state and reset flow is activated. To be able to recognize whether the failure was on a closing command, the operational code for the given VHCR command is used. Once the device entered into an error state we prevent redundant error messages from being printed. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:15 -08:00
Yishai Hadas	55ad359225	net/mlx4_core: Enable device recovery flow with SRIOV In SRIOV, both the PF and the VF may attempt device recovery whenever they assume that the device is not functioning. When the PF driver resets the device, the VF should detect this and attempt to reinitialize itself. The VF must be able to reset itself under all circumstances, even if the PF is not responsive. The VF shall reset itself in the following cases: 1. Commands are not processed within reasonable time over the communication channel. This is done considering device state and the correct return code based on the command as was done in the native mode, done in the next patch. 2. The VF driver receives an internal error event reported by the PF on the communication channel. This occurs when the PF driver resets the device or when VF is out of sync with the PF. Add 'VF reset' capability, which allows the VF to reinitialize itself even when the PF is not responsive. As PF and VF may run their reset flow simulantanisly, there are several cases that are handled: - Prevent freeing VF resources upon FLR, when PF is in its unloading stage. - Prevent PF getting VF commands before it has finished initializing its resources. - Upon VF startup, check that comm-channel is online before sending commands to the PF and getting timed-out. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:14 -08:00
Yishai Hadas	2ba5fbd62b	net/mlx4_core: Handle AER flow properly Fix AER callbacks to work properly, it includes: - Refractoring AER to be aligned with Reset flow support. - Sync with concurrent catas flow. In addition, fix the shutdown PCI callback to sync with concurrent catas flow. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:14 -08:00
Yishai Hadas	c69453e294	net/mlx4_core: Manage interface state for Reset flow cases We need to manage interface state to sync between reset flow and some other relative cases such as remove_one. This has to be done to prevent certain races. For example in case software stack is down as a result of unload call, the remove_one should skip the unload phase. Implement the remove_one case, handling AER and other cases comes next. The interface can be up/down, upon remove_one, the state will include an extra bit indicating that the device is cleaned-up, forcing other tasks to finish before the final cleanup. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:14 -08:00
Yishai Hadas	f5aef5aa35	net/mlx4_core: Activate reset flow upon fatal command cases We activate reset flow upon command fatal errors, when the device enters an erroneous state, and must be reset. The cases below are assumed to be fatal: FW command timed-out, an error from FW on closing commands, pci is offline when posting/pending a command. In those cases we place the device into an error state: chip is reset, pending commands are awakened and completed immediately. Subsequent commands will return immediately. The return code in the above cases will depend on the command. Commands which free and close resources will return success (because the chip was reset, so callers may safely free their kernel resources). Other commands will return -EIO. Since the device's state was marked as error, the catas poller will detect this and restart the device's software stack (as is done when a FW internal error is directly detected). The device state is protected by a persistent mutex lives on its mlx4_dev, as such no need any more for the hcr_mutex which is removed. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:14 -08:00
Yishai Hadas	f6bc11e426	net/mlx4_core: Enhance the catas flow to support device reset This includes: - resetting the chip when a fatal error is detected (the current code does not do this). - exposing the ability to enter error state from outside the catas code by calling its functionality. (E.g. FW Command timeout, AER error). - managing a persistent device state. This is needed to sync between reset flow cases. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:14 -08:00
Yishai Hadas	ad9a0bf08f	net/mlx4_core: Refactor the catas flow to work per device Using a WQ per device instead of a single global WQ, this allows independent reset handling per device even when SRIOV is used. This comes as a pre-patch for supporting chip reset for both native and SRIOV. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:14 -08:00
Yishai Hadas	dd0eefe3ab	net/mlx4_core: Set device configuration data to be persistent across reset When an HCA enters an internal error state, this is detected by the driver. The driver then should reset the HCA and restart the software stack. Keep ports information and some SRIOV configuration in a persistent area to have it valid across reset. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:13 -08:00
Yishai Hadas	872bf2fb69	net/mlx4_core: Maintain a persistent memory for mlx4 device Maintain a persistent memory that should survive reset flow/PCI error. This comes as a preparation for coming series to support above flows. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-25 14:43:13 -08:00
Or Gerlitz	5eff6dadb9	net/mlx4: Don't disable vxlan offloads under DMFS-A0 optimized steering Except for VXLAN steering rules, all offloads should work as they were under plain DMFS mode. Fix that by enabling all the offloads under DMFS-A0 mode, except for VXLAN steering rules. Fixes: `d57febe1a4` "net/mlx4: Add A0 hybrid steering" Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-15 19:35:30 -05:00
Jiri Pirko	df8a39defa	net: rename vlan_tx_* helpers since "tx" is misleading there The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:51:08 -05:00
Arnd Bergmann	065bd8c28b	mlx5: avoid build warnings on 32-bit The mlx5 driver passes a string pointer in through a 'u64' variable, which on 32-bit machines causes a build warning: drivers/net/ethernet/mellanox/mlx5/core/debugfs.c: In function 'qp_read_field': drivers/net/ethernet/mellanox/mlx5/core/debugfs.c:303:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] The code is in fact safe, so we can shut up the warning by adding extra type casts. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-13 17:08:20 -05:00
David S. Miller	44d84d7272	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2015-01-06 22:29:20 -05:00
Richard Cochran	d9f393734a	mlx4: include clocksource.h again This driver uses the function, clocksource_khz2mult, and so it really must include clocksource.h. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 16:47:36 -05:00
Jack Morgenstein	d0d012509f	net/mlx4_core: Fix error flow in mlx4_init_hca() We shouldn't call UNMAP_FA here, this is done in mlx4_load_one. If mlx4_query_func fails, we need to invoke CLOSE_HCA for both native and master. Fixes: `a0eacca948` ('net/mlx4_core: Refactor mlx4_load_one') Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:41:29 -05:00
Maor Gottlieb	a51e0df4c1	net/mlx4_core: Correcly update the mtt's offset in the MR re-reg flow Previously, mlx4_mt_rereg_write filled the MPT's entity_size with the old MTT's page shift, which could result in using an incorrect offset. Fix the initialization to be after we calculate the new MTT offset. In addition, assign mtt order to -1 after calling mlx4_mtt_cleanup. This is necessary in order to mark the MTT as invalid and avoid freeing it later. Fixes: `e630664` ('mlx4_core: Add helper functions to support MR re-registration') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-02 15:41:28 -05:00
Richard Cochran	2eebdde652	timecounter: keep track of accumulated fractional nanoseconds The current timecounter implementation will drop a variable amount of resolution, depending on the magnitude of the time delta. In other words, reading the clock too often or too close to a time stamp conversion will introduce errors into the time values. This patch fixes the issue by introducing a fractional nanosecond field that accumulates the low order bits. Reported-by: Janusz Użycki <j.uzycki@elproma.com.pl> Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-30 18:29:27 -05:00
Richard Cochran	ce51ff0937	net: mlx4: convert to timecounter adjtime. This patch changes the driver to use the new and improved method for adjusting the offset of a timecounter. Compile tested only. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-30 18:29:27 -05:00
Jesse Gross	5f35227ea3	net: Generalize ndo_gso_check to ndo_features_check GSO isn't the only offload feature with restrictions that potentially can't be expressed with the current features mechanism. Checksum is another although it's a general issue that could in theory apply to anything. Even if it may be possible to implement these restrictions in other ways, it can result in duplicate code or inefficient per-packet behavior. This generalizes ndo_gso_check so that drivers can remove any features that don't make sense for a given packet, similar to netif_skb_features(). It also converts existing driver restrictions to the new format, completing the work that was done to support tunnel protocols since the issues apply to checksums as well. By actually removing features from the set that are used to do offloading, it solves another problem with the existing interface. In these cases, GSO would run with the original set of features and not do anything because it appears that segmentation is not required. CC: Tom Herbert <therbert@google.com> CC: Joe Stringer <joestringer@nicira.com> CC: Eric Dumazet <edumazet@google.com> CC: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Tom Herbert <therbert@google.com> Fixes: `04ffcb255f` ("net: Add ndo_gso_check") Tested-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-26 17:20:56 -05:00
Amir Vadai	492f5add4b	net/mlx4_en: Doorbell is byteswapped in Little Endian archs iowrite32() will byteswap it's argument on big endian archs. iowrite32be() will byteswap on little endian archs. Since we don't want to do this unnecessary byteswap on the fast path, doorbell is stored in the NIC's native endianness. Using the right iowrite() according to the arch endianness. CC: Wei Yang <weiyang@linux.vnet.ibm.com> CC: David Laight <david.laight@aculab.com> Fixes: `6a4e812` ("net/mlx4_en: Avoid calling bswap in tx fast path") Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-22 16:33:10 -05:00
Linus Torvalds	4c929feed7	Main batch of InfiniBand/RDMA changes for 3.19: - On-demand paging support in core midlayer and mlx5 driver. This lets userspace create non-pinned memory regions and have the adapter HW trigger page faults. - iSER and IPoIB updates and fixes. - Low-level HW driver updates for cxgb4, mlx4 and ocrdma. - Other miscellaneous fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJUk2pBAAoJEENa44ZhAt0hL18P/jdCGbWVXOJh25KvjzmKIfUV T3Bdixz5h/Xj2iU6ShsXLSZa8vkPXsiO5v3MIQcR5MuPn88vrxemTy/OmBjefJeL qKGnWfy9O8KxMqhYZAokTTIyl5ygtSITbJsCE5W0KHgRBgBtexbrHeFBcWsT3AZ5 piGyRP4XWc2LtfjrFWdUUjRELz9m74L93uILy0P8lS58k3M8YIOvkjqVmGj5Ya3U /hadgk1HYWfxjw+z3v0keaP1IoqHpJferH+UyjCj8UsIB9swXabE8ap/SFrQPIpe p+Zwyi25292mavqEfm/neUmvn34xLF8c00XB6UKxr42Q9yd1mDxnO+ZxWpxW5klQ tKEZeySDbB/WplGrumCeNXPonFvdBpGOTguP3z5o0bcgj1UJ+yVk8KjNBlwSWhQw Mkh/Rb6gSJzeidB3pnQV3TKVkvcFr+Li6DgbG6a77f0W7ggQC2UaeTwEPY5FlMtK n2jQddhnXYsQXeOEpDcISbpAnCIx+qjDIRv7jYTajw0hg8A669ytcI/gi4b9qJeU l3epZDszbCkRwPACzOXCRfeZRiz1H6/USI+Vn/yIQZBlHEd7TcK6ph+KDO/btX+D PWKrirIgzorJsIsDyD4WBXHfJnNS1Imfoxl5s7/8kkwrIkwY+lGpU0zM1bu7cS8W c32iGI9+dgHSPTZt3RdL =MCO9 -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband updates from Roland Dreier: "Main batch of InfiniBand/RDMA changes for 3.19: - On-demand paging support in core midlayer and mlx5 driver. This lets userspace create non-pinned memory regions and have the adapter HW trigger page faults. - iSER and IPoIB updates and fixes. - Low-level HW driver updates for cxgb4, mlx4 and ocrdma. - Other miscellaneous fixes" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (56 commits) IB/mlx5: Implement on demand paging by adding support for MMU notifiers IB/mlx5: Add support for RDMA read/write responder page faults IB/mlx5: Handle page faults IB/mlx5: Page faults handling infrastructure IB/mlx5: Add mlx5_ib_update_mtt to update page tables after creation IB/mlx5: Changes in memory region creation to support on-demand paging IB/mlx5: Implement the ODP capability query verb mlx5_core: Add support for page faults events and low level handling mlx5_core: Re-add MLX5_DEV_CAP_FLAG_ON_DMND_PG flag IB/srp: Allow newline separator for connection string IB/core: Implement support for MMU notifiers regarding on demand paging regions IB/core: Add support for on demand paging regions IB/core: Add flags for on demand paging support IB/core: Add support for extended query device caps IB/mlx5: Add function to read WQE from user-space IB/core: Add umem function to read data from user-space IB/core: Replace ib_umem's offset field with a full address IB/mlx5: Enhance UMR support to allow partial page table update IB/mlx5: Remove per-MR pas and dma pointers RDMA/ocrdma: Always resolve destination mac from GRH for UD QPs ...	2014-12-18 20:10:44 -08:00
Ido Shamay	c3f2511fea	net/mlx4: Cache line CQE/EQE stride fixes This commit contains 2 fixes for the 128B CQE/EQE stride feaure. Wei found that mlx4_QUERY_HCA function marked the wrong capability in flags (64B CQE/EQE), when CQE/EQE stride feature was enabled. Also added small fix in initial CQE ownership bit assignment, when CQE is size is not default 32B. Fixes: `77507aa24` (net/mlx4: Enable CQE/EQE stride support) Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-16 15:23:53 -05:00
Roland Dreier	a7cfef21e3	Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'mlx4', 'ocrdma', 'odp' and 'srp' into for-next	2014-12-15 18:19:20 -08:00
Haggai Eran	e420f0c0f3	mlx5_core: Add support for page faults events and low level handling * Add a handler function pointer in the mlx5_core_qp struct for page fault events. Handle page fault events by calling the handler function, if not NULL. * Add on-demand paging capability query command. * Export command for resuming QPs after page faults. * Add various constants related to paging support. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:18:59 -08:00
Yuval Shaia	0b9976577c	mlx4_core: Check for DPDP violation only when DPDP is not supported Move check for DPDP out of the loop to make the code more readable. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-12-15 18:12:29 -08:00
Or Gerlitz	c78e25edbf	net/mlx4_core: Avoid double dumping of the PF device capabilities To support asymmetric EQ allocations, we should query the device capabilities prior to enabling SRIOV. As a side effect of adding that, we are dumping the PF device capabilities twice. Avoid that by moving the printing into a helper function which is called once. Fixes: `7ae0e400cd` ('net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-15 11:34:54 -05:00
Matan Barak	da315679e8	net/mlx4_core: Fixed memory leak and incorrect refcount in mlx4_load_one The current mlx4_load_one has a memory leak as it always allocates dev_cap, but frees it only on error. In addition, even if VFs exist when mlx4_load_one is called, we still need to notify probed VFs that we're loading (by incrementing pf_loading). Fixes: `a0eacca948` ('net/mlx4_core: Refactor mlx4_load_one') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-15 11:34:53 -05:00
Matan Barak	7d077cd34e	net/mlx4: Add support for A0 steering Add the required firmware commands for A0 steering and a way to enable that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT, QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used to configure and query the device. The different A0 DMFS (steering) modes are: Static - optimized performance, but flow steering rules are limited. This mode should be choosed explicitly by the user in order to be used. Dynamic - this mode should be explicitly choosed by the user. In this mode, the FW works in optimized steering mode as long as it can and afterwards automatically drops to classic (full) DMFS. Disable - this mode should be explicitly choosed by the user. The user instructs the system not to use optimized steering, even if the FW supports Dynamic A0 DMFS (and thus will be able to use optimized steering in Default A0 DMFS mode). Default - this mode is implicitly choosed. In this mode, if the FW supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll work at Disable A0 DMFS mode. Under SRIOV configuration, when the A0 steering mode is enabled, older guest VF drivers who aren't using the RX QP allocation flag (MLX4_RESERVE_A0_QP) will get a QP from the general range and fail when attempting to register a steering rule. To avoid that, the PF context behaviour is changed once on A0 static mode, to require support for the allocation flag in VF drivers too. In order to enable A0 steering, we use log_num_mgm_entry_size param. If the value of the parameter is not positive, we treat the absolute value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this bit field enables static A0 steering. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:36 -05:00
Matan Barak	431df8c7e9	net/mlx4: Refactor QUERY_PORT Currently QUERY_PORT is done as a part of QUERY_DEV_CAP firmware command. Since we would like to use it without querying all device capabilities, extract this part to be a function of its own. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Matan Barak	579d059bd2	net/mlx4_core: Add explicit error message when rule doesn't meet configuration When a given flow steering rule is invalid in respect to the current steering configuration, print the correct error message to the system log. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Matan Barak	d57febe1a4	net/mlx4: Add A0 hybrid steering A0 hybrid steering is a form of high performance flow steering. By using this mode, mlx4 cards use a fast limited table based steering, in order to enable fast steering of unicast packets to a QP. In order to implement A0 hybrid steering we allocate resources from different zones: (1) General range (2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region. When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP, we try hard to allocate the QP from range (2). Otherwise, we try hard not to allocate from this range. However, when the system is pushed to its limits and one needs every resource, the allocator uses every region it can. Meaning, when we run out of raw-eth qps, the allocator allocates from the general range (and the special-A0 area is no longer active). If we run out of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that is also exhausted, the allocator will allocate from the general range (and the A0 region is no longer active). Note that if a raw-eth qp is allocated from the general range, it attempts to allocate the range such that bits 6 and 7 (blueflame bits) in the QP number are not set. When the feature is used in SRIOV, the VF has to notify the PF what kind of QP attributes it needs. In order to do that, along with the "Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According to the combination of these bits, the PF tries to allocate a suitable QP. In order to maintain backward compatibility (with older PFs), the PF notifies which QP attributes it supports via QUERY_FUNC_CAP command. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Matan Barak	7a89399ffa	net/mlx4: Add mlx4_bitmap zone allocator The zone allocator is a mechanism which manages a few mlx4_bitmaps. When allocating a resource, the user indicates the desired zone of which this resource will be allocated from. If possible, the resource will be allocated from this zone. Otherwise, the resource will be allocated from a less-than, equal-to, higher-than priority zone, according to the desired zone's properties with that respective allocation order. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Dotan Barak	ab256e5ad0	net/mlx4: Add a check if there are too many reserved QPs The number of reserved QPs is affected both from the firmware and from the driver's requirements. This patch adds a check that validates that this number is indeed feasable. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Eugenia Emantayev	ddae0349fd	net/mlx4: Change QP allocation scheme When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current Ethernet driver code reserves a Tx QP range with 256b alignment. This is wrong because if there are more than 64 Tx QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:35 -05:00
Matan Barak	3dca0f42c7	net/mlx4_core: Use tasklet for user-space CQ completion events Previously, we've fired all our completion callbacks straight from our ISR. Some of those callbacks were lightweight (for example, mlx4_en's and IPoIB napi callbacks), but some of them did more work (for example, the user-space RDMA stack uverbs' completion handler). Besides that, doing more than the minimal work in ISR is generally considered wrong, it could even lead to a hard lockup of the system. Since when a lot of completion events are generated by the hardware, the loop over those events could be so long, that we'll get into a hard lockup by the system watchdog. In order to avoid that, add a new way of invoking completion events callbacks. In the interrupt itself, we add the CQs which receive completion event to a per-EQ list and schedule a tasklet. In the tasklet context we loop over all the CQs in the list and invoke the user callback. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:34 -05:00
Or Gerlitz	383677da43	net/mlx4_core: Mask out host side virtualization features for guests When VFs (guests in this context) issue the QUERY_DEV_CAP command, they need not be told that host side virtualization features such as VST, FSM (MAC anti-spoofing) and running > 80 VFs are supported by the device. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:34 -05:00
Or Gerlitz	c58942f252	net/mlx4_en: Set csum level for encapsulated packets This was dropped by mistake for the napi_gro_frags flow, fix that. Fixes: `dd65beac48` ('net/mlx4_en: Extend usage of napi_gro_frags') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-11 14:47:34 -05:00
Eyal Perry	947cbb0ac2	net/mlx4_en: Support for configurable RSS hash function The ConnectX HW is capable of using one of the following hash functions: Toeplitz and an XOR hash function. This patch extends the implementation of the mlx4_en driver set/get_rxfh callbacks to support getting and setting the RSS hash function used by the device. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 21:07:10 -05:00
Eyal Perry	892311f66f	ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 21:07:10 -05:00
Eli Cohen	28c167fa8f	net/mlx5_core: Add more supported devices Add ConnectX-4LX to the list of supported devices as well as their virtual functions. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:55 -05:00
Majd Dibbiny	6b60d5e221	net/mlx5_core: Clear outbox of dealloc uar The outbox should be cleared before executing the command. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:55 -05:00
Eli Cohen	ab62924ec2	net/mlx5_core: Print resource number on QP/SRQ async events Useful for debugging purposes. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:54 -05:00
Eli Cohen	2d446d18aa	net/mlx5_core: Fix command queue size enforcement Command queue descriptor page size is 4KB and not the page size used by the kernel. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:54 -05:00
Eli Cohen	3a9e161a59	net/mlx5_core: Fix min vectors value in mlx5_enable_msix mlx5 requires at least one interrupt vector for completions so fix the minvec argument to pci_enable_msix_range() accordingly. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:54 -05:00
Eli Cohen	f66f049fb7	net/mlx5_core: Request the mlx5 IB module on driver load Call request module on mlx5_ib so it will be available for applications requiring it, such as installers that require boot over IB. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-08 20:45:54 -05:00
Jiri Pirko	02637fce3e	net: rename netdev_phys_port_id to more generic name So this can be reused for identification of other "items" as well. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-12-02 20:01:19 -08:00
David S. Miller	60b7379dc5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-11-29 20:47:48 -08:00
Jack Morgenstein	2d5c57d7fb	net/mlx4_core: Limit count field to 24 bits in qp_alloc_res Some VF drivers use the upper byte of "param1" (the qp count field) in mlx4_qp_reserve_range() to pass flags which are used to optimize the range allocation. Under the current code, if any of these flags are set, the 32-bit count field yields a count greater than 2^24, which is out of range, and this VF fails. As these flags represent a "best-effort" allocation hint anyway, they may safely be ignored. Therefore, the PF driver may simply mask out the bits. Fixes: `c82e9aa0a8` "mlx4_core: resource tracking for HCA resources used by guests" Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-26 12:04:49 -05:00
Eric Dumazet	bd635c354d	mlx4: fix mlx4_en_set_rxfh() mlx4_en_set_rxfh() can crash if no RSS indir table is provided. While we are at it, allow RSS key to be changed with ethtool -X Tested: myhost:~# cat /proc/sys/net/core/netdev_rss_key b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e:c5:5d:4d:8c:22:67:30🆎8a:6e:c3:6a myhost:~# ethtool -x eth0 RX flow hash indirection table for eth0 with 8 RX ring(s): 0: 0 1 2 3 4 5 6 7 RSS hash key: b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e myhost:~# ethtool -X eth0 hkey \ 03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f myhost:~# ethtool -x eth0 RX flow hash indirection table for eth0 with 8 RX ring(s): 0: 0 1 2 3 4 5 6 7 RSS hash key: 03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f Reported-by: Ben Hutchings <ben@decadent.org.uk> Fixes: `b9d1ab7eb4` ("mlx4: use netdev_rss_key_fill() helper") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-23 13:49:12 -05:00
David S. Miller	1459143386	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ieee802154/fakehard.c A bug fix went into 'net' for ieee802154/fakehard.c, which is removed in 'net-next'. Add build fix into the merge from Stephen Rothwell in openvswitch, the logging macros take a new initial 'log' argument, a new call was added in 'net' so when we merge that in here we have to explicitly add the new 'log' arg to it else the build fails. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 22:28:24 -05:00
Saeed Mahameed	312df74c71	net/mlx4_en: mlx4_en_set_settings() always fails when autoneg is set Fix ethtool set settings to not check AUTONEG_ENABLE mlx4_en_set_settings should not check if cmd->autoneg == AUTONEG_ENABLE, cmd->autoneg can be enabled by default and this check will fail other settings requests. mlx4_en driver doesn't support changing autoneg value, but shouldn't fail the request in case cmd->autoneg was set. Fixes: `d48b3ab` ("net/mlx4_en: Use PTYS register to set ethtool settings (Speed)") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 15:06:30 -05:00
Al Viro	914efb02be	mlx4: don't duplicate kvfree() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:58:18 -05:00
Al Viro	479163f460	mlx5: don't duplicate kvfree() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-21 14:58:18 -05:00
Or Gerlitz	9737c6ab7a	net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too This is currently missing, which results in a crash when one attempts to set VXLAN tunnel over the mlx4_en when acting as PF. [ 2408.785472] BUG: unable to handle kernel NULL pointer dereference at (null) [...] [ 2408.994104] Call Trace: [ 2408.996584] [<ffffffffa021f7f5>] ? vxlan_get_rx_port+0xd6/0x103 [vxlan] [ 2409.003316] [<ffffffffa021f71f>] ? vxlan_lowerdev_event+0xf2/0xf2 [vxlan] [ 2409.010225] [<ffffffffa0630358>] mlx4_en_start_port+0x862/0x96a [mlx4_en] [ 2409.017132] [<ffffffffa063070f>] mlx4_en_open+0x17f/0x1b8 [mlx4_en] While here, make sure to invoke vxlan_get_rx_port() only when VXLAN offloads are actually enabled and not when they are only supported. Reported-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-19 15:11:09 -05:00
Eric Dumazet	b9d1ab7eb4	mlx4: use netdev_rss_key_fill() helper Use of well known RSS key increases attack surface. Switch to a random one, using generic helper so that all ports share a common key. Also provide ethtool -x support to fetch RSS key Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-16 15:59:13 -05:00
Joe Stringer	956bdab2e4	net/mlx4_en: Implement ndo_gso_check() Use vxlan_gso_check() to advertise offload support for this NIC. Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-14 17:12:48 -05:00
David S. Miller	076ce44825	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/chelsio/cxgb4vf/sge.c drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c sge.c was overlapping two changes, one to use the new __dev_alloc_page() in net-next, and one to use s->fl_pg_order in net. ixgbe_phy.c was a set of overlapping whitespace changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-14 01:01:12 -05:00
Matan Barak	de966c5928	net/mlx4_core: Support more than 64 VFs We now allow up to 126 VFs. Note though that certain firmware versions only allow up to 80 VFs. Moreover, old HCAs only support 64 VFs. In these cases, we limit the maximum number of VFs to 64. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:22 -05:00
Matan Barak	7ae0e400cd	net/mlx4_core: Flexible (asymmetric) allocation of EQs and MSI-X vectors for PF/VFs Previously, the driver queried the firmware in order to get the number of supported EQs. Under SRIOV, since this was done before the driver notified the firmware how many VFs it actually needs, the firmware had to take into account a worst case scenario and always allocated four EQs per VF, where one was used for events while the others were used for completions. Now, when the firmware supports the asymmetric allocation scheme, denoted by exposing num_sys_eqs > 0 (--> MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the QUERY_FUNC command to query the firmware before enabling SRIOV. Thus we can get more EQs and MSI-X vectors per function. Moreover, when running in the new firmware/driver mode, the limitation that the number of EQs should be a power of two is lifted. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:21 -05:00
Matan Barak	e8c4265bea	net/mlx4_core: Add QUERY_FUNC firmware command QUERY_FUNC firmware command could be used in order to query the number of EQs, reserved EQs, etc for a specific function. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:19 -05:00
Matan Barak	a0eacca948	net/mlx4_core: Refactor mlx4_load_one Refactor mlx4_load_one, as a preparation step for a new and more complicated load function. The goal is to support both newer firmware that required init_hca to be done before enable_sriov and legacy firmwares that requires things to be done the other way around. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:18 -05:00
Matan Barak	ffc39f6d6f	net/mlx4_core: Refactor mlx4_cmd_init and mlx4_cmd_cleanup Refactoring mlx4_cmd_init and mlx4_cmd_cleanup such that partial init and cleanup are possible. After this refactoring, calling mlx4_cmd_init several times is safe. This is necessary in the VF init flow when mlx4_init_hca returns -EACCESS, we need to issue cleanup and re-attempt to call it with the slave flag. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:17 -05:00
Matan Barak	225c6c8c6b	net/mlx4_core: Use correct variable type for mlx4_slave_cap We've used an incorrect type for the loop counter and the mlx4_QUERY_FUNC_CAP function. The current input modifier is either a port or a boolean. Since the number of ports is always a positive value < 255, we should use u8 instead of an integer with casting. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:16 -05:00
Matan Barak	7c68dd435b	net/mlx4_core: Fix wrong reading of reserved_eqs We mistakenly read the reserved_eqs field as a standard numeric value rather than a log2 value. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-13 15:16:15 -05:00
Or Gerlitz	f4a1edd561	net/mlx4_en: Advertize encapsulation offloads features only when VXLAN tunnel is set Currenly we only support Large-Send and TX checksum offloads for encapsulated traffic of type VXLAN. We must make sure to advertize these offloads up to the stack only when VXLAN tunnel is set. Failing to do so, would mislead the the networking stack to assume that the driver can offload the internal TX checksum for GRE packets and other buggy schemes. Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:24:45 -05:00
Shani Michaeli	f8c6455bb0	net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE When processing received traffic, pass CHECKSUM_COMPLETE status to the stack, with calculated checksum for non TCP/UDP packets (such as GRE or ICMP). Although the stack expects checksum which doesn't include the pseudo header, the HW adds it. To address that, we are subtracting the pseudo header checksum from the checksum value provided by the HW. In the IPv6 case, we also compute/add the IP header checksum which is not added by the HW for such packets. Cc: Jerry Chu <hkchu@google.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:20:02 -05:00
Shani Michaeli	dd65beac48	net/mlx4_en: Extend usage of napi_gro_frags We can call napi_gro_frags for all the received traffic regardless of the checksum status. Specifically, received packets whose status is CHECKSUM_NONE (and soon to be added CHECKSUM_COMPLETE) are eligible for napi_gro_frags as well. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-11 13:20:02 -05:00
Eric Dumazet	2e1af7d74f	mlx4: restore conditional call to napi_complete_done() After commit `1a28817282` ("mlx4: use napi_complete_done()") we ended up calling napi_complete_done() in the case NAPI poll consumed all its budget. This added extra interrupt pressure, this patch restores proper behavior. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `1a28817282` ("mlx4: use napi_complete_done()") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 21:09:03 -05:00
Eric Dumazet	1a28817282	mlx4: use napi_complete_done() To enable gro_flush_timeout, a driver has to use napi_complete_done() instead of napi_complete(). Tested: Ran 200 netperf TCP_STREAM from A to B (10Gbe mlx4 link, 8 RX queues) Without this feature, we send back about 305,000 ACK per second. GRO aggregation ratio is low (811/305 = 2.65 segments per GRO packet) Setting a timer of 2000 nsec is enough to increase GRO packet sizes and reduce number of ACK packets. (811/19.2 = 42) Receiver performs less calls to upper stacks, less wakes up. This also reduces cpu usage on the sender, as it receives less ACK packets. Note that reducing number of wakes up increases cpu efficiency, but can decrease QPS, as applications wont have the chance to warmup cpu caches doing a partial read of RPC requests/answers if they fit in one skb. B:~# sar -n DEV 1 10 \| grep eth0 \| tail -1 Average: eth0 811269.80 305732.30 1199462.57 19705.72 0.00 0.00 0.50 B:~# echo 2000 >/sys/class/net/eth0/gro_flush_timeout B:~# sar -n DEV 1 10 \| grep eth0 \| tail -1 Average: eth0 811577.30 19230.80 1199916.51 1239.80 0.00 0.00 0.50 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-10 12:05:59 -05:00
David S. Miller	4e84b496fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-11-06 22:01:18 -05:00
Eli Cohen	364d1798ef	net/mlx5_core: Fix race on driver load When events arrive at driver load, the event handler gets called even before the spinlock and list are initialized. Fix this by moving the initialization before EQs creation. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 16:40:36 -05:00
Eli Cohen	a158906dd7	net/mlx5_core: Fix race in create EQ After the EQ is created, it can possibly generate interrupts and the interrupt handler is referencing eq->dev. It is therefore required to set eq->dev before calling request_irq() so if an event is generated before request_irq() returns, we will have a valid eq->dev field. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-06 16:40:35 -05:00
Matan Barak	d475c95b4b	net/mlx4_core: Add retrieval of CONFIG_DEV parameters Add code to issue CONFIG_DEV "get" firmware command. This command is used in order to obtain certain parameters used for supporting various RX checksumming options and vxlan UDP port. The GET operation is allowed for VFs too. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 12:28:14 -05:00
Ido Shamay	1ab25f86c4	net/mlx4_en: Add __GFP_COLD gfp flags in alloc_pages Needed in order to get cache cold pages (L3 flushed) for HW scatter. Otherwise memory may flush those entries when the packet comes from PCI, causing back pressure resulting in BW decrease. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 12:28:13 -05:00
Ido Shamay	5f6e980080	net/mlx4_en: Remove RX buffers alignment to IP_ALIGN When IP_ALIGN has a non zero value, hardware will write to a non aligned address. The only reader from this address is when copying the header from the first frag into the linear buffer (further access to the IP address will be from the linear buffer, in which the headers are aligned). Since the penalty of non align access by the hardware is greater than the software memcpy, changing the frag_align to always be 0. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 12:28:13 -05:00
Amir Vadai	0a98455666	net/mlx4_core: Protect port type setting by mutex We need to protect set_port_type() for concurrency, as the sysfs code could call it from mutliple contexts in parallel. The port_mutex is not enough because we need to protect from concurrent modification of 'info' and stopping of the port sensing work. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 12:28:13 -05:00
Saeed Mahameed	6e80669998	net/mlx4_core: Prevent VF from changing port configuration Added wrapper to the ACCESS_REG command for handling guest HW registers access, preventing write operations, but do allow reads. This will prevent SRIOV guests to change port PTYS configuration, such as speed/advertised link modes. Fixes: `adbc7ac5c1` ('net/mlx4_core: Introduce ACCESS_REG CMD [...]') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-03 12:28:13 -05:00
David S. Miller	55b42b5ca2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/marvell.c Simple overlapping changes in drivers/net/phy/marvell.c Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-01 14:53:27 -04:00
Or Gerlitz	571e1b2c7a	mlx4: Avoid leaking steering rules on flow creation error flow If mlx4_ib_create_flow() attempts to create > 1 rules with the firmware, and one of these registrations fail, we leaked the already created flow rules. One example of the leak is when the registration of the VXLAN ghost steering rule fails, we didn't unregister the original rule requested by the user, introduced in commit `d2fce8a906` "mlx4: Set user-space raw Ethernet QPs to properly handle VXLAN traffic". While here, add dump of the VXLAN portion of steering rules so it can actually be seen when flow creation fails. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:48:58 -04:00
Or Gerlitz	a4f2dacbf2	net/mlx4_en: Don't attempt to TX offload the outer UDP checksum for VXLAN For VXLAN/NVGRE encapsulation, the current HW doesn't support offloading both the outer UDP TX checksum and the inner TCP/UDP TX checksum. The driver doesn't advertize SKB_GSO_UDP_TUNNEL_CSUM, however we are wrongly telling the HW to offload the outer UDP checksum for encapsulated packets, fix that. Fixes: `837052d0cc` ('net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 19:48:58 -04:00
Eric Dumazet	477b35b44f	mlx4: use napi_schedule_irqoff() mlx4_en_rx_irq() and mlx4_en_tx_irq() run from hard interrupt context. They can use napi_schedule_irqoff() instead of napi_schedule() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-30 16:50:47 -04:00
Amir Vadai	d5ec899adb	net/mlx4_en: Report actual number of rings in indirection table Hardware requires the number of rings in indirection table to be a power of 2. When setting number of channels to a non power of 2 number, indirection table is using only the closest power of 2 rings. Report this number in 'ethtool -x' and not the total number of rx rings. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:01 -04:00
Eugenia Emantayev	207af6c507	net/mlx4_en: Move spinlocks and work initalizations to beginning of init_netdev Upon failures, destroy_netdev is called, and spinlocks/works must be initialized before calling it. Otherwise kernel panic may occur. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:01 -04:00
Ido Shamay	f4a3675158	net/mlx4_en: Call napi_synchronize on stop_port This is instead of calling the actual implementation of napi_synchronize, for better encapsulation. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:01 -04:00
Jack Morgenstein	c2a3d4b4ca	net/mlx4_en: Cleanups suggested by clang static checker clang flagged the following. All are actually cosmetic cleanups, not really bugs: drivers/net/ethernet/mellanox/mlx4/en_main.c:233:3: warning: Value stored to 'err' is never read err = -ENOMEM; ^ ~~~~~~~ drivers/net/ethernet/mellanox/mlx4/en_main.c:293:3: warning: Value stored to 'err' is never read err = -ENOMEM; drivers/net/ethernet/mellanox/mlx4/en_netdev.c:648:16: warning: Assigned value is garbage or undefined entry->reg_id = reg_id; ^ ~~~~~~ drivers/net/ethernet/mellanox/mlx4/en_netdev.c:659:2: warning: Function call argument is an uninitialized value mlx4_en_uc_steer_release(priv, priv->dev->dev_addr, *qpn, reg_id); (NOTE: reg_id is only used in the device-managed flow steering path, in which is it always initialized. This is not a bug. Cleanup here is therefore cosmetic only). drivers/net/ethernet/mellanox/mlx4/en_rx.c:122:3: warning: Value stored to 'frag_info' is never read frag_info = &priv->frag_info[i]; ^ ~~~~~~~~~~~~~~~~~~~ Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:01 -04:00
Saeed Mahameed	537f6f951e	net/mlx4_en: Add ethtool support for [rx\|tx]vlan offload set to OFF/ON Move mlx4_en_reset_config to en_netdev.c as it now serves more general purpose. Add support for turning OFF/ON the rx/tx vlan offlad. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:01 -04:00
Saeed Mahameed	7787fa661b	net/mlx4_en: Add support for setting rxvlan offload OFF/ON Rename mlx4_en_timestamp_config to mlx4_en_reset_config and extend it to support choosing RX vlan offload configuration. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:01 -04:00
Saeed Mahameed	d48b3ab4c0	net/mlx4_en: Use PTYS register to set ethtool settings (Speed) Added Support to set speed or advertised link modes via ethtool: ethtool -s <ifname> [speed <speed>] [advertise <link modes>] Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:00 -04:00
Saeed Mahameed	2c76267943	net/mlx4_en: Use PTYS register to query ethtool settings - If dev cap MLX4_DEV_CAP_FLAG2_ETH_PROT_CTRL is ON, query PTYS register to fill ethtool settings. else use default values. - Use autoneg port cap and dev backplane autoneg cap to reprort autoneg interface capbilities. - Fix typo in mlx4_en_port_state struct field (transciver to transceiver). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:00 -04:00
Saeed Mahameed	dcf972a334	ethtool, net/mlx4_en: Add 100M, 20G, 56G speeds ethtool reporting support Added 100M, 20G and 56G ethtool speed reporting support. Update mlx4_en_test_speed self test with the new speeds. Defined new link speeds in include/uapi/linux/ethtool.h: +#define SPEED_20000 20000 +#define SPEED_40000 40000 +#define SPEED_56000 56000 Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:00 -04:00
Saeed Mahameed	a53e3e8c1d	net/mlx4_core: Add ethernet backplane autoneg device capability Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:00 -04:00
Saeed Mahameed	adbc7ac5c1	net/mlx4_core: Introduce ACCESS_REG CMD and eth_prot_ctrl dev cap Adding ACCESS REG mlx4 command and use it to implement Query method for PTYS (Port Type and Speed Register). Query and store eth_prot_ctrl dev cap. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:00 -04:00
Saeed Mahameed	7202da8b7f	ethtool, net/mlx4_en: Cable info, get_module_info/eeprom ethtool support Added support for get_module_info/get_module_eeprom ethtool support for cable info reading. Added new cable types enum in include/uapi/linux/ethtool.h for ethtool use. +#define ETH_MODULE_SFF_8636 0x3 +#define ETH_MODULE_SFF_8636_LEN 256 +#define ETH_MODULE_SFF_8436 0x4 +#define ETH_MODULE_SFF_8436_LEN 256 Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:18:00 -04:00
Saeed Mahameed	32a173c7f9	net/mlx4_core: Introduce mlx4_get_module_info for cable module info reading Added new MAD_IFC command to read cable module info with attribute id (0xFF60). Update include/linux/mlx4/device.h with function declaration (mlx4_get_module_info) and the needed defines/enums for future use. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-28 17:17:59 -04:00
Eli Cohen	bf1bac5b78	net/mlx4_core: Call synchronize_irq() before freeing EQ buffer After moving the EQ ownership to software effectively destroying it, call synchronize_irq() to ensure that any handler routines running on other CPU cores finish execution. Only then free the EQ buffer. The same thing is done when we destroy a CQ which is one of the sources generating interrupts. In the case of CQ we want to avoid completion handlers on a CQ that was destroyed. In the case we do the same to avoid receiving asynchronous events after the EQ has been destroyed and its buffers freed. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-26 22:46:04 -04:00
Eli Cohen	96e4be06cb	net/mlx5_core: Call synchronize_irq() before freeing EQ buffer After destroying the EQ, the object responsible for generating interrupts, call synchronize_irq() to ensure that any handler routines running on other CPU cores finish execution. Only then free the EQ buffer. This patch solves a very rare case when we get panic on driver unload. The same thing is done when we destroy a CQ which is one of the sources generating interrupts. In the case of CQ we want to avoid completion handlers on a CQ that was destroyed. In the case we do the same to avoid receiving asynchronous events after the EQ has been destroyed and its buffers freed. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-26 22:46:03 -04:00
Eric Dumazet	98226208c8	mlx4: fix race accessing page->_count This is illegal to use atomic_set(&page->_count, ...) even if we 'own' the page. Other entities in the kernel need to use get_page_unless_zero() to get a reference to the page before testing page properties, so we could loose a refcount increment. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-10 15:37:28 -04:00
Linus Torvalds	35a9ad8af0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Most notable changes in here: 1) By far the biggest accomplishment, thanks to a large range of contributors, is the addition of multi-send for transmit. This is the result of discussions back in Chicago, and the hard work of several individuals. Now, when the ->ndo_start_xmit() method of a driver sees skb->xmit_more as true, it can choose to defer the doorbell telling the driver to start processing the new TX queue entires. skb->xmit_more means that the generic networking is guaranteed to call the driver immediately with another SKB to send. There is logic added to the qdisc layer to dequeue multiple packets at a time, and the handling mis-predicted offloads in software is now done with no locks held. Finally, pktgen is extended to have a "burst" parameter that can be used to test a multi-send implementation. Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4, virtio_net Adding support is almost trivial, so export more drivers to support this optimization soon. I want to thank, in no particular or implied order, Jesper Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann, David Tat, Hannes Frederic Sowa, and Rusty Russell. 2) PTP and timestamping support in bnx2x, from Michal Kalderon. 3) Allow adjusting the rx_copybreak threshold for a driver via ethtool, and add rx_copybreak support to enic driver. From Govindarajulu Varadarajan. 4) Significant enhancements to the generic PHY layer and the bcm7xxx driver in particular (EEE support, auto power down, etc.) from Florian Fainelli. 5) Allow raw buffers to be used for flow dissection, allowing drivers to determine the optimal "linear pull" size for devices that DMA into pools of pages. The objective is to get exactly the necessary amount of headers into the linear SKB area pre-pulled, but no more. The new interface drivers use is eth_get_headlen(). From WANG Cong, with driver conversions (several had their own by-hand duplicated implementations) by Alexander Duyck and Eric Dumazet. 6) Support checksumming more smoothly and efficiently for encapsulations, and add "foo over UDP" facility. From Tom Herbert. 7) Add Broadcom SF2 switch driver to DSA layer, from Florian Fainelli. 8) eBPF now can load programs via a system call and has an extensive testsuite. Alexei Starovoitov and Daniel Borkmann. 9) Major overhaul of the packet scheduler to use RCU in several major areas such as the classifiers and rate estimators. From John Fastabend. 10) Add driver for Intel FM10000 Ethernet Switch, from Alexander Duyck. 11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric Dumazet. 12) Add Datacenter TCP congestion control algorithm support, From Florian Westphal. 13) Reorganize sk_buff so that __copy_skb_header() is significantly faster. From Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits) netlabel: directly return netlbl_unlabel_genl_init() net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers net: description of dma_cookie cause make xmldocs warning cxgb4: clean up a type issue cxgb4: potential shift wrapping bug i40e: skb->xmit_more support net: fs_enet: Add NAPI TX net: fs_enet: Remove non NAPI RX r8169:add support for RTL8168EP net_sched: copy exts->type in tcf_exts_change() wimax: convert printk to pr_foo() af_unix: remove 0 assignment on static ipv6: Do not warn for informational ICMP messages, regardless of type. Update Intel Ethernet Driver maintainers list bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING tipc: fix bug in multicast congestion handling net: better IFF_XMIT_DST_RELEASE support net/mlx4_en: remove NETDEV_TX_BUSY 3c59x: fix bad split of cpu_to_le32(pci_map_single()) net: bcmgenet: fix Tx ring priority programming ...	2014-10-08 21:40:54 -04:00
Eric Dumazet	535114539b	net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers Add two helpers so that drivers do not have to care of BQL being available or not. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jim Davis <jim.epost@gmail.com> Fixes: `29d40c9032` ("net/mlx4_en: Use prefetch in tx path") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-08 16:08:04 -04:00
Linus Torvalds	28596c9722	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull "trivial tree" updates from Jiri Kosina: "Usual pile from trivial tree everyone is so eagerly waiting for" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Remove MN10300_PROC_MN2WS0038 mei: fix comments treewide: Fix typos in Kconfig kprobes: update jprobe_example.c for do_fork() change Documentation: change "&" to "and" in Documentation/applying-patches.txt Documentation: remove obsolete pcmcia-cs from Changes Documentation: update links in Changes Documentation: Docbook: Fix generated DocBook/kernel-api.xml score: Remove GENERIC_HAS_IOMAP gpio: fix 'CONFIG_GPIO_IRQCHIP' comments tty: doc: Fix grammar in serial/tty dma-debug: modify check_for_stack output treewide: fix errors in printk genirq: fix reference in devm_request_threaded_irq comment treewide: fix synchronize_rcu() in comments checkstack.pl: port to AArch64 doc: queue-sysfs: minor fixes init/do_mounts: better syntax description MIPS: fix comment spelling powerpc/simpleboot: fix comment ...	2014-10-07 21:16:26 -04:00
Eric Dumazet	fe971b95c2	net/mlx4_en: remove NETDEV_TX_BUSY Drivers should avoid NETDEV_TX_BUSY as much as possible. They should stop the tx queue before qdisc even tries to push another packet, to avoid requeues. For a driver supporting skb->xmit_more, this is likely to be a prereq anyway, otherwise we could have a tx deadlock : We need to force a doorbell if TX ring is full. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-07 13:20:39 -04:00

... 4 5 6 7 8 ...

1262 Commits