Commit Graph

92 Commits

Author SHA1 Message Date
Ido Shamay
77507aa249 net/mlx4_core: Enable CQE/EQE stride support
This feature is intended for archs having cache line larger then 64B.

Since our CQE/EQEs are generally 64B in those systems, HW will write
twice to the same cache line consecutively, causing pipe locks due to
he hazard prevention mechanism. For elements in a cyclic buffer, writes
are consecutive, so entries smaller than a cache line should be
avoided, especially if they are written at a high rate.

Reduce consecutive writes to same cache line in CQs/EQs, by allowing the
driver to increase the distance between entries so that each will reside
in a different cache line. Until the introduction of this feature, there
were two types of CQE/EQE:

1. 32B stride and context in the [0-31] segment
2. 64B stride and context in the [32-63] segment

This feature introduces two additional types:

3. 128B stride and context in the [0-31] segment (128B cache line)
4. 256B stride and context in the [0-31] segment (256B cache line)

Modify the mlx4_core driver to query the device for the CQE/EQE cache
line stride capability and to enable that capability when the host
cache line size is larger than 64 bytes (supported cache lines are
128B and 256B).

The mlx4 IB driver and libmlx4 need not be aware of this change. The PF
context behaviour is changed to require this change in VF drivers
running on such archs.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-19 17:30:10 -04:00
Linus Torvalds
e3b1fd56f1 Main set of InfiniBand/RDMA updates for 3.17 merge window:
- MR reregistration support
  - MAD support for RMPP in userspace
  - iSER and SRP initiator updates
  - ocrdma hardware driver updates
  - other fixes...
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJT7N2GAAoJEENa44ZhAt0hUiIQAKBqYIjpB3QY6Z/B19mxDxku
 I81B3OkirumbAaCLoLckvq4gnwQ+BAD+YUmXgP08TCIgrABZAYYw+WvMEY9WNyQB
 x3Pv+BzX+wKKNaQkSnB9JVdku+BSI76eW0YYrIX0F0x1o0Jq9JpSkia91KmRvqmX
 YSFy2R7BjEZ4lfo/uscydHT26Q6EdT3od4iv48K8qq5rKdjtyYNgD/75DLCN599Z
 uI1f98e6Tl+7nHWaioQB61zlYPkNPLAnZtMrY2j4tarTwYwX1KhF3eV7z39L1l81
 nhMIXr+qBXtYuZw3I9rKqw3VVCCqB9e6E8FIA5K/d2jWqO+0TqIMUYOuZXuCezWg
 o+uGgwbDOweBqwrKRmiR2M0lk2I1Z16jBxYuaUBbLImG0/NPtuUB22t6RbPAojTa
 EjDkb9XBA7uFUMrYYnou+HxEzmJUYkin6wgGxtklYEKUqvh8G9ccGt6httWrCSrV
 mpjwJv+S4LdFM49wdP993lYthpCoZ42yxEzZ7zJ/KTt17/Wb5F4RtHIROGVFkHdT
 mo8RUz5DGDznfNQgn0m/jC3woFnMNpLOduI+CivhrwrWCwMpSUxIjqWu3263pIJ7
 +H0kNOKDAbp6+F27j+AVznlyXnaEtYyM8EZnysG1Hkz24gCSWBYo5Ep8eSH8LgY4
 9VPc75KVG6uddx5mhg5h
 =wOm0
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband/rdma updates from Roland Dreier:
 "Main set of InfiniBand/RDMA updates for 3.17 merge window:

   - MR reregistration support
   - MAD support for RMPP in userspace
   - iSER and SRP initiator updates
   - ocrdma hardware driver updates
   - other fixes..."

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (52 commits)
  IB/srp: Fix return value check in srp_init_module()
  RDMA/ocrdma: report asic-id in query device
  RDMA/ocrdma: Update sli data structure for endianness
  RDMA/ocrdma: Obtain SL from device structure
  RDMA/uapi: Include socket.h in rdma_user_cm.h
  IB/srpt: Handle GID change events
  IB/mlx5: Use ARRAY_SIZE instead of sizeof/sizeof[0]
  IB/mlx4: Use ARRAY_SIZE instead of sizeof/sizeof[0]
  RDMA/amso1100: Check for integer overflow in c2_alloc_cq_buf()
  IPoIB: Remove unnecessary test for NULL before debugfs_remove()
  IB/mad: Add user space RMPP support
  IB/mad: add new ioctl to ABI to support new registration options
  IB/mad: Add dev_notice messages for various umad/mad registration failures
  IB/mad: Update module to [pr|dev]_* style print messages
  IB/ipoib: Avoid multicast join attempts with invalid P_key
  IB/umad: Update module to [pr|dev]_* style print messages
  IB/ipoib: Avoid flushing the workqueue from worker context
  IB/ipoib: Use P_Key change event instead of P_Key polling mechanism
  IB/ipath: Add P_Key change event support
  mlx4_core: Add support for secure-host and SMP firewall
  ...
2014-08-14 11:09:05 -06:00
Roland Dreier
d087f6ad72 Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc', 'mlx4', 'mlx5', 'ocrdma' and 'srp' into for-next 2014-08-14 08:58:04 -07:00
Jack Morgenstein
114840c3d2 mlx4_core: Add support for secure-host and SMP firewall
Secure-host is the general term for the capability of a device
to protect itself and the subnet from malicious host software.

This is achieved by:
1. Not allowing un-trusted entities to access device configuration
   registers, directly (through pci_cr or pci_conf) and indirectly
   (through MADs).

2. Hiding M_Key from untrusted entities.

3. Preventing the modification of GUID0 by un-trusted entities

4. Not allowing drivers on untrusted hosts to receive nor to transmit
   packets over QP0 (SMP Firewall).

The secure-host capability depends on firmware handling all QP0
packets, and not passing these packets up to the driver. Any information
required by the driver for proper operation (e.g., SM lid) is passed
via events generated by the firmware while processing QP0 MADs.

Driver support mainly requires using the MAD_DEMUX FW command at startup,
where the feature is enabled/disabled through a procedure described in
the Mellanox HCA tools package.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>

[ Fix error path in mlx4_setup_hca to go to err_mcg_table_free. - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-08-05 07:40:22 -07:00
Matan Barak
e630664c83 mlx4_core: Add helper functions to support MR re-registration
Add few helper functions to support a mechanism of getting an MPT,
modifying it and updating the HCA with the modified object.

The code takes 2 paths, one for directly changing the MPT (and
sometimes its related MTTs) and another one which queries the MPT and
updates the HCA via fw command SW2HW_MPT. The first path is used in
native mode; the second path is slower and is used only in SRIOV.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-08-01 15:11:13 -07:00
Eugenia Emantayev
523ece889e net/mlx4_en: Fix set port ratelimit for 40GE
In 40GE we can't use the default bw units for set ratelimit (100 Mbps)
since the max is 255*100 Mbps = 25 Gbps (not suited for 40GE), thus we need 1 Gbps units.
But for 10GE 1 Gbps units might be too bruit so we use the following solution.

For user set ratelimit <= 25 Gbps:
        use 100 Mbps units * user_ratelimit (* 10).

For user set ratelimit > 25 Gbps:
        use 1 Gbps units * user_ratelimit.

For user set unlimited ratelimit (0 Gbps):
        use 1 Gbps units * MAX_RATELIMIT_DEFAULT (57)

Note: any value > 58 will damage the FW ratelimit computation, so we allow
      a max and any higher value will be pulled down to 57.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 19:58:44 -07:00
Linus Torvalds
f9da455b93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
2014-06-12 14:27:40 -07:00
Linus Torvalds
1d21b1bf53 Main batch of InfiniBand/RDMA changes for 3.16:
- Add iWARP port mapper to avoid conflicts between RDMA and normal
    stack TCP connections.
 
  - Fixes for i386 / x86-64 structure padding differences (ABI
    compatibility for 32-on-64) from Yann Droneaud.
 
  - A pile of SRP initiator fixes from Bart Van Assche.
 
  - Fixes for a writeback / memory allocation deadlock with NFS over
    IPoIB connected mode from Jiri Kosina.
 
  - The usual fixes and cleanups to mlx4, mlx5, cxgb4 and other
    low-level drivers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJTlzyEAAoJEENa44ZhAt0h9yoP/1UeXlejOpCJyiNdtJZ+ilcU
 cb0PEzsjzqACyDqcoQ0EpQM3/3emccVIC3uUXK12mzlTIXOFYTeRLays/TbxZDLt
 FK5D/NrMmmJmciPt1ZRgUX82kFFRGScEfpkXYs7jxtRaNT7CW5KwSNQr6aFXskUz
 1gpdK1ARCN5rWcGl2HJx5o9C4c/Fa/Vov8lOsAkUZXD1SuPNT/fFN0u1pRzU68g0
 k3oj81XnZq5ejOBQKXEHImcmjXwaJ2yjmzxhSsKebqDWDdXuS/F9e4taKneHTZmr
 AdwJaLLJPWmAGi/vYYhkuLKpzIDpzMCqwr39lEabmjWvznYOlnjfVUXwUTE2nwNC
 DIXuHOLFrSvF2cNxh8ZeEYKS8AV+PjAOahPC5whkWkY256Q67uB7cy9ilWAK+7xS
 QcQ5Inr6iXvxIGYA4hNwUo8aK0NuKFwhkVVFEbkPaurbQZPqiKwyVE3w2FOws/Qp
 0kLLCVvpRQYjKzkxyof2tb1AcNuVNKXHrYk6RaBDJ9mjxHbhvY4OSt4CBxAAXBu6
 zoedUydN1Nz1UgAB1jDsBdyE2QQnXockA1+JJKNq6gM5Dz0DUdAylzQ2NqY9tnYz
 RTzihEPYIiQUkV3B8ErbqsuO6z7M830AXO5AR6bLZn1zgJ0cbMLBaKLA8LRufJI/
 qxNVwL32Uv1PjKZ+yX1x
 =Wcdc
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull main InfiniBand/RDMA updates from Roland Dreier:

 - add iWARP port mapper to avoid conflicts between RDMA and normal
   stack TCP connections.

 - fixes for i386 / x86-64 structure padding differences (ABI
   compatibility for 32-on-64) from Yann Droneaud.

 - a pile of SRP initiator fixes from Bart Van Assche.

 - fixes for a writeback / memory allocation deadlock with NFS over
   IPoIB connected mode from Jiri Kosina.

 - the usual fixes and cleanups to mlx4, mlx5, cxgb4 and other low-level
   drivers.

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (61 commits)
  RDMA/cxgb4: Add support for iWARP Port Mapper user space service
  RDMA/nes: Add support for iWARP Port Mapper user space service
  RDMA/core: Add support for iWARP Port Mapper user space service
  IB/mlx4: Fix gfp passing in create_qp_common()
  IB/umad: Fix use-after-free on close
  IB/core: Fix kobject leak on device register error flow
  RDMA/cxgb4: add missing padding at end of struct c4iw_alloc_ucontext_resp
  mlx4_core: Fix GFP flags parameters to be gfp_t
  IB/core: Fix port kobject deletion during error flow
  IB/core: Remove unneeded kobject_get/put calls
  IB/core: Fix sparse warnings about redeclared functions
  IB/mad: Fix sparse warning about gfp_t use
  IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO
  IB: Add a QP creation flag to use GFP_NOIO allocations
  IB: Return error for unsupported QP creation flags
  IB: Allow build of hw/ and ulp/ subdirectories independently
  mlx4_core: Move handling of MLX4_QP_ST_MLX to proper switch statement
  RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp
  IB/srp: Avoid problems if a header uses pr_fmt
  IB/umad: Fix error handling
  ...
2014-06-10 10:41:33 -07:00
Roland Dreier
eeaddf3670 Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next 2014-06-10 10:12:14 -07:00
David S. Miller
c99f7abf0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	include/net/inetpeer.h
	net/ipv6/output_core.c

Changes in net were fixing bugs in code removed in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-03 23:32:12 -07:00
Jiri Kosina
40f2287bd5 IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO
Modify the various routines used to allocate memory resources which
serve QPs in mlx4 to get an input GFP directive.  Have the Ethernet
driver to use GFP_KERNEL in it's QP allocations as done prior to this
commit, and the IB driver to use GFP_NOIO when the IB verbs
IB_QP_CREATE_USE_GFP_NOIO QP creation flag is provided.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-06-02 14:58:11 -07:00
Jack Morgenstein
111c6094bd net/mlx4_core: Reset RoCE VF gids when guest driver goes down
Reset the GIDs assigned to a VF in the port RoCE GID table when
that guest goes down (either crashes or goes down cleanly).

As part of this fix, we refactor the RoCE gid table driver copy,
moving it to the mlx4_port_info structure (together with the MAC
and VLAN tables).

As with the MAC and VLAN tables, we now use a mutex per port
for the GID table so that modifying the driver copy and
modifying the firmware copy of a port GID table becomes an
atomic operation (thus avoiding driver-copy/FW-copy mismatches).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-30 16:57:52 -07:00
Jack Morgenstein
99ec41d0a4 mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs
This commit adds the infrastructure for enabling selected VFs to
operate SMI (QP0) MADs without restriction.

Additionally, for these enabled VFs, their QP0 proxy and tunnel QPs
are MLX QPs.  As such, they operate over VL15.  Therefore, they are
not affected by "credit" problems or changes in the VLArb table (which
may shut down VL0).

Non-enabled VFs may only create UD proxy QP0 qps (which are forced by
the hypervisor to send packets using the q-key it assigns and places
in the qp-context).  Thus, non-enabled VFs will not pose a security
risk.  The hypervisor discards any privileged MADs it receives from
these non-enabled VFs.

By default, all VFs are NOT enabled, and must explicitly be enabled
by the administrator.

The sysfs interface which operates the VF enablement infrastructure
is provided in the next commit.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-05-29 21:13:09 -07:00
David S. Miller
54e5c4def0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_alb.c
	drivers/net/ethernet/altera/altera_msgdma.c
	drivers/net/ethernet/altera/altera_sgdma.c
	net/ipv6/xfrm6_output.c

Several cases of overlapping changes.

The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.

In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.

Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 00:32:30 -04:00
Matan Barak
ce8d9e0d67 net/mlx4_core: Add UPDATE_QP SRIOV wrapper support
This patch adds UPDATE_QP SRIOV wrapper support.

The mechanism is a general one, but currently only source MAC
index changes are allowed for VFs.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 15:12:45 -04:00
Joe Perches
1a91de2883 mellanox: Logging message cleanups
Use a more current logging style.

o Coalesce formats
o Add missing spaces for coalesced formats
o Align arguments for modified formats
o Add missing newlines for some logging messages
o Use DRV_NAME as part of format instead of %s, DRV_NAME to
  reduce overall text.
o Use ..., ##__VA_ARGS__ instead of args... in macros
o Correct a few format typos
o Use a single line message where appropriate

Signed-off-by: Joe Perches <joe@perches.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-08 23:42:02 -04:00
Wei Yang
befdf8978a net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()
pci_match_id() just match the static pci_device_id, which may return NULL if
someone binds the driver to a device manually using
/sys/bus/pci/drivers/.../new_id.

This patch wrap up a helper function __mlx4_remove_one() which does the tear
down function but preserve the drv_data. Functions like
mlx4_pci_err_detected() and mlx4_restart_one() will call this one with out
releasing drvdata.

Fixes: 97a5221 "net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset".

CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Amir Vadai <amirv@mellanox.com>
CC: Jack Morgenstein <jackm@dev.mellanox.co.il>
CC: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-13 23:12:15 -04:00
Or Gerlitz
b74757944d net/mlx4: USe one wrapper that returns -EPERM
When a VF issues a firmware command which is disallowed for them, the PF
rerturns -EPERM from that command wrapper. Move to use one such wrapper
instance, instead of repeating the same code on such commands.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:29:35 -04:00
Matan Barak
449fc48866 net/mlx4: Adapt code for N-Port VF
Adds support for N-Port VFs, this includes:
1. Adding support in the wrapped FW command
	In wrapped commands, we need to verify and convert
	the slave's port into the real physical port.
	Furthermore, when sending the response back to the slave,
	a reverse conversion should be made.
2. Adjusting sqpn for QP1 para-virtualization
	The slave assumes that sqpn is used for QP1 communication.
	If the slave is assigned to a port != (first port), we need
	to adjust the sqpn that will direct its QP1 packets into the
	correct endpoint.
3. Adjusting gid[5] to modify the port for raw ethernet
	In B0 steering, gid[5] contains the port. It needs
	to be adjusted into the physical port.
4. Adjusting number of ports in the query / ports caps in the FW commands
	When a slave queries the hardware, it needs to view only
	the physical ports it's assigned to.
5. Adjusting the sched_qp according to the port number
	The QP port is encoded in the sched_qp, thus in modify_qp we need
	to encode the correct port in sched_qp.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20 16:18:30 -04:00
Matan Barak
f74462acf8 net/mlx4: Add utils for N-Port VFs
This patch adds the following utils:
1. Convert slave_id -> VF
2. Get the active ports by slave_id
3. Convert slave's port to real port
4. Get the slave's port from real port
5. Get all slaves that uses the i'th real port
6. Get all slaves that uses the i'th real port exclusively

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20 16:18:29 -04:00
Jack Morgenstein
b6ffaeffae mlx4: In RoCE allow guests to have multiple GIDS
The GIDs are statically distributed, as follows:
PF: gets 16 GIDs
VFs:  Remaining GIDS are divided evenly between VFs activated by the driver.
      If the division is not even, lower-numbered VFs get an extra GID.

For an IB interface, the number of gids per guest remains as before: one gid per guest.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:14 -04:00
Jack Morgenstein
6ee51a4e86 mlx4: Adjust QP1 multiplexing for RoCE/SRIOV
This requires the following modifications:
1. Fix build_mlx4_header to properly fill in the ETH fields
2. Adjust mux and demux QP1 flow to support RoCE.

This commit still assumes only one GID per slave for RoCE.
The commit enabling multiple GIDs is a subsequent commit, and
is done separately because of its complexity.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:12 -04:00
Amir Vadai
169a1d85d0 net,IB/mlx: Bump all Mellanox driver versions
Bump all Mellanox driver versions.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25 17:34:44 -05:00
Linus Torvalds
4ba9920e5e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) BPF debugger and asm tool by Daniel Borkmann.

 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

 3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

 4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match.  From Ben Hutchings.

 5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

 6) Implement auto corking in TCP, from Eric Dumazet.  Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

 9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

10) Fix ipv6 router reachability probing, from Jiri Benc.

11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

12) Support tunneling in GRO layer, from Jerry Chu.

13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI.  From Atzm Watanabe.

15) New "Heavy Hitter" qdisc, from Terry Lam.

16) Significantly improve the IPSEC support in pktgen, from Fan Du.

17) Allow ipv4 tunnels to cache routes, just like sockets.  From Tom
    Herbert.

18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

20) Key TCP metrics blobs also by source address, not just destination
    address.  From Christoph Paasch.

21) Support 10G in generic phylib.  From Andy Fleming.

22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided.  From Tom Herbert.

The wireless and netfilter folks have been busy little bees too.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
  net/cxgb4: Fix referencing freed adapter
  ipv6: reallocate addrconf router for ipv6 address when lo device up
  fib_frontend: fix possible NULL pointer dereference
  rtnetlink: remove IFLA_BOND_SLAVE definition
  rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
  qlcnic: update version to 5.3.55
  qlcnic: Enhance logic to calculate msix vectors.
  qlcnic: Refactor interrupt coalescing code for all adapters.
  qlcnic: Update poll controller code path
  qlcnic: Interrupt code cleanup
  qlcnic: Enhance Tx timeout debugging.
  qlcnic: Use bool for rx_mac_learn.
  bonding: fix u64 division
  rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
  sfc: Use the correct maximum TX DMA ring size for SFC9100
  Add Shradha Shah as the sfc driver maintainer.
  net/vxlan: Share RX skb de-marking and checksum checks with ovs
  tulip: cleanup by using ARRAY_SIZE()
  ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
  net/cxgb4: Don't retrieve stats during recovery
  ...
2014-01-25 11:17:34 -08:00
Matan Barak
4de6580360 mlx4_core: Add support for steerable IB UD QPs
This patch adds support for allocating IB UD QPs that we can steer
traffic from.  We introduce a new firmware command FLOW_STEERING_IB_UC_QP_RANGE
and a capability bit.

This command isn't supported for VFs.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-14 14:06:50 -08:00
Jack Morgenstein
7c6d74d23a mlx4_core: Roll back round robin bitmap allocation commit for CQs, SRQs, and MPTs
Commit f4ec9e9 "mlx4_core: Change bitmap allocator to work in round-robin fashion"
introduced round-robin allocation (via bitmap) for all resources which allocate
via a bitmap.

Round robin allocation is desirable for mcgs, counters, pd's, UARs, and xrcds.
These are simply numbers, with no involvement of ICM memory mapping.

Round robin is required for QPs, since we had a problem with immediate
reuse of a 24-bit QP number (commit f4ec9e9).

However, for other resources which use the bitmap allocator and involve
mapping ICM memory -- MPTs, CQs, SRQs -- round-robin is not desirable.

What happens in these cases is the following:

ICM memory is allocated and mapped in chunks of 256K.

Since the resource allocation index goes up monotonically, the allocator
will eventually require mapping a new chunk. Now, chunks are also unmapped
when their reference count goes back to zero.  Thus, if a single app is
running and starts/exits frequently we will have the following situation:

When the app starts, a new chunk must be allocated and mapped.

When the app exits, the chunk reference count goes back to zero, and the
chunk is unmapped and freed. Therefore, the app must pay the cost of allocation
and mapping of ICM memory each time it runs (although the price is paid only when
allocating the initial entry in the new chunk).

For apps which allocate MPTs/SRQs/CQs and which operate as described above,
this presented a performance problem.

We therefore roll back the round-robin allocator modification for MPTs, CQs, SRQs.

Reported-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 21:12:13 -05:00
Jack Morgenstein
146f3ef4a1 net/mlx4_core: Implement resource quota enforcement
Implements resource quota grant decision when resources are requested,
for the following resources:  QPs, CQs, SRQs, MPTs, MTTs, vlans, MACs,
and Counters.

When granting a resource, the quota system increases the allocated-count
for that slave.

When the slave later frees the resource, its allocated-count is reduced.

A spinlock is used to protect the integrity of each resource's free-pool counter.
(One slave may be in the process of being granted a resource while another
slave has crashed, initiating cleanup of that slave's resource quotas).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:08 -05:00
Jack Morgenstein
5a0d0a6161 mlx4: Structures and init/teardown for VF resource quotas
This is step #1 for implementing SRIOV resource quotas for VFs.

Quotas are implemented per resource type for VFs and the PF, to prevent
any entity from simply grabbing all the resources for itself and leaving
the other entities unable to obtain such resources.

Resources which are allocated using quotas:  QPs, CQs, SRQs, MPTs, MTTs, MAC,
                                             VLAN, and Counters.

The quota system works as follows:
Each entity (VF or PF) is given a max number of a given resource (its quota),
and a guaranteed minimum number for each resource (starvation prevention).

For QPs, CQs, SRQs, MPTs and MTTs:
50% of the available quantity for the resource is divided equally among
the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module
parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool.
The other 50% is the "free pool", allocated on a first-come-first-serve basis.
For each VF/PF, resources are first allocated from its "guaranteed-minimum"
pool. When that pool is exhausted, the driver attempts to allocate from
the resource "free-pool".

The quota (i.e., max) for the VFs and the PF is:
  The free-pool amount (50% of the real max) + the guaranteed minimum

For MACs:
  Guarantee 2 MACs per VF/PF per port. As a result, since we have only
  128 MACs per port, reduce the allowable number of VFs from 64 to 63.
  Any remaining MACs are put into a free pool.

For VLANs:
  For the PF, the per-port quota is 128 and guarantee is 64
     (to allow the PF to register at least a VLAN per VF in VST mode).
  For the VFs, the per-port quota is 64 and the guarantee is 0.
      We assume that VGT VFs are trusted not to abuse the VLAN resource.

For Counters:
  For all functions (PF and VFs), the quota is 128 and the guarantee is 0.

In this patch, we define the needed structures, which are added to the
resource-tracker struct.  In addition, we do initialization
for the resource quota, and adjust the query_device response to use quotas
rather than resource maxima.

As part of the implementation, we introduce a new field in
mlx4_dev: quotas.  This field holds the resource quotas used
to report maxima to the upper layers (ib_core, via query_device).

The HCA maxima of these values are passed to the VFs (via
QUERY_HCA) so that they may continue to use these in handling
QPs, CQs, SRQs and MPTs.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Jack Morgenstein
2c957ff27d net/mlx4_core: Don't fail reg/unreg vlan for older guests
In upstream kernels under SRIOV, the vlan register/unregister calls
were NOPs (doing nothing and returning OK). We detect these old
calls from guests (via the comm channel), since previously the
port number in mlx4_register_vlan was passed (improperly) in the
out_param. This has been corrected so that the port number is now
passed in bits 8..15 of the in_modifier field.

For old calls, these bits will be zero, so if the passed port
number is zero, we can still look at the out_param field to see
if it contains a valid port number. If yes, the VM is running
an old driver.

Since for old drivers, the register/unregister_vlan wrappers were
NOPs, we continue this policy -- the reason being that upstream
had an additional bug in eth driver running on guests (where
procedure mlx4_en_vlan_rx_kill_vid() had the following code:

if (!mlx4_find_cached_vlan(mdev->dev, priv->port, vid, &idx))
        mlx4_unregister_vlan(mdev->dev, priv->port, idx);
else
        en_err(priv, "could not find vid %d in cache\n", vid);

On a VM, mlx4_find_cached_vlan() will always fail, since the
vlan cache is located on the Hypervisor; on guests it is empty.

Therefore, if we allow upstream guests to register vlans, we will
have vlan leakage since the unregister will never be performed.
Leaving vlan reg/unreg for old guest drivers as a NOP is not a
feature regression, since in upstream the register/unregister
vlan wrapper is a NOP.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Jack Morgenstein
2009d0059c net/mlx4_en: Use vlan id instead of vlan index for unregistration
Use of vlan_index created problems unregistering vlans on guests.

In addition, tools delete vlan by tag, not by index, lets follow that.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-04 16:19:07 -05:00
Yevgeny Petrilin
fe6f700d6c net/mlx4_core: Respond to operation request by firmware
This commit adds new firmware command and new firmware event.  The firmware
raises the MLX4_EVENT_TYPE_OP_REQUIRED event in order to signal the driver it
needs to perform an administrative operation throughout the MLX4_CMD_GET_OP_REQ
command. At the moment the supported operation is adding/removing multicast
entries which are used by the firmware for handling NCSI traffic in B0
steering mode.

Also, had to swap the order of mlx4_init_mcg_table() and
mlx4_init_eq_table() to make sure that driver will get events only after
resources are initialized to handle it.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-29 01:12:40 -07:00
Rony Efraim
0a6eac2458 net/mlx4_core: Add HW enforcement to VF link state
When the firmware supports the UPDATE_QP command, if the VF link is disabled,
block all QPs opened by the VF, by programming the UPDATE_QP command to drop
all RX & TX traffic to/from these QPs. Operates only in VST mode.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 13:10:57 -07:00
Jack Morgenstein
b01978cacf net/mlx4_core: Dynamic VST to VST vlan/qos changes
Within VST mode, enable modifying the vlan and/or qos
for a VF without requiring unbind/rebind.

This requires firmware which supports the UPDATE_QP command.
(If the command is not available, we fall back to requiring
unbind/bind to activate these changes).

To avoid race conditions with modify-qp on QPs that are affected
by update-qp, this operation is performed on the comm_wq.

If the update operation succeeds for all the necessary QPs, a
vlan_unregister is performed for the abandoned vlan id.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-07-01 13:10:22 -07:00
Rony Efraim
948e306d7d net/mlx4: Add VF link state support
Add support to change the link state of VF (vPort)

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 17:51:04 -07:00
Linus Torvalds
e0fd9affeb InfiniBand/RDMA changes for the 3.10 merge window:
- XRC transport fixes
  - Fix DHCP on IPoIB
  - mlx4 preparations for flow steering
  - iSER fixes
  - miscellaneous other fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJRisChAAoJEENa44ZhAt0hHLoP/iX5MxtHJ3X1u5KcARX/7nci
 CCH/VnD172re/KavCCg7zkbZQpS4jHCtW/CzLUCSPqBGOaj78HFTUkB3ragvUW7m
 ndERwplF8DP/i0x7Kk7Wau2a4RdlH0lqwucOjqTyQnbIdxknkz6w3Jcsb9Ic2lzx
 up0T0HaHtTxdVF6lXOB5QOIpUGg3l0Yu4euX2BA61WDoZj+VIYhgeWZewq0iV0D1
 rLtarJ+Or7mdwu2rNcDHgrD0lhF4SCBd3rx4lbc4F68Cr8JUz0Xe7liPLNskeLhW
 f3NEm3gmkYp9YI1otGsA0X/CyV6wnRk4mT8JMlOb2WNzeq2V13Z54/9ZyF5/gFD2
 JgzkQB9Ibf7EmTwXWd6+0+FA40Q6dNvnRnhddRM255dvDVw7nxUr2UzYH4Re/Z9K
 rNFjkvix2YUwEmoPjitWocz2kj2reDMqjtiVDmdGy1YbtnicH5GtkQsWkoPg8ON9
 m5jORUdzydTD+yBJwTiFP1EuFoG3TdfoZ7zHMJwWy/u8i308xD6WPGms9MTdjh8j
 7gjz2TCKr+vpuVRh/p6esCPPOTSsSeWDeowy7Sgpdf3qoqAImXsWXVrl2kXLhtyl
 1VIgHU3ztm7oqwmy0gQ/zVCo4CLLdsif2zmEIDpxJPnWaSq+D9LJdyLTcfdBjhQ8
 9SjUafe4msT1pIjNb7ND
 =1ojD
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull InfiniBand/RDMA changes from Roland Dreier:
 - XRC transport fixes
 - Fix DHCP on IPoIB
 - mlx4 preparations for flow steering
 - iSER fixes
 - miscellaneous other fixes

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (23 commits)
  IB/iser: Add support for iser CM REQ additional info
  IB/iser: Return error to upper layers on EAGAIN registration failures
  IB/iser: Move informational messages from error to info level
  IB/iser: Add module version
  mlx4_core: Expose a few helpers to fill DMFS HW strucutures
  mlx4_core: Directly expose fields of DMFS HW rule control segment
  mlx4_core: Change a few DMFS fields names to match firmare spec
  mlx4: Match DMFS promiscuous field names to firmware spec
  mlx4_core: Move DMFS HW structs to common header file
  IB/mlx4: Set link type for RAW PACKET QPs in the QP context
  IB/mlx4: Disable VLAN stripping for RAW PACKET QPs
  mlx4_core: Reduce warning message for SRQ_LIMIT event to debug level
  RDMA/iwcm: Don't touch cmid after dropping reference
  IB/qib: Correct qib_verbs_register_sysfs() error handling
  IB/ipath: Correct ipath_verbs_register_sysfs() error handling
  RDMA/cxgb4: Fix SQ allocation when on-chip SQ is disabled
  SRPT: Fix odd use of WARN_ON()
  IPoIB: Fix ipoib_hard_header() return value
  RDMA: Rename random32() to prandom_u32()
  RDMA/cxgb3: Fix uninitialized variable
  ...
2013-05-08 15:29:48 -07:00
Rony Efraim
3f7fb021d0 net/mlx4: Add set VF default vlan ID and priority support
Add support to ndo_set_vf_vlan in the driver. Once this call is used the vport
is considered to be in VST mode. In this mode, the PPF driver configures
Ethernet QPs created by this VF to use this vlan id and priority. Currently
RoCE isn't supported on that mode.

The special values of VID=4095 or VID=0,UP=0 are considered as VGT.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-26 23:29:13 -04:00
Rony Efraim
0eb62b93cb net/mlx4: Add structures to keep VF Ethernet ports information
This patch add struct mlx4_vport_state where all the parameters related
to management of VFs port (virtual ports of the NIC eswitch) are kept.

The driver keeps an administrative and operational copy of the settings.
The current administrative copy becomes operational on the event of probing
a VF either on a VM or on the host.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-26 23:29:13 -04:00
Hadar Hen Zion
3cd0e1789a mlx4_core: Move DMFS HW structs to common header file
Move flow steering HW structures to be on the public mlx4 include
directory, as a pre-step for the mlx4 IB driver to use them too.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-24 17:51:28 -07:00
Eugenia Emantayev
ddd8a6c12d net/mlx4_core: Read HCA frequency and map internal clock
Read HCA frequency, read PCI clock bar and offset, map internal clock to
PCI bar.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-24 16:30:13 -04:00
Hadar Hen Zion
fd91c49fb0 net/mlx4_core: Add helper function to translate B0 steering rules to DMFS
A pre-step for supporting guests that use B0 steering over a hypervisor
that runs in DMFS (device managed flow steering mode). Add helper function
which allows to translate L2 attachments / detachments provided in B0 mode
to DMFS rules.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-11 16:12:40 -04:00
Jack Morgenstein
e7dbeba856 net/mlx4_core: Fix endianness bug in set_param_l
The set_param_l function assumes casting a u64 pointer to a u32 pointer
allows to access the lower 32bits, but it results in writing the upper
32 bits on big endian systems.

The fixed function reads the upper 32 bits of the 64 argument, and or's
them with the 32 bits of the 32-bit value passed to the function.

Since this is now a "read-modify-write" operation, we got many
"unintialized variable" warnings which needed to be fixed as well.

Reported-by: Alexander Schmidt <alexschm@de.ibm.com>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-07 15:52:03 -05:00
Linus Torvalds
70a3a06d01 Main batch of InfiniBand/RDMA changes for 3.9:
- SRP error handling fixes from Bart Van Assche
  - Implementation of memory windows for mlx4 from Shani Michaeli
  - Lots of cxgb4 HW driver fixes from Vipul Pandya
  - Make iSER work for virtual functions, other fixes from Or Gerlitz
  - Fix for bug in qib HW driver from Mike Marciniszyn
  - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman
  - Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJRLPNoAAoJEENa44ZhAt0h0bMP/1xqlgDR3DGdoUoV4yTd6O9G
 Uccdb7og5o5tedVo+8xZz01y4at99P8FWW4hJ4os6k8n6yKoBVwo7qjN+BOR+JG5
 q8+O+ynUSIg4tGrb5sXcMnKXAXbw/vkftMWYNA41cbrM24DTYzB/2SLpvhwbFoTT
 tdQc2tgz5QaDqzWbagyCR4+k/IgO+Llrz/RvIdtz4dsTnTDogN7QCoSffX8n/Lpb
 DxtyXK4sdl3DAtd3CsIdsB/TSMb3RkHLCoSvmrWlLnqMdsbRxVnCVfBm4BOghW3J
 Y2K3joRoCjjIZSRNs/i0FMFkT/jbCXg1oXg9ek/a6YFNcgyk7z8iGyXrRY7fOnno
 8U2SfxJ69YpVYeJr+DSjaeHcmjsaYU7NN7JPxzvPKcJOIsxQJ/euJDXAXau3lEQY
 o9/p4JsGty0WHi1NanyygvghvBAoP1C5/59Sl4bHH5gckPyJT1kinPSCTT76YXGS
 WkSHg2mDhiJHy7Pnuy85iZldPoy2/5z09/I4aGMeL+8kUZbD4iFqzXIJU0HTsAim
 EONoRXDhIcN5DNVSVH1ig6nJ2a7Vhov4Z0r/vB8P4KhslBcqFwf2leC0eCoe5mNt
 SzcKhqosZDXoL8AwzpntzGIOid8pWmHbUx/PgIcoVXPjtl0h2ULNIFoYYyMZ3cyU
 AyN2tSiUZVddTV1/aKGL
 =RAQw
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband update from Roland Dreier:
 "Main batch of InfiniBand/RDMA changes for 3.9:

   - SRP error handling fixes from Bart Van Assche

   - Implementation of memory windows for mlx4 from Shani Michaeli

   - Lots of cxgb4 HW driver fixes from Vipul Pandya

   - Make iSER work for virtual functions, other fixes from Or Gerlitz

   - Fix for bug in qib HW driver from Mike Marciniszyn

   - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman

   - Various cleanups and warning fixes from Julia Lawall, Paul Bolle,
     Wei Yongjun"

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (41 commits)
  IB/mlx4: Advertise MW support
  IB/mlx4: Support memory window binding
  mlx4: Implement memory windows allocation and deallocation
  mlx4_core: Enable memory windows in {INIT, QUERY}_HCA
  mlx4_core: Disable memory windows for virtual functions
  IPoIB: Free ipoib neigh on path record failure so path rec queries are retried
  IB/srp: Fail I/O requests if the transport is offline
  IB/srp: Avoid endless SCSI error handling loop
  IB/srp: Avoid sending a task management function needlessly
  IB/srp: Track connection state properly
  IB/mlx4: Remove redundant NULL check before kfree
  IB/mlx4: Fix compiler warning about uninitialized 'vlan' variable
  IB/mlx4: Convert is_xxx variables in build_mlx_header() to bool
  IB/iser: Enable iser when FMRs are not supported
  IB/iser: Avoid error prints on EAGAIN registration failures
  IB/iser: Use proper define for the commands per LUN value advertised to SCSI ML
  IB/uverbs: Implement memory windows support in uverbs
  IB/core: Add "type 2" memory windows support
  mlx4_core: Propagate MR deregistration failures to caller
  mlx4_core: Rename MPT-related functions to have mpt_ prefix
  ...
2013-02-26 11:41:08 -08:00
Shani Michaeli
e448834e35 mlx4_core: Enable memory windows in {INIT, QUERY}_HCA
Add memory windows-related code to INIT_HCA and QUERY_HCA.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-25 10:44:31 -08:00
Shani Michaeli
cc1ade94ee mlx4_core: Disable memory windows for virtual functions
Do not enable memory windows allocation for virtual functions.

In addition, add a few safety checks, such as:

* Verifying the PD of a new MPT matches the VF.
* Making sure binding memory window isn't enabled for FMRs, and
  that new memory windows are not FMR themselves.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-25 10:44:31 -08:00
Shani Michaeli
b20e519a81 mlx4_core: Rename MPT-related functions to have mpt_ prefix
The MPT - Memory Protection Table - is used by both memory windows and
memory regions.  Hence, all MPT references are relevant for both types
of memory objects.  Rename the relevant functions to start with mpt_
instead of the current mr_ prefix.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-21 11:37:23 -08:00
Yan Burman
16a10ffd20 net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en
Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.

Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp

To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:26:12 -05:00
Hadar Hen Zion
23537b732f net/mlx4_core: Use firmware driven flow steering hash mode
The Firmware dynamically changes flow steering hash configuration from covering
L2 only to "full" L2/L3/L4 mode needed.  The dynamic change allows the driver
to set hard coded hash configuration which is changed by the firmware from L2
to L2/L3/L4 when attaching the first L3/L4 flow steering rule and back to L2
when there are no more such rules.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31 12:48:47 -05:00
Hadar Hen Zion
015465f851 net/mlx4_core: Directly expose fields of HW flow steering rule control segment
Some of the fields for struct mlx4_net_trans_rule_hw_ctrl were packed into u32
and accessed through bit field operations. Expose and access them directly as
u8.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31 12:48:46 -05:00
Jack Morgenstein
3c439b5586 mlx4_core: Allow choosing flow steering mode
Device managed flow steering will be enabled only under administrator
directive provided through setting the existing module parameter
log_num_mgm_entry_size to -1 (if the device actually supports flow
steering).  If flow steering isn't requested or not available, the
driver will use the value of log_num_mgm_entry_size and B0 steering.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-12-19 11:47:22 -08:00
Roland Dreier
ca3e57a599 mlx4_core: Clean up enabling of SENSE_PORT for older (ConnectX-1/-2) HCAs
Instead of having a hard-coded "PCI device ID != 0x1003" (which
obviously breaks as newer devices with ID != 0x1003 become available),
instead let's set a flag in our PCI device table for the older devices
where we're supposed to force using SENSE_PORT.  This also avoids
enabling SENSE_PORT for virtual functions by mistake.

Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-10-01 02:10:44 -07:00