Commit Graph

227 Commits

Author SHA1 Message Date
Gary R Hook
0b93a6dbec ntb: Remove debug-fs variables from the context structure
The Debug FS entries manage themselves; we don't need to hang onto
them in the context structure.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Gary R Hook
e9410ff810 ntb: Add a module option to control affinity of DMA channels
The DMA channel(s)/memory used to transfer data to an NTB device
may not be required to be on the same node as the device. Add a
module parameter that allows any candidate channel (aside from
node assocation) and allocated memory to be used.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Serge Semin
bf2a952d31 NTB: Add IDT 89HPESxNTx PCIe-switches support
IDT 89HPESxNTx device series is PCIe-switches, which support
Non-Transparent bridging between domains connected to the device ports.
Since new NTB API exposes multi-port interface and messaging API, the
IDT NT-functions can be now supported in the kernel. This driver adds
the following functionality:
1) Multi-port NTB API to have information of possible NT-functions
activated in compliance with available device ports.
2) Memory windows of direct and look up table based address translation
with all possible combinations of BARs setup.
3) Traditional doorbell NTB API.
4) One-on-one messaging NTB API.

There are some IDT PCIe-switch setups, which must be done before any of
the NTB peers started. It can be performed either by system BIOS via
IDT SMBus-slave interface or by pre-initialized IDT PCIe-switch EEPROM:
1) NT-functions of corresponding ports must be activated using
SWPARTxCTL and SWPORTxCTL registers.
2) BAR0 must be configured to expose NT-function configuration
registers map.
3) The rest of the BARs must have at least one memory window
configured, otherwise the driver will just return an error.
Temperature sensor of IDT PCIe-switches can be also optionally
activated by BIOS or EEPROM.
(See IDT documentations for details of how the pre-initialization can
be done)

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Logan Gunthorpe
48ea02184a ntb_hw_intel: Style fixes: open code macros that just obfuscate code
As per a comments in [1] by Greg Kroah-Hartman, the ndev_* macros should
be cleaned up. This makes it more clear what's actually going on when
reading the code.

[1] http://www.spinics.net/lists/linux-pci/msg56904.html

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Logan Gunthorpe
0f9bfb979a ntb_hw_amd: Style fixes: open code macros that just obfuscate code
As per a comments in [1] by Greg Kroah-Hartman, the ndev_* macros should
be cleaned up. This makes it more clear what's actually going on when
reading the code.

[1] http://www.spinics.net/lists/linux-pci/msg56904.html

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Serge Semin
bc3e49adc2 NTB: Add Messaging NTB API
Some IDT NTB-capable PCIe-switches have message registers to communicate with
peer devices. This patch adds new NTB API callback methods, which can be used
to utilize these registers functionality:
 ntb_msg_count(); - get number of message registers
 ntb_msg_inbits(); - get bitfield of inbound message registers status
 ntb_msg_outbits(); - get bitfield of outbound message registers status
 ntb_msg_read_sts(); - read the inbound and outbound message registers status
 ntb_msg_clear_sts(); - clear status bits of message registers
 ntb_msg_set_mask(); - mask interrupts raised by status bits of message
registers.
 ntb_msg_clear_mask(); - clear interrupts mask bits of message registers
 ntb_msg_read(midx, *pidx); - read message register with specified index,
additionally getting peer port index which data received from
 ntb_msg_write(midx, pidx); - write data to the specified message register
sending it to the passed peer device connected over a pidx port
 ntb_msg_event(); - notify driver context of a new message event

Of course there is hardware which doesn't support Message registers, so
this API is made optional.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Serge Semin
d67288a395 NTB: Alter Scratchpads API to support multi-ports devices
Even though there is no any real NTB hardware, which would have both more
than two ports and Scratchpad registers, it is logically correct to have
Scratchpad API accepting a peer port index as well. Intel/AMD drivers utilize
Primary and Secondary topology to split Scratchpad between connected root
devices. Since port-index API introduced, Intel/AMD NTB hardware drivers can
use device port to determine which Scratchpad registers actually belong to
local and peer devices. The same approach can be used if some potential
hardware in future will be multi-port and have some set of Scratchpads.
Here are the brief of changes in the API:
 ntb_spad_count() - return number of Scratchpads per each port
 ntb_peer_spad_addr(pidx, sidx) - address of Scratchpad register of the
peer device with pidx-index
 ntb_peer_spad_read(pidx, sidx) - read specified Scratchpad register of the
peer with pidx-index
 ntb_peer_spad_write(pidx, sidx) - write data to Scratchpad register of the
peer with pidx-index

Since there is hardware which doesn't support Scratchpad registers, the
corresponding API methods are now made optional.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Serge Semin
443b9a14ec NTB: Alter MW API to support multi-ports devices
Multi-port NTB devices permit to share a memory between all accessible peers.
Memory Windows API is altered to correspondingly initialize and map memory
windows for such devices:
 ntb_mw_count(pidx); - number of inbound memory windows, which can be allocated
for shared buffer with specified peer device.
 ntb_mw_get_align(pidx, widx); - get alignment and size restriction parameters
to properly allocate inbound memory region.
 ntb_peer_mw_count(); - get number of outbound memory windows.
 ntb_peer_mw_get_addr(widx); - get mapping address of an outbound memory window

If hardware supports inbound translation configured on the local ntb port:
 ntb_mw_set_trans(pidx, widx); - set translation address of allocated inbound
memory window so a peer device could access it.
 ntb_mw_clear_trans(pidx, widx); - clear the translation address of an inbound
memory window.

If hardware supports outbound translation configured on the peer ntb port:
 ntb_peer_mw_set_trans(pidx, widx); - set translation address of a memory
window retrieved from a peer device
 ntb_peer_mw_clear_trans(pidx, widx); - clear the translation address of an
outbound memory window

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Serge Semin
4e8c11b7fd NTB: Alter link-state API to support multi-port devices
Multi-port devices permit the NTB connections between multiple domains,
so a local device can have NTB link being up with one peer and being
down with another. NTB link-state API is appropriately altered to return
a bitfield of the link-states between the local device and possible peers.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Serge Semin
1e5301196a NTB: Add indexed ports NTB API
There is some NTB hardware, which can combine more than just two domains
over NTB. For instance, some IDT PCIe-switches can have NTB-functions
activated on more than two-ports. The different domains are distinguished
by ports they are connected to. So the new port-related methods are added to
the NTB API:
 ntb_port_number() - return local port
 ntb_peer_port_count() - return number of peers local port can connect to
 ntb_peer_port_number(pdix) - return port number by it index
 ntb_peer_port_idx(port) - return port index by it number

Current test-drivers aren't changed much. They still support two-ports devices
for the time being while multi-ports hardware drivers aren't added.

By default port-related API is declared for two-ports hardware.
So corresponding hardware drivers won't need to implement it.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-07-06 11:30:07 -04:00
Allen Hubbe
88931ec3dc ntb: no sleep in ntb_async_tx_submit
Do not sleep in ntb_async_tx_submit, which could deadlock.
This reverts commit "8c874cc140d667f84ae4642bb5b5e0d6396d2ca4"

Fixes: 8c874cc140 ("NTB: Address out of DMA descriptor issue with NTB")
Reported-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Dave Jiang
5eb449e15d ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits
Fixing doorbell register length to 32bits per spec. On Skylake NTB, the
doorbell registers are 32bit write only registers. The source for the
doorbell is a 64bit register that shows the interrupt bits.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 783dfa6cc4 ("ntb: Adding Skylake Xeon NTB support")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Logan Gunthorpe
8e8496e0e9 ntb_transport: fix bug calculating num_qps_mw
A divide by zero error occurs if qp_count is less than mw_count because
num_qps_mw is calculated to be zero. The calculation appears to be
incorrect.

The requirement is for num_qps_mw to be set to qp_count / mw_count
with any remainder divided among the earlier mws.

For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1
will have 3 qps per window and mws 2 through 4 will have 2 qps per window.
Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher
than when mw_num >= qp_count.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Logan Gunthorpe
cb827ee6cc ntb_transport: fix qp count bug
In cases where there are more mw's than spads/2-2, the mw count gets
reduced to match the limitation. ntb_transport also tries to ensure that
there are fewer qps than mws but uses the full mw count instead of
the reduced one. When this happens, the math in
'ntb_transport_setup_qp_mw' will get confused and result in a kernel
paging request bug.

This patch fixes the bug by reducing qp_count to the reduced mw count
instead of the full mw count.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Gary R Hook
94fc795454 ntb: Correct modinfo usage statement for ntb_perf
The order parameters are powers of 2; adjust the usage information
to use correct mathematical representations.

Signed-off-by: Gary R Hook <gary.hook@amd.com>
Fixes: 8a7b6a778a ("ntb: ntb perf tool")
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-06-19 14:24:41 -04:00
Dave Jiang
939ada5fb5 ntb: ntb_hw_intel: link_poll isn't clearing the pending status properly
On Skylake hardware, the link_poll isn't clearing the pending interrupt
bit.  Adding a new function for SKX that handles clearing of status bit the
right way.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 783dfa6c ("ntb: Adding Skylake Xeon NTB support")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-02-16 23:11:26 -05:00
Thomas VanSelus
8fcd0950c0 ntb_transport: Pick an unused queue
Fix typo causing ntb_transport_create_queue to select the first
queue every time, instead of using the next free queue.

Signed-off-by: Thomas VanSelus <tvanselus@xes-inc.com>
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Fixes: fce8a7bb5 ("PCI-Express Non-Transparent Bridge Support")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-02-16 23:11:26 -05:00
Dave Jiang
9644347c52 ntb: ntb_perf missing dmaengine_unmap_put
In the normal I/O execution path, ntb_perf is missing a call to
dmaengine_unmap_put() after submission. That causes us to leak
unmap objects.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 8a7b6a77 ("ntb: ntb perf tool")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-02-16 23:11:26 -05:00
Allen Hubbe
dd62245e73 NTB: ntb_transport: fix debugfs_remove_recursive
The call to debugfs_remove_recursive(qp->debugfs_dir) of the sub-level
directory must not be later than
debugfs_remove_recursive(nt_debugfs_dir) of the top-level directory.
Otherwise, the sub-level directory will not exist, and it would be
invalid (panic) to attempt to remove it.  This removes the top-level
directory last, after sub-level directories have been cleaned up.

Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com>
Fixes: e26a5843f ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2017-02-16 23:10:52 -05:00
Steve Wahl
dfb7d24c5a ntb_transport: Remove unnecessary call to ntb_peer_spad_read
The results were previously ignored, anyway.

Signed-off-by: Steve Wahl <Steve.Wahl@dell.com>
Fixes: e26a5843f7
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23 16:11:07 -05:00
Christophe JAILLET
28734e8f69 NTB: Fix 'request_irq()' and 'free_irq()' inconsistancy
'request_irq()' and 'free_irq()' should have the same 'dev_id'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23 16:11:03 -05:00
Dave Jiang
09e71a6f13 ntb: fix SKX NTB config space size register offsets
The offsets for the SZ registers are wrong. Updated.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reported-by: Sandeep Mann <sandeep@purestorage.com>
Tested-by: Zachary Ross <zacharyx.ross@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23 16:10:54 -05:00
Shyam Sundar S K
b17faba03f ntb_transport: Limit memory windows based on available, scratchpads
When the underlying NTB H/W driver advertises more memory windows
than the number of scratchpads available to setup MW's, it is likely
that we may end up filling the remaining memory windows with garbage.
So to avoid that, lets limit the memory windows that transport driver
can setup based on the available scratchpads.

Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23 16:10:35 -05:00
Shyam Sundar S K
872deb2103 NTB: Register and offset values fix for memory window
Due to incorrect limit and translation register values, NTB link was
going down when the memory window was setup. Made appropriate changes
as per spec.

Fix limit register values for BAR1, which was overlapping
with the BAR23 address.

Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23 16:09:18 -05:00
Xiangliang Yu
e5b0d2d1ba NTB: add support for hotplug feature
AMD NTB support hotplug under B2B mode. NTB will trigger link
up/down interrupt event when doing plug add/remove, this patch
implements the two interrupt event to support B2B hotplug function.

Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23 16:09:15 -05:00
Dave Jiang
783dfa6cc4 ntb: Adding Skylake Xeon NTB support
The Skylake Xeon NTB hardware has made some changes to the register name,
offset, and the way doorbells work. Adding driver support for the new
hardware.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-12-23 16:09:10 -05:00
Dan Carpenter
819baf8859 ntb_perf: potential info leak in debugfs
This is a static checker warning, not something I'm desperately
concerned about.  But snprintf() returns the number of bytes that
would have been copied if there were space.  We really care about the
number of bytes that actually were copied so we should use scnprintf()
instead.

It probably won't overrun, and in that case we may as well just use
sprintf() but these sorts of things make static checkers and code
reviewers happier.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-11-13 16:48:30 -05:00
Dave Jiang
25ea9f2bf5 ntb: ntb_hw_intel: init peer_addr in struct intel_ntb_dev
The peer_addr member of intel_ntb_dev is not set, therefore when
acquiring ntb_peer_db and ntb_peer_spad we only get the offset rather
than the actual physical address. Adding fix to correct that.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-11-13 16:48:29 -05:00
Nicholas Mc Guire
cdc08982a5 ntb: make DMA_OUT_RESOURCE_TO HZ independent
schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-11-13 16:48:29 -05:00
Nicholas Mc Guire
c0a88032ef ntb_transport: make DMA_OUT_RESOURCE_TO HZ independent
schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-11-13 16:48:29 -05:00
Wei Yongjun
49b89de41f NTB: ntb_hw_intel: Fix typo in module parameter descriptions
Fix typo in module parameter descriptions.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-11-13 16:48:29 -05:00
Wei Yongjun
cedecbc5e0 ntb_pingpong: Fix db_init parameter description
Fix 'db_init' parameter description.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-11-13 16:48:29 -05:00
Dave Jiang
72203572af ntb: add DMA error handling for RX DMA
Adding support on the rx DMA path to allow recovery of errors when
DMA responds with error status and abort all the subsequent ops.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: linux-ntb@googlegroups.com
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-08-08 08:11:42 +05:30
Dave Jiang
9cabc2691e ntb: add DMA error handling for TX DMA
Adding support on the tx DMA path to allow recovery of errors when
DMA responds with error status and abort all the subsequent ops.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: linux-ntb@googlegroups.com
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2016-08-08 08:11:42 +05:30
Allen Hubbe
95f1464f69 NTB: ntb_hw_intel: use local variable pdev
Clean up duplicated expression by replacing it with the equivalent local
variable pdev.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:34:13 -04:00
Allen Hubbe
4089527388 NTB: ntb_hw_intel: show BAR size in debugfs info
It will be useful to know the hardware configured BAR size to diagnose
issues with NTB memory windows.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:33:47 -04:00
Logan Gunthorpe
35539b54ac ntb_perf: clear link_is_up flag when the link goes down.
When the link goes down, the link_is_up flag did not return to
false. This could have caused some subtle corner case bugs
when the link goes up and down quickly.

Once that was fixed, there was found to be a race if the link was
brought down then immediately up. The link_cleanup work would
occasionally be scheduled after the next link up event. This would
cancel the link_work that was supposed to occur and leave ntb_perf
in an unusable state.

To fix this we get rid of the link_cleanup work and put the actions
directly in the link_down event.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:08 -04:00
Logan Gunthorpe
20572ee1c5 ntb_pingpong: Add a debugfs file to get the ping count
This commit adds a debugfs 'count' file to ntb_pingpong. This is so
testing with ntb_pingpong can be automated beyond just checking the
logs for pong messages.

The count file returns a number which increments every pong. The
counter can be cleared by writing a zero.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:07 -04:00
Logan Gunthorpe
bfcaa39652 ntb_tool: Add link status and files to debugfs
In order to more successfully script with ntb_tool it's useful to
have a link file to check the link status so that the script
doesn't use the other files until the link is up.

This commit adds a 'link' file to the debugfs directory which reads
boolean (Y or N) depending on the link status. Writing to the file
change the link state using ntb_link_enable or ntb_link_disable.

A 'link_event' file is also provided so an application can block until
the link changes to the desired state. If the user writes a 1, it will
block until the link is up. If the user writes a 0, it will block until
the link is down.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:07 -04:00
Logan Gunthorpe
717146a2a8 ntb_tool: Postpone memory window initialization for the user
In order to make the interface closer to the raw NTB API, this commit
changes memory windows so they are not initialized on link up.
Instead, the 'peer_trans*' debugfs files are introduced. When read,
they return information provided by ntb_mw_get_range. When written,
they create a buffer and initialize the memory window. The
value written is taken as the requested size of the buffer (which
is then rounded for alignment). Writing a value of zero frees the buffer
and tears down the memory window translation. The 'peer_mw*' file is
only created once the memory window translation is setup by the user.

Additionally, it was noticed that the read and write functions for the
'peer_mw*' files should have checked for a NULL pointer.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:07 -04:00
Logan Gunthorpe
26dc638ae6 ntb_perf: Wait for link before running test
Instead of returning immediately with an error when the link is
down, wait for the link to come up (or the user sends a SIGINT).

This is to make scripting ntb_perf easier.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:07 -04:00
Logan Gunthorpe
58fd0f3b15 ntb_perf: Return results by reading the run file
Instead of having to watch logs, allow the results to be retrieved
by reading back the run file. This file will return "running" when
the test is running and nothing if no tests have been run yet.
It returns 1 line per thread, and will display an error message if the
corresponding thread returns an error.

With the above change, the pr_info calls that returned the results are
then changed to pr_debug calls.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:07 -04:00
Logan Gunthorpe
da573eaa3a ntb_perf: Improve thread handling to increase robustness
This commit accomplishes a few things:

1) Properly prevent multiple sets of threads from running at once using
a mutex. Lots of race issues existed with the thread_cleanup.

2) The mutex allows us to ensure that threads are finished before
tearing down the device or module.

3) Don't use kthread_stop when the threads can exit by themselves, as
this is counter-indicated by the kthread_create documentation. Threads
now wait for kthread_stop to occur.

4) Writing to the run file now blocks until the threads are complete.
The test can then be safely interrupted by a SIGINT.

Also, while I was at it:

5) debugfs_run_write shouldn't return 0 in the early check cases as this
could cause debugfs_run_write to loop undesirably.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:06 -04:00
Logan Gunthorpe
fd2ecd885b ntb_perf: Schedule based on time not on performance
When debugging performance problems, if some issue causes the ntb
hardware to be significantly slower than expected, ntb_perf will
hang requiring a reboot because it only schedules once every 4GB.

Instead, schedule based on jiffies so it will not hang the CPU if
the transfer is slow.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:06 -04:00
Logan Gunthorpe
19645a0771 ntb_transport: Check the number of spads the hardware supports
I'm working on hardware that currently has a limited number of
scratchpad registers and ntb_ndev fails with no clue as to why. I
feel it is better to fail early and provide a reasonable error message
then to fail later on.

The same is done to ntb_perf, but it doesn't currently require enough
spads to actually fail. I've also removed the unused SPAD_MSG and
SPAD_ACK enums so that MAX_SPAD accurately reflects the number of
spads used.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:06 -04:00
Logan Gunthorpe
8b71d28506 ntb_tool: Add memory window debug support
We allocate some memory window buffers when the link comes up, then we
provide debugfs files to read/write each side of the link.

This is useful for debugging the mapping when writing new drivers.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:06 -04:00
Logan Gunthorpe
4aae977721 ntb_perf: Allow limiting the size of the memory windows
On my system, dma_alloc_coherent won't produce memory anywhere
near the size of the BAR. So I needed a way to limit this.

It's pretty much copied straight from ntb_transport.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:06 -04:00
Dave Jiang
a754a8fcaf NTB: allocate number transport entries depending on size of ring size
Currently we only allocate a fixed default number of descriptors for the tx
and rx side. We should dynamically resize it to the number of descriptors
resides in the transport rings. We should know the number of transmit
descriptors at initializaiton. We will allocate the default number of
descriptors for receive side and allocate additional ones when we know the
actual max entries for receive.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <allen.hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:21:05 -04:00
Logan Gunthorpe
625f0802e8 ntb_tool: BUG: Ensure the buffer size is large enough to return all spads
On hardware with 32 scratchpad registers the spad field in ntb tool
could chop off the end. The maximum buffer size is increased from
256 to 15 times the number or scratchpads.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:05:31 -04:00
Logan Gunthorpe
c792eba12c ntb_tool: Fix infinite loop bug when writing spad/peer_spad file
If you tried to write two spads in one line, as per the example:

root@peer# echo '0 0x01010101 1 0x7f7f7f7f' > $DBG_DIR/peer_spad

then the CPU would freeze in an infinite loop.

This wasn't immediately obvious but 'pos' was not incrementing the
buffer, so after reading the second pair of values, 'pos' would once
again be 3 and it would re-read the second pair of values ad infinitum.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-08-05 10:05:31 -04:00
Allen Hubbe
4f1b50c3e3 NTB: Remove _addr functions from ntb_hw_amd
Kernel zero day testing warned about address space confusion.  A virtual
iomem address was used where a physical address is expected.  The
offending functions implement an optional part of the api, so they are
removed.  They can be added later, after testing.

Fixes: a1b3695820

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Acked-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-26 11:44:33 -04:00
Dave Jiang
838850ee0b NTB: Fix incorrect clean up routine in ntb_perf
The clean up routine when we failed to allocate kthread is not cleaning
up all the threads, only the same one over and over again.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-21 19:28:30 -04:00
Dave Jiang
ddc8f6feec NTB: Fix incorrect return check in ntb_perf
kthread_create_no_node() returns error pointers, never NULL. Fix check so
it handles error correctly.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-21 19:28:25 -04:00
Sudip Mukherjee
2572c7fb4e ntb: fix possible NULL dereference
kmalloc can fail and we should check for NULL before using the pointer
returned by kmalloc.

Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-17 20:52:15 -04:00
Dave Jiang
ee5f750f1c ntb: add missing setup of translation window
The perf tool is missing the setup of translation window. Adding call to
setup the translation window for backed memory.

Signed-off-by: John Kading <john.kading@gd-ms.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-17 20:50:15 -04:00
Dave Jiang
84f766855f ntb: stop link work when we do not have memory
Instead of keep trying to go through the init routine when we aren't able
to allocate memory, we should just stop and go down.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-17 20:38:40 -04:00
Dave Jiang
e902133162 ntb: stop tasklet from spinning forever during shutdown.
We can leave tasklet spinning forever if we disable the tasklet during
qp shutdown and the tasklets are still being kicked off. This hopefully
should avoid that race condition.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reported-by: Alex Depoutovitch <alex@pernixdata.com>
Tested-by: Alex Depoutovitch <alex@pernixdata.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-17 20:38:40 -04:00
Arnd Bergmann
1985a88107 ntb: perf test: fix address space confusion
The ntb driver assigns between pointers an __iomem tokens, and
also casts them to 64-bit integers, which results in compiler
warnings on 32-bit systems:

drivers/ntb/test/ntb_perf.c: In function 'perf_copy':
drivers/ntb/test/ntb_perf.c:213:10: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  vbase = (u64)(u64 *)mw->vbase;
          ^
drivers/ntb/test/ntb_perf.c:214:14: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
  dst_vaddr = (u64)(u64 *)dst;
              ^

This adds __iomem annotations where needed and changes the temporary
variables to iomem pointers to avoid casting them to u64. I did not
see the problem in linux-next earlier, but it show showed up in
4.5-rc1.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 8a7b6a778a ("ntb: ntb perf tool")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-03-17 20:38:40 -04:00
Allen Hubbe
03beaec80d NTB: Fix macro parameter conflict with field name
If the parameter given to the macro is replaced throughout the macro as
it is evaluated.  The intent is that the macro parameter should replace
the only the first parameter to container_of().  However, the way the
macro was written, it would also inadvertantly replace a structure field
name.  If a parameter of any other name is given to the macro, it will
fail to compile, if the structure does not contain a field of the same
name.  At worst, it will compile, and hide improper access of an
unintended field in the structure.

Change the macro parameter name, so it does not conflict with the
structure field name.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-01-21 19:53:10 -05:00
Xiangliang Yu
a1b3695820 NTB: Add support for AMD PCI-Express Non-Transparent Bridge
This adds support for AMD's PCI-Express Non-Transparent Bridge
(NTB) device on the Zeppelin platform. The driver connnects to the
standard NTB sub-system interface, with modification to add hooks
for power management in a separate patch. The AMD NTB device has 3
memory windows, 16 doorbell, 16 scratch-pad registers, and supports
up to 16 PCIe lanes running a Gen3 speeds.

Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-01-21 19:51:04 -05:00
Dave Jiang
8a7b6a778a ntb: ntb perf tool
Providing raw performance data via a tool that directly access data from
NTB w/o any software overhead. This allows measurement of the hardware
performance limit. In revision one we are only doing single direction
CPU and DMA writes. Eventually we will provide bi-directional writes.

The measurement using DMA engine for NTB performance measure does
not measure the raw performance of DMA engine over NTB due to software
overhead. But it should provide the peak performance through the Linux DMA
driver.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-01-17 22:08:05 -05:00
Dave Jiang
8c874cc140 NTB: Address out of DMA descriptor issue with NTB
The transport right now does not handle the case where we run out of DMA
descriptors. We just fail when we do not succeed. Adding code to retry for
a bit attempting to use the DMA engine instead of instantly fail to CPU
copy.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-01-11 10:01:27 -05:00
Dave Jiang
703872c2c5 NTB: Clear property bits in BAR value
The lower bits read from a BAR register will contain property bits
that we do not care about. Clear those so that we can use the BAR
values for limit and xlat registers.

Reported-by: Conrad Meyer <cem@freebsd.org>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-01-11 09:51:17 -05:00
Jon Mason
179f912a39 NTB: ntb_process_tx error path bug
The transmit overrun avoidance error path in ntb_process_tx accidentally
swapped the first two values being passed to the tx_handler client.
This could result in crashes in the ntb_netdev (or other out-of-tree NTB
clients).

Reported-by: Alex Depoutovitch <alex@pernixdata.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2016-01-11 09:51:17 -05:00
Arnd Bergmann
fdcb4b2e78 NTB: fix 32-bit compiler warning
resource_size_t may be 32-bit wide on some architectures, which causes
this warning when building the NTB code:

drivers/ntb/ntb_transport.c: In function 'ntb_transport_link_work':
drivers/ntb/ntb_transport.c:828:46: warning: right shift count >= width of type [-Wshift-count-overflow]

The warning is harmless but can be avoided by using the upper_32_bits()
macro.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-11-08 16:24:43 -05:00
Dave Jiang
8b782fab4d NTB: unify translation addresses
There is no need for the upstream and downstream addresses to be different
for the NTB configs. Go to using a single set of address. It is still
possible to configure them differently using module parameter override
however.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked and Tested-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-11-08 16:11:21 -05:00
Jon Mason
c92ba3c5d9 NTB: invalid buf pointer in multi-MW setups
Order of operations issue with the QP Num and MW count, which would
result in the receive buffer pointer being invalid if there are more
than 1 MW.  Corrected with parenthesis to enforce the proper order of
operations.

Reported-by: John I. Kading <John.Kading@gd-ms.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-11-08 16:11:21 -05:00
Sudip Mukherjee
70d4687d60 NTB: remove unused variable
These variables were not used anywhere. So remove them.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-11-08 16:11:21 -05:00
Sudip Mukherjee
d4adee09fd NTB: fix access of free-ed pointer
We were accessing nt->mw_vec after freeing it. Fix the error path so
that we free nt->mw_vec after we have finished using it.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-11-08 16:11:21 -05:00
Dave Jiang
04afde45e0 NTB: Fix issue where we may be accessing NULL ptr
smatch detected an issue in the function ntb_transport_max_size() where
we could be dereferencing a dma channel pointer when it is NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-11-08 16:11:21 -05:00
Allen Hubbe
9a07826f99 NTB: Fix range check on memory window index
The range check must exclude the upper bound.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:27:12 -04:00
Allen Hubbe
2aa2a77a48 NTB: Improve index handling in B2B MW workaround
Check that b2b_mw_idx is in range of the number of memory windows when
initializing the device.  The workaround is considered to be in effect
only if the device b2b_idx is exactly UINT_MAX, instead of any index
past the last memory window.

Only print B2B MW workaround information in debugfs if the workaround is
in effect.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:27:12 -04:00
Dave Jiang
569410ca75 NTB: Use unique DMA channels for TX and RX
Allocate two DMA channels, one for TX operation and one for RX
operation, instead of having one DMA channel for everything. This
provides slightly better performance, and also will make error handling
cleaner later on.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:09 -04:00
Allen Hubbe
905921e748 NTB: Remove dma_sync_wait from ntb_async_rx
The dma_sync_wait can hurt the performance of workloads mixed with both
large and small frames.  Large frames will be copied using the dma
engine.  Small frames will be copied by the cpu.  The dma_sync_wait
prevents the cpu and dma engine copying in parallel.

In the period where the cpu is copying, the dma engine is stopped.  The
dma engine is not doing any useful work to copy large frames during that
time, and the additional time to restart the dma engine for the next
large frame.  This will decrease the throughput for the portion of a
workload with large frames.

In the period where the dma engine is copying, the cpu is held up
waiting for dma to complete.  The small frames processing will be
delayed until the dma is complete.  The RX frames are completed
in-order, and the processing of small frames takes very little time, so
dma_sync_wait may have an insignificant impact on the respose time of
frames.  The more significant impact is to the system, because the delay
in dma_sync_wait is implemented as busy non-blocking wait.  This can
prevent the delayed core from doing any useful work, even if it could be
processing work for other drivers, unrelated to transport RX processing.

After applying the earlier patch to fix out-of-order RX acknoledgement,
the dma_sync_wait is no longer necessary.  Remove it, so that cpu memcpy
will proceed immediately for small frames, in parallel with ongoing dma
for large frames.  Do not hold up the cpu from doing work while dma is
in progress.  The prior fix will continue to ensure in-order completion
of the RX frames to the upper layer, and in-order delivery of the RX
acknoledgement.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang
d98ef99e37 NTB: Clean up QP stats info
Make QP stats info more readable for debugging purposes.  Also add an
entry to indicate whether DMA is being used.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang
315100004f NTB: Make the transport list in order of discovery
The list should be added from the bottom and not the top in order to
ensure the transport is provided in the same order to clients as ntb
devices are discovered.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang
0a5d19d9f0 NTB: Add PCI Device IDs for Broadwell Xeon
Adding PCI Device IDs for B2B (back to back), RP (root port, primary),
and TB (transparent bridge, secondary) devices.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Dave Jiang
e74bfeedad NTB: Add flow control to the ntb_netdev
Right now if we push the NTB really hard, we start dropping packets due
to not able to process the packets fast enough. We need to st:qop the
upper layer from flooding us when that happens.

A timer is necessary in order to restart the queue once the resource has
been processed on the receive side. Due to the way NTB is setup, the
resources on the tx side are tied to the processing of the rx side and
there's no async way to know when the rx side has released those
resources.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-09-07 15:17:08 -04:00
Kees Cook
e15f940908 ntb: avoid format string in dev_set_name
Avoid any chance of format string expansion when calling dev_set_name.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Allen Hubbe
30a4bb1e5a NTB: Fix dereference before check
Remove early dereference of a pointer that is checked later in the code.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Allen Hubbe
8c9edf63e7 NTB: Fix zero size or integer overflow in ntb_set_mw
A plain 32 bit integer will overflow for values over 4GiB.

Change the plain integer size to the appropriate size type in
ntb_set_mw.  Change the type of the size parameter and two local
variables used for size.

Even if there is no overflow, a size of zero is invalid here.

Reported-by: Juyoung Jung <jjung@micron.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Allen Hubbe
8b5a22d8f1 NTB: Schedule to receive on QP link up
Schedule to receive on QP link up, to make sure that the doorbell is
properly cleared for interrupts.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Dave Jiang
260bee9451 NTB: Fix oops in debugfs when transport is half-up
When the remote side is not up, we do not have all the context for the
transport, and that causes NULL ptr access. Have the debugfs reads check
to see if transport is up before we make access.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:22 -04:00
Dave Jiang
c8650fd03d NTB: Fix transport stats for multiple devices
Currently the debugfs does not have files for all NTB transport queue
pairs.  When there are multiple NTBs present in a system, the QP names
of the last transport clobber the names of previously added transport
QPs.  Only the last added QPs can be observed via debugfs.

Create a directory per NTB transport to associate the QPs with that
transport.  Name the directory the same as the PCI device.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:21 -04:00
Allen Hubbe
da2e5ae561 NTB: Fix ntb_transport out-of-order RX update
It was possible for a synchronous update of the RX index in the error
case to get ahead of the asynchronous RX index update in the normal
case.  Change the RX processing to preserve an RX completion order.

There were two error cases.  First, if a buffer is not present to
receive data, there would be no queue entry to preserve the RX
completion order.  Instead of dropping the RX frame, leave the RX frame
in the ring.  Schedule RX processing when RX entries are enqueued, in
case there are RX frames waiting in the ring to be received.

Second, if a buffer is too small to receive data, drop the frame in the
ring, mark the RX entry as done, and indicate the error in the RX entry
length.  Check for a negative length in the receive callback in
ntb_netdev, and count occurrences as rx_length_errors.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-08-09 16:32:21 -04:00
Dave Jiang
bf44fe4671 NTB: Add split BAR output for debugfs stats
When split BAR is enabled, the driver needs to dump out the split BAR
registers rather than the original 64bit BAR registers.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:32 -04:00
Dave Jiang
fd839bf884 NTB: Change WARN_ON_ONCE to pr_warn_once on unsafe
The unsafe doorbell and scratchpad access should display reason when
WARN is called.  Otherwise we get a stack dump without any explanation.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:30 -04:00
Dave Jiang
7eb387813d NTB: Print driver name and version in module init
Printouts driver name and version to indicate what is being loaded.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:28 -04:00
Dave Jiang
9891417de8 NTB: Increase transport MTU to 64k from 16k
Benchmarking showed a significant performance increase with the MTU size
to 64k instead of 16k.  Change the driver default to 64k.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:27 -04:00
Dave Jiang
2f887b9a44 NTB: Rename Intel code names to platform names
Instead of using the platform code names, use the correct platform names
to identify the respective Intel NTB hardware.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:25 -04:00
Dave Jiang
a41ef053f7 NTB: Default to CPU memcpy for performance
Disable DMA usage by default, since the CPU provides much better
performance with write combining.  Provide a module parameter to enable
DMA usage when offloading the memcpy is preferred.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:24 -04:00
Dave Jiang
06917f7535 NTB: Improve performance with write combining
Changing the memory window BAR mappings to write combining significantly
boosts the performance.  We will also use memcpy that uses non-temporal
store, which showed performance improvement when doing non-cached
memcpys.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:21 -04:00
Allen Hubbe
0e041fb536 NTB: Use NUMA memory in Intel driver
Allocate memory for the NUMA node of the NTB device.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:19 -04:00
Allen Hubbe
1199aa6126 NTB: Use NUMA memory and DMA chan in transport
Allocate memory and request the DMA channel for the same NUMA node as
the NTB device.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:08:33 -04:00
Allen Hubbe
2876228941 NTB: Rate limit ntb_qp_link_work
When the ntb transport is connecting and waiting for the peer, the debug
console receives lots of debug level messages about the remote qp link
status being down.  Rate limit those messages.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:08:30 -04:00
Allen Hubbe
578b881ba9 NTB: Add tool test client
This is a simple debugging driver that enables the doorbell and
scratch pad registers to be read and written from the debugfs.  This
tool enables more complicated debugging to be scripted from user space.
This driver may be used to test that your ntb hardware and drivers are
functioning at a basic level.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:08:17 -04:00
Allen Hubbe
963de4739f NTB: Add ping pong test client
This is a simple ping pong driver that exercises the scratch pads and
doorbells of the ntb hardware.  This driver may be used to test that
your ntb hardware and drivers are functioning at a basic level.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:07:42 -04:00
Allen Hubbe
42fefc86a6 NTB: Add parameters for Intel SNB B2B addresses
Add module parameters for the addresses to be used in B2B topology.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:06:40 -04:00
Allen Hubbe
2849b5d706 NTB: Reset transport QP link stats on down
Reset the link stats when the link goes down.  In particular, the TX and
RX index and count must be reset, or else the TX side will be sending
packets to the RX side where the RX side is not expecting them.  Reset
all the stats, to be consistent.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:06:34 -04:00
Allen Hubbe
c0900b33d1 NTB: Do not advance transport RX on link down
On link down, don't advance RX index to the next entry.  The next entry
should never be valid after receiving the link down flag.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:06:24 -04:00