linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-30 14:36:46 +07:00

Author	SHA1	Message	Date
Dan Williams	3208ca52f3	ioat: driver version 4.0 A new ring implementation and the addition of raid functionality constitutes a bump in the driver major version number. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-10 11:27:36 -07:00
Maciej Sosnowski	1a5aeeecd5	dca: registering requesters in multiple dca domains This patch enables DCA support on multiple-IOH/multiple-IIO architectures. It modifies dca module by replacing single dca_providers list with dca_domains list, each domain containing separate list of providers. This approach lets dca driver manage multiple domains, i.e. sets of providers and requesters mapped back to the same PCI root complex device. The driver takes care to register each requester to a provider from the same domain. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>	2009-09-10 10:00:05 -07:00
Dan Williams	9a8de639f3	async_tx: remove HIGHMEM64G restriction This restriction prevented ASYNC_TX_DMA from being enabled on platform configurations where DMA address conversion could not be performed in place on the stack. Since commit `04ce9ab3` ("async_xor: permit callers to pass in a 'dma/page scribble' region") the async_tx api now either uses a caller provided 'scribble' buffer, or performs the conversion in place when sizeof(dma_addr_t) <= sizeof(struct page *). Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:56:37 -07:00
Nobuhiro Iwamatsu	d8902adcc1	dmaengine: sh: Add Support SuperH DMA Engine driver This supported all DMA channels, and it was tested in SH7722, SH7780, SH7785 and SH7763. This can not use with SH DMA API. Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com> Reviewed-by: Matt Fleming <matt@console-pimps.org> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:56:02 -07:00
Dan Williams	bbb20089a3	Merge branch 'dmaengine' into async-tx-next Conflicts: crypto/async_tx/async_xor.c drivers/dma/ioat/dma_v2.h drivers/dma/ioat/pci.c drivers/md/raid5.c	2009-09-08 17:55:21 -07:00
Dan Williams	3e48e65690	Merge branch 'iop-raid6' into async-tx-next	2009-09-08 17:53:57 -07:00
Atsushi Nemoto	657a77fa72	dmaengine: Move all map_sg/unmap_sg for slave channel to its client Dan Williams wrote: ... DMA-slave clients request specific channels and know the hardware details at a low level, so it should not be too high an expectation to push dma mapping responsibility to the client. Also this patch includes DMA_COMPL_{SRC,DEST}_UNMAP_SINGLE support for dw_dmac driver. Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:05 -07:00
Ira Snyder	bbea0b6e0d	fsldma: Add DMA_SLAVE support Use the DMA_SLAVE capability of the DMAEngine API to copy/from a scatterlist into an arbitrary list of hardware address/length pairs. This allows a single DMA transaction to copy data from several different devices into a scatterlist at the same time. This also adds support to enable some controller-specific features such as external start and external pause for a DMA transaction. [dan.j.williams@intel.com: rebased on tx_list movement] Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Acked-by: Li Yang <leoli@freescale.com> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:04 -07:00
Ira Snyder	e6c7ecb64e	fsldma: split apart external pause and request count features When using the Freescale DMA controller in external control mode, both the request count and external pause bits need to be setup correctly. This was being done with the same function. The 83xx controller lacks the external pause feature, but has a similar feature called external start. This feature requires that the request count bits be setup correctly. Split the function into two parts, to make it possible to use the external start feature on the 83xx controller. Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:04 -07:00
Dan Williams	162b96e63e	ioat2,3: cacheline align software descriptor allocations All the necessary fields for handling an ioat2,3 ring entry can fit into one cacheline. Move ->len prior to ->txd in struct ioat_ring_ent, and move allocation of these entries to a hw-cache-aligned kmem cache to reduce the number of cachelines dirtied for descriptor management. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:04 -07:00
Dan Williams	0803172778	dmaengine: kill tx_list The tx_list attribute of struct dma_async_tx_descriptor is common to most, but not all dma driver implementations. None of the upper level code (dmaengine/async_tx) uses it, so allow drivers to implement it locally if they need it. This saves sizeof(struct list_head) bytes for drivers that do not manage descriptors with a linked list (e.g.: ioatdma v2,3). Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:04 -07:00
Dan Williams	1979b186b8	txx9dmac: implement a private tx_list Drop txx9dmac's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:03 -07:00
Dan Williams	285a3c7164	at_hdmac: implement a private tx_list Drop at_hdmac's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:03 -07:00
Dan Williams	64203b6727	mv_xor: implement a private tx_list Drop mv_xor's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Saeed Bishara <saeed@marvell.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:03 -07:00
Dan Williams	ea25968a32	ioat: implement a private tx_list Drop ioatdma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:02 -07:00
Dan Williams	308136d1ab	iop-adma: implement a private tx_list Drop iop-adma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:02 -07:00
Dan Williams	eda3423457	fsldma: implement a private tx_list Drop fsldma's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Li Yang <leoli@freescale.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:02 -07:00
Dan Williams	e0bd0f8cb0	dw_dmac: implement a private tx_list Drop dw_dmac's use of tx_list from struct dma_async_tx_descriptor in preparation for removal of this field. Cc: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:53:02 -07:00
Dan Williams	e12c4fa377	Merge branch 'ioat' into dmaengine	2009-09-08 17:52:57 -07:00
Roland Dreier	a6417dd58d	I/OAT: Convert to PCI_VDEVICE() Trivial cleanup to make the PCI ID table easier to read. [dan.j.williams@intel.com: extended to v3.2 devices] Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:43:03 -07:00
Roland Dreier	6506cbca6b	Add MODULE_DEVICE_TABLE() so ioatdma module is autoloaded The ioatdma module is missing aliases for the PCI devices it supports, so it is not autoloaded on boot. Add a MODULE_DEVICE_TABLE() to get these aliases. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:43:03 -07:00
Dan Williams	e3232714d4	ioat3: segregate raid engines The cleanup routine for the raid cases imposes extra checks for handling raid descriptors and extended descriptors. If the channel does not support raid it can avoid this extra overhead by using the ioat2 cleanup path. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:43:02 -07:00
Tom Picard	b265b11fc1	ioat3: ioat3.2 pci ids for Jasper Forest Jasper Forest introduces raid offload support via ioat3.2 support. When raid offload is enabled two (out of 8 channels) will report raid5/raid6 offload capabilities. The remaining channels will only report ioat3.0 capabilities (memcpy). Signed-off-by: Tom Picard <tom.s.picard@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:43:01 -07:00
Dan Williams	58c8649e0e	ioat3: interrupt descriptor support The async_tx api uses the DMA_INTERRUPT operation type to terminate a chain of issued operations with a callback routine. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:43:00 -07:00
Dan Williams	ae786624c2	ioat3: support xor via pq descriptors If a platform advertises pq capabilities, but not xor, then use ioat3_prep_pqxor and ioat3_prep_pqxor_val to simulate xor support. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:43:00 -07:00
Dan Williams	d69d235b7d	ioat3: pq support ioat3.2 adds support for raid6 syndrome generation (xor sum of galois field multiplication products) using up to 8 sources. It can also perform an pq-zero-sum operation to validate whether the syndrome for a given set of sources matches a previously computed syndrome. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:59 -07:00
Dan Williams	9de6fc717b	ioat3: xor self test This adds a hardware specific self test to be called from ioat_probe. In the ioat3 case we will have tests for all the different raid operations, while ioat1 and ioat2 will continue to just test memcpy. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:58 -07:00
Dan Williams	b094ad3be5	ioat3: xor support ioat3.2 adds xor offload support for up to 8 sources. It can also perform an xor-zero-sum operation to validate whether all given sources sum to zero, without writing to a destination. Xor descriptors differ from memcpy in that one operation may require multiple descriptors depending on the number of sources. When the number of sources exceeds 5 an extended descriptor is needed. These descriptors need to be accounted for when updating the DMA_COUNT register. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:57 -07:00
Dan Williams	e61dacaeb3	ioat3: enable dca for completion writes Tag completion writes for direct cache access to reduce the latency of checking for descriptor completions. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:57 -07:00
Dan Williams	5669e31c5a	ioat: add 'ioat' sysfs attributes Export driver attributes for diagnostic purposes: 'ring_size': total number of descriptors available to the engine 'ring_active': number of descriptors in-flight 'capabilities': supported operation types for this channel 'version': Intel(R) QuickData specfication revision This also allows some chattiness to be removed from the driver startup as this information is now available via sysfs. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:56 -07:00
Dan Williams	bf40a6869c	ioat3: split ioat3 support to its own file, add memset Up until this point the driver for Intel(R) QuickData Technology engines, specification versions 2 and 3, were mostly identical save for a few quirks. Version 3.2 hardware adds many new capabilities (like raid offload support) requiring some infrastructure that is not relevant for v2. For better code organization of the new funcionality move v3 and v3.2 support to its own file dma_v3.c, and export some routines from the base files (dma.c and dma_v2.c) that can be reused directly. The first new capability included in this code reorganization is support for v3.2 memset operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:55 -07:00
Dan Williams	2aec048cdc	ioat3: hardware version 3.2 register / descriptor definitions ioat3.2 adds raid5 and raid6 offload capabilities. Signed-off-by: Tom Picard <tom.s.picard@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:54 -07:00
Dan Williams	128f2d567f	ioat2+: add fence support In preparation for adding more operation types to the ioat3 path the driver needs to honor the DMA_PREP_FENCE flag. For example the async_tx api will hand xor->memcpy->xor chains to the driver with the 'fence' flag set on the first xor and the memcpy operation. This flag in turn sets the 'fence' flag in the descriptor control field telling the hardware that future descriptors in the chain depend on the result of the current descriptor, so wait for all writes to complete before starting the next operation. Note that ioat1 does not prefetch the descriptor chain, so does not require/support fenced operations. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:53 -07:00
Dan Williams	83544ae9f3	dmaengine, async_tx: support alignment checks Some engines have transfer size and address alignment restrictions. Add a per-operation alignment property to struct dma_device that the async routines and dmatest can use to check alignment capabilities. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:53 -07:00
Dan Williams	9308add6ea	dmaengine: cleanup unused transaction types No drivers currently implement these operation types, so they can be deleted. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:52 -07:00
Dan Williams	138f4c359d	dmaengine, async_tx: add a "no channel switch" allocator Channel switching is problematic for some dmaengine drivers as the architecture precludes separating the ->prep from ->submit. In these cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify the async_tx allocator to only return channels that support all of the required asynchronous operations. For example MD_RAID456=y selects support for asynchronous xor, xor validate, pq, pq validate, and memcpy. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to quickly locate compatible channels with the guarantee that dependency chains will remain on one channel. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select channels that lead to operation chains that need to cross channel boundaries using the async_tx channel switch capability. Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:42:51 -07:00
Dan Williams	f9dd213437	Merge branch 'md-raid6-accel' into ioat3.2 Conflicts: include/linux/dmaengine.h	2009-09-08 17:42:29 -07:00
Dan Williams	4b652f0db3	net_dma: poll for a descriptor after allocation failure Handle descriptor allocation failures by polling for a descriptor. The driver will force forward progress when polled. In the best case this polling interval will be the time it takes for one dma memcpy transaction to complete. In the worst case, channel hang, we will need to wait 100ms for the cleanup watchdog to fire (ioatdma driver). Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:38:54 -07:00
Dan Williams	a309218ace	ioat2,3: dynamically resize descriptor ring Increment the allocation order of the descriptor ring every time we run out of descriptors up to a maximum of allocation order specified by the module parameter 'ioat_max_alloc_order'. After each idle period decrement the allocation order to a minimum order of 'ioat_ring_alloc_order' (i.e. the default ring size, tunable as a module parameter). Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:38:54 -07:00
Dan Williams	09c8a5b85e	ioat: switch watchdog and reset handler from workqueue to timer In order to support dynamic resizing of the descriptor ring or polling for a descriptor in the presence of a hung channel the reset handler needs to make progress while in a non-preemptible context. The current workqueue implementation precludes polling channel reset completion under spin_lock(). This conversion also allows us to return to opportunistic cleanup in the ioat2 case as the timer implementation guarantees at least one cleanup after every descriptor is submitted. This means the worst case completion latency becomes the timer frequency (for exceptional circumstances), but with the benefit of avoiding busy waiting when the lock is contended. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:24 -07:00
Dan Williams	ad643f54c8	ioat1: trim ioat_dma_desc_sw Save 4 bytes per software descriptor by transmitting tx_cnt in an unused portion of the hardware descriptor. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:24 -07:00
Dan Williams	345d852391	ioat: ___devinit annotate the initialization paths Mark all single use initialization routines with __devinit. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:24 -07:00
Dan Williams	f6ab95b557	ioat: preserve chanctrl bits when re-arming interrupts The register write in ioat_dma_cleanup_tasklet is unfortunate in two ways: 1/ It clears the extra 'enable' bits that we set at alloc_chan_resources time 2/ It gives the impression that it disables interrupts when it is in fact re-arming interrupts [ Impact: fix, persist the value of the chanctrl register when re-arming ] Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:24 -07:00
Dan Williams	bb32078630	ioat: ignore reserved bits for chancnt and xfercap Don't trust that the reserved bits are always zero, also sanity check the returned value. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:24 -07:00
Dan Williams	4fb9b9e8d5	ioat: cleanup completion status reads The cleanup path makes an effort to only perform an atomic read of the 64-bit completion address. However in the 32-bit case it does not matter if we read the upper-32 and lower-32 non-atomically because the upper-32 will always be zero. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:24 -07:00
Dan Williams	6df9183a15	ioat: add some dev_dbg() calls Provide some output for debugging the driver. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:23 -07:00
Dan Williams	38e12f64a1	ioat1: kill unused unmap parameters The unified ioat1/ioat2 ioat_dma_unmap() implementation derives the source and dest addresses from the unmap descriptor. There is no longer a need to track this information in struct ioat_desc_sw. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:30:23 -07:00
Dan Williams	5cbafa65b9	ioat2,3: convert to a true ring buffer Replace the current linked list munged into a ring with a native ring buffer implementation. The benefit of this approach is reduced overhead as many parameters can be derived from ring position with simple pointer comparisons and descriptor allocation/freeing becomes just a manipulation of head/tail pointers. It requires a contiguous allocation for the software descriptor information. Since this arrangement is significantly different from the ioat1 chain, move ioat2,3 support into its own file and header. Common routines are exported from driver/dma/ioat/dma.[ch]. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:29:55 -07:00
Dan Williams	dcbc853af6	ioat: prepare the code for ioat[12]_dma_chan split Prepare the code for the conversion of the ioat2 linked-list-ring into a native ring buffer. After this conversion ioat2 channels will share less of the ioat1 infrastructure, but there will still be places where sharing is possible. struct ioat_chan_common is created to house the channel attributes that will remain common between ioat1 and ioat2 channels. For every routine that accesses both common and hardware specific fields the old unified 'ioat_chan' pointer is split into an 'ioat' and 'chan' pointer. Where 'chan' references common fields and 'ioat' the hardware/version specific. [ Impact: pure structure member movement/variable renames, no logic changes ] Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:29:55 -07:00
Dan Williams	a6a39ca1ba	ioat: fix self test interrupts If a callback is to be attached to a descriptor the channel needs to know at ->prep time so it can set the interrupt enable bit. This is in preparation for moving descriptor ioat2 descriptor preparation from ->submit to ->prep. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-09-08 17:29:55 -07:00

1 2 3 4 5 ...

270 Commits