linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-18 18:57:47 +07:00

Author	SHA1	Message	Date
Oded Gabbay	75b3cb2bb0	habanalabs: add uapi to retrieve device utilization Users and sysadmins usually want to know what is the device utilization as a level 0 indication if they are efficiently using the device. Add a new opcode to the INFO IOCTL that will return the device utilization over the last period of 100-1000ms. The return value is 0-100, representing as percentage the total utilization rate. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Oded Gabbay	fe9a52c97f	habanalabs: replace __le32_to_cpu with le32_to_cpu In some files the driver uses __le32_to_cpu while in other it uses le32_to_cpu. Replace all __le32_to_cpu instances with le32_to_cpu for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Oded Gabbay	abca3a8224	habanalabs: replace __cpu_to_le32/64 with cpu_to_le32/64 In some files the code use __cpu_to_le32/64 while in other it use cpu_to_le32/64. Replace all __cpu_to_le32/64 instances with cpu_to_le32/64 for consistency. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Oded Gabbay	b9040c9941	habanalabs: fix endianness handling for internal QMAN submission The PQs of internal H/W queues (QMANs) can be located in different memory areas for different ASICs. Therefore, when writing PQEs, we need to use the correct function according to the location of the PQ. e.g. if the PQ is located in the device's memory (SRAM or DRAM), we need to use memcpy_toio() so it would work in architectures that have separate address ranges for IO memory. This patch makes the code that writes the PQE to be ASIC-specific so we can handle this properly per ASIC. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Tested-by: Ben Segal <bpsegal20@gmail.com>	2019-08-12 09:01:10 +03:00
Oded Gabbay	921a465ba7	habanalabs: pass device pointer to asic-specific function This patch adds a new parameter that is passed to the add_end_of_cb_packets() asic-specific function. The parameter is the pointer to the driver's device structure. The function needs this pointer for future ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-12 16:53:16 +03:00
Tomer Tayar	94cb669ceb	habanalabs: Manipulate DMA addresses in ASIC functions Routing device accesses to the host memory requires the usage of a base offset, which is canceled by the iATU just before leaving the device. The value of the base offset might be distinctive between different ASIC types. The manipulation of the addresses is currently used throughout the driver code, and one should be aware to it whenever providing a host memory address to the device. This patch removes this manipulation from the driver common code, and moves it to the ASIC specific functions that are responsible for host memory allocation/mapping. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-01 11:28:15 +03:00
Oded Gabbay	d9c3aa8038	habanalabs: rename functions to improve code readability This patch renames four functions in the ASIC-specific functions section, so it will be easier to differentiate them from the generic kernel functions with the same name. This will help in future code reviews, to make sure we don't use the kernel functions directly. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-01 11:47:04 +03:00
Tomer Tayar	03d5f641dc	habanalabs: Use single pool for CPU accessible host memory The device's CPU accessible memory on host is managed in a dedicated pool, except for 2 regions - Primary Queue (PQ) and Event Queue (EQ) - which are allocated from generic DMA pools. Due to address length limitations of the CPU, the addresses of all these memory regions must have the same MSBs starting at bit 40. This patch modifies the allocation of the PQ and EQ to be also from the dedicated pool, to ensure compliance with the limitation. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-28 19:17:38 +03:00
Oded Gabbay	cbaa99ed1b	habanalabs: perform accounting for active CS This patch adds accounting for active CS. Active means that the CS was submitted to the H/W queues and was not completed yet. This is necessary to support suspend operation. Because the device will be reset upon suspend, we can only suspend after all active CS have been completed. Hence, we need to perform accounting on their number. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-03-03 15:13:15 +02:00
Oded Gabbay	8c8448792a	habanalabs: fix little-endian<->cpu conversion warnings Add __cpu_to_le16/32/64 and __le16/32/64_to_cpu where needed according to sparse. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-28 13:07:52 +01:00
Oded Gabbay	eff6f4a0e7	habanalabs: add command submission module This patch adds the main flow for the user to submit work to the device. Each work is described by a command submission object (CS). The CS contains 3 arrays of command buffers: One for execution, and two for context-switch (store and restore). For each CB, the user specifies on which queue to put that CB. In case of an internal queue, the entry doesn't contain a pointer to the CB but the address in the on-chip memory that the CB resides at. The driver parses some of the CBs to enforce security restrictions. The user receives a sequence number that represents the CS object. The user can then query the driver regarding the status of the CS, using that sequence number. In case the CS doesn't finish before the timeout expires, the driver will perform a soft-reset of the device. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-18 09:46:45 +01:00
Oded Gabbay	9494a8dd8d	habanalabs: add h/w queues module This patch adds the H/W queues module and the code to initialize Goya's various compute and DMA engines and their queues. Goya has 5 DMA channels, 8 TPC engines and a single MME engine. For each channel/engine, there is a H/W queue logic which is used to pass commands from the user to the H/W. That logic is called QMAN. There are two types of QMANs: external and internal. The DMA QMANs are considered external while the TPC and MME QMANs are considered internal. For each external queue there is a completion queue, which is located on the Host memory. The differences between external and internal QMANs are: 1. The location of the queue's memory. External QMANs are located on the Host memory while internal QMANs are located on the on-chip memory. 2. The external QMAN write an entry to a completion queue and sends an MSI-X interrupt upon completion of a command buffer that was given to it. The internal QMAN doesn't do that. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-18 09:46:45 +01:00

12 Commits