linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-25 22:35:37 +07:00

Author	SHA1	Message	Date
Omer Shpigelman	54bb67444e	habanalabs: split MMU properties to PCI/DRAM Split the properties used for MMU mappings to DRAM and PCI (host) types. This is a prerequisite for future ASICs support. Note that in Goya ASIC, the PMMU and DMMU are the same (except of page sizes) as only one MMU mechanism is used for both of the mapping types. Hence this patch should not have any effect on current behavior. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	7b6e4ea0f7	habanalabs: type specific MMU cache invalidation Add the ability to invalidate the necessary MMU cache only. This ability is a prerequisite for future ASICs support. Note that in Goya ASIC, a single cache is used for both host/DRAM mappings and hence this patch should not have any effect on current behavior. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Omer Shpigelman	7f74d4d335	habanalabs: re-factor memory module code Some of the functions in the memory module code were too long and/or contained multiple operations that are not always done together. Re-factor the code by dividing those functions to smaller functions which are more readable and maintainable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:46 +02:00
Oded Gabbay	5d1012576d	habanalabs: export uapi defines to user-space The two defines that control the maximum size of a command buffer and the maximum number of JOBS per CS need to be exported to the user as they are part of the API towards user-space. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:46 +02:00
Oded Gabbay	bd4c8cb17d	habanalabs: increase max jobs number to 512 In training, there is a need for a large amount of patching to the recipe. This results in many command buffers contains a lot of DMA packets. The number of command buffers per CS is larger than the current maximum of 64, which is an arbitrary number that is enough for inference, but it has no real affect on the code and/or resources of the host machine. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	62c1e124a9	habanalabs: add opcode to INFO IOCTL to return clock rate Add a new opcode to the INFO IOCTL to allow the user application to retrieve the ASIC's current and maximum clock rate. The rate is returned in MHz. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-11-21 11:35:45 +02:00
Oded Gabbay	8fdacf2a53	habanalabs: set TPC Icache to 16 cache lines Reduce latency to memory during TPC kernel execution. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-11-21 11:35:45 +02:00
Tomer Tayar	cb596aee88	habanalabs: Add a new H/W queue type This patch adds a support for a new H/W queue type. This type of queue is for DMA and compute engines jobs, for which completion notification are sent by H/W. Command buffer for this queue can be created either through the CB IOCTL and using the retrieved CB handle, or by preparing a buffer on the host or device SRAM/DRAM, and using the device address to that buffer. The patch includes the handling of the 2 options, as well as the initialization of the H/W queue and its jobs scheduling. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
Tomer Tayar	df762375f1	habanalabs: Mark queue as expecting CB handle or address Jobs on some queues must be provided with a handle to a driver command buffer object, while for other queues, jobs must be provided with an address to a command buffer. Currently the distinction is done based on the queue type, which is less flexible if the same queue type behaves differently on different types of ASICs. This patch adds a new queue property for this target, which is configured per queue type per ASIC type. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
Tomer Tayar	f435614ff5	habanalabs: Fix typos s/paerser/parser/ s/requeusted/requested/ s/an JOB/a JOB/ Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-11-21 11:35:45 +02:00
Oded Gabbay	6dc66f7c26	habanalabs: correctly cast variable to __le32 When using the macro le32_to_cpu(x), we need to correctly convert x to be __le32 in case it is defined as u32 variable. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>	2019-09-05 14:55:28 +03:00
Oded Gabbay	4c172bbfaa	habanalabs: stop using the acronym KMD We want to stop using the acronym KMD. Therefore, replace all locations (except for register names we can't modify) where KMD is written to other terms such as "Linux kernel driver" or "Host kernel driver", etc. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Oded Gabbay	e9730763a2	habanalabs: add uapi to retrieve aggregate H/W events Add a new opcode to INFO IOCTL to retrieve aggregate H/W events. i.e. the events counters are NOT cleared upon device reset, but count from the loading of the driver. Add the code to support it in the device event handling function. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Oded Gabbay	75b3cb2bb0	habanalabs: add uapi to retrieve device utilization Users and sysadmins usually want to know what is the device utilization as a level 0 indication if they are efficiently using the device. Add a new opcode to the INFO IOCTL that will return the device utilization over the last period of 100-1000ms. The return value is 0-100, representing as percentage the total utilization rate. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:27 +03:00
Tomer Tayar	ea451f88ef	habanalabs: Expose devices after initialization is done The char devices are currently exposed to user before the device and driver initialization are done. This patch moves the cdev and device adding to the system to the end of the initialization sequence, while keeping the creation of the structures at the beginning to allow the usage of dev_*(). Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-09-05 14:55:27 +03:00
Oded Gabbay	4d6a7751f6	habanalabs: create two char devices per ASIC This patch changes the driver to create two char devices for each ASIC it discovers. This is done to allow system/monitoring applications to query the device for stats, information, idle state and more, while also allowing the deep-learning application to send work to the ASIC. One char device is the original device, hlX. IOCTL calls through this device file can perform any task on the device (compute, memory, queries). The open function for this device will fail if it was called before but the file-descriptor it created was not completely released yet (the release callback function is not called from the kernel until all instances of that FD are closed). The driver needs to keep this behavior to support backward compatibility with existing userspace, which count that the open will fail if the device is "occupied". The second char device is called "hl_controlDx", where x is the same index of the main device with a minor number of the original char device + 1. Applications that open this device can only call the INFO IOCTL. There is no limitation on the number of applications opening this device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	eb7caf84b0	habanalabs: maintain a list of file private data objects This patch adds a new list to the driver's device structure. The list will keep the file private data structures that the driver creates when a user process opens the device. This change is needed because it is useless to try to count how many FD are open. Instead, track our own private data structure per open file and once it is released, remove it from the list. As long as the list is not empty, it means we have a user that can do something with our device. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	86d5307a6d	habanalabs: rename user_ctx as compute_ctx This patch renames the "user_ctx" field in the device structure to "compute_ctx". This better reflects the meaning of this context. In addition, we also check in the ctx_fini() that the debug mode should be disabled only if the context being destroyed is the compute context. This has no effect right now as we only have a single process and a single context, but this makes the code more ready for multiple process support. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	b888751a02	habanalabs: add handle field to context structure This patch adds a field to the context's structure that will hold a unique handle for the context. This will be needed when the user will create the context. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-09-05 14:55:26 +03:00
Oded Gabbay	ed0fc50535	habanalabs: cap simulator timeout In the driver timeout functions, we give the simulator a factor of 10 in the timeout. This was necessary when the requested timeout is small but if it was a few seconds, this can result in a very large timeout which is unnecessary. This patch caps the maximum timeout of the simulator to 10 seconds, which is our largest timeout in the code. That is more then enough for anything the simulator is doing. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>	2019-09-05 14:55:26 +03:00
Oded Gabbay	b9040c9941	habanalabs: fix endianness handling for internal QMAN submission The PQs of internal H/W queues (QMANs) can be located in different memory areas for different ASICs. Therefore, when writing PQEs, we need to use the correct function according to the location of the PQ. e.g. if the PQ is located in the device's memory (SRAM or DRAM), we need to use memcpy_toio() so it would work in architectures that have separate address ranges for IO memory. This patch makes the code that writes the PQE to be ASIC-specific so we can handle this properly per ASIC. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Tested-by: Ben Segal <bpsegal20@gmail.com>	2019-08-12 09:01:10 +03:00
Ben Segal	2aa4e41079	habanalabs: fix host memory polling in BE architecture This patch fix a bug in the host memory polling macro. The bug is that the memory being polled can be written by the device, which always writes it in LE. However, if the host is running Linux in BE mode, we need to convert the value that was written by the device before matching it to the required value that the caller has given to the macro. Signed-off-by: Ben Segal <bpsegal20@gmail.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-07-29 11:40:25 +03:00
Tomer Tayar	e8960ca06b	habanalabs: Add busy engines bitmask to HW idle IOCTL The information which is currently provided as a response to the "HL_INFO_HW_IDLE" IOCTL is merely a general boolean value. This patch extends it and provides also a bitmask that indicates which of the device engines are busy. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-07-01 13:59:45 +00:00
Tomer Tayar	06deb86a74	habanalabs: Add debugfs node for engines status Command submissions sent to the device are composed of command buffers which are targeted to different device engines, like DMA and compute entities. When a command submission gets stuck, knowing in which engine the stuck is, is crucial for debugging. This patch adds a debugfs node that exports this information, by displaying the engines' various registers that assemble their idle/busy status. The information retrieval is based on the is_device_idle ASIC function. The printout in this function, of the first detected busy engine, is removed because it becomes redundant in the presence of the more elaborated info of the new debugfs node. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-07-01 13:59:45 +00:00
Oded Gabbay	95b5a8b83e	habanalabs: add MMU mappings for Goya CPU This patch adds the necessary MMU mappings for the Goya CPU to access the device DRAM and the host memory. The first 256MB of the device DRAM is being mapped. That's where the F/W is running. The 2MB area located on the host memory for the purpose of communication between the driver and the device CPU is also being mapped. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-29 17:30:04 +03:00
Oded Gabbay	cbb10f1e4a	habanalabs: don't limit packet size for device CPU This patch removes a limitation on the maximum packet size that is read by the device CPU as that limitation is not needed. Therefore, the patch also removes an elaborate calculation that is based on this limitation which is also not needed now. Instead, use a fixed value for the memory pool size of the packets. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-17 01:08:23 +03:00
Omer Shpigelman	a1e537b3f0	habanalabs: increase PCI ELBI timeout for Palladium This patch increases the timeout for PCI ELBI configuration to support low frequency Palladium images. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-13 14:44:50 +03:00
Oded Gabbay	921a465ba7	habanalabs: pass device pointer to asic-specific function This patch adds a new parameter that is passed to the add_end_of_cb_packets() asic-specific function. The parameter is the pointer to the driver's device structure. The function needs this pointer for future ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-12 16:53:16 +03:00
Oded Gabbay	a08b51a9a0	habanalabs: change polling functions to macros This patch changes two polling functions to macros, in order to make their API the same as the standard readl_poll_timeout so we would be able to define the "condition for exit" when calling these macros. This will simplify the code as it will eliminate the need to check both for timeout and for the (cond) in the calling function. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-09 01:48:23 +03:00
Oded Gabbay	19734970c9	habanalabs: force user to set device debug mode This patch adds the implementation of the HL_DEBUG_OP_SET_MODE opcode in the DEBUG IOCTL. It forces the user who wants to debug the device to set the device into debug mode before he can configure the debug engines. The patch also makes sure to disable debug mode upon user releasing FD, in case the user forgot to disable debug mode. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-04 17:36:06 +03:00
Omer Shpigelman	d1287493ab	habanalabs: minor documentation and prints fixes This patch fixes comments on various structure members and some spelling errors in log messages. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-05 13:24:24 +03:00
Omer Shpigelman	89225ce4fc	habanalabs: halt debug engines on user process close This patch fix a potential bug where a user's process has closed unexpectedly without disabling the debug engines. In that case, the debug engines might continue running but because the user's MMU mappings are going away, we will get page fault errors. This behavior is also opposed to the general rule where nothing runs on the device after the user process closes. The patch stops the debug H/W engines upon process termination and thus makes sure nothing runs on the device after the process goes away. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-24 22:46:15 +03:00
Dalit Ben Zoor	b1b537713e	habanalabs: increase timeout if working with simulator Where there is a spike in the CPU consumption, it may cause random failures in the C/I since the KMD timeout for CPU and/or QMAN0 jobs expires and it stops communicating to the simulator. This commit fixes it by increasing timeout on polling functions if working with simulator. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-30 17:18:51 +03:00
Dalit Ben Zoor	5809e18e02	habanalabs: remove redundant member from parser struct use_virt_addr member was used for telling whether to treat the addresses in the CB as virtual during parsing. We disabled it only when calling the parser from the driver memset device function, and since this call had been removed, it should always be enabled. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-01 13:16:18 +03:00
Tomer Tayar	94cb669ceb	habanalabs: Manipulate DMA addresses in ASIC functions Routing device accesses to the host memory requires the usage of a base offset, which is canceled by the iATU just before leaving the device. The value of the base offset might be distinctive between different ASIC types. The manipulation of the addresses is currently used throughout the driver code, and one should be aware to it whenever providing a host memory address to the device. This patch removes this manipulation from the driver common code, and moves it to the ASIC specific functions that are responsible for host memory allocation/mapping. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-01 11:28:15 +03:00
Oded Gabbay	d9c3aa8038	habanalabs: rename functions to improve code readability This patch renames four functions in the ASIC-specific functions section, so it will be easier to differentiate them from the generic kernel functions with the same name. This will help in future code reviews, to make sure we don't use the kernel functions directly. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-05-01 11:47:04 +03:00
Tomer Tayar	03d5f641dc	habanalabs: Use single pool for CPU accessible host memory The device's CPU accessible memory on host is managed in a dedicated pool, except for 2 regions - Primary Queue (PQ) and Event Queue (EQ) - which are allocated from generic DMA pools. Due to address length limitations of the CPU, the addresses of all these memory regions must have the same MSBs starting at bit 40. This patch modifies the allocation of the PQ and EQ to be also from the dedicated pool, to ensure compliance with the limitation. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-28 19:17:38 +03:00
Oded Gabbay	a38693d775	habanalabs: return old dram bar address upon change This patch changes the ASIC interface function that changes the DRAM bar window. The change is to return the old address that the DRAM bar pointed to instead of an error code. This simplifies the code that use this function (mainly in debugfs) to restore the bar to the old setting. This is also needed for easier support in future ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-28 10:18:35 +03:00
Oded Gabbay	027d35d0b6	habanalabs: rename restore to ctx_switch when appropriate This patch only does renaming of certain variables and structure members, and their accompanied comments. This is done to better reflect the actions these variables and members represent. There is no functional change in this patch. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-25 20:15:42 +03:00
Oded Gabbay	b2377e032f	habanalabs: use ASIC functions interface for rreg/wreg This patch slightly changes the macros of RREG32 and WREG32, which are used when reading or writing from registers. Instead of directly calling a function in the common code from these macros, the new code calls a function from the ASIC functions interface. This change allows us to share much more code between real ASICs and simulators, which in turn reduces the maintenance burden and the chances for forgetting to port code between the ASIC files. The patch also implements the hl_poll_timeout macro, instead of calling the generic readl_poll_timeout macro. This is required to allow use of this macro in the simulator files. As a result from this change, more functions in goya.c are shared with the simulator and therefore, should not be defined as static. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-22 11:49:06 +03:00
Tomer Tayar	e00dac3daa	habanalabs: Cancel pr_fmt() definition dependency on includes order pr_fmt() should be defined before including linux/printk.h, either directly or indirectly, in order to avoid redefinition of the macro. Currently the macro definition is in habanalabs.h, which is included in many files, and that makes the addition/reorder of includes to be prone to compilation errors. This patch cancels this dependency by defining the macro only in the few source files that use it. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-10 15:18:46 +03:00
Oded Gabbay	295938406c	habanalabs: ASIC_AUTO_DETECT enum value is redundant This patch removes the enum value of ASIC_AUTO_DETECT because we can use the validity of the pdev variable to know whether we have a real device or a simulator. For a real device, we detect the asic type from the device ID while for a simulator, the simulator code calls create_hdev() with the specified ASIC type. Set ASIC_INVALID as the first option in the enum to make sure that no other enum value will receive the value 0 (which indicates a non-existing entry in the simulator array). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-04 14:33:34 +03:00
Oded Gabbay	bedd14425d	habanalabs: refactoring in goya.c This patch does some refactoring in goya.c to make code more reusable between goya code and the goya simulator code (which is not upstreamed). In addition, the patch removes some dead functions from goya.c which are not used by the current upstream code Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-02 15:56:16 +03:00
Omer Shpigelman	315bc055ed	habanalabs: add new IOCTL for debug, tracing and profiling Habanalabs ASICs use the ARM coresight infrastructure to support debug, tracing and profiling of neural networks topologies. Because the coresight is configured using register writes and reads, and some of the registers hold sensitive information (e.g. the address in the device's DRAM where the trace data is written to), the user must go through the kernel driver to configure this mechanism. This patch implements the common code of the IOCTL and calls the ASIC-specific function for the actual H/W configuration. The IOCTL supports configuration of seven coresight components: ETR, ETF, STM, FUNNEL, BMON, SPMU and TIMESTAMP The user specifies which component he wishes to configure and provides a pointer to a structure (located in its process space) that contains the relevant configuration. The common code copies the relevant data from the user-space to kernel space and then calls the ASIC-specific function to do the H/W configuration. After the configuration is done, which is usually composed of several IOCTL calls depending on what the user wanted to trace, the user can start executing the topology. The trace data will be written to the user's area in the device's DRAM. After the tracing operation is complete, and user will call the IOCTL again to disable the tracing operation. The user also need to read values from registers for some of the components (e.g. the size of the trace data in the device's DRAM). In that case, the user will provide a pointer to an "output" structure in user-space, which the IOCTL code will fill according the to selected component. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-04-01 22:31:22 +03:00
Dalit Ben Zoor	aa957088b4	habanalabs: add device status option to INFO IOCTL This patch adds a new opcode to INFO IOCTL that returns the device status. This will allow users to query the device status in order to avoid sending command submissions while device is in reset. Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-03-24 10:15:44 +02:00
Oded Gabbay	d9973871da	habanalabs: keep track of the device's dma mask This patch refactors the code that is responsible to set the DMA mask for the device. Upon each change of the dma mask, the driver will save the new value that was set. This is needed in order to make sure we don't try to increase the mask a second time, in case we failed in the first time. This is especially relevant for Power machines, as that may cause a change in configuration of the TVT which will break the device. Goya will first try to set the device's dma mask to 39 bits, so that the memory that is allocated on the host machine for communication with the device's cpu will be in a bus address which is lower then 39 bits. Later, Goya will try to increase that mask to 48 bits, but only if setting the mask to 39 bits was successful. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-03-07 18:03:23 +02:00
Omer Shpigelman	66542c3b9d	habanalabs: add MMU shadow mapping This patch adds shadow mapping to the MMU module. The shadow mapping allows traversing the page table in host memory rather reading each PTE from the device memory. It brings better performance and avoids reading from invalid device address upon PCI errors. Only at the end of map/unmap flow, writings to the device are performed in order to sync the H/W page tables with the shadow ones. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-02-24 09:17:55 +02:00
Tomer Tayar	c811f7bc77	habanalabs: Add a printout with the name of a busy engine Print the name of a busy engine when checking if a device is idle. The change is done mainly to help a user to pinpoint problems in his topology's recipe. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-03-07 14:26:02 +02:00
Tomer Tayar	b6f897d75d	habanalabs: Move PCI code into common file Move duplicated PCI-related code from ASIC-specific files into the common pci.c file. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-03-05 16:48:42 +02:00
Tomer Tayar	3110c60fdc	habanalabs: Move device CPU code into common file This patch moves the code that is responsible of the communication vs. the F/W to a dedicated file. This will allow us to share the code between different ASICs. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2019-03-04 10:22:09 +02:00

1 2

69 Commits