Read and write io memory should address align on ARCH ARM. Change to use
memcpy_toio to avoid kernel panic caused by the address un-align issue.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20200929091106.24624-5-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since struct _mic_vring_info and vring are allocated together and follow
vring, if the vring_size() is not four bytes aligned, which will cause
the start address of struct _mic_vring_info is not four byte aligned.
For example, when vring entries is 128, the vring_size() will be 5126
bytes. The _mic_vring_info struct layout in ddr looks like:
0x90002400: 00000000 00390000 EE010000 0000C0FF
Here 0x39 is the avail_idx member, and 0xC0FFEE01 is the magic member.
When EP use ioread32(magic) to reads the magic in RC's share memory, it
will cause kernel panic on ARM64 platform due to the cross-byte io read.
Here read magic in user space use le32toh(vr0->info->magic) will meet
the same issue.
So add round_up(x,4) for vring_size, then the struct _mic_vring_info
will store in this way:
0x90002400: 00000000 00000000 00000039 C0FFEE01
Which will avoid kernel panic when read magic in struct _mic_vring_info.
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20200929091106.24624-4-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Set VIRTIO_F_ACCESS_PLATFORM feature for vop driver, as the DMA mapping
details shouldn't decide on the virtio implementation, but the host PCIe
implementation.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20200929084944.24146-1-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When OCXL is enabled and HOTPLUG_PCI is disabled, it results in the
following Kbuild warning:
WARNING: unmet direct dependencies detected for HOTPLUG_PCI_POWERNV
Depends on [n]: PCI [=y] && HOTPLUG_PCI [=n] && PPC_POWERNV [=y] && EEH [=y]
Selected by [y]:
- OCXL [=y] && PPC_POWERNV [=y] && PCI [=y] && EEH [=y]
The reason is that OCXL selects HOTPLUG_PCI_POWERNV without depending on
or selecting HOTPLUG_PCI while HOTPLUG_PCI_POWERNV is subordinate to
HOTPLUG_PCI.
HOTPLUG_PCI_POWERNV is a visible symbol with a set of dependencies.
Selecting it will lead to overlooking its other dependencies as well.
Let OCXL depend on HOTPLUG_PCI_POWERNV instead to avoid Kbuild issues.
Fixes: 49ce94b867 ("ocxl: Add PCI hotplug dependency to Kconfig")
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Necip Fazil Yildiran <fazilyildiran@gmail.com>
Link: https://lore.kernel.org/r/20200918094148.20525-1-fazilyildiran@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sg_init_table zeroes its first argument, so the allocation of that argument
doesn't have to.
the semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression x,n,flags;
@@
x =
- kcalloc
+ kmalloc_array
(n,sizeof(struct scatterlist),flags)
...
sg_init_table(x,n)
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Link: https://lore.kernel.org/r/1600601186-7420-14-git-send-email-Julia.Lawall@inria.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
LDMA registers are configured with a fixed value.
We add new define set which gives the configuration
a proper meaning.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The device should be idle after a context is closed. If not, print a
notice.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
During debugging of error we sometimes need to know whether the error
happened when a user context was open. Add debug prints when opening and
closing user contexts.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Some engines use resources that belong to the kernel context (e.g. MMU
mappings). In case the halt-engines doesn't work properly due to H/W
restriction, we need to make sure the kernel context lives on until after
the hw_fini. The hw_fini resets the ASIC after that no engine is alive and
we can safely close the kernel context.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
We don't try to allocate huge pages here so remove the huge word.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This return statement is indented one tab too far.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20200918143405.GF909725@mwanda
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Inside __scif_pin_pages(), when map_flags != SCIF_MAP_KERNEL it
will call pin_user_pages_fast() to map nr_pages. However,
pin_user_pages_fast() might fail with a return value -ERRNO.
The return value is stored in pinned_pages->nr_pages. which in
turn is passed to unpin_user_pages(), which expects
pinned_pages->nr_pages >=0, else disaster.
Fix this by assigning pinned_pages->nr_pages to 0 if
pin_user_pages_fast() returns -ERRNO.
Fixes: ba612aa8b4 ("misc: mic: SCIF memory registration and unregistration")
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Link: https://lore.kernel.org/r/1600570295-29546-1-git-send-email-jrdr.linux@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make use of devm_platform_ioremap_resource() provided by driver
core platform instead of duplicated analogue.
Acked-by: Arnd Bergmann <arn@arndb.de>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lore.kernel.org/r/20200918083634.33124-1-bobo.shaobowang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Our firmware use some scratchpad registers in the device for different
roles. Update the file to the latest version of the firmware code.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Future F/W versions will have enhanced security measures and the driver
won't be able to do certain configurations that it always did and those
configurations will be done by the firmware.
We use the firmware's preboot version to determine whether security
measures are enabled or not. Because we need this very early in our code,
the read of the preboot version is moved to the earliest possible place,
right after the device's PCI initialization.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This is a workaround for H/W bug H3-2116, where if there are more than 16
outstanding completions in the DMA transpose engine, there can be a
deadlock in the engine.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Add new packet to fetch PLL information from firmware. This will be needed
in the future when the driver won't be able to access the PLL registers
directly
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
There are cases in which the device should access the host memory of a
CB through the device MMU, and thus this memory should be mapped.
The patch adds a flag to the CB IOCTL, in which a user can ask the
driver to perform the mapping when creating a CB.
The mapping is allowed only if a dedicated VA range was allocated for
the specific ASIC.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Future changes require using a context while handling a command buffer,
and thus need to save the context in the command buffer object.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Now that the driver no longer uses dma_buf, we can remove the select of
DMA_SHARED_BUFFER from kconfig.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The user sometimes wants to check if a CS has completed to clean resources.
In that case, the user doesn't want to sleep but just to check if the CS
has finished and continue with his code.
Add a new definition to the API of the wait on CS. The new definition says
that if the timeout is 0, the driver won't sleep at all but return
immediately after checking if the CS has finished.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The firmware running in the boot stage takes more time to execute due to
increased security mechanisms. Therefore, we need to increase the timeout
we wait for the boot fit to finish loading.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This commit modify the existing debugfs code to support future devices that
have a 6 HOPs MMU implementation instead of 5 HOPs implementation.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This commit adds the number of HOPs supported by the device to the
device MMU properties.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
As preparation to MMU v2, rework MMU to be device oriented
instantiated according to the device in hand.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
In the future we will have MMU v2 code, so we need to prepare the
driver for it. The first step is to rename the current MMU file to
mmu_v1.c.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Change the acquiring of a device virtual address for mapping by using the
smallest possible alignment, rather than the biggest, depending on the
page size used by the user for allocating the memory. This will lower the
virtual space memory consumption.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
For consistency with GAUDI code, add check of the relevant flag in the
device structure before resetting the GOYA device in case of firmware
event.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
For future ASICs, we increase this field by one nibble. This field was not
used by the current ASICs so this change doesn't break anything.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Because the device CPU compiler aligns structures to 8 bytes,
struct cpucp_info has an alignment issue as some parts
in the structure are not aligned to 8 bytes.
It is preferred that we explicitly insert placeholders inside
the structure to avoid confusion
in order to validate this scenario, we printed both pointers:
__u8 cpucp_version[VERSION_MAX_LEN]; (0xffff899c67ed4cbc)
__le64 dram_size; (0xffff899c67ed4d40)
we see difference of 132 bytes although the first array
is only 128 bytes long, Meaning compiler added a 4 byte padding.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
There were a couple of comments where the name ArmCP was still used. Rename
it to CPU-CP.
In addition, rename ArmCP or ARM in log messages to "device CPU".
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
There is a case where the user reaches the maximum number of CS in-flight.
In that case, the driver rejects the new CS of the user with EAGAIN. Count
that event so the user can query the driver later to see if it happened.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Add dma_mmap_coherent() for goya and gaudi to match their use of
dma_alloc_coherent(), see the Link tag for why.
Link: https://lore.kernel.org/lkml/20200609091727.GA23814@lst.de/
Cc: Christoph Hellwig <hch@lst.de>
Cc: Zhang Li <li.zhang@bitmain.com>
Cc: Ding Z Nan <oshack@hotmail.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The driver use vm_pgoff to hold the CB idr handle. Before we actually call
the mapping function, we need to clear the handle so there won't be any
garbage left in vm_pgoff.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Arrange the hl_mmap code to be more structured and expandable for the
future. Add better defines that describe our usage of the vm_pgoff.
Note that I shamelessly took the code and defines from the amdkfd driver
(my previous driver).
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
When shifting a boolean variable by more than 31 bits and putting the
result into a u64 variable, we need to cast the boolean into unsigned 64
bits to prevent possible overflow.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
ArmCP mandates that the device CPU is always an ARM processor, which might
be wrong in the future.
Most of this change is an internal renaming of variables, functions and
defines but there are two entries in sysfs which have armcp in their
names. Add identical cpucp entries but don't remove yet the armcp entries.
Those will be deprecated next year. Add the documentation about it in sysfs
documentation.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Add driver implementation for reading the total energy consumption
from the device ARM FW.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Include linux/bitfield.h only in habanalabs.h, instead of in each and
every file that needs it, as habanalabs.h is already included by all.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
change busy engines bitmask to 64 bits in order to represent
more engines, needed for future ASIC support.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Eliminate following warning:
warning: Shifting signed 32-bit value by 31 bits is undefined behavior
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The driver waits for the TPC vector pipe to be empty before checking if the
TPC kernel has finished executing, but the code doesn't validate that the
pipe was indeed empty, it just wait for it without checking the return
value.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>