There is a case where the user reaches the maximum number of CS in-flight.
In that case, the driver rejects the new CS of the user with EAGAIN. Count
that event so the user can query the driver later to see if it happened.
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Add dma_mmap_coherent() for goya and gaudi to match their use of
dma_alloc_coherent(), see the Link tag for why.
Link: https://lore.kernel.org/lkml/20200609091727.GA23814@lst.de/
Cc: Christoph Hellwig <hch@lst.de>
Cc: Zhang Li <li.zhang@bitmain.com>
Cc: Ding Z Nan <oshack@hotmail.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The driver use vm_pgoff to hold the CB idr handle. Before we actually call
the mapping function, we need to clear the handle so there won't be any
garbage left in vm_pgoff.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Arrange the hl_mmap code to be more structured and expandable for the
future. Add better defines that describe our usage of the vm_pgoff.
Note that I shamelessly took the code and defines from the amdkfd driver
(my previous driver).
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
When shifting a boolean variable by more than 31 bits and putting the
result into a u64 variable, we need to cast the boolean into unsigned 64
bits to prevent possible overflow.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
ArmCP mandates that the device CPU is always an ARM processor, which might
be wrong in the future.
Most of this change is an internal renaming of variables, functions and
defines but there are two entries in sysfs which have armcp in their
names. Add identical cpucp entries but don't remove yet the armcp entries.
Those will be deprecated next year. Add the documentation about it in sysfs
documentation.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Add driver implementation for reading the total energy consumption
from the device ARM FW.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Include linux/bitfield.h only in habanalabs.h, instead of in each and
every file that needs it, as habanalabs.h is already included by all.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
change busy engines bitmask to 64 bits in order to represent
more engines, needed for future ASIC support.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Eliminate following warning:
warning: Shifting signed 32-bit value by 31 bits is undefined behavior
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The driver waits for the TPC vector pipe to be empty before checking if the
TPC kernel has finished executing, but the code doesn't validate that the
pipe was indeed empty, it just wait for it without checking the return
value.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
new_dma_pkt->ctl is assigned a value and then is reassigned a new value
without the first value ever being used.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Use the standard FIELD_PREP() macro instead of << operator to perform
bitmask operations. This ensures type check safety and eliminate compiler
warnings.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Use the standard macros to define bitmasks.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
If both parts of if-else are goto statements, we can remove the else and
put the else goto statement after the if statement.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
%u is used for unsigned so we need to cast the int variable to u32 to avoid
compiler warning.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Although the possible values for CB's ID are only 32 bits, there are a few
places in the code where this field is shifted and passed into a function
which expects 64 bits.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
If there is a failure during the testing of a queue,
to ease up debugging - print the queue id.
Signed-off-by: Dotan Barak <dbarak@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Allow user application to write to this register in order
to be able to configure the quiet period of the QMAN between grants.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
driver will now get notified upon any PCI error occurred and
will respond according to the severity of the error.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Although the driver defines the first user-available sync manager object
and monitor in habanalabs.h, we would like to also expose this information
via the INFO IOCTL so the runtime can get this information dynamically.
This is because in future ASICs we won't need to define it statically.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Update firmware header with new API for getting pcie info
such as tx/rx throughput and replay counter.
These counters are needed by customers for monitor and maintenance
of multiple devices.
Add new opcodes to the INFO ioctl to retrieve these counters.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
habanalabs driver uses dma-fence mechanism for synchronization.
dma-fence mechanism was designed solely for GPUs, hence we purpose
a simpler mechanism based on completions to replace current
dma-fence objects.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Future ASIC names are longer than 15 chars so increase the variable length
to 32 chars.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
Add entry in the MAINTAINERS file for the Nitro Enclaves files such as
the documentation, the header files, the driver itself and the user
space sample.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Update the location of the documentation, as it has been moved to the
"virt" directory.
v7 -> v8
* No changes.
v6 -> v7
* No changes.
v5 -> v6
* No changes.
v4 -> v5
* No changes.
v3 -> v4
* No changes.
v2 -> v3
* Update file entries to be in alphabetical order.
v1 -> v2
* No changes.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-19-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add documentation on the overview of Nitro Enclaves. Include it in the
virtualization specific directory.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Move the Nitro Enclaves documentation to the "virt" directory and add
an entry for it in the corresponding index file.
v7 -> v8
* Add info about the primary / parent VM CID value.
* Update reference link for huge pages.
* Add reference link for the x86 boot protocol.
* Add license mention and update doc title / chapter formatting.
v6 -> v7
* No changes.
v5 -> v6
* No changes.
v4 -> v5
* No changes.
v3 -> v4
* Update doc type from .txt to .rst.
* Update documentation based on the changes from v4.
v2 -> v3
* No changes.
v1 -> v2
* New in v2.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-18-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a user space sample for the usage of the ioctl interface provided by
the Nitro Enclaves driver.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* No changes.
v7 -> v8
* Track NE custom error codes for invalid page size, invalid flags and
enclave CID.
* Update the heartbeat logic to have a listener fd first, then start the
enclave and then accept connection to get the heartbeat.
* Update the reference link to the hugetlb documentation.
v6 -> v7
* Track POLLNVAL as poll event in addition to POLLHUP.
v5 -> v6
* Remove "rc" mentioning when printing errno string.
* Remove the ioctl to query API version.
* Include usage info for NUMA-aware hugetlb configuration.
* Update documentation to kernel-doc format.
* Add logic for enclave image loading.
v4 -> v5
* Print enclave vCPU ids when they are created.
* Update logic to map the modified vCPU ioctl call.
* Add check for the path to the enclave image to be less than PATH_MAX.
* Update the ioctl calls error checking logic to match the NE specific
error codes.
v3 -> v4
* Update usage details to match the updates in v4.
* Update NE ioctl interface usage.
v2 -> v3
* Remove the include directory to use the uapi from the kernel.
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
v1 -> v2
* New in v2.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-17-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add Makefile for the Nitro Enclaves driver, considering the option set
in the kernel config.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Remove -Wall flags, could use W=1 as an option for this.
v7 -> v8
* No changes.
v6 -> v7
* No changes.
v5 -> v6
* No changes.
v4 -> v5
* No changes.
v3 -> v4
* No changes.
v2 -> v3
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
v1 -> v2
* Update path to Makefile to match the drivers/virt/nitro_enclaves
directory.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-16-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add kernel config entry for Nitro Enclaves, including dependencies.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* No changes.
v7 -> v8
* No changes.
v6 -> v7
* Remove, for now, the dependency on ARM64 arch. x86 is currently
supported, with Arm to come afterwards. The NE kernel driver can be
built for aarch64 arch.
v5 -> v6
* No changes.
v4 -> v5
* Add arch dependency for Arm / x86.
v3 -> v4
* Add PCI and SMP dependencies.
v2 -> v3
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
v1 -> v2
* Update path to Kconfig to match the drivers/virt/nitro_enclaves
directory.
* Update help in Kconfig.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-15-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An enclave is associated with an fd that is returned after the enclave
creation logic is completed. This enclave fd is further used to setup
enclave resources. Once the enclave needs to be terminated, the enclave
fd is closed.
Add logic for enclave termination, that is mapped to the enclave fd
release callback. Free the internal enclave info used for bookkeeping.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Use the ne_devs data structure to get the refs for the NE PCI device.
v7 -> v8
* No changes.
v6 -> v7
* Remove the pci_dev_put() call as the NE misc device parent field is
used now to get the NE PCI device.
* Update the naming and add more comments to make more clear the logic
of handling full CPU cores and dedicating them to the enclave.
v5 -> v6
* Update documentation to kernel-doc format.
* Use directly put_page() instead of unpin_user_pages(), to match the
get_user_pages() calls.
v4 -> v5
* Release the reference to the NE PCI device on enclave fd release.
* Adapt the logic to cpumask enclave vCPU ids and CPU cores.
* Remove sanity checks for situations that shouldn't happen, only if
buggy system or broken logic at all.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
v2 -> v3
* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Update kzfree() calls to kfree().
v1 -> v2
* Add log pattern for NE.
* Remove the BUG_ON calls.
* Update goto labels to match their purpose.
* Add early exit in release() if there was a slot alloc error in the fd
creation path.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-14-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After all the enclave resources are set, the enclave is ready for
beginning to run.
Add ioctl command logic for starting an enclave after all its resources,
memory regions and CPUs, have been set.
The enclave start information includes the local channel addressing -
vsock CID - and the flags associated with the enclave.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Use the ne_devs data structure to get the refs for the NE PCI device.
v7 -> v8
* Add check for invalid enclave CID value e.g. well-known CIDs and
parent VM CID.
* Add custom error code for incorrect flag in enclave start info and
invalid enclave CID.
v6 -> v7
* Update the naming and add more comments to make more clear the logic
of handling full CPU cores and dedicating them to the enclave.
v5 -> v6
* Check for invalid enclave start flags.
* Update documentation to kernel-doc format.
v4 -> v5
* Add early exit on enclave start ioctl function call error.
* Move sanity checks in the enclave start ioctl function, outside of the
switch-case block.
* Remove log on copy_from_user() / copy_to_user() failure.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
* Update the naming for the ioctl command from metadata to info.
* Check for minimum enclave memory size.
v2 -> v3
* Remove the WARN_ON calls.
* Update static calls sanity checks.
v1 -> v2
* Add log pattern for NE.
* Check if enclave state is init when starting an enclave.
* Remove the BUG_ON calls.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-13-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Another resource that is being set for an enclave is memory. User space
memory regions, that need to be backed by contiguous memory regions,
are associated with the enclave.
One solution for allocating / reserving contiguous memory regions, that
is used for integration, is hugetlbfs. The user space process that is
associated with the enclave passes to the driver these memory regions.
The enclave memory regions need to be from the same NUMA node as the
enclave CPUs.
Add ioctl command logic for setting user space memory region for an
enclave.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Use the ne_devs data structure to get the refs for the NE PCI device.
v7 -> v8
* Add early check, while getting user pages, to be multiple of 2 MiB for
the pages that back the user space memory region.
* Add custom error code for incorrect user space memory region flag.
* Include in a separate function the sanity checks for each page of the
user space memory region.
v6 -> v7
* Update check for duplicate user space memory regions to cover
additional possible scenarios.
v5 -> v6
* Check for max number of pages allocated for the internal data
structure for pages.
* Check for invalid memory region flags.
* Check for aligned physical memory regions.
* Update documentation to kernel-doc format.
* Check for duplicate user space memory regions.
* Use directly put_page() instead of unpin_user_pages(), to match the
get_user_pages() calls.
v4 -> v5
* Add early exit on set memory region ioctl function call error.
* Remove log on copy_from_user() failure.
* Exit without unpinning the pages on NE PCI dev request failure as
memory regions from the user space range may have already been added.
* Add check for the memory region user space address to be 2 MiB
aligned.
* Update logic to not have a hardcoded check for 2 MiB memory regions.
v3 -> v4
* Check enclave memory regions are from the same NUMA node as the
enclave CPUs.
* Use dev_err instead of custom NE log pattern.
* Update the NE ioctl call to match the decoupling from the KVM API.
v2 -> v3
* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Update kzfree() calls to kfree().
v1 -> v2
* Add log pattern for NE.
* Update goto labels to match their purpose.
* Remove the BUG_ON calls.
* Check if enclave max memory regions is reached when setting an enclave
memory region.
* Check if enclave state is init when setting an enclave memory region.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-12-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before setting the memory regions for the enclave, the enclave image
needs to be placed in memory. After the memory regions are set, this
memory cannot be used anymore by the VM, being carved out.
Add ioctl command logic to get the offset in enclave memory where to
place the enclave image. Then the user space tooling copies the enclave
image in the memory using the given memory offset.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* No changes.
v7 -> v8
* Add custom error code for incorrect enclave image load info flag.
v6 -> v7
* No changes.
v5 -> v6
* Check for invalid enclave image load flags.
v4 -> v5
* Check for the enclave not being started when invoking this ioctl call.
* Remove log on copy_from_user() / copy_to_user() failure.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
* Set enclave image load offset based on flags.
* Update the naming for the ioctl command from metadata to info.
v2 -> v3
* No changes.
v1 -> v2
* New in v2.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-11-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An enclave, before being started, has its resources set. One of its
resources is CPU.
A NE CPU pool is set and enclave CPUs are chosen from it. Offline the
CPUs from the NE CPU pool during the pool setup and online them back
during the NE CPU pool teardown. The CPU offline is necessary so that
there would not be more vCPUs than physical CPUs available to the
primary / parent VM. In that case the CPUs would be overcommitted and
would change the initial configuration of the primary / parent VM of
having dedicated vCPUs to physical CPUs.
The enclave CPUs need to be full cores and from the same NUMA node. CPU
0 and its siblings have to remain available to the primary / parent VM.
Add ioctl command logic for setting an enclave vCPU.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Use the ne_devs data structure to get the refs for the NE PCI device.
v7 -> v8
* No changes.
v6 -> v7
* Check for error return value when setting the kernel parameter string.
* Use the NE misc device parent field to get the NE PCI device.
* Update the naming and add more comments to make more clear the logic
of handling full CPU cores and dedicating them to the enclave.
* Calculate the number of threads per core and not use smp_num_siblings
that is x86 specific.
v5 -> v6
* Check CPUs are from the same NUMA node before going through CPU
siblings during the NE CPU pool setup.
* Update documentation to kernel-doc format.
v4 -> v5
* Set empty string in case of invalid NE CPU pool.
* Clear NE CPU pool mask on pool setup failure.
* Setup NE CPU cores out of the NE CPU pool.
* Early exit on NE CPU pool setup if enclave(s) already running.
* Remove sanity checks for situations that shouldn't happen, only if
buggy system or broken logic at all.
* Add check for maximum vCPU id possible before looking into the CPU
pool.
* Remove log on copy_from_user() / copy_to_user() failure and on admin
capability check for setting the NE CPU pool.
* Update the ioctl call to not create a file descriptor for the vCPU.
* Split the CPU pool usage logic in 2 separate functions - one to get a
CPU from the pool and the other to check the given CPU is available in
the pool.
v3 -> v4
* Setup the NE CPU pool at runtime via a sysfs file for the kernel
parameter.
* Check enclave CPUs to be from the same NUMA node.
* Use dev_err instead of custom NE log pattern.
* Update the NE ioctl call to match the decoupling from the KVM API.
v2 -> v3
* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Update kzfree() calls to kfree().
* Remove file ops that do nothing for now - open, ioctl and release.
v1 -> v2
* Add log pattern for NE.
* Update goto labels to match their purpose.
* Remove the BUG_ON calls.
* Check if enclave state is init when setting enclave vCPU.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-10-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add ioctl command logic for enclave VM creation. It triggers a slot
allocation. The enclave resources will be associated with this slot and
it will be used as an identifier for triggering enclave run.
Return a file descriptor, namely enclave fd. This is further used by the
associated user space enclave process to set enclave resources and
trigger enclave termination.
The poll function is implemented in order to notify the enclave process
when an enclave exits without a specific enclave termination command
trigger e.g. when an enclave crashes.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Use the ne_devs data structure to get the refs for the NE PCI device.
v7 -> v8
* No changes.
v6 -> v7
* Use the NE misc device parent field to get the NE PCI device.
* Update the naming and add more comments to make more clear the logic
of handling full CPU cores and dedicating them to the enclave.
v5 -> v6
* Update the code base to init the ioctl function in this patch.
* Update documentation to kernel-doc format.
v4 -> v5
* Release the reference to the NE PCI device on create VM error.
* Close enclave fd on copy_to_user() failure; rename fd to enclave fd
while at it.
* Remove sanity checks for situations that shouldn't happen, only if
buggy system or broken logic at all.
* Remove log on copy_to_user() failure.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
* Update the NE ioctl call to match the decoupling from the KVM API.
* Add metadata for the NUMA node for the enclave memory and CPUs.
v2 -> v3
* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Update kzfree() calls to kfree().
* Remove file ops that do nothing for now - open.
v1 -> v2
* Add log pattern for NE.
* Update goto labels to match their purpose.
* Remove the BUG_ON calls.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-9-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Nitro Enclaves driver provides an ioctl interface to the user space
for enclave lifetime management e.g. enclave creation / termination and
setting enclave resources such as memory and CPU.
This ioctl interface is mapped to a Nitro Enclaves misc device.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Use the ne_devs data structure to get the refs for the NE misc device
in the NE PCI device driver logic.
v7 -> v8
* Add define for the CID of the primary / parent VM.
* Update the NE PCI driver shutdown logic to include misc device
deregister.
v6 -> v7
* Set the NE PCI device the parent of the NE misc device to be able to
use it in the ioctl logic.
* Update the naming and add more comments to make more clear the logic
of handling full CPU cores and dedicating them to the enclave.
v5 -> v6
* Remove the ioctl to query API version.
* Update documentation to kernel-doc format.
v4 -> v5
* Update the size of the NE CPU pool string from 4096 to 512 chars.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
* Remove the NE CPU pool init during kernel module loading, as the CPU
pool is now setup at runtime, via a sysfs file for the kernel
parameter.
* Add minimum enclave memory size definition.
v2 -> v3
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
* Remove the WARN_ON calls.
* Remove linux/bug and linux/kvm_host includes that are not needed.
* Remove "ratelimited" from the logs that are not in the ioctl call
paths.
* Remove file ops that do nothing for now - open and release.
v1 -> v2
* Add log pattern for NE.
* Update goto labels to match their purpose.
* Update ne_cpu_pool data structure to include the global mutex.
* Update NE misc device mode to 0660.
* Check if the CPU siblings are included in the NE CPU pool, as full CPU
cores are given for the enclave(s).
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-8-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In addition to the replies sent by the Nitro Enclaves PCI device in
response to command requests, out-of-band enclave events can happen e.g.
an enclave crashes. In this case, the Nitro Enclaves driver needs to be
aware of the event and notify the corresponding user space process that
abstracts the enclave.
Register an MSI-X interrupt vector to be used for this kind of
out-of-band events. The interrupt notifies that the state of an enclave
changed and the driver logic scans the state of each running enclave to
identify for which this notification is intended.
Create an workqueue to handle the out-of-band events. Notify user space
enclave process that is using a polling mechanism on the enclave fd.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Use the reference to the pdev directly from the ne_pci_dev instead of
the one from the enclave data structure.
v7 -> v8
* No changes.
v6 -> v7
* No changes.
v5 -> v6
* Update documentation to kernel-doc format.
v4 -> v5
* Remove sanity checks for situations that shouldn't happen, only if
buggy system or broken logic at all.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
* Return IRQ_NONE when interrupts are not handled.
v2 -> v3
* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Remove "ratelimited" from the logs that are not in the ioctl call
paths.
v1 -> v2
* Add log pattern for NE.
* Update goto labels to match their purpose.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru-Catalin Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-7-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Nitro Enclaves PCI device exposes a MMIO space that this driver
uses to submit command requests and to receive command replies e.g. for
enclave creation / termination or setting enclave resources.
Add logic for handling PCI device command requests based on the given
command type.
Register an MSI-X interrupt vector for command reply notifications to
handle this type of communication events.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* No changes.
v7 -> v8
* Update function signature for submit request and retrive reply
functions as they only returned 0, no error code.
* Include command type value in the error logs of ne_do_request().
v6 -> v7
* No changes.
v5 -> v6
* Update documentation to kernel-doc format.
v4 -> v5
* Remove sanity checks for situations that shouldn't happen, only if
buggy system or broken logic at all.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
* Return IRQ_NONE when interrupts are not handled.
v2 -> v3
* Remove the WARN_ON calls.
* Update static calls sanity checks.
* Remove "ratelimited" from the logs that are not in the ioctl call
paths.
v1 -> v2
* Add log pattern for NE.
* Remove the BUG_ON calls.
* Update goto labels to match their purpose.
* Add fix for kbuild report:
https://lore.kernel.org/lkml/202004231644.xTmN4Z1z%25lkp@intel.com/
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru-Catalin Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-6-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Nitro Enclaves PCI device is used by the kernel driver as a means of
communication with the hypervisor on the host where the primary VM and
the enclaves run. It handles requests with regard to enclave lifetime.
Setup the PCI device driver and add support for MSI-X interrupts.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Init the reference to the ne_pci_dev in the ne_devs data structure.
v7 -> v8
* Add NE PCI driver shutdown logic.
v6 -> v7
* No changes.
v5 -> v6
* Update documentation to kernel-doc format.
v4 -> v5
* Remove sanity checks for situations that shouldn't happen, only if
buggy system or broken logic at all.
v3 -> v4
* Use dev_err instead of custom NE log pattern.
* Update NE PCI driver name to "nitro_enclaves".
v2 -> v3
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
* Remove the WARN_ON calls.
* Remove linux/bug include that is not needed.
* Update static calls sanity checks.
* Remove "ratelimited" from the logs that are not in the ioctl call
paths.
* Update kzfree() calls to kfree().
v1 -> v2
* Add log pattern for NE.
* Update PCI device setup functions to receive PCI device data structure and
then get private data from it inside the functions logic.
* Remove the BUG_ON calls.
* Add teardown function for MSI-X setup.
* Update goto labels to match their purpose.
* Implement TODO for NE PCI device disable state check.
* Update function name for NE PCI device probe / remove.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru-Catalin Vasile <lexnv@amazon.com>
Signed-off-by: Alexandru Ciobotaru <alcioa@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-5-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Nitro Enclaves driver keeps an internal info per each enclave.
This is needed to be able to manage enclave resources state, enclave
notifications and have a reference of the PCI device that handles
command requests for enclave lifetime management.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Add data structure to keep references to both Nitro Enclaves misc and
PCI devices.
v7 -> v8
* No changes.
v6 -> v7
* Update the naming and add more comments to make more clear the logic
of handling full CPU cores and dedicating them to the enclave.
v5 -> v6
* Update documentation to kernel-doc format.
* Include in the enclave memory region data structure the user space
address and size for duplicate user space memory regions checks.
v4 -> v5
* Include enclave cores field in the enclave metadata.
* Update the vCPU ids data structure to be a cpumask instead of a list.
v3 -> v4
* Add NUMA node field for an enclave metadata as the enclave memory and
CPUs need to be from the same NUMA node.
v2 -> v3
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
v1 -> v2
* Add enclave memory regions and vcpus count for enclave bookkeeping.
* Update ne_state comments to reflect NE_START_ENCLAVE ioctl naming
update.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru-Catalin Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-4-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Nitro Enclaves (NE) driver communicates with a new PCI device, that
is exposed to a virtual machine (VM) and handles commands meant for
handling enclaves lifetime e.g. creation, termination, setting memory
regions. The communication with the PCI device is handled using a MMIO
space and MSI-X interrupts.
This device communicates with the hypervisor on the host, where the VM
that spawned the enclave itself runs, e.g. to launch a VM that is used
for the enclave.
Define the MMIO space of the NE PCI device, the commands that are
provided by this device. Add an internal data structure used as private
data for the PCI device driver and the function for the PCI device
command requests handling.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* Fix indent for the NE PCI device command types enum.
v7 -> v8
* No changes.
v6 -> v7
* Update the documentation to include references to the NE PCI device id
and MMIO bar.
v5 -> v6
* Update documentation to kernel-doc format.
v4 -> v5
* Add a TODO for including flags in the request to the NE PCI device to
set a memory region for an enclave. It is not used for now.
v3 -> v4
* Remove the "packed" attribute and include padding in the NE data
structures.
v2 -> v3
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
v1 -> v2
* Update path naming to drivers/virt/nitro_enclaves.
* Update NE_ENABLE_OFF / NE_ENABLE_ON defines.
Reviewed-by: Alexander Graf <graf@amazon.com>
Signed-off-by: Alexandru-Catalin Vasile <lexnv@amazon.com>
Signed-off-by: Alexandru Ciobotaru <alcioa@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-3-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Nitro Enclaves driver handles the enclave lifetime management. This
includes enclave creation, termination and setting up its resources such
as memory and CPU.
An enclave runs alongside the VM that spawned it. It is abstracted as a
process running in the VM that launched it. The process interacts with
the NE driver, that exposes an ioctl interface for creating an enclave
and setting up its resources.
Changelog
v9 -> v10
* Update commit message to include the changelog before the SoB tag(s).
v8 -> v9
* No changes.
v7 -> v8
* Add NE custom error codes for user space memory regions not backed by
pages multiple of 2 MiB, invalid flags and enclave CID.
* Add max flag value for enclave image load info.
v6 -> v7
* Clarify in the ioctls documentation that the return value is -1 and
errno is set on failure.
* Update the error code value for NE_ERR_INVALID_MEM_REGION_SIZE as it
gets in user space as value 25 (ENOTTY) instead of 515. Update the
NE custom error codes values range to not be the same as the ones
defined in include/linux/errno.h, although these are not propagated
to user space.
v5 -> v6
* Fix typo in the description about the NE CPU pool.
* Update documentation to kernel-doc format.
* Remove the ioctl to query API version.
v4 -> v5
* Add more details about the ioctl calls usage e.g. error codes, file
descriptors used.
* Update the ioctl to set an enclave vCPU to not return a file
descriptor.
* Add specific NE error codes.
v3 -> v4
* Decouple NE ioctl interface from KVM API.
* Add NE API version and the corresponding ioctl call.
* Add enclave / image load flags options.
v2 -> v3
* Remove the GPL additional wording as SPDX-License-Identifier is
already in place.
v1 -> v2
* Add ioctl for getting enclave image load metadata.
* Update NE_ENCLAVE_START ioctl name to NE_START_ENCLAVE.
* Add entry in Documentation/userspace-api/ioctl/ioctl-number.rst for NE
ioctls.
* Update NE ioctls definition based on the updated ioctl range for major
and minor.
Reviewed-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexandru Vasile <lexnv@amazon.com>
Signed-off-by: Andra Paraschiv <andraprs@amazon.com>
Link: https://lore.kernel.org/r/20200921121732.44291-2-andraprs@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Common pattern of handling deferred probe can be simplified with
dev_err_probe(). Less code and the error value gets printed.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20200902172433.1138-2-krzk@kernel.org
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Common pattern of handling deferred probe can be simplified with
dev_err_probe(). Less code and the error value gets printed.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20200902172433.1138-1-krzk@kernel.org
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
* icc-syncstate:
interconnect: Add get_bw() callback
interconnect: Add sync state support
interconnect: qcom: Use icc_sync_state
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Lowering the bandwidth on the bus might have negative consequences if
it's done before all consumers had a chance to cast their vote. Now by
default the framework sets the bandwidth to maximum during boot. We need
to use the icc_sync_state callback to notify the framework when all
consumers are probed and there is no need to keep the bandwidth set to
maximum anymore.
Link: https://lore.kernel.org/r/20200825170152.6434-4-georgi.djakov@linaro.org
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
The bootloaders often do some initial configuration of the interconnects
in the system and we want to keep this configuration until all consumers
have probed and expressed their bandwidth needs. This is because we don't
want to change the configuration by starting to disable unused paths until
every user had a chance to request the amount of bandwidth it needs.
To accomplish this we will implement an interconnect specific sync_state
callback which will synchronize (aggregate and set) the current bandwidth
settings when all consumers have been probed.
Link: https://lore.kernel.org/r/20200825170152.6434-3-georgi.djakov@linaro.org
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
The interconnect controller hardware may support querying the current
bandwidth settings, so add a callback for providers to implement this
functionality if supported.
Link: https://lore.kernel.org/r/20200825170152.6434-2-georgi.djakov@linaro.org
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>