This removes the 'write' and 'force' from get_user_pages() and replaces
them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers
as use of this flag can result in surprising behaviour (and hence bugs)
within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Replace explicit computation of vma page count by a call to
vma_pages()
Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When building with W=1, the __scif_rma_destroy_tcw function
causes a harmless warning about an argument variable that is
modified but not used:
drivers/misc/mic/scif/scif_dma.c: In function ‘__scif_rma_destroy_tcw’:
drivers/misc/mic/scif/scif_dma.c:118:27: error: parameter ‘ep’ set but not used [-Werror=unused-but-set-parameter]
In this case, we can just remove the argument, since all callers
are in the same file.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- New vsock device support in host and guest
- Platform IOMMU support in host and guest,
including compatibility quirks for legacy systems.
- Misc fixes and cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXofvbAAoJECgfDbjSjVRpUTIH/iEoK9h636tBayXy0PXkPby0
6fMaRFy6H1HgEttgDhJE8Pqg/ba3qaW9Em0fHyFq7Mp2waFHAZ8hAT8phC6TAK3c
CIBnfzyyuI8u3N9SnNOfelPVcwCBfuALuuTsXB/rwKbYQEVv+U5Rdt3Vyx9+lXkj
P005klz7PfqxFhQrrnj4Eh7VawtHwmMuLH8YoWpCZpM71dHPo6eL+3ftKwhH2boo
qK86uVprwba03Pewpm13vQnotemfVfUUkjXd4EJpG3dx7E0KZosuj0ZG9OV8mPGQ
Cl2gBdUhocdJgeUnAHmf6tumYi9KFlYfy6xLy44YMmN7FL3E9nQjaKZp25UKfiM=
=ztIm
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost updates from Michael Tsirkin:
- new vsock device support in host and guest
- platform IOMMU support in host and guest, including compatibility
quirks for legacy systems.
- misc fixes and cleanups.
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
VSOCK: Use kvfree()
vhost: split out vringh Kconfig
vhost: detect 32 bit integer wrap around
vhost: new device IOTLB API
vhost: drop vringh dependency
vhost: convert pre sorted vhost memory array to interval tree
vhost: introduce vhost memory accessors
VSOCK: Add Makefile and Kconfig
VSOCK: Introduce vhost_vsock.ko
VSOCK: Introduce virtio_transport.ko
VSOCK: Introduce virtio_vsock_common.ko
VSOCK: defer sock removal to transports
VSOCK: transport-specific vsock_transport functions
vhost: drop vringh dependency
vop: pull in vhost Kconfig
virtio: new feature to detect IOMMU device quirk
balloon: check the number of available pages in leak balloon
vhost: lockless enqueuing
vhost: simplify work flushing
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer. Thus the pointer can point to const data.
However the attributes do not have to be a bitfield. Instead unsigned
long will do fine:
1. This is just simpler. Both in terms of reading the code and setting
attributes. Instead of initializing local attributes on the stack
and passing pointer to it to dma_set_attr(), just set the bits.
2. It brings safeness and checking for const correctness because the
attributes are passed by value.
Semantic patches for this change (at least most of them):
virtual patch
virtual context
@r@
identifier f, attrs;
@@
f(...,
- struct dma_attrs *attrs
+ unsigned long attrs
, ...)
{
...
}
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
and
// Options: --all-includes
virtual patch
virtual context
@r@
identifier f, attrs;
type t;
@@
t f(..., struct dma_attrs *attrs);
@@
identifier r.f;
@@
f(...,
- NULL
+ 0
)
Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vringh is pulled in by caif and mic, but the other
vhost config does not need to be there.
In particular, it makes no sense to have vhost net/scsi/sock
under caif/mic.
Create a separate Kconfig file and put vringh bits there.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Return statements at the end of void functions are useless.
The Coccinelle semantic patch used to make this change is as follows:
//<smpl>
@@
identifier f;
expression e;
@@
void f(...) {
<...
- return
e;
...>
}
//</smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
My static checker complains that we still use "mark" even when the
_scif_fence_mark() call fails so it can be uninitialized.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MIC VOP driver does two successive reads from user space to read a
variable length data structure. Kernel memory corruption can result if
the data structure changes between the two reads. This patch disallows
the chance of this happening.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=116651
Reported by: Pengfei Wang <wpengfeinudt@gmail.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull x86 protection key support from Ingo Molnar:
"This tree adds support for a new memory protection hardware feature
that is available in upcoming Intel CPUs: 'protection keys' (pkeys).
There's a background article at LWN.net:
https://lwn.net/Articles/643797/
The gist is that protection keys allow the encoding of
user-controllable permission masks in the pte. So instead of having a
fixed protection mask in the pte (which needs a system call to change
and works on a per page basis), the user can map a (handful of)
protection mask variants and can change the masks runtime relatively
cheaply, without having to change every single page in the affected
virtual memory range.
This allows the dynamic switching of the protection bits of large
amounts of virtual memory, via user-space instructions. It also
allows more precise control of MMU permission bits: for example the
executable bit is separate from the read bit (see more about that
below).
This tree adds the MM infrastructure and low level x86 glue needed for
that, plus it adds a high level API to make use of protection keys -
if a user-space application calls:
mmap(..., PROT_EXEC);
or
mprotect(ptr, sz, PROT_EXEC);
(note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
this special case, and will set a special protection key on this
memory range. It also sets the appropriate bits in the Protection
Keys User Rights (PKRU) register so that the memory becomes unreadable
and unwritable.
So using protection keys the kernel is able to implement 'true'
PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
PROT_READ as well. Unreadable executable mappings have security
advantages: they cannot be read via information leaks to figure out
ASLR details, nor can they be scanned for ROP gadgets - and they
cannot be used by exploits for data purposes either.
We know about no user-space code that relies on pure PROT_EXEC
mappings today, but binary loaders could start making use of this new
feature to map binaries and libraries in a more secure fashion.
There is other pending pkeys work that offers more high level system
call APIs to manage protection keys - but those are not part of this
pull request.
Right now there's a Kconfig that controls this feature
(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
(like most x86 CPU feature enablement code that has no runtime
overhead), but it's not user-configurable at the moment. If there's
any serious problem with this then we can make it configurable and/or
flip the default"
* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
mm/core, x86/mm/pkeys: Add execute-only protection keys support
x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
x86/mm/pkeys: Allow kernel to modify user pkey rights register
x86/fpu: Allow setting of XSAVE state
x86/mm: Factor out LDT init from context init
mm/core, x86/mm/pkeys: Add arch_validate_pkey()
mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
x86/mm/pkeys: Add Kconfig prompt to existing config option
x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
x86/mm/pkeys: Dump PKRU with other kernel registers
mm/core, x86/mm/pkeys: Differentiate instruction fetches
x86/mm/pkeys: Optimize fault handling in access_error()
mm/core: Do not enforce PKEY permissions on remote mm access
um, pkeys: Add UML arch_*_access_permitted() methods
mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
x86/mm/gup: Simplify get_user_pages() PTE bit handling
...
We will soon modify the vanilla get_user_pages() so it can no
longer be used on mm/tasks other than 'current/current->mm',
which is by far the most common way it is called. For now,
we allow the old-style calls, but warn when they are used.
(implemented in previous patch)
This patch switches all callers of:
get_user_pages()
get_user_pages_unlocked()
get_user_pages_locked()
to stop passing tsk/mm so they will no longer see the warnings.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: jack@suse.cz
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160212210156.113E9407@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Static checkers complain that the this is a potential array overflow.
We verify that it's not on the next line so this code is OK, but
static checker warnings are annoying.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Swap the printk and kfree() to avoid a use after free bug.
Fixes: 61e9c905df ('misc: mic: Enable VOP host side functionality')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch modifies the MIC host and card drivers to start using the
VOP driver. The MIC host and card drivers now implement the VOP bus
operations and register a VOP device on the VOP bus. MIC driver stack
documentation is also updated to include the new VOP driver.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch moves the virtio specific debugfs hooks previously in
mic_debugfs.c in the MIC host driver into the VOP driver. The
Kconfig/Makefile is also updated to allow building the VOP driver.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch moves virtio functionality from the MIC card driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit 2141c7c5ee ("Intel MIC Card
Driver Changes for Virtio Devices.") in
drivers/misc/mic/card/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch moves virtio functionality from the MIC host driver into a
separate hardware independent Virtio Over PCIe (VOP) driver. This
functionality was introduced in commit f69bcbf3b4 ("Intel MIC Host
Driver Changes for Virtio Devices.") in
drivers/misc/mic/host/mic_virtio.c. Apart from being moved into a
separate driver the functionality is essentially unchanged. See the
above mentioned commit for a description of this functionality.
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds VOP driver data structures used in subsequent
patches. These data structures are refactored from similar data
structures used in the virtio parts of previous MIC host and card
drivers.
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Virtio Over PCIe (VOP) bus abstracts the low level hardware
details like interrupts and mapping remote memory so that the same VOP
driver can work without changes with different MIC host or card
drivers as long as the hardware bus operations are implemented. The
VOP driver registers itself on the VOP bus. The base PCIe drivers
implement the bus ops and register VOP devices on the bus, resulting
in the VOP driver being probed with the VOP devices. This allows the
VOP functionality to be shared between multiple generations of Intel
MIC products.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch deletes the virtio functionality from the MIC X100 card
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch deletes the virtio functionality from the MIC X100 host
driver. A subsequent patch will re-enable this functionality by
consolidating the hardware independent logic in a new Virtio over PCIe
(VOP) driver.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of calling release_firmware() on every error and then jumping
lets have a common release_firmware() in the error path.
This patch also fixes a memory leak where we missed release_firmware()
if mic_x100_load_command_line() fails.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of jumping to a label and then returning from there lets return
directly.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If request_firmware() succeeds then rc becomes 0. After that if the test
for strcmp() fails then we were jumping to label done: and returning rc.
But rc being 0 we returned success whereas we have failed here and we
were supposed to return an error.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>From the error path we are printing an error message with dev_err(). No
need to print almost same message with dev_dbg().
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After the loop we test "if (!retry)" to see if we timedout. The problem
is "retry--" is a post-op so retry will be -1 at the end of the loop. I
have fixed this by changing it to a pre-op instead.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes the following crash seen when MIC reset is invoked in
RESET_FAILED state due to device_del being called a second time on an
already deleted device:
[<ffffffff813b2295>] device_del+0x45/0x1d0
[<ffffffff813b243e>] device_unregister+0x1e/0x60
[<ffffffffa040f1c2>] scif_unregister_device+0x12/0x20 [scif_bus]
[<ffffffffa042f75a>] cosm_stop+0xaa/0xe0 [mic_cosm]
[<ffffffffa042f844>] cosm_reset_trigger_work+0x14/0x20 [mic_cosm]
The fix consists in realizing that because cosm_reset changes the
state to MIC_RESETTING, cosm_stop needs the previous state, before it
changed to MIC_RESETTING, to decide whether a hw_ops->stop had
previously been issued. This is now provided in a new cosm_device
member cdev->prev_state.
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The error code passed to ERR_PTR() always should be negated. Also, the
return value of scif_add_mmu_notifier() was never checked.
Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
list_next_entry has been defined in list.h, so I replace list_entry_next
with it.
Signed-off-by: Geliang Tang <geliangtang@163.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed integer overflow is undefined. Also I added a check for
"(offset < 0)" in scif_unregister() because that makes it match the
other conditions and because I didn't want to subtract a negative.
Fixes: ba612aa8b4 ('misc: mic: SCIF memory registration and unregistration')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
checkpatch.pl wants arrays of strings declared as follows:
static const char * const names[] = { "vq-1", "vq-2", "vq-3" };
Currently the find_vqs() function takes a const char *names[] argument
so passing checkpatch.pl's const char * const names[] results in a
compiler error due to losing the second const.
This patch adjusts the find_vqs() prototype and updates all virtio
transports. This makes it possible for virtio_balloon.c, virtio_input.c,
virtgpu_kms.c, and virtio_rpmsg_bus.c to use the checkpatch.pl-friendly
type.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
We should be returning -ENOMEM here instead of success.
Fixes: ba612aa8b4 ('misc: mic: SCIF memory registration and unregistration')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The caller expects that we take this lock again before returning
otherwise it you get double unlocks and races.
Fixes: ba612aa8b4 ('misc: mic: SCIF memory registration and unregistration')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In scif_node_connect() we were returning if the initialization of p2p_ji
fails. But at that time p2p_ij has already been initialized and
resources allocated for it. And since p2p_ij is not added to the list
till now so we will have a leak.
Lets deinitialize and release the resources connected to p2p_ij.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Handle a failed device_register(), replace kfree() with put_device(),
which will call cosm/mbus/scif_release_dev().
Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SCIF depends on IOVA which requires IOMMU_SUPPORT to be enabled.
The long term fix is to move IOVA from drivers/iommu to lib/
but this current patch should fix the reported issue.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds the SCIF kernel node QP control messages required to
enable SCIF RMAs. Examples of such node QP control messages include
registration, unregistration, remote memory allocation requests,
remote memory unmap and SCIF remote fence requests.
The patch also updates the SCIF driver with minor changes required to
enable SCIF RMAs by adding the new files to the build, initializing
RMA specific information during SCIF endpoint creation, reserving SCIF
DMA channels, initializing SCIF RMA specific global data structures,
adding the IOCTL hooks required for SCIF RMAs and updating RMA
specific debugfs hooks.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch implements the fence APIs required to synchronize
DMAs. SCIF provides an interface to return a "mark" for all DMAs
programmed at the instant the API was called. Users can then "wait" on
the mark provided previously by blocking inside the kernel. Upon
receipt of a DMA completion interrupt the waiting thread is woken
up. There is also an interface to signal DMA completion by polling for
a location to be updated via a "signal" cookie to avoid the interrupt
overhead in the mark/wait interface. SCIF allows programming fences on
both the local and the remote node for both the mark/wait or the fence
signal APIs.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SCIF allows users to read from or write to registered remote memory
via CPU copies or DMA. The API verifies that both local and remote
windows are valid before initiating the CPU or DMA transfers. SCIF has
optimized algorithms for handling byte aligned as well as cache line
aligned DMA engines. A registration cache is maintained to avoid the
overhead of pinning pages repeatedly if buffers are reused. The
registration cache is invalidated upon receipt of MMU notifier
callbacks. SCIF windows are destroyed and the pages are unpinned only
once all prior DMAs initiated using that window are drained. Users can
request synchronous DMA operations as well as tail byte ordering if
required. CPU copies are always performed synchronously.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch implements the SCIF mmap/munmap interface. A similar
capability is provided to kernel clients via the
scif_get_pages()/scif_put_pages() APIs. The SCIF mmap interface
queries to check if a window is valid and then remaps the local
virtual address to the remote physical pages. These mappings are
subsequently destroyed upon receipt of the VMA close operation or
scif_get_pages(). This functionality allows SCIF users to directly
access remote memory without any driver interaction once the mappings
are created thereby providing bare-metal PCIe latency. These mappings
are zapped to avoid RMA accesses from user space, if a Coprocessor is
reset.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds the implementation for operations performed on the
list of SCIF windows. Examples of such operations includes adding the
windows to the list of registered (or cached) windows, querying the
list of self or remote windows and unregistering windows. The query
operation is used by SCIF APIs which initiate DMAs, CPU copies or
fences to ensure that a window remains valid during a transfer.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch implements the SCIF APIs required to pin and unpin
pages. SCIF registration locks down the pages. It then sends a remote
window allocation request to the peer. Once the peer has allocated
memory, the local SCIF endpoint copies the pinned page information to
the peer and notifies the peer once the copy has complete. The peer
upon receipt of the registration notification adds the new remote
window to its list. At this point the window page information is
available on both self and remote nodes so that they can start
performing SCIF DMAs, CPU copies and fences. The unregistration API
tears down the registration at both self and remote nodes.
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds the internal data structures required to perform SCIF
RMAs. The data structures required to maintain per SCIF endpoint, RMA
information are contained in scif_endpt_rma_info. scif_pinned_pages
describes a set of SCIF pinned pages maintained locally. The
scif_window is a data structure which contains all the fields required
to describe a SCIF registered window on self and remote nodes. It
contains an offset which is used as a key to perform SCIF DMAs and CPU
copies between self and remote registered windows.
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch updates the MIC host daemon to work with corresponding
changes in COSM. Other MIC daemon fixes, cleanups and enhancements as
are also rolled into this patch. Changes to MIC sysfs ABI which go
into effect with this patch are also documented.
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since card side COSM functionality, to trigger MIC device shutdowns
and communicate shutdown status to the host, is now moved into a
separate COSM client driver, this patch removes this functionality
from the base MIC card driver. The mic_bus driver is also updated to
use the device index provided by COSM rather than maintain its own
device index.
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since COSM functionality is now moved into a separate COSM driver
drivers, this patch removes this functionality from the base MIC host
driver. The MIC host driver now implements cosm_hw_ops and registers a
COSM device which allows the COSM driver to trigger
boot/shutdown/reset of the MIC devices via the cosm_hw_ops.
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The COSM client driver running on the MIC cards is implemented as a
kernel mode SCIF client. It responds to a "shutdown" message from the
host by triggering a card shutdown and also communicates the shutdown
or reboot status back the host. It is also responsible for syncing the
card time to that of the host. Because SCIF messaging cannot be used
in a panic context, the COSM client driver also periodically sends a
heartbeat SCIF message to the host thereby enabling the host to detect
card crashes.
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
Reviewed-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>