linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-20 17:07:17 +07:00

Author	SHA1	Message	Date
Matan Barak	64b19e1323	IB/core: Export ioctl enum types to user-space Add a new ib_user_ioctl_verbs.h which exports all required ABI enums and structs to the user-space. Export the default types to user-space through this file. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-31 08:35:12 -04:00
Matan Barak	fac9658cab	IB/core: Add new ioctl interface In this ioctl interface, processing the command starts from properties of the command and fetching the appropriate user objects before calling the handler. Parsing and validation is done according to a specifier declared by the driver's code. In the driver, all supported objects are declared. These objects are separated to different object namepsaces. Dividing objects to namespaces is done at initialization by using the higher bits of the object ids. This initialization can mix objects declared in different places to one parsing tree using in this ioctl interface. For each object we list all supported methods. Similarly to objects, methods are separated to method namespaces too. Namespacing is done similarly to the objects case. This could be used in order to add methods to an existing object. Each method has a specific handler, which could be either a default handler or a driver specific handler. Along with the handler, a bunch of attributes are specified as well. Similarly to objects and method, attributes are namespaced and hashed by their ids at initialization too. All supported attributes are subject to automatic fetching and validation. These attributes include the command, response and the method's related objects' ids. When these entities (objects, methods and attributes) are used, the high bits of the entities ids are used in order to calculate the hash bucket index. Then, these high bits are masked out in order to have a zero based index. Since we use these high bits for both bucketing and namespacing, we get a compact representation and O(1) array access. This is mandatory for efficient dispatching. Each attribute has a type (PTR_IN, PTR_OUT, IDR and FD) and a length. Attributes could be validated through some attributes, like: () Minimum size / Exact size () Fops for FD () Object type for IDR If an IDR/fd attribute is specified, the kernel also states the object type and the required access (NEW, WRITE, READ or DESTROY). All uobject/fd management is done automatically by the infrastructure, meaning - the infrastructure will fail concurrent commands that at least one of them requires concurrent access (WRITE/DESTROY), synchronize actions with device removals (dissociate context events) and take care of reference counting (increase/decrease) for concurrent actions invocation. The reference counts on the actual kernel objects shall be handled by the handlers. objects +--------+ \| \| \| \| methods +--------+ \| \| ns method method_spec +-----+ \|len \| +--------+ +------+[d]+-------+ +----------------+[d]+------------+ \|attr1+-> \|type \| \| object +> \|method+-> \| spec +-> + attr_buckets +-> \|default_chain+--> +-----+ \|idr_type\| +--------+ +------+ \|handler\| \| \| +------------+ \|attr2\| \|access \| \| \| \| \| +-------+ +----------------+ \|driver chain\| +-----+ +--------+ \| \| \| \| +------------+ \| \| +------+ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| +--------+ [d] = Hash ids to groups using the high order bits The right types table is also chosen by using the high bits from the ids. Currently we have either default or driver specific groups. Once validation and object fetching (or creation) completed, we call the handler: int (handler)(struct ib_device ib_dev, struct ib_uverbs_file ufile, struct uverbs_attr_bundle *ctx); ctx bundles attributes of different namespaces. Each element there is an array of attributes which corresponds to one namespaces of attributes. For example, in the usually used case: ctx core +----------------------------+ +------------+ \| core: +---> \| valid \| +----------------------------+ \| cmd_attr \| \| driver: \| +------------+ \|----------------------------+--+ \| valid \| \| \| cmd_attr \| \| +------------+ \| \| valid \| \| \| obj_attr \| \| +------------+ \| \| drivers \| +------------+ +> \| valid \| \| cmd_attr \| +------------+ \| valid \| \| cmd_attr \| +------------+ \| valid \| \| obj_attr \| +------------+ Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-31 08:35:09 -04:00
Aditya Sarwade	72f9b089ec	RDMA/vmw_pvrdma: Report network header type in WC We should report the network header type in the work completion so that the kernel can infer the right RoCE type headers. Reviewed-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Aditya Sarwade <asarwade@vmware.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-31 08:35:08 -04:00
Paul Mackerras	e3bfed1df3	KVM: PPC: Book3S HV: Report storage key support to userspace This adds information about storage keys to the struct returned by the KVM_PPC_GET_SMMU_INFO ioctl. The new fields replace a pad field, which was zeroed by previous kernel versions. Thus userspace that knows about the new fields will see zeroes when running on an older kernel, indicating that storage keys are not supported. The size of the structure has not changed. The number of keys is hard-coded for the CPUs supported by HV KVM, which is just POWER7, POWER8 and POWER9. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2017-08-31 12:36:44 +10:00
Subash Abhinov Kasiviswanathan	cdf4969c42	net: arp: Add support for raw IP device Define the raw IP type. This is needed for raw IP net devices like rmnet. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-30 11:41:13 -07:00
Subash Abhinov Kasiviswanathan	7373ae7e8f	net: ether: Add support for multiplexing and aggregation type Define the Qualcomm multiplexing and aggregation (MAP) ether type 0x00F9. This is needed for receiving data in the MAP protocol like RMNET. This is not an officially registered ID. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-30 11:41:13 -07:00
Florian Westphal	31770e34e4	tcp: Revert "tcp: remove header prediction" This reverts commit `45f119bf93`. Eric Dumazet says: We found at Google a significant regression caused by `45f119bf93` tcp: remove header prediction In typical RPC (TCP_RR), when a TCP socket receives data, we now call tcp_ack() while we used to not call it. This touches enough cache lines to cause a slowdown. so problem does not seem to be HP removal itself but the tcp_ack() call. Therefore, it might be possible to remove HP after all, provided one finds a way to elide tcp_ack for most cases. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-30 11:20:09 -07:00
Jiri Benc	155e6f6497	ether: add NSH ethertype The NSH draft says: An IEEE EtherType, 0x894F, has been allocated for NSH. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 15:16:52 -07:00
Alexander Aring	2804fd3af6	if_ether: add forces ife lfb type This patch adds the forces IFE lfb type according to IEEE registered ethertypes. See http://standards-oui.ieee.org/ethertype/eth.txt for more information. Since there exists the IFE subsystem it can be used there. This patch also use the correct word "ForCES" instead of "FoRCES" which is a spelling error inside the IEEE ethertype specification. Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 15:14:18 -07:00
Kan Liang	fc7ce9c74c	perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR For understanding how the workload maps to memory channels and hardware behavior, it's very important to collect address maps with physical addresses. For example, 3D XPoint access can only be found by filtering the physical address. Add a new sample type for physical address. perf already has a facility to collect data virtual address. This patch introduces a function to convert the virtual address to physical address. The function is quite generic and can be extended to any architecture as long as a virtual address is provided. - For kernel direct mapping addresses, virt_to_phys is used to convert the virtual addresses to physical address. - For user virtual addresses, __get_user_pages_fast is used to walk the pages tables for user physical address. - This does not work for vmalloc addresses right now. These are not resolved, but code to do that could be added. The new sample type requires collecting the virtual address. The virtual address will not be output unless SAMPLE_ADDR is applied. For security, the physical address can only be exposed to root or privileged user. Tested-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: mpe@ellerman.id.au Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-29 15:09:25 +02:00
Ingo Molnar	e0563e0495	Merge branch 'perf/urgent' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-29 15:09:03 +02:00
Artemy Kovalyov	8d50505ada	IB/uverbs: Expose XRQ capabilities Make XRQ capabilities available via ibv_query_device() verb. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-29 08:30:18 -04:00
Artemy Kovalyov	9382d4e1d3	IB/uverbs: Add XRQ creation parameter to UAPI Add tm_list_size parameter to struct ib_uverbs_create_xsrq. If SRQ type is tag-matching this field defines maximum size of tag matching list. Otherwise, it is expected to be zero. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-29 08:30:17 -04:00
Dave Airlie	7846b12fe0	Merge branch 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux into drm-next vmwgfx add fence fd support. * 'drm-vmwgfx-next' of git://people.freedesktop.org/~syeh/repos_linux: drm/vmwgfx: Bump the version for fence FD support drm/vmwgfx: Add export fence to file descriptor support drm/vmwgfx: Add support for imported Fence File Descriptor drm/vmwgfx: Prepare to support fence fd drm/vmwgfx: Fix incorrect command header offset at restart drm/vmwgfx: Support the NOP_ERROR command drm/vmwgfx: Restart command buffers after errors drm/vmwgfx: Move irq bottom half processing to threads drm/vmwgfx: Don't use drm_irq_[un]install	2017-08-29 10:38:14 +10:00
Dave Airlie	095e2d04f9	Merge tag 'drm-misc-next-fixes-2017-08-28' of git://anongit.freedesktop.org/git/drm-misc into drm-next UAPI Changes: - Rename u32 to __u32 in struct drm_format_modifier_blob (Lionel) Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> * tag 'drm-misc-next-fixes-2017-08-28' of git://anongit.freedesktop.org/git/drm-misc: drm: rename u32 in __u32 in uapi	2017-08-29 10:36:06 +10:00
Jason Ekstrand	ffa9443fb3	drm/syncobj: Add a signal ioctl (v3) This IOCTL provides a mechanism for userspace to trigger a sync object directly. There are other ways that userspace can trigger a syncobj such as submitting a dummy batch somewhere or hanging on to a triggered sync_file and doing an import. This just provides an easy way to manually trigger the sync object without weird hacks. The motivation for this IOCTL is Vulkan fences. Vulkan lets you create a fence already in the signaled state so that you can wait on it immediatly without stalling. We could also handle this with a new create flag to ask the driver to create a syncobj that is already signaled but the IOCTL seemed a bit cleaner and more generic. v2: - Take an array of sync objects (Dave Airlie) v3: - Throw -EINVAL if pad != 0 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-29 10:16:25 +10:00
Jason Ekstrand	aa4035d2c7	drm/syncobj: Add a reset ioctl (v3) This just resets the dma_fence to NULL so it looks like it's never been signaled. This will be useful once we add the new wait API for allowing wait on "submit and signal" behavior. v2: - Take an array of sync objects (Dave Airlie) v3: - Throw -EINVAL if pad != 0 Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-29 10:16:19 +10:00
Jason Ekstrand	e7aca5031a	drm/syncobj: Allow wait for submit and signal behavior (v5) Vulkan VkFence semantics require that the application be able to perform a CPU wait on work which may not yet have been submitted. This is perfectly safe because the CPU wait has a timeout which will get triggered eventually if no work is ever submitted. This behavior is advantageous for multi-threaded workloads because, so long as all of the threads agree on what fences to use up-front, you don't have the extra cross-thread synchronization cost of thread A telling thread B that it has submitted its dependent work and thread B is now free to wait. Within a single process, this can be implemented in the userspace driver by doing exactly the same kind of tracking the app would have to do using posix condition variables or similar. However, in order for this to work cross-process (as is required by VK_KHR_external_fence), we need to handle this in the kernel. This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which instructs the IOCTL to wait for the syncobj to have a non-null fence and then wait on the fence. Combined with DRM_IOCTL_SYNCOBJ_RESET, you can easily get the Vulkan behavior. v2: - Fix a bug in the invalid syncobj error path - Unify the wait-all and wait-any cases v3: - Unify the timeout == 0 case a bit with the timeout > 0 case - Use wait_event_interruptible_timeout v4: - Use proxy fence v5: - Revert to a combination of v2 and v3 - Don't use proxy fences - Don't use wait_event_interruptible_timeout because it just adds an extra layer of callbacks Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Dave Airlie <airlied@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-29 06:28:17 +10:00
Jason Ekstrand	1fc08218ed	drm/syncobj: Add a CREATE_SIGNALED flag This requests that the driver create the sync object such that it already has a signaled dma_fence attached. Because we don't need anything in particular (just something signaled), we use a dummy null fence. This is useful for Vulkan which has a similar flag that can be passed to vkCreateFence. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-29 06:27:41 +10:00
Dave Airlie	5e60a10eae	drm/syncobj: add sync obj wait interface. (v8) This interface will allow sync object to be used to back Vulkan fences. This API is pretty much the vulkan fence waiting API, and I've ported the code from amdgpu. v2: accept relative timeout, pass remaining time back to userspace. v3: return to absolute timeouts. v4: absolute zero = poll, rewrite any/all code to have same operation for arrays return -EINVAL for 0 fences. v4.1: fixup fences allocation check, use u64_to_user_ptr v5: move to sec/nsec, and use timespec64 for calcs. v6: use -ETIME and drop the out status flag. (-ETIME is suggested by ickle, I can feel a shed painting) v7: talked to Daniel/Arnd, use ktime and ns everywhere. v8: be more careful in the timeout calculations use uint32_t for counter variables so we don't overflow graciously handle -ENOINT being returned from dma_fence_wait_timeout Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-08-29 06:26:32 +10:00
Jens Axboe	cd996fb47c	Linux 4.13-rc7 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZo2HiAAoJEHm+PkMAQRiG3OcIAJqSeVK2uQ/QhmqFN1ExYay4 bdTjSTtSk7GH6PxI2C0cqfZvsxOUU7ICDHG8bYM1LA0S0SxfOtoFhHGKc/BcFLX8 MiKJWlF51ZbX0mkIEpKF+C8pRrXPgSqtk3N450/k2BzG9qCZSM93A2NCOB7v9T9w XOBUIYHqfTS2tdmCinjwu8Ls+w8oPOGH1gLjxZyGnBlg4lTqHMcUufmHeVEAh11d giGByqqqXH69kGD1HNC7H6quzXN9rz4n0gEwEG0mIhfkJ98b+ESSWwSEXXypOAQD QT5/6+2YizXf5DPCqR46xasQCPjRsS6Sv0cF2cntW2PEAb4jBjhx5gTFlJcoOC8= =efWJ -----END PGP SIGNATURE----- Merge tag 'v4.13-rc7' into for-4.14/block-postmerge Linux 4.13-rc7 Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-28 13:00:44 -06:00
Sean Wang	1c16ae65e2	serial: 8250: of: Add new port type for MediaTek BTIF controller on MT7622/23 SoC MediaTek BTIF controller is the serial interface similar to UART but it works only as the digital device which is mainly used to communicate with the connectivity module called CONNSYS inside the SoC which could be mostly found on those MediaTek SoCs with Bluetooth feature such as MT7622 and MT7623 SoCs. And the controller is made as being compatible with the 8250 register layout with extra registers such as DMA enablement so it tends to be integrated with reusing 8250 OF driver. However, DMA mode is not being supported yet in the current driver. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-28 20:51:22 +02:00
Andy Shevchenko	ee1c90cc2c	serial: Fix port type numbering for TI DA8xx The UAPI has a global list of unique numbers for different port types. The commit `a2d6a987bf` ("serial: 8250: Add new port type for TI DA8xx/66AK2x") introduced a new port type and brought the collision with two other port types. Reuse 95 for it instead. Fixes: `a2d6a987bf` ("serial: 8250: Add new port type for TI DA8xx/66AK2x") Cc: David Lechner <david@lechnology.com> Cc: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-28 20:51:20 +02:00
Andy Shevchenko	3f3dac7e4d	serial: Remove unused port type PORT_MFD is not in use since commit `1bd187de53` ("x86, intel-mid: remove Intel MID specific serial support") Remove leftover. Fixes: `1bd187de53` ("x86, intel-mid: remove Intel MID specific serial support") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-28 20:51:20 +02:00
Andy Shevchenko	63e8d4394a	serial: pch_uart: Make port type explicit It used to be a gap in port definitions after PORT_MAX_8250. Since the new drivers are coming the gap become shorter and shorter until the commit `a2d6a987bf` ("serial: 8250: Add new port type for TI DA8xx/66AK2x") completely removed it. So, while type here is just a formality, make things a little bit more explicit for this driver and move port types to UAPI header. Note, it uses two types for now. Fixes: `fddceb8b53` ("tty: 8250: Add 64byte UART support for FSL platforms") Cc: Priyanka Jain <Priyanka.Jain@freescale.com> Cc: Poonam Aggrwal <poonam.aggrwal@freescale.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-28 20:51:20 +02:00
John Fastabend	2f857d0460	bpf: sockmap, remove STRPARSER map_flags and add multi-map support The addition of map_flags BPF_SOCKMAP_STRPARSER flags was to handle a specific use case where we want to have BPF parse program disabled on an entry in a sockmap. However, Alexei found the API a bit cumbersome and I agreed. Lets remove the STRPARSER flag and support the use case by allowing socks to be in multiple maps. This allows users to create two maps one with programs attached and one without. When socks are added to maps they now inherit any programs attached to the map. This is a nice generalization and IMO improves the API. The API rules are less ambiguous and do not need a flag: - When a sock is added to a sockmap we have two cases, i. The sock map does not have any attached programs so we can add sock to map without inheriting bpf programs. The sock may exist in 0 or more other maps. ii. The sock map has an attached BPF program. To avoid duplicate bpf programs we only add the sock entry if it does not have an existing strparser/verdict attached, returning -EBUSY if a program is already attached. Otherwise attach the program and inherit strparser/verdict programs from the sock map. This allows for socks to be in a multiple maps for redirects and inherit a BPF program from a single map. Also this patch simplifies the logic around BPF_{EXIST\|NOEXIST\|ANY} flags. In the original patch I tried to be extra clever and only update map entries when necessary. Now I've decided the complexity is not worth it. If users constantly update an entry with the same sock for no reason (i.e. update an entry without actually changing any parameters on map or sock) we still do an alloc/release. Using this and allowing multiple entries of a sock to exist in a map the logic becomes much simpler. Note: Now that multiple maps are supported the "maps" pointer called when a socket is closed becomes a list of maps to remove the sock from. To keep the map up to date when a sock is added to the sockmap we must add the map/elem in the list. Likewise when it is removed we must remove it from the list. This results in searching the per psock list on delete operation. On TCP_CLOSE events we walk the list and remove the psock from all map/entry locations. I don't see any perf implications in this because at most I have a psock in two maps. If a psock were to be in many maps its possibly this might be noticeable on delete but I can't think of a reason to dup a psock in many maps. The sk_callback_lock is used to protect read/writes to the list. This was convenient because in all locations we were taking the lock anyways just after working on the list. Also the lock is per sock so in normal cases we shouldn't see any contention. Suggested-by: Alexei Starovoitov <ast@kernel.org> Fixes: `174a79ff95` ("bpf: sockmap with sk redirect support") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-28 11:13:21 -07:00
John Fastabend	464bc0fd62	bpf: convert sockmap field attach_bpf_fd2 to type In the initial sockmap API we provided strparser and verdict programs using a single attach command by extending the attach API with a the attach_bpf_fd2 field. However, if we add other programs in the future we will be adding a field for every new possible type, attach_bpf_fd(3,4,..). This seems a bit clumsy for an API. So lets push the programs using two new type fields. BPF_SK_SKB_STREAM_PARSER BPF_SK_SKB_STREAM_VERDICT This has the advantage of having a readable name and can easily be extended in the future. Updates to samples and sockmap included here also generalize tests slightly to support upcoming patch for multiple map support. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Fixes: `174a79ff95` ("bpf: sockmap with sk redirect support") Suggested-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-28 11:13:21 -07:00
Sinclair Yeh	2cfa0bb25d	drm/vmwgfx: Prepare to support fence fd Make the fields and flags available. Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Singh Rawat <drawat@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2017-08-28 17:51:28 +02:00
Dan Williams	7a14724f54	libnvdimm: clean up command definitions Remove the command payloads that do not have an associated libnvdimm ioctl. I.e. remove the payloads that would only ever be carried in the ND_CMD_CALL envelope. This prevents userspace from growing unnecessary dependencies on this kernel header when userspace already has everything it needs to craft and send these commands. Cc: Jerry Hoemann <jerry.hoemann@hpe.com> Reported-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-08-28 08:33:20 -07:00
Pawel Baldysiak	ddc088238c	md: Runtime support for multiple ppls Increase PPL area to 1MB and use it as circular buffer to store PPL. The entry with highest generation number is the latest one. If PPL to be written is larger then space left in a buffer, rewind the buffer to the start (don't wrap it). Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>	2017-08-28 07:45:48 -07:00
Greg Kroah-Hartman	9749c37275	Merge 4.13-rc7 into char-misc-next We want the binder fix in here as well for testing and merge issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-28 10:19:01 +02:00
Colin Ian King	a9e4998073	media: dvb_frontend: ensure that inital front end status initialized The fe_status variable s is not initialized meaning it can have any random garbage status. This could be problematic if fe->ops.tune is false as s is not updated by the call to fe->ops.tune() and a subsequent check on the change status will using a garbage value. Fix this by adding FE_NONE to the enum fe_status and initializing s to this. Detected by CoverityScan, CID#112887 ("Uninitialized scalar variable") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-08-27 17:55:51 -04:00
Sakari Ailus	d1b3437ed7	media: v4l: Add packed Bayer raw12 pixel formats These formats are compressed 12-bit raw bayer formats with four different pixel orders. They are similar to 10-bit variants. The formats added by this patch are V4L2_PIX_FMT_SBGGR12P V4L2_PIX_FMT_SGBRG12P V4L2_PIX_FMT_SGRBG12P V4L2_PIX_FMT_SRGGB12P Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-08-26 14:45:24 -04:00
David Lebrun	38ee7f2d47	ipv6: sr: add support for encapsulation of L2 frames This patch implements the L2 frame encapsulation mechanism, referred to as T.Encaps.L2 in the SRv6 specifications [1]. A new type of SRv6 tunnel mode is added (SEG6_IPTUN_MODE_L2ENCAP). It only accepts packets with an existing MAC header (i.e., it will not work for locally generated packets). The resulting packet looks like IPv6 -> SRH -> Ethernet -> original L3 payload. The next header field of the SRH is set to NEXTHDR_NONE. [1] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01 Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-25 17:10:23 -07:00
Shreyas NC	3d52a79241	ASoC: Intel: uapi: Add new tokens for module common data The module private data can be modelled independent of its instances so that it can be reused by the module instances. So move module data to common manifest which can be referenced by the module instances. This requires new tokens to be defined to accommodate these changes. The new tokens will specify buffer sizes, DSP cycles and respective indexes corresponding to the pcm params in the topology manifest so that driver need not compute them. Signed-off-by: Shreyas NC <shreyas.nc@intel.com> Signed-off-by: Guneshwor Singh <guneshwor.o.singh@intel.com> Acked-By: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2017-08-25 14:53:51 +01:00
Lionel Landwerlin	f44d85389e	drm: rename u32 in __u32 in uapi All other fields use __ Cc: Ben Widawsky <ben@bwidawsk.net> Fixes: `db1689aa61` ("drm: Create a format/modifier blob") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Daniel Stone <daniels@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170824150814.5878-1-lionel.g.landwerlin@intel.com	2017-08-25 10:07:30 +01:00
Andi Kleen	6ae5fa61d2	perf/x86: Fix data source decoding for Skylake Skylake changed the encoding of the PEBS data source field. Some combinations are not available anymore, but some new cases e.g. for L4 cache hit are added. Fix up the conversion table for Skylake, similar as had been done for Nehalem. On Skylake server the encoding for L4 actually means persistent memory. Handle this case too. To properly describe it in the abstracted perf format I had to add some new fields. Since a hit can have only one level add a new field that is an enumeration, not a bit field to describe the level. It can describe any level. Some numbers are also used to describe PMEM and LFB. Also add a new generic remote flag that can be combined with the generic level to signify a remote cache. And there is an extension field for the snoop indication to handle the Forward state. I didn't add a generic flag for hops because it's not needed for Skylake. I changed the existing encodings for older CPUs to also fill in the new level and remote fields. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/20170816222156.19953-3-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-25 11:04:17 +02:00
Bodong Wang	050da902ad	IB/mlx5: Report mlx5 enhanced multi packet WQE capability Expose enhanced multi packet WQE capability to user space through query_device by uhw. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-24 17:47:35 -04:00
Bodong Wang	795b609c8b	IB/mlx5: Allow posting multi packet send WQEs if hardware supports Set the field to allow posting multi packet send WQEs if hardware supports this feature. This doesn't mean the send WQEs will be for multi packet unless the send WQE was prepared according to multi packet send WQE format. User space shall use flag MLX5_IB_ALLOW_MPW to check if hardware supports MPW and allows MPW in SQ context. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-24 17:47:35 -04:00
Noa Osherovich	96dc3fc5f1	IB/mlx5: Expose software parsing for Raw Ethernet QP Software parsing (SWP) is a feature that can be used to instruct the device to stop using its internal parser and to parse packets on the transmit path according to offsets set for each packets. Through this feature, the device allows the handling of checksum and LSO by the hardware according to the location of IP and TCP/UDP headers. Enable SW parsing on Raw Ethernet send queue by default if firmware supports it and report these capabilities to user space. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-24 17:47:34 -04:00
Guy Levi	b23673f86f	IB/mlx4: Remove redundant attribute in mlx4_ib_create_qp_rss struct rx_key_len is not in use and needs to be removed. Fixes: `3078f5f1bd` ("IB/mlx4: Add support for RSS QP") Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-24 16:27:11 -04:00
Guy Levi	078b357303	IB/mlx4: Fix struct mlx4_ib_create_wq alignment The mlx4 ABI defines to have structures with alignment of 64B. Fixes: `400b1ebcfe` ("IB/mlx4: Add support for WQ related verbs") Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-24 16:27:11 -04:00
Maor Gottlieb	55f2467cd7	RDMA/mlx4: Fix create qp command alignment Avoid extra padding by replacing the order of inl_recv_sz and reserved, otherwise 'mlx4_ib_create_qp' structure might be larger than legacy user input leading to copy of some garbage data from the user space buffer. Fixes: `ea30b966f7` ('IB/mlx4: Add inline-receive support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-24 16:27:10 -04:00
Arkadi Sharshevsky	3fb886ecea	devlink: Add IPv4 header for dpipe This will be used by the IPv4 host table which will be introduced in the following patches. This header is global and can be reused by many drivers. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:16 -07:00
Arkadi Sharshevsky	1177009131	devlink: Add Ethernet header for dpipe This will be used by the IPv4 host table which will be introduced in the following patches. This header is global and can be reused by many drivers. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-24 09:33:15 -07:00
Dongdong Liu	f20c4ea49e	PCI/DPC: Add eDPC support Add eDPC support. Get and print the RP PIO error information when the trigger condition is RP PIO error. For more information on eDPC, please see PCI Express Base Specification Revision 3.1, section 6.2.10.3, or view the PCI-SIG eDPC ECN here: https://pcisig.com/sites/default/files/specification_documents/ECN_Enhanced_DPC_2012-11-19_final.pdf Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>	2017-08-24 11:28:44 -05:00
Alex Williamson	ea5311c7e7	PCI: Fix PCIe capability sizes PCI_CAP_EXP_ENDPOINT_SIZEOF_V1 defines the size of the PCIe capability structure for v1 devices with link, but we also have a need in the vfio code for sizing the capability for devices without link, such as Root Complex Integrated Endpoints. Create a separate define for this ending the structure before the link fields. Additionally, this reveals that PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 is currently incorrect, ending the capability length before the v2 link fields. Rename this to specify an RC Integrated Endpoint (no link) capability length and move PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 to include the link fields as we have for the v1 version. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> [bhelgaas: add "_" in "PCI_CAP_EXP_RC ENDPOINT_SIZEOF_V2 44"] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>	2017-08-24 11:24:59 -05:00
Juergen Gross	ecda85e702	x86/lguest: Remove lguest support Lguest seems to be rather unused these days. It has seen only patches ensuring it still builds the last two years and its official state is "Odd Fixes". Remove it in order to be able to clean up the paravirt code. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: boris.ostrovsky@oracle.com Cc: lguest@lists.ozlabs.org Cc: rusty@rustcorp.com.au Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20170816173157.8633-3-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-24 09:57:28 +02:00
Omar Sandoval	1e6ec9ea89	Revert "loop: support 4k physical blocksize" There's some stuff still up in the air, let's not get stuck with a subpar ABI. I'll follow up with something better for 4.14. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-23 15:57:55 -06:00
Martijn Coenen	5cdcf4c6a6	ANDROID: binder: add padding to binder_fd_array_object. binder_fd_array_object starts with a 4-byte header, followed by a few fields that are 8 bytes when ANDROID_BINDER_IPC_32BIT=N. This can cause alignment issues in a 64-bit kernel with a 32-bit userspace, as on x86_32 an 8-byte primitive may be aligned to a 4-byte address. Pad with a __u32 to fix this. Signed-off-by: Martijn Coenen <maco@android.com> Cc: stable <stable@vger.kernel.org> # 4.11+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-22 18:43:23 -07:00
William Tu	84e54fe0a5	gre: introduce native tunnel support for ERSPAN The patch adds ERSPAN type II tunnel support. The implementation is based on the draft at [1]. One of the purposes is for Linux box to be able to receive ERSPAN monitoring traffic sent from the Cisco switch, by creating a ERSPAN tunnel device. In addition, the patch also adds ERSPAN TX, so Linux virtual switch can redirect monitored traffic to the ERSPAN tunnel device. The traffic will be encapsulated into ERSPAN and sent out. The implementation reuses tunnel key as ERSPAN session ID, and field 'erspan' as ERSPAN Index fields: ./ip link add dev ers11 type erspan seq key 100 erspan 123 \ local 172.16.1.200 remote 172.16.1.100 To use the above device as ERSPAN receiver, configure Nexus 5000 switch as below: monitor session 100 type erspan-source erspan-id 123 vrf default destination ip 172.16.1.200 source interface Ethernet1/11 both source interface Ethernet1/12 both no shut monitor erspan origin ip-address 172.16.1.100 global [1] https://tools.ietf.org/html/draft-foschiano-erspan-01 [2] iproute2 patch: http://marc.info/?l=linux-netdev&m=150306086924951&w=2 [3] test script: http://marc.info/?l=linux-netdev&m=150231021807304&w=2 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Meenakshi Vohra <mvohra@vmware.com> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-22 14:29:30 -07:00
Greg Kroah-Hartman	34a0036748	usb: changes for v4.14 merge window Not a big pull request this time around. Only 49 non-merge commits. This pull request is, however, all over the place. Most of the changes are in the bdc driver adding support for USB Phy layer and PM. Renesas adds support for R-Car H3 ES2.0 and R-Car M3-W SoCs. Also here is PM_RUNTIME support for dwc3-keystone. UDC Core got a DMA unmap fix to make sure we only unmap requests that were, indeed, mapped. Other than these, we have a lot of cleanups, many of them adding 'const' to several places. -----BEGIN PGP SIGNATURE----- iQJRBAABCAA7FiEElLzh7wn96CXwjh2IzL64meEamQYFAlmaxjsdHGZlbGlwZS5i YWxiaUBsaW51eC5pbnRlbC5jb20ACgkQzL64meEamQZOXRAAlxA/jU0DVtZ8YAAC 6GRVni3xiEkoEoGA8tQwPZVZh0QRlupKtq5E31xWxjW2Hmtd/159WlIWZcFo3Cgi ChDZ7yBQ0NGJhOS8X1zYSM6Rl9DDzKtk1vh7EXhz1CjG3aYTqwELLy9ZvCJUTG4w O57XNH8GtRptmloOrFT+LvI6lb2EpcxGSTMhf0kUdnAC3Iw6gBL8PGr+jnoTLQ4M gcddC5oQ24CpPFB/mszGd3Ro2USiNipttJmZJ/yrYLGcBIoIL4oyNbbAkhvcXAai 4cBsC7SELifgTJZ2npoUh0z+tRsmQ668zqXS81QgS8/Y6uFgmp4PWoTiGCLbpEm0 l/T5jMDeHFySmBM8+0opbXjXzK2FVG+NnGNkRPxpwoue6cUlay2dw2PGVhP0/6Y9 DQfv3bOHsEEe2ywp3J7UAZPFw4UaLKM8yWJ/Tv0DUhDolG8bA2PerU5vX5W6FufY tWkvpTjRWKBVgHfYIUJoYtxLMsEubHQeuuc/hKEU+uqJ3161IQyUbOx547Y7toVi E2mdDfN+Zv/PW5lNJ+GguK3eTxeJ8w4aBKC6VY+mLoATEDa5xzq9OzGJ118LcV3V f4Nw7UsqinJYxCJMKgZHHCnrd6ct0wmhcsiWq9J0fSRPSzwzY3IrdbkWj24Gcc8V Z5/lfdjee2Jx1TrUI3Lasew0qP4= =FSil -----END PGP SIGNATURE----- Merge tag 'usb-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next Felipe writes: usb: changes for v4.14 merge window Not a big pull request this time around. Only 49 non-merge commits. This pull request is, however, all over the place. Most of the changes are in the bdc driver adding support for USB Phy layer and PM. Renesas adds support for R-Car H3 ES2.0 and R-Car M3-W SoCs. Also here is PM_RUNTIME support for dwc3-keystone. UDC Core got a DMA unmap fix to make sure we only unmap requests that were, indeed, mapped. Other than these, we have a lot of cleanups, many of them adding 'const' to several places.	2017-08-22 13:16:06 -07:00
Dave Airlie	735f463af7	Merge tag 'drm-intel-next-2017-08-18' of git://anongit.freedesktop.org/git/drm-intel into drm-next Final pile of features for 4.14 - New ioctl to change NOA configurations, plus prep (Lionel) - CCS (color compression) scanout support, based on the fancy new modifier additions (Ville&Ben) - Document i915 register macro style (Jani) - Many more gen10/cnl patches (Rodrigo, Pualo, ...) - More gpu reset vs. modeset duct-tape to restore the old way. - prep work for cnl: hpd_pin reorg (Rodrigo), support for more power wells (Imre), i2c pin reorg (Anusha) - drm_syncobj support (Jason Ekstrand) - forcewake vs gpu reset fix (Chris) - execbuf speedup for the no-relocs fastpath, anv/vk low-overhead ftw (Chris) - switch to idr/radixtree instead of the resizing ht for execbuf id->vma lookups (Chris) gvt: - MMIO save/restore optimization (Changbin) - Split workload scan vs. dispatch for more parallel exec (Ping) - vGPU full 48bit ppgtt support (Joonas, Tina) - vGPU hw id expose for perf (Zhenyu) Bunch of work all over to make the igt CI runs more complete/stable. Watch https://intel-gfx-ci.01.org/tree/drm-tip/shards-all.html for progress in getting this ready. Next week we're going into production mode (i.e. will send results to intel-gfx) on hsw, more platforms to come. Also, a new maintainer tram, I'm stepping out. Huge thanks to Jani for being an awesome co-maintainer the past few years, and all the best for Jani, Joonas&Rodrigo as the new maintainers! * tag 'drm-intel-next-2017-08-18' of git://anongit.freedesktop.org/git/drm-intel: (179 commits) drm/i915: Update DRIVER_DATE to 20170818 drm/i915/bxt: use NULL for GPIO connection ID drm/i915: Mark the GT as busy before idling the previous request drm/i915: Trivial grammar fix s/opt of/opt out of/ in comment drm/i915: Replace execbuf vma ht with an idr drm/i915: Simplify eb_lookup_vmas() drm/i915: Convert execbuf to use struct-of-array packing for critical fields drm/i915: Check context status before looking up our obj/vma drm/i915: Don't use MI_STORE_DWORD_IMM on Sandybridge/vcs drm/i915: Stop touching forcewake following a gen6+ engine reset MAINTAINERS: drm/i915 has a new maintainer team drm/i915: Split pin mapping into per platform functions drm/i915/opregion: let user specify override VBT via firmware load drm/i915/cnl: Reuse skl_wm_get_hw_state on Cannonlake. drm/i915/gen10: implement gen 10 watermarks calculations drm/i915/cnl: Fix LSPCON support. drm/i915/vbt: ignore extraneous child devices for a port drm/i915/cnl: Setup PAT Index. drm/i915/edp: Allow alternate fixed mode for eDP if available. drm/i915: Add support for drm syncobjs ...	2017-08-22 10:03:07 +10:00
David S. Miller	a43dce9358	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-08-21 1) Support RX checksum with IPsec crypto offload for esp4/esp6. From Ilan Tayari. 2) Fixup IPv6 checksums when doing IPsec crypto offload. From Yossi Kuperman. 3) Auto load the xfrom offload modules if a user installs a SA that requests IPsec offload. From Ilan Tayari. 4) Clear RX offload informations in xfrm_input to not confuse the TX path with stale offload informations. From Ilan Tayari. 5) Allow IPsec GSO for local sockets if the crypto operation will be offloaded. 6) Support setting of an output mark to the xfrm_state. This mark can be used to to do the tunnel route lookup. From Lorenzo Colitti. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-21 09:29:47 -07:00
Ingo Molnar	94edf6f3c2	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Removal of spin_unlock_wait() - SRCU updates - Torture-test updates - Documentation updates - Miscellaneous fixes - CPU-hotplug fixes - Miscellaneous non-RCU fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-21 09:45:19 +02:00
Hans Verkuil	9a6b2a8740	media: cec: rename pin events/function The CEC_EVENT_PIN_LOW/HIGH defines and the cec_queue_pin_event() function did not specify that these were about CEC pin events. Since in the future there will also be HPD pin events it is wise to rename the event defines and function to CEC_EVENT_PIN_CEC_LOW/HIGH and cec_queue_pin_cec_event() now before these become part of the ABI. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-08-20 08:14:03 -04:00
Martin KaFai Lau	96eabe7a40	bpf: Allow selecting numa node during map creation The current map creation API does not allow to provide the numa-node preference. The memory usually comes from where the map-creation-process is running. The performance is not ideal if the bpf_prog is known to always run in a numa node different from the map-creation-process. One of the use case is sharding on CPU to different LRU maps (i.e. an array of LRU maps). Here is the test result of map_perf_test on the INNER_LRU_HASH_PREALLOC test if we force the lru map used by CPU0 to be allocated from a remote numa node: [ The machine has 20 cores. CPU0-9 at node 0. CPU10-19 at node 1 ] ># taskset -c 10 ./map_perf_test 512 8 1260000 8000000 5:inner_lru_hash_map_perf pre-alloc 1628380 events per sec 4:inner_lru_hash_map_perf pre-alloc 1626396 events per sec 3:inner_lru_hash_map_perf pre-alloc 1626144 events per sec 6:inner_lru_hash_map_perf pre-alloc 1621657 events per sec 2:inner_lru_hash_map_perf pre-alloc 1621534 events per sec 1:inner_lru_hash_map_perf pre-alloc 1620292 events per sec 7:inner_lru_hash_map_perf pre-alloc 1613305 events per sec 0:inner_lru_hash_map_perf pre-alloc 1239150 events per sec #<<< After specifying numa node: ># taskset -c 10 ./map_perf_test 512 8 1260000 8000000 5:inner_lru_hash_map_perf pre-alloc 1629627 events per sec 3:inner_lru_hash_map_perf pre-alloc 1628057 events per sec 1:inner_lru_hash_map_perf pre-alloc 1623054 events per sec 6:inner_lru_hash_map_perf pre-alloc 1616033 events per sec 2:inner_lru_hash_map_perf pre-alloc `1614630` events per sec 4:inner_lru_hash_map_perf pre-alloc 1612651 events per sec 7:inner_lru_hash_map_perf pre-alloc 1609337 events per sec 0:inner_lru_hash_map_perf pre-alloc 1619340 events per sec #<<< This patch adds one field, numa_node, to the bpf_attr. Since numa node 0 is a valid node, a new flag BPF_F_NUMA_NODE is also added. The numa_node field is honored if and only if the BPF_F_NUMA_NODE flag is set. Numa node selection is not supported for percpu map. This patch does not change all the kmalloc. F.e. 'htab = kzalloc()' is not changed since the object is small enough to stay in the cache. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-19 21:35:43 -07:00
Florian Westphal	6b5dc98e8f	netfilter: rt: add support to fetch path mss to be used in combination with tcp option set support to mimic iptables TCPMSS --clamp-mss-to-pmtu. v2: Eric Dumazet points out dst must be initialized. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-08-19 13:15:10 +02:00
Florian Westphal	99d1712bc4	netfilter: exthdr: tcp option set support This allows setting 2 and 4 byte quantities in the tcp option space. Main purpose is to allow native replacement for xt_TCPMSS to work around pmtu blackholes. Writes to kind and len are now allowed at the moment, it does not seem useful to do this as it causes corruption of the tcp option space. We can always lift this restriction later if a use-case appears. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-08-19 13:15:10 +02:00
Levin, Alexander (Sasha Levin)	0888e372c3	net: inet: diag: expose sockets cgroup classid This is useful for directly looking up a task based on class id rather than having to scan through all open file descriptors. Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-18 16:10:50 -07:00
Mathieu Desnoyers	22e4ebb975	membarrier: Provide expedited private command Implement MEMBARRIER_CMD_PRIVATE_EXPEDITED with IPIs using cpumask built from all runqueues for which current thread's mm is the same as the thread calling sys_membarrier. It executes faster than the non-expedited variant (no blocking). It also works on NOHZ_FULL configurations. Scheduler-wise, it requires a memory barrier before and after context switching between processes (which have different mm). The memory barrier before context switch is already present. For the barrier after context switch: * Our TSO archs can do RELEASE without being a full barrier. Look at x86 spin_unlock() being a regular STORE for example. But for those archs, all atomics imply smp_mb and all of them have atomic ops in switch_mm() for mm_cpumask(), and on x86 the CR3 load acts as a full barrier. * From all weakly ordered machines, only ARM64 and PPC can do RELEASE, the rest does indeed do smp_mb(), so there the spin_unlock() is a full barrier and we're good. * ARM64 has a very heavy barrier in switch_to(), which suffices. * PPC just removed its barrier from switch_to(), but appears to be talking about adding something to switch_mm(). So add a smp_mb__after_unlock_lock() for now, until this is settled on the PPC side. Changes since v3: - Properly document the memory barriers provided by each architecture. Changes since v2: - Address comments from Peter Zijlstra, - Add smp_mb__after_unlock_lock() after finish_lock_switch() in finish_task_switch() to add the memory barrier we need after storing to rq->curr. This is much simpler than the previous approach relying on atomic_dec_and_test() in mmdrop(), which actually added a memory barrier in the common case of switching between userspace processes. - Return -EINVAL when MEMBARRIER_CMD_SHARED is used on a nohz_full kernel, rather than having the whole membarrier system call returning -ENOSYS. Indeed, CMD_PRIVATE_EXPEDITED is compatible with nohz_full. Adapt the CMD_QUERY mask accordingly. Changes since v1: - move membarrier code under kernel/sched/ because it uses the scheduler runqueue, - only add the barrier when we switch from a kernel thread. The case where we switch from a user-space thread is already handled by the atomic_dec_and_test() in mmdrop(). - add a comment to mmdrop() documenting the requirement on the implicit memory barrier. CC: Peter Zijlstra <peterz@infradead.org> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Boqun Feng <boqun.feng@gmail.com> CC: Andrew Hunter <ahh@google.com> CC: Maged Michael <maged.michael@gmail.com> CC: gromer@google.com CC: Avi Kivity <avi@scylladb.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Dave Watson <davejwatson@fb.com>	2017-08-17 07:28:05 -07:00
Ingo Molnar	927d2c21f2	Merge branch 'linus' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-08-17 09:41:41 +02:00
Dave Airlie	3154b13371	Merge tag 'drm-misc-next-2017-08-16' of git://anongit.freedesktop.org/git/drm-misc into drm-next UAPI Changes: - vc4: Allow userspace to dictate rendering order in submit_cl ioctl (Eric) Cross-subsystem Changes: - vboxvideo: One of Cihangir's patches applies to vboxvideo which is maintained in staging Core Changes: - atomic_legacy_backoff is officially killed (Daniel) - Extract drm_device.h (Daniel) - Unregister drm device on unplug (Daniel) - Rename deprecated drm__(un)?reference functions to drm__{get\|put} (Cihangir) Driver Changes: - vc4: Error/destroy path cleanups, log level demotion, edid leak (Eric) - various: Make various drm__funcs structs const (Bhumika) - tinydrm: add support for LEGO MINDSTORMS EV3 LCD (David) - various: Second half of .dumb_{map_offset\|destroy} defaults set (Noralf) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Eric Anholt <eric@anholt.net> Cc: Bhumika Goyal <bhumirks@gmail.com> Cc: Cihangir Akturk <cakturk@gmail.com> Cc: David Lechner <david@lechnology.com> Cc: Noralf Trønnes <noralf@tronnes.org> tag 'drm-misc-next-2017-08-16' of git://anongit.freedesktop.org/git/drm-misc: (50 commits) drm/gem-cma-helper: Remove drm_gem_cma_dumb_map_offset() drm/virtio: Use the drm_driver.dumb_destroy default drm/bochs: Use the drm_driver.dumb_destroy default drm/mgag200: Use the drm_driver.dumb_destroy default drm/exynos: Use .dumb_map_offset and .dumb_destroy defaults drm/msm: Use the drm_driver.dumb_destroy default drm/ast: Use the drm_driver.dumb_destroy default drm/qxl: Use the drm_driver.dumb_destroy default drm/udl: Use the drm_driver.dumb_destroy default drm/cirrus: Use the drm_driver.dumb_destroy default drm/tegra: Use .dumb_map_offset and .dumb_destroy defaults drm/gma500: Use .dumb_map_offset and .dumb_destroy defaults drm/mxsfb: Use .dumb_map_offset and .dumb_destroy defaults drm/meson: Use .dumb_map_offset and .dumb_destroy defaults drm/kirin: Use .dumb_map_offset and .dumb_destroy defaults drm/vc4: Continue the switch to drm_*_put() helpers drm/vc4: Fix leak of HDMI EDID dma-buf: fix reservation_object_wait_timeout_rcu to wait correctly v2 dma-buf: add reservation_object_copy_fences (v2) drm/tinydrm: add support for LEGO MINDSTORMS EV3 LCD ...	2017-08-17 07:33:41 +10:00
John Fastabend	8a31db5615	bpf: add access to sock fields and pkt data from sk_skb programs Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-16 11:27:53 -07:00
John Fastabend	174a79ff95	bpf: sockmap with sk redirect support Recently we added a new map type called dev map used to forward XDP packets between ports (`6093ec2dc3`). This patches introduces a similar notion for sockets. A sockmap allows users to add participating sockets to a map. When sockets are added to the map enough context is stored with the map entry to use the entry with a new helper bpf_sk_redirect_map(map, key, flags) This helper (analogous to bpf_redirect_map in XDP) is given the map and an entry in the map. When called from a sockmap program, discussed below, the skb will be sent on the socket using skb_send_sock(). With the above we need a bpf program to call the helper from that will then implement the send logic. The initial site implemented in this series is the recv_sock hook. For this to work we implemented a map attach command to add attributes to a map. In sockmap we add two programs a parse program and a verdict program. The parse program uses strparser to build messages and pass them to the verdict program. The parse programs use the normal strparser semantics. The verdict program is of type SK_SKB. The verdict program returns a verdict SK_DROP, or SK_REDIRECT for now. Additional actions may be added later. When SK_REDIRECT is returned, expected when bpf program uses bpf_sk_redirect_map(), the sockmap logic will consult per cpu variables set by the helper routine and pull the sock entry out of the sock map. This pattern follows the existing redirect logic in cls and xdp programs. This gives the flow, recv_sock -> str_parser (parse_prog) -> verdict_prog -> skb_send_sock \ -> kfree_skb As an example use case a message based load balancer may use specific logic in the verdict program to select the sock to send on. Sample programs are provided in future patches that hopefully illustrate the user interfaces. Also selftests are in follow-on patches. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-16 11:27:53 -07:00
John Fastabend	b005fd189c	bpf: introduce new program type for skbs on sockets A class of programs, run from strparser and soon from a new map type called sock map, are used with skb as the context but on established sockets. By creating a specific program type for these we can use bpf helpers that expect full sockets and get the verifier to ensure these helpers are not used out of context. The new type is BPF_PROG_TYPE_SK_SKB. This patch introduces the infrastructure and type. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-16 11:27:53 -07:00
David S. Miller	463910e2df	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-08-15 20:23:23 -07:00
Yong Zhao	5d71dbc3a5	drm/amdkfd: Implement image tiling mode support v2 v2: Removed hole in ioctl number space Signed-off-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-08-15 23:00:22 -04:00
Moses Reuben	6a1c951069	drm/amdkfd: Adding new IOCTL for scratch memory v2 v2: * Renamed ALLOC_MEMORY_OF_SCRATCH to SET_SCRATCH_BACKING_VA * Removed size parameter from the ioctl, it was unused * Removed hole in ioctl number space * No more call to write_config_static_mem * Return correct error code from ioctl Signed-off-by: Moses Reuben <moses.reuben@amd.com> Signed-off-by: Ben Goz <ben.goz@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-08-15 23:00:20 -04:00
Ido Schimmel	fe40079995	ipv6: fib: Provide offload indication using nexthop flags IPv6 routes currently lack nexthop flags as in IPv4. This has several implications. In the forwarding path, it requires us to check the carrier state of the nexthop device and potentially ignore a linkdown route, instead of checking for RTNH_F_LINKDOWN. It also requires capable drivers to use the user facing IPv6-specific route flags to provide offload indication, instead of using the nexthop flags as in IPv4. Add nexthop flags to IPv6 routes in the 40 bytes hole and use it to provide offload indication instead of the RTF_OFFLOAD flag, which is removed while it's still not part of any official kernel release. In the near future we would like to use the field for the RTNH_F_{LINKDOWN,DEAD} flags, but this change is more involved and might not be ready in time for the current cycle. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-15 17:05:03 -07:00
Nick Terrell	5c1aab1dd5	btrfs: Add zstd support Add zstd compression and decompression support to BtrFS. zstd at its fastest level compresses almost as well as zlib, while offering much faster compression and decompression, approaching lzo speeds. I benchmarked btrfs with zstd compression against no compression, lzo compression, and zlib compression. I benchmarked two scenarios. Copying a set of files to btrfs, and then reading the files. Copying a tarball to btrfs, extracting it to btrfs, and then reading the extracted files. After every operation, I call `sync` and include the sync time. Between every pair of operations I unmount and remount the filesystem to avoid caching. The benchmark files can be found in the upstream zstd source repository under `contrib/linux-kernel/{btrfs-benchmark.sh,btrfs-extract-benchmark.sh}` [1] [2]. I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, 16 GB of RAM, and a SSD. The first compression benchmark is copying 10 copies of the unzipped Silesia corpus [3] into a BtrFS filesystem mounted with `-o compress-force=Method`. The decompression benchmark times how long it takes to `tar` all 10 copies into `/dev/null`. The compression ratio is measured by comparing the output of `df` and `du`. See the benchmark file [1] for details. I benchmarked multiple zstd compression levels, although the patch uses zstd level 1. \| Method \| Ratio \| Compression MB/s \| Decompression speed \| \|---------\|-------\|------------------\|---------------------\| \| None \| 0.99 \| 504 \| 686 \| \| lzo \| 1.66 \| 398 \| 442 \| \| zlib \| 2.58 \| 65 \| 241 \| \| zstd 1 \| 2.57 \| 260 \| 383 \| \| zstd 3 \| 2.71 \| 174 \| 408 \| \| zstd 6 \| 2.87 \| 70 \| 398 \| \| zstd 9 \| 2.92 \| 43 \| 406 \| \| zstd 12 \| 2.93 \| 21 \| 408 \| \| zstd 15 \| 3.01 \| 11 \| 354 \| The next benchmark first copies `linux-4.11.6.tar` [4] to btrfs. Then it measures the compression ratio, extracts the tar, and deletes the tar. Then it measures the compression ratio again, and `tar`s the extracted files into `/dev/null`. See the benchmark file [2] for details. \| Method \| Tar Ratio \| Extract Ratio \| Copy (s) \| Extract (s)\| Read (s) \| \|--------\|-----------\|---------------\|----------\|------------\|----------\| \| None \| 0.97 \| 0.78 \| 0.981 \| 5.501 \| 8.807 \| \| lzo \| 2.06 \| 1.38 \| 1.631 \| 8.458 \| 8.585 \| \| zlib \| 3.40 \| 1.86 \| 7.750 \| 21.544 \| 11.744 \| \| zstd 1 \| 3.57 \| 1.85 \| 2.579 \| 11.479 \| 9.389 \| [1] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-benchmark.sh [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-extract-benchmark.sh [3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia [4] https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.11.6.tar.xz zstd source repository: https://github.com/facebook/zstd Signed-off-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Chris Mason <clm@fb.com>	2017-08-15 09:02:09 -07:00
Jason Ekstrand	cf6e7bac63	drm/i915: Add support for drm syncobjs This commit adds support for waiting on or signaling DRM syncobjs as part of execbuf. It does so by hijacking the currently unused cliprects pointer to instead point to an array of i915_gem_exec_fence structs which containe a DRM syncobj and a flags parameter which specifies whether to wait on it or to signal it. This implementation theoretically allows for both flags to be set in which case it waits on the dma_fence that was in the syncobj and then immediately replaces it with the dma_fence from the current execbuf. v2: - Rebase on new syncobj API v3: - Pull everything out into helpers - Do all allocation in gem_execbuffer2 - Pack the flags in the bottom 2 bits of the drm_syncobj* v4: - Prevent a potential race on syncobj->fence Testcase: igt/gem_exec_fence/syncobj* Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/1499289202-25441-1-git-send-email-jason.ekstrand@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20170815145733.4562-1-chris@chris-wilson.co.uk	2017-08-15 16:46:57 +01:00
Baolin Wang	44dd8a989c	include: uapi: usb: Introduce USB charger type and state definition Introducing USB charger type and state definition can help to support USB charging which will be added in USB phy core. Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>	2017-08-15 15:05:00 +03:00
Dave Airlie	0c697fafc6	Linux 4.13-rc5 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZkNpUAAoJEHm+PkMAQRiGr68H/2nr8kxpoUhZ7eA5C71waCjh gnJSevkzJAp+fCb0KfQFAp1qvpmLLle4e6tAxYgTQZg4Z3W5cJJNfxu9TzY5sGuL o9QUr43XzABepW4e4jhRtZv6dj3K6XruNeDQKXDZTDcc/S8zoiS/Pltq7VgPcAuM kX+3qsNdUyknngD6b0z9NtJkb0mHKY6J8MpraWRO34egDwsaN/tuhRj0DRQpCoyQ x/k+hMbc9MB9Dn8cfACo6Omb+r5Rfd7dTBUAju/TnIIgs//9voHba307N7XvLJZg kWc8MqMQQZXfRZHB0atpDMHyZS/XQRlNPXj76j0+Ud/byODKTFkkazmgTpALvj8= =CxeU -----END PGP SIGNATURE----- Backmerge tag 'v4.13-rc5' into drm-next Linux 4.13-rc5 There's a really nasty nouveau collision, hopefully someone can take a look once I pushed this out.	2017-08-15 16:16:58 +10:00
Greg Kroah-Hartman	cf0a1579dd	Merge 4.13-rc5 into tty-next We want the fixes in here, and we resolve the merge issue in the 8250_core.c file. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-08-14 14:46:59 -07:00
Kees Cook	0466bdb99e	seccomp: Implement SECCOMP_RET_KILL_PROCESS action Right now, SECCOMP_RET_KILL_THREAD (neé SECCOMP_RET_KILL) kills the current thread. There have been a few requests for this to kill the entire process (the thread group). This cannot be just changed (discovered when adding coredump support since coredumping kills the entire process) because there are userspace programs depending on the thread-kill behavior. Instead, implement SECCOMP_RET_KILL_PROCESS, which is 0x80000000, and can be processed as "-1" by the kernel, below the existing RET_KILL that is ABI-set to "0". For userspace, SECCOMP_RET_ACTION_FULL is added to expand the mask to the signed bit. Old userspace using the SECCOMP_RET_ACTION mask will see SECCOMP_RET_KILL_PROCESS as 0 still, but this would only be visible when examining the siginfo in a core dump from a RET_KILL_*, where it will think it was thread-killed instead of process-killed. Attempts to introduce this behavior via other ways (filter flags, seccomp struct flags, masked RET_DATA bits) all come with weird side-effects and baggage. This change preserves the central behavioral expectations of the seccomp filter engine without putting too great a burden on changes needed in userspace to use the new action. The new action is discoverable by userspace through either the new actions_avail sysctl or through the SECCOMP_GET_ACTION_AVAIL seccomp operation. If used without checking for availability, old kernels will treat RET_KILL_PROCESS as RET_KILL_THREAD (since the old mask will produce RET_KILL_THREAD). Cc: Paul Moore <paul@paul-moore.com> Cc: Fabricio Voznika <fvoznika@google.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-08-14 13:46:50 -07:00
Kees Cook	4d3b0b05aa	seccomp: Introduce SECCOMP_RET_KILL_PROCESS This introduces the BPF return value for SECCOMP_RET_KILL_PROCESS to kill an entire process. This cannot yet be reached by seccomp, but it changes the default-kill behavior (for unknown return values) from kill-thread to kill-process. Signed-off-by: Kees Cook <keescook@chromium.org>	2017-08-14 13:46:49 -07:00
Kees Cook	fd76875ca2	seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD In preparation for adding SECCOMP_RET_KILL_PROCESS, rename SECCOMP_RET_KILL to the more accurate SECCOMP_RET_KILL_THREAD. The existing selftest values are intentionally left as SECCOMP_RET_KILL just to be sure we're exercising the alias. Signed-off-by: Kees Cook <keescook@chromium.org>	2017-08-14 13:46:48 -07:00
Tyler Hicks	59f5cf44a3	seccomp: Action to log before allowing Add a new action, SECCOMP_RET_LOG, that logs a syscall before allowing the syscall. At the implementation level, this action is identical to the existing SECCOMP_RET_ALLOW action. However, it can be very useful when initially developing a seccomp filter for an application. The developer can set the default action to be SECCOMP_RET_LOG, maybe mark any obviously needed syscalls with SECCOMP_RET_ALLOW, and then put the application through its paces. A list of syscalls that triggered the default action (SECCOMP_RET_LOG) can be easily gleaned from the logs and that list can be used to build the syscall whitelist. Finally, the developer can change the default action to the desired value. This provides a more friendly experience than seeing the application get killed, then updating the filter and rebuilding the app, seeing the application get killed due to a different syscall, then updating the filter and rebuilding the app, etc. The functionality is similar to what's supported by the various LSMs. SELinux has permissive mode, AppArmor has complain mode, SMACK has bring-up mode, etc. SECCOMP_RET_LOG is given a lower value than SECCOMP_RET_ALLOW as allow while logging is slightly more restrictive than quietly allowing. Unfortunately, the tests added for SECCOMP_RET_LOG are not capable of inspecting the audit log to verify that the syscall was logged. With this patch, the logic for deciding if an action will be logged is: if action == RET_ALLOW: do not log else if action == RET_KILL && RET_KILL in actions_logged: log else if action == RET_LOG && RET_LOG in actions_logged: log else if filter-requests-logging && action in actions_logged: log else if audit_enabled && process-is-being-audited: log else: do not log Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-08-14 13:46:47 -07:00
Tyler Hicks	e66a399779	seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW Add a new filter flag, SECCOMP_FILTER_FLAG_LOG, that enables logging for all actions except for SECCOMP_RET_ALLOW for the given filter. SECCOMP_RET_KILL actions are always logged, when "kill" is in the actions_logged sysctl, and SECCOMP_RET_ALLOW actions are never logged, regardless of this flag. This flag can be used to create noisy filters that result in all non-allowed actions to be logged. A process may have one noisy filter, which is loaded with this flag, as well as a quiet filter that's not loaded with this flag. This allows for the actions in a set of filters to be selectively conveyed to the admin. Since a system could have a large number of allocated seccomp_filter structs, struct packing was taken in consideration. On 64 bit x86, the new log member takes up one byte of an existing four byte hole in the struct. On 32 bit x86, the new log member creates a new four byte hole (unavoidable) and consumes one of those bytes. Unfortunately, the tests added for SECCOMP_FILTER_FLAG_LOG are not capable of inspecting the audit log to verify that the actions taken in the filter were logged. With this patch, the logic for deciding if an action will be logged is: if action == RET_ALLOW: do not log else if action == RET_KILL && RET_KILL in actions_logged: log else if filter-requests-logging && action in actions_logged: log else if audit_enabled && process-is-being-audited: log else: do not log Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-08-14 13:46:46 -07:00
Tyler Hicks	d612b1fd80	seccomp: Operation for checking if an action is available Userspace code that needs to check if the kernel supports a given action may not be able to use the /proc/sys/kernel/seccomp/actions_avail sysctl. The process may be running in a sandbox and, therefore, sufficient filesystem access may not be available. This patch adds an operation to the seccomp(2) syscall that allows userspace code to ask the kernel if a given action is available. If the action is supported by the kernel, 0 is returned. If the action is not supported by the kernel, -1 is returned with errno set to -EOPNOTSUPP. If this check is attempted on a kernel that doesn't support this new operation, -1 is returned with errno set to -EINVAL meaning that userspace code will have the ability to differentiate between the two error cases. Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Suggested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Kees Cook <keescook@chromium.org>	2017-08-14 13:46:44 -07:00
Florian Weimer	34fc75bfc6	uapi/linux/quota.h: Do not include linux/errno.h linux/errno.h is very sensitive to coordination with libc headers. Nothing in linux/quota.h needs it, so this change allows using this header in more contexts. Signed-off-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2017-08-14 11:53:34 +02:00
Lorenzo Colitti	077fbac405	net: xfrm: support setting an output mark. On systems that use mark-based routing it may be necessary for routing lookups to use marks in order for packets to be routed correctly. An example of such a system is Android, which uses socket marks to route packets via different networks. Currently, routing lookups in tunnel mode always use a mark of zero, making routing incorrect on such systems. This patch adds a new output_mark element to the xfrm state and a corresponding XFRMA_OUTPUT_MARK netlink attribute. The output mark differs from the existing xfrm mark in two ways: 1. The xfrm mark is used to match xfrm policies and states, while the xfrm output mark is used to set the mark (and influence the routing) of the packets emitted by those states. 2. The existing mark is constrained to be a subset of the bits of the originating socket or transformed packet, but the output mark is arbitrary and depends only on the state. The use of a separate mark provides additional flexibility. For example: - A packet subject to two transforms (e.g., transport mode inside tunnel mode) can have two different output marks applied to it, one for the transport mode SA and one for the tunnel mode SA. - On a system where socket marks determine routing, the packets emitted by an IPsec tunnel can be routed based on a mark that is determined by the tunnel, not by the marks of the unencrypted packets. - Support for setting the output marks can be introduced without breaking any existing setups that employ both mark-based routing and xfrm tunnel mode. Simply changing the code to use the xfrm mark for routing output packets could xfrm mark could change behaviour in a way that breaks these setups. If the output mark is unspecified or set to zero, the mark is not set or changed. Tested: make allyesconfig; make -j64 Tested: https://android-review.googlesource.com/452776 Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2017-08-11 07:03:00 +02:00
Ville Syrjälä	bbfb6ce86c	drm/i915: Implement .get_format_info() hook for CCS SKL+ display engine can scan out certain kinds of compressed surfaces produced by the render engine. This involved telling the display engine the location of the color control surfae (CCS) which describes which parts of the main surface are compressed and which are not. The location of CCS is provided by userspace as just another plane with its own offset. By providing our own format information for the CCS formats, we should be able to make framebuffer_check() do the right thing for the CCS surface as well. Note that we'll return the same format info for both Y and Yf tiled format as that's what happens with the non-CCS Y vs. Yf as well. If desired, we could potentially return a unique pointer for each pixel_format+tiling+ccs combination, in which case we immediately be able to tell if any of that stuff changed by just comparing the pointers. But that does sound a bit wasteful space wise. v2: Drop the 'dev' argument from the hook v3: Include the description of the CCS surface layout v4: Pretend CCS tiles are regular 128 byte wide Y tiles (Jason) v5: Re-drop 'dev', fix commit message, add missing drm_fourcc.h description of CCS layout. (daniels) Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v3) Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Ville Syrjä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Stone <daniels@collabora.com>	2017-08-10 17:58:32 +01:00
Daniel Vetter	148b1e115e	Merge airlied/drm-next into drm-intel-next-queued Ben Widawsky/Daniel Stone need the extended modifier support from drm-misc to be able to merge CCS support for i915.ko Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-08-10 18:12:01 +02:00
Leon Romanovsky	1bb77b8c1d	RDMA/netlink: Export node_type Add ability to get node_type for RDAM netlink users. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:14 +03:00
Leon Romanovsky	5654e49db0	RDMA/netlink: Provide port state and physical link state Add port state and physical link state to the users of RDMA netlink. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:13 +03:00
Leon Romanovsky	34840fea11	RDMA/netlink: Export LID mask control (LMC) Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:13 +03:00
Leon Romanovsky	80a06dd36f	RDMA/netink: Export lids and sm_lids According to the IB specification, the LID and SM_LID are 16-bit wide, but to support OmniPath users, export it as 32-bit value from the beginning. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:12 +03:00
Leon Romanovsky	12026fbba6	RDMA/netlink: Advertise IB subnet prefix Add IB subnet prefix to the port properties exported by RDMA netlink. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:12 +03:00
Leon Romanovsky	1aaff896ca	RDMA/netlink: Export node_guid and sys_image_guid Add Node GUID and system image GUID to the device properties exported by RDMA netlink, to be used by RDMAtool. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:11 +03:00
Leon Romanovsky	8621a7e3c1	RDMA/netlink: Export FW version Add FW version to the device properties exported by RDMA netlink, to be used by RDMAtool. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>	2017-08-10 13:28:11 +03:00
Leon Romanovsky	ac50525374	RDMA/netlink: Expose device and port capability masks The port capability mask is exposed to user space via sysfs interface, while device capabilities are available for verbs only. This patch provides those capabilities through netlink interface. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>	2017-08-10 13:28:10 +03:00
Leon Romanovsky	1a6e7c31d7	RDMA/netlink: Add netlink device definitions to UAPI Introduce new defines to rdma_netlink.h, so the RDMA configuration tool will be able to communicate with RDMA subsystem by using the shared defines. The addition of new client (NLDEV) revealed the fact that we exposed by mistake the RDMA_NL_I40IW define which is not backed by any RDMA netlink by now and it won't be exposed in the future too. So this patch reuses the value and deletes the old defines. The NLDEV operates with objects. The struct ib_device has two straightforward objects: device itself and ports of that device. This brings us to propose the following commands to work on those objects: * RDMA_NLDEV_CMD_{GET,SET,NEW,DEL} - works on ib_device itself * RDMA_NLDEV_CMD_PORT_{GET,SET,NEW,DEL} - works on ports of specific ib_device Those commands receive/return the device index (RDMA_NLDEV_ATTR_DEV_INDEX) and port index (RDMA_NLDEV_ATTR_PORT_INDEX). For device object accesses, the RDMA_NLDEV_ATTR_PORT_INDEX will return the maximum number of ports for specific ib_device and for port access the actual port index. The port index starts from 1 to follow RDMA/core internal semantics and the sysfs exposed knobs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>	2017-08-10 13:28:07 +03:00
Daniel Borkmann	92b31a9af7	bpf: add BPF_J{LT,LE,SLT,SLE} instructions Currently, eBPF only understands BPF_JGT (>), BPF_JGE (>=), BPF_JSGT (s>), BPF_JSGE (s>=) instructions, this means that particularly JLT/JLE counterparts involving immediates need to be rewritten from e.g. X < [IMM] by swapping arguments into [IMM] > X, meaning the immediate first is required to be loaded into a register Y := [IMM], such that then we can compare with Y > X. Note that the destination operand is always required to be a register. This has the downside of having unnecessarily increased register pressure, meaning complex program would need to spill other registers temporarily to stack in order to obtain an unused register for the [IMM]. Loading to registers will thus also affect state pruning since we need to account for that register use and potentially those registers that had to be spilled/filled again. As a consequence slightly more stack space might have been used due to spilling, and BPF programs are a bit longer due to extra code involving the register load and potentially required spill/fills. Thus, add BPF_JLT (<), BPF_JLE (<=), BPF_JSLT (s<), BPF_JSLE (s<=) counterparts to the eBPF instruction set. Modifying LLVM to remove the NegateCC() workaround in a PoC patch at [1] and allowing it to also emit the new instructions resulted in cilium's BPF programs that are injected into the fast-path to have a reduced program length in the range of 2-3% (e.g. accumulated main and tail call sections from one of the object file reduced from 4864 to 4729 insns), reduced complexity in the range of 10-30% (e.g. accumulated sections reduced in one of the cases from 116432 to 88428 insns), and reduced stack usage in the range of 1-5% (e.g. accumulated sections from one of the object files reduced from 824 to 784b). The modification for LLVM will be incorporated in a backwards compatible way. Plan is for LLVM to have i) a target specific option to offer a possibility to explicitly enable the extension by the user (as we have with -m target specific extensions today for various CPU insns), and ii) have the kernel checked for presence of the extensions and enable them transparently when the user is selecting more aggressive options such as -march=native in a bpf target context. (Other frontends generating BPF byte code, e.g. ply can probe the kernel directly for its code generation.) [1] https://github.com/borkmann/llvm/tree/bpf-insns Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-09 16:53:56 -07:00
Hans Verkuil	79bcd34ccf	media: cec-funcs.h: cec_ops_report_features: set *dev_features to NULL gcc can get confused by this code and it thinks dev_features can be returned uninitialized. So initialize to NULL at the beginning to shut up the warning. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-08-09 09:36:13 -04:00
Eric Anholt	3be8eddd9d	drm/vc4: Add exec flags to allow forcing a specific X/Y tile walk order. This is useful to allow GL to provide defined results for overlapping glBlitFramebuffer, which X11 in turn uses to accelerate uncomposited window movement without first blitting to a temporary. x11perf -copywinwin100 goes from 1850/sec to 4850/sec. v2: Default to the same behavior as before when the flags aren't passed. (suggested by Boris) Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20170725162733.28007-2-eric@anholt.net Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>	2017-08-08 13:26:44 -07:00
Hans Verkuil	6c2c188f35	media: drop use of MEDIA_API_VERSION Set media_version to LINUX_VERSION_CODE, just as we did for driver_version. Nobody ever rememebers to update the version number, but LINUX_VERSION_CODE will always be updated. Move the MEDIA_API_VERSION define to the ifndef __KERNEL__ section of the media.h header. That way kernelspace can't accidentally start to use it again. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-08-08 06:03:15 -04:00
Mauro Carvalho Chehab	1d54267b23	Linux 4.13-rc4 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZh8YYAAoJEHm+PkMAQRiG46QIAKOBbLlOY38zIJwDfJs6ydvH eFLryznS7RM2w0Gw1RyVAyWS43QS9RUNGDMa4UOb9AvurBHYpK29t1uq6LejQ/hn 2Uvxuq95qEVVYzN1OA3WzLKUa35g3qRM9rTYFz7xGMRp2Ldk/aPRi/PVJLhSO3YQ HFRLsfNMWTkSR4imuxm7NS+cYMcqWbDbanvW5IwQ+RFRPo8Ac1PbFpGUdVtar6+O Fm3GLBsRB3dijJwYyWQKeDvtLr608i50by4yS7EIAqbUSfoDpJEyTL57oTCRok7P 5ZycGpK4bXWF0OpBWpKgrFO5tB7xfzUDa3TmNhS3Q8ep4KLHNXwM3V6p8Y+YZco= =FId5 -----END PGP SIGNATURE----- Merge tag 'v4.13-rc4' into patchwork Linux 4.13-rc4 * tag 'v4.13-rc4': (863 commits) Linux 4.13-rc4 Fix compat_sys_sigpending breakage ext4: fix copy paste error in ext4_swap_extents() ext4: fix overflow caused by missing cast in ext4_resize_fs() ext4, project: expand inode extra size if possible ext4: cleanup ext4_expand_extra_isize_ea() ext4: restructure ext4_expand_extra_isize ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize ext4: make xattr inode reads faster ext4: inplace xattr block update fails to deduplicate blocks ext4: remove unused mode parameter ext4: fix warning about stack corruption ext4: fix dir_nlink behaviour ext4: silence array overflow warning ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize platform/x86: intel-vbtn: match power button on press rather than release ext4: release discard bio after sending discard commands sparc64: Fix exception handling in UltraSPARC-III memcpy. arm64: avoid overflow in VA_START and PAGE_OFFSET arm64: Fix potential race with hardware DBM in ptep_set_access_flags() ...	2017-08-08 05:38:41 -04:00
David Lebrun	d1df6fd8a1	ipv6: sr: define core operations for seg6local lightweight tunnel This patch implements a new type of lightweight tunnel named seg6local. A seg6local lwt is defined by a type of action and a set of parameters. The action represents the operation to perform on the packets matching the lwt's route, and is not necessarily an encapsulation. The set of parameters are arguments for the processing function. Each action is defined in a struct seg6_action_desc within seg6_action_table[]. This structure contains the action, mandatory attributes, the processing function, and a static headroom size required by the action. The mandatory attributes are encoded as a bitmask field. The static headroom is set to a non-zero value when the processing function always add a constant number of bytes to the skb (e.g. the header size for encapsulations). To facilitate rtnetlink-related operations such as parsing, fill_encap, and cmp_encap, each type of action parameter is associated to three function pointers, in seg6_action_params[]. All actions defined in seg6_local.h are detailed in [1]. [1] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01 Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-07 14:16:22 -07:00
Mikko Rapeli	f02a60924c	uapi linux/dlm_netlink.h: include linux/dlmconstants.h Fixes userspace compilation error: error: ‘DLM_RESNAME_MAXLEN’ undeclared here (not in a function) char resource_name[DLM_RESNAME_MAXLEN]; Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi> Signed-off-by: David Teigland <teigland@redhat.com>	2017-08-07 11:23:09 -05:00
Mikko Rapeli	adb8a5a5eb	uapi drm/armada_drm.h: use __u32 and __u64 instead of uint32_t and uint64_t These are defined in linux/types.h or drm/drm.h. Fixes user space compilation errors like: drm/armada_drm.h:26:2: error: unknown type name ‘uint32_t’ uint32_t handle; ^~~~~~~~ Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Gabriel Laskar <gabriel@lse.epita.fr> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20170806164428.2273-33-mikko.rapeli@iki.fi	2017-08-07 17:01:15 +02:00
Mikko Rapeli	472b46c352	uapi linux/kfd_ioctl.h: only use __u32 and __u64 Include <drm/drm.h> instead of <linux/types.h> which on Linux includes <linux/types.h> and on non-Linux platforms defines __u32 etc types. Fixes user space compilation errors like: linux/kfd_ioctl.h:33:2: error: unknown type name ‘uint32_t’ uint32_t major_version; /* from KFD */ ^~~~~~~~ Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2017-08-06 18:44:27 +02:00
John Fastabend	56ce097c1c	net: comment fixes against BPF devmap helper calls Update BPF comments to accurately reflect XDP usage. Fixes: `97f91a7cf0` ("bpf: add bpf_redirect_map helper routine") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-04 11:29:03 -07:00
Jens Wiklander	059cf566e1	tee: indicate privileged dev in gen_caps Mirrors the TEE_DESC_PRIVILEGED bit of struct tee_desc:flags into struct tee_ioctl_version_data:gen_caps as TEE_GEN_CAP_PRIVILEGED in tee_ioctl_version() Reviewed-by: Jerome Forissier <jerome.forissier@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>	2017-08-04 10:30:27 +02:00
Willem de Bruijn	76851d1212	sock: add SOCK_ZEROCOPY sockopt The send call ignores unknown flags. Legacy applications may already unwittingly pass MSG_ZEROCOPY. Continue to ignore this flag unless a socket opts in to zerocopy. Introduce socket option SO_ZEROCOPY to enable MSG_ZEROCOPY processing. Processes can also query this socket option to detect kernel support for the feature. Older kernels will return ENOPROTOOPT. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 21:37:29 -07:00
Willem de Bruijn	52267790ef	sock: add MSG_ZEROCOPY The kernel supports zerocopy sendmsg in virtio and tap. Expand the infrastructure to support other socket types. Introduce a completion notification channel over the socket error queue. Notifications are returned with ee_origin SO_EE_ORIGIN_ZEROCOPY. ee_errno is 0 to avoid blocking the send/recv path on receiving notifications. Add reference counting, to support the skb split, merge, resize and clone operations possible with SOCK_STREAM and other socket types. The patch does not yet modify any datapaths. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 21:37:29 -07:00
Ido Schimmel	61e4d01e16	ipv6: fib: Add offload indication to routes Allow user space applications to see which routes are offloaded and which aren't by setting the RTNH_F_OFFLOAD flag when dumping them. To be consistent with IPv4, offload indication is provided on a per-nexthop basis. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:00 -07:00
Lionel Landwerlin	f89823c212	drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface The motivation behind this new interface is expose at runtime the creation of new OA configs which can be used as part of the i915 perf open interface. This will enable the kernel to learn new configs which may be experimental, or otherwise not part of the core set currently available through the i915 perf interface. v2: Drop DRM_ERROR for userspace errors (Matthew) Add padding to userspace structure (Matthew) s/guid/uuid/ (Matthew) v3: Use u32 instead of int to iterate through registers (Matthew) v4: Lock access to dynamic config list (Lionel) v5: by Matthew: Fix uninitialized error values Fix incorrect unwiding when opening perf stream Use kmalloc_array() to store register Use uuid_is_valid() to valid config uuids Declare ioctls as write only Check padding members are set to 0 by Lionel: Return ENOENT rather than EINVAL when trying to remove non existing config v6: by Chris: Use ref counts for OA configs Store UUID in drm_i915_perf_oa_config rather then using pointer Shuffle fields of drm_i915_perf_oa_config to avoid padding v7: by Chris Rename uapi pointers fields to end with '_ptr' v8: by Andrzej, Marek, Sebastian Update register whitelisting by Lionel Add more register names for documentation Allow configuration programming in non-paranoid mode Add support for value filter for a couple of registers already programmed in other part of the kernel v9: Documentation fix (Lionel) Allow writing WAIT_FOR_RC6_EXIT only on Gen8+ (Andrzej) v10: Perform read access_ok() on register pointers (Lionel) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Andrzej Datczuk <andrzej.datczuk@intel.com> Reviewed-by: Andrzej Datczuk <andrzej.datczuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170803165812.2373-2-lionel.g.landwerlin@intel.com	2017-08-03 18:19:53 +01:00
Jordan Crouse	cdbc78ba70	drm/msm: Remove __user from __u64 data types __user should be used to identify user pointers and not __u64 variables containing pointers. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-08-01 19:11:48 -04:00
David S. Miller	29fda25a2d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two minor conflicts in virtio_net driver (bug fix overlapping addition of a helper) and MAINTAINERS (new driver edit overlapping revamp of PHY entry). Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 10:07:50 -07:00
Ben Widawsky	db1689aa61	drm: Create a format/modifier blob Updated blob layout (Rob, Daniel, Kristian, xerpi) v2: * Removed __packed, and alignment (.+) * Fix indent in drm_format_modifier fields (Liviu) * Remove duplicated modifier > 64 check (Liviu) * Change comment about modifier (Liviu) * Remove arguments to blob creation, use plane instead (Liviu) * Fix data types (Ben) * Make the blob part of uapi (Daniel) v3: Remove unused ret field. Change i, and j to unsigned int (Emil) v4: Use plane->modifier_count instead of recounting (Daniel) v5: Rename modifiers to modifiers_property (Ville) Use sizeof(__u32) instead to reflect UAPI nature (Ville) Make BUILD_BUG_ON for blob header size Cc: Rob Clark <robdclark@gmail.com> Cc: Kristian H. Kristensen <hoegsberg@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Daniel Stone <daniels@collabora.com> (v2) Reviewed-by: Liviu Dudau <liviu@dudau.co.uk> (v2) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v3) Signed-off-by: Daniel Stone <daniels@collabora.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170724034641.13369-2-ben@bwidawsk.net	2017-08-01 17:50:06 +01:00
Ben Widawsky	e6fc3b6855	drm: Plumb modifiers through plane init This is the plumbing for supporting fb modifiers on planes. Modifiers have already been introduced to some extent, but this series will extend this to allow querying modifiers per plane. Based on this, the client to enable optimal modifications for framebuffers. This patch simply allows the DRM drivers to initialize their list of supported modifiers upon initializing the plane. v2: A minor addition from Daniel v3: * Updated commit message * s/INVALID/DRM_FORMAT_MOD_INVALID (Liviu) * Remove some excess newlines (Liviu) * Update comment for > 64 modifiers (Liviu) v4: Minor comment adjustments (Liviu) v5: Some new platforms added due to rebase v6: Add some missed plane inits (or maybe they're new - who knows at this point) (Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Daniel Stone <daniels@collabora.com> (v2) Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Daniel Stone <daniels@collabora.com>	2017-08-01 17:50:06 +01:00
Wei Wang	bb7c19f960	tcp: add related fields into SCM_TIMESTAMPING_OPT_STATS Add the following stats into SCM_TIMESTAMPING_OPT_STATS control msg: TCP_NLA_PACING_RATE TCP_NLA_DELIVERY_RATE TCP_NLA_SND_CWND TCP_NLA_REORDERING TCP_NLA_MIN_RTT TCP_NLA_RECUR_RETRANS TCP_NLA_DELIVERY_RATE_APP_LMT Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 17:26:18 -07:00
Florian Westphal	3282e65558	tcp: remove unused mib counters was used by tcp prequeue and header prediction. TCPFORWARDRETRANS use was removed in january. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:37:50 -07:00
Phil Sutter	6150957521	netfilter: nf_tables: Allow object names of up to 255 chars Same conversion as for table names, use NFT_NAME_MAXLEN as upper boundary as well. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-07-31 20:41:59 +02:00
Phil Sutter	387454901b	netfilter: nf_tables: Allow set names of up to 255 chars Same conversion as for table names, use NFT_NAME_MAXLEN as upper boundary as well. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-07-31 20:41:58 +02:00
Phil Sutter	b7263e071a	netfilter: nf_tables: Allow chain name of up to 255 chars Same conversion as for table names, use NFT_NAME_MAXLEN as upper boundary as well. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-07-31 20:41:57 +02:00
Phil Sutter	e46abbcc05	netfilter: nf_tables: Allow table names of up to 255 chars Allocate all table names dynamically to allow for arbitrary lengths but introduce NFT_NAME_MAXLEN as an upper sanity boundary. It's value was chosen to allow using a domain name as per RFC 1035. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-07-31 20:41:57 +02:00
Jamal Hadi Salim	e62e484df0	net sched actions: add time filter for action dumping This patch adds support for filtering based on time since last used. When we are dumping a large number of actions it is useful to have the option of filtering based on when the action was last used to reduce the amount of data crossing to user space. With this patch the user space app sets the TCA_ROOT_TIME_DELTA attribute with the value in milliseconds with "time of interest since now". The kernel converts this to jiffies and does the filtering comparison matching entries that have seen activity since then and returns them to user space. Old kernels and old tc continue to work in legacy mode since they dont specify this attribute. Some example (we have 400 actions bound to 400 filters); at installation time. Using updated when tc setting the time of interest to 120 seconds earlier (we see 400 actions): prompt$ hackedtc actions ls action gact since 120000\| grep index \| wc -l 400 go get some coffee and wait for > 120 seconds and try again: prompt$ hackedtc actions ls action gact since 120000 \| grep index \| wc -l 0 Lets see a filter bound to one of these actions: .... filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 2 success 1) match 7f000002/ffffffff at 12 (success 1 ) action order 1: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1145 sec used 802 sec Action statistics: Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 .... that coffee took long, no? It was good. Now lets ping -c 1 127.0.0.2, then run the actions again: prompt$ hackedtc actions ls action gact since 120 \| grep index \| wc -l 1 More details please: prompt$ hackedtc -s actions ls action gact since 120000 action order 0: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1270 sec used 30 sec Action statistics: Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 And the filter? filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 4 success 2) match 7f000002/ffffffff at 12 (success 2 ) action order 1: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1324 sec used 84 sec Action statistics: Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Jamal Hadi Salim	90825b23a8	net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch When you dump hundreds of thousands of actions, getting only 32 per dump batch even when the socket buffer and memory allocations allow is inefficient. With this change, the user will get as many as possibly fitting within the given constraints available to the kernel. The top level action TLV space is extended. An attribute TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON is set by the user indicating the user is capable of processing these large dumps. Older user space which doesnt set this flag doesnt get the large (than 32) batches. The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many actions are put in a single batch. As such user space app knows how long to iterate (independent of the type of action being dumped) instead of hardcoded maximum of 32 thus maintaining backward compat. Some results dumping 1.5M actions below: first an unpatched tc which doesnt understand these features... prompt$ time -p tc actions ls action gact \| grep index \| wc -l 1500000 real 1388.43 user 2.07 sys 1386.79 Now lets see a patched tc which sets the correct flags when requesting a dump: prompt$ time -p updatedtc actions ls action gact \| grep index \| wc -l 1500000 real 178.13 user 2.02 sys 176.96 That is about 8x performance improvement for tc app which sets its receive buffer to about 32K. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Jamal Hadi Salim	64c83d8373	net netlink: Add new type NLA_BITFIELD32 Generic bitflags attribute content sent to the kernel by user. With this netlink attr type the user can either set or unset a flag in the kernel. The value is a bitmap that defines the bit values being set The selector is a bitmask that defines which value bit is to be considered. A check is made to ensure the rules that a kernel subsystem always conforms to bitflags the kernel already knows about. i.e if the user tries to set a bit flag that is not understood then the _it will be rejected_. In the most basic form, the user specifies the attribute policy as: [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags }, where myvalidflags is the bit mask of the flags the kernel understands. If the user _does not_ provide myvalidflags then the attribute will also be rejected. Examples: value = 0x0, and selector = 0x1 implies we are selecting bit 1 and we want to set its value to 0. value = 0x2, and selector = 0x2 implies we are selecting bit 2 and we want to set its value to 1. Suggested-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-30 19:28:08 -07:00
Ingo Molnar	f5db340f19	Merge branch 'perf/urgent' into perf/core, to pick up latest fixes and refresh the tree Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-07-30 11:15:13 +02:00
Vidya Sagar Ravipati	1a5f3da20b	net: ethtool: add support for forward error correction modes Forward Error Correction (FEC) modes i.e Base-R and Reed-Solomon modes are introduced in 25G/40G/100G standards for providing good BER at high speeds. Various networking devices which support 25G/40G/100G provides ability to manage supported FEC modes and the lack of FEC encoding control and reporting today is a source for interoperability issues for many vendors. FEC capability as well as specific FEC mode i.e. Base-R or RS modes can be requested or advertised through bits D44:47 of base link codeword. This patch set intends to provide option under ethtool to manage and report FEC encoding settings for networking devices as per IEEE 802.3 bj, bm and by specs. set-fec/show-fec option(s) are designed to provide control and report the FEC encoding on the link. SET FEC option: root@tor: ethtool --set-fec swp1 encoding [off \| RS \| BaseR \| auto] Encoding: Types of encoding Off : Turning off any encoding RS : enforcing RS-FEC encoding on supported speeds BaseR : enforcing Base R encoding on supported speeds Auto : IEEE defaults for the speed/medium combination Here are a few examples of what we would expect if encoding=auto: - if autoneg is on, we are expecting FEC to be negotiated as on or off as long as protocol supports it - if the hardware is capable of detecting the FEC encoding on it's receiver it will reconfigure its encoder to match - in absence of the above, the configuration would be set to IEEE defaults. >From our understanding , this is essentially what most hardware/driver combinations are doing today in the absence of a way for users to control the behavior. SHOW FEC option: root@tor: ethtool --show-fec swp1 FEC parameters for swp1: Active FEC encodings: RS Configured FEC encodings: RS \| BaseR ETHTOOL DEVNAME output modification: ethtool devname output: root@tor:~# ethtool swp1 Settings for swp1: root@hpe-7712-03:~# ethtool swp18 Settings for swp18: Supported ports: [ FIBRE ] Supported link modes: 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 100000baseSR4/Full 100000baseCR4/Full 100000baseLR4_ER4/Full Supported pause frame use: No Supports auto-negotiation: Yes Supported FEC modes: [RS \| BaseR \| None \| Not reported] Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: [RS \| BaseR \| None \| Not reported] <<<< One or more FEC modes Speed: 100000Mb/s Duplex: Full Port: FIBRE PHYAD: 106 Transceiver: internal Auto-negotiation: off Link detected: yes This patch includes following changes a) New ETHTOOL_SFECPARAM/SFECPARAM API, handled by the new get_fecparam/set_fecparam callbacks, provides support for configuration of forward error correction modes. b) Link mode bits for FEC modes i.e. None (No FEC mode), RS, BaseR/FC are defined so that users can configure these fec modes for supported and advertising fields as part of link autonegotiation. Signed-off-by: Vidya Sagar Ravipati <vidya.chowdary@gmail.com> Signed-off-by: Dustin Byford <dustin@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-29 23:23:44 -07:00
Shaohua Li	ca1136c99b	blktrace: export cgroup info in trace Currently blktrace isn't cgroup aware. blktrace prints out task name of current context, but the task of current context isn't always in the cgroup where the BIO comes from. We can't use task name to find out IO cgroup. For example, Writeback BIOs always comes from flusher thread but the BIOs are for different blk cgroups. Request could be requeued and dispatched from completely different tasks. MD/DM are another examples. This patch tries to fix the gap. We print out cgroup fhandle info in blktrace. Userspace can use open_by_handle_at() syscall to find the cgroup by fhandle. Or userspace can use name_to_handle_at() syscall to find fhandle for a cgroup and use a BPF program to filter out blktrace for a specific cgroup. We add a new 'blk_cgroup' trace option for blk tracer. It's default off. Application which doesn't know the new option isn't affected. When it's on, we output fhandle info right after blk_io_trace with an extra bit set in event action. So from application point of view, blktrace with the option will output new actions. I didn't change blk trace event yet, since I'm not sure if changing the trace event output is an ABI issue. If not, I'll do it later. Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-07-29 09:00:03 -06:00
Eric Anholt	f30994622b	drm/vc4: Add an ioctl for labeling GEM BOs for summary stats This has proven immensely useful for debugging memory leaks and overallocation (which is a rather serious concern on the platform, given that we typically run at about 256MB of CMA out of up to 1GB total memory, with framebuffers that are about 8MB ecah). The state of the art without this is to dump debug logs from every GL application, guess as to kernel allocations based on bo_stats, and try to merge that all together into a global picture of memory allocation state. With this, you can add a couple of calls to the debug build of the 3D driver and get a pretty detailed view of GPU memory usage from /debug/dri/0/bo_stats (or when we debug print to dmesg on allocation failure). The Mesa side currently labels at the gallium resource level (so you see that a 1920x20 pixmap has been created, presumably for the window system panel), but we could extend that to be even more useful with glObjectLabel() names being sent all the way down to the kernel. (partial) example of sorted debugfs output with Mesa labeling all resources: kernel BO cache: 16392kb BOs (3) tiling shadow 1920x1080: 8160kb BOs (1) resource 1920x1080@32/0: 8160kb BOs (1) scanout resource 1920x1080@32/0: 8100kb BOs (1) kernel: 8100kb BOs (1) v2: Use strndup_user(), use lockdep assertion instead of just a comment, fix an array[-1] reference, extend comment about name freeing. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20170725182718.31468-2-eric@anholt.net Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-07-28 16:04:53 -07:00
Doug Ledford	a5f66725c7	Merge branch 'misc' into k.o/for-next	2017-07-27 09:00:38 -04:00
Amrani, Ram	67cbe3532c	RDMA/qedr: notify user application of supported WIDs The number of supported WIDs, if they are supported at all, can be limited due to resources. Notifying the user space application the number of available WIDs allows it to utilize them correctly. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-27 08:59:52 -04:00
Amrani, Ram	ad84dad216	RDMA/qedr: notify user application if DPM is supported Direct Packet Mode support may be disabled, e.g, due to limited resources. Notifying the user application prevents wasting cycles on attempting to send these kind of packets. Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-27 08:59:52 -04:00
Dave Airlie	0eb2c0ae57	Linux 4.13-rc2 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZdS4PAAoJEHm+PkMAQRiGbEYH/2mukTPOUAfNoWaVjO2YHxuL 5yI3n1838tKIJm967IUmGdckN/RYGPjJxvZ+muXN2/rv23+9j3LVq9vQcsYqRQop vrWP+hvGGJvOGJ2NYBDB+4AUrPPdeX9stolwyAcYvyCZ8AilPIovm4s2poA+fuQX D78c8JSfpse32oc93dy4bUz3mRFKTeufstrWEuzqXI691mthF2G9EpA0R3hlbqv+ GiUnNcZVOnOuCt/47GnpWVKsyv91l3CkGq3bV1GSUi8a/1PnyFxHQxQI/qgbkLXs NuswRupSeLDQKRgiDLgWF/BpdHEp4dpFFWXm00KWlgxeGSQnKat9bpW/d5OgnhA= =mv3H -----END PGP SIGNATURE----- Backmerge tag 'v4.13-rc2' into drm-next Linux 4.13-rc2 This is required for drm-misc fixing.	2017-07-27 08:15:43 +10:00
Daniel Vetter	af05559854	Merge airlied/drm-next into drm-misc-next I need this to be able to apply the deferred fbdev setup patches, I need the relevant prep work that landed through the drm-intel tree. Also squash in conflict fixup from Laurent Pinchart. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-07-26 13:43:33 +02:00
Eric W. Biederman	cc731525f2	signal: Remove kernel interal si_code magic struct siginfo is a union and the kernel since 2.4 has been hiding a union tag in the high 16bits of si_code using the values: __SI_KILL __SI_TIMER __SI_POLL __SI_FAULT __SI_CHLD __SI_RT __SI_MESGQ __SI_SYS While this looks plausible on the surface, in practice this situation has not worked well. - Injected positive signals are not copied to user space properly unless they have these magic high bits set. - Injected positive signals are not reported properly by signalfd unless they have these magic high bits set. - These kernel internal values leaked to userspace via ptrace_peek_siginfo - It was possible to inject these kernel internal values and cause the the kernel to misbehave. - Kernel developers got confused and expected these kernel internal values in userspace in kernel self tests. - Kernel developers got confused and set si_code to __SI_FAULT which is SI_USER in userspace which causes userspace to think an ordinary user sent the signal and that it was not kernel generated. - The values make it impossible to reorganize the code to transform siginfo_copy_to_user into a plain copy_to_user. As si_code must be massaged before being passed to userspace. So remove these kernel internal si codes and make the kernel code simpler and more maintainable. To replace these kernel internal magic si_codes introduce the helper function siginfo_layout, that takes a signal number and an si_code and computes which union member of siginfo is being used. Have siginfo_layout return an enumeration so that gcc will have enough information to warn if a switch statement does not handle all of union members. A couple of architectures have a messed up ABI that defines signal specific duplications of SI_USER which causes more special cases in siginfo_layout than I would like. The good news is only problem architectures pay the cost. Update all of the code that used the previous magic __SI_ values to use the new SIL_ values and to call siginfo_layout to get those values. Escept where not all of the cases are handled remove the defaults in the switch statements so that if a new case is missed in the future the lack will show up at compile time. Modify the code that copies siginfo si_code to userspace to just copy the value and not cast si_code to a short first. The high bits are no longer used to hold a magic union member. Fixup the siginfo header files to stop including the __SI_ values in their constants and for the headers that were missing it to properly update the number of si_codes for each signal type. The fixes to copy_siginfo_from_user32 implementations has the interesting property that several of them perviously should never have worked as the __SI_ values they depended up where kernel internal. With that dependency gone those implementations should work much better. The idea of not passing the __SI_ values out to userspace and then not reinserting them has been tested with criu and criu worked without changes. Ref: 2.4.0-test1 Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-07-24 14:30:28 -05:00
Eric W. Biederman	d08477aa97	fcntl: Don't use ambiguous SIG_POLL si_codes We have a weird and problematic intersection of features that when they all come together result in ambiguous siginfo values, that we can not support properly. - Supporting fcntl(F_SETSIG,...) with arbitrary valid signals. - Using positive values for POLL_IN, POLL_OUT, POLL_MSG, ..., etc that imply they are signal specific si_codes and using the aforementioned arbitrary signal to deliver them. - Supporting injection of arbitrary siginfo values for debugging and checkpoint/restore. The result is that just looking at siginfo si_codes of 1 to 6 are ambigious. It could either be a signal specific si_code or it could be a generic si_code. For most of the kernel this is a non-issue but for sending signals with siginfo it is impossible to play back the kernel signals and get the same result. Strictly speaking when the si_code was changed from SI_SIGIO to POLL_IN and friends between 2.2 and 2.4 this functionality was not ambiguous, as only real time signals were supported. Before 2.4 was released the kernel began supporting siginfo with non realtime signals so they could give details of why the signal was sent. The result is that if F_SETSIG is set to one of the signals with signal specific si_codes then user space can not know why the signal was sent. I grepped through a bunch of userspace programs using debian code search to get a feel for how often people choose a signal that results in an ambiguous si_code. I only found one program doing so and it was using SIGCHLD to test the F_SETSIG functionality, and did not appear to be a real world usage. Therefore the ambiguity does not appears to be a real world problem in practice. Remove the ambiguity while introducing the smallest chance of breakage by changing the si_code to SI_SIGIO when signals with signal specific si_codes are targeted. Fixes: v2.3.40 -- Added support for queueing non-rt signals Fixes: v2.3.21 -- Changed the si_code from SI_SIGIO Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-07-24 14:29:23 -05:00
Guy Levi	3078f5f1bd	IB/mlx4: Add support for RSS QP Add support to work with a RSS QP by using an indirection table object upon QP creation. Other related QP verbs (e.g. modify/destroy/query) were updated as well for that QP mode. Notes: - The RX hash properties are supplied as driver private data. - The RSS QP port is used on the associated WQs in its indirection table. Applying different ports during WQ life time is not allowed. - The expected RSS QP flow is: create, modify(RST->INIT), modify(RST->RTR), destroy. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:45:53 -04:00
Guy Levi	b8d46ca035	IB/mlx4: Add support for WQ indirection table related verbs To enable RSS functionality the IB indirection table object (i.e. ib_rwq_ind_table) should be used. This patch implements the related verbs as of create and destroy an indirection table. In downstream patches the indirection table will be used as part of RSS QP creation. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:45:50 -04:00
Guy Levi	400b1ebcfe	IB/mlx4: Add support for WQ related verbs Support create/modify/destroy WQ related verbs. The base IB object to enable RSS functionality is a WQ (i.e. ib_wq). This patch implements the related WQ verbs as of create, modify and destroy. In downstream patches the WQ will be used as part of an indirection table (i.e. ib_rwq_ind_table) to enable RSS QP creation. Notes: ConnectX-3 hardware requires consecutive WQNs list as receive descriptor queues for the RSS QP. Hence, the driver manages consecutive ranges lists per context which the user must respect. Destroying the WQ does not return its WQN back to its range for reusing. However, destroying all WQs from the same range releases the range and in turn releases its WQNs for reusing. Since the WQ object is not a natural object in the hardware, the driver implements the WQ by the hardware QP. As such, the WQ inherits its port from its RSS QP parent upon its RST->INIT transition and by that time its state is applied to the hardware. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:45:01 -04:00
Maor Gottlieb	ea30b966f7	IB/mlx4: Add inline-receive support When inline-receive is enabled, the HCA may write received data into the receive WQE. Inline-receive is enabled by setting its matching bit in the QP context and each single-packet message with payload not exceeding the receive WQE size will be delivered to the WQE. The completion report will indicate that the payload was placed to the WQE. It includes: 1) Return maximum supported size of inline-receive by the hardware in query_device vendor's data part. 2) Enable the feature when requested by the vendor data input. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:41:02 -04:00
Yishai Hadas	2dee0e5458	IB/uverbs: Enable QP creation with a given source QP number Enable QP creation with a given source QP number, the created QP will use the source QPN as its wire QP number. To create such a QP, root privileges (i.e. CAP_NET_RAW) are required from the user application. This comes as a pre-patch for downstream patches in this series to allow user space applications to accelerate traffic which is typically handled by IPoIB ULP. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 10:40:46 -04:00
Phil Sutter	784b4e612d	netfilter: nf_tables: Attach process info to NFT_MSG_NEWGEN notifications This is helpful for 'nft monitor' to track which process caused a given change to the ruleset. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-07-24 13:25:07 +02:00
Greg Kroah-Hartman	329f0a8a35	Merge 4.13-rc2 into tty-next We want the tty/serial fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-23 20:04:33 -07:00
Linus Torvalds	ae75d1aefe	TTY/Serial fixes for 4.13-rc2 Here are some small tty and serial driver fixes for 4.13-rc2. Nothing huge at all, a revert of a patch that turned out to break things, a fix up for a new tty ioctl we added in 4.13-rc1 to get the uapi definition correct, and a few minor serial driver fixes for reported issues. All of these have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWXMebA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yn9MgCdGsIEQlTvTeG4QikxUPB7dkEJQacAoLWOKJzQ A65mBAeOfiJztczi8uo4 =KEXp -----END PGP SIGNATURE----- Merge tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some small tty and serial driver fixes for 4.13-rc2. Nothing huge at all, a revert of a patch that turned out to break things, a fix up for a new tty ioctl we added in 4.13-rc1 to get the uapi definition correct, and a few minor serial driver fixes for reported issues. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: Fix TIOCGPTPEER ioctl definition tty: hide unused pty_get_peer function tty: serial: lpuart: Fix the logic for detecting the 32-bit type UART serial: imx: Prevent TX buffer PIO write when a DMA has been started Revert "serial: imx-serial - move DMA buffer configuration to DT" serial: sh-sci: Uninitialized variables in sysfs files serial: st-asc: Potential error pointer dereference	2017-07-22 09:00:24 -07:00
David Howells	ddc6c70f07	rxrpc: Move the packet.h include file into net/rxrpc/ Move the protocol description header file into net/rxrpc/ and rename it to protocol.h. It's no longer necessary to expose it as packets are no longer exposed to kernel services (such as AFS) that use the facility. The abort codes are transferred to the UAPI header instead as we pass these back to userspace and also to kernel services. Signed-off-by: David Howells <dhowells@redhat.com>	2017-07-21 11:00:20 +01:00
David Howells	727f891447	rxrpc: Expose UAPI definitions to userspace Move UAPI definitions from the internal header and place them in a UAPI header file so that userspace can make use of them. Signed-off-by: David Howells <dhowells@redhat.com>	2017-07-21 10:39:26 +01:00
David S. Miller	7a68ada6ec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-07-21 03:38:43 +01:00
Jin Yao	eb0baf8a0d	perf/core: Define the common branch type classification It is often useful to know the branch types while analyzing branch data. For example, a call is very different from a conditional branch. Currently we have to look it up in binary while the binary may later not be available and even the binary is available but user has to take some time. It is very useful for user to check it directly in perf report. Perf already has support for disassembling the branch instruction to get the x86 branch type. To keep consistent on kernel and userspace and make the classification more common, the patch adds the common branch type classification in perf_event.h. The patch only defines a minimum but most common set of branch types. PERF_BR_UNKNOWN : unknown PERF_BR_COND ：conditional PERF_BR_UNCOND : unconditional PERF_BR_IND : indirect PERF_BR_CALL : function call PERF_BR_IND_CALL : indirect function call PERF_BR_RET : function return PERF_BR_SYSCALL : syscall PERF_BR_SYSRET : syscall return PERF_BR_COND_CALL : conditional function call PERF_BR_COND_RET : conditional function return The patch also adds a new field type (4 bits) in perf_branch_entry to record the branch type. Since the disassembling of branch instruction needs some overhead, a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it needs to disassemble the branch instruction and record the branch type. Change log: v10: Not changed. v9: Not changed. v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN. No other change. v7: Just keep the most common branch types. Others are removed. v6: Not changed. v5: Not changed. The v5 patch series just change the userspace. v4: Comparing to previous version, the major changes are: 1. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be computed later in userspace. 2. Remove the "cross" field in perf_branch_entry. The cross page computing will be done later in userspace. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Link: http://lkml.kernel.org/r/1500379995-6449-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-07-18 23:14:38 -03:00
Hans Verkuil	6b2bbb0874	media: cec: rework the cec event handling Event handling was always fairly simplistic since there were only two events. With the addition of pin events this needed to be redesigned. The state_change and lost_msgs events are now core events with the guarantee that the last state is always available. The new pin events are a queue of events (up to 64 for each event) and the oldest event will be dropped if the application cannot keep up. Lost events are marked with a new event flag. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-07-18 12:49:36 -03:00
Hans Verkuil	6303d97873	media: linux/cec.h: add pin monitoring API support Add support for low-level CEC pin monitoring. This adds a new monitor mode, a new capability and two new events. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-07-18 12:48:37 -03:00
Andreas Färber	fc60a8b675	tty: serial: owl: Implement console driver Implement serial console driver to complement earlycon. Based on LeMaker linux-actions tree. Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-18 09:28:29 +02:00
Ruslan Bilovol	8bd226f9a7	include: usb: audio: specify exact endiannes of descriptors USB spec says that multiple byte fields are stored in little-endian order (see chapter 8.1 of USB2.0 spec and chapter 7.1 of USB3.0 spec), thus mark such fields as LE for UAC1 and UAC2 headers Signed-off-by: Ruslan Bilovol <ruslan.bilovol@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>	2017-07-18 09:33:06 +03:00
John Fastabend	97f91a7cf0	bpf: add bpf_redirect_map helper routine BPF programs can use the devmap with a bpf_redirect_map() helper routine to forward packets to netdevice in map. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-17 09:48:06 -07:00

1 2 3 4 5 ...

5174 Commits