linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-21 08:46:49 +07:00

Author	SHA1	Message	Date
David S. Miller	1cdba55055	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter/IPVS/OVS updates for net-next The following patchset contains Netfilter/IPVS fixes and OVS NAT support, more specifically this batch is composed of: 1) Fix a crash in ipset when performing a parallel flush/dump with set:list type, from Jozsef Kadlecsik. 2) Make sure NFACCT_FILTER_* netlink attributes are in place before accessing them, from Phil Turnbull. 3) Check return error code from ip_vs_fill_iph_skb_off() in IPVS SIP helper, from Arnd Bergmann. 4) Add workaround to IPVS to reschedule existing connections to new destination server by dropping the packet and wait for retransmission of TCP syn packet, from Julian Anastasov. 5) Allow connection rescheduling in IPVS when in CLOSE state, also from Julian. 6) Fix wrong offset of SIP Call-ID in IPVS helper, from Marco Angaroni. 7) Validate IPSET_ATTR_ETHER netlink attribute length, from Jozsef. 8) Check match/targetinfo netlink attribute size in nft_compat, patch from Florian Westphal. 9) Check for integer overflow on 32-bit systems in x_tables, from Florian Westphal. Several patches from Jarno Rajahalme to prepare the introduction of NAT support to OVS based on the Netfilter infrastructure: 10) Schedule IP_CT_NEW_REPLY definition for removal in nf_conntrack_common.h. 11) Simplify checksumming recalculation in nf_nat. 12) Add comments to the openvswitch conntrack code, from Jarno. 13) Update the CT state key only after successful nf_conntrack_in() invocation. 14) Find existing conntrack entry after upcall. 15) Handle NF_REPEAT case due to templates in nf_conntrack_in(). 16) Call the conntrack helper functions once the conntrack has been confirmed. 17) And finally, add the NAT interface to OVS. The batch closes with: 18) Cleanup to use spin_unlock_wait() instead of spin_lock()/spin_unlock(), from Nicholas Mc Guire. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-14 22:10:25 -04:00
Jarno Rajahalme	05752523e5	openvswitch: Interface with NAT. Extend OVS conntrack interface to cover NAT. New nested OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action. A bare OVS_CT_ATTR_NAT only mangles existing and expected connections. If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested attributes, new (non-committed/non-confirmed) connections are mangled according to the rest of the nested attributes. The corresponding OVS userspace patch series includes test cases (in tests/system-traffic.at) that also serve as example uses. This work extends on a branch by Thomas Graf at https://github.com/tgraf/ovs/tree/nat. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-03-14 23:47:29 +01:00
Jarno Rajahalme	bfa3f9d7f3	netfilter: Remove IP_CT_NEW_REPLY definition. Remove the definition of IP_CT_NEW_REPLY from the kernel as it does not make sense. This allows the definition of IP_CT_NUMBER to be simplified as well. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-03-14 23:47:27 +01:00
Martin KaFai Lau	a44d6eacda	tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/In Per RFC4898, they count segments sent/received containing a positive length data segment (that includes retransmission segments carrying data). Unlike tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments carrying no data (e.g. pure ack). The patch also updates the segs_in in tcp_fastopen_add_skb() so that segs_in >= data_segs_in property is kept. Together with retransmission data, tcpi_data_segs_out gives a better signal on the rxmit rate. v6: Rebase on the latest net-next v5: Eric pointed out that checking skb->len is still needed in tcp_fastopen_add_skb() because skb can carry a FIN without data. Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in() helper is used. Comment is added to the fastopen case to explain why segs_in has to be reset and tcp_segs_in() has to be called before __skb_pull(). v4: Add comment to the changes in tcp_fastopen_add_skb() and also add remark on this case in the commit message. v3: Add const modifier to the skb parameter in tcp_segs_in() v2: Rework based on recent fix by Eric: commit `a9d99ce28e` ("tcp: fix tcpi_segs_in after connection establishment") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Chris Rapier <rapier@psc.edu> Cc: Eric Dumazet <edumazet@google.com> Cc: Marcelo Ricardo Leitner <mleitner@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-14 14:55:26 -04:00
Sabrina Dubroca	dece8d2b78	uapi: add MACsec bits Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-13 22:40:24 -04:00
Zhang Shengju	136ba622de	netconf: add macro to represent all attributes This patch adds macro NETCONFA_ALL to represent all type of netconf attributes for IPv4 and IPv6. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-13 21:54:44 -04:00
Dave Airlie	9b61c0fcdf	Merge drm-fixes into drm-next. Nouveau wanted this to avoid some worse conflicts when I merge that.	2016-03-14 09:46:02 +10:00
Linus Torvalds	95f41fb203	media fixes for v4.5-rc8 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW4xgMAAoJEAhfPr2O5OEVKVAP/1hSigOgCEWrCbXL+mp9xl2P WXYA87O0ckk6rfIKOi6tv72bkxUlrik9t/F6DIzQejh5SO4IxWeDr8v4iW9Zq+PT r7ondfq7Sw0VxZfJ/7sulDvtySVBL7V1osJGocrKhtXknmlHdspMX4tuEkB8HYy/ dCpl5yf9ZGYXJrxkRC9rCWFzyEyI8Mg9GE0YORlYYSjaRbl9mYQNQQ6pFjRzlR99 MaPaSfMA7UPQvapyUNplgqHvq8Bo459cLiAL2aR2Z3zdJr8aJvpDYaGBGdzdBIoM kR55OrDfS/DPX9sou2Xsmty6bMRAynkzI6lGWd5muGfznJ2O5j2s1AY0pkX+wj6O 7S1AfCG8ryi7rvUsfxHkBV6mE2vbKtHU9CnZBIu25B7Dtp2rKNimPh7FqPR6U38h snWSGNCxayJchAxBBkhXE5BNdCpopLCed6Y9jIQbTelzghNhFKP96APIwHOKvfAq WmfHT6/diTst7Bu859WS/1UqCf1xIcY6jqofz7El/GIECbAxR6k9eFaPW55tecss M/60e58U6MLVZxZUqSykKw1bTXq7PeceH5b3dpg1Yv/ST5kNqZZS082rHi1Qpv5o 9llLHIwa/Nu+v4bjeLbiHPOK2VOTcMZp9RAknc4TNRuy3FCX0ntWxGLq24r2FPg+ UzRT+MzaP9slkbb2M80B =btba -----END PGP SIGNATURE----- Merge tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fix from Mauro Carvalho Chehab: "One last time fix: It adds a code that prevents some media tools like media-ctl to hide some entities that have their IDs out of the range expected by those apps" * tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] media-device: map new functions into old types for legacy API	2016-03-11 12:32:02 -08:00
Daniel Borkmann	4018ab1875	bpf: support flow label for bpf_skb_{set, get}_tunnel_key This patch extends bpf_tunnel_key with a tunnel_label member, that maps to ip_tunnel_key's label so underlying backends like vxlan and geneve can propagate the label to udp_tunnel6_xmit_skb(), where it's being set in the IPv6 header. It allows for having 20 more bits to encode/decode flow related meta information programmatically. Tested with vxlan and geneve. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-11 15:14:27 -05:00
Daniel Borkmann	8eb3b99554	geneve: support setting IPv6 flow label This work adds support for setting the IPv6 flow label for geneve per device and through collect metadata (ip_tunnel_key) frontends. Also here, the geneve dst cache does not need any special considerations, for the cases where caches can be used, the label is static per cache. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-11 15:14:27 -05:00
Daniel Borkmann	e7f70af111	vxlan: support setting IPv6 flow label This work adds support for setting the IPv6 flow label for vxlan per device and through collect metadata (ip_tunnel_key) frontends. The vxlan dst cache does not need any special considerations here, for the cases where caches can be used, the label is static per cache. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-11 15:14:26 -05:00
Dmitry Torokhov	62d5bdf972	Merge branch 'synaptics-rmi4' into next Bring in support for devices using Synaptics RMI4 protocol, including RMI4 bus, 2D sensor and button handlers, and SPI and I2C interface drivers.	2016-03-11 09:51:25 -08:00
Sheng Yang	32c76de346	target/user: Report capability of handling out-of-order completions to userspace TCMU_MAILBOX_FLAG_CAP_OOOC was introduced, and userspace can check the flag for out-of-order completion capability support. Also update the document on how to use the feature. Signed-off-by: Sheng Yang <sheng@yasker.org> Reviewed-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2016-03-10 21:49:09 -08:00
Jason Wang	0308813724	vhost_net: basic polling support This patch tries to poll for new added tx buffer or socket receive queue for a while at the end of tx/rx processing. The maximum time spent on polling were specified through a new kind of vring ioctl. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-03-11 02:18:53 +02:00
Andrew Duggan	2b6a321da9	Input: synaptics-rmi4 - add support for Synaptics RMI4 devices Synaptics uses the Register Mapped Interface (RMI) protocol as a communications interface for their devices. This driver adds the core functionality needed to interface with RMI4 devices. RMI devices can be connected to the host via several transport protocols and can supports a wide variety of functionality defined by RMI functions. Support for transport protocols and RMI functions are implemented in individual drivers. The RMI4 core driver uses a bus architecture to facilitate the various combinations of transport and function drivers needed by a particular device. Signed-off-by: Andrew Duggan <aduggan@synaptics.com> Signed-off-by: Christopher Heiny <cheiny@synaptics.com> Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Tested-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2016-03-10 16:02:39 -08:00
Amir Vadai	5b33f48842	net/flower: Introduce hardware offload support This patch is based on a patch made by John Fastabend. It adds support for offloading cls_flower. when NETIF_F_HW_TC is on: flags = 0 => Rule will be processed twice - by hardware, and if still relevant, by software. flags = SKIP_HW => Rull will be processed by software only If hardware fail/not capabale to apply the rule, operation will NOT fail. Filter will be processed by SW only. Acked-by: Jiri Pirko <jiri@mellanox.com> Suggested-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-10 16:24:02 -05:00
Mauro Carvalho Chehab	b2cd27448b	[media] media-device: map new functions into old types for legacy API The legacy media controller userspace API exposes entity types that carry both type and function information. The new API replaces the type with a function. It preserves backward compatibility by defining legacy functions for the existing types and using them in drivers. This works fine, as long as newer entity functions won't be added. Unfortunately, some tools, like media-ctl with --print-dot argument rely on the now legacy MEDIA_ENT_T_V4L2_SUBDEV and MEDIA_ENT_T_DEVNODE numeric ranges to identify what entities will be shown. Also, if the entity doesn't match those ranges, it will ignore the major/minor information on devnodes, and won't be getting the devnode name via udev or sysfs. As we're now adding devices outside the old range, the legacy ioctl needs to map the new entity functions into a type at the old range, or otherwise we'll have a regression. Detected on all released media-ctl versions (e. g. versions <= 1.10). Fix this by deriving the type from the function to emulate the legacy API if the function isn't in the legacy functions range. Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-10 15:10:59 -03:00
Linus Walleij	fae9816446	gpio: uapi: use 0xB4 as ioctl() major The previous 'o' is in conflict and not very orderly assigned. We want to select an ioctl() major that does not conflict with the existining ones. Add the new reserved major (0xB4) to Documentation/ioctl/ioctl-number.txt Fixes: `3c702e9987` ("gpio: add a userspace chardev ABI for GPIOs") Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2016-03-10 16:02:52 +07:00
Tom Herbert	ab7ac4eb98	kcm: Kernel Connection Multiplexor module This module implements the Kernel Connection Multiplexor. Kernel Connection Multiplexor (KCM) is a facility that provides a message based interface over TCP for generic application protocols. With KCM an application can efficiently send and receive application protocol messages over TCP using datagram sockets. For more information see the included Documentation/networking/kcm.txt Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-09 16:36:14 -05:00
Paolo Bonzini	ab92f30875	KVM/ARM updates for 4.6 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - Various optimizations to the vgic save/restore code -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW36xjAAoJECPQ0LrRPXpDGQkQAMDppzcTOixT3e8VPdHAX09a Z5PO0gyTMVV7Jyz5Ul3pedPJA2GSK9mxOCwqvIFbdxLAR6ZB00juO5FrTHkSdI91 1XLPj4bKoMWcVvhL/g5A4Glp/pVMW1k/9Yq8zZAtYlsLRlqG5rLOutSadcqHcYaJ cTD/pFf7b2oPtkTPyoFml75KgHBT/8uvAvFDOWA66Id2z6T11+PsBT/6XnGDiwKg tpGTNzx3kPIKIzOAOHqVW6UBxFOeabebXLT8wUz3VwNn/UbG6gkumMNApMAyF2q1 zU0nAh8+7Ek6Dr4OFWE6BfW6sgg/l7i1lA8XoAmqG7ZTrSptCc59fvaZJxPruG+Q dMsU6QgR77JJjbZTinf9a1jReZ/liZrx2gZXedVKdILrjmDSq0UnGcxjUOEDZOGy 2/dbrlJhv+LhpcJtuPpxPCfoqbW5L0ynzmuYuXRdRz3lTHiOWIRx5gugrhO+wH4D 4gvZhbw3XCiYfpYHYhl8A1EH5kanKgdXDocz9yIm7mZm89gngufF/HkeXS3ZU25T yThyBGulGjqN4FCdgf1HolkTfFjnfSx4qJovJ58eHga+HNLXRkTecZZcbFy2OOHv 8Bx0PIlwj4RgSaRLWQUudAhdhKS2g22DKDDljxFwhkMPNghvqkYMJCRDKLu6GBXQ 4YsLKM+TaShHFjSpx+ao =rpvb -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.6 - VHE support so that we can run the kernel at EL2 on ARMv8.1 systems - PMU support for guests - 32bit world switch rewritten in C - Various optimizations to the vgic save/restore code Conflicts: include/uapi/linux/kvm.h	2016-03-09 11:50:42 +01:00
Alexei Starovoitov	6c90598174	bpf: pre-allocate hash map elements If kprobe is placed on spin_unlock then calling kmalloc/kfree from bpf programs is not safe, since the following dead lock is possible: kfree->spin_lock(kmem_cache_node->lock)...spin_unlock->kprobe-> bpf_prog->map_update->kmalloc->spin_lock(of the same kmem_cache_node->lock) and deadlocks. The following solutions were considered and some implemented, but eventually discarded - kmem_cache_create for every map - add recursion check to slow-path of slub - use reserved memory in bpf_map_update for in_irq or in preempt_disabled - kmalloc via irq_work At the end pre-allocation of all map elements turned out to be the simplest solution and since the user is charged upfront for all the memory, such pre-allocation doesn't affect the user space visible behavior. Since it's impossible to tell whether kprobe is triggered in a safe location from kmalloc point of view, use pre-allocation by default and introduce new BPF_F_NO_PREALLOC flag. While testing of per-cpu hash maps it was discovered that alloc_percpu(GFP_ATOMIC) has odd corner cases and often fails to allocate memory even when 90% of it is free. The pre-allocation of per-cpu hash elements solves this problem as well. Turned out that bpf_map_update() quickly followed by bpf_map_lookup()+bpf_map_delete() is very common pattern used in many of iovisor/bcc/tools, so there is additional benefit of pre-allocation, since such use cases are must faster. Since all hash map elements are now pre-allocated we can remove atomic increment of htab->count and save few more cycles. Also add bpf_map_precharge_memlock() to check rlimit_memlock early to avoid large malloc/free done by users who don't have sufficient limits. Pre-allocation is done with vmalloc and alloc/free is done via percpu_freelist. Here are performance numbers for different pre-allocation algorithms that were implemented, but discarded in favor of percpu_freelist: 1 cpu: pcpu_ida 2.1M pcpu_ida nolock 2.3M bt 2.4M kmalloc 1.8M hlist+spinlock 2.3M pcpu_freelist 2.6M 4 cpu: pcpu_ida 1.5M pcpu_ida nolock 1.8M bt w/smp_align 1.7M bt no/smp_align 1.1M kmalloc 0.7M hlist+spinlock 0.2M pcpu_freelist 2.0M 8 cpu: pcpu_ida 0.7M bt w/smp_align 0.8M kmalloc 0.4M pcpu_freelist 1.5M 32 cpu: kmalloc 0.13M pcpu_freelist 0.49M pcpu_ida nolock is a modified percpu_ida algorithm without percpu_ida_cpu locks and without cross-cpu tag stealing. It's faster than existing percpu_ida, but not as fast as pcpu_freelist. bt is a variant of block/blk-mq-tag.c simlified and customized for bpf use case. bt w/smp_align is using cache line for every 'long' (similar to blk-mq-tag). bt no/smp_align allocates 'long' bitmasks continuously to save memory. It's comparable to percpu_ida and in some cases faster, but slower than percpu_freelist hlist+spinlock is the simplest free list with single spinlock. As expeceted it has very bad scaling in SMP. kmalloc is existing implementation which is still available via BPF_F_NO_PREALLOC flag. It's significantly slower in single cpu and in 8 cpu setup it's 3 times slower than pre-allocation with pcpu_freelist, but saves memory, so in cases where map->max_entries can be large and number of map update/delete per second is low, it may make sense to use it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-08 15:28:31 -05:00
David S. Miller	4c38cd61ae	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Remove useless debug message when deleting IPVS service, from Yannick Brosseau. 2) Get rid of compilation warning when CONFIG_PROC_FS is unset in several spots of the IPVS code, from Arnd Bergmann. 3) Add prandom_u32 support to nft_meta, from Florian Westphal. 4) Remove unused variable in xt_osf, from Sudip Mukherjee. 5) Don't calculate IP checksum twice from netfilter ipv4 defrag hook since fixing af_packet defragmentation issues, from Joe Stringer. 6) On-demand hook registration for iptables from netns. Instead of registering the hooks for every available netns whenever we need one of the support tables, we register this on the specific netns that needs it, patchset from Florian Westphal. 7) Add missing port range selection to nf_tables masquerading support. BTW, just for the record, there is a typo in the description of `5f6c253ebe` ("netfilter: bridge: register hooks only when bridge interface is added") that refers to the cluster match as deprecated, but it is actually the CLUSTERIP target (which registers hooks inconditionally) the one that is scheduled for removal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-08 14:25:20 -05:00
Daniel Borkmann	14ca0751c9	bpf: support for access to tunnel options After eBPF being able to programmatically access/manage tunnel key meta data via commit `d3aa45ce6b` ("bpf: add helpers to access tunnel metadata") and more recently also for IPv6 through `c6c3345407` ("bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key"), this work adds two complementary helpers to generically access their auxiliary tunnel options. Geneve and vxlan support this facility. For geneve, TLVs can be pushed, and for the vxlan case its GBP extension. I.e. setting tunnel key for geneve case only makes sense, if we can also read/write TLVs into it. In the GBP case, it provides the flexibility to easily map the group policy ID in combination with other helpers or maps. I chose to model this as two separate helpers, bpf_skb_{set,get}_tunnel_opt(), for a couple of reasons. bpf_skb_{set,get}_tunnel_key() is already rather complex by itself, and there may be cases for tunnel key backends where tunnel options are not always needed. If we would have integrated this into bpf_skb_{set,get}_tunnel_key() nevertheless, we are very limited with remaining helper arguments, so keeping compatibility on structs in case of passing in a flat buffer gets more cumbersome. Separating both also allows for more flexibility and future extensibility, f.e. options could be fed directly from a map, etc. Moreover, change geneve's xmit path to test only for info->options_len instead of TUNNEL_GENEVE_OPT flag. This makes it more consistent with vxlan's xmit path and allows for avoiding to specify a protocol flag in the API on xmit, so it can be protocol agnostic. Having info->options_len is enough information that is needed. Tested with vxlan and geneve. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-08 13:58:46 -05:00
Daniel Borkmann	2208087061	bpf: allow to propagate df in bpf_skb_set_tunnel_key Added by `9a628224a6` ("ip_tunnel: Add dont fragment flag."), allow to feed df flag into tunneling facilities (currently supported on TX by vxlan, geneve and gre) as a hint from eBPF's bpf_skb_set_tunnel_key() helper. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-08 13:58:46 -05:00
Daniel Borkmann	8afd54c87a	bpf: add flags to bpf_skb_store_bytes for clearing hash When overwriting parts of the packet with bpf_skb_store_bytes() that were fed previously into skb->hash calculation, we should clear the current hash with skb_clear_hash(), so that a next skb_get_hash() call can determine the correct hash related to this skb. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-08 13:55:15 -05:00
David S. Miller	810813c47a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-08 12:34:12 -05:00
Wilson Ding	30530791a7	serial: mvebu-uart: initial support for Armada-3700 serial port Armada-3700's uart is a simple serial port, which doesn't support. Configuring the modem control lines. The uart port has a 32 bytes Tx FIFO and a 64 bytes Rx FIFO The uart driver implements the uart core operations. It also support the system (early) console based on Armada-3700's serial port. Known Issue: The uart driver currently doesn't support clock programming, which means the baud-rate stays with the default value configured by the bootloader at boot time [gregory.clement@free-electrons.com: Rewrite many part which are too long to enumerate] Signed-off-by: Wilson Ding <dingwei@marvell.com> Signed-off-by: Nadav Haklai <nadavh@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-07 16:11:14 -08:00
Linus Torvalds	e2857b8f11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix ordering of WEXT netlink messages so we don't see a newlink after a dellink, from Johannes Berg. 2) Out of bounds access in minstrel_ht_set_best_prob_rage, from Konstantin Khlebnikov. 3) Paging buffer memory leak in iwlwifi, from Matti Gottlieb. 4) Wrong units used to set initial TCP rto from cached metrics, also from Konstantin Khlebnikov. 5) Fix stale IP options data in the SKB control block from leaking through layers of encapsulation, from Bernie Harris. 6) Zero padding len miscalculated in bnxt_en, from Michael Chan. 7) Only CHECKSUM_PARTIAL packets should be passed down through GSO, fix from Hannes Frederic Sowa. 8) Fix suspend/resume with JME networking devices, from Diego Violat and Guo-Fu Tseng. 9) Checksums not validated properly in bridge multicast support due to the placement of the SKB header pointers at the time of the check, fix from Álvaro Fernández Rojas. 10) Fix hang/tiemout with r8169 if a stats fetch is done while the device is runtime suspended. From Chun-Hao Lin. 11) The forwarding database netlink dump facilities don't track the state of the dump properly, resulting in skipped/missed entries. From Minoura Makoto. 12) Fix regression from a recent 3c59x bug fix, from Neil Horman. 13) Fix list corruption in bna driver, from Ivan Vecera. 14) Big endian machines crash on vlan add in bnx2x, fix from Michal Schmidt. 15) Ethtool RSS configuration not propagated properly in mlx5 driver, from Tariq Toukan. 16) Fix regression in PHY probing in stmmac driver, from Gabriel Fernandez. 17) Fix SKB tailroom calculation in igmp/mld code, from Benjamin Poirier. 18) A past change to skip empty routing headers in ipv6 extention header parsing accidently caused fragment headers to not be matched any longer. Fix from Florian Westphal. 19) eTSEC-106 erratum needs to be applied to more gianfar chips, from Atsushi Nemoto. 20) Fix netdev reference after free via workqueues in usb networking drivers, from Oliver Neukum and Bjørn Mork. 21) mdio->irq is now an array rather than a pointer to dynamic memory, but several drivers were still trying to free it :-/ Fixes from Colin Ian King. 22) act_ipt iptables action forgets to set the family field, thus LOG netfilter targets don't work with it. Fix from Phil Sutter. 23) SKB leak in ibmveth when skb_linearize() fails, from Thomas Falcon. 24) pskb_may_pull() cannot be called with interrupts disabled, fix code that tries to do this in vmxnet3 driver, from Neil Horman. 25) be2net driver leaks iomap'd memory on removal, fix from Douglas Miller. 26) Forgotton RTNL mutex unlock in ppp_create_interface() error paths, from Guillaume Nault. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (97 commits) ppp: release rtnl mutex when interface creation fails cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind tcp: fix tcpi_segs_in after connection establishment net: hns: fix the bug about loopback jme: Fix device PM wakeup API usage jme: Do not enable NIC WoL functions on S0 udp6: fix UDP/IPv6 encap resubmit path be2net: Don't leak iomapped memory on removal. vmxnet3: avoid calling pskb_may_pull with interrupts disabled net: ethernet: Add missing MFD_SYSCON dependency on HAS_IOMEM ibmveth: check return of skb_linearize in ibmveth_start_xmit cdc_ncm: toggle altsetting to force reset before setup usbnet: cleanup after bind() in probe() mlxsw: pci: Correctly determine if descriptor queue is full mlxsw: spectrum: Always decrement bridge's ref count tipc: fix nullptr crash during subscription cancel net: eth: altera: do not free array priv->mdio->irq net/ethoc: do not free array priv->mdio->irq net: sched: fix act_ipt for LOG target asix: do not free array priv->mdio->irq ...	2016-03-07 15:41:10 -08:00
Dan Williams	d4f323672a	nfit, libnvdimm: clear poison command support Add the boiler-plate for a 'clear error' command based on section 9.20.7.6 "Function Index 4 - Clear Uncorrectable Error" from the ACPI 6.1 specification, and add a reference implementation in nfit_test. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-03-05 18:06:14 -08:00
Linus Torvalds	ee8f3955c0	media fixes for v4.5-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJW2tpIAAoJEAhfPr2O5OEVJOkP/0QKNBOyvonEvVGCNDzvlPnz stFLRKFyzHRA9KQKPcZ2UMHaP3UFy4uUbdK1YOUMvYN8CUBmhr4QOlZcmLwCGROx 0BtGqWeGwjM1gbDby4c+/8nvl+dPehNNtv1d5jjtu53bPylg7rQTM337QnBykXFC qWW/NvlHWqcfR2TUW+8saAJ5l/R2gWYuAIreIgbImXFB5mBHBZ0QmtnW/radPlU6 pUTsRxIaw1IYJ0qpEmVYaTZiVwax6i55KJBKONjzqGPM3Bk/+XOuqyUfID3Ogvb5 u10B4x6l+UvFMKqWZNXeCSalsdw5NI3yaBo6MAjUCpIlVPR4o15RM1mlvkFn0x+1 fNnX+lpJcFamytXAGkQ8qbCNGd03AmXVusMs+gXnJIET98UGDa44F0l5/D9Uy+Wg dcGuVTDH/WnwO/UndCFqT2R1hAx1CwOoVseIRL3stQ0xrxHA39kuoB98r4knBh+o AD4bVzHX+lwZmtOAqOgS6mIx5h+lCGlOomDLmfRt7T6UP8YVCFg2tuCRrO83OR+e +6u7z3fnhn6zpUQ3VsjI8qoILVg4UctHeJ8u0Ygks3FYFWsFaNJriZH0iiNhiFcS dbGQjSvBp9svbFz9KmvB/mh4hrJjwTOFf/U9sR/KkBqRb/rAsPv6DFkDZBtV/91D H9B5sI6sYD4CCsldqXph =lhV9 -----END PGP SIGNATURE----- Merge tag 'media/v4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - some last time changes before we stablize the new entity function integer numbers at uAPI - probe: fix erroneous return value on i2c/adp1653 driver - fix tx 5v detect regression on adv7604 driver - fix missing unlock on error in vpfe_prepare_pipeline() on davinci_vpfe driver * tag 'media/v4.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] media: Sanitise the reserved fields of the G_TOPOLOGY IOCTL arguments [media] media.h: postpone connectors entities [media] media.h: use hex values for range offsets, move connectors base up. [media] adv7604: fix tx 5v detect regression [media] media.h: get rid of MEDIA_ENT_F_CONN_TEST [media] [for,v4.5] media.h: increase the spacing between function ranges [media] media: i2c/adp1653: probe: fix erroneous return value [media] media: davinci_vpfe: fix missing unlock on error in vpfe_prepare_pipeline()	2016-03-05 12:32:34 -08:00
Reilly Grant	d883f52e1f	usb: devio: Add ioctl to disallow detaching kernel USB drivers. The new USBDEVFS_DROP_PRIVILEGES ioctl allows a process to voluntarily relinquish the ability to issue other ioctls that may interfere with other processes and drivers that have claimed an interface on the device. This commit also includes a simple utility to be able to test the ioctl, located at Documentation/usb/usbdevfs-drop-permissions.c Example (with qemu-kvm's input device): $ lsusb ... Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd $ usb-devices ... C: #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=100mA I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=02 Driver=usbhid $ sudo ./usbdevfs-drop-permissions /dev/bus/usb/001/002 OK: privileges dropped! Available options: [0] Exit now [1] Reset device. Should fail if device is in use [2] Claim 4 interfaces. Should succeed where not in use [3] Narrow interface permission mask Which option shall I run?: 1 ERROR: USBDEVFS_RESET failed! (1 - Operation not permitted) Which test shall I run next?: 2 ERROR claiming if 0 (1 - Operation not permitted) ERROR claiming if 1 (1 - Operation not permitted) ERROR claiming if 2 (1 - Operation not permitted) ERROR claiming if 3 (1 - Operation not permitted) Which test shall I run next?: 0 After unbinding usbhid: $ usb-devices ... I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=02 Driver=(none) $ sudo ./usbdevfs-drop-permissions /dev/bus/usb/001/002 ... Which option shall I run?: 2 OK: claimed if 0 ERROR claiming if 1 (1 - Operation not permitted) ERROR claiming if 2 (1 - Operation not permitted) ERROR claiming if 3 (1 - Operation not permitted) Which test shall I run next?: 1 OK: USBDEVFS_RESET succeeded Which test shall I run next?: 0 After unbinding usbhid and restricting the mask: $ sudo ./usbdevfs-drop-permissions /dev/bus/usb/001/002 ... Which option shall I run?: 3 Insert new mask: 0 OK: privileges dropped! Which test shall I run next?: 2 ERROR claiming if 0 (1 - Operation not permitted) ERROR claiming if 1 (1 - Operation not permitted) ERROR claiming if 2 (1 - Operation not permitted) ERROR claiming if 3 (1 - Operation not permitted) Signed-off-by: Reilly Grant <reillyg@chromium.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Emilio López <emilio.lopez@collabora.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-05 12:05:01 -08:00
Nicolas Dichtel	14e2037902	ethtool.h: define INT_MAX for userland INT_MAX needs limits.h in userland. When ethtool.h is included by a userland app, we got the following error: .../usr/include/linux/ethtool.h: In function 'ethtool_validate_speed': .../usr/include/linux/ethtool.h:1471:18: error: 'INT_MAX' undeclared (first use in this function) return speed <= INT_MAX \|\| speed == SPEED_UNKNOWN ^ Fixes: `e02564ee33` ("ethtool: make validate_speed accept all speeds between 0 and INT_MAX") CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-04 16:10:37 -05:00
Nicolas Dichtel	b5d3755a22	uapi: define DIV_ROUND_UP for userland DIV_ROUND_UP is defined in linux/kernel.h only for the kernel. When ethtool.h is included by a userland app, we got the following error: include/linux/ethtool.h:1218:8: error: variably modified 'queue_mask' at file scope __u32 queue_mask[DIV_ROUND_UP(MAX_NUM_QUEUE, 32)]; ^ Let's add a common definition in uapi and use it everywhere. Fixes: `ac2c7ad0e5` ("net/ethtool: introduce a new ioctl for per queue setting") CC: Kan Liang <kan.liang@intel.com> Suggested-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-04 16:10:36 -05:00
Dmitry Torokhov	52cdce8adb	Merge branch 'rotary-encoder' into next Bring in updates to roraty encoder driver switching it away from legacy platform data and over to generic device properties and adding support for encoders using more than 2 GPIOs.	2016-03-04 11:32:40 -08:00
Christoph Hellwig	97be7ebe53	vfs: add the RWF_HIPRI flag for preadv2/pwritev2 This adds a flag that tells the file system that this is a high priority request for which it's worth to poll the hardware. The flag is purely advisory and can be ignored if not supported. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stephen Bates <stephen.bates@pmcs.com> Tested-by: Stephen Bates <stephen.bates@pmcs.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-03-04 12:20:10 -05:00
Li Jun	346dbc6993	usb: add OTG status selector definition for HNP polling A host is required to use the GetStatus command, with wIndex set to the OTG status selector(F000H) to request the Host request flag from the peripheral. Acked-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Li Jun <jun.li@nxp.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>	2016-03-04 15:14:35 +02:00
John Youn	446fa3a95d	usb: ch9: Add size macro for SSP dev cap descriptor The SuperspeedPlus Device Capability Descriptor has a variable size depending on the number of sublink speed attributes. This patch adds a macro to calculate that size. The macro takes one argument, the Sublink Speed Attribute Count (SSAC) as reported by the descriptor in bmAttributes[4:0]. See USB 3.1 9.6.2.5, Table 9-19. Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org>	2016-03-04 15:14:22 +02:00
Christopher S. Hall	719f1aa4a6	ptp: Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping Currently, network /system cross-timestamping is performed in the PTP_SYS_OFFSET ioctl. The PTP clock driver reads gettimeofday() and the gettime64() callback provided by the driver. The cross-timestamp is best effort where the latency between the capture of system time (getnstimeofday()) and the device time (driver callback) may be significant. The getcrosststamp() callback and corresponding PTP_SYS_OFFSET_PRECISE ioctl allows the driver to perform this device/system correlation when for example cross timestamp hardware is available. Modern Intel systems can do this for onboard Ethernet controllers using the ART counter. There is virtually zero latency between captures of the ART and network device clock. The capabilities ioctl (PTP_CLOCK_GETCAPS), is augmented allowing applications to query whether or not drivers implement the getcrosststamp callback, providing more precise cross timestamping. Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: kevin.b.stanton@intel.com Cc: kevin.j.clarke@intel.com Cc: hpa@zytor.com Cc: jeffrey.t.kirsher@intel.com Cc: netdev@vger.kernel.org Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com> [jstultz: Commit subject tweaks] Signed-off-by: John Stultz <john.stultz@linaro.org>	2016-03-03 14:23:43 -08:00
Hans Verkuil	64bd1971de	[media] media.h: always start with 1 for the audio entities Start the audio defines with BASE + 0x03001 instead of 0x03000. This is consistent with the other defines, and I think it is good practice not to start with 0, just in case we want to do something like (id & 0xfff) in the future and treat the value 0 as a special case. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Suggested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-03 18:16:25 -03:00
Sakari Ailus	fbe093ac9f	[media] media: Sanitise the reserved fields of the G_TOPOLOGY IOCTL arguments The argument structs are used in arrays for G_TOPOLOGY IOCTL. The arguments themselves do not need to be aligned to a power of two, but aligning them up to the largest basic type alignment (u64) on common ABIs is a good thing to do. The patch changes the size of the reserved fields to 5 or 6 u32's and aligns the size of the struct to 8 bytes so we do no longer depend on the compiler to perform the alignment. While at it, add __attribute__ ((packed)) to these structs as well. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-03 18:10:53 -03:00
Mauro Carvalho Chehab	93125094c0	[media] media.h: postpone connectors entities The representation of external connections got some heated discussions recently. As we're too close to the merge window, let's not set those entities into a stone. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-03 14:52:51 -03:00
Sakari Ailus	0629e991a2	[media] media: Move media_get_uptr() macro out of the media.h user space header The media_get_uptr() macro is mostly useful only for the IOCTL handling code in media-device.c so move it there. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-03 13:02:44 -03:00
Paolo Bonzini	61ec84f145	Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD The highlights are: * Enable VFIO device on PowerPC, from David Gibson * Optimizations to speed up IPIs between vcpus in HV KVM, from Suresh Warrier (who is also Suresh E. Warrier) * In-kernel handling of IOMMU hypercalls, and support for dynamic DMA windows (DDW), from Alexey Kardashevskiy. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-03 14:36:07 +01:00
Aviv Greenberg	5d8d8db851	[media] UVC: Add support for R200 depth camera Add support for Intel R200 depth camera in uvc driver. This includes adding new uvc GUIDs for the new pixel formats, adding new V4L pixel format definition to user api headers, and updating the uvc driver GUID-to-4cc tables with the new formats. Tested-by: Greenberg, Aviv D <aviv.d.greenberg@intel.com> Signed-off-by: Aviv Greenberg <aviv.d.greenberg@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-03 06:49:20 -03:00
Hans Verkuil	87294e9db5	[media] media.h: use hex values for IF and AUDIO entities too Make the base offset hexadecimal to simplify debugging since the base addresses are hex too. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-03 06:10:14 -03:00
Mauro Carvalho Chehab	664716ad01	Merge branch 'v4l_for_linus' into patchwork We need to import the changes at media.h, as we have a followup patch that depends on it. * v4l_for_linus: [media] media.h: use hex values for range offsets, move connectors base up. [media] adv7604: fix tx 5v detect regression	2016-03-03 06:07:41 -03:00
Hans Verkuil	58402b6e98	[media] media.h: use hex values for range offsets, move connectors base up. Make the base offset hexadecimal to simplify debugging since the base addresses are hex too. The offsets for connectors is also changed to start after the 'reserved' range 0x10000-0x2ffff. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-03-03 05:56:33 -03:00
Pablo Neira Ayuso	8a6bf5da1a	netfilter: nft_masq: support port range Complete masquerading support by allowing port range selection. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-03-02 20:05:27 +01:00
Michael S. Tsirkin	592002f55e	virtio_blk: VIRTIO_BLK_F_WCE->VIRTIO_BLK_F_FLUSH Latest virtio spec says the feature bit name is VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_WCE is the legacy name. virtio blk header says exactly the reverse - fix that and update driver code to match. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-03-02 17:01:59 +02:00
Greg Kroah-Hartman	71e41bbb43	Merge 4.5-rc6 into usb-next We want the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:13:54 -08:00
Greg Kroah-Hartman	3e66848a32	Merge 4.5-rc6 into staging-next We want the staging fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:10:45 -08:00
Alexey Kardashevskiy	58ded4201f	KVM: PPC: Add support for 64bit TCE windows The existing KVM_CREATE_SPAPR_TCE only supports 32bit windows which is not enough for directly mapped windows as the guest can get more than 4GB. This adds KVM_CREATE_SPAPR_TCE_64 ioctl and advertises it via KVM_CAP_SPAPR_TCE_64 capability. The table size is checked against the locked memory limit. Since 64bit windows are to support Dynamic DMA windows (DDW), let's add @bus_offset and @page_shift which are also required by DDW. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org>	2016-03-02 09:56:50 +11:00
Alexey Kardashevskiy	01d01d6919	KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_64 capability number This adds a capability number for 64-bit TCE tables support. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org>	2016-03-02 09:56:50 +11:00
Jamal Hadi Salim	ef6980b6be	introduce IFE action This action allows for a sending side to encapsulate arbitrary metadata which is decapsulated by the receiving end. The sender runs in encoding mode and the receiver in decode mode. Both sender and receiver must specify the same ethertype. At some point we hope to have a registered ethertype and we'll then provide a default so the user doesnt have to specify it. For now we enforce the user specify it. Lets show example usage where we encode icmp from a sender towards a receiver with an skbmark of 17; both sender and receiver use ethertype of 0xdead to interop. YYYY: Lets start with Receiver-side policy config: xxx: add an ingress qdisc sudo tc qdisc add dev $ETH ingress xxx: any packets with ethertype 0xdead will be subjected to ife decoding xxx: we then restart the classification so we can match on icmp at prio 3 sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \ u32 match u32 0 0 flowid 1:1 \ action ife decode reclassify xxx: on restarting the classification from above if it was an icmp xxx: packet, then match it here and continue to the next rule at prio 4 xxx: which will match based on skb mark of 17 sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \ u32 match ip protocol 1 0xff flowid 1:1 \ action continue xxx: match on skbmark of 0x11 (decimal 17) and accept sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \ handle 0x11 fw flowid 1:1 \ action ok xxx: Lets show the decoding policy sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead xxx: filter pref 2 u32 filter pref 2 u32 fh 800: ht divisor 1 filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 (rule hit 0 success 0) match 00000000/00000000 at 0 (success 0 ) action order 1: ife decode action reclassify index 1 ref 1 bind 1 installed 14 sec used 14 sec type: 0x0 Metadata: allow mark allow hash allow prio allow qmap Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 xxx: Observe that above lists all metadatum it can decode. Typically these submodules will already be compiled into a monolithic kernel or loaded as modules YYYY: Lets show the sender side now .. xxx: Add an egress qdisc on the sender netdev sudo tc qdisc add dev $ETH root handle 1: prio xxx: xxx: Match all icmp packets to 192.168.122.237/24, then xxx: tag the packet with skb mark of decimal 17, then xxx: Encode it with: xxx: ethertype 0xdead xxx: add skb->mark to whitelist of metadatum to send xxx: rewrite target dst MAC address to 02:15:15:15:15:15 xxx: sudo $TC filter add dev $ETH parent 1: protocol ip prio 10 u32 \ match ip dst 192.168.122.237/24 \ match ip protocol 1 0xff \ flowid 1:2 \ action skbedit mark 17 \ action ife encode \ type 0xDEAD \ allow mark \ dst 02:15:15:15:15:15 xxx: Lets show the encoding policy sudo tc -s filter ls dev $ETH parent 1: protocol ip xxx: filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2 (rule hit 0 success 0) match c0a87aed/ffffffff at 16 (success 0 ) match 00010000/00ff0000 at 8 (success 0 ) action order 1: skbedit mark 17 index 6 ref 1 bind 1 Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 action order 2: ife encode action pipe index 3 ref 1 bind 1 dst MAC: 02:15:15:15:15:15 type: 0xDEAD Metadata: allow mark Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 xxx: test by sending ping from sender to destination Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-01 17:15:22 -05:00
David S. Miller	d67703fced	Here's another round of updates for -next: * big A-MSDU RX performance improvement (avoid linearize of paged RX) * rfkill changes: cleanups, documentation, platform properties * basic PBSS support in cfg80211 * MU-MIMO action frame processing support * BlockAck reordering & duplicate detection offload support * various cleanups & little fixes -----BEGIN PGP SIGNATURE----- iQIcBAABCgAGBQJW0LZ0AAoJEGt7eEactAAdde0P/2meIOehuHBuAtL7REVoNhri bz9eSHMTg+ozCspL7F6vW1ifDI9AaJEaqJccmriueE/UQVC3VXRPGRJ4SFCwZGo9 Zrtys2v9wOq0+XhxyN65Ucf41O9F/5FFabR5OFbf/pZhW5b2cubEjD1P4BB76Iya 8O6wf9oDDjt3zJgYK+sygm3k9wtDVrH3qEbj8IDnCy22P7010qCsfok9swfaq8OB DBgb6BVfDOFTNXvJGH5fRuUKZdtovzzxorXnoG+zjmKmFdMVdgIYj9+2QfnMjW03 B4/W85svcLLH8V3lHZc4G8oKM4J4XtjH1PskKIMF7ThJsKGMf8tL2vpt9rr8iscd Y9SwTEGc9JmhL7n2FaQFlY6ScLcp4ML+2rXxDOMpBmgF3Ne3yfBsJhLKZEl8vSfI mKhzGXpUKjJxJWIxkR0ylJy4/zHeIXkgRlUEhb8t+jgAqvOBTwiVY+vljHCDUERa sH40r1OqnGJtOHkSRqXSpxwXW+eKgyDd7fnnRX/tyttp2Fuew27/fN63SjpsfN6O 3lfSM5bl3FcCKx7vqTLuqzsoqGvDDYkSq6GDfKDqeZIk0vaXA3SJNEOKgymFWQfR rzsaXvTbBT34GYRg3xS2NCxlmcBPemei/q0x6ZOffxhF41Qpqjs1dPB1Yq3AW4jD HGF+NdRbWEqEFVIjQa8w =JHOe -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2016-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Here's another round of updates for -next: * big A-MSDU RX performance improvement (avoid linearize of paged RX) * rfkill changes: cleanups, documentation, platform properties * basic PBSS support in cfg80211 * MU-MIMO action frame processing support * BlockAck reordering & duplicate detection offload support * various cleanups & little fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-01 17:03:27 -05:00
Nikolay Aleksandrov	59f78f9f6c	bridge: mcast: add support for more router port information dumping Allow for more multicast router port information to be dumped such as timer and type attributes. For that that purpose we need to extend the MDBA_ROUTER_PORT attribute similar to how it was done for the mdb entries recently. The new format is thus: [MDBA_ROUTER_PORT] = { <- nested attribute u32 ifindex <- router port ifindex for user-space compatibility [MDBA_ROUTER_PATTR attributes] } This way it remains compatible with older users (they'll simply retrieve the u32 in the beginning) and new users can parse the remaining attributes. It would also allow to add future extensions to the router port without breaking compatibility. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-01 16:55:07 -05:00
Nikolay Aleksandrov	a55d8246ab	bridge: mcast: add support for temporary port router Add support for a temporary router port which doesn't depend only on the incoming query. It can be refreshed if set to the same value, which is a no-op for the rest. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-01 16:55:07 -05:00
Nikolay Aleksandrov	7f0aec7a66	bridge: mcast: use names for the different multicast_router types Using raw values makes it difficult to extend and also understand the code, give them names and do explicit per-option manipulation in br_multicast_set_port_router. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-01 16:55:07 -05:00
Jiri Pirko	bfcd3a4661	Introduce devlink infrastructure Introduce devlink infrastructure for drivers to register and expose to userspace via generic Netlink interface. There are two basic objects defined: devlink - one instance for every "parent device", for example switch ASIC devlink port - one instance for every physical port of the device. This initial portion implements basic get/dump of objects to userspace. Also, port splitter and port type setting is implemented. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-01 16:07:29 -05:00
John Fastabend	9e8ce79cd7	net: sched: cls_u32 add bit to specify software only rules In the initial implementation the only way to stop a rule from being inserted into the hardware table was via the device feature flag. However this doesn't work well when working on an end host system where packets are expect to hit both the hardware and software datapaths. For example we can imagine a rule that will match an IP address and increment a field. If we install this rule in both hardware and software we may increment the field twice. To date we have only added support for the drop action so we have been able to ignore these cases. But as we extend the action support we will hit this example plus more such cases. Arguably these are not even corner cases in many working systems these cases will be common. To avoid forcing the driver to always abort (i.e. the above example) this patch adds a flag to add a rule in software only. A careful user can use this flag to build software and hardware datapaths that work together. One example we have found particularly useful is to use hardware resources to set the skb->mark on the skb when the match may be expensive to run in software but a mark lookup in a hash table is cheap. The idea here is hardware can do in one lookup what the u32 classifier may need to traverse multiple lists and hash tables to compute. The flag is only passed down on inserts. On deletion to avoid stale references in hardware we always try to remove a rule if it exists. The flags field is part of the classifier specific options. Although it is tempting to lift this into the generic structure doing this proves difficult do to how the tc netlink attributes are implemented along with how the dump/change routines are called. There is also precedence for putting seemingly generic pieces in the specific classifier options such as TCA_U32_POLICE, TCA_U32_ACT, etc. So although not ideal I've left FLAGS in the u32 options as well as it simplifies the code greatly and user space has already learned how to manage these bits ala 'tc' tool. Another thing if trying to update a rule we require the flags to be unchanged. This is to force user space, software u32 and the hardware u32 to keep in sync. Thanks to Simon Horman for catching this case. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-01 16:05:39 -05:00
Shannon Zhao	f577f6c2a6	arm64: KVM: Introduce per-vcpu kvm device controls In some cases it needs to get/set attributes specific to a vcpu and so needs something else than ONE_REG. Let's copy the KVM_DEVICE approach, and define the respective ioctls for the vcpu file descriptor. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Acked-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-02-29 18:34:21 +00:00
Shannon Zhao	808e738142	arm64: KVM: Add a new feature bit for PMUv3 To support guest PMUv3, use one bit of the VCPU INIT feature array. Initialize the PMU when initialzing the vcpu with that bit and PMU overflow interrupt set. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>	2016-02-29 18:34:21 +00:00
Florian Westphal	b07edbe1cf	netfilter: meta: add PRANDOM support Can be used to randomly match packets e.g. for statistic traffic sampling. See commit `3ad0040573` ("bpf: split state from prandom_u32() and consolidate {c, e}BPF prngs") for more info why this doesn't use prandom_u32 directly. Unlike bpf nft_meta can be built as a module, so add an EXPORT_SYMBOL for prandom_seed_full_state too. Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-02-29 13:55:59 +01:00
Shuah Khan	7a2eba12ff	[media] media: Add ALSA Media Controller function entities Add ALSA Media Controller capture, playback, and mixer function entity defines. [mchehab@osg.samsung.com: fix a trivial merge conflict and start MEDIA_ENT_AUDIO_F from 3000] Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-02-27 08:32:54 -03:00
Shuah Khan	5af557a6d2	[media] uapi/media.h: Declare interface types for ALSA Declare the interface types to be used on alsa for the new G_TOPOLOGY ioctl. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-02-27 08:14:44 -03:00
David Decotigny	3f1ac7a700	net: ethtool: add new ETHTOOL_xLINKSETTINGS API This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-25 22:06:45 -05:00
David Ahern	f1705ec197	net: ipv6: Make address flushing on ifdown optional Currently, all ipv6 addresses are flushed when the interface is configured down, including global, static addresses: $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 << nothing; all addresses have been flushed>> Add a new sysctl to make this behavior optional. The new setting defaults to flush all addresses to maintain backwards compatibility. When the set global addresses with no expire times are not flushed on an admin down. The sysctl is per-interface or system-wide for all interfaces $ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1 or $ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1 Will keep addresses on eth1 on an admin down. $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 2100:1::2/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ip link set dev eth1 down $ ip -6 addr show dev eth1 3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000 inet6 2100:1::2/120 scope global tentative valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative valid_lft forever preferred_lft forever Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-25 21:45:15 -05:00
Linus Walleij	214338e372	gpio: present the consumer of a line to userspace I named the field representing the current user of GPIO line as "label" but this is too vague and ambiguous. Before anyone gets confused, rename it to "consumer" and indicate clearly in the documentation that this is a string set by the user of the line. Also clean up leftovers in the documentation. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2016-02-25 21:07:23 +01:00
Daniel Borkmann	2da897e51d	bpf: fix csum setting for bpf_set_tunnel_key The fix in `35e2d1152b` ("tunnels: Allow IPv6 UDP checksums to be correctly controlled.") changed behavior for bpf_set_tunnel_key() when in use with IPv6 and thus uncovered a bug that TUNNEL_CSUM needed to be set but wasn't. As a result, the stack dropped ingress vxlan IPv6 packets, that have been sent via eBPF through collect meta data mode due to checksum now being zero. Since after LCO, we enable IPv4 checksum by default, so make that analogous and only provide a flag BPF_F_ZERO_CSUM_TX for the user to turn it off in IPv4 case. Fixes: `35e2d1152b` ("tunnels: Allow IPv6 UDP checksums to be correctly controlled.") Fixes: `c6c3345407` ("bpf: support ipv6 for bpf_skb_{set,get}_tunnel_key") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-24 16:23:47 -05:00
Beni Lev	0c9ca11b1a	cfg80211: Add global RRM capability Today, the supplicant will add the RRM capabilities Information Element in the association request only if Quiet period is supported (NL80211_FEATURE_QUIET). Quiet is one of many RRM features, and there are other RRM features that are not related to Quiet (e.g. neighbor report). Therefore, requiring Quiet to enable RRM is too restrictive. Some of the features, like neighbor report, can be supported by user space without any help from the kernel. Hence adding the RRM capabilities IE to association request should be the sole user space's decision. Removing the RRM dependency on Quiet in the driver solves this problem, but using an old driver with a user space tool that would not require Quiet feature would be problematic: the user space would add NL80211_ATTR_USE_RRM in the association request even if the kernel doesn't advertize NL80211_FEATURE_QUIET and the association would be denied by the kernel. This solution adds a global RRM capability, that tells user space that it can request RRM capabilities IE publishment without any specific feature support in the kernel. Signed-off-by: Beni Lev <beni.lev@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2016-02-24 09:04:41 +01:00
Lior David	34d505193b	cfg80211: basic support for PBSS network type PBSS (Personal Basic Service Set) is a new BSS type for DMG networks. It is similar to infrastructure BSS, having an AP-like entity called PCP (PBSS Control Point), but it has few differences. PBSS support is mandatory for 11ad devices. Add support for PBSS by introducing a new PBSS flag attribute. The PBSS flag is used in the START_AP command to request starting a PCP instead of an AP, and in the CONNECT command to request connecting to a PCP instead of an AP. Signed-off-by: Lior David <liord@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2016-02-24 09:04:34 +01:00
João Paulo Rechi Vita	d4634e8dea	rfkill: Update userspace API documentation Add a note to userspace on the effect of RFKILL_OP_CHANGE_ALL also updating the default state for hotplugged devices. Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> [reword a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2016-02-24 09:04:25 +01:00
Dan Williams	4577b06655	nfit: update address range scrub commands to the acpi 6.1 format The original format of these commands from the "NVDIMM DSM Interface Example" [1] are superseded by the ACPI 6.1 definition of the "NVDIMM Root Device _DSMs" [2]. [1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf [2]: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf "9.20.7 NVDIMM Root Device _DSMs" Changes include: 1/ New 'restart' fields in ars_status, unfortunately these are implemented in the middle of the existing definition so this change is not backwards compatible. The expectation is that shipping platforms will only ever support the ACPI 6.1 definition. 2/ New status values for ars_start ('busy') and ars_status ('overflow'). Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2016-02-23 17:17:20 -08:00
Alex Williamson	f572a960a1	vfio/pci: Intel IGD host and LCP bridge config space access Provide read-only access to PCI config space of the PCI host bridge and LPC bridge through device specific regions. This may be used to configure a VM with matching register contents to satisfy driver requirements. Providing this through the vfio file descriptor removes an additional userspace requirement for access through pci-sysfs and removes the CAP_SYS_ADMIN requirement that doesn't appear to apply to the specific devices we're accessing. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2016-02-22 16:10:09 -07:00
Alex Williamson	5846ff54e8	vfio/pci: Intel IGD OpRegion support This is the first consumer of vfio device specific resource support, providing read-only access to the OpRegion for Intel graphics devices. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2016-02-22 16:10:09 -07:00
Alex Williamson	c7bb4cb40f	vfio: Define device specific region type capability To this point vfio has only provided an interface to the user that allows them to determine the number of regions and specifics about each region. What the region represents is left to the vfio bus driver. vfio-pci chooses to use fixed indexes for fixed resources, index 0 is BAR0, 1 is BAR1,... 7 is config space, etc. This works pretty well since all PCI devices have these regions, even if they don't necessarily populate all of them. Then we start to add things like VGA, which only certain device even support. We added this the same way, but now we've wasted a region index, and due to our offset implementation the corresponding address space, for all devices. Rather than continuing that process, let's try to make regions self describing by including a capability that defines their type. For vfio-pci we'll make the current VFIO_PCI_NUM_REGIONS fixed, defining the end of the static indexes and the beginning of self describing regions. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2016-02-22 16:10:09 -07:00
Alex Williamson	ff63eb638d	vfio: Define sparse mmap capability for regions We can't always support mmap across an entire device region, for example we deny mmaps covering the MSI-X table of PCI devices, but we don't really have a way to report it. We expect the user to implicitly know this restriction. We also can't split the region because vfio-pci defines an API with fixed region index to BAR number mapping. We therefore define a new capability which lists areas within the region that may be mmap'd. In addition to the MSI-X case, this potentially enables in-kernel emulation and extensions to devices. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2016-02-22 16:10:08 -07:00
Alex Williamson	c84982adb2	vfio: Define capability chains We have a few cases where we need to extend the data returned from the INFO ioctls in VFIO. For instance we already have devices exposed through vfio-pci where VFIO_DEVICE_GET_REGION_INFO reports the region as mmap-capable, but really only supports sparse mmaps, avoiding the MSI-X table. If we wanted to provide in-kernel emulation or extended functionality for devices, we'd also want the ability to tell the user not to mmap various regions, rather than forcing them to figure it out on their own. Another example is VFIO_IOMMU_GET_INFO. We'd really like to expose the actual IOVA capabilities of the IOMMU rather than letting the user assume the address space they have available to them. We could add IOVA base and size fields to struct vfio_iommu_type1_info, but what if we have multiple IOVA ranges. For instance x86 uses a range of addresses at 0xfee00000 for MSI vectors. These typically are not available for standard DMA IOVA mappings and splits our available IOVA space into two ranges. POWER systems have both an IOVA window below 4G as well as dynamic data window which they can use to remap all of guest memory. Representing variable sized arrays within a fixed structure makes it very difficult to parse, we'd therefore like to put this data beyond fixed fields within the data structures. One way to do this is to emulate capabilities in PCI configuration space. A new flag indciates whether capabilties are supported and a new fixed field reports the offset of the first entry. Users can then walk the chain to find capabilities, adding capabilities does not require additional fields in the fixed structure, and parsing variable sized data becomes trivial. This patch outlines the theory and base header structure, which should be shared by all future users. Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2016-02-22 16:10:08 -07:00
Daniel Borkmann	2f72959a9c	bpf: fix csum update in bpf_l4_csum_replace helper for udp When using this helper for updating UDP checksums, we need to extend this in order to write CSUM_MANGLED_0 for csum computations that result into 0 as sum. Reason we need this is because packets with a checksum could otherwise become incorrectly marked as a packet without a checksum. Likewise, if the user indicates BPF_F_MARK_MANGLED_0, then we should not turn packets without a checksum into ones with a checksum. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-21 22:07:10 -05:00
Daniel Borkmann	7d672345ed	bpf: add generic bpf_csum_diff helper For L4 checksums, we currently have bpf_l4_csum_replace() helper. It's currently limited to handle 2 and 4 byte changes in a header and feeds the from/to into inet_proto_csum_replace{2,4}() helpers of the kernel. When working with IPv6, for example, this makes it rather cumbersome to deal with, similarly when editing larger parts of a header. Instead, extend the API in a more generic way: For bpf_l4_csum_replace(), add a case for header field mask of 0 to change the checksum at a given offset through inet_proto_csum_replace_by_diff(), and provide a helper bpf_csum_diff() that can generically calculate a from/to diff for arbitrary amounts of data. This can be used in multiple ways: for the bpf_l4_csum_replace() only part, this even provides us with the option to insert precalculated diffs from user space f.e. from a map, or from bpf_csum_diff() during runtime. bpf_csum_diff() has a optional from/to stack buffer input, so we can calculate a diff by using a scratchbuffer for scenarios where we're inserting (from is NULL), removing (to is NULL) or diffing (from/to buffers don't need to be of equal size) data. Also, bpf_csum_diff() allows to feed a previous csum into csum_partial(), so the function can also be cascaded. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-21 22:07:09 -05:00
Alexei Starovoitov	d5a3b1f691	bpf: introduce BPF_MAP_TYPE_STACK_TRACE add new map type to store stack traces and corresponding helper bpf_get_stackid(ctx, map, flags) - walk user or kernel stack and return id @ctx: struct pt_regs* @map: pointer to stack_trace map @flags: bits 0-7 - numer of stack frames to skip bit 8 - collect user stack instead of kernel bit 9 - compare stacks by hash only bit 10 - if two different stacks hash into the same stackid discard old other bits - reserved Return: >= 0 stackid on success or negative error stackid is a 32-bit integer handle that can be further combined with other data (including other stackid) and used as a key into maps. Userspace will access stackmap using standard lookup/delete syscall commands to retrieve full stack trace for given stackid. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-20 00:21:44 -05:00
Kan Liang	ac2c7ad0e5	net/ethtool: introduce a new ioctl for per queue setting Introduce a new ioctl ETHTOOL_PERQUEUE for per queue parameters setting. The following patches will enable some SUB_COMMANDs for per queue setting. Signed-off-by: Kan Liang <kan.liang@intel.com> Reviewed-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-19 22:54:09 -05:00
Laurent Pinchart	4c4400504f	Merge remote-tracking branch 'linuxtv/vsp1' into HEAD	2016-02-20 02:58:43 +02:00
Nikolay Aleksandrov	2125715635	bridge: mdb: add support for more attributes and export timer Currently mdb entries are exported directly as a structure inside MDBA_MDB_ENTRY_INFO attribute, we can't really extend it without breaking user-space. In order to export new mdb fields, I've converted the MDBA_MDB_ENTRY_INFO into a nested attribute which starts like before with struct br_mdb_entry (without header, as it's casted directly in iproute2) and continues with MDBA_MDB_EATTR_ attributes. This way we keep compatibility with older users and can export new data. I've tested this with iproute2, both with and without support for the added attribute and it works fine. So basically we again have MDBA_MDB_ENTRY_INFO with struct br_mdb_entry inside but it may contain also some additional MDBA_MDB_EATTR_ attributes such as MDBA_MDB_EATTR_TIMER which can be parsed by user-space. So the new structure is: [MDBA_MDB] = { [MDBA_MDB_ENTRY] = { [MDBA_MDB_ENTRY_INFO] [MDBA_MDB_ENTRY_INFO] { <- Nested attribute struct br_mdb_entry <- nla_put_nohdr() [MDBA_MDB_ENTRY attributes] <- normal netlink attributes } } } Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-19 15:27:36 -05:00
Laurent Pinchart	d65fae92f9	[media] v4l: Add YUV 4:2:2 and YUV 4:4:4 tri-planar non-contiguous formats The formats use three planes through the multiplanar API, allowing for non-contiguous planes in memory. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-02-19 08:10:37 -02:00
Wu-Cheng Li	cedc12108b	[media] v4l: add V4L2_CID_MPEG_VIDEO_FORCE_KEY_FRAME Some drivers also need a control like V4L2_CID_MPEG_MFC51_VIDEO_FORCE_FRAME_TYPE to force an encoder key frame. Add a general V4L2_CID_MPEG_VIDEO_FORCE_KEY_FRAME so the new drivers and applications can use it. Signed-off-by: Wu-Cheng Li <wuchengli@chromium.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-02-19 08:10:35 -02:00
Linus Walleij	521a2ad6f8	gpio: add userspace ABI for GPIO line information This adds a GPIO line ABI for getting name, label and a few select flags from the kernel. This hides the kernel internals and only tells userspace what it may need to know: the different in-kernel consumers are masked behind the flag "kernel" and that is all userspace needs to know. However electric characteristics like active low, open drain etc are reflected to userspace, as this is important information. We provide information on all lines on all chips, later on we will likely add a flag for the chardev consumer so we can filter and display only the lines userspace actually uses in e.g. lsgpio, but then we first need an ABI for userspace to grab and use (get/set/select direction) a GPIO line. Sample output from "lsgpio" on ux500: GPIO chip: gpiochip7, "8011e000.gpio", 32 GPIO lines line 0: unnamed unlabeled line 1: unnamed unlabeled (...) line 25: unnamed "SFH7741 Proximity Sensor" [kernel output open-drain] line 26: unnamed unlabeled (...) Tested-by: Michael Welling <mwelling@ieee.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2016-02-19 09:48:46 +01:00
Linus Walleij	df4878e969	gpio: store reflect the label to userspace The gpio_chip label is useful for userspace to understand what kind of GPIO chip it is dealing with. Let's store a copy of this label in the gpio_device, add it to the struct passed to userspace for GPIO_GET_CHIPINFO_IOCTL and modify lsgpio to show it. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2016-02-19 09:48:41 +01:00
Florian Westphal	d1b4c689d4	netlink: remove mmapped netlink support mmapped netlink has a number of unresolved issues: - TX zerocopy support had to be disabled more than a year ago via commit `4682a03586` ("netlink: Always copy on mmap TX.") because the content of the mmapped area can change after netlink attribute validation but before message processing. - RX support was implemented mainly to speed up nfqueue dumping packet payload to userspace. However, since commit `ae08ce0021` ("netfilter: nfnetlink_queue: zero copy support") we avoid one copy with the socket-based interface too (via the skb_zerocopy helper). The other problem is that skbs attached to mmaped netlink socket behave different from normal skbs: - they don't have a shinfo area, so all functions that use skb_shinfo() (e.g. skb_clone) cannot be used. - reserving headroom prevents userspace from seeing the content as it expects message to start at skb->head. See for instance commit `aa3a022094` ("netlink: not trim skb for mmaped socket when dump"). - skbs handed e.g. to netlink_ack must have non-NULL skb->sk, else we crash because it needs the sk to check if a tx ring is attached. Also not obvious, leads to non-intuitive bug fixes such as `7c7bdf359` ("netfilter: nfnetlink: use original skbuff when acking batches"). mmaped netlink also didn't play nicely with the skb_zerocopy helper used by nfqueue and openvswitch. Daniel Borkmann fixed this via commit `6bb0fef489` ("netlink, mmap: fix edge-case leakages in nf queue zero-copy")' but at the cost of also needing to provide remaining length to the allocation function. nfqueue also has problems when used with mmaped rx netlink: - mmaped netlink doesn't allow use of nfqueue batch verdict messages. Problem is that in the mmap case, the allocation time also determines the ordering in which the frame will be seen by userspace (A allocating before B means that A is located in earlier ring slot, but this also means that B might get a lower sequence number then A since seqno is decided later. To fix this we would need to extend the spinlocked region to also cover the allocation and message setup which isn't desirable. - nfqueue can now be configured to queue large (GSO) skbs to userspace. Queing GSO packets is faster than having to force a software segmentation in the kernel, so this is a desirable option. However, with a mmap based ring one has to use 64kb per ring slot element, else mmap has to fall back to the socket path (NL_MMAP_STATUS_COPY) for all large packets. To use the mmap interface, userspace not only has to probe for mmap netlink support, it also has to implement a recv/socket receive path in order to handle messages that exceed the size of an rx ring element. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ken-ichirou MATSUZAWA <chamaken@gmail.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-18 11:42:18 -05:00
Eric Dumazet	cd9b266095	tcp: add tcpi_min_rtt and tcpi_notsent_bytes to tcp_info tcpi_min_rtt reports the minimal rtt observed by TCP stack for the flow, in usec unit. Might be ~0U if not yet known. tcpi_notsent_bytes reports the amount of bytes in the write queue that were not yet sent. This is done in a single patch to not add a temporary 32bit padding hole in tcp_info. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-16 20:27:35 -05:00
Aditya Kali	5e2bec7c22	sched: new clone flag CLONE_NEWCGROUP for cgroup namespace CLONE_NEWCGROUP will be used to create new cgroup namespace. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2016-02-16 13:04:58 -05:00
Andrey Smetanin	83326e43f2	kvm/x86: Hyper-V VMBus hypercall userspace exit The patch implements KVM_EXIT_HYPERV userspace exit functionality for Hyper-V VMBus hypercalls: HV_X64_HCALL_POST_MESSAGE, HV_X64_HCALL_SIGNAL_EVENT. Changes v3: * use vcpu->arch.complete_userspace_io to setup hypercall result Changes v2: * use KVM_EXIT_HYPERV for hypercalls Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Joerg Roedel <joro@8bytes.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-02-16 18:48:44 +01:00
Mauro Carvalho Chehab	8c755c2910	Merge branch 'fixes' into patchwork Some macros were changed/removed at the material for v4.5. We need to sync with those changes here, in order to avoid troubles. * v4l_for_linus: [media] media.h: get rid of MEDIA_ENT_F_CONN_TEST [media] [for,v4.5] media.h: increase the spacing between function ranges [media] media: i2c/adp1653: probe: fix erroneous return value [media] media: davinci_vpfe: fix missing unlock on error in vpfe_prepare_pipeline()	2016-02-16 09:20:45 -02:00
Mauro Carvalho Chehab	360104e3b8	[media] media.h: get rid of MEDIA_ENT_F_CONN_TEST Defining it as a connector was a bad idea. Remove it while it is not too late. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-02-16 09:16:56 -02:00
Mauro Carvalho Chehab	9727a9545a	[media] media.h: get rid of MEDIA_ENT_F_CONN_TEST Defining it as a connector was a bad idea. Remove it while it is not too late. Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-02-16 09:15:23 -02:00
Hans Verkuil	1f4522400e	[media] [for,v4.5] media.h: increase the spacing between function ranges Each function range is quite narrow and especially for connectors this will pose a problem. Increase the function ranges while we still can and move the connector range to the end so that range is practically limitless. [mchehab@osg.samsung.com: Rebased to apply at Linus tree] Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>	2016-02-16 09:07:16 -02:00
Steinar H. Gunderson	f7d34b445a	USB: Add support for usbfs zerocopy. Add a new interface for userspace to preallocate memory that can be used with usbfs. This gives two primary benefits: - Zerocopy; data no longer needs to be copied between the userspace and the kernel, but can instead be read directly by the driver from userspace's buffers. This works for all kinds of transfers (even if nonsensical for control and interrupt transfers); isochronous also no longer need to memset() the buffer to zero to avoid leaking kernel data. - Once the buffers are allocated, USB transfers can no longer fail due to memory fragmentation; previously, long-running programs could run into problems finding a large enough contiguous memory chunk, especially on embedded systems or at high rates. Memory is allocated by using mmap() against the usbfs file descriptor, and similarly deallocated by munmap(). Once memory has been allocated, using it as pointers to a bulk or isochronous operation means you will automatically get zerocopy behavior. Note that this also means you cannot modify outgoing data until the transfer is complete. The same holds for data on the same cache lines as incoming data; DMA modifying them at the same time could lead to your changes being overwritten. There's a new capability USBDEVFS_CAP_MMAP that userspace can query to see if the running kernel supports this functionality, if just trying mmap() is not acceptable. Largely based on a patch by Markus Rechberger with some updates. The original patch can be found at: http://sundtek.de/support/devio_mmap_v0.4.diff Signed-off-by: Steinar H. Gunderson <sesse@google.com> Signed-off-by: Markus Rechberger <mrechberger@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-14 17:11:48 -08:00
Mathias Nyman	faee822c5a	usb: Add USB 3.1 Precision time measurement capability descriptor support USB 3.1 devices that support precision time measurement have an additional PTM cabaility descriptor as part of the full BOS descriptor Look for this descriptor while parsing the BOS descriptor, and store it in struct usb_hub_bos if it exists. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-14 17:03:23 -08:00
Mathias Nyman	c8b1d8977e	usb: Add USB3.1 SuperSpeedPlus Isoc Endpoint Companion descriptor USB3.1 specifies a SuperSpeedPlus Isoc endpoint companion descriptor which is returned as part of the devices complete configuration descriptor. It contains number of bytes per service interval which is needed when reserving bus time in the schedule for transfers over 48K bytes per service interval. If bmAttributes bit 7 is set in the old SuperSpeed Endpoint Companion descriptor, it will be ollowed by the new SuperSpeedPlus Isoc Endpoint Companion descriptor. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-14 17:03:23 -08:00
Greg Kroah-Hartman	172ad9af55	Merge 4.5-rc4 into usb-next We want the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-14 14:38:30 -08:00

1 2 3 4 5 ...

2701 Commits