Commit Graph

5469 Commits

Author SHA1 Message Date
David S. Miller
f3edacbd69 bpf: Revert bpf_overrid_function() helper changes.
NACK'd by x86 maintainer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-11 18:24:55 +09:00
Maciej Żenczykowski
2210d6b2f2 net: ipv6: sysctl to specify IPv6 ND traffic class
Add a per-device sysctl to specify the default traffic class to use for
kernel originated IPv6 Neighbour Discovery packets.

Currently this includes:

  - Router Solicitation (ICMPv6 type 133)
    ndisc_send_rs() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Neighbour Solicitation (ICMPv6 type 135)
    ndisc_send_ns() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Neighbour Advertisement (ICMPv6 type 136)
    ndisc_send_na() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Redirect (ICMPv6 type 137)
    ndisc_send_redirect() -> ndisc_send_skb() -> ip6_nd_hdr()

and if the kernel ever gets around to generating RA's,
it would presumably also include:

  - Router Advertisement (ICMPv6 type 134)
    (radvd daemon could pick up on the kernel setting and use it)

Interface drivers may examine the Traffic Class value and translate
the DiffServ Code Point into a link-layer appropriate traffic
prioritization scheme.  An example of mapping IETF DSCP values to
IEEE 802.11 User Priority values can be found here:

    https://tools.ietf.org/html/draft-ietf-tsvwg-ieee-802-11

The expected primary use case is to properly prioritize ND over wifi.

Testing:
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  0
  jzem22:~# echo -1 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  -bash: echo: write error: Invalid argument
  jzem22:~# echo 256 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  -bash: echo: write error: Invalid argument
  jzem22:~# echo 0 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# echo 255 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  255
  jzem22:~# echo 34 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  34

  jzem22:~# echo $[0xDC] > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# tcpdump -v -i eth0 icmp6 and src host jzem22.pgc and dst host fe80::1
  tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
  IP6 (class 0xdc, hlim 255, next-header ICMPv6 (58) payload length: 24)
  jzem22.pgc > fe80::1: [icmp6 sum ok] ICMP6, neighbor advertisement,
  length 24, tgt is jzem22.pgc, Flags [solicited]

(based on original change written by Erik Kline, with minor changes)

v2: fix 'suspicious rcu_dereference_check() usage'
    by explicitly grabbing the rcu_read_lock.

Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Erik Kline <ek@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-11 15:13:02 +09:00
Josef Bacik
dd0bb688ea bpf: add a bpf_override_function helper
Error injection is sloppy and very ad-hoc.  BPF could fill this niche
perfectly with it's kprobe functionality.  We could make sure errors are
only triggered in specific call chains that we care about with very
specific situations.  Accomplish this with the bpf_override_funciton
helper.  This will modify the probe'd callers return value to the
specified value and set the PC to an override function that simply
returns, bypassing the originally probed function.  This gives us a nice
clean way to implement systematic error injection for all of our code
paths.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-11 12:18:05 +09:00
Richard Guy Briggs
42d5e37654 audit: filter PATH records keyed on filesystem magic
Tracefs or debugfs were causing hundreds to thousands of PATH records to
be associated with the init_module and finit_module SYSCALL records on a
few modules when the following rule was in place for startup:
	-a always,exit -F arch=x86_64 -S init_module -F key=mod-load

Provide a method to ignore these large number of PATH records from
overwhelming the logs if they are not of interest.  Introduce a new
filter list "AUDIT_FILTER_FS", with a new field type AUDIT_FSTYPE,
which keys off the filesystem 4-octet hexadecimal magic identifier to
filter specific filesystem PATH records.

An example rule would look like:
	-a never,filesystem -F fstype=0x74726163 -F key=ignore_tracefs
	-a never,filesystem -F fstype=0x64626720 -F key=ignore_debugfs

Arguably the better way to address this issue is to disable tracefs and
debugfs on boot from production systems.

See: https://github.com/linux-audit/audit-kernel/issues/16
See: https://github.com/linux-audit/audit-userspace/issues/8
Test case: https://github.com/linux-audit/audit-testsuite/issues/42

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: fixed the whitespace damage in kernel/auditsc.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-11-10 16:08:56 -05:00
David S. Miller
4dc6758d78 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Simple cases of overlapping changes in the packet scheduler.

Must easier to resolve this time.

Which probably means that I screwed it up somehow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-10 10:00:18 +09:00
Mark Greer
4d63adfe12 NFC: Add NFC_CMD_DEACTIVATE_TARGET support
Once an NFC target (i.e., a tag) is found, it remains active until
there is a failure reading or writing it (often caused by the target
moving out of range).  While the target is active, the NFC adapter
and antenna must remain powered.  This wastes power when the target
remains in range but the client application no longer cares whether
it is there or not.

To mitigate this, add a new netlink command that allows userspace
to deactivate an active target.  When issued, this command will cause
the NFC subsystem to act as though the target was moved out of range.
Once the command has been executed, the client application can power
off the NFC adapter to reduce power consumption.

Signed-off-by: Mark Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-11-10 00:03:39 +01:00
Christian Borntraeger
da9a1446d2 KVM: s390: provide a capability for AIS state migration
The AIS capability was introduced in 4.12, while the interface to
migrate the state was added in 4.13. Unfortunately it is not possible
for userspace to detect the migration capability without creating a flic
kvm device. As in QEMU the cpu model detection runs on the "none"
machine this will result in cpu model issues regarding the "ais"
capability.

To get the "ais" capability properly let's add a new KVM capability that
tells userspace that AIS states can be migrated.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
2017-11-09 16:48:51 +01:00
Jason Gerecke
9e429d5649 HID: wacom: generic: Send BTN_STYLUS3 when both barrel switches are set
The Wacom Pro Pen 3D includes a third barrel switch which is intended to
be particularly useful in applications where one frequency uses pan, zoom,
and rotate to navigate around a scene or model. The pen is compatible with
the MobileStudio Pro, 2nd-gen Intuos Pro, and Cintiq Pro. When the third
button is pressed, these devices set both the HID_DG_BARRELSWITCH and
HID_DG_BARRELSWITCH2 usages since their HID descriptors do not include a
usage specific to the button.

Rather than send both BTN_STYLUS and BTN_STYLUS2 when the third button is
pressed, userspace (libinput) has requested that we detect this condition
and report a newly-defined BTN_STYLUS3 event instead. We could define a
quirk specific to devices compatible with the Pro Pen 3D, but the liklihood
of seeing both barrel switch bits set with other pens/devices is low enough
to not worry about (pens mechanically prevent accidental activation of
multiple switches).

Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-11-09 13:32:43 +01:00
Matthew Garrett
096b854648 EVM: Include security.apparmor in EVM measurements
Apparmor will be gaining support for security.apparmor labels, and it
would be helpful to include these in EVM validation now so appropriate
signatures can be generated even before full support is merged.

Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: John Johansen <John.johansen@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Tvrtko Ursulin
40a4884512 drm/i915: Reject unknown syncobj flags
We have to reject unknown flags for uAPI considerations, and also
because the curent implementation limits their i915 storage space
to two bits.

v2: (Chris Wilson)
 * Fix fail in ABI check.
 * Added unknown flags and BUILD_BUG_ON.

v3:
 * Use ARCH_KMALLOC_MINALIGN instead of alignof. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: cf6e7bac63 ("drm/i915: Add support for drm syncobjs")
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171031102326.9738-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit ebcaa1ff8b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-11-08 10:19:45 -08:00
Yi Yang
b2d0f5d5dc openvswitch: enable NSH support
v16->17
 - Fixed disputed check code: keep them in nsh_push and nsh_pop
   but also add them in __ovs_nla_copy_actions

v15->v16
 - Add csum recalculation for nsh_push, nsh_pop and set_nsh
   pointed out by Pravin
 - Move nsh key into the union with ipv4 and ipv6 and add
   check for nsh key in match_validate pointed out by Pravin
 - Add nsh check in validate_set and __ovs_nla_copy_actions

v14->v15
 - Check size in nsh_hdr_from_nlattr
 - Fixed four small issues pointed out By Jiri and Eric

v13->v14
 - Rename skb_push_nsh to nsh_push per Dave's comment
 - Rename skb_pop_nsh to nsh_pop per Dave's comment

v12->v13
 - Fix NSH header length check in set_nsh

v11->v12
 - Fix missing changes old comments pointed out
 - Fix new comments for v11

v10->v11
 - Fix the left three disputable comments for v9
   but not fixed in v10.

v9->v10
 - Change struct ovs_key_nsh to
       struct ovs_nsh_key_base base;
       __be32 context[NSH_MD1_CONTEXT_SIZE];
 - Fix new comments for v9

v8->v9
 - Fix build error reported by daily intel build
   because nsh module isn't selected by openvswitch

v7->v8
 - Rework nested value and mask for OVS_KEY_ATTR_NSH
 - Change pop_nsh to adapt to nsh kernel module
 - Fix many issues per comments from Jiri Benc

v6->v7
 - Remove NSH GSO patches in v6 because Jiri Benc
   reworked it as another patch series and they have
   been merged.
 - Change it to adapt to nsh kernel module added by NSH
   GSO patch series

v5->v6
 - Fix the rest comments for v4.
 - Add NSH GSO support for VxLAN-gpe + NSH and
   Eth + NSH.

v4->v5
 - Fix many comments by Jiri Benc and Eric Garver
   for v4.

v3->v4
 - Add new NSH match field ttl
 - Update NSH header to the latest format
   which will be final format and won't change
   per its author's confirmation.
 - Fix comments for v3.

v2->v3
 - Change OVS_KEY_ATTR_NSH to nested key to handle
   length-fixed attributes and length-variable
   attriubte more flexibly.
 - Remove struct ovs_action_push_nsh completely
 - Add code to handle nested attribute for SET_MASKED
 - Change PUSH_NSH to use the nested OVS_KEY_ATTR_NSH
   to transfer NSH header data.
 - Fix comments and coding style issues by Jiri and Eric

v1->v2
 - Change encap_nsh and decap_nsh to push_nsh and pop_nsh
 - Dynamically allocate struct ovs_action_push_nsh for
   length-variable metadata.

OVS master and 2.8 branch has merged NSH userspace
patch series, this patch is to enable NSH support
in kernel data path in order that OVS can support
NSH in compat mode by porting this.

Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Eric Garver <e@erig.me>
Acked-by: Pravin Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08 16:12:33 +09:00
Nogah Frankel
602f3baf22 net_sch: red: Add offload ability to RED qdisc
Add the ability to offload RED qdisc by using ndo_setup_tc.
There are four commands for RED offloading:
* TC_RED_SET: handles set and change.
* TC_RED_DESTROY: handle qdisc destroy.
* TC_RED_STATS: update the qdiscs counters (given as reference)
* TC_RED_XSTAT: returns red xstats.

Whether RED is being offloaded is being determined every time dump action
is being called because parent change of this qdisc could change its
offload state but doesn't require any RED function to be called.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08 12:23:37 +09:00
Tom Herbert
fddb231ebe ila: Add a hook type for LWT routes
In LWT tunnels both an input and output route method is defined.
If both of these are executed in the same path then double translation
happens and the effect is not correct.

This patch adds a new attribute that indicates the hook type. Two
values are defined for route output and route output. ILA
translation is only done for the one that is set. The default is
to enable ILA on route output.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08 11:20:49 +09:00
Tom Herbert
70d5aef48a ila: allow configuration of identifier type
Allow identifier to be explicitly configured for a mapping.
This can either be one of the identifier types specified in the
ILA draft or a value of ILA_ATYPE_USE_FORMAT which means the
identifier type is inferred from the identifier type field.
If a value other than ILA_ATYPE_USE_FORMAT is set for a
mapping then it is assumed that the identifier type field is
not present in an identifier.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08 11:20:48 +09:00
Tom Herbert
84287bb328 ila: add checksum neutral map auto
Add checksum neutral auto that performs checksum neutral mapping
without using the C-bit. This is enabled by configuration of
a mapping.

The checksum neutral function has been split into
ila_csum_do_neutral_fmt and ila_csum_do_neutral_nofmt. The former
handles the C-bit and includes it in the adjustment value. The latter
just sets the adjustment value on the locator diff only.

Added configuration for checksum neutral map aut in ila_lwt
and ila_xlat.

Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08 11:20:48 +09:00
Felipe Balbi
6f27f4f97e usb: core: add Status Type definitions
USB 3.1 added a PTM_STATUS type. Let's add a define for it and
following patches will let usb_get_status() accept the new argument.

Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-07 15:47:19 +01:00
Ingo Molnar
8c5db92a70 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:32:44 +01:00
Roman Gushchin
ebc614f687 bpf, cgroup: implement eBPF-based device controller for cgroup v2
Cgroup v2 lacks the device controller, provided by cgroup v1.
This patch adds a new eBPF program type, which in combination
of previously added ability to attach multiple eBPF programs
to a cgroup, will provide a similar functionality, but with some
additional flexibility.

This patch introduces a BPF_PROG_TYPE_CGROUP_DEVICE program type.
A program takes major and minor device numbers, device type
(block/character) and access type (mknod/read/write) as parameters
and returns an integer which defines if the operation should be
allowed or terminated with -EPERM.

Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 23:26:51 +09:00
David S. Miller
488e5b30d3 Merge tag 'mlx5-updates-2017-11-04' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-updates-2017-11-04

This series includes:

From Huy: dscp to priority mapping for Ethernet packet.

===================================================
First six patches enable differentiated services code point (dscp) to
priority mapping for Ethernet packet. Once this feature is
enabled, the packet is routed to the corresponding priority based on its
dscp. User can combine this feature with priority flow control (pfc)
feature to have priority flow control based on the dscp.

Firmware interface:
Mellanox firmware provides two control knobs for this feature:
  QPTS register allow changing the trust state between dscp and
  pcp mode. The default is pcp mode. Once in dscp mode, firmware will
  route the packet based on its dscp value if the dscp field exists.

  QPDPM register allow mapping a specific dscp (0 to 63) to a
  specific priority (0 to 7). By default, all the dscps are mapped to
  priority zero.

Software interface:
This feature is controlled via application priority TLV. IEEE
specification P802.1Qcd/D2.1 defines priority selector id 5 for
application priority TLV. This APP TLV selector defines DSCP to priority
map. This APP TLV can be sent by the switch or can be set locally using
software such as lldptool. In mlx5 drivers, we add the support for net
dcb's getapp and setapp call back. Mlx5 driver only handles the selector
id 5 application entry (dscp application priority application entry).
If user sends multiple dscp to priority APP TLV entries on the same
dscp, the last sent one will take effect. All the previous sent will be
deleted.

The firmware trust state (in QPTS register) is changed based on the
number of dscp to priority application entries. When the first dscp to
priority application entry is added by the user, the trust state is
changed to dscp. When the last dscp to priority application entry is
deleted by the user, the trust state is changed to pcp.

When the port is in DSCP trust state, the transmit queue is selected
based on the dscp of the skb.

When the port is in DSCP trust state and vport inline mode is not NONE,
firmware requires mlx5 driver to copy the IP header to the
wqe ethernet segment inline header if the skb has it.
This is done by changing the transmit queue sq's min inline mode to L3.
Note that the min inline mode of sqs that belong to other features
such as xdpsq, icosq are not modified.
===================================================

Plus to the dscp series, some small misc changes are include as well:

From Inbar, Ethtool msglvl support and some debug prints in DCBNL logic
From Or Gerlitz, Enlarge the NIC TC offload table size
From Rabie, Initialize destination_flow struct to 0
From Feras, Add inner TTC table to IPoIB flow steering
From Tal, Enable CQE based moderation on TX CQ
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 23:25:02 +09:00
Jakub Kicinski
bd601b6ada bpf: report offload info to user space
Extend struct bpf_prog_info to contain information about program
being bound to a device.  Since the netdev may get destroyed while
program still exists we need a flag to indicate the program is
loaded for a device, even if the device is gone.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 22:26:18 +09:00
Jakub Kicinski
ab3f0063c4 bpf: offload: add infrastructure for loading programs for a specific netdev
The fact that we don't know which device the program is going
to be used on is quite limiting in current eBPF infrastructure.
We have to reverse or limit the changes which kernel makes to
the loaded bytecode if we want it to be offloaded to a networking
device.  We also have to invent new APIs for debugging and
troubleshooting support.

Make it possible to load programs for a specific netdev.  This
helps us to bring the debug information closer to the core
eBPF infrastructure (e.g. we will be able to reuse the verifer
log in device JIT).  It allows device JITs to perform translation
on the original bytecode.

__bpf_prog_get() when called to get a reference for an attachment
point will now refuse to give it if program has a device assigned.
Following patches will add a version of that function which passes
the expected netdev in. @type argument in __bpf_prog_get() is
renamed to attach_type to make it clearer that it's only set on
attachment.

All calls to ndo_bpf are protected by rtnl, only verifier callbacks
are not.  We need a wait queue to make sure netdev doesn't get
destroyed while verifier is still running and calling its driver.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 22:26:18 +09:00
Jiri Benc
79e1ad148c rtnetlink: use netnsid to query interface
Currently, when an application gets netnsid from the kernel (for example as
the result of RTM_GETLINK call on one end of the veth pair), it's not much
useful. There's no reliable way to get to the netns fd from the netnsid, nor
does any kernel API accept netnsid.

Extend the RTM_GETLINK call to also accept netnsid. It will operate on the
netns with the given netnsid in such case. Of course, the calling process
needs to have enough capabilities in the target name space; for now, require
CAP_NET_ADMIN. This can be relaxed in the future.

To signal to the calling process that the kernel understood the new
IFLA_IF_NETNSID attribute in the query, it will include it in the response.
This is needed to detect older kernels, as they will just ignore
IFLA_IF_NETNSID and query in the current name space.

This patch implemetns IFLA_IF_NETNSID only for get and dump. For set
operations, this can be extended later.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 21:49:17 +09:00
Jiri Benc
9354d45203 openvswitch: reliable interface indentification in port dumps
This patch allows reliable identification of netdevice interfaces connected
to openvswitch bridges. In particular, user space queries the netdev
interfaces belonging to the ports for statistics, up/down state, etc.
Datapath dump needs to provide enough information for the user space to be
able to do that.

Currently, only interface names are returned. This is not sufficient, as
openvswitch allows its ports to be in different name spaces and the
interface name is valid only in its name space. What is needed and generally
used in other netlink APIs, is the pair ifindex+netnsid.

The solution is addition of the ifindex+netnsid pair (or only ifindex if in
the same name space) to vport get/dump operation.

On request side, ideally the ifindex+netnsid pair could be used to
get/set/del the corresponding vport. This is not implemented by this patch
and can be added later if needed.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-05 21:49:17 +09:00
Huy Nguyen
ee20598194 net/dcb: Add dscp to priority selector type
IEEE specification P802.1Qcd/D2.1 defines priority selector 5.
This APP TLV selector defines DSCP to priority map.
This patch defines such DSCP selector.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-04 21:23:32 -07:00
Ingo Molnar
fb7df12d64 tools/headers: Synchronize kernel ABI headers
After the SPDX license tags were added a number of tooling headers got out of
sync with their kernel variants, generating lots of build warnings.

Sync them:

 - tools/arch/x86/include/asm/disabled-features.h,
   tools/arch/x86/include/asm/required-features.h,
   tools/include/linux/hash.h:

     Remove the SPDX tag where the kernel version does not have it.

 - tools/include/asm-generic/bitops/__fls.h,
   tools/include/asm-generic/bitops/arch_hweight.h,
   tools/include/asm-generic/bitops/const_hweight.h,
   tools/include/asm-generic/bitops/fls.h,
   tools/include/asm-generic/bitops/fls64.h,
   tools/include/uapi/asm-generic/ioctls.h,
   tools/include/uapi/asm-generic/mman-common.h,
   tools/include/uapi/sound/asound.h,
   tools/include/uapi/linux/kvm.h,
   tools/include/uapi/linux/perf_event.h,
   tools/include/uapi/linux/sched.h,
   tools/include/uapi/linux/vhost.h,
   tools/include/uapi/sound/asound.h:

     Add the SPDX tag of the respective kernel header.

 - tools/include/uapi/linux/bpf_common.h,
   tools/include/uapi/linux/fcntl.h,
   tools/include/uapi/linux/hw_breakpoint.h,
   tools/include/uapi/linux/mman.h,
   tools/include/uapi/linux/stat.h,

     Change the tag to the kernel header version:

       -/* SPDX-License-Identifier: GPL-2.0 */
       +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */

Also sync other header details:

 - include/uapi/sound/asound.h:

     Fix pointless end of line whitespace noise the header grew in this cycle.

 - tools/arch/x86/lib/memcpy_64.S:

     Sync the code and add tools/include/asm/export.h with dummy wrappers
     to support building the kernel side code in a tooling header environment.

 - tools/include/uapi/asm-generic/mman.h,
   tools/include/uapi/linux/bpf.h:

     Sync other details that don't impact tooling's use of the ABIs.

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-04 09:27:46 +01:00
David S. Miller
2a171788ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Files removed in 'net-next' had their license header updated
in 'net'.  We take the remove from 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-04 09:26:51 +09:00
Mario Limonciello
f2645fa317 platform/x86: dell-smbios-wmi: introduce userspace interface
It's important for the driver to provide a R/W ioctl to ensure that
two competing userspace processes don't race to provide or read each
others data.

This userspace character device will be used to perform SMBIOS calls
from any applications.

It provides an ioctl that will allow passing the WMI calling
interface buffer between userspace and kernel space.

This character device is intended to deprecate the dcdbas kernel module
and the interface that it provides to userspace.

To perform an SMBIOS IOCTL call using the character device userspace will
perform a read() on the the character device.  The WMI bus will provide
a u64 variable containing the necessary size of the IOCTL buffer.

The API for interacting with this interface is defined in documentation
as well as the WMI uapi header provides the format of the structures.

Not all userspace requests will be accepted.  The dell-smbios filtering
functionality will be used to prevent access to certain tokens and calls.

All whitelisted commands and tokens are now shared out to userspace so
applications don't need to define them in their own headers.

Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Edward O'Callaghan <quasisec@google.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-11-03 16:34:00 -07:00
Mario Limonciello
44b6b76611 platform/x86: wmi: create userspace interface for drivers
For WMI operations that are only Set or Query readable and writable sysfs
attributes created by WMI vendor drivers or the bus driver makes sense.

For other WMI operations that are run on Method, there needs to be a
way to guarantee to userspace that the results from the method call
belong to the data request to the method call.  Sysfs attributes don't
work well in this scenario because two userspace processes may be
competing at reading/writing an attribute and step on each other's
data.

When a WMI vendor driver declares a callback method in the wmi_driver
the WMI bus driver will create a character device that maps to that
function.  This callback method will be responsible for filtering
invalid requests and performing the actual call.

That character device will correspond to this path:
/dev/wmi/$driver

Performing read() on this character device will provide the size
of the buffer that the character device needs to perform calls.
This buffer size can be set by vendor drivers through a new symbol
or when MOF parsing is available by the MOF.

Performing ioctl() on this character device will be interpretd
by the WMI bus driver. It will perform sanity tests for size of
data, test them for a valid instance, copy the data from userspace
and pass iton to the vendor driver to further process and run.

This creates an implicit policy that each driver will only be allowed
a single character device.  If a module matches multiple GUID's,
the wmi_devices will need to be all handled by the same wmi_driver.

The WMI vendor drivers will be responsible for managing inappropriate
access to this character device and proper locking on data used by
it.

When a WMI vendor driver is unloaded the WMI bus driver will clean
up the character device and any memory allocated for the call.

Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Edward O'Callaghan <quasisec@google.com>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2017-11-03 16:34:00 -07:00
Dave Martin
2d2123bc7c arm64/sve: Add prctl controls for userspace vector length management
This patch adds two arm64-specific prctls, to permit userspace to
control its vector length:

 * PR_SVE_SET_VL: set the thread's SVE vector length and vector
   length inheritance mode.

 * PR_SVE_GET_VL: get the same information.

Although these prctls resemble instruction set features in the SVE
architecture, they provide additional control: the vector length
inheritance mode is Linux-specific and nothing to do with the
architecture, and the architecture does not permit EL0 to set its
own vector length directly.  Both can be used in portable tools
without requiring the use of SVE instructions.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
[will: Fixed up prctl constants to avoid clash with PDEATHSIG]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:19 +00:00
Dave Martin
43d4da2c45 arm64/sve: ptrace and ELF coredump support
This patch defines and implements a new regset NT_ARM_SVE, which
describes a thread's SVE register state.  This allows a debugger to
manipulate the SVE state, as well as being included in ELF
coredumps for post-mortem debugging.

Because the regset size and layout are dependent on the thread's
current vector length, it is not possible to define a C struct to
describe the regset contents as is done for existing regsets.
Instead, and for the same reasons, NT_ARM_SVE is based on the
freeform variable-layout approach used for the SVE signal frame.

Additionally, to reduce debug overhead when debugging threads that
might or might not have live SVE register state, NT_ARM_SVE may be
presented in one of two different formats: the old struct
user_fpsimd_state format is embedded for describing the state of a
thread with no live SVE state, whereas a new variable-layout
structure is embedded for describing live SVE state.  This avoids a
debugger needing to poll NT_PRFPREG in addition to NT_ARM_SVE, and
allows existing userspace code to handle the non-SVE case without
too much modification.

For this to work, NT_ARM_SVE is defined with a fixed-format header
of type struct user_sve_header, which the recipient can use to
figure out the content, size and layout of the reset of the regset.
Accessor macros are defined to allow the vector-length-dependent
parts of the regset to be manipulated.

Signed-off-by: Alan Hayward <alan.hayward@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Okamoto Takayuki <tokamoto@jp.fujitsu.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:18 +00:00
Dave Martin
7582e22038 arm64/sve: Backend logic for setting the vector length
This patch implements the core logic for changing a task's vector
length on request from userspace.  This will be used by the ptrace
and prctl frontends that are implemented in later patches.

The SVE architecture permits, but does not require, implementations
to support vector lengths that are not a power of two.  To handle
this, logic is added to check a requested vector length against a
possibly sparse bitmap of available vector lengths at runtime, so
that the best supported value can be chosen.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-03 15:24:16 +00:00
Jan Kara
b6fb293f24 mm: Define MAP_SYNC and VM_SYNC flags
Define new MAP_SYNC flag and corresponding VMA VM_SYNC flag. As the
MAP_SYNC flag is not part of LEGACY_MAP_MASK, currently it will be
refused by all MAP_SHARED_VALIDATE map attempts and silently ignored for
everything else.

Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-11-03 06:26:25 -07:00
Dan Williams
1c97259740 mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags
The mmap(2) syscall suffers from the ABI anti-pattern of not validating
unknown flags. However, proposals like MAP_SYNC need a mechanism to
define new behavior that is known to fail on older kernels without the
support. Define a new MAP_SHARED_VALIDATE flag pattern that is
guaranteed to fail on all legacy mmap implementations.

It is worth noting that the original proposal was for a standalone
MAP_VALIDATE flag. However, when that  could not be supported by all
archs Linus observed:

    I see why you *think* you want a bitmap. You think you want
    a bitmap because you want to make MAP_VALIDATE be part of MAP_SYNC
    etc, so that people can do

    ret = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED
		    | MAP_SYNC, fd, 0);

    and "know" that MAP_SYNC actually takes.

    And I'm saying that whole wish is bogus. You're fundamentally
    depending on special semantics, just make it explicit. It's already
    not portable, so don't try to make it so.

    Rename that MAP_VALIDATE as MAP_SHARED_VALIDATE, make it have a value
    of 0x3, and make people do

    ret = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED_VALIDATE
		    | MAP_SYNC, fd, 0);

    and then the kernel side is easier too (none of that random garbage
    playing games with looking at the "MAP_VALIDATE bit", but just another
    case statement in that map type thing.

    Boom. Done.

Similar to ->fallocate() we also want the ability to validate the
support for new flags on a per ->mmap() 'struct file_operations'
instance basis.  Towards that end arrange for flags to be generically
validated against a mmap_supported_flags exported by 'struct
file_operations'. By default all existing flags are implicitly
supported, but new flags require MAP_SHARED_VALIDATE and
per-instance-opt-in.

Cc: Jan Kara <jack@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-11-03 06:26:22 -07:00
Linus Torvalds
ead751507d License cleanup: add SPDX license identifiers to some files
Many source files in the tree are missing licensing information, which
 makes it harder for compliance tools to determine the correct license.
 
 By default all files without license information are under the default
 license of the kernel, which is GPL version 2.
 
 Update the files which contain no license information with the 'GPL-2.0'
 SPDX license identifier.  The SPDX identifier is a legally binding
 shorthand, which can be used instead of the full boiler plate text.
 
 This patch is based on work done by Thomas Gleixner and Kate Stewart and
 Philippe Ombredanne.
 
 How this work was done:
 
 Patches were generated and checked against linux-4.14-rc6 for a subset of
 the use cases:
  - file had no licensing information it it.
  - file was a */uapi/* one with no licensing information in it,
  - file was a */uapi/* one with existing licensing information,
 
 Further patches will be generated in subsequent months to fix up cases
 where non-standard license headers were used, and references to license
 had to be inferred by heuristics based on keywords.
 
 The analysis to determine which SPDX License Identifier to be applied to
 a file was done in a spreadsheet of side by side results from of the
 output of two independent scanners (ScanCode & Windriver) producing SPDX
 tag:value files created by Philippe Ombredanne.  Philippe prepared the
 base worksheet, and did an initial spot review of a few 1000 files.
 
 The 4.13 kernel was the starting point of the analysis with 60,537 files
 assessed.  Kate Stewart did a file by file comparison of the scanner
 results in the spreadsheet to determine which SPDX license identifier(s)
 to be applied to the file. She confirmed any determination that was not
 immediately clear with lawyers working with the Linux Foundation.
 
 Criteria used to select files for SPDX license identifier tagging was:
  - Files considered eligible had to be source code files.
  - Make and config files were included as candidates if they contained >5
    lines of source
  - File already had some variant of a license header in it (even if <5
    lines).
 
 All documentation files were explicitly excluded.
 
 The following heuristics were used to determine which SPDX license
 identifiers to apply.
 
  - when both scanners couldn't find any license traces, file was
    considered to have no license information in it, and the top level
    COPYING file license applied.
 
    For non */uapi/* files that summary was:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|-------
    GPL-2.0                                              11139
 
    and resulted in the first patch in this series.
 
    If that file was a */uapi/* path one, it was "GPL-2.0 WITH
    Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|-------
    GPL-2.0 WITH Linux-syscall-note                        930
 
    and resulted in the second patch in this series.
 
  - if a file had some form of licensing information in it, and was one
    of the */uapi/* ones, it was denoted with the Linux-syscall-note if
    any GPL family license was found in the file or had no licensing in
    it (per prior point).  Results summary:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|------
    GPL-2.0 WITH Linux-syscall-note                       270
    GPL-2.0+ WITH Linux-syscall-note                      169
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
    LGPL-2.1+ WITH Linux-syscall-note                      15
    GPL-1.0+ WITH Linux-syscall-note                       14
    ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
    LGPL-2.0+ WITH Linux-syscall-note                       4
    LGPL-2.1 WITH Linux-syscall-note                        3
    ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
    ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
 
    and that resulted in the third patch in this series.
 
  - when the two scanners agreed on the detected license(s), that became
    the concluded license(s).
 
  - when there was disagreement between the two scanners (one detected a
    license but the other didn't, or they both detected different
    licenses) a manual inspection of the file occurred.
 
  - In most cases a manual inspection of the information in the file
    resulted in a clear resolution of the license that should apply (and
    which scanner probably needed to revisit its heuristics).
 
  - When it was not immediately clear, the license identifier was
    confirmed with lawyers working with the Linux Foundation.
 
  - If there was any question as to the appropriate license identifier,
    the file was flagged for further research and to be revisited later
    in time.
 
 In total, over 70 hours of logged manual review was done on the
 spreadsheet to determine the SPDX license identifiers to apply to the
 source files by Kate, Philippe, Thomas and, in some cases, confirmation
 by lawyers working with the Linux Foundation.
 
 Kate also obtained a third independent scan of the 4.13 code base from
 FOSSology, and compared selected files where the other two scanners
 disagreed against that SPDX file, to see if there was new insights.  The
 Windriver scanner is based on an older version of FOSSology in part, so
 they are related.
 
 Thomas did random spot checks in about 500 files from the spreadsheets
 for the uapi headers and agreed with SPDX license identifier in the
 files he inspected. For the non-uapi files Thomas did random spot checks
 in about 15000 files.
 
 In initial set of patches against 4.14-rc6, 3 files were found to have
 copy/paste license identifier errors, and have been fixed to reflect the
 correct identifier.
 
 Additionally Philippe spent 10 hours this week doing a detailed manual
 inspection and review of the 12,461 patched files from the initial patch
 version early this week with:
  - a full scancode scan run, collecting the matched texts, detected
    license ids and scores
  - reviewing anything where there was a license detected (about 500+
    files) to ensure that the applied SPDX license was correct
  - reviewing anything where there was no detection but the patch license
    was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
    SPDX license was correct
 
 This produced a worksheet with 20 files needing minor correction.  This
 worksheet was then exported into 3 different .csv files for the
 different types of files to be modified.
 
 These .csv files were then reviewed by Greg.  Thomas wrote a script to
 parse the csv files and add the proper SPDX tag to the file, in the
 format that the file expected.  This script was further refined by Greg
 based on the output to detect more types of files automatically and to
 distinguish between header and source .c files (which need different
 comment types.)  Finally Greg ran the script using the .csv files to
 generate the patches.
 
 Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
 Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
 Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA
 6dVh26uchcEQLN/XqUDt
 =x306
 -----END PGP SIGNATURE-----

Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull initial SPDX identifiers from Greg KH:
 "License cleanup: add SPDX license identifiers to some files

  Many source files in the tree are missing licensing information, which
  makes it harder for compliance tools to determine the correct license.

  By default all files without license information are under the default
  license of the kernel, which is GPL version 2.

  Update the files which contain no license information with the
  'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
  binding shorthand, which can be used instead of the full boiler plate
  text.

  This patch is based on work done by Thomas Gleixner and Kate Stewart
  and Philippe Ombredanne.

  How this work was done:

  Patches were generated and checked against linux-4.14-rc6 for a subset
  of the use cases:

   - file had no licensing information it it.

   - file was a */uapi/* one with no licensing information in it,

   - file was a */uapi/* one with existing licensing information,

  Further patches will be generated in subsequent months to fix up cases
  where non-standard license headers were used, and references to
  license had to be inferred by heuristics based on keywords.

  The analysis to determine which SPDX License Identifier to be applied
  to a file was done in a spreadsheet of side by side results from of
  the output of two independent scanners (ScanCode & Windriver)
  producing SPDX tag:value files created by Philippe Ombredanne.
  Philippe prepared the base worksheet, and did an initial spot review
  of a few 1000 files.

  The 4.13 kernel was the starting point of the analysis with 60,537
  files assessed. Kate Stewart did a file by file comparison of the
  scanner results in the spreadsheet to determine which SPDX license
  identifier(s) to be applied to the file. She confirmed any
  determination that was not immediately clear with lawyers working with
  the Linux Foundation.

  Criteria used to select files for SPDX license identifier tagging was:

   - Files considered eligible had to be source code files.

   - Make and config files were included as candidates if they contained
     >5 lines of source

   - File already had some variant of a license header in it (even if <5
     lines).

  All documentation files were explicitly excluded.

  The following heuristics were used to determine which SPDX license
  identifiers to apply.

   - when both scanners couldn't find any license traces, file was
     considered to have no license information in it, and the top level
     COPYING file license applied.

     For non */uapi/* files that summary was:

       SPDX license identifier                            # files
       ---------------------------------------------------|-------
       GPL-2.0                                              11139

     and resulted in the first patch in this series.

     If that file was a */uapi/* path one, it was "GPL-2.0 WITH
     Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
     was:

       SPDX license identifier                            # files
       ---------------------------------------------------|-------
       GPL-2.0 WITH Linux-syscall-note                        930

     and resulted in the second patch in this series.

   - if a file had some form of licensing information in it, and was one
     of the */uapi/* ones, it was denoted with the Linux-syscall-note if
     any GPL family license was found in the file or had no licensing in
     it (per prior point). Results summary:

       SPDX license identifier                            # files
       ---------------------------------------------------|------
       GPL-2.0 WITH Linux-syscall-note                       270
       GPL-2.0+ WITH Linux-syscall-note                      169
       ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
       ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
       LGPL-2.1+ WITH Linux-syscall-note                      15
       GPL-1.0+ WITH Linux-syscall-note                       14
       ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
       LGPL-2.0+ WITH Linux-syscall-note                       4
       LGPL-2.1 WITH Linux-syscall-note                        3
       ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
       ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

     and that resulted in the third patch in this series.

   - when the two scanners agreed on the detected license(s), that
     became the concluded license(s).

   - when there was disagreement between the two scanners (one detected
     a license but the other didn't, or they both detected different
     licenses) a manual inspection of the file occurred.

   - In most cases a manual inspection of the information in the file
     resulted in a clear resolution of the license that should apply
     (and which scanner probably needed to revisit its heuristics).

   - When it was not immediately clear, the license identifier was
     confirmed with lawyers working with the Linux Foundation.

   - If there was any question as to the appropriate license identifier,
     the file was flagged for further research and to be revisited later
     in time.

  In total, over 70 hours of logged manual review was done on the
  spreadsheet to determine the SPDX license identifiers to apply to the
  source files by Kate, Philippe, Thomas and, in some cases,
  confirmation by lawyers working with the Linux Foundation.

  Kate also obtained a third independent scan of the 4.13 code base from
  FOSSology, and compared selected files where the other two scanners
  disagreed against that SPDX file, to see if there was new insights.
  The Windriver scanner is based on an older version of FOSSology in
  part, so they are related.

  Thomas did random spot checks in about 500 files from the spreadsheets
  for the uapi headers and agreed with SPDX license identifier in the
  files he inspected. For the non-uapi files Thomas did random spot
  checks in about 15000 files.

  In initial set of patches against 4.14-rc6, 3 files were found to have
  copy/paste license identifier errors, and have been fixed to reflect
  the correct identifier.

  Additionally Philippe spent 10 hours this week doing a detailed manual
  inspection and review of the 12,461 patched files from the initial
  patch version early this week with:

   - a full scancode scan run, collecting the matched texts, detected
     license ids and scores

   - reviewing anything where there was a license detected (about 500+
     files) to ensure that the applied SPDX license was correct

   - reviewing anything where there was no detection but the patch
     license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
     applied SPDX license was correct

  This produced a worksheet with 20 files needing minor correction. This
  worksheet was then exported into 3 different .csv files for the
  different types of files to be modified.

  These .csv files were then reviewed by Greg. Thomas wrote a script to
  parse the csv files and add the proper SPDX tag to the file, in the
  format that the file expected. This script was further refined by Greg
  based on the output to detect more types of files automatically and to
  distinguish between header and source .c files (which need different
  comment types.) Finally Greg ran the script using the .csv files to
  generate the patches.

  Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
  Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
  Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  License cleanup: add SPDX license identifier to uapi header files with a license
  License cleanup: add SPDX license identifier to uapi header files with no license
  License cleanup: add SPDX GPL-2.0 license identifier to files with no license
2017-11-02 10:04:46 -07:00
Greg Kroah-Hartman
e2be04c7f9 License cleanup: add SPDX license identifier to uapi header files with a license
Many user space API headers have licensing information, which is either
incomplete, badly formatted or just a shorthand for referring to the
license under which the file is supposed to be.  This makes it hard for
compliance tools to determine the correct license.

Update these files with an SPDX license identifier.  The identifier was
chosen based on the license information in the file.

GPL/LGPL licensed headers get the matching GPL/LGPL SPDX license
identifier with the added 'WITH Linux-syscall-note' exception, which is
the officially assigned exception identifier for the kernel syscall
exception:

   NOTE! This copyright does *not* cover user programs that use kernel
   services by normal system calls - this is merely considered normal use
   of the kernel, and does *not* fall under the heading of "derived work".

This exception makes it possible to include GPL headers into non GPL
code, without confusing license compliance tools.

Headers which have either explicit dual licensing or are just licensed
under a non GPL license are updated with the corresponding SPDX
identifier and the GPLv2 with syscall exception identifier.  The format
is:
        ((GPL-2.0 WITH Linux-syscall-note) OR SPDX-ID-OF-OTHER-LICENSE)

SPDX license identifiers are a legally binding shorthand, which can be
used instead of the full boiler plate text.  The update does not remove
existing license information as this has to be done on a case by case
basis and the copyright holders might have to be consulted. This will
happen in a separate step.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.  See the previous patch in this series for the
methodology of how this patch was researched.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:20:11 +01:00
Greg Kroah-Hartman
6f52b16c5b License cleanup: add SPDX license identifier to uapi header files with no license
Many user space API headers are missing licensing information, which
makes it hard for compliance tools to determine the correct license.

By default are files without license information under the default
license of the kernel, which is GPLV2.  Marking them GPLV2 would exclude
them from being included in non GPLV2 code, which is obviously not
intended. The user space API headers fall under the syscall exception
which is in the kernels COPYING file:

   NOTE! This copyright does *not* cover user programs that use kernel
   services by normal system calls - this is merely considered normal use
   of the kernel, and does *not* fall under the heading of "derived work".

otherwise syscall usage would not be possible.

Update the files which contain no license information with an SPDX
license identifier.  The chosen identifier is 'GPL-2.0 WITH
Linux-syscall-note' which is the officially assigned identifier for the
Linux syscall exception.  SPDX license identifiers are a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.  See the previous patch in this series for the
methodology of how this patch was researched.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:19:54 +01:00
David S. Miller
ed29668d1a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Smooth Cong Wang's bug fix into 'net-next'.  Basically put
the bulk of the tcf_block_put() logic from 'net' into
tcf_block_put_ext(), but after the offload unbind.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02 15:23:39 +09:00
Zygo Blaxell
d24a67b2d9 btrfs: add a flags argument to LOGICAL_INO and call it LOGICAL_INO_V2
Now that check_extent_in_eb()'s extent offset filter can be turned off,
we need a way to do it from userspace.

Add a 'flags' field to the btrfs_logical_ino_args structure to disable
extent offset filtering, taking the place of one of the existing
reserved[] fields.

Previous versions of LOGICAL_INO neglected to check whether any of the
reserved fields have non-zero values.  Assigning meaning to those fields
now may change the behavior of existing programs that left these fields
uninitialized.  The lack of a zero check also means that new programs
have no way to know whether the kernel is honoring the flags field.

To avoid these problems, define a new ioctl LOGICAL_INO_V2.  We can
use the same argument layout as LOGICAL_INO, but shorten the reserved[]
array by one element and turn it into the 'flags' field.  The V2 ioctl
explicitly checks that reserved fields and unsupported flag bits are zero
so that userspace can negotiate future feature bits as they are defined.

Since the memory layouts of the two ioctls' arguments are compatible,
there is no need for a separate function for logical_to_ino_v2 (contrast
with tree_search_v2 vs tree_search where the layout and code are quite
different).  A version parameter and an 'if' statement will suffice.

Now that we have a flags field in logical_ino_args, add a flag
BTRFS_LOGICAL_INO_ARGS_IGNORE_OFFSET to get the behavior we want,
and pass it down the stack to iterate_inodes_from_logical.

Motivation and background, copied from the patchset cover letter:

Suppose we have a file with one extent:

    root@tester:~# zcat /usr/share/doc/cpio/changelog.gz > /test/a
    root@tester:~# sync

Split the extent by overwriting it in the middle:

    root@tester:~# cat /dev/urandom | dd bs=4k seek=2 skip=2 count=1 conv=notrunc of=/test/a

We should now have 3 extent refs to 2 extents, with one block unreachable.
The extent tree looks like:

    root@tester:~# btrfs-debug-tree /dev/vdc -t 2
    [...]
            item 9 key (1103101952 EXTENT_ITEM 73728) itemoff 15942 itemsize 53
                    extent refs 2 gen 29 flags DATA
                    extent data backref root 5 objectid 261 offset 0 count 2
    [...]
            item 11 key (1103175680 EXTENT_ITEM 4096) itemoff 15865 itemsize 53
                    extent refs 1 gen 30 flags DATA
                    extent data backref root 5 objectid 261 offset 8192 count 1
    [...]

and the ref tree looks like:

    root@tester:~# btrfs-debug-tree /dev/vdc -t 5
    [...]
            item 6 key (261 EXTENT_DATA 0) itemoff 15825 itemsize 53
                    extent data disk byte 1103101952 nr 73728
                    extent data offset 0 nr 8192 ram 73728
                    extent compression(none)
            item 7 key (261 EXTENT_DATA 8192) itemoff 15772 itemsize 53
                    extent data disk byte 1103175680 nr 4096
                    extent data offset 0 nr 4096 ram 4096
                    extent compression(none)
            item 8 key (261 EXTENT_DATA 12288) itemoff 15719 itemsize 53
                    extent data disk byte 1103101952 nr 73728
                    extent data offset 12288 nr 61440 ram 73728
                    extent compression(none)
    [...]

There are two references to the same extent with different, non-overlapping
byte offsets:

    [------------------72K extent at 1103101952----------------------]
    [--8K----------------|--4K unreachable----|--60K-----------------]
    ^                                         ^
    |                                         |
    [--8K ref offset 0--][--4K ref offset 0--][--60K ref offset 12K--]
                         |
                         v
                         [-----4K extent-----] at 1103175680

We want to find all of the references to extent bytenr 1103101952.

Without the patch (and without running btrfs-debug-tree), we have to
do it with 18 LOGICAL_INO calls:

    root@tester:~# btrfs ins log 1103101952 -P /test/
    Using LOGICAL_INO
    inode 261 offset 0 root 5

    root@tester:~# for x in $(seq 0 17); do btrfs ins log $((1103101952 + x * 4096)) -P /test/; done 2>&1 | grep inode
    inode 261 offset 0 root 5
    inode 261 offset 4096 root 5   <- same extent ref as offset 0
                                   (offset 8192 returns empty set, not reachable)
    inode 261 offset 12288 root 5
    inode 261 offset 16384 root 5  \
    inode 261 offset 20480 root 5  |
    inode 261 offset 24576 root 5  |
    inode 261 offset 28672 root 5  |
    inode 261 offset 32768 root 5  |
    inode 261 offset 36864 root 5  \
    inode 261 offset 40960 root 5   > all the same extent ref as offset 12288.
    inode 261 offset 45056 root 5  /  More processing required in userspace
    inode 261 offset 49152 root 5  |  to figure out these are all duplicates.
    inode 261 offset 53248 root 5  |
    inode 261 offset 57344 root 5  |
    inode 261 offset 61440 root 5  |
    inode 261 offset 65536 root 5  |
    inode 261 offset 69632 root 5  /

In the worst case the extents are 128MB long, and we have to do 32768
iterations of the loop to find one 4K extent ref.

With the patch, we just use one call to map all refs to the extent at once:
    root@tester:~# btrfs ins log 1103101952 -P /test/
    Using LOGICAL_INO_V2
    inode 261 offset 0 root 5
    inode 261 offset 12288 root 5

The TREE_SEARCH ioctl allows userspace to retrieve the offset and
extent bytenr fields easily once the root, inode and offset are known.
This is sufficient information to build a complete map of the extent
and all of its references.  Userspace can use this information to make
better choices to dedup or defrag.

Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Tested-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
[ copy background and motivation from cover letter ]
Signed-off-by: David Sterba <dsterba@suse.com>
2017-11-01 20:45:35 +01:00
John Fastabend
04686ef299 bpf: remove SK_REDIRECT from UAPI
Now that SK_REDIRECT is no longer a valid return code. Remove it
from the UAPI completely. Then do a namespace remapping internal
to sockmap so SK_REDIRECT is no longer externally visible.

Patchs primary change is to do a namechange from SK_REDIRECT to
__SK_REDIRECT

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01 11:43:50 +09:00
Arnd Bergmann
cb91775711 isofs: use unsigned char types consistently
Based on the discussion about the signed character field for the year,
I went through all fields in the iso9660 and rockridge standards to see
whether they should used signed or unsigned characters. Only a single
8-bit value is defined as signed per 'section 7.1.2': the timezone
offset in a timestamp, this has always been handled correctly through
explicit sign-extension.

All others are either '7.1.1 8-bit unsigned numerical values' or
composite fields. I also read the linux source code and came to the
same conclusion, also I could not find any other part of the
implementation that actually behaves differently for signed or
unsigned values.

Since it is still ambigous to use plain 'char' in interface definitions,
I'm changing all fields representing numbers and reserved bytes to
the unambiguous '__u8'. Fields that hold actual strings are left as
'char' arrays. I built the code with '-Wpointer-sign -Wsign-compare'
to see if anything got left out, but couldn't find anything wrong
with the remaining warnings.

This patch should not change runtime behavior and does not need to
be backported.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-10-31 18:11:33 +01:00
David S. Miller
e1ea2f9856 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several conflicts here.

NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to
nfp_fl_output() needed some adjustments because the code block is in
an else block now.

Parallel additions to net/pkt_cls.h and net/sch_generic.h

A bug fix in __tcp_retransmit_skb() conflicted with some of
the rbtree changes in net-next.

The tc action RCU callback fixes in 'net' had some overlap with some
of the recent tcf_block reworking.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-30 21:09:24 +09:00
Qu Wenruo
40c3c40947 btrfs: Add sanity check for EXTENT_DATA when reading out leaf
Add extra checks for item with EXTENT_DATA type.  This checks the
following thing:

0) Key offset
   All key offsets must be aligned to sectorsize.
   Inline extent must have 0 for key offset.

1) Item size
   Uncompressed inline file extent size must match item size.
   (Compressed inline file extent has no information about its on-disk size.)
   Regular/preallocated file extent size must be a fixed value.

2) Every member of regular file extent item
   Including alignment for bytenr and offset, possible value for
   compression/encryption/type.

3) Type/compression/encode must be one of the valid values.

This should be the most comprehensive and strict check in the context
of btrfs_item for EXTENT_DATA.

Signed-off-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ switch to BTRFS_FILE_EXTENT_TYPES, similar to what
  BTRFS_COMPRESS_TYPES does ]
Signed-off-by: David Sterba <dsterba@suse.com>
2017-10-30 12:27:58 +01:00
Linus Torvalds
19e12196da Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix route leak in xfrm_bundle_create().

 2) In mac80211, validate user rate mask before configuring it. From
    Johannes Berg.

 3) Properly enforce memory limits in fair queueing code, from Toke
    Hoiland-Jorgensen.

 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.

 5) Fix TSO header allocation and management in mvpp2 driver, from Yan
    Markman.

 6) Don't take socket lock in BH handler in strparser code, from Tom
    Herbert.

 7) Don't show sockets from other namespaces in AF_UNIX code, from
    Andrei Vagin.

 8) Fix double free in error path of tap_open(), from Girish Moodalbail.

 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
    and Alexander Duyck.

10) Fix DCB mode programming in stmmac driver, from Jose Abreu.

11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
    Long.

12) Properly align SKB head before building SKB in tuntap, from Jason
    Wang.

13) Avoid matching qdiscs with a zero handle during lookups, from Cong
    Wang.

14) Fix various endianness bugs in sctp, from Xin Long.

15) Fix tc filter callback races and add selftests which trigger the
    problem, from Cong Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  selftests: Introduce a new test case to tc testsuite
  selftests: Introduce a new script to generate tc batch file
  net_sched: fix call_rcu() race on act_sample module removal
  net_sched: add rtnl assertion to tcf_exts_destroy()
  net_sched: use tcf_queue_work() in tcindex filter
  net_sched: use tcf_queue_work() in rsvp filter
  net_sched: use tcf_queue_work() in route filter
  net_sched: use tcf_queue_work() in u32 filter
  net_sched: use tcf_queue_work() in matchall filter
  net_sched: use tcf_queue_work() in fw filter
  net_sched: use tcf_queue_work() in flower filter
  net_sched: use tcf_queue_work() in flow filter
  net_sched: use tcf_queue_work() in cgroup filter
  net_sched: use tcf_queue_work() in bpf filter
  net_sched: use tcf_queue_work() in basic filter
  net_sched: introduce a workqueue for RCU callbacks of tc filter
  sctp: fix some type cast warnings introduced since very beginning
  sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
  sctp: fix some type cast warnings introduced by transport rhashtable
  sctp: fix some type cast warnings introduced by stream reconf
  ...
2017-10-29 08:11:49 -07:00
Mahesh Bandewar
fe89aa6b25 ipvlan: implement VEPA mode
This is very similar to the Macvlan VEPA mode, however, there is some
difference. IPvlan uses the mac-address of the lower device, so the VEPA
mode has implications of ICMP-redirects for packets destined for its
immediate neighbors sharing same master since the packets will have same
source and dest mac. The external switch/router will send redirect msg.

Having said that, this will be useful tool in terms of debugging
since IPvlan will not switch packets within its slaves and rely completely
on the external entity as intended in 802.1Qbg.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 18:39:57 +09:00
Mahesh Bandewar
a190d04db9 ipvlan: introduce 'private' attribute for all existing modes.
IPvlan has always operated in bridge mode. However there are scenarios
where each slave should be able to talk through the master device but
not necessarily across each other. Think of an environment where each
of a namespace is a private and independant customer. In this scenario
the machine which is hosting these namespaces neither want to tell who
their neighbor is nor the individual namespaces care to talk to neighbor
on short-circuited network path.

This patch implements the mode that is very similar to the 'private' mode
in macvlan where individual slaves can send and receive traffic through
the master device, just that they can not talk among slave devices.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 18:39:57 +09:00
Xin Long
978aa04741 sctp: fix some type cast warnings introduced since very beginning
These warnings were found by running 'make C=2 M=net/sctp/'.
They are there since very beginning.

Note after this patch, there still one warning left in
sctp_outq_flush():
  sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM)

Since it has been moved to sctp_stream_outq_migrate on net-next,
to avoid the extra job when merging net-next to net, I will post
the fix for it after the merging is done.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 18:03:24 +09:00
Wei Wang
2ea2352ede ipv6: prevent user from adding cached routes
Cached routes should only be created by the system when receiving pmtu
discovery or ip redirect msg. Users should not be allowed to create
cached routes.

Furthermore, after the patch series to move cached routes into exception
table, user added cached routes will trigger the following warning in
fib6_add():

WARNING: CPU: 0 PID: 2985 at net/ipv6/ip6_fib.c:1137
fib6_add+0x20d9/0x2c10 net/ipv6/ip6_fib.c:1137
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 2985 Comm: syzkaller320388 Not tainted 4.14.0-rc3+ #74
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 panic+0x1e4/0x417 kernel/panic.c:181
 __warn+0x1c4/0x1d9 kernel/panic.c:542
 report_bug+0x211/0x2d0 lib/bug.c:183
 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
 do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
 do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:fib6_add+0x20d9/0x2c10 net/ipv6/ip6_fib.c:1137
RSP: 0018:ffff8801cf09f6a0 EFLAGS: 00010297
RAX: ffff8801ce45e340 RBX: 1ffff10039e13eec RCX: ffff8801d749c814
RDX: 0000000000000000 RSI: ffff8801d749c700 RDI: ffff8801d749c780
RBP: ffff8801cf09fa08 R08: 0000000000000000 R09: ffff8801cf09f360
R10: ffff8801cf09f2d8 R11: 1ffff10039c8befb R12: 0000000000000001
R13: dffffc0000000000 R14: ffff8801d749c700 R15: ffffffff860655c0
 __ip6_ins_rt+0x6c/0x90 net/ipv6/route.c:1011
 ip6_route_add+0x148/0x1a0 net/ipv6/route.c:2782
 ipv6_route_ioctl+0x4d5/0x690 net/ipv6/route.c:3291
 inet6_ioctl+0xef/0x1e0 net/ipv6/af_inet6.c:521
 sock_do_ioctl+0x65/0xb0 net/socket.c:961
 sock_ioctl+0x2c2/0x440 net/socket.c:1058
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xbe

So we fix this by failing the attemp to add cached routes from userspace
with returning EINVAL error.

Fixes: 2b760fcf5c ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 12:18:58 +09:00
John Fastabend
bfa640757e bpf: rename sk_actions to align with bpf infrastructure
Recent additions to support multiple programs in cgroups impose
a strict requirement, "all yes is yes, any no is no". To enforce
this the infrastructure requires the 'no' return code, SK_DROP in
this case, to be 0.

To apply these rules to SK_SKB program types the sk_actions return
codes need to be adjusted.

This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove
SK_ABORTED to remove any chance that the API may allow aborted
program flows to be passed up the stack. This would be incorrect
behavior and allow programs to break existing policies.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-29 11:18:48 +09:00
Oded Gabbay
7e86a365a8 drm/amdkfd: increase limit of signal events to 4096 per process
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-27 19:35:30 -04:00
Dave Airlie
7a88cbd8d6 Linux 4.14-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZ9kEFAAoJEHm+PkMAQRiGw6wH/0j197qyGd0hkVFMJO6LAgN3
 KQWS4nZ5BkVDocwv0RVnUJTtXqU1eozFgdVEtSoaFXpzlHGuptR2Tau9efDCJ7w3
 /utZxqvhGebZd2T+j+/o/LE8BRQxhADBNJq2D/o0WNt8ecxuG0GIkhkEYt/o3z1v
 /sxlwVwzXB7Dc/h1WcgGJG7cS6L9KzzAzGAS/iNvdFrPOygHBv8c0MxVZIiBIeeK
 1nZdyvbyM8uenSyG+prGt9ENrqXZxxfwUxIchi2V7A9m1WmD5zijNkf1JCWji/O+
 UsA1auxna7MwoxjxqZuGm4MlKOwZ+8xutk4JGgc+aP/ulndJbJYu+4op/3vaFBM=
 =Mhx+
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.14-rc7' into drm-next

Linux 4.14-rc7

Requested by Ben Skeggs for nouveau to avoid major conflicts,
and things were getting a bit conflicty already, esp around amdgpu
reverts.
2017-11-02 12:40:41 +10:00
Dave Airlie
87331c8379 Merge tag 'drm-msm-next-2017-11-01' of git://people.freedesktop.org/~robclark/linux into drm-next
+ preemption support for a5xx[1][2]

 + display fixes for 8x96 (snapdragon 820) including fixes for 4k scanout
   (hwpipe assignment re-work to handle multiple hwpipe assigned to plane
   for wide scanout)

 + async cursor plane updates and fixes

 + refactor adreno_bind/hwinit.. still defer fw loading until device open,
   but move clk/irq/etc to probe/bind time to fix issues when fw isn't
   present in filesys

 + clk/dt bindings cleanups w/ backward compat via msm_clk_get() (dt docs
   part ack'ed by Rob Herring)

 + fw loading re-work with helper to handle either /lib/firmware/qcom/$fw
   or /lib/firmware/$fw.. background, we've started landing fw for some of
   generations in linux-firmware, but there is a preference to put fw files
   under 'qcom' subdirectory, which is not what was done on android or for
   people who copied fw from android.  So now we first look in qcom subdir
   and then fallback to the original location.

 + bunch of GPU debugging enhancements, to dump full cmdline of processes
   that trigger faults, and to add a new debugfs to capture cmdstream of
   just submits that triggered faults.. both quite useful for piglit ;-)

* tag 'drm-msm-next-2017-11-01' of git://people.freedesktop.org/~robclark/linux: (38 commits)
  drm/msm: use %z format modifier for printing size_t
  drm/msm/mdp5: Don't use async plane update path if plane visibility changes
  drm/msm/mdp5: mdp5_crtc: Restore cursor state only if LM cursors are enabled
  drm/msm/mdp5: Update mdp5_pipe_assign to spit out both planes
  drm/msm/mdp5: Prepare mdp5_pipe_assign for some rework
  drm/msm: remove mdp5_cursor_plane_funcs
  drm/msm: update cursors asynchronously through atomic
  drm/msm/atomic: switch to drm_atomic_helper_check
  drm/msm/mdp5: restore cursor state when enabling crtc
  drm/msm/mdp5: don't use autosuspend
  drm/msm/mdp5: ignore planes that are not visible
  drm/msm: dump submits which triggered gpu hang
  drm/msm: preserve IOVAs in submit's bo table
  drm/msm/rd: allow adding addition msg to top of dump
  drm/msm: split rd debugfs file
  drm/msm: add special _get_vaddr_active() for cmdstream dumps
  drm/msm: show task cmdline in gpu recovery messages
  drm/msm: dump a rd GPUADDR header for all buffers in the command
  drm/msm: Removed unused struct_mutex_task
  drm/msm: Implement preemption for A5XX targets
  ...
2017-11-02 11:29:28 +10:00
Jordan Crouse
a6e29a0eea drm/msm: Add a parameter query for the number of ringbuffers
In order to manage ringbuffer priority to its fullest userspace
should know how many ringbuffers it has to work with. Add a
parameter to return the number of active rings.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-28 11:01:37 -04:00
Jordan Crouse
f97decac5f drm/msm: Support multiple ringbuffers
Add the infrastructure to support the idea of multiple ringbuffers.
Assign each ringbuffer an id and use that as an index for the various
ring specific operations.

The biggest delta is to support legacy fences. Each fence gets its own
sequence number but the legacy functions expect to use a unique integer.
To handle this we return a unique identifier for each submission but
map it to a specific ring/sequence under the covers. Newer users use
a dma_fence pointer anyway so they don't care about the actual sequence
ID or ring.

The actual mechanics for multiple ringbuffers are very target specific
so this code just allows for the possibility but still only defines
one ringbuffer for each target family.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-28 11:01:36 -04:00
Jordan Crouse
f7de15450e drm/msm: Add per-instance submit queues
Currently the behavior of a command stream is provided by the user
application during submission and the application is expected to internally
maintain the settings for each 'context' or 'rendering queue' and specify
the correct ones.

This works okay for simple cases but as applications become more
complex we will want to set context specific flags and do various
permission checks to allow certain contexts to enable additional
privileges.

Add kernel-side submit queues to be analogous to 'contexts' or
'rendering queues' on the application side. Each file descriptor
instance will maintain its own list of queues. Queues cannot be
shared between file descriptors.

For backwards compatibility context id '0' is defined as a default
context specifying no priority and no special flags. This is
intended to be the usual configuration for 99% of applications so
that a garden variety application can function correctly without
creating a queue. Only those applications requiring the specific
benefit of different queues need create one.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-28 11:01:35 -04:00
Vinicius Costa Gomes
585d763af0 net/sched: Introduce Credit Based Shaper (CBS) qdisc
This queueing discipline implements the shaper algorithm defined by
the 802.1Q-2014 Section 8.6.8.2 and detailed in Annex L.

It's primary usage is to apply some bandwidth reservation to user
defined traffic classes, which are mapped to different queues via the
mqprio qdisc.

Only a simple software implementation is added for now.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Tested-by: Henrik Austad <henrik@austad.us>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-27 09:48:02 -07:00
Dave Airlie
43106e25ab Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-next
Just a few fixes for 4.15.

* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/amdgpu: Remove workaround for suspend/resume in uvd7
  drm/amdgpu: don't flush the TLB before initializing GART
  drm/amdgpu: minor cleanup for amdgpu_ttm_bind
  drm/amdgpu/psp: prevent page fault by checking write_frame address(v4)
  drm/amd/powerplay: retrieve the real-time coreClock values
  drm/amd/powerplay: fix performance drop on Vega10
  drm/amd/powerplay: add one smc message for Vega10
  drm/amd/powerplay: fix amd_powerplay_reset()
  amdgpu: add padding to the fence to handle ioctl.
  drm/amdgpu:fix wb_clear
  drm/amdgpu:fix vf_error_put
  drm/amdgpu/sriov:now must reinit psp
  drm/amdgpu: merge bios post checking functions
2017-10-26 14:49:44 +10:00
Maor Gottlieb
309fa3470f IB/mlx5: Add support for RSS on the inner packet
Some user space application would like to do RSS on the inner
packet fields instead on the outer.
When MLX5_RX_HASH_INNER is set with one or more of the other
hash fields, then the RSS will be done using the inner packet.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 14:19:32 -04:00
Maor Gottlieb
f95ef6cbae IB/mlx5: Add tunneling offloads support
The device can support receive Stateless Offloads for the inner
packet's fields only when the packet is processed by TIR which is
enabled to support tunneling. Otherwise, the device treats the
packet as an ordinary non-tunneling packet and receive offloads
can be done only for the outer packet's field.
In order to enable receive Stateless Offloading support for incoming
tunneling traffic the TIR should be created with tunneled_offload_en.
Tunneling offloads is supported only be raw ethernet QP.

This patch includes:
* New QP creation flag for tunneling offloads.
* Reports device capabilities.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 14:19:31 -04:00
Guy Levi
7a0c8f4244 IB/mlx5: Support padded 128B CQE feature
In some benchmarks and some CPU architectures, writing the CQE on a full
cache line size improves performance by saving memory access operations
(read-modify-write) relative to partial cache line change. This patch
lets the user to configure the device to pad the CQE up to 128B in case
its content is less than 128B. Currently the driver supports only padding
for a CQE size of 128B.

Signed-off-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 14:17:06 -04:00
Guy Levi
de57f2ad06 IB/mlx5: Support 128B CQE compression feature
In commit 1cbe6fc86c ("IB/mlx5: Add support for CQE compressing") the
concept of CQE compression was introduced and added a support for 64B
CQE size. This change update the code to support 128B CQE size as well.

Signed-off-by: Guy Levi <guyle@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 14:17:06 -04:00
Noa Osherovich
ccc8708790 IB/mlx5: Allow creation of a multi-packet RQ
Allow creation of a multi-packet receive queue.

In order to create a multi-packet RQ, the following fields in
the mlx5_ib_rwq should be set:
- log_num_strides: Log of number of strides per WQE
- single_stride_log_num_of_bytes: Log of a single stride size
- two_byte_shift_en: When enabled, hardware pads 2 bytes of zeros
  before writing the message to memory (e.g. for the IP alignment).

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 14:03:44 -04:00
Noa Osherovich
b4f34597a5 IB/mlx5: Expose multi-packet RQ capabilities
This patch reports the device's striding RQ capabilities to
the user-space:
- min/max_single_stride_log_num_of_bytes: Log of min/max number of
  bytes in a single stride.
- min/max_single_wqe_log_num_of_strides: Log of min/max number of
  strides in a single WQE.
- supported_qpts: A bit mask to know which QP types support multi-
  packet RQ, for now only Raw Packet QPs.

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-25 14:03:44 -04:00
Mark Brown
7555aa766b Merge remote-tracking branches 'spi/fix/armada', 'spi/fix/idr', 'spi/fix/qspi', 'spi/fix/stm32' and 'spi/fix/uapi' into spi-linus 2017-10-25 14:06:34 +02:00
Keith Packard
62884cd386 drm: Add four ioctls for managing drm mode object leases [v7]
drm_mode_create_lease

	Creates a lease for a list of drm mode objects, returning an
	fd for the new drm_master and a 64-bit identifier for the lessee

drm_mode_list_lesees

	List the identifiers of the lessees for a master file

drm_mode_get_lease

	List the leased objects for a master file

drm_mode_revoke_lease

	Erase the set of objects managed by a lease.

This should suffice to at least create and query leases.

Changes for v2 as suggested by Daniel Vetter <daniel.vetter@ffwll.ch>:

 * query ioctls only query the master associated with
   the provided file.

 * 'mask_lease' value has been removed

 * change ioctl has been removed.

Changes for v3 suggested in part by Dave Airlie <airlied@gmail.com>

 * Add revoke ioctl.

Changes for v4 suggested by Dave Airlie <airlied@gmail.com>

 * Expand on the comment about the magic use of &drm_lease_idr_object
 * Pad lease ioctl structures to align on 64-bit boundaries

Changes for v5 suggested by Dave Airlie <airlied@gmail.com>

 * Check for non-negative object_id in create_lease to avoid debug
   output from the kernel.

Changes for v6 provided by Dave Airlie <airlied@gmail.com>

 * For non-universal planes add primary/cursor planes to lease

   If we aren't exposing universal planes to this userspace client,
   and it requests a lease on a crtc, we should implicitly export the
   primary and cursor planes for the crtc.

   If the lessee doesn't request universal planes, it will just see
   the crtc, but if it does request them it will then see the plane
   objects as well.

   This also moves the object look ups earlier as a side effect, so
   we'd exit the ioctl quicker for non-existant objects.

 * Restrict leases to crtc/connector/planes.

   This only allows leasing for objects we wish to allow.

Changes for v7 provided by Dave Airlie <airlied@gmail.com>

 * Check pad args are 0
 * Check create flags and object count are valid.
 * Check return from fd allocation
 * Refactor lease idr setup and add some simple validation
 * Use idr_mutex uniformly (Keith)

Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-25 16:31:30 +10:00
Shmulik Ladkani
908d140a87 ip6_tunnel: Allow rcv/xmit even if remote address is a local address
Currently, ip6_tnl_xmit_ctl drops tunneled packets if the remote
address (outer v6 destination) is one of host's locally configured
addresses.
Same applies to ip6_tnl_rcv_ctl: it drops packets if the remote address
(outer v6 source) is a local address.

This prevents using ipxip6 (and ip6_gre) tunnels whose local/remote
endpoints are on same host; OTOH v4 tunnels (ipip or gre) allow such
configurations.

An example where this proves useful is a system where entities are
identified by their unique v6 addresses, and use tunnels to encapsulate
traffic between them. The limitation prevents placing several entities
on same host.

Introduce IP6_TNL_F_ALLOW_LOCAL_REMOTE which allows to bypass this
restriction.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-25 10:33:27 +09:00
Christian König
276b738deb PCI: Add resizable BAR infrastructure
Add resizable BAR infrastructure, including defines and helper functions to
read the possible sizes of a BAR and update its size.  See PCIe r3.1, sec
7.22.

Link: https://pcisig.com/sites/default/files/specification_documents/ECN_Resizable-BAR_24Apr2008.pdf
Signed-off-by: Christian König <christian.koenig@amd.com>
[bhelgaas: rename to functions with "rebar" (to match #defines), drop shift
#defines, drop "_MASK" suffixes, fix typos, fix kerneldoc]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
2017-10-24 14:40:20 -05:00
Will Deacon
d15155824c linux/compiler.h: Split into compiler.h and compiler_types.h
linux/compiler.h is included indirectly by linux/types.h via
uapi/linux/types.h -> uapi/linux/posix_types.h -> linux/stddef.h
-> uapi/linux/stddef.h and is needed to provide a proper definition of
offsetof.

Unfortunately, compiler.h requires a definition of
smp_read_barrier_depends() for defining lockless_dereference() and soon
for defining READ_ONCE(), which means that all
users of READ_ONCE() will need to include asm/barrier.h to avoid splats
such as:

   In file included from include/uapi/linux/stddef.h:1:0,
                    from include/linux/stddef.h:4,
                    from arch/h8300/kernel/asm-offsets.c:11:
   include/linux/list.h: In function 'list_empty':
>> include/linux/compiler.h:343:2: error: implicit declaration of function 'smp_read_barrier_depends' [-Werror=implicit-function-declaration]
     smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
     ^

A better alternative is to include asm/barrier.h in linux/compiler.h,
but this requires a type definition for "bool" on some architectures
(e.g. x86), which is defined later by linux/types.h. Type "bool" is also
used directly in linux/compiler.h, so the whole thing is pretty fragile.

This patch splits compiler.h in two: compiler_types.h contains type
annotations, definitions and the compiler-specific parts, whereas
compiler.h #includes compiler-types.h and additionally defines macros
such as {READ,WRITE.ACCESS}_ONCE().

uapi/linux/stddef.h and linux/linkage.h are then moved over to include
linux/compiler_types.h, which fixes the build for h8 and blackfin.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1508840570-22169-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-24 13:17:32 +02:00
Christoph Paasch
71c02379c7 tcp: Configure TFO without cookie per socket and/or per route
We already allow to enable TFO without a cookie by using the
fastopen-sysctl and setting it to TFO_SERVER_COOKIE_NOT_REQD (or
TFO_CLIENT_NO_COOKIE).
This is safe to do in certain environments where we know that there
isn't a malicous host (aka., data-centers) or when the
application-protocol already provides an authentication mechanism in the
first flight of data.

A server however might be providing multiple services or talking to both
sides (public Internet and data-center). So, this server would want to
enable cookie-less TFO for certain services and/or for connections that
go to the data-center.

This patch exposes a socket-option and a per-route attribute to enable such
fine-grained configurations.

Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 18:48:08 +09:00
Dave Airlie
fef1aa48f4 Merge tag 'drm-misc-next-2017-10-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
Final drm-misc feature pull for 4.15:

UAPI Changes:
- new madvise ioctl for vc4 (Boris)

Core Changes:
- plane commit tracking fixes (Maarten)
- vgaarb improvements for fancy new platforms (aka ppc64 and arm64) by
  Bjorn Helgaas

Driver Changes:
- pile of new panel drivers: Toshiba LT089AC19000, Innolux AT043TN24
- more sun4i work to support A10/A20 Tcon and hdmi outputs
- vc4: fix sleep in irq handler by making it threaded (Eric)
- udl probe/edid read fixes (Robert Tarasov)

And a bunch of misc small cleanups/refactors and doc fixes all over.

* tag 'drm-misc-next-2017-10-20' of git://anongit.freedesktop.org/drm/drm-misc: (32 commits)
  drm/vc4: Fix sleeps during the IRQ handler for DSI transactions.
  drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl
  drm/panel: simple: add Toshiba LT089AC19000
  dma-fence: remove duplicate word in comment
  drm/panel: simple: add delays for Innolux AT043TN24
  drm/panel: simple: add bus flags for Innolux AT043TN24
  drm/panel: simple: fix vertical timings for Innolux AT043TN24
  drm/atomic-helper: check that drivers call drm_crtc_vblank_off
  drm: some KMS todo ideas
  vgaarb: Factor out EFI and fallback default device selection
  vgaarb: Select a default VGA device even if there's no legacy VGA
  drm/bridge: adv7511: Fix a use after free
  drm/sun4i: Add support for A20 display pipeline components
  drm/sun4i: Add support for A10 display pipeline components
  drm/sun4i: hdmi: Support HDMI controller on A10
  drm/sun4i: tcon: Add support for A10 TCON
  drm/sun4i: backend: Support output muxing
  drm/sun4i: tcon: Move out the tcon0 common setup
  drm/sun4i: tcon: Don't rely on encoders to set the TCON mode
  drm/sun4i: tcon: Don't rely on encoders to enable the TCON
  ...
2017-10-24 16:51:05 +10:00
David S. Miller
5908064a0b This documentation/cleanup patchset includes the following patches:
- Fix parameter kerneldoc which caused kerneldoc warnings, by Sven Eckelmann
 
  - Remove spurious warnings in B.A.T.M.A.N. V neighbor comparison,
    by Sven Eckelmann
 
  - Use inline kernel-doc style for UAPI constants, by Sven Eckelmann
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlnuC9IWHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeob0tEACwqVIrUSX9ru7N+uAv04jm9GXy
 zR43z5cEzHEiH/yZZKxL7z7sOXTO+fHj06JKWDKUvEYYhUhDBL7xqvgBBcHZO0AK
 TQLHirY+gMRJXphNsfrUWJ5IbB+C0wz7BpBuA+TLG8I0ibc9tjIt/pGdwJO34Z0C
 Ym0PSB8A1ej+l3CyGMCICramOexuBPRrbtgKrxi4uO56c+RYnfZkPdB+EFMvPGA+
 mcofumu/rKr/NFk/ZS77yooqBe2q93IVFzvRc7h3anm84XD9n4HwD5Rm4ruvXEBt
 FH4w0ZPmMbkBnwqkiWgefV5hDqKt1tLCEW9lxvokceoKy0ntDvzSqCU/yad2MGVw
 Kf27E6/iYhshVVgXFY5Q4E3YC5Z+Y2+rKL9xO7Dr4xYKi3GCIsCzz+aCJn3rRVgI
 q4qTVS/+bLR2YxYkHyPs5ux82G7VfhnaPI6jQkUM6ZeTKE1hqt30J7fDnHFRb7b4
 WVz6MGr46HGs9mBwhZ64mCercdS4+dIKYFjKS0avm39LWtOStgqoZs9rasLKE4/T
 wYfmrkX2tWpOrmZjO4ySuSBz9F3o4Yy0zwBeBNcnb11Mm8/08D6CwXLKg2q9wiY0
 E3hFjwg2AGivf0Lq9KMSS30tpC0iDz+xpJdGwxZumALX602pWv7f20cvMT31tR57
 Rmqf4ZEK82N2YpiGWA==
 =qGzP
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20171023' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This documentation/cleanup patchset includes the following patches:

 - Fix parameter kerneldoc which caused kerneldoc warnings, by Sven Eckelmann

 - Remove spurious warnings in B.A.T.M.A.N. V neighbor comparison,
   by Sven Eckelmann

 - Use inline kernel-doc style for UAPI constants, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 01:15:03 +01:00
Sven Eckelmann
40b16b9be5 batman-adv: use inline kernel-doc for uapi constants
The enums of constants for netlink tends to become rather large over time.
Documenting them is easier when the kernel-doc is actually next to constant
and not in a different block above the enum.

Also inline kernel-doc allows multi-paragraph description. This could be
required to better document the netlink command types and the expected
return values.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2017-10-23 14:22:25 +02:00
Keith Packard
3064abfa93 drm: Add CRTC_GET_SEQUENCE and CRTC_QUEUE_SEQUENCE ioctls [v3]
These provide crtc-id based functions instead of pipe-number, while
also offering higher resolution time (ns) and wider frame count (64)
as required by the Vulkan API.

v2:

 * Check for DRIVER_MODESET in new crtc-based vblank ioctls

	Failing to check this will oops the driver.

 * Ensure vblank interupt is running in crtc_get_sequence ioctl

	The sequence and timing values are not correct while the
	interrupt is off, so make sure it's running before asking for
	them.

 * Short-circuit get_sequence if the counter is enabled and accurate

	Steal the idea from the code in wait_vblank to avoid the
	expense of drm_vblank_get/put

 * Return active state of crtc in crtc_get_sequence ioctl

	Might be useful for applications that aren't in charge of
	modesetting?

 * Use drm_crtc_vblank_get/put in new crtc-based vblank sequence ioctls

	Daniel Vetter prefers these over the old drm_vblank_put/get
	APIs.

 * Return s64 ns instead of u64 in new sequence event

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

v3:

 * Removed FIRST_PIXEL_OUT_FLAG
 * Document that the timestamp in the query and event are
   that of the first pixel leaving the display engine for
   the display (using the same wording as the Vulkan spec).

Suggested-by: Michel Dänzer <michel@daenzer.net>
Acked-by: Dave Airlie <airlied@redhat.com>

[airlied: left->leaves (Michel)]

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-23 11:15:03 +10:00
David S. Miller
f8ddadc4db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
There were quite a few overlapping sets of changes here.

Daniel's bug fix for off-by-ones in the new BPF branch instructions,
along with the added allowances for "data_end > ptr + x" forms
collided with the metadata additions.

Along with those three changes came veritifer test cases, which in
their final form I tried to group together properly.  If I had just
trimmed GIT's conflict tags as-is, this would have split up the
meta tests unnecessarily.

In the socketmap code, a set of preemption disabling changes
overlapped with the rename of bpf_compute_data_end() to
bpf_compute_data_pointers().

Changes were made to the mv88e6060.c driver set addr method
which got removed in net-next.

The hyperv transport socket layer had a locking change in 'net'
which overlapped with a change of socket state macro usage
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 13:39:14 +01:00
Lawrence Brakmo
cd86d1fd21 bpf: Adding helper function bpf_getsockops
Adding support for helper function bpf_getsockops to socket_ops BPF
programs. This patch only supports TCP_CONGESTION.

Signed-off-by: Vlad Vysotsky <vlad@cs.ucla.edu>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 03:12:05 +01:00
Lawrence Brakmo
e6546ef6d8 bpf: add support for BPF_SOCK_OPS_BASE_RTT
A congestion control algorithm can make a call to the BPF socket_ops
program to request the base RTT. The base RTT can be congestion control
dependent and is meant to represent a congestion threshold such that
RTTs above it indicate congestion. This is especially useful for flows
within a DC where the base RTT is easy to obtain.

Being provided a base RTT solves a basic problem in RTT based congestion
avoidance algorithms (such as Vegas, NV and BBR). Although it is easy
to get the base RTT when the network is not congested, it is very
diffcult to do when it is very congested. Newer connections get an
inflated value of the base RTT leading to unfariness (newer flows with a
larger base RTT get more bandwidth). As a result, RTT based congestion
avoidance algorithms tend to update their base RTTs to improve fairness.
In very congested networks this can lead to base RTT inflation, reducing
the ability of these RTT based congestion control algorithms to prevent
congestion.

Note that in my experiments with TCP-NV, the base RTT provided can be
much larger than the actual hardware RTT. For example, experimenting
with hosts within a rack where the hardware RTT is 16-20us, I've used
base RTTs up to 150us. The effect of using a larger base RTT is that the
congestion avoidance algorithm will allow more queueing. When there are
only a few flows the main effect is larger measured RTTs and RPC
latencies due to the increased queueing. When there are a lot of flows,
a larger base RTT can lead to more congestion and more packet drops.
For this case, where the hardware RTT is 20us, a base RTT of 80us
produces good results.

This patch only introduces BPF_SOCK_OPS_BASE_RTT, a later patch in this
set adds support for using it in TCP-NV. Further study and testing is
needed before support can be added to other delay based congestion
avoidance algorithms.

Signed-off-by: Lawrence Brakmo <brakmo@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 03:12:05 +01:00
Dave Airlie
56e0349f38 amdgpu: add padding to the fence to handle ioctl.
I don't think this ioctl is in a Linus release yet.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-20 13:29:07 -04:00
Chenbo Feng
6e71b04a82 bpf: Add file mode configuration into bpf maps
Introduce the map read/write flags to the eBPF syscalls that returns the
map fd. The flags is used to set up the file mode when construct a new
file descriptor for bpf maps. To not break the backward capability, the
f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
read the map content, it will check the file mode to see if it is
allowed to make the change.

Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:32:59 +01:00
Yuchung Cheng
1fba70e5b6 tcp: socket option to set TCP fast open key
New socket option TCP_FASTOPEN_KEY to allow different keys per
listener.  The listener by default uses the global key until the
socket option is set.  The key is a 16 bytes long binary data. This
option has no effect on regular non-listener TCP sockets.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:21:36 +01:00
Mathieu Desnoyers
a961e40917 membarrier: Provide register expedited private command
This introduces a "register private expedited" membarrier command which
allows eventual removal of important memory barrier constraints on the
scheduler fast-paths. It changes how the "private expedited" membarrier
command (new to 4.14) is used from user-space.

This new command allows processes to register their intent to use the
private expedited command.  This affects how the expedited private
command introduced in 4.14-rc is meant to be used, and should be merged
before 4.14 final.

Processes are now required to register before using
MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM.

This fixes a problem that arose when designing requested extensions to
sys_membarrier() to allow JITs to efficiently flush old code from
instruction caches.  Several potential algorithms are much less painful
if the user register intent to use this functionality early on, for
example, before the process spawns the second thread.  Registering at
this time removes the need to interrupt each and every thread in that
process at the first expedited sys_membarrier() system call.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-19 22:13:40 -04:00
Dave Airlie
282dc8322a Merge tag 'drm-intel-next-2017-10-12' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Last batch of drm/i915 features for v4.15:

- transparent huge pages support (Matthew)
- uapi: I915_PARAM_HAS_SCHEDULER into a capability bitmask (Chris)
- execlists: preemption (Chris)
- scheduler: user defined priorities (Chris)
- execlists optimization (Michał)
- plenty of display fixes (Imre)
- has_ipc fix (Rodrigo)
- platform features definition refactoring (Rodrigo)
- legacy cursor update fix (Maarten)
- fix vblank waits for cursor updates (Maarten)
- reprogram dmc firmware on resume, dmc state fix (Imre)
- remove use_mmio_flip module parameter (Maarten)
- wa fixes (Oscar)
- huc/guc firmware refacoring (Sagar, Michal)
- push encoder specific code to encoder hooks (Jani)
- DP MST fixes (Dhinakaran)
- eDP power sequencing fixes (Manasi)
- selftest updates (Chris, Matthew)
- mmu notifier cpu hotplug deadlock fix (Daniel)
- more VBT parser refactoring (Jani)
- max pipe refactoring (Mika Kahola)
- rc6/rps refactoring and separation (Sagar)
- userptr lockdep fix (Chris)
- tracepoint fixes and defunct tracepoint removal (Chris)
- use rcu instead of abusing stop_machine (Daniel)
- plenty of other fixes all around (Everyone)

* tag 'drm-intel-next-2017-10-12' of git://anongit.freedesktop.org/drm/drm-intel: (145 commits)
  drm/i915: Update DRIVER_DATE to 20171012
  drm/i915: Simplify intel_sanitize_enable_ppgtt
  drm/i915/userptr: Drop struct_mutex before cleanup
  drm/i915/dp: limit sink rates based on rate
  drm/i915/dp: centralize max source rate conditions more
  drm/i915: Allow PCH platforms fall back to BIOS LVDS mode
  drm/i915: Reuse normal state readout for LVDS/DVO fixed mode
  drm/i915: Use rcu instead of stop_machine in set_wedged
  drm/i915: Introduce separate status variable for RC6 and LLC ring frequency setup
  drm/i915: Create generic functions to control RC6, RPS
  drm/i915: Create generic function to setup LLC ring frequency table
  drm/i915: Rename intel_enable_rc6 to intel_rc6_enabled
  drm/i915: Name structure in dev_priv that contains RPS/RC6 state as "gt_pm"
  drm/i915: Move rps.hw_lock to dev_priv and s/hw_lock/pcu_lock
  drm/i915: Name i915_runtime_pm structure in dev_priv as "runtime_pm"
  drm/i915: Separate RPS and RC6 handling for CHV
  drm/i915: Separate RPS and RC6 handling for VLV
  drm/i915: Separate RPS and RC6 handling for BDW
  drm/i915: Remove superfluous IS_BDW checks and non-BDW changes from gen8_enable_rps
  drm/i915: Separate RPS and RC6 handling for gen6+
  ...
2017-10-20 10:56:10 +10:00
Dave Airlie
6585d4274b Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-next
Last set of features for 4.15.  Highlights:
- Add a bo flag to allow buffers to opt out of implicit sync
- Add ctx priority setting interface
- Lots more powerplay cleanups
- Start to plumb through vram lost infrastructure for gpu reset
- ttm support for huge pages
- misc cleanups and bug fixes

* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (73 commits)
  drm/amd/powerplay: Place the constant on the right side of the test
  drm/amd/powerplay: Remove useless variable
  drm/amd/powerplay: Don't cast kzalloc() return value
  drm/amdgpu: allow GTT overcommit during bind
  drm/amdgpu: linear validate first then bind to GART
  drm/amd/pp: Fix overflow when setup decf/pix/disp dpm table.
  drm/amd/pp: thermal control not enabled on vega10.
  drm/amdgpu: busywait KIQ register accessing (v4)
  drm/amdgpu: report more amdgpu_fence_info
  drm/amdgpu:don't check soft_reset for sriov
  drm/amdgpu:fix duplicated setting job's vram_lost
  drm/amdgpu:reduce wb to 512 slot
  drm/amdgpu: fix regresstion on SR-IOV gpu reset failed
  drm/amd/powerplay: Tidy up cz_dpm_powerup_vce()
  drm/amd/powerplay: Tidy up cz_dpm_powerdown_vce()
  drm/amd/powerplay: Tidy up cz_dpm_update_vce_dpm()
  drm/amd/powerplay: Tidy up cz_dpm_update_uvd_dpm()
  drm/amd/powerplay: Tidy up cz_dpm_powerup_uvd()
  drm/amd/powerplay: Tidy up cz_dpm_powerdown_uvd()
  drm/amd/powerplay: Tidy up cz_start_dpm()
  ...
2017-10-20 10:47:19 +10:00
Dongdong Liu
7c950b9e53 PCI/portdrv: Add #defines for AER and DPC Interrupt Message Number masks
In the AER case, the mask isn't strictly necessary because there are no
higher-order bits above the Interrupt Message Number, but using a #define
will make it possible to grep for it.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-10-19 18:02:01 -05:00
Christian König
1f7251b73e drm/amdgpu: add VRAM lost query
Allows userspace to figure out if VRAM was lost.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:27:05 -04:00
Andres Rodriguez
8bc4c256f4 drm/amdgpu: rename context priority levels
Don't leak implementation details about how each priority behaves to
usermode. This allows greater flexibility in the future.

Squash into c2636dc53a

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-19 15:26:48 -04:00
Boris Brezillon
b9f19259b8 drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl
This ioctl will allow us to purge inactive userspace buffers when the
system is running out of contiguous memory.

For now, the purge logic is rather dumb in that it does not try to
release only the amount of BO needed to meet the last CMA alloc request
but instead purges all objects placed in the purgeable pool as soon as
we experience a CMA allocation failure.

Note that the in-kernel BO cache is always purged before the purgeable
cache because those objects are known to be unused while objects marked
as purgeable by a userspace application/library might have to be
restored when they are marked back as unpurgeable, which can be
expensive.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20171019125748.3152-1-boris.brezillon@free-electrons.com
2017-10-19 10:34:49 -07:00
Will Deacon
085b30625e perf/core: Add PERF_AUX_FLAG_COLLISION to report colliding samples
The ARM SPE architecture permits an implementation to ignore a sample
if the sample is due to be taken whilst another sample is already being
produced. In this case, it is desirable to report the collision to
userspace, as they may want to lower the sample period.

This patch adds a PERF_AUX_FLAG_COLLISION flag, so that such events can
be relayed to userspace.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:31 +01:00
Jesper Dangaard Brouer
6710e11269 bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP
The 'cpumap' is primarily used as a backend map for XDP BPF helper
call bpf_redirect_map() and XDP_REDIRECT action, like 'devmap'.

This patch implement the main part of the map.  It is not connected to
the XDP redirect system yet, and no SKB allocation are done yet.

The main concern in this patch is to ensure the datapath can run
without any locking.  This adds complexity to the setup and tear-down
procedure, which assumptions are extra carefully documented in the
code comments.

V2:
 - make sure array isn't larger than NR_CPUS
 - make sure CPUs added is a valid possible CPU

V3: fix nitpicks from Jakub Kicinski <kubakici@wp.pl>

V5:
 - Restrict map allocation to root / CAP_SYS_ADMIN
 - WARN_ON_ONCE if queue is not empty on tear-down
 - Return -EPERM on memlock limit instead of -ENOMEM
 - Error code in __cpu_map_entry_alloc() also handle ptr_ring_cleanup()
 - Moved cpu_map_enqueue() to next patch

V6: all notice by Daniel Borkmann
 - Fix err return code in cpu_map_alloc() introduced in V5
 - Move cpu_possible() check after max_entries boundary check
 - Forbid usage initially in check_map_func_compatibility()

V7:
 - Fix alloc error path spotted by Daniel Borkmann
 - Did stress test adding+removing CPUs from the map concurrently
 - Fixed refcnt issue on cpu_map_entry, kthread started too soon
 - Make sure packets are flushed during tear-down, involved use of
   rcu_barrier() and kthread_run only exit after queue is empty
 - Fix alloc error path in __cpu_map_entry_alloc() for ptr_ring

V8:
 - Nitpicking comments and gramma by Edward Cree
 - Fix missing semi-colon introduced in V7 due to rebasing
 - Move struct bpf_cpu_map_entry members cpu+map_id to tracepoint patch

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:12:18 +01:00
Mauro Carvalho Chehab
61065fc3e3 Merge commit '3728e6a255b5' into patchwork
* commit '3728e6a255b5': (904 commits)
  Linux 4.14-rc5
  x86/microcode: Do the family check first
  locking/lockdep: Disable cross-release features for now
  x86/mm: Flush more aggressively in lazy TLB mode
  mm, swap: use page-cluster as max window of VMA based swap readahead
  mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE outside of lock
  kmemleak: clear stale pointers from task stacks
  fs/binfmt_misc.c: node could be NULL when evicting inode
  fs/mpage.c: fix mpage_writepage() for pages with buffers
  linux/kernel.h: add/correct kernel-doc notation
  tty: fall back to N_NULL if switching to N_TTY fails during hangup
  Revert "vmalloc: back off when the current task is killed"
  mm/cma.c: take __GFP_NOWARN into account in cma_alloc()
  scripts/kallsyms.c: ignore symbol type 'n'
  userfaultfd: selftest: exercise -EEXIST only in background transfer
  mm: only display online cpus of the numa node
  mm: remove unnecessary WARN_ONCE in page_vma_mapped_walk().
  mm/mempolicy: fix NUMA_INTERLEAVE_HIT counter
  include/linux/of.h: provide of_n_{addr,size}_cells wrappers for !CONFIG_OF
  mm/madvise.c: add description for MADV_WIPEONFORK and MADV_KEEPONFORK
  ...
2017-10-17 17:22:20 -07:00
Alexander Duyck
32302902ff mqprio: Reserve last 32 classid values for HW traffic classes and misc IDs
This patch makes a slight tweak to mqprio in order to bring the
classid values used back in line with what is used for mq. The general idea
is to reserve values :ffe0 - :ffef to identify hardware traffic classes
normally reported via dev->num_tc. By doing this we can maintain a
consistent behavior with mq for classid where :1 - :ffdf will represent a
physical qdisc mapped onto a Tx queue represented by classid - 1, and the
traffic classes will be mapped onto a known subset of classid values
reserved for our virtual qdiscs.

Note I reserved the range from :fff0 - :ffff since this way we might be
able to reuse these classid values with clsact and ingress which would mean
that for mq, mqprio, ingress, and clsact we should be able to maintain a
similar classid layout.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-16 20:53:23 +01:00
Nicolas Pitre
fd4f6f2a78 cramfs: implement uncompressed and arbitrary data block positioning
Two new capabilities are introduced here:

- The ability to store some blocks uncompressed.

- The ability to locate blocks anywhere.

Those capabilities can be used independently, but the combination
opens the possibility for execute-in-place (XIP) of program text segments
that must remain uncompressed, and in the MMU case, must have a specific
alignment.  It is even possible to still have the writable data segments
from the same file compressed as they have to be copied into RAM anyway.

This is achieved by giving special meanings to some unused block pointer
bits while remaining compatible with legacy cramfs images.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-10-15 00:47:22 -04:00
Dave Airlie
787e1b74b7 Merge branch 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux into drm-next
Most notable addition this time is the support for the GPU performance
counters by Christian. This has been in the making for some time and it
has matured a lot. Since this is adding UAPI, the corresponding WIP
userspace can be found at [1] mesa/libdrm repos. I expect that
Christian sends out the final userspace patches for this once you have
pulled the kernel bits.

Philipp optimized the probe path, so etnaviv gets out of the way for
systems that want to boot real quick.

I've done mostly cleanups, disentangling etnaviv from the IOMMU API,
with some MMUv1 optimizations on the way.

* 'etnaviv/next' of https://git.pengutronix.de/git/lst/linux: (36 commits)
  drm/etnaviv: remove unnecessary clock stabilization delay
  drm/etnaviv: reduce reset delay
  drm/etnaviv: remove unused function etnaviv_gem_new
  drm/etnaviv: remove stale comment
  drm/etnaviv: submit supports performance monitor requests
  drm/etnaviv: enable debug registers on demand
  drm/etnaviv: need to disable clock gating when doing profiling
  drm/etnaviv: add MC perf domain
  drm/etnaviv: add TX perf domain
  drm/etnaviv: add RA perf domain
  drm/etnaviv: add SE perf domain
  drm/etnaviv: add PA perf domain
  drm/etnaviv: add SH perf domain
  drm/etnaviv: add PE perf domain
  drm/etnaviv: add HI perf domain
  drm/etnaviv: use 'sync points' for performance monitor requests
  drm/etnaviv: clear alloced event
  drm/etnaviv: add 'sync point' support
  drm/etnaviv: add performance monitor request processing
  drm/etnaviv: copy pmrs from userspace
  ...
2017-10-14 09:39:56 +10:00
Amritha Nambiar
4e8b86c062 mqprio: Introduce new hardware offload mode and shaper in mqprio
The offload types currently supported in mqprio are 0 (no offload) and
1 (offload only TCs) by setting these values for the 'hw' option. If
offloads are supported by setting the 'hw' option to 1, the default
offload mode is 'dcb' where only the TC values are offloaded to the
device. This patch introduces a new hardware offload mode called
'channel' with 'hw' set to 1 in mqprio which makes full use of the
mqprio options, the TCs, the queue configurations and the QoS parameters
for the TCs. This is achieved through a new netlink attribute for the
'mode' option which takes values such as 'dcb' (default) and 'channel'.
The 'channel' mode also supports QoS attributes for traffic class such as
minimum and maximum values for bandwidth rate limits.

This patch enables configuring additional HW shaper attributes associated
with a traffic class. Currently the shaper for bandwidth rate limiting is
supported which takes options such as minimum and maximum bandwidth rates
and are offloaded to the hardware in the 'channel' mode. The min and max
limits for bandwidth rates are provided by the user along with the TCs
and the queue configurations when creating the mqprio qdisc. The interface
can be extended to support new HW shapers in future through the 'shaper'
attribute.

Introduces a new data structure 'tc_mqprio_qopt_offload' for offloading
mqprio queue options and use this to be shared between the kernel and
device driver. This contains a copy of the existing data structure
for mqprio queue options. This new data structure can be extended when
adding new attributes for traffic class such as mode, shaper, shaper
parameters (bandwidth rate limits). The existing data structure for mqprio
queue options will be shared between the kernel and userspace.

Example:
  queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\
  min_rate 1Gbit 2Gbit max_rate 4Gbit 5Gbit

To dump the bandwidth rates:

qdisc mqprio 804a: root  tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
             queues:(0:3) (4:7)
             mode:channel
             shaper:bw_rlimit   min_rate:1Gbit 2Gbit   max_rate:4Gbit 5Gbit

Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-13 13:23:35 -07:00
Jon Maloy
ae236fb208 tipc: receive group membership events via member socket
Like with any other service, group members' availability can be
subscribed for by connecting to be topology server. However, because
the events arrive via a different socket than the member socket, there
is a real risk that membership events my arrive out of synch with the
actual JOIN/LEAVE action. I.e., it is possible to receive the first
messages from a new member before the corresponding JOIN event arrives,
just as it is possible to receive the last messages from a leaving
member after the LEAVE event has already been received.

Since each member socket is internally also subscribing for membership
events, we now fix this problem by passing those events on to the user
via the member socket. We leverage the already present member synch-
ronization protocol to guarantee correct message/event order. An event
is delivered to the user as an empty message where the two source
addresses identify the new/lost member. Furthermore, we set the MSG_OOB
bit in the message flags to mark it as an event. If the event is an
indication about a member loss we also set the MSG_EOR bit, so it can
be distinguished from a member addition event.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 08:46:00 -07:00
Jon Maloy
75da2163db tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.

We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.

A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.

Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 08:46:00 -07:00
Florian Fainelli
ad2d116c52 sched: tc_mirred: Remove whitespaces
This file contains unnecessary whitespaces as newlines, remove them,
found by looking at what struct tc_mirred looks like.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-12 12:24:03 -07:00
Dave Airlie
c5c7bc71a0 2nd batch of v4.15 features:
- lib/scatterlist updates, use for userptr allocations (Tvrtko)
 - Fixed point wrapper cleanup (Mahesh)
 - Gen9+ transition watermarks, watermark optimization and fixes (Mahesh)
 - Display IPC (Isochronous Priority Control) support (Mahesh)
 - GEM workaround fixes (Oscar)
 - GVT: PCI config sanitize series (Changbin)
 - GVT: Workload submission error handling series (Fred)
 - PSR fixes and refactoring (Rodrigo)
 - HWSP based optimizations (Chris)
 - Private PAT management (Zhi)
 - IRQ handling fixes and refactoring (Ville)
 - Module parameter refactoring and variable name clash fix (Michal)
 - Execlist refactoring, incomplete request unwinding on reset (Chris)
 - GuC scheduling improvements (Michal)
 - OA updates (Lionel)
 - Coffeelake out of alpha support (Rodrigo)
 - seqno fixes (Chris)
 - Execlist refactoring (Mika)
 - DP and DP MST cleanups (Dhinakaran)
 - Cannonlake slice/sublice config (Ben)
 - Numerous fixes all around (Everyone)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFWWmW3ewYy4RJOWc05gHnSar7m8FAlnOMGwACgkQ05gHnSar
 7m8UABAAgtpuUdm7IFsQwogtEn7JDbS3Q2LfwpyW2kAkTEdDqIBqAFl27PSTMhCj
 Yk5Q1/uG/cYqx9zdO7mLoCM77YTATlCZT4zTOulRZxhFlKMmSyvFDFE3D99LmLUT
 +pbywT7CCh3J+/Xrjf4S8WKFFsz0DRc6ycXwUim1gNJHURWzlG0A4gJhz8MfnuZF
 JmUsO68o8AL5FY6WmWNOF4vEtnwJqvRS0nUzd1UNqmz8sEGu6jSnrzUy9WpdOzoH
 k0PeNBqTaq8QVvlll1pyF4Xn9czKH6XlNXMA/bnVgCODs/A4E1r6rhnDadVZTI57
 KWePlVXeSuXFvnseINPajQnV8BvXwmcBi0eBBllFMOoo9izXhRz7CAGW+vsZvdsr
 Xxcmo8Qrbv+qBsbkX0TSKmZBFT7eBqUn7Pf+WPZhRG2/DAyKdbhcWJi5BfvPQze4
 73bL60XXQlXSacsgkzOIaGlhH1l7DoFdKnG1gWvdLr25xgbcq0uHHGRtwfK1QFT3
 /3DUOJgzd7uwcTRKkBCL4YkeQSRL4ilRuJhJIOebaElz/KBXSU1lyQQz26dSr6iO
 33WdSYjE09+02kXtsjHFIgFFOuTM8Rl38wuI82DbNiH0rh7houMFKfDUjuOgJh2e
 PuHheX9D/SILFCwCwmDvtS4wAmOe72nnA7zW9rV5VWWINPV75x8=
 =qpkh
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2017-09-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

2nd batch of v4.15 features:

- lib/scatterlist updates, use for userptr allocations (Tvrtko)
- Fixed point wrapper cleanup (Mahesh)
- Gen9+ transition watermarks, watermark optimization and fixes (Mahesh)
- Display IPC (Isochronous Priority Control) support (Mahesh)
- GEM workaround fixes (Oscar)
- GVT: PCI config sanitize series (Changbin)
- GVT: Workload submission error handling series (Fred)
- PSR fixes and refactoring (Rodrigo)
- HWSP based optimizations (Chris)
- Private PAT management (Zhi)
- IRQ handling fixes and refactoring (Ville)
- Module parameter refactoring and variable name clash fix (Michal)
- Execlist refactoring, incomplete request unwinding on reset (Chris)
- GuC scheduling improvements (Michal)
- OA updates (Lionel)
- Coffeelake out of alpha support (Rodrigo)
- seqno fixes (Chris)
- Execlist refactoring (Mika)
- DP and DP MST cleanups (Dhinakaran)
- Cannonlake slice/sublice config (Ben)
- Numerous fixes all around (Everyone)

* tag 'drm-intel-next-2017-09-29' of git://anongit.freedesktop.org/drm/drm-intel: (168 commits)
  drm/i915: Update DRIVER_DATE to 20170929
  drm/i915: Use memset64() to prefill the GTT page
  drm/i915: Also discard second CRC on gen8+ platforms.
  drm/i915/psr: Set frames before SU entry for psr2
  drm/dp: Add defines for latency in sink
  drm/i915: Allow optimized platform checks
  drm/i915: Avoid using dev_priv->info.gen directly.
  i915: Use %pS printk format for direct addresses
  drm/i915/execlists: Notify context-out for lost requests
  drm/i915/cnl: Add support slice/subslice/eu configs
  drm/i915: Compact device info access by a small re-ordering
  drm/i915: Add IS_PLATFORM macro
  drm/i915/selftests: Try to recover from a wedged GPU during reset tests
  drm/i915/huc: Reorganize HuC authentication
  drm/i915: Fix default values of some modparams
  drm/i915: Extend I915_PARAMS_FOR_EACH with default member value
  drm/i915: Make I915_PARAMS_FOR_EACH macro more flexible
  drm/i915: Enable scanline read based on frame timestamps
  drm/i915/execlists: Microoptimise execlists_cancel_port_request()
  drm/i915: Don't rmw PIPESTAT enable bits
  ...
2017-10-12 10:20:03 +10:00
Bjorn Andersson
da7653f0fa net: qrtr: Add control packet definition to uapi
The QMUX protocol specification defines structure of the special control
packet messages being sent between handlers of the control port.

Add these to the uapi header, as this structure and the associated types
are shared between the kernel and all userspace handlers of control
messages.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-11 15:28:38 -07:00
Bjorn Andersson
28978713c5 net: qrtr: Move constants to header file
The constants are used by both the name server and clients, so clarify
their value and move them to the uapi header.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-11 15:28:38 -07:00
David S. Miller
df2fd38a08 Work continues in various areas:
* port authorized event for 4-way-HS offload (Avi)
  * enable MFP optional for such devices (Emmanuel)
  * Kees's timer setup patch for mac80211 mesh
    (the part that isn't trivially scripted)
  * improve VLAN vs. TXQ handling (myself)
  * load regulatory database as firmware file (myself)
  * with various other small improvements and cleanups
 
 I merged net-next once in the meantime to allow Kees's
 timer setup patch to go in.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAlneDzEACgkQa3t4Rpy0
 AB3EHBAAhQana6YiMx0Ag4ANGlll3xnxFCZlkmlBoJ/EwKgQhPonylHntuvtkXf6
 kZRsOr4uA+wpN/opHLGfMJzat9uxztHVo2sT4rxVnvZq4DYcB/JdlhTMLZDsdDgm
 kHRpUEKh/+2FAgq2A4VEUpVb+Mtg0dq8iJJXFw89xb3Sw5UhNA6ljWQZ4zpXuI0P
 xOB8Z52LqAcMNnspP+L2TRpanu2ETLcl4Laj+cMl1Yiut2GHkclXUoGvbZ1al5SO
 CYqpjVKk67ENLJMrmhQ7DVzj0rpwlV+Eh756RU9DhamPAWbxqWLWJgfuGBskRXnI
 GneCUQkLZ5j1kUJjvQdXBv1UmpkCG4/3yITZX8kL3UR+AbhSCqzVQDo7it5hsWEf
 XTNAlhdTDhSn7OQQ6XOxvWeydAiaaz671bhPuIvKEo9D/+7Uv0PxHmvu8QqUm0xH
 Wvyh0LYRrblDz7fgEkaFctjJKYKnwviQ9O2LGx98C8NVam+Qyti2MlLA4AO5E+it
 ky97W3Dh5ftjQhFD0Ip9P4+BO/9hvNELlCRWUXI197n6B0/KH7FWX1eqw/vpnKc4
 w7VB/V59mB8zMmZ1QUdwT1/Ru+MD++6ds93STttZvH/0P3H0dDRGuxUK4m32YHiX
 s97uSBAbBMy2UH6b8HyxjVMGWvmW3KRakBID1zv2NRSIXtyfWj4=
 =gW8q
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2017-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Work continues in various areas:
 * port authorized event for 4-way-HS offload (Avi)
 * enable MFP optional for such devices (Emmanuel)
 * Kees's timer setup patch for mac80211 mesh
   (the part that isn't trivially scripted)
 * improve VLAN vs. TXQ handling (myself)
 * load regulatory database as firmware file (myself)
 * with various other small improvements and cleanups

I merged net-next once in the meantime to allow Kees's
timer setup patch to go in.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-11 10:15:01 -07:00
Mauro Carvalho Chehab
259a41d9ae media: dvb_frontend: fix return values for FE_SET_PROPERTY
There are several problems with regards to the return of
FE_SET_PROPERTY. The original idea were to return per-property
return codes via tvp->result field, and to return an updated
set of values.

However, that never worked. What's actually implemented is:

- the FE_SET_PROPERTY implementation doesn't call .get_frontend
  callback in order to get the actual parameters after return;

- the tvp->result field is only filled if there's no error.
  So, it is always filled with zero;

- FE_SET_PROPERTY doesn't call memdup_user() nor any other
  copy_to_user() function. So, any changes to the properties
  will be lost;

- FE_SET_PROPERTY is declared as a write-only ioctl (IOW).

While we could fix the above, it could cause regressions.

So, let's just assume what the code really does, updating
the documentation accordingly and removing the logic that
would update the discarded tvp->result.

Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-10-11 12:58:40 -04:00
Johannes Berg
1ea4ff3e9f cfg80211: support reloading regulatory database
If the regulatory database is loaded, and then updated, it may
be necessary to reload it. Add an nl80211 command to do this.

Note that this just reloads the database, it doesn't re-apply
the rules from it immediately.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-11 13:04:15 +02:00
Eric Garver
b8226962b1 openvswitch: add ct_clear action
This adds a ct_clear action for clearing conntrack state. ct_clear is
currently implemented in OVS userspace, but is not backed by an action
in the kernel datapath. This is useful for flows that may modify a
packet tuple after a ct lookup has already occurred.

Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10 16:38:34 -07:00
Steve Grubb
de8cd83e91 audit: Record fanotify access control decisions
The fanotify interface allows user space daemons to make access
control decisions. Under common criteria requirements, we need to
optionally record decisions based on policy. This patch adds a bit mask,
FAN_AUDIT, that a user space daemon can 'or' into the response decision
which will tell the kernel that it made a decision and record it.

It would be used something like this in user space code:

  response.response = FAN_DENY | FAN_AUDIT;
  write(fd, &response, sizeof(struct fanotify_response));

When the syscall ends, the audit system will record the decision as a
AUDIT_FANOTIFY auxiliary record to denote that the reason this event
occurred is the result of an access control decision from fanotify
rather than DAC or MAC policy.

A sample event looks like this:

type=PATH msg=audit(1504310584.332:290): item=0 name="./evil-ls"
inode=1319561 dev=fc:03 mode=0100755 ouid=1000 ogid=1000 rdev=00:00
obj=unconfined_u:object_r:user_home_t:s0 nametype=NORMAL
type=CWD msg=audit(1504310584.332:290): cwd="/home/sgrubb"
type=SYSCALL msg=audit(1504310584.332:290): arch=c000003e syscall=2
success=no exit=-1 a0=32cb3fca90 a1=0 a2=43 a3=8 items=1 ppid=901
pid=959 auid=1000 uid=1000 gid=1000 euid=1000 suid=1000
fsuid=1000 egid=1000 sgid=1000 fsgid=1000 tty=pts1 ses=3 comm="bash"
exe="/usr/bin/bash" subj=unconfined_u:unconfined_r:unconfined_t:
s0-s0:c0.c1023 key=(null)
type=FANOTIFY msg=audit(1504310584.332:290): resp=2

Prior to using the audit flag, the developer needs to call
fanotify_init or'ing in FAN_ENABLE_AUDIT to ensure that the kernel
supports auditing. The calling process must also have the CAP_AUDIT_WRITE
capability.

Signed-off-by: sgrubb <sgrubb@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2017-10-10 13:18:06 +02:00
Christian Gmeiner
05916bed11 drm/etnaviv: add uapi for perfmon feature
Sadly we can not read any registers via command stream so we need
to extend the drm_etnaviv_gem_submit struct with performance monitor
requests. Those requests gets process before or after the actual
submitted command stream.

The Vivante kernel driver has a special ioctl to read all perfmon
registers at once and return it.

Changes from v1 -> v2:
- use a 16 bit value for signals
- fix padding issues

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-10 11:45:41 +02:00
Christian Gmeiner
9e2c2e2730 drm/etnaviv: add infrastructure to query perf counter
Make it possible that userspace can query all performance domains and
its signals. This information is needed to sample those signals via
submit ioctl.

At the moment no performance domain is available.

Changes from v1 -> v2:
- use a 16 bit value for signals
- fix padding issues
- add id member to domain and signal struct

Changes v4 -> v5
- provide for each pipe an own set of pm domains

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-10 11:45:41 +02:00
William Tu
ceaa001a17 openvswitch: Add erspan tunnel support.
Add erspan netlink interface for OVS.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 20:45:50 -07:00
David S. Miller
d93fa2ba64 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-10-09 20:11:09 -07:00
Andres Rodriguez
52c6a62c64 drm/amdgpu: add interface for editing a foreign process's priority v3
The AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE ioctls are used to set
the priority of a different process in the current system.

When a request is dropped, the process's contexts will be
restored to the priority specified at context creation time.

A request can be dropped by setting the override priority to
AMDGPU_CTX_PRIORITY_UNSET.

An fd is used to identify the remote process. This is simpler than
passing a pid number, which is vulnerable to re-use, etc.

This functionality is limited to DRM_MASTER since abuse of this
interface can have a negative impact on the system's performance.

v2: removed unused output structure
v3: change refcounted interface for a regular set operation

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09 16:30:24 -04:00
Andres Rodriguez
f3d19bf80d drm/amdgpu: introduce AMDGPU_CTX_PRIORITY_UNSET
Use _INVALID to identify bad parameters and _UNSET to represent the
lack of interest in a specific value.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09 16:30:23 -04:00
Andres Rodriguez
c2636dc53a drm/amdgpu: add parameter to allocate high priority contexts v11
Add a new context creation parameter to express a global context priority.

The priority ranking in descending order is as follows:
 * AMDGPU_CTX_PRIORITY_HIGH_HW
 * AMDGPU_CTX_PRIORITY_HIGH_SW
 * AMDGPU_CTX_PRIORITY_NORMAL
 * AMDGPU_CTX_PRIORITY_LOW_SW
 * AMDGPU_CTX_PRIORITY_LOW_HW

The driver will attempt to schedule work to the hardware according to
the priorities. No latency or throughput guarantees are provided by
this patch.

This interface intends to service the EGL_IMG_context_priority
extension, and vulkan equivalents.

Setting a priority above NORMAL requires CAP_SYS_NICE or DRM_MASTER.

v2: Instead of using flags, repurpose __pad
v3: Swap enum values of _NORMAL _HIGH for backwards compatibility
v4: Validate usermode priority and store it
v5: Move priority validation into amdgpu_ctx_ioctl(), headline reword
v6: add UAPI note regarding priorities requiring CAP_SYS_ADMIN
v7: remove ctx->priority
v8: added AMDGPU_CTX_PRIORITY_LOW, s/CAP_SYS_ADMIN/CAP_SYS_NICE
v9: change the priority parameter to __s32
v10: split priorities into _SW and _HW
v11: Allow DRM_MASTER without CAP_SYS_NICE

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09 16:30:20 -04:00
Andres Rodriguez
177ae09b5d drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
Introduce a flag to signal that access to a BO will be synchronized
through an external mechanism.

Currently all buffers shared between contexts are subject to implicit
synchronization. However, this is only required for protocols that
currently don't support an explicit synchronization mechanism (DRI2/3).

This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that
users can specify when it is safe to disable implicit sync.

v2: only disable explicit sync in amdgpu_cs_ioctl

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-09 16:30:19 -04:00
David S. Miller
fb60bccc06 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:

1) Fix packet drops due to incorrect ECN handling in IPVS, from Vadim
   Fedorenko.

2) Fix splat with mark restoration in xt_socket with non-full-sock,
   patch from Subash Abhinov Kasiviswanathan.

3) ipset bogusly bails out when adding IPv4 range containing more than
   2^31 addresses, from Jozsef Kadlecsik.

4) Incorrect pernet unregistration order in ipset, from Florian Westphal.

5) Races between dump and swap in ipset results in BUG_ON splats, from
   Ross Lagerwall.

6) Fix chain renames in nf_tables, from JingPiao Chen.

7) Fix race in pernet codepath with ebtables table registration, from
   Artem Savkov.

8) Memory leak in error path in set name allocation in nf_tables, patch
   from Arvind Yadav.

9) Don't dump chain counters if they are not available, this fixes a
   crash when listing the ruleset.

10) Fix out of bound memory read in strlcpy() in x_tables compat code,
    from Eric Dumazet.

11) Make sure we only process TCP packets in SYNPROXY hooks, patch from
    Lin Zhang.

12) Cannot load rules incrementally anymore after xt_bpf with pinned
    objects, added in revision 1. From Shmulik Ladkani.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-09 10:39:52 -07:00
Shmulik Ladkani
98589a0998 netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'
Commit 2c16d60332 ("netfilter: xt_bpf: support ebpf") introduced
support for attaching an eBPF object by an fd, with the
'bpf_mt_check_v1' ABI expecting the '.fd' to be specified upon each
IPT_SO_SET_REPLACE call.

However this breaks subsequent iptables calls:

 # iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/xxx -j ACCEPT
 # iptables -A INPUT -s 5.6.7.8 -j ACCEPT
 iptables: Invalid argument. Run `dmesg' for more information.

That's because iptables works by loading existing rules using
IPT_SO_GET_ENTRIES to userspace, then issuing IPT_SO_SET_REPLACE with
the replacement set.

However, the loaded 'xt_bpf_info_v1' has an arbitrary '.fd' number
(from the initial "iptables -m bpf" invocation) - so when 2nd invocation
occurs, userspace passes a bogus fd number, which leads to
'bpf_mt_check_v1' to fail.

One suggested solution [1] was to hack iptables userspace, to perform a
"entries fixup" immediatley after IPT_SO_GET_ENTRIES, by opening a new,
process-local fd per every 'xt_bpf_info_v1' entry seen.

However, in [2] both Pablo Neira Ayuso and Willem de Bruijn suggested to
depricate the xt_bpf_info_v1 ABI dealing with pinned ebpf objects.

This fix changes the XT_BPF_MODE_FD_PINNED behavior to ignore the given
'.fd' and instead perform an in-kernel lookup for the bpf object given
the provided '.path'.

It also defines an alias for the XT_BPF_MODE_FD_PINNED mode, named
XT_BPF_MODE_PATH_PINNED, to better reflect the fact that the user is
expected to provide the path of the pinned object.

Existing XT_BPF_MODE_FD_ELF behavior (non-pinned fd mode) is preserved.

References: [1] https://marc.info/?l=netfilter-devel&m=150564724607440&w=2
            [2] https://marc.info/?l=netfilter-devel&m=150575727129880&w=2

Reported-by: Rafael Buchbinder <rafi@rbk.ms>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-10-09 15:18:04 +02:00
Greg Kroah-Hartman
9424e8b1fe Merge 4.14-rc4 into tty-next
We want the tty/serial fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-09 09:05:05 +02:00
Roopa Prabhu
821f1b21ca bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood
This patch adds a new bridge port flag BR_NEIGH_SUPPRESS to
suppress arp and nd flood on bridge ports. It implements
rfc7432, section 10.
https://tools.ietf.org/html/rfc7432#section-10
for ethernet VPN deployments. It is similar to the existing
BR_PROXYARP* flags but has a few semantic differences to conform
to EVPN standard. Unlike the existing flags, this new flag suppresses
flood of all neigh discovery packets (arp and nd) to tunnel ports.
Supports both vlan filtering and non-vlan filtering bridges.

In case of EVPN, it is mainly used to avoid flooding
of arp and nd packets to tunnel ports like vxlan.

This patch adds netlink and sysfs support to set this bridge port
flag.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-08 21:12:04 -07:00
Dave Airlie
bb7a9c8d71 Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-next
More new stuff for 4.15. Highlights:
- Add clock query interface for raven
- Add new FENCE_TO_HANDLE ioctl
- UVD video encode ring support on polaris
- transparent huge page DMA support
- deadlock fixes
- compute pipe lru tweaks
- powerplay cleanups and regression fixes
- fix duplicate symbol issue with radeon and amdgpu
- misc bug fixes

* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (72 commits)
  drm/radeon/dp: make radeon_dp_get_dp_link_config static
  drm/radeon: move ci_send_msg_to_smc to where it's used
  drm/amd/sched: fix deadlock caused by unsignaled fences of deleted jobs
  drm/amd/sched: NULL out the s_fence field after run_job
  drm/amd/sched: move adding finish callback to amd_sched_job_begin
  drm/amd/sched: fix an outdated comment
  drm/amd/sched: rename amd_sched_entity_pop_job
  drm/amdgpu: minor coding style fix
  drm/ttm: add transparent huge page support for DMA allocations v2
  drm/ttm: add support for different pool sizes
  drm/ttm: remove unsued options from ttm_mem_global_alloc_page
  drm/amdgpu: add uvd enc irq
  drm/amdgpu: add uvd enc ib test
  drm/amdgpu: add uvd enc ring test
  drm/amdgpu: add uvd enc vm functions (v2)
  drm/amdgpu: add uvd enc into run queue
  drm/amdgpu: add uvd enc rings
  drm/amdgpu: add new uvd enc ring methods
  drm/amdgpu: add uvd enc command in header
  drm/amdgpu: add uvd enc registers in header
  ...
2017-10-09 11:00:16 +10:00
Martin KaFai Lau
067cae4777 bpf: Use char in prog and map name
Instead of u8, use char for prog and map name.  It can avoid the
userspace tool getting compiler's signess warning.  The
bpf_prog_aux, bpf_map, bpf_attr, bpf_prog_info and
bpf_map_info are changed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:29:39 +01:00
Yonghong Song
4bebdc7a85 bpf: add helper bpf_perf_prog_read_value
This patch adds helper bpf_perf_prog_read_cvalue for perf event based bpf
programs, to read event counter and enabled/running time.
The enabled/running time is accumulated since the perf event open.

The typical use case for perf event based bpf program is to attach itself
to a single event. In such cases, if it is desirable to get scaling factor
between two bpf invocations, users can can save the time values in a map,
and use the value from the map and the current value to calculate
the scaling factor.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:05:57 +01:00
Yonghong Song
908432ca84 bpf: add helper bpf_perf_event_read_value for perf event array map
Hardware pmu counters are limited resources. When there are more
pmu based perf events opened than available counters, kernel will
multiplex these events so each event gets certain percentage
(but not 100%) of the pmu time. In case that multiplexing happens,
the number of samples or counter value will not reflect the
case compared to no multiplexing. This makes comparison between
different runs difficult.

Typically, the number of samples or counter value should be
normalized before comparing to other experiments. The typical
normalization is done like:
  normalized_num_samples = num_samples * time_enabled / time_running
  normalized_counter_value = counter_value * time_enabled / time_running
where time_enabled is the time enabled for event and time_running is
the time running for event since last normalization.

This patch adds helper bpf_perf_event_read_value for kprobed based perf
event array map, to read perf counter and enabled/running time.
The enabled/running time is accumulated since the perf event open.
To achieve scaling factor between two bpf invocations, users
can can use cpu_id as the key (which is typical for perf array usage model)
to remember the previous value and do the calculation inside the
bpf program.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 23:05:57 +01:00
Amine Kherbouche
bdc476413d ip_tunnel: add mpls over gre support
This commit introduces the MPLSoGRE support (RFC 4023), using ip tunnel
API by simply adding ipgre_tunnel_encap_(add|del)_mpls_ops() and the new
tunnel type TUNNEL_ENCAP_MPLS.

Signed-off-by: Amine Kherbouche <amine.kherbouche@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-07 21:38:31 +01:00
Marek Olšák
7ca24cf2d2 drm/amdgpu: add FENCE_TO_HANDLE ioctl that returns syncobj or sync_file
for being able to convert an amdgpu fence into one of the handles.
Mesa will use this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-06 16:47:56 -04:00
Joonas Lahtinen
822a4b6732 drm/i915: Don't use BIT() in UAPI section
Lets not introduce BIT() macro requirement for UAPI for now.

Fixes: 3fd3a6ffe2 ("drm/i915: Simplify i915_reg_read_ioctl")
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171006104559.17312-1-joonas.lahtinen@linux.intel.com
2017-10-06 14:08:55 +03:00
Johannes Berg
753d179ad0 Merge remote-tracking branch 'net-next/master' into mac80211-next
Merging this brings in the timer_setup() change, which allows
me to apply Kees's mac80211 changes for it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-06 11:46:55 +02:00
Stefan Hajnoczi
413a4317ac VSOCK: add sock_diag interface
This patch adds the sock_diag interface for querying sockets from
userspace.  Tools like ss(8) and netstat(8) can use this interface to
list open sockets.

The userspace ABI is defined in <linux/vm_sockets_diag.h> and includes
netlink request and response structs.  The request can query sockets
based on their sk_state (e.g. listening sockets only) and the response
contains socket information fields including the local/remote addresses,
inode number, etc.

This patch does not dump VMCI pending sockets because I have only tested
the virtio transport, which does not use pending sockets.  Support can
be added later by extending vsock_diag_dump() if needed by VMCI users.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05 18:44:17 -07:00
David S. Miller
53954cf8c5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-05 18:19:22 -07:00
Linus Torvalds
076264ada9 - A stable fix for the alignment of the event number reported at the
end of the 'DM_LIST_DEVICES' ioctl.
 
 - A couple stable fixes for the DM crypt target.
 
 - A DM raid health status reporting fix.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZ1pR1AAoJEMUj8QotnQNa48kIAJ+HTqeNjVhspxqKyJHPl78W
 3N/B11dWJ/CQ4xN7tbpC2gmsbnBBHE8RFTJzk3xQo7yoKsD0muqH35n0XA7X2A29
 i7DoYro/7F6ZuPlgzhzcCjA7eTugR4vcp5dTFYoIQG0DaOKAkN/+gJTVjNDjpRR5
 oGljZhKTeS4UNJTv/+ZjSMuAPycZq8LKRMOn/EgqT9MD4cIQ9VHN2qGc8jQt0Xrb
 m58URvAoFesGnSjZcypk+JG2SbUfJ4WB3Db7+A+X7lu2219FIroFhNHMk9obYhXG
 mkrhEnAsVsq/paPhCY4gdXWmSe7RNiAeSJeWhUSrNfjUACf1GF+l4CgBeBWIX+0=
 =V40h
 -----END PGP SIGNATURE-----

Merge tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - a stable fix for the alignment of the event number reported at the
   end of the 'DM_LIST_DEVICES' ioctl.

 - a couple stable fixes for the DM crypt target.

 - a DM raid health status reporting fix.

* tag 'for-4.14/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm raid: fix incorrect status output at the end of a "recover" process
  dm crypt: reject sector_size feature if device length is not aligned to it
  dm crypt: fix memory leak in crypt_ctr_cipher_old()
  dm ioctl: fix alignment of event number in the device list
2017-10-05 15:17:40 -07:00
Linus Torvalds
9a431ef962 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Check iwlwifi 9000 reorder buffer out-of-space condition properly,
    from Sara Sharon.

 2) Fix RCU splat in qualcomm rmnet driver, from Subash Abhinov
    Kasiviswanathan.

 3) Fix session and tunnel release races in l2tp, from Guillaume Nault
    and Sabrina Dubroca.

 4) Fix endian bug in sctp_diag_dump(), from Dan Carpenter.

 5) Several mlx5 driver fixes from the Mellanox folks (max flow counters
    cap check, invalid memory access in IPoIB support, etc.)

 6) tun_get_user() should bail if skb->len is zero, from Alexander
    Potapenko.

 7) Fix RCU lookups in inetpeer, from Eric Dumazet.

 8) Fix locking in packet_do_bund().

 9) Handle cb->start() error properly in netlink dump code, from Jason
    A. Donenfeld.

10) Handle multicast properly in UDP socket early demux code. From Paolo
    Abeni.

11) Several erspan bug fixes in ip_gre, from Xin Long.

12) Fix use-after-free in socket filter code, in order to handle the
    fact that listener lock is no longer taken during the three-way TCP
    handshake. From Eric Dumazet.

13) Fix infoleak in RTM_GETSTATS, from Nikolay Aleksandrov.

14) Fix tail call generation in x86-64 BPF JIT, from Alexei Starovoitov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (77 commits)
  net: 8021q: skip packets if the vlan is down
  bpf: fix bpf_tail_call() x64 JIT
  net: stmmac: dwmac-rk: Add RK3128 GMAC support
  rndis_host: support Novatel Verizon USB730L
  net: rtnetlink: fix info leak in RTM_GETSTATS call
  socket, bpf: fix possible use after free
  mlxsw: spectrum_router: Track RIF of IPIP next hops
  mlxsw: spectrum_router: Move VRF refcounting
  net: hns3: Fix an error handling path in 'hclge_rss_init_hw()'
  net: mvpp2: Fix clock resource by adding an optional bus clock
  r8152: add Linksys USB3GIGV1 id
  l2tp: fix l2tp_eth module loading
  ip_gre: erspan device should keep dst
  ip_gre: set tunnel hlen properly in erspan_tunnel_init
  ip_gre: check packet length and mtu correctly in erspan_xmit
  ip_gre: get key from session_id correctly in erspan_rcv
  tipc: use only positive error codes in messages
  ppp: fix __percpu annotation
  udp: perform source validation for mcast early demux
  IPv4: early demux can return an error code
  ...
2017-10-05 08:40:09 -07:00
Nicolas Dichtel
6621dd29eb dev: advertise the new nsid when the netns iface changes
x-netns interfaces are bound to two netns: the link netns and the upper
netns. Usually, this kind of interfaces is created in the link netns and
then moved to the upper netns. At the end, the interface is visible only
in the upper netns. The link nsid is advertised via netlink in the upper
netns, thus the user always knows where is the link part.

There is no such mechanism in the link netns. When the interface is moved
to another netns, the user cannot "follow" it.
This patch adds a new netlink attribute which helps to follow an interface
which moves to another netns. When the interface is unregistered, the new
nsid is advertised. If the interface is a x-netns interface (ie
rtnl_link_ops->get_link_net is defined), the nsid is allocated if needed.

CC: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 18:04:41 -07:00
Alexei Starovoitov
468e2f64d2 bpf: introduce BPF_PROG_QUERY command
introduce BPF_PROG_QUERY command to retrieve a set of either
attached programs to given cgroup or a set of effective programs
that will execute for events within a cgroup

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
for cgroup bits
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 16:05:05 -07:00
Alexei Starovoitov
324bda9e6c bpf: multi program support for cgroup+bpf
introduce BPF_F_ALLOW_MULTI flag that can be used to attach multiple
bpf programs to a cgroup.

The difference between three possible flags for BPF_PROG_ATTACH command:
- NONE(default): No further bpf programs allowed in the subtree.
- BPF_F_ALLOW_OVERRIDE: If a sub-cgroup installs some bpf program,
  the program in this cgroup yields to sub-cgroup program.
- BPF_F_ALLOW_MULTI: If a sub-cgroup installs some bpf program,
  that cgroup program gets run in addition to the program in this cgroup.

NONE and BPF_F_ALLOW_OVERRIDE existed before. This patch doesn't
change their behavior. It only clarifies the semantics in relation
to new flag.

Only one program is allowed to be attached to a cgroup with
NONE or BPF_F_ALLOW_OVERRIDE flag.
Multiple programs are allowed to be attached to a cgroup with
BPF_F_ALLOW_MULTI flag. They are executed in FIFO order
(those that were attached first, run first)
The programs of sub-cgroup are executed first, then programs of
this cgroup and then programs of parent cgroup.
All eligible programs are executed regardless of return code from
earlier programs.

To allow efficient execution of multiple programs attached to a cgroup
and to avoid penalizing cgroups without any programs attached
introduce 'struct bpf_prog_array' which is RCU protected array
of pointers to bpf programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
for cgroup bits
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-04 16:05:05 -07:00
Chris Wilson
ac14fbd460 drm/i915/scheduler: Support user-defined priorities
Use a priority stored in the context as the initial value when
submitting a request. This allows us to change the default priority on a
per-context basis, allowing different contexts to be favoured with GPU
time at the expense of lower importance work. The user can adjust the
context's priority via I915_CONTEXT_PARAM_PRIORITY, with more positive
values being higher priority (they will be serviced earlier, after their
dependencies have been resolved). Any prerequisite work for an execbuf
will have its priority raised to match the new request as required.

Normal users can specify any value in the range of -1023 to 0 [default],
i.e. they can reduce the priority of their workloads (and temporarily
boost it back to normal if so desired).

Privileged users can specify any value in the range of -1023 to 1023,
[default is 0], i.e. they can raise their priority above all overs and
so potentially starve the system.

Note that the existing schedulers are not fair, nor load balancing, the
execution is strictly by priority on a first-come, first-served basis,
and the driver may choose to boost some requests above the range
available to users.

This priority was originally based around nice(2), but evolved to allow
clients to adjust their priority within a small range, and allow for a
privileged high priority range.

For example, this can be used to implement EGL_IMG_context_priority
https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt

	EGL_CONTEXT_PRIORITY_LEVEL_IMG determines the priority level of
        the context to be created. This attribute is a hint, as an
        implementation may not support multiple contexts at some
        priority levels and system policy may limit access to high
        priority contexts to appropriate system privilege level. The
        default value for EGL_CONTEXT_PRIORITY_LEVEL_IMG is
        EGL_CONTEXT_PRIORITY_MEDIUM_IMG."

so we can map

	PRIORITY_HIGH -> 1023 [privileged, will failback to 0]
	PRIORITY_MED -> 0 [default]
	PRIORITY_LOW -> -1023

They also map onto the priorities used by VkQueue (and a VkQueue is
essentially a timeline, our i915_gem_context under full-ppgtt).

v2: s/CAP_SYS_ADMIN/CAP_SYS_NICE/
v3: Report min/max user priorities as defines in the uapi, and rebase
internal priorities on the exposed values.

Testcase: igt/gem_exec_schedule
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-9-chris@chris-wilson.co.uk
2017-10-04 17:52:46 +01:00
Chris Wilson
bf64e0b00e drm/i915: Expand I915_PARAM_HAS_SCHEDULER into a capability bitmask
In the next few patches, we wish to enable different features for the
scheduler, some which may subtlety change ABI (e.g. allow requests to be
reordered under different circumstances). So we need to make sure
userspace is cognizant of the changes (if they care), by which we employ
the usual method of a GETPARAM. We already have an
I915_PARAM_HAS_SCHEDULER (which notes the existing ability to reorder
requests to avoid bubbles), and now we wish to extend that to be a
bitmask to describe the different capabilities implemented.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-7-chris@chris-wilson.co.uk
2017-10-04 17:52:46 +01:00
Ed Blake
6263368c5b serial: Add define for max baud rate divisor
Add a define for the maximum baud rate divisor, to improve code
readability.

Signed-off-by: Ed Blake <ed.blake@sondrel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-04 10:17:27 +02:00
Marcelo Ricardo Leitner
ac1ed8b82c sctp: introduce round robin stream scheduler
This patch introduces RFC Draft ndata section 3.2 Priority Based
Scheduler (SCTP_SS_RR).

Works by maintaining a list of enqueued streams and tracking the last
one used to send data. When the datamsg is done, it switches to the next
stream.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
637784ade2 sctp: introduce priority based stream scheduler
This patch introduces RFC Draft ndata section 3.4 Priority Based
Scheduler (SCTP_SS_PRIO).

It works by having a struct sctp_stream_priority for each priority
configured. This struct is then enlisted on a queue ordered per priority
if, and only if, there is a stream with data queued, so that dequeueing
is very straightforward: either finish current datamsg or simply dequeue
from the highest priority queued, which is the next stream pointed, and
that's it.

If there are multiple streams assigned with the same priority and with
data queued, it will do round robin amongst them while respecting
datamsgs boundaries (when not using idata chunks), to be reasonably
fair.

We intentionally don't maintain a list of priorities nor a list of all
streams with the same priority to save memory. The first would mean at
least 2 other pointers per priority (which, for 1000 priorities, that
can mean 16kB) and the second would also mean 2 other pointers but per
stream. As SCTP supports up to 65535 streams on a given asoc, that's
1MB. This impacts when giving a priority to some stream, as we have to
find out if the new priority is already being used and if we can free
the old one, and also when tearing down.

The new fields in struct sctp_stream_out_ext and sctp_stream are added
under a union because that memory is to be shared with other schedulers.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
0ccdf3c7fd sctp: add sockopt to get/set stream scheduler parameters
As defined per RFC Draft ndata Section 4.3.3, named as
SCTP_STREAM_SCHEDULER_VALUE.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
13aa8770fe sctp: add sockopt to get/set stream scheduler
As defined per RFC Draft ndata Section 4.3.2, named as
SCTP_STREAM_SCHEDULER.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Marcelo Ricardo Leitner
5bbbbe32a4 sctp: introduce stream scheduler foundations
This patch introduces the hooks necessary to do stream scheduling, as
per RFC Draft ndata.  It also introduces the first scheduler, which is
what we do today but now factored out: first come first served (FCFS).

With stream scheduling now we have to track which chunk was enqueued on
which stream and be able to select another other than the in front of
the main outqueue. So we introduce a list on sctp_stream_out_ext
structure for this purpose.

We reuse sctp_chunk->transmitted_list space for the list above, as the
chunk cannot belong to the two lists at the same time. By using the
union in there, we can have distinct names for these moments.

sctp_sched_ops are the operations expected to be implemented by each
scheduler. The dequeueing is a bit particular to this implementation but
it is to match how we dequeue packets today. We first dequeue and then
check if it fits the packet and if not, we requeue it at head. Thus why
we don't have a peek operation but have dequeue_done instead, which is
called once the chunk can be safely considered as transmitted.

The check removed from sctp_outq_flush is now performed by
sctp_stream_outq_migrate, which is only called during assoc setup.
(sctp_sendmsg() also checks for it)

The only operation that is foreseen but not yet added here is a way to
signalize that a new packet is starting or that the packet is done, for
round robin scheduler per packet, but is intentionally left to the
patch that actually implements it.

Support for I-DATA chunks, also described in this RFC, with user message
interleaving is straightforward as it just requires the schedulers to
probe for the feature and ignore datamsg boundaries when dequeueing.

See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-13
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:27:29 -07:00
Alexei Starovoitov
90caccdd8c bpf: fix bpf_tail_call() x64 JIT
- bpf prog_array just like all other types of bpf array accepts 32-bit index.
  Clarify that in the comment.
- fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes
- tighten corresponding check in the interpreter to stay consistent

The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag
in commit 96eabe7a40 in 4.14. Before that the map_flags would stay zero and
though JIT code is wrong it will check bounds correctly.
Hence two fixes tags. All other JITs don't have this problem.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Fixes: 96eabe7a40 ("bpf: Allow selecting numa node during map creation")
Fixes: b52f00e6a7 ("x86: bpf_jit: implement bpf_tail_call() helper")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-03 16:04:44 -07:00
Linus Torvalds
887c8ba753 USB fixes for 4.14-rc4
Here are a number of USB fixes for 4.14-rc4 to resolved reported issue.
 
 There's a bunch of stuff in here based on the great work Andrey
 Konovalov is doing in fuzzing the USB stack.  Lots of bug fixes when
 dealing with corrupted USB descriptors that we've never seen in "normal"
 operation, but is now ensuring the stack is much more hardened overall.
 
 There's also the usual XHCI and gadget driver fixes as well, and a build
 error fix, and a few other minor things, full details in the shortlog.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN/yw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yl6pQCdGY+nPJhzj9EIeFj5QUpSuS4b1pYAoKrbNn+V
 CMpg4iG1oXUtVL8jBbKa
 =fVpl
 -----END PGP SIGNATURE-----

Merge tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a number of USB fixes for 4.14-rc4 to resolved reported
  issues.

  There's a bunch of stuff in here based on the great work Andrey
  Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
  dealing with corrupted USB descriptors that we've never seen in
  "normal" operation, but is now ensuring the stack is much more
  hardened overall.

  There's also the usual XHCI and gadget driver fixes as well, and a
  build error fix, and a few other minor things, full details in the
  shortlog.

  All of these have been in linux-next with no reported issues"

* tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (38 commits)
  usb: dwc3: of-simple: Add compatible for Spreadtrum SC9860 platform
  usb: gadget: udc: atmel: set vbus irqflags explicitly
  usb: gadget: ffs: handle I/O completion in-order
  usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction
  usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe
  usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe()
  usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value
  usb: gadget: udc: renesas_usb3: fix for no-data control transfer
  USB: dummy-hcd: Fix erroneous synchronization change
  USB: dummy-hcd: fix infinite-loop resubmission bug
  USB: dummy-hcd: fix connection failures (wrong speed)
  USB: cdc-wdm: ignore -EPIPE from GetEncapsulatedResponse
  USB: devio: Don't corrupt user memory
  USB: devio: Prevent integer overflow in proc_do_submiturb()
  USB: g_mass_storage: Fix deadlock when driver is unbound
  USB: gadgetfs: Fix crash caused by inadequate synchronization
  USB: gadgetfs: fix copy_to_user while holding spinlock
  USB: uas: fix bug in handling of alternate settings
  usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives
  usb-storage: fix bogus hardware error messages for ATA pass-thru devices
  ...
2017-10-03 09:25:40 -07:00
Dave Airlie
ebec44a245 Linux 4.14-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZ0WQ6AAoJEHm+PkMAQRiGuloH/3sF4qfBhPuJo8OTf0uCtQ18
 4Ux9zZbm81df/Jjz0exAp1Jqk+TvdIS3OXPWcKilvbUBP16hQcsxFTnI/5QF+YcN
 87aNr+OCMJzOBK4suN1yhzO46NYHeIizdB0PTZVL1Zsto69Tt31D8VJmgH6oBxAw
 Isb/nAkOr31dZ9PI5UEExTIanUt6EywVb0UswA+2rNl3h1UkeasQCpMpK2n6HBhU
 kVD7sxEd/CN0MmfhB0HrySSam/BeSpOtzoU9bemOwrU2uu9+5+2rqMe7Gsdj4nX6
 3Kk+7FQNktlrhxCZIFN/+CdusOUuDd8r/75d7DnsRK5YvSb0sZzJkfD3Nba68Ms=
 =7J2+
 -----END PGP SIGNATURE-----

BackMerge tag 'v4.14-rc3' into drm-next

Linux 4.14-rc3

Requested by Daniel for the tracing build fix in fixes.
2017-10-03 09:35:04 +10:00
Avraham Stern
503c1fb98b cfg80211/nl80211: add a port authorized event
Add an event that indicates that a connection is authorized
(i.e. the 4 way handshake was performed by the driver). This event
should be sent by the driver after sending a connect/roamed event.

This is useful for networks that require 802.1X authentication.
In cases that the driver supports 4 way handshake offload, but the
802.1X authentication is managed by user space, the driver needs to
inform user space right after the 802.11 association was completed
so user space can initialize its 802.1X state machine etc.
However, it is also possible that the AP will choose to skip the
802.1X authentication (e.g. when PMKSA caching is used) and proceed
with the 4 way handshake immediately. In this case the driver needs
to inform user space that 802.1X authentication is no longer required
(e.g. to prevent user space from disconnecting since it did not get
any EAPOLs from the AP).

This is also useful for roaming, in which case it is possible that
the driver used the Fast Transition protocol so 802.1X is not
required.

Since there will now be a dedicated notification indicating that the
connection is authorized, the authorized flag can be removed from the
roamed event. Drivers can send the new port authorized event right
after sending the roamed event to indicate the new AP is already
authorized. This therefore reserves the old PORT_AUTHORIZED attribute.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-10-02 14:08:27 +02:00
Maciej Żenczykowski
84e14fe353 net-ipv6: add support for sockopt(SOL_IPV6, IPV6_FREEBIND)
So far we've been relying on sockopt(SOL_IP, IP_FREEBIND) being usable
even on IPv6 sockets.

However, it turns out it is perfectly reasonable to want to set freebind
on an AF_INET6 SOCK_RAW socket - but there is no way to set any SOL_IP
socket option on such a socket (they're all blindly errored out).

One use case for this is to allow spoofing src ip on a raw socket
via sendmsg cmsg.

Tested:
  built, and booted
  # python
  >>> import socket
  >>> SOL_IP = socket.SOL_IP
  >>> SOL_IPV6 = socket.IPPROTO_IPV6
  >>> IP_FREEBIND = 15
  >>> IPV6_FREEBIND = 78
  >>> s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, 0)
  >>> s.getsockopt(SOL_IP, IP_FREEBIND)
  0
  >>> s.getsockopt(SOL_IPV6, IPV6_FREEBIND)
  0
  >>> s.setsockopt(SOL_IPV6, IPV6_FREEBIND, 1)
  >>> s.getsockopt(SOL_IP, IP_FREEBIND)
  1
  >>> s.getsockopt(SOL_IPV6, IPV6_FREEBIND)
  1

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-30 05:30:52 +01:00
Mauro Carvalho Chehab
cf09e3c904 Linux 4.14-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZyEIXAAoJEHm+PkMAQRiGNYYH/0VsoYYN4WNYtfetKcHWT27G
 FZYu/yr+MpybcHhB74eiYEGK2qwR6OV1l9H4WVfESrbnT84Dbb7QdDssD2YCD4Lw
 oFMtr8qQmfHBxlRHIlk4UNSrcy+xketMNvuHjPklKqtrLutVQ4D9HR0yLOVKoxyR
 hoxV6F6OnDCekhPjcBQZ+xDYuzTnF8fCqk1szvv4X2DBbffNX9Fvr9eVnS4bgML6
 gfNiEplq4PiUXi3yPP2zRpdwRGDVpw8Wh++iroYgUFZ/LtVURI83GpeLyP0cs/W6
 MygH+joQovOzkJ26MO8/SATdjvXn7NjBcjrDT61GrVhawJOkcnnQQvRpep3ih14=
 =wrF8
 -----END PGP SIGNATURE-----

Merge tag 'v4.14-rc2' into patchwork

Linux 4.14-rc2

* tag 'v4.14-rc2': (12066 commits)
  Linux 4.14-rc2
  tpm: ibmvtpm: simplify crq initialization and document crq format
  tpm: replace msleep() with  usleep_range() in TPM 1.2/2.0 generic drivers
  Documentation: tpm: add powered-while-suspended binding documentation
  tpm: tpm_crb: constify acpi_device_id.
  tpm: vtpm: constify vio_device_id
  security: fix description of values returned by cap_inode_need_killpriv
  x86/asm: Fix inline asm call constraints for Clang
  objtool: Handle another GCC stack pointer adjustment bug
  inet: fix improper empty comparison
  net: use inet6_rcv_saddr to compare sockets
  net: set tb->fast_sk_family
  net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
  MAINTAINERS: update git tree locations for ieee802154 subsystem
  SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
  SMB3: handle new statx fields
  arch: remove unused *_segments() macros/functions
  parisc: Unbreak bootloader due to gcc-7 optimizations
  parisc: Reintroduce option to gzip-compress the kernel
  apparmor: fix apparmorfs DAC access permissions
  ...
2017-09-29 05:24:10 -04:00
Martin KaFai Lau
ad5b177bd7 bpf: Add map_name to bpf_map_info
This patch allows userspace to specify a name for a map
during BPF_MAP_CREATE.

The map's name can later be exported to user space
via BPF_OBJ_GET_INFO_BY_FD.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:17:05 +01:00
Martin KaFai Lau
cb4d2b3f03 bpf: Add name, load_time, uid and map_ids to bpf_prog_info
The patch adds name and load_time to struct bpf_prog_aux.  They
are also exported to bpf_prog_info.

The bpf_prog's name is passed by userspace during BPF_PROG_LOAD.
The kernel only stores the first (BPF_PROG_NAME_LEN - 1) bytes
and the name stored in the kernel is always \0 terminated.

The kernel will reject name that contains characters other than
isalnum() and '_'.  It will also reject name that is not null
terminated.

The existing 'user->uid' of the bpf_prog_aux is also exported to
the bpf_prog_info as created_by_uid.

The existing 'used_maps' of the bpf_prog_aux is exported to
the newly added members 'nr_map_ids' and 'map_ids' of
the bpf_prog_info.  On the input, nr_map_ids tells how
big the userspace's map_ids buffer is.  On the output,
nr_map_ids tells the exact user_map_cnt and it will only
copy up to the userspace's map_ids buffer is allowed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:17:05 +01:00
Nikolay Aleksandrov
5af48b59f3 net: bridge: add per-port group_fwd_mask with less restrictions
We need to be able to transparently forward most link-local frames via
tunnels (e.g. vxlan, qinq). Currently the bridge's group_fwd_mask has a
mask which restricts the forwarding of STP and LACP, but we need to be able
to forward these over tunnels and control that forwarding on a per-port
basis thus add a new per-port group_fwd_mask option which only disallows
mac pause frames to be forwarded (they're always dropped anyway).
The patch does not change the current default situation - all of the others
are still restricted unless configured for forwarding.
We have successfully tested this patch with LACP and STP forwarding over
VxLAN and qinq tunnels.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-29 06:02:55 +01:00
Jani Nikula
32f35b8634 Merge drm-upstream/drm-next into drm-intel-next-queued
Need MST sideband message transaction to power up/down nodes.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-09-28 15:56:49 +03:00
Alice Frosi
262832bc5a s390/ptrace: add runtime instrumention register get/set
Add runtime instrumention register get and set which allows to read
and modify the runtime instrumention control block.

Signed-off-by: Alice Frosi <alice@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-28 07:29:41 +02:00
Dave Airlie
754270c7c5 Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-next
First feature pull for 4.15.  Highlights:
- Per VM BO support
- Lots of powerplay cleanups
- Powerplay support for CI
- pasid mgr for kfd
- interrupt infrastructure for recoverable page faults
- SR-IOV fixes
- initial GPU reset for vega10
- prime mmap support
- ttm page table debugging improvements
- lots of bug fixes

* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (232 commits)
  drm/amdgpu: clarify license in amdgpu_trace_points.c
  drm/amdgpu: Add gem_prime_mmap support
  drm/amd/powerplay: delete dead code in smumgr
  drm/amd/powerplay: delete SMUM_FIELD_MASK
  drm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_READ_FIELD
  drm/amd/powerplay: delete SMUM_SET_FIELD
  drm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD
  drm/amd/powerplay: delete SMUM_WRITE_FIELD
  drm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD
  drm/amd/powerplay: move macros to hwmgr.h
  drm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h
  drm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h
  drm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h
  drm/amd/powerplay: add new helper functions in hwmgr.h
  drm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair
  drm/amd/powerplay: refine powerplay code.
  drm/amd/powerplay: delete dead code in hwmgr.h
  drm/amd/powerplay: refine interface in struct pp_smumgr_func
  ...
2017-09-28 08:37:02 +10:00
Dave Airlie
9afafdbfbf Merge tag 'drm-intel-next-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel into drm-next
Getting started with v4.15 features:

- Cannonlake workarounds (Rodrigo, Oscar)
- Infoframe refactoring and fixes to enable infoframes for DP (Ville)
- VBT definition updates (Jani)
- Sparse warning fixes (Ville, Chris)
- Crtc state usage fixes and cleanups (Ville)
- DP vswing, pre-emph and buffer translation refactoring and fixes (Rodrigo)
- Prevent IPS from interfering with CRC capture (Ville, Marta)
- Enable Mesa to advertise ARB_timer_query (Nanley)
- Refactor GT number into intel_device_info (Lionel)
- Avoid eDP DP AUX CH timeouts harder (Manasi)
- CDCLK check improvements (Ville)
- Restore GPU clock boost on missed pageflip vblanks (Chris)
- Fence register reservation API for vGPU (Changbin)
- First batch of CCS fixes (Ville)
- Finally, numerous GEM fixes, cleanups and improvements (Chris)

* tag 'drm-intel-next-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel: (100 commits)
  drm/i915: Update DRIVER_DATE to 20170907
  drm/i915/cnl: WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod)
  drm/i915: Lift has-pinned-pages assert to caller of ____i915_gem_object_get_pages
  drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk
  drm/i915/cnl: Allow the reg_read ioctl to read the RCS TIMESTAMP register
  drm/i915: Move device_info.has_snoop into the static tables
  drm/i915: Disable MI_STORE_DATA_IMM for i915g/i915gm
  drm/i915: Re-enable GTT following a device reset
  drm/i915/cnp: Wa 1181: Fix Backlight issue
  drm/i915: Annotate user relocs with __user
  drm/i915: Constify load detect mode
  drm/i915/perf: Remove __user from u64 in drm_i915_perf_oa_config
  drm/i915: Silence sparse by using gfp_t
  drm/i915: io unmap functions want __iomem
  drm/i915: Add __rcu to radix tree slot pointer
  drm/i915: Wake up the device for the fbdev setup
  drm/i915: Add interface to reserve fence registers for vGPU
  drm/i915: Use correct path to trace include
  drm/i915: Fix the missing PPAT cache attributes on CNL
  drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder
  ...
2017-09-28 07:12:44 +10:00
Dave Airlie
29baa82aa5 Merge tag 'drm-misc-next-2017-09-20' of git://anongit.freedesktop.org/git/drm-misc into drm-next
UAPI Changes:

Cross-subsystem Changes:

Core Changes:
- DP SDP defines (Ville)
- polish for scdc helpers (Thierry Reding)
- fix lifetimes for connector/plane state across crtc changes (Maarten
  Lankhorst).
- sparse fixes (Ville+Thierry)
- make legacy kms ioctls all interruptible (Maarten)
- push edid override into the edid helpers (out of probe helpers)
  (Jani)
- DP ESI defines for link status (DK)

Driver Changes:
- drm-panel is now in drm-misc!
- minor panel-simple cleanups/refactoring by various folks
- drm_bridge_add cleanup (Inki Dae)
- constify a few i2c_device_id structs (Arvind Yadav)
- More patches from Noralf's fb/gem helper cleanup
- bridge/synopsis: reset fix (Philippe Cornu)
- fix tracepoint include handling in drivers (Thierry)
- rockchip: lvds support (Sandy Huang)
- move sun4i into drm-misc fold (Maxime Ripard)
- sun4i: refactor driver load + support TCON backend/layer muxing
  (Chen-Yu Tsai)
- pl111: support more pl11x variants (Linus Walleij)
- bridge/adv7511: robustify probing/edid handling (Lars-Petersen
  Clausen)

New hw support:
- S6E63J0X03 panel (Hoegeun Kwon)
- OTM8009A panel (Philippe CORNU)
- Seiko 43WVF1G panel (Marco Franchi)
- tve200 driver (Linus Walleij)

Plus assorted of tiny patches all over, including our first outreachy
patches from applicants for the winter round!

* tag 'drm-misc-next-2017-09-20' of git://anongit.freedesktop.org/git/drm-misc: (101 commits)
  drm: add backwards compatibility support for drm_kms_helper.edid_firmware
  drm: handle override and firmware EDID at drm_do_get_edid() level
  drm/dp: DPCD register defines for link status within ESI field
  drm/rockchip: Replace dev_* with DRM_DEV_*
  drm/tinydrm: Drop driver registered message
  drm/gem-fb-helper: Use debug message on gem lookup failure
  drm/imx: Use drm_gem_fb_create() and drm_gem_fb_prepare_fb()
  drm/bridge: adv7511: Constify HDMI CODEC platform data
  drm/bridge: adv7511: Enable connector polling when no interrupt is specified
  drm/bridge: adv7511: Remove private copy of the EDID
  drm/bridge: adv7511: Properly update EDID when no EDID was found
  drm/crtc: Convert setcrtc ioctl locking to interruptible.
  drm/atomic: Convert pageflip ioctl locking to interruptible.
  drm/legacy: Convert setplane ioctl locking to interruptible.
  drm/legacy: Convert cursor ioctl locking to interruptible.
  drm/atomic: Convert atomic ioctl locking to interruptible.
  drm/atomic: Prepare drm_modeset_lock infrastructure for interruptible waiting, v2.
  drm/tve200: Clean up panel bridging
  drm/doc: Update todo.rst
  drm/dp/mst: Sideband message transaction to power up/down nodes
  ...
2017-09-28 05:46:15 +10:00
Daniel Borkmann
de8f3a83b0 bpf: add meta pointer for direct access
This work enables generic transfer of metadata from XDP into skb. The
basic idea is that we can make use of the fact that the resulting skb
must be linear and already comes with a larger headroom for supporting
bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
for adjusting a new pointer called xdp->data_meta. Thus, the packet has
a flexible and programmable room for meta data, followed by the actual
packet data. struct xdp_buff is therefore laid out that we first point
to data_hard_start, then data_meta directly prepended to data followed
by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
account whether we have meta data already prepended and if so, memmove()s
this along with the given offset provided there's enough room.

xdp->data_meta is optional and programs are not required to use it. The
rationale is that when we process the packet in XDP (e.g. as DoS filter),
we can push further meta data along with it for the XDP_PASS case, and
give the guarantee that a clsact ingress BPF program on the same device
can pick this up for further post-processing. Since we work with skb
there, we can also set skb->mark, skb->priority or other skb meta data
out of BPF, thus having this scratch space generic and programmable
allows for more flexibility than defining a direct 1:1 transfer of
potentially new XDP members into skb (it's also more efficient as we
don't need to initialize/handle each of such new members). The facility
also works together with GRO aggregation. The scratch space at the head
of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
yet supporting xdp->data_meta can simply be set up with xdp->data_meta
as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
such that the subsequent match against xdp->data for later access is
guaranteed to fail.

The verifier treats xdp->data_meta/xdp->data the same way as we treat
xdp->data/xdp->data_end pointer comparisons. The requirement for doing
the compare against xdp->data is that it hasn't been modified from it's
original address we got from ctx access. It may have a range marking
already from prior successful xdp->data/xdp->data_end pointer comparisons
though.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 13:36:44 -07:00
Petar Penkov
90e33d4594 tun: enable napi_gro_frags() for TUN/TAP driver
Add a TUN/TAP receive mode that exercises the napi_gro_frags()
interface. This mode is available only in TAP mode, as the interface
expects packets with Ethernet headers.

Furthermore, packets follow the layout of the iovec_iter that was
received. The first iovec is the linear data, and every one after the
first is a fragment. If there are more fragments than the max number,
drop the packet. Additionally, invoke eth_get_headlen() to exercise flow
dissector code and to verify that the header resides in the linear data.

The napi_gro_frags() mode requires setting the IFF_NAPI_FRAGS option.
This is imposed because this mode is intended for testing via tools like
syzkaller and packetdrill, and the increased flexibility it provides can
introduce security vulnerabilities. This flag is accepted only if the
device is in TAP mode and has the IFF_NAPI flag set as well. This is
done because both of these are explicit requirements for correct
operation in this mode.

Signed-off-by: Petar Penkov <peterpenkov96@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: davem@davemloft.net
Cc: ppenkov@stanford.edu
Acked-by: Mahesh Bandewar <maheshb@google,com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:16:13 -07:00
Petar Penkov
943170998b tun: enable NAPI for TUN/TAP driver
Changes TUN driver to use napi_gro_receive() upon receiving packets
rather than netif_rx_ni(). Adds flag IFF_NAPI that enables these
changes and operation is not affected if the flag is disabled.  SKBs
are constructed upon packet arrival and are queued to be processed
later.

The new path was evaluated with a benchmark with the following setup:
Open two tap devices and a receiver thread that reads in a loop for
each device. Start one sender thread and pin all threads to different
CPUs. Send 1M minimum UDP packets to each device and measure sending
time for each of the sending methods:
	napi_gro_receive():	4.90s
	netif_rx_ni():		4.90s
	netif_receive_skb():	7.20s

Signed-off-by: Petar Penkov <peterpenkov96@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: davem@davemloft.net
Cc: ppenkov@stanford.edu
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-25 20:16:13 -07:00
Leon Romanovsky
78b1beb099 IB/core: Fix typo in the name of the tag-matching cap struct
The tag matching functionality is implemented by mlx5 driver
by extending XRQ, however this internal kernel information was
exposed to user space applications with *xrq* name instead of *tm*.

This patch renames *xrq* to *tm* to handle that.

Fixes: 8d50505ada ("IB/uverbs: Expose XRQ capabilities")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-25 11:47:23 -04:00
Mikulas Patocka
62e082430e dm ioctl: fix alignment of event number in the device list
The size of struct dm_name_list is different on 32-bit and 64-bit
kernels (so "(nl + 1)" differs between 32-bit and 64-bit kernels).

This mismatch caused some harmless difference in padding when using 32-bit
or 64-bit kernel. Commit 23d70c5e52 ("dm ioctl: report event number in
DM_LIST_DEVICES") added reporting event number in the output of
DM_LIST_DEVICES_CMD. This difference in padding makes it impossible for
userspace to determine the location of the event number (the location
would be different when running on 32-bit and 64-bit kernels).

Fix the padding by using offsetof(struct dm_name_list, name) instead of
sizeof(struct dm_name_list) to determine the location of entries.

Also, the ioctl version number is incremented to 37 so that userspace
can use the version number to determine that the event number is present
and correctly located.

In addition, a global event is now raised when a DM device is created,
removed, renamed or when table is swapped, so that the user can monitor
for device changes.

Reported-by: Eugene Syromiatnikov <esyr@redhat.com>
Fixes: 23d70c5e52 ("dm ioctl: report event number in DM_LIST_DEVICES")
Cc: stable@vger.kernel.org # 4.13
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-09-25 11:18:29 -04:00
Linus Torvalds
71aa60f67f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix NAPI poll list corruption in enic driver, from Christian
    Lamparter.

 2) Fix route use after free, from Eric Dumazet.

 3) Fix regression in reuseaddr handling, from Josef Bacik.

 4) Assert the size of control messages in compat handling since we copy
    it in from userspace twice. From Meng Xu.

 5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
    from Ursula Braun.

 6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.

 7) Don't use ARRAY_SIZE on spinlock array which might have zero
    entries, from Geert Uytterhoeven.

 8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
    Kasiviswanathan.

 9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
    Long.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
  inet: fix improper empty comparison
  net: use inet6_rcv_saddr to compare sockets
  net: set tb->fast_sk_family
  net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
  MAINTAINERS: update git tree locations for ieee802154 subsystem
  net: prevent dst uses after free
  net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
  net/smc: no close wait in case of process shut down
  net/smc: introduce a delay
  net/smc: terminate link group if out-of-sync is received
  net/smc: longer delay for client link group removal
  net/smc: adapt send request completion notification
  net/smc: adjust net_device refcount
  net/smc: take RCU read lock for routing cache lookup
  net/smc: add receive timeout check
  net/smc: add missing dev_put
  net: stmmac: Cocci spatch "of_table"
  lan78xx: Use default values loaded from EEPROM/OTP after reset
  lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
  lan78xx: Fix for eeprom read/write when device auto suspend
  ...
2017-09-23 05:41:27 -10:00
Hans Verkuil
333ef6bd10 media: cec: add CEC_EVENT_PIN_HPD_LOW/HIGH events
Add support for two new low-level events: PIN_HPD_LOW and PIN_HPD_HIGH.

This is specifically meant for use with the upcoming cec-gpio driver
and makes it possible to trace when the HPD pin changes. Some HDMI
sinks do strange things with the HPD and this makes it easy to debug
this.

Note that this also moves the initialization of a devnode mutex and
list to the allocate_adapter function: if the HPD is high, then as
soon as the HPD interrupt is created an interrupt occurs and
cec_queue_pin_hpd_event() is called which requires that the devnode
mutex and list are initialized.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-09-23 07:43:04 -04:00
Linus Torvalds
c0a3a64e72 Major additions:
- sysctl and seccomp operation to discover available actions. (tyhicks)
 - new per-filter configurable logging infrastructure and sysctl. (tyhicks)
 - SECCOMP_RET_LOG to log allowed syscalls. (tyhicks)
 - SECCOMP_RET_KILL_PROCESS as the new strictest possible action.
 - self-tests for new behaviors.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJZxVbTAAoJEIly9N/cbcAmvIAQALR9aVQQXjma4lLhZxwTsLtG
 rJm8t/o4y/2aBV8vzpFbMPT5gfN/PAkHJpCoxVPssx0k4PH2M7HjpnR6E1OC+erg
 RNom3uNdNqZeFlDpdX1qriYiCTB9p6rHe0DPwgG9iGqgDxsJ+G3W+x1sMZ1C+A0M
 shxA3fwt+Qpivo8Zq44xjMFjK+Zeor9V3yPc51QoZktWHlM16ID3HvHVnUtzqAUb
 nTWF6ZlmZlJ/lp4Dq8/55lytVcXPo240G3H0Odai+SNFakK6p5UO//BRBV209bmb
 05jpAOH6uym1sxVz00TQXCtDqOEzs2mQgomtTSShHg8SrLFX7nFkEFtAVA6tEri2
 FqDYce9KX7ZtOYiq83C7pnpAFCouc0z31dQl9USHiAiexXklwBIX+OsVv98omWGi
 pW43uLE2ovY0cpOsN50xI4mnxiGh6MhFcdbor2VLRJwLIFSw3XjjgNCCLyK4AJxs
 N514252qi70c9cWyAHYDLy077yTVxu3JUlsVQKtRTMfoFUq6bX1jPXVXE8qkVrui
 bc/Ay54pPrUwM854IpQ9ZBOuMfs6I5opocGIsBvMaND45U4o2B0ANCsxhuZ0zEtM
 E55DhK5OgjukNemQmlWK2foDckYdtkJXCj2yMBNQady0Uynr2BWZ6VDBP7vFcnRB
 UihRlFZRZleu8383uHsc
 =sKeC
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "Major additions:

   - sysctl and seccomp operation to discover available actions
     (tyhicks)

   - new per-filter configurable logging infrastructure and sysctl
     (tyhicks)

   - SECCOMP_RET_LOG to log allowed syscalls (tyhicks)

   - SECCOMP_RET_KILL_PROCESS as the new strictest possible action

   - self-tests for new behaviors"

[ This is the seccomp part of the security pull request during the merge
  window that was nixed due to unrelated problems   - Linus ]

* tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  samples: Unrename SECCOMP_RET_KILL
  selftests/seccomp: Test thread vs process killing
  seccomp: Implement SECCOMP_RET_KILL_PROCESS action
  seccomp: Introduce SECCOMP_RET_KILL_PROCESS
  seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD
  seccomp: Action to log before allowing
  seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW
  seccomp: Selftest for detection of filter flag support
  seccomp: Sysctl to configure actions that are allowed to be logged
  seccomp: Operation for checking if an action is available
  seccomp: Sysctl to display available actions
  seccomp: Provide matching filter for introspection
  selftests/seccomp: Refactor RET_ERRNO tests
  selftests/seccomp: Add simple seccomp overhead benchmark
  selftests/seccomp: Add tests for basic ptrace actions
2017-09-22 16:16:41 -10:00
Florian Fainelli
19cab88726 net: ethtool: Add back transceiver type
Commit 3f1ac7a700 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
deprecated the ethtool_cmd::transceiver field, which was fine in
premise, except that the PHY library was actually using it to report the
type of transceiver: internal or external.

Use the first word of the reserved field to put this __u8 transceiver
field back in. It is made read-only, and we don't expect the
ETHTOOL_xLINKSETTINGS API to be doing anything with this anyway, so this
is mostly for the legacy path where we do:

ethtool_get_settings()
-> dev->ethtool_ops->get_link_ksettings()
   -> convert_link_ksettings_to_legacy_settings()

to have no information loss compared to the legacy get_settings API.

Fixes: 3f1ac7a700 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-21 15:20:40 -07:00
Emmanuel Grumbach
65026002d6 nl80211: add an option to allow MFP without requiring it
The user space can now allow the kernel to associate to an AP that
requires MFP or that doesn't have MFP enabled in the same
NL80211_CMD_CONNECT command, by using a new NL80211_MFP_OPTIONAL flag.
The driver / firmware will decide whether to use it or not.

Include a feature bit to advertise support for NL80211_MFP_OPTIONAL.
This allows new user space to run on old kernels and know that it
cannot use the new attribute if it isn't supported.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-09-21 11:42:03 +02:00
Roee Zamir
2d23d0736e nl80211: add OCE scan and capability flags
Add Optimized Connectivity Experience (OCE) scan and capability flags.
Some of them unique to OCE and some are stand alone.
And add scan flags to enable/disable them.

Signed-off-by: Roee Zamir <roee.zamir@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-09-21 11:41:59 +02:00
Greg Kroah-Hartman
bd7a3fe770 USB: fix out-of-bounds in usb_set_configuration
Andrey Konovalov reported a possible out-of-bounds problem for a USB interface
association descriptor.  He writes:
	It seems there's no proper size check of a USB_DT_INTERFACE_ASSOCIATION
	descriptor. It's only checked that the size is >= 2 in
	usb_parse_configuration(), so find_iad() might do out-of-bounds access
	to intf_assoc->bInterfaceCount.

And he's right, we don't check for crazy descriptors of this type very well, so
resolve this problem.  Yet another issue found by syzkaller...

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-19 17:27:16 +02:00
Lionel Landwerlin
ee427e2595 uapi/drm/i915: document field usage of drm_i915_perf_oa_config
Document the expected length of buffers config pointers (tuple of u32
values).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170918114241.30105-1-lionel.g.landwerlin@intel.com
2017-09-18 14:46:21 +01:00
Dave Airlie
134dd2e616 Merge tag 'drm-amdkfd-next-2017-09-02' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes
some trivial amdkfd cleanups

* tag 'drm-amdkfd-next-2017-09-02' of git://people.freedesktop.org/~gabbayo/linux:
  drm/amdkfd: pass queue's mqd when destroying mqd
  drm/amdkfd: remove memset before memcpy
  uapi linux/kfd_ioctl.h: only use __u32 and __u64
2017-09-18 16:29:47 +10:00
Linus Torvalds
e7cdb60fd2 Merge branch 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull zstd support from Chris Mason:
 "Nick Terrell's patch series to add zstd support to the kernel has been
  floating around for a while. After talking with Dave Sterba, Herbert
  and Phillip, we decided to send the whole thing in as one pull
  request.

  zstd is a big win in speed over zlib and in compression ratio over
  lzo, and the compression team here at FB has gotten great results
  using it in production. Nick will continue to update the kernel side
  with new improvements from the open source zstd userland code.

  Nick has a number of benchmarks for the main zstd code in his lib/zstd
  commit:

      I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB
      of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel
      Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using
      `silesia.tar` [3], which is 211,988,480 B large. Run the following
      commands for the benchmark:

        sudo modprobe zstd_compress_test
        sudo mknod zstd_compress_test c 245 0
        sudo cp silesia.tar zstd_compress_test

      The time is reported by the time of the userland `cp`.
      The MB/s is computed with

        1,536,217,008 B / time(buffer size, hash)

      which includes the time to copy from userland.
      The Adjusted MB/s is computed with

        1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).

      The memory reported is the amount of memory the compressor
      requests.

        | Method   | Size (B) | Time (s) | Ratio | MB/s    | Adj MB/s | Mem (MB) |
        |----------|----------|----------|-------|---------|----------|----------|
        | none     | 11988480 |    0.100 |     1 | 2119.88 |        - |        - |
        | zstd -1  | 73645762 |    1.044 | 2.878 |  203.05 |   224.56 |     1.23 |
        | zstd -3  | 66988878 |    1.761 | 3.165 |  120.38 |   127.63 |     2.47 |
        | zstd -5  | 65001259 |    2.563 | 3.261 |   82.71 |    86.07 |     2.86 |
        | zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |    16.13 |    13.22 |
        | zstd -15 | 58009756 |   47.601 | 3.654 |    4.45 |     4.46 |    21.61 |
        | zstd -19 | 54014593 |  102.835 | 3.925 |    2.06 |     2.06 |    60.15 |
        | zlib -1  | 77260026 |    2.895 | 2.744 |   73.23 |    75.85 |     0.27 |
        | zlib -3  | 72972206 |    4.116 | 2.905 |   51.50 |    52.79 |     0.27 |
        | zlib -6  | 68190360 |    9.633 | 3.109 |   22.01 |    22.24 |     0.27 |
        | zlib -9  | 67613382 |   22.554 | 3.135 |    9.40 |     9.44 |     0.27 |

      I benchmarked zstd decompression using the same method on the same
      machine. The benchmark file is located in the upstream zstd repo
      under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The
      memory reported is the amount of memory required to decompress
      data compressed with the given compression level. If you know the
      maximum size of your input, you can reduce the memory usage of
      decompression irrespective of the compression level.

        | Method   | Time (s) | MB/s    | Adjusted MB/s | Memory (MB) |
        |----------|----------|---------|---------------|-------------|
        | none     |    0.025 | 8479.54 |             - |           - |
        | zstd -1  |    0.358 |  592.15 |        636.60 |        0.84 |
        | zstd -3  |    0.396 |  535.32 |        571.40 |        1.46 |
        | zstd -5  |    0.396 |  535.32 |        571.40 |        1.46 |
        | zstd -10 |    0.374 |  566.81 |        607.42 |        2.51 |
        | zstd -15 |    0.379 |  559.34 |        598.84 |        4.61 |
        | zstd -19 |    0.412 |  514.54 |        547.77 |        8.80 |
        | zlib -1  |    0.940 |  225.52 |        231.68 |        0.04 |
        | zlib -3  |    0.883 |  240.08 |        247.07 |        0.04 |
        | zlib -6  |    0.844 |  251.17 |        258.84 |        0.04 |
        | zlib -9  |    0.837 |  253.27 |        287.64 |        0.04 |

  I ran a long series of tests and benchmarks on the btrfs side and the
  gains are very similar to the core benchmarks Nick ran"

* 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  squashfs: Add zstd support
  btrfs: Add zstd support
  lib: Add zstd modules
  lib: Add xxhash module
2017-09-14 17:30:49 -07:00
Joonas Lahtinen
3fd3a6ffe2 drm/i915: Simplify i915_reg_read_ioctl
Convert to use the freshly available made INTEL_GEN_MASK for easier
grepping and improve function readability and clarify the UABI
documentation.

No functional changes.

v2:
- Lift GEM_BUG_ONs and use is_power_of_2 (Chris)
- Retain -EINVAL on bad flags behavior (Chris)

v3:
- Extract flags with 'entry->size - 1' (Chris)

v4:
- Add GEM_BUG_ON on for flags vs entry offset (Chris)

v5:
- Use 'u16' to match 'dev_priv' (Ville)

v6:
- Fix checkpatch.pl errors

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170913115255.13851-2-joonas.lahtinen@linux.intel.com
2017-09-14 11:15:52 +03:00
Baruch Siach
a2b4a79b88 spi: uapi: spidev: add missing ioctl header
The SPI_IOC_MESSAGE() macro references _IOC_SIZEBITS. Add linux/ioctl.h
to make sure this macro is defined. This fixes the following build
failure of lcdproc with the musl libc:

In file included from .../sysroot/usr/include/sys/ioctl.h:7:0,
                 from hd44780-spi.c:31:
hd44780-spi.c: In function 'spi_transfer':
hd44780-spi.c:89:24: error: '_IOC_SIZEBITS' undeclared (first use in this function)
  status = ioctl(p->fd, SPI_IOC_MESSAGE(1), &xfer);
                        ^

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
2017-09-13 09:43:11 -07:00
Linus Torvalds
dd198ce714 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "Life has been busy and I have not gotten half as much done this round
  as I would have liked. I delayed it so that a minor conflict
  resolution with the mips tree could spend a little time in linux-next
  before I sent this pull request.

  This includes two long delayed user namespace changes from Kirill
  Tkhai. It also includes a very useful change from Serge Hallyn that
  allows the security capability attribute to be used inside of user
  namespaces. The practical effect of this is people can now untar
  tarballs and install rpms in user namespaces. It had been suggested to
  generalize this and encode some of the namespace information
  information in the xattr name. Upon close inspection that makes the
  things that should be hard easy and the things that should be easy
  more expensive.

  Then there is my bugfix/cleanup for signal injection that removes the
  magic encoding of the siginfo union member from the kernel internal
  si_code. The mips folks reported the case where I had used FPE_FIXME
  me is impossible so I have remove FPE_FIXME from mips, while at the
  same time including a return statement in that case to keep gcc from
  complaining about unitialized variables.

  I almost finished the work to get make copy_siginfo_to_user a trivial
  copy to user. The code is available at:

     git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3

  But I did not have time/energy to get the code posted and reviewed
  before the merge window opened.

  I was able to see that the security excuse for just copying fields
  that we know are initialized doesn't work in practice there are buggy
  initializations that don't initialize the proper fields in siginfo. So
  we still sometimes copy unitialized data to userspace"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  Introduce v3 namespaced file capabilities
  mips/signal: In force_fcr31_sig return in the impossible case
  signal: Remove kernel interal si_code magic
  fcntl: Don't use ambiguous SIG_POLL si_codes
  prctl: Allow local CAP_SYS_ADMIN changing exe_file
  security: Use user_namespace::level to avoid redundant iterations in cap_capable()
  userns,pidns: Verify the userns for new pid namespaces
  signal/testing: Don't look for __SI_FAULT in userspace
  signal/mips: Document a conflict with SI_USER with SIGFPE
  signal/sparc: Document a conflict with SI_USER with SIGFPE
  signal/ia64: Document a conflict with SI_USER with SIGFPE
  signal/alpha: Document a conflict with SI_USER for SIGTRAP
2017-09-11 18:34:47 -07:00
Linus Torvalds
ae46654bcf ARM: SoC driver updates for v4.14
This branch contains platform-related driver updates for ARM and ARM64.
 
 Among them:
 
  - Reset driver updates:
   + New API for dealing with arrays of resets
   + Make unimplemented {de,}assert return success on shared resets
   + MSDKv1 driver
   + Removal of obsolete Gemini reset driver
   + Misc updates for sunxi and Uniphier
 
  - SoC drivers:
   + Platform SoC driver registration on Tegra
   + Shuffle of Qualcomm drivers into a submenu
   + Allwinner A64 support for SRAM
   + Renesas R-Car R3 support
   + Power domains for Rockchip RK3366
 
  - Misc updates and smaller fixes for TEE and memory driver subsystems
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZtdt7AAoJEIwa5zzehBx3EboP/jR2T9lrMavXR1zL48L14yJb
 S+fiJlrX1Kr42UF4PQvsfs+uTqOLmycrPVFkMb6IwoUPzQ9UCOSoiMzYm2b7ZPvt
 uIesHhdM3/xun6wKfieN8GmNA1yDVynTxo0TTYDw5ha7I6s2GHgw0GSFzy3wm0Qg
 KzerAO3gzf3L5XsKR0cai3IXNjHO9ubpFG1ReR09da28nPElP8ggWg0KnqdO76Ch
 BGpFj78EC875ZNqwHgnspUqgGDJnBjig3m/uA4FWA0G9Jl38tCyKTZfUR7cEraoV
 kyCgBlR/UrI8eXVTyEy5k5iTsQ3A1VhX4rGjyH+5NZHTs1yWr4+RDND/qeGl9tSo
 VASuOtH6Rc3vdUDpHPBNAFNQH8fwwDoKf96dvN1tiffsx6LSKb//NyOfkXzKOtR6
 CP5raYfX4YktLtHq0XVTZ/6r3XmLcTHzElR/dCFpQOFcTOYii0pWtfcWouahbZ1w
 dhoBX/dbNq37MfzrxtHN2VTIEHpn2GU7u+ZGkp2ArokD58BAft/M3Xee1cDnF75g
 ZDwe5eNFT8aBZKaY7zwG8cdxiw9kACAivDRwW+zgpfUr39c+d0+QmVfnfJh4EGXK
 Ri6yr2EfBWK6jw3cwkdSyyt7iSzIkB+RiuuD1MjpYhWzAvoDpzqkXYukFGbpXnuy
 vUFHNuP1ocUsRtCs8mm1
 =yBzS
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
 "This branch contains platform-related driver updates for ARM and ARM64.

  Among them:

   - Reset driver updates:
     + New API for dealing with arrays of resets
     + Make unimplemented {de,}assert return success on shared resets
     + MSDKv1 driver
     + Removal of obsolete Gemini reset driver
     + Misc updates for sunxi and Uniphier

   - SoC drivers:
     + Platform SoC driver registration on Tegra
     + Shuffle of Qualcomm drivers into a submenu
     + Allwinner A64 support for SRAM
     + Renesas R-Car R3 support
     + Power domains for Rockchip RK3366

   - Misc updates and smaller fixes for TEE and memory driver
     subsystems"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (54 commits)
  firmware: arm_scpi: fix endianness of dev_id in struct dev_pstate_set
  soc/tegra: fuse: Add missing semi-colon
  soc/tegra: Restrict SoC device registration to Tegra
  drivers: soc: sunxi: add support for A64 and its SRAM C
  drivers: soc: sunxi: add support for remapping func value to reg value
  drivers: soc: sunxi: fix error processing on base address when claiming
  dt-bindings: add binding for Allwinner A64 SRAM controller and SRAM C
  bus: sunxi-rsb: Enable by default for ARM64
  soc/tegra: Register SoC device
  firmware: tegra: set drvdata earlier
  memory: Convert to using %pOF instead of full_name
  soc: Convert to using %pOF instead of full_name
  bus: Convert to using %pOF instead of full_name
  firmware: Convert to using %pOF instead of full_name
  soc: mediatek: add SCPSYS power domain driver for MediaTek MT7622 SoC
  soc: mediatek: add header files required for MT7622 SCPSYS dt-binding
  soc: mediatek: reduce code duplication of scpsys_probe across all SoCs
  dt-bindings: soc: update the binding document for SCPSYS on MediaTek MT7622 SoC
  reset: uniphier: add analog amplifiers reset control
  reset: uniphier: add video input subsystem reset control
  ...
2017-09-10 20:40:00 -07:00
Linus Torvalds
126e76ffbf Merge branch 'for-4.14/block-postmerge' of git://git.kernel.dk/linux-block
Pull followup block layer updates from Jens Axboe:
 "I ended up splitting the main pull request for this series into two,
  mainly because of clashes between NVMe fixes that went into 4.13 after
  the for-4.14 branches were split off. This pull request is mostly
  NVMe, but not exclusively. In detail, it contains:

   - Two pull request for NVMe changes from Christoph. Nothing new on
     the feature front, basically just fixes all over the map for the
     core bits, transport, rdma, etc.

   - Series from Bart, cleaning up various bits in the BFQ scheduler.

   - Series of bcache fixes, which has been lingering for a release or
     two. Coly sent this in, but patches from various people in this
     area.

   - Set of patches for BFQ from Paolo himself, updating both
     documentation and fixing some corner cases in performance.

   - Series from Omar, attempting to now get the 4k loop support
     correct. Our confidence level is higher this time.

   - Series from Shaohua for loop as well, improving O_DIRECT
     performance and fixing a use-after-free"

* 'for-4.14/block-postmerge' of git://git.kernel.dk/linux-block: (74 commits)
  bcache: initialize dirty stripes in flash_dev_run()
  loop: set physical block size to logical block size
  bcache: fix bch_hprint crash and improve output
  bcache: Update continue_at() documentation
  bcache: silence static checker warning
  bcache: fix for gc and write-back race
  bcache: increase the number of open buckets
  bcache: Correct return value for sysfs attach errors
  bcache: correct cache_dirty_target in __update_writeback_rate()
  bcache: gc does not work when triggering by manual command
  bcache: Don't reinvent the wheel but use existing llist API
  bcache: do not subtract sectors_to_gc for bypassed IO
  bcache: fix sequential large write IO bypass
  bcache: Fix leak of bdev reference
  block/loop: remove unused field
  block/loop: fix use after free
  bfq: Use icq_to_bic() consistently
  bfq: Suppress compiler warnings about comparisons
  bfq: Check kstrtoul() return value
  bfq: Declare local functions static
  ...
2017-09-09 12:49:01 -07:00
Linus Torvalds
fbd01410e8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "The iwlwifi firmware compat fix is in here as well as some other
  stuff:

  1) Fix request socket leak introduced by BPF deadlock fix, from Eric
     Dumazet.

  2) Fix VLAN handling with TXQs in mac80211, from Johannes Berg.

  3) Missing __qdisc_drop conversions in prio and qfq schedulers, from
     Gao Feng.

  4) Use after free in netlink nlk groups handling, from Xin Long.

  5) Handle MTU update properly in ipv6 gre tunnels, from Xin Long.

  6) Fix leak of ipv6 fib tables on netns teardown, from Sabrina Dubroca
     with follow-on fix from Eric Dumazet.

  7) Need RCU and preemption disabled during generic XDP data patch,
     from John Fastabend"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
  bpf: make error reporting in bpf_warn_invalid_xdp_action more clear
  Revert "mdio_bus: Remove unneeded gpiod NULL check"
  bpf: devmap, use cond_resched instead of cpu_relax
  bpf: add support for sockmap detach programs
  net: rcu lock and preempt disable missing around generic xdp
  bpf: don't select potentially stale ri->map from buggy xdp progs
  net: tulip: Constify tulip_tbl
  net: ethernet: ti: netcp_core: no need in netif_napi_del
  davicom: Display proper debug level up to 6
  net: phy: sfp: rename dt properties to match the binding
  dt-binding: net: sfp binding documentation
  dt-bindings: add SFF vendor prefix
  dt-bindings: net: don't confuse with generic PHY property
  ip6_tunnel: fix setting hop_limit value for ipv6 tunnel
  ip_tunnel: fix setting ttl and tos value in collect_md mode
  ipv6: fix typo in fib6_net_exit()
  tcp: fix a request socket leak
  sctp: fix missing wake ups in some situations
  netfilter: xt_hashlimit: fix build error caused by 64bit division
  netfilter: xt_hashlimit: alloc hashtable with right size
  ...
2017-09-09 11:05:20 -07:00
Linus Torvalds
fbf4432ff7 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - most of the rest of MM

 - a small number of misc things

 - lib/ updates

 - checkpatch

 - autofs updates

 - ipc/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (126 commits)
  ipc: optimize semget/shmget/msgget for lots of keys
  ipc/sem: play nicer with large nsops allocations
  ipc/sem: drop sem_checkid helper
  ipc: convert kern_ipc_perm.refcount from atomic_t to refcount_t
  ipc: convert sem_undo_list.refcnt from atomic_t to refcount_t
  ipc: convert ipc_namespace.count from atomic_t to refcount_t
  kcov: support compat processes
  sh: defconfig: cleanup from old Kconfig options
  mn10300: defconfig: cleanup from old Kconfig options
  m32r: defconfig: cleanup from old Kconfig options
  drivers/pps: use surrounding "if PPS" to remove numerous dependency checks
  drivers/pps: aesthetic tweaks to PPS-related content
  cpumask: make cpumask_next() out-of-line
  kmod: move #ifdef CONFIG_MODULES wrapper to Makefile
  kmod: split off umh headers into its own file
  MAINTAINERS: clarify kmod is just a kernel module loader
  kmod: split out umh code into its own file
  test_kmod: flip INT checks to be consistent
  test_kmod: remove paranoid UINT_MAX check on uint range processing
  vfat: deduplicate hex2bin()
  ...
2017-09-09 10:30:07 -07:00
Daniel Borkmann
9beb8bedb0 bpf: make error reporting in bpf_warn_invalid_xdp_action more clear
Differ between illegal XDP action code and just driver
unsupported one to provide better feedback when we throw
a one-time warning here. Reason is that with 814abfabef
("xdp: add bpf_redirect helper function") not all drivers
support the new XDP return code yet and thus they will
fall into their 'default' case when checking for return
codes after program return, which then triggers a
bpf_warn_invalid_xdp_action() stating that the return
code is illegal, but from XDP perspective it's not.

I decided not to place something like a XDP_ACT_MAX define
into uapi i) given we don't have this either for all other
program types, ii) future action codes could have further
encoding there, which would render such define unsuitable
and we wouldn't be able to rip it out again, and iii) we
rarely add new action codes.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08 21:13:09 -07:00
Robert P. J. Day
a2d8180301 drivers/pps: aesthetic tweaks to PPS-related content
Collection of aesthetic adjustments to various PPS-related files,
directories and Documentation, some quite minor just for the sake of
consistency, including:

 * Updated example of pps device tree node (courtesy Rodolfo G.)
 * "PPS-API" -> "PPS API"
 * "pps_source_info_s" -> "pps_source_info"
 * "ktimer driver" -> "pps-ktimer driver"
 * "ppstest /dev/pps0" -> "ppstest /dev/pps1" to match example
 * Add missing PPS-related entries to MAINTAINERS file
 * Other trivialities

Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1708261048220.8106@localhost.localdomain
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:51 -07:00
Tomohiro Kusumi
1f28c5d055 autofs: remove unused AUTOFS_IOC_EXPIRE_DIRECT/INDIRECT
These are not used by either kernel or userspace, although
AUTOFS_IOC_EXPIRE_DIRECT once seems to have been used by userspace in
around 2006-2008, which was technically just an alias of the existing
ioctl AUTOFS_IOC_EXPIRE_MULTI.

ioctls for autofs are already complicated enough that they could be
removed unless these are staying here to be able to compile userspace code
of certain period of time from a decade ago.

Edit: raven@themaw.net
Yes, this is indeed very old and anything that still uses must
be updated becuase it will be using broken functionality.
End edit: raven@themaw.net

Link: http://lkml.kernel.org/r/150285067347.4670.11494624644273072003.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:50 -07:00
Ian Kent
3dd8f7c3b7 autofs: make dev ioctl version and ismountpoint user accessible
Some of the autofs miscellaneous device ioctls need to be accessable to
user space applications without CAP_SYS_ADMIN to get information about
autofs mounts.

Link: http://lkml.kernel.org/r/150216642517.11652.2338933266137331637.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Colin Walters <walters@redhat.com>
Cc: Ondrej Holy <oholy@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08 18:26:50 -07:00
Linus Torvalds
0d519f2d1e pci-v4.14-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJZsr8cAAoJEFmIoMA60/r8lXYQAKViYIRMJDD4n3NhjMeLOsnJ
 vwaBmWlLRjSFIEpag5kMjS1RJE17qAvmkBZnDvSNZ6cT28INkkZnVM2IW96WECVq
 64MIvDijVPcvqGuWePCfWdDiSXApiDWwJuw55BOhmvV996wGy0gYgzpPY+1g0Knh
 XzH9IOzDL79hZleLfsxX0MLV6FGBVtOsr0jvQ04k4IgEMIxEDTlbw85rnrvzQUtc
 0Vj2koaxWIESZsq7G/wiZb2n6ekaFdXO/VlVvvhmTSDLCBaJ63Hb/gfOhwMuVkS6
 B3cVprNrCT0dSzWmU4ZXf+wpOyDpBexlemW/OR/6CQUkC6AUS6kQ5si1X44dbGmJ
 nBPh414tdlm/6V4h/A3UFPOajSGa/ZWZ/uQZPfvKs1R6WfjUerWVBfUpAzPbgjam
 c/mhJ19HYT1J7vFBfhekBMeY2Px3JgSJ9rNsrFl48ynAALaX5GEwdpo4aqBfscKz
 4/f9fU4ysumopvCEuKD2SsJvsPKd5gMQGGtvAhXM1TxvAoQ5V4cc99qEetAPXXPf
 h2EqWm4ph7YP4a+n/OZBjzluHCmZJn1CntH5+//6wpUk6HnmzsftGELuO9n12cLE
 GGkreI3T9ctV1eOkzVVa0l0QTE1X/VLyEyKCtb9obXsDaG4Ud7uKQoZgB19DwyTJ
 EG76ridTolUFVV+wzJD9
 =9cLP
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:

 - add enhanced Downstream Port Containment support, which prints more
   details about Root Port Programmed I/O errors (Dongdong Liu)

 - add Layerscape ls1088a and ls2088a support (Hou Zhiqiang)

 - add MediaTek MT2712 and MT7622 support (Ryder Lee)

 - add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang)

 - add Qualcom IPQ8074 support (Varadarajan Narayanan)

 - add R-Car r8a7743/5 device tree support (Biju Das)

 - add Rockchip per-lane PHY support for better power management (Shawn
   Lin)

 - fix IRQ mapping for hot-added devices by replacing the
   pci_fixup_irqs() boot-time design with a host bridge hook called at
   probe-time (Lorenzo Pieralisi, Matthew Minter)

 - fix race when enabling two devices that results in upstream bridge
   not being enabled correctly (Srinath Mannam)

 - fix pciehp power fault infinite loop (Keith Busch)

 - fix SHPC bridge MSI hotplug events by enabling bus mastering
   (Aleksandr Bezzubikov)

 - fix a VFIO issue by correcting PCIe capability sizes (Alex
   Williamson)

 - fix an INTD issue on Xilinx and possibly other drivers by unifying
   INTx IRQ domain support (Paul Burton)

 - avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg
   Roedel)

 - allow APM X-Gene device assignment to guests by adding an ACS quirk
   (Feng Kan)

 - fix driver crashes by disabling Extended Tags on Broadcom HT2100
   (Extended Tags support is required for PCIe Receivers but not
   Requesters, and we now enable them by default when Requesters support
   them) (Sinan Kaya)

 - fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs
   use the real Requester ID (not a phantom RID) (Robin Murphy)

 - prevent assignment of Intel VMD children to guests (which may be
   supported eventually, but isn't yet) by not associating an IOMMU with
   them (Jon Derrick)

 - fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott
   Bauer)

 - fix a Function-Level Reset issue with Intel 750 NVMe by waiting
   longer (up to 60sec instead of 1sec) for device to become ready
   (Sinan Kaya)

 - fix a Function-Level Reset issue on iProc Stingray by working around
   hardware defects in the CRS implementation (Oza Pawandeep)

 - fix an issue with Intel NVMe P3700 after an iProc reset by adding a
   delay during shutdown (Oza Pawandeep)

 - fix a Microsoft Hyper-V lockdep issue by polling instead of blocking
   in compose_msi_msg() (Stephen Hemminger)

 - fix a wireless LAN driver timeout by clearing DesignWare MSI
   interrupt status after it is handled, not before (Faiz Abbas)

 - fix DesignWare ATU enable checking (Jisheng Zhang)

 - reduce Layerscape dependencies on the bootloader by doing more
   initialization in the driver (Hou Zhiqiang)

 - improve Intel VMD performance allowing allocation of more IRQ vectors
   than present CPUs (Keith Busch)

 - improve endpoint framework support for initial DMA mask, different
   BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon
   Vijay Abraham I, Stan Drozd)

 - rework CRS support to add periodic messages while we poll during
   enumeration and after Function-Level Reset and prepare for possible
   other uses of CRS (Sinan Kaya)

 - clean up Root Port AER handling by removing unnecessary code and
   moving error handler methods to struct pcie_port_service_driver
   (Christoph Hellwig)

 - clean up error handling paths in various drivers (Bjorn Andersson,
   Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen,
   Lorenzo Pieralisi, Sergei Shtylyov)

 - clean up SR-IOV resource handling by disabling VF decoding before
   updating the corresponding resource structs (Gavin Shan)

 - clean up DesignWare-based drivers by unifying quirks to update Class
   Code and Interrupt Pin and related handling of write-protected
   registers (Hou Zhiqiang)

 - clean up by adding empty generic pcibios_align_resource() and
   pcibios_fixup_bus() and removing empty arch-specific implementations
   (Palmer Dabbelt)

 - request exclusive reset control for several drivers to allow cleanup
   elsewhere (Philipp Zabel)

 - constify various structures (Arvind Yadav, Bhumika Goyal)

 - convert from full_name() to %pOF (Rob Herring)

 - remove unused variables from iProc, HiSi, Altera, Keystone (Shawn
   Lin)

* tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits)
  PCI: xgene: Clean up whitespace
  PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset
  PCI: xgene: Fix platform_get_irq() error handling
  PCI: xilinx-nwl: Fix platform_get_irq() error handling
  PCI: rockchip: Fix platform_get_irq() error handling
  PCI: altera: Fix platform_get_irq() error handling
  PCI: spear13xx: Fix platform_get_irq() error handling
  PCI: artpec6: Fix platform_get_irq() error handling
  PCI: armada8k: Fix platform_get_irq() error handling
  PCI: dra7xx: Fix platform_get_irq() error handling
  PCI: exynos: Fix platform_get_irq() error handling
  PCI: iproc: Clean up whitespace
  PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP
  PCI: iproc: Add 500ms delay during device shutdown
  PCI: Fix typos and whitespace errors
  PCI: Remove unused "res" variable from pci_resource_io()
  PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag()
  PCI/AER: Reformat AER register definitions
  iommu/vt-d: Prevent VMD child devices from being remapping targets
  x86/PCI: Use is_vmd() rather than relying on the domain number
  ...
2017-09-08 15:47:43 -07:00
Linus Torvalds
0756b7fbb6 First batch of KVM changes for 4.14
Common:
  - improve heuristic for boosting preempted spinlocks by ignoring VCPUs
    in user mode
 
 ARM:
  - fix for decoding external abort types from guests
 
  - added support for migrating the active priority of interrupts when
    running a GICv2 guest on a GICv3 host
 
  - minor cleanup
 
 PPC:
  - expose storage keys to userspace
 
  - merge powerpc/topic/ppc-kvm branch that contains
    find_linux_pte_or_hugepte and POWER9 thread management cleanup
 
  - merge kvm-ppc-fixes with a fix that missed 4.13 because of vacations
 
  - fixes
 
 s390:
  - merge of topic branch tlb-flushing from the s390 tree to get the
    no-dat base features
 
  - merge of kvm/master to avoid conflicts with additional sthyi fixes
 
  - wire up the no-dat enhancements in KVM
 
  - multiple epoch facility (z14 feature)
 
  - Configuration z/Architecture Mode
 
  - more sthyi fixes
 
  - gdb server range checking fix
 
  - small code cleanups
 
 x86:
  - emulate Hyper-V TSC frequency MSRs
 
  - add nested INVPCID
 
  - emulate EPTP switching VMFUNC
 
  - support Virtual GIF
 
  - support 5 level page tables
 
  - speedup nested VM exits by packing byte operations
 
  - speedup MMIO by using hardware provided physical address
 
  - a lot of fixes and cleanups, especially nested
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJZspE1AAoJEED/6hsPKofoDcMIALT11n+LKV50QGwQdg2W1GOt
 aChbgnj/Kegit3hQlDhVNb8kmdZEOZzSL81Lh0VPEr7zXU8QiWn2snbizDPv8sde
 MpHhcZYZZ0YrpoiZKjl8yiwcu88OWGn2qtJ7OpuTS5hvEGAfxMncp0AMZho6fnz/
 ySTwJ9GK2MTgBw39OAzCeDOeoYn4NKYMwjJGqBXRhNX8PG/1wmfqv0vPrd6wfg31
 KJ58BumavwJjr8YbQ1xELm9rpQrAmaayIsG0R1dEUqCbt5a1+t2gt4h2uY7tWcIv
 ACt2bIze7eF3xA+OpRs+eT+yemiH3t9btIVmhCfzUpnQ+V5Z55VMSwASLtTuJRQ=
 =R8Ry
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.14-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "First batch of KVM changes for 4.14

  Common:
   - improve heuristic for boosting preempted spinlocks by ignoring
     VCPUs in user mode

  ARM:
   - fix for decoding external abort types from guests

   - added support for migrating the active priority of interrupts when
     running a GICv2 guest on a GICv3 host

   - minor cleanup

  PPC:
   - expose storage keys to userspace

   - merge kvm-ppc-fixes with a fix that missed 4.13 because of
     vacations

   - fixes

  s390:
   - merge of kvm/master to avoid conflicts with additional sthyi fixes

   - wire up the no-dat enhancements in KVM

   - multiple epoch facility (z14 feature)

   - Configuration z/Architecture Mode

   - more sthyi fixes

   - gdb server range checking fix

   - small code cleanups

  x86:
   - emulate Hyper-V TSC frequency MSRs

   - add nested INVPCID

   - emulate EPTP switching VMFUNC

   - support Virtual GIF

   - support 5 level page tables

   - speedup nested VM exits by packing byte operations

   - speedup MMIO by using hardware provided physical address

   - a lot of fixes and cleanups, especially nested"

* tag 'kvm-4.14-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
  KVM: arm/arm64: Support uaccess of GICC_APRn
  KVM: arm/arm64: Extract GICv3 max APRn index calculation
  KVM: arm/arm64: vITS: Drop its_ite->lpi field
  KVM: arm/arm64: vgic: constify seq_operations and file_operations
  KVM: arm/arm64: Fix guest external abort matching
  KVM: PPC: Book3S HV: Fix memory leak in kvm_vm_ioctl_get_htab_fd
  KVM: s390: vsie: cleanup mcck reinjection
  KVM: s390: use WARN_ON_ONCE only for checking
  KVM: s390: guestdbg: fix range check
  KVM: PPC: Book3S HV: Report storage key support to userspace
  KVM: PPC: Book3S HV: Fix case where HDEC is treated as 32-bit on POWER9
  KVM: PPC: Book3S HV: Fix invalid use of register expression
  KVM: PPC: Book3S HV: Fix H_REGISTER_VPA VPA size validation
  KVM: PPC: Book3S HV: Fix setting of storage key in H_ENTER
  KVM: PPC: e500mc: Fix a NULL dereference
  KVM: PPC: e500: Fix some NULL dereferences on error
  KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list
  KVM: s390: we are always in czam mode
  KVM: s390: expose no-DAT to guest and migration support
  KVM: s390: sthyi: remove invalid guest write access
  ...
2017-09-08 15:18:36 -07:00
Radim Krčmář
5f54c8b2d4 Merge branch 'kvm-ppc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
This fix was intended for 4.13, but didn't get in because both
maintainers were on vacation.

Paul Mackerras:
 "It adds mutual exclusion between list_add_rcu and list_del_rcu calls
  on the kvm->arch.spapr_tce_tables list.  Without this, userspace could
  potentially trigger corruption of the list and cause a host crash or
  worse."
2017-09-08 14:40:43 +02:00
Linus Torvalds
460352c2f1 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF, reiserfs, quota, fsnotify cleanups from Jan Kara:
 "Several UDF, reiserfs, quota and fsnotify cleanups.

  Note that there is also a patch updating MAINTAINERS entry for
  notification subsystem to point to me as a maintainer since current
  entries are stale"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: make dnotify_fsnotify_ops const
  isofs: Delete an unnecessary variable initialisation in isofs_read_inode()
  isofs: Adjust four checks for null pointers
  isofs: Delete an error message for a failed memory allocation in isofs_read_inode()
  quota_v2: Delete an error message for a failed memory allocation in v2_read_file_info()
  fs-udf: Delete an error message for a failed memory allocation in two functions
  fs-udf: Improve six size determinations
  fs-udf: Adjust two checks for null pointers
  reiserfs: fix spelling mistake: "tranasction" -> "transaction"
  MAINTAINERS: Update entries for notification subsystem
  uapi/linux/quota.h: Do not include linux/errno.h
2017-09-07 14:53:17 -07:00
Linus Torvalds
c0da4fa0d1 media updates for v4.14-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZsSBoAAoJEAhfPr2O5OEVDc4QAJZSuVYmyLgvtmPxhyqgCvkz
 I0DmWM4ZtK2VT/xJ/AA23z8IiLKi2+pDC0Xx6/aIiA665cyl3oPUdkKIaHW9Z6+A
 fV8gSFkmGkluQb9mP/KdHYI2oSeEv2ivCa1kfaApYcoBa904z8uU++z15Iu5p/+m
 fjpc2vnc9rax0Vuwmgv7p1CL4j4e/ja0siCSCGbu2ad50KqP4ytnBooNPQOQt89D
 L+Av5MeGml/CTUUnAFjWfSmQ72Ht8GhoBBKc6wGoq9x3GTckDDTqy8BAqGt4UQnu
 fR0mb71zuSVmTjxRe7tc/74m3ReaeSHzQeHJhjdQslvNmV3RVQgk/6CCsmqNEegr
 rbC3glQCM+gp5YywCjRL6DCPsoqvjexLtPQjMZIGYxgSYQUyXGOxilgmj9+73761
 6aOl0nqdgN+vlWzaSeDF9EQxRsc+cCq/Po8/xuPE/Pzs6zTQwU+6b+ADLf9jCyDP
 LTC49wOj24SoWiTlG1FTct2ogZ3h5wNPWlurBtmyiFJn+43RpsH5IW9wLilCjeiE
 6JeCWEIBglCCq/TVCzETKNSaixDL6/lMQ9uRdCpIO4VLyoS6S9pZASNPBmQ1h7h/
 oTjYDeWirIthNOccstbBoJQYSX62CqAIW3wq5ME6PAgM+ioiLXLYk0fV3yBKoBNW
 Z0SBeTcuPxWmfzuxMtik
 =fNM2
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:
 "Brazil's Independence Day pull request :-)

  This is one of the biggest media pull requests, with 625 patches
  affecting almost all parts of media (RC, DVB, V4L2, CEC, docs).

  This contains:

   - A lot of new drivers:
     * DVB frontends: mxl5xx, stv0910, stv6111;
     * camera flash: as3645a led driver;
     * HDMI receiver: adv748X;
     * camera sensor: Omnivision 6650 5M driver (ov6650);
     * HDMI CEC: ao-cec meson driver;
     * V4L2: Qualcom camss driver;
     * Remote controller: gpio-ir-tx, pwm-ir-tx and zx-irdec drivers.

   - The DDbridge DVB driver got a massive update, with makes it in sync
     with modern hardware from that vendor;

   - There's an important milestone on this series: the DVB
     documentation was written in 2003, but only started to be updated
     in 2007. It also used to contain several gaps from the time it was
     kept out of tree, mentioning error codes and device nodes that
     never existed upstream. On this series, it received a massive
     update: all non-deprecated digital TV APIs are now in sync with the
     current implementation;

   - Some DVB APIs that aren't used by any upstream driver got removed;

   - Other parts of the media documentation algo got updated, fixing
     some bugs on its PDF output and making it compatible with Sphinx
     version 1.6.

     As the number of hacks required to build PDF output reduced, I hope
     we'll have less troubles as newer versions of our documentation
     toolchain are released (famous last words);

   - As usual, lots of driver cleanups and improvements"

* tag 'media/v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (624 commits)
  media: leds: as3645a: add V4L2_FLASH_LED_CLASS dependency
  media: get rid of removed DMX_GET_CAPS and DMX_SET_SOURCE leftovers
  media: Revert "[media] v4l: async: make v4l2 coexist with devicetree nodes in a dt overlay"
  media: staging: atomisp: sh_css_calloc shall return a pointer to the allocated space
  media: Revert "[media] lirc_dev: remove superfluous get/put_device() calls"
  media: add qcom_camss.rst to v4l-drivers rst file
  media: dvb headers: make checkpatch happier
  media: dvb uapi: move frontend legacy API to another part of the book
  media: pixfmt-srggb12p.rst: better format the table for PDF output
  media: docs-rst: media: Don't use \small for V4L2_PIX_FMT_SRGGB10 documentation
  media: index.rst: don't write "Contents:" on PDF output
  media: pixfmt*.rst: replace a two dots by a comma
  media: vidioc-g-fmt.rst: adjust table format
  media: vivid.rst: add a blank line to correct ReST format
  media: v4l2 uapi book: get rid of driver programming's chapter
  media: format.rst: use the right markup for important notes
  media: docs-rst: cardlists: change their format to flat-tables
  media: em28xx-cardlist.rst: update to reflect last changes
  media: v4l2-event.rst: adjust table to fit on PDF output
  media: docs: don't show ToC for each part on PDF output
  ...
2017-09-07 12:53:14 -07:00
Linus Torvalds
d969443064 sound updates for 4.14-rc1
We have touched quite a lot of files but with fewer changes at this
 cycle; as you can see, most of changes are trivial fixes, especially
 constification patches.  Among the massive attacks by constification
 gangs, we had a few core changes (mostly for ASoC core), as well the
 fixes and the updates by major vendors.  Some highlights are below:
 
 ALSA core:
 - Fix possible races in control API user-TLV codes
 - Small cleanup of PCM core
 
 ASoC:
 - Continued work for componentization; still half-baked, but we're
   certainly progressing
 - Use of devres for jack detection GPIOs, rather as a cleanup
 - Jack detection support for Qualcomm MSM8916
 - Support for Allwinner H3, Cirrus Logic CS43130, Intel Kabylake
   systems with RT5663, Realtek RT274, TI TLV320AIC32x6 and Wolfson
   WM8523
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEECxfAB4MH3rD5mfB6bDGAVD0pKaQFAlmuvBsOHHRpd2FpQHN1
 c2UuZGUACgkQbDGAVD0pKaQl2Q/+I7gcTwlNqjqKL0NZx2cX37nHmfsEcEkWXzx5
 +GFx7LJQf1sLlpxA8YWFThY24/59lH711guC1q8OGwRFdwl9X01+b/vScpvrxnzH
 a0wBTFuvQhr70brc33Y3q3mPhspC5QIUOJtzULG6MiLgzlTBZqLLu2uKV4fSz36I
 pGhi9OB7gMyUL4PFXk7xkcCNxzdOiptWWwFJgh39Yd+9yNP3/U0y3hHz6uzDuQf1
 tUIxDhwl22w8gCiYYNbBn8pcSGH+KT4BMqa5+/+qlV5exNa+Es8DozCQ+lis579+
 dnVc/rpzDJPotmWYdWJsuAmkURvtgzizWSill31bRSoydhfClw6yfl6EoGfZdP50
 3SES87YhTCAPCWfMFSUBtG3PZHgKQcp+QqWCrPaxmVIjqFDS+OxYxtkfVqGRd1u3
 Vv9VCbXhocsdMxA4FlTdyoQPCM9DxcHIdqCmYyjDs55eSy/dO4ND005eunnZawNh
 8kbQcP4873PbDdCssXEeWsbvDSkaeVEa5/taLpalfG7bIO9zHX7ese/0zIgljaF3
 0zK3beWIaIxIeZ473+/gwmzFSpUeIEFJW6YJDkT7CDrRuT1C6fzhaAkz7J+BdzZd
 et0Cqr9tQg0Xk9SsXV1rq1h90e9yQcG0wdalkGrxEVDVtnTRmL5v4AiyX8wzpTAp
 MbuniQo=
 =/Rv6
 -----END PGP SIGNATURE-----

Merge tag 'sound-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "We have touched quite a lot of files but with fewer changes at this
  cycle; as you can see, most of changes are trivial fixes, especially
  constification patches.

  Among the massive attacks by constification gangs, we had a few core
  changes (mostly for ASoC core), as well the fixes and the updates by
  major vendors.

  Some highlights:

  ALSA core:

   - Fix possible races in control API user-TLV codes

   - Small cleanup of PCM core

  ASoC:

   - Continued work for componentization; still half-baked, but we're
     certainly progressing

   - Use of devres for jack detection GPIOs, rather as a cleanup

   - Jack detection support for Qualcomm MSM8916

   - Support for Allwinner H3, Cirrus Logic CS43130, Intel Kabylake
     systems with RT5663, Realtek RT274, TI TLV320AIC32x6 and Wolfson
     WM8523"

* tag 'sound-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (512 commits)
  ALSA: hda/ca0132 - Fix memory leak at error path
  ALSA: hda: Fix forget to free resource in error handling code path in hda_codec_driver_probe
  ASoC: cs43130: Fix unused compiler warnings for PM runtime
  ASoC: cs43130: Fix possible Oops with invalid dev_id
  ASoC: cs43130: fix spelling mistake: "irq_occurrance" -> "irq_occurrence"
  ALSA: atmel: Remove leftovers of AVR32 removal
  ALSA: atmel: convert AC97c driver to GPIO descriptor API
  ALSA: hda/realtek - Enable jack detection function for Intel ALC700
  ALSA: hda: Fix regression of hdmi eld control created based on invalid pcm
  ASoC: Intel: Skylake: Add IPC to configure the copier secondary pins
  ASoC: add missing compile rule for max98371
  ASoC: add missing compile rule for sirf-audio-codec
  ASoC: add missing compile rule for max98371
  ASoC: cs43130: Add devicetree bindings for CS43130
  ASoC: cs43130: Add support for CS43130 codec
  ASoC: make clock direction configurable in asoc-simple
  ALSA: ctxfi: Remove null check before kfree
  ASoC: max98927: Changed device property read function
  ASoC: max98927: Modified DAPM widget and map to enable/disable VI sense path
  ASoC: max98927: Added PM suspend and resume function
  ...
2017-09-07 12:44:53 -07:00
Linus Torvalds
3645e6d0dc Merge tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull MD updates from Shaohua Li:
 "This update mainly fixes bugs:

   - Make raid5 ppl support several ppl from Pawel

   - Several raid5-cache bug fixes from Song

   - Bitmap fixes from Neil and Me

   - One raid1/10 regression fix since 4.12 from Me

   - Other small fixes and cleanup"

* tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/bitmap: disable bitmap_resize for file-backed bitmaps.
  raid5-ppl: Recovery support for multiple partial parity logs
  md: Runtime support for multiple ppls
  md/raid0: attach correct cgroup info in bio
  lib/raid6: align AVX512 constants to 512 bits, not bytes
  raid5: remove raid5_build_block
  md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_show
  md: replace seq_release_private with seq_release
  md: notify about new spare disk in the container
  md/raid1/10: reset bio allocated from mempool
  md/raid5: release/flush io in raid5_do_work()
  md/bitmap: copy correct data for bitmap super
2017-09-07 12:41:48 -07:00
Linus Torvalds
a0725ab0c7 Merge branch 'for-4.14/block' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
 "This is the first pull request for 4.14, containing most of the code
  changes. It's a quiet series this round, which I think we needed after
  the churn of the last few series. This contains:

   - Fix for a registration race in loop, from Anton Volkov.

   - Overflow complaint fix from Arnd for DAC960.

   - Series of drbd changes from the usual suspects.

   - Conversion of the stec/skd driver to blk-mq. From Bart.

   - A few BFQ improvements/fixes from Paolo.

   - CFQ improvement from Ritesh, allowing idling for group idle.

   - A few fixes found by Dan's smatch, courtesy of Dan.

   - A warning fixup for a race between changing the IO scheduler and
     device remova. From David Jeffery.

   - A few nbd fixes from Josef.

   - Support for cgroup info in blktrace, from Shaohua.

   - Also from Shaohua, new features in the null_blk driver to allow it
     to actually hold data, among other things.

   - Various corner cases and error handling fixes from Weiping Zhang.

   - Improvements to the IO stats tracking for blk-mq from me. Can
     drastically improve performance for fast devices and/or big
     machines.

   - Series from Christoph removing bi_bdev as being needed for IO
     submission, in preparation for nvme multipathing code.

   - Series from Bart, including various cleanups and fixes for switch
     fall through case complaints"

* 'for-4.14/block' of git://git.kernel.dk/linux-block: (162 commits)
  kernfs: checking for IS_ERR() instead of NULL
  drbd: remove BIOSET_NEED_RESCUER flag from drbd_{md_,}io_bio_set
  drbd: Fix allyesconfig build, fix recent commit
  drbd: switch from kmalloc() to kmalloc_array()
  drbd: abort drbd_start_resync if there is no connection
  drbd: move global variables to drbd namespace and make some static
  drbd: rename "usermode_helper" to "drbd_usermode_helper"
  drbd: fix race between handshake and admin disconnect/down
  drbd: fix potential deadlock when trying to detach during handshake
  drbd: A single dot should be put into a sequence.
  drbd: fix rmmod cleanup, remove _all_ debugfs entries
  drbd: Use setup_timer() instead of init_timer() to simplify the code.
  drbd: fix potential get_ldev/put_ldev refcount imbalance during attach
  drbd: new disk-option disable-write-same
  drbd: Fix resource role for newly created resources in events2
  drbd: mark symbols static where possible
  drbd: Send P_NEG_ACK upon write error in protocol != C
  drbd: add explicit plugging when submitting batches
  drbd: change list_for_each_safe to while(list_first_entry_or_null)
  drbd: introduce drbd_recv_header_maybe_unplug
  ...
2017-09-07 11:59:42 -07:00
Bjorn Helgaas
33db87de6a Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Fix PCIe capability sizes
  PCI: Convert to using %pOF instead of full_name()
  PCI: Constify endpoint pci_epf_type device_type
  PCI: Constify bin_attribute structures
  PCI: Constify hotplug pci_device_id structures
  PCI: Constify hotplug attribute_group structures
  PCI: Constify label attribute_group structures
  PCI: Constify sysfs attribute_group structures
2017-09-07 13:24:16 -05:00
Bjorn Helgaas
18f20670e0 Merge branch 'pci/dpc' into next
* pci/dpc:
  PCI/DPC: Add local struct device pointers
  PCI/DPC: Add eDPC support
2017-09-07 13:24:13 -05:00
Linus Torvalds
d34fc1adf0 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - various misc bits

 - DAX updates

 - OCFS2

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (119 commits)
  mm,fork: introduce MADV_WIPEONFORK
  x86,mpx: make mpx depend on x86-64 to free up VMA flag
  mm: add /proc/pid/smaps_rollup
  mm: hugetlb: clear target sub-page last when clearing huge page
  mm: oom: let oom_reap_task and exit_mmap run concurrently
  swap: choose swap device according to numa node
  mm: replace TIF_MEMDIE checks by tsk_is_oom_victim
  mm, oom: do not rely on TIF_MEMDIE for memory reserves access
  z3fold: use per-cpu unbuddied lists
  mm, swap: don't use VMA based swap readahead if HDD is used as swap
  mm, swap: add sysfs interface for VMA based swap readahead
  mm, swap: VMA based swap readahead
  mm, swap: fix swap readahead marking
  mm, swap: add swap readahead hit statistics
  mm/vmalloc.c: don't reinvent the wheel but use existing llist API
  mm/vmstat.c: fix wrong comment
  selftests/memfd: add memfd_create hugetlbfs selftest
  mm/shmem: add hugetlbfs support to memfd_create()
  mm, devm_memremap_pages: use multi-order radix for ZONE_DEVICE lookups
  mm/vmalloc.c: halve the number of comparisons performed in pcpu_get_vm_areas()
  ...
2017-09-06 20:49:49 -07:00
Rik van Riel
d2cd9ede6e mm,fork: introduce MADV_WIPEONFORK
Introduce MADV_WIPEONFORK semantics, which result in a VMA being empty
in the child process after fork.  This differs from MADV_DONTFORK in one
important way.

If a child process accesses memory that was MADV_WIPEONFORK, it will get
zeroes.  The address ranges are still valid, they are just empty.

If a child process accesses memory that was MADV_DONTFORK, it will get a
segmentation fault, since those address ranges are no longer valid in
the child after fork.

Since MADV_DONTFORK also seems to be used to allow very large programs
to fork in systems with strict memory overcommit restrictions, changing
the semantics of MADV_DONTFORK might break existing programs.

MADV_WIPEONFORK only works on private, anonymous VMAs.

The use case is libraries that store or cache information, and want to
know that they need to regenerate it in the child process after fork.

Examples of this would be:
 - systemd/pulseaudio API checks (fail after fork) (replacing a getpid
   check, which is too slow without a PID cache)
 - PKCS#11 API reinitialization check (mandated by specification)
 - glibc's upcoming PRNG (reseed after fork)
 - OpenSSL PRNG (reseed after fork)

The security benefits of a forking server having a re-inialized PRNG in
every child process are pretty obvious.  However, due to libraries
having all kinds of internal state, and programs getting compiled with
many different versions of each library, it is unreasonable to expect
calling programs to re-initialize everything manually after fork.

A further complication is the proliferation of clone flags, programs
bypassing glibc's functions to call clone directly, and programs calling
unshare, causing the glibc pthread_atfork hook to not get called.

It would be better to have the kernel take care of this automatically.

The patch also adds MADV_KEEPONFORK, to undo the effects of a prior
MADV_WIPEONFORK.

This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:

    https://man.openbsd.org/minherit.2

[akpm@linux-foundation.org: numerically order arch/parisc/include/uapi/asm/mman.h #defines]
Link: http://lkml.kernel.org/r/20170811212829.29186-3-riel@redhat.com
Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Florian Weimer <fweimer@redhat.com>
Reported-by: Colm MacCártaigh <colm@allcosts.net>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Cc: <linux-api@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:30 -07:00
Mike Kravetz
749df87bd7 mm/shmem: add hugetlbfs support to memfd_create()
This patch came out of discussions in this e-mail thread:
  http://lkml.kernel.org/r/1499357846-7481-1-git-send-email-mike.kravetz%40oracle.com

The Oracle JVM team is developing a new garbage collection model.  This
new model requires multiple mappings of the same anonymous memory.  One
straight forward way to accomplish this is with memfd_create.  They can
use the returned fd to create multiple mappings of the same memory.

The JVM today has an option to use (static hugetlb) huge pages.  If this
option is specified, they would like to use the same garbage collection
model requiring multiple mappings to the same memory.  Using hugetlbfs,
it is possible to explicitly mount a filesystem and specify file paths
in order to get an fd that can be used for multiple mappings.  However,
this introduces additional system admin work and coordination.

Ideally they would like to get a hugetlbfs fd without requiring explicit
mounting of a filesystem.  Today, mmap and shmget can make use of
hugetlbfs without explicitly mounting a filesystem.  The patch adds this
functionality to memfd_create.

Add a new flag MFD_HUGETLB to memfd_create() that will specify the file
to be created resides in the hugetlbfs filesystem.  This is the generic
hugetlbfs filesystem not associated with any specific mount point.  As
with other system calls that request hugetlbfs backed pages, there is
the ability to encode huge page size in the flag arguments.

hugetlbfs does not support sealing operations, therefore specifying
MFD_ALLOW_SEALING with MFD_HUGETLB will result in EINVAL.

Of course, the memfd_man page would need updating if this type of
functionality moves forward.

Link: http://lkml.kernel.org/r/1502149672-7759-2-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:29 -07:00
Andrea Arcangeli
a36985d31a userfaultfd: provide pid in userfault msg - add feat union
No ABI change, but this will make it more explicit to software that ptid
is only available if requested by passing UFFD_FEATURE_THREAD_ID to
UFFDIO_API.  The fact it's a union will also self document it shouldn't
be taken for granted there's a tpid there.

Link: http://lkml.kernel.org/r/20170802165145.22628-7-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Alexey Perevalov <a.perevalov@samsung.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:29 -07:00
Alexey Perevalov
9d4ac93482 userfaultfd: provide pid in userfault msg
It could be useful for calculating downtime during postcopy live
migration per vCPU.  Side observer or application itself will be
informed about proper task's sleep during userfaultfd processing.

Process's thread id is being provided when user requeste it by setting
UFFD_FEATURE_THREAD_ID bit into uffdio_api.features.

Link: http://lkml.kernel.org/r/20170802165145.22628-6-aarcange@redhat.com
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:29 -07:00
Prakash Sangappa
2d6d6f5a09 mm: userfaultfd: add feature to request for a signal delivery
In some cases, userfaultfd mechanism should just deliver a SIGBUS signal
to the faulting process, instead of the page-fault event.  Dealing with
page-fault event using a monitor thread can be an overhead in these
cases.  For example applications like the database could use the
signaling mechanism for robustness purpose.

Database uses hugetlbfs for performance reason.  Files on hugetlbfs
filesystem are created and huge pages allocated using fallocate() API.
Pages are deallocated/freed using fallocate() hole punching support.
These files are mmapped and accessed by many processes as shared memory.
The database keeps track of which offsets in the hugetlbfs file have
pages allocated.

Any access to mapped address over holes in the file, which can occur due
to bugs in the application, is considered invalid and expect the process
to simply receive a SIGBUS.  However, currently when a hole in the file
is accessed via the mapped address, kernel/mm attempts to automatically
allocate a page at page fault time, resulting in implicitly filling the
hole in the file.  This may not be the desired behavior for applications
like the database that want to explicitly manage page allocations of
hugetlbfs files.

Using userfaultfd mechanism with this support to get a signal, database
application can prevent pages from being allocated implicitly when
processes access mapped address over holes in the file.

This patch adds UFFD_FEATURE_SIGBUS feature to userfaultfd mechnism to
request for a SIGBUS signal.

See following for previous discussion about the database requirement
leading to this proposal as suggested by Andrea.

http://www.spinics.net/lists/linux-mm/msg129224.html

Link: http://lkml.kernel.org/r/1501552446-748335-2-git-send-email-prakash.sangappa@oracle.com
Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:29 -07:00
Mike Kravetz
4da243ac1c mm: shm: use new hugetlb size encoding definitions
Use the common definitions from hugetlb_encode.h header file for
encoding hugetlb size definitions in shmget system call flags.

In addition, move these definitions from the internal (kernel) to user
(uapi) header file.

Link: http://lkml.kernel.org/r/1501527386-10736-4-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:28 -07:00
Mike Kravetz
aafd4562df mm: arch: consolidate mmap hugetlb size encodings
A non-default huge page size can be encoded in the flags argument of the
mmap system call.  The definitions for these encodings are in arch
specific header files.  However, all architectures use the same values.

Consolidate all the definitions in the primary user header file
(uapi/linux/mman.h).  Include definitions for all known huge page sizes.
Use the generic encoding definitions in hugetlb_encode.h as the basis
for these definitions.

Link: http://lkml.kernel.org/r/1501527386-10736-3-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:28 -07:00
Mike Kravetz
e652f69459 mm: hugetlb: define system call hugetlb size encodings in single file
Patch series "Consolidate system call hugetlb page size encodings".

These patches are the result of discussions in
https://lkml.org/lkml/2017/3/8/548.  The following changes are made in the
patch set:

1) Put all the log2 encoded huge page size definitions in a common
   header file.  The idea is have a set of definitions that can be use as
   the basis for system call specific definitions such as MAP_HUGE_* and
   SHM_HUGE_*.

2) Remove MAP_HUGE_* definitions in arch specific files.  All these
   definitions are the same.  Consolidate all definitions in the primary
   user header file (uapi/linux/mman.h).

3) Remove SHM_HUGE_* definitions intended for user space from kernel
   header file, and add to user (uapi/linux/shm.h) header file.  Add
   definitions for all known huge page size encodings as in mmap.

This patch (of 3):

If hugetlb pages are requested in mmap or shmget system calls, a huge
page size other than default can be requested.  This is accomplished by
encoding the log2 of the huge page size in the upper bits of the flag
argument.  asm-generic and arch specific headers all define the same
values for these encodings.

Put common definitions in a single header file.  The primary uapi header
files for mmap and shm will use these definitions as a basis for
definitions specific to those system calls.

Link: http://lkml.kernel.org/r/1501527386-10736-2-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-06 17:27:28 -07:00
Linus Torvalds
aae3dbb477 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Support ipv6 checksum offload in sunvnet driver, from Shannon
    Nelson.

 2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
    Dumazet.

 3) Allow generic XDP to work on virtual devices, from John Fastabend.

 4) Add bpf device maps and XDP_REDIRECT, which can be used to build
    arbitrary switching frameworks using XDP. From John Fastabend.

 5) Remove UFO offloads from the tree, gave us little other than bugs.

 6) Remove the IPSEC flow cache, from Florian Westphal.

 7) Support ipv6 route offload in mlxsw driver.

 8) Support VF representors in bnxt_en, from Sathya Perla.

 9) Add support for forward error correction modes to ethtool, from
    Vidya Sagar Ravipati.

10) Add time filter for packet scheduler action dumping, from Jamal Hadi
    Salim.

11) Extend the zerocopy sendmsg() used by virtio and tap to regular
    sockets via MSG_ZEROCOPY. From Willem de Bruijn.

12) Significantly rework value tracking in the BPF verifier, from Edward
    Cree.

13) Add new jump instructions to eBPF, from Daniel Borkmann.

14) Rework rtnetlink plumbing so that operations can be run without
    taking the RTNL semaphore. From Florian Westphal.

15) Support XDP in tap driver, from Jason Wang.

16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.

17) Add Huawei hinic ethernet driver.

18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
    Delalande.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
  i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
  i40e: avoid NVM acquire deadlock during NVM update
  drivers: net: xgene: Remove return statement from void function
  drivers: net: xgene: Configure tx/rx delay for ACPI
  drivers: net: xgene: Read tx/rx delay for ACPI
  rocker: fix kcalloc parameter order
  rds: Fix non-atomic operation on shared flag variable
  net: sched: don't use GFP_KERNEL under spin lock
  vhost_net: correctly check tx avail during rx busy polling
  net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
  rxrpc: Make service connection lookup always check for retry
  net: stmmac: Delete dead code for MDIO registration
  gianfar: Fix Tx flow control deactivation
  cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
  cxgb4: Fix pause frame count in t4_get_port_stats
  cxgb4: fix memory leak
  tun: rename generic_xdp to skb_xdp
  tun: reserve extra headroom only when XDP is set
  net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
  net: dsa: bcm_sf2: Advertise number of egress queues
  ...
2017-09-06 14:45:08 -07:00
Linus Torvalds
c7f396f12f dlm for 4.14
This set includes a bunch of minor code cleanups that
 have accumulated, probably from code analyzers people
 like to run.  There is one nice fix that avoids some
 socket leaks by switching to use sock_create_lite().
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZrtP1AAoJEDgbc8f8gGmqH3YQALZouj0tzatxHfWlGcMHoufL
 M2wmQragG4qOI1w9vNKmo6GctQ+Teqholnt2gRHputxNLzPUXPzNgAR1/O7O8741
 TCB/fhR16KqMMP4bTa2GQ73WVFhohkh8xSvtcWkhCqC+Ti/qx2FN7LZ7Mxn6Muje
 IC7E+Oy2Xr64lUb1CsfpXXel8vs+ujoMIAZiU4P/PgCzYX5FvaFWJ9VCwgYzfIuN
 zj2O1txau5xW2fZmD5GRmgWY/g5wCPcPxwCdZacqrL7yNiU1wsrhYFds0AiGSPJC
 D/wMX9a0GN28L+zW0eLEVI+lIk8f5Az+DOrw5UUFNwDd4ejDWaS0dtMNThYu6VvD
 x6+JZhgZHcj3Df/s4PMZvPkCx+8ZeRGK9RK+jlkEVfO8aIE39gi6mC+EuTJmZe/m
 PAB7O2OG0FTUPoY+t/5wKaz1g6qSHQ2fQZb8rAMoUFWwFJWXp3q7/tlZN4dlwIDI
 2yp9UN09ug3tICcne/gvmJ5x8lVN3Eh6XHkbO1qedsv45SYKdOwPmvyp2XTZooJK
 kg827Z+deRmvTfX3gzEsEO1caabtYDOrZ23RHJxqViNdZbMn3Tifc2pBZCIJHfmu
 4My+Midl38Ch9SUx30ePwjyJ9+Ptsm7KiSOIvrtRZV/1bPEqRP03suxEZvmCA/z4
 elMj1gKykj/GHXZ+cHlX
 =lGzS
 -----END PGP SIGNATURE-----

Merge tag 'dlm-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:
 "This set includes a bunch of minor code cleanups that have
  accumulated, probably from code analyzers people like to run. There is
  one nice fix that avoids some socket leaks by switching to use
  sock_create_lite()"

* tag 'dlm-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: use sock_create_lite inside tcp_accept_from_sock
  uapi linux/dlm_netlink.h: include linux/dlmconstants.h
  dlm: avoid double-free on error path in dlm_device_{register,unregister}
  dlm: constify kset_uevent_ops structure
  dlm: print log message when cluster name is not set
  dlm: Delete an unnecessary variable initialisation in dlm_ls_start()
  dlm: Improve a size determination in two functions
  dlm: Use kcalloc() in two functions
  dlm: Use kmalloc_array() in make_member_array()
  dlm: Delete an error message for a failed memory allocation in dlm_recover_waiters_pre()
  dlm: Improve a size determination in dlm_recover_waiters_pre()
  dlm: Use kcalloc() in dlm_scan_waiters()
  dlm: Improve a size determination in table_seq_start()
  dlm: Add spaces for better code readability
  dlm: Replace six seq_puts() calls by seq_putc()
  dlm: Make dismatch error message more clear
  dlm: Fix kernel memory disclosure
2017-09-06 13:39:23 -07:00
Linus Torvalds
5791577963 Updates for 4.14:
- Write unmount record for a ro mount to avoid unnecessary log replay
 - Clean up orphaned inodes when mounting fs readonly
 - Resubmit inode log items when buffer writeback fails to avoid umount hang
 - Fix log recovery corruption problems when log headers wrap around the end
 - Avoid infinite loop searching for free inodes when inode counters are wrong
 - Evict inodes involved with log redo so that we don't leak them later
 - Fix a potential race between reclaim and inode cluster freeing
 - Refactor the inode joining code w.r.t. transaction rolling & deferred ops
 - Fix a bug where the log doesn't properly deal with dirty buffers that
   are about to become ordered buffers
 - Fix the extent swap code to deal with making dirty buffers ordered properly
 - Consolidate page fault handlers
 - Refactor the incore extent manipulation functions to use the iext
   abstractions instead of directly modifying with extent data
 - Disable crashy chattr +/-x until we fix it
 - Don't allow us to set S_DAX for v2 inodes
 - Various cleanups
 - Clarify some documentation
 - Fix a problem where fsync and a log commit race to send the disk a
   flush command, resulting in a small window where power fail data loss
   could occur
 - Simplify some rmap operations in the fcollapse code
 - Fix some use-after-free problems in async writeback
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJZrEAQAAoJEPh/dxk0SrTrxKEP/3y8sLWdy4fUdPpVkwZteXwc
 zGyYaLrmKRc5i6abBNtLCZoRJGfRdvVyPrhQ1q3mt8H//xuURgqgFFyjj3wAdsLf
 sDejIHhdsc8/VcuLLtCW3rEYg58hJ89hW7d1InCP0tvqWmljh9svhzXebtwUvNNF
 /2fHIUXUiAxLbgjv/N2i/smlLl0zdx6C2x1TlJmfwer0UMTAnlmbFWxCqmtUZwSl
 QSuGgn1wo3dkId9aFoNwQmSCFeYcxQlpaInJEzUiVQOA4dbphXHO9Bsx0eOkpDuz
 39waaX0fld8LEfIQGmUQ995UkAwfk/asjgDSApyXdkMayNWhi0KpRl1zXgCb8BbL
 m7vYJhIfJ399+jbNPe1+htn3I16AmpvAai9MNJidFclWwqFEuQEnxZccdtTIAiRv
 XuYiq9hN2NOwlwPUYfrZxfx34fdocRyHmGVs3i7P3/qPWd5Hx6+FpQTOngciS7MN
 6xnM8PbnrLadw3ooMDEKgWsN805BQALiwzDRggoAXG1Pm2SqFnLD/dAR4c7R3nR8
 vvYlfGHnd38aMlW73IALkkGJqZy/bHPFhrbvpjXyIG6SYwCjrWrO0chM0O8MCRrF
 MIW3rM5hYIE8aCkpJ2mxvcQalmSAlSPVKlmgvSK4S1Sz4kcywxskNhch8uNkb5uy
 WUHhrJz+wBjdjrDOU3aL
 =jBdo
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.14-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull XFS updates from Darrick Wong:
 "Here are the changes for xfs for 4.14. Most of these are cleanups and
  fixes for bad behavior, as we're mostly focusing on improving
  reliablity this cycle (read: there's potentially a lot of stuff on the
  horizon for 4.15 so better to spend a few weeks killing other bugs
  now).

  Summary:

   - Write unmount record for a ro mount to avoid unnecessary log replay

   - Clean up orphaned inodes when mounting fs readonly

   - Resubmit inode log items when buffer writeback fails to avoid
     umount hang

   - Fix log recovery corruption problems when log headers wrap around
     the end

   - Avoid infinite loop searching for free inodes when inode counters
     are wrong

   - Evict inodes involved with log redo so that we don't leak them
     later

   - Fix a potential race between reclaim and inode cluster freeing

   - Refactor the inode joining code w.r.t. transaction rolling &
     deferred ops

   - Fix a bug where the log doesn't properly deal with dirty buffers
     that are about to become ordered buffers

   - Fix the extent swap code to deal with making dirty buffers ordered
     properly

   - Consolidate page fault handlers

   - Refactor the incore extent manipulation functions to use the iext
     abstractions instead of directly modifying with extent data

   - Disable crashy chattr +/-x until we fix it

   - Don't allow us to set S_DAX for v2 inodes

   - Various cleanups

   - Clarify some documentation

   - Fix a problem where fsync and a log commit race to send the disk a
     flush command, resulting in a small window where power fail data
     loss could occur

   - Simplify some rmap operations in the fcollapse code

   - Fix some use-after-free problems in async writeback"

* tag 'xfs-4.14-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (44 commits)
  xfs: use kmem_free to free return value of kmem_zalloc
  xfs: open code end_buffer_async_write in xfs_finish_page_writeback
  xfs: don't set v3 xflags for v2 inodes
  xfs: fix compiler warnings
  fsmap: fix documentation of FMR_OF_LAST
  xfs: simplify the rmap code in xfs_bmse_merge
  xfs: remove unused flags arg from xfs_file_iomap_begin_delay
  xfs: fix incorrect log_flushed on fsync
  xfs: disable per-inode DAX flag
  xfs: replace xfs_qm_get_rtblks with a direct call to xfs_bmap_count_leaves
  xfs: rewrite xfs_bmap_count_leaves using xfs_iext_get_extent
  xfs: use xfs_iext_*_extent helpers in xfs_bmap_split_extent_at
  xfs: use xfs_iext_*_extent helpers in xfs_bmap_shift_extents
  xfs: move some code around inside xfs_bmap_shift_extents
  xfs: use xfs_iext_get_extent in xfs_bmap_first_unused
  xfs: switch xfs_bmap_local_to_extents to use xfs_iext_insert
  xfs: add a xfs_iext_update_extent helper
  xfs: consolidate the various page fault handlers
  iomap: return VM_FAULT_* codes from iomap_page_mkwrite
  xfs: relog dirty buffers during swapext bmbt owner change
  ...
2017-09-06 12:19:23 -07:00