Commit Graph

534114 Commits

Author SHA1 Message Date
Stas Sergeev
4cba5c2103 of_mdio: add new DT property 'managed' to specify the PHY management type
Currently the PHY management type is selected by the MAC driver arbitrary.
The decision is based on the presence of the "fixed-link" node and on a
will of the driver's authors.
This caused a regression recently, when mvneta driver suddenly started
to use the in-band status for auto-negotiation on fixed links.
It appears the auto-negotiation may not work when expected by the MAC driver.
Sebastien Rannou explains:
<< Yes, I confirm that my HW does not generate an in-band status. AFAIK, it's
a PHY that aggregates 4xSGMIIs to 1xQSGMII ; the MAC side of the PHY (with
inband status) is connected to the switch through QSGMII, and in this context
we are on the media side of the PHY. >>
https://lkml.org/lkml/2015/7/10/206

This patch introduces the new string property 'managed' that allows
the user to set the management type explicitly.
The supported values are:
"auto" - default. Uses either MDIO or nothing, depending on the presence
of the fixed-link node
"in-band-status" - use in-band status

Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>

CC: Rob Herring <robh+dt@kernel.org>
CC: Pawel Moll <pawel.moll@arm.com>
CC: Mark Rutland <mark.rutland@arm.com>
CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
CC: Kumar Gala <galak@codeaurora.org>
CC: Florian Fainelli <f.fainelli@gmail.com>
CC: Grant Likely <grant.likely@linaro.org>
CC: devicetree@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:55 -07:00
Stas Sergeev
868a4215be net: phy: fixed_phy: handle link-down case
fixed_phy_register() currently hardcodes the fixed PHY link to 1, and
expects to find a "speed" parameter to provide correct information
towards the fixed PHY consumer.

In a subsequent change, where we allow "managed" (e.g: (RS)GMII in-band
status auto-negotiation) fixed PHYs, none of these parameters can be
provided since they will be auto-negotiated, hence, we just provide a
zero-initialized fixed_phy_status to fixed_phy_register() which makes it
fail when we call fixed_phy_update_regs() since status.speed = 0 which
makes us hit the "default" label and error out.

Without this change, we would also see potentially inconsistent
speed/duplex parameters for fixed PHYs when the link is DOWN.

CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
[florian: add more background to why this is correct and desirable]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:55 -07:00
Florian Fainelli
d2eac98f7d net: dsa: bcm_sf2: Do not override speed settings
The SF2 driver currently overrides speed settings for its port
configured using a fixed PHY, this is both unnecessary and incorrect,
because we keep feedback to the hardware parameters that we read from
the PHY device, which in the case of a fixed PHY cannot possibly change
speed.

This is a required change to allow the fixed PHY code to allow
registering a PHY with a link configured as DOWN by default and avoid
some sort of circular dependency where we require the link_update
callback to run to program the hardware, and we then utilize the fixed
PHY parameters to program the hardware with the same settings.

Fixes: 246d7f773c ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:12:55 -07:00
Linus Torvalds
d725e66c06 Revert "fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()"
This reverts commit a2673b6e04.

Kinglong Mee reports a memory leak with that patch, and Jan Kara confirms:

 "Thanks for report! You are right that my patch introduces a race
  between fsnotify kthread and fsnotify_destroy_group() which can result
  in leaking inotify event on group destruction.

  I haven't yet decided whether the right fix is not to queue events for
  dying notification group (as that is pointless anyway) or whether we
  should just fix the original problem differently...  Whenever I look
  at fsnotify code mark handling I get lost in the maze of locks, lists,
  and subtle differences between how different notification systems
  handle notification marks :( I'll think about it over night"

and after thinking about it, Jan says:

 "OK, I have looked into the code some more and I found another
  relatively simple way of fixing the original oops.  It will be IMHO
  better than trying to fixup this issue which has more potential for
  breakage.  I'll ask Linus to revert the fsnotify fix he already merged
  and send a new fix"

Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Requested-by: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-21 16:06:53 -07:00
David S. Miller
0bccece592 ath9k:
* fix device ID check for AR956x
 
 iwlwifi:
 
 * bug fixes specific for 8000 series
 * fix a crash in time events
 * fix a crash in PCIe transport
 * fix BT Coex code that prevented association on certain
   devices (3160).
 * revert the new RBD allocation model because it introduced
   a bug when running on weak VM setups.
 * new device IDs
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJVrQ65AAoJEG4XJFUm622bArgH/jlGm44aPLVTtTfc3Qi/yH1m
 pVZ+F6Z4FhFM8Ln/skL/PIWPbxmcwMQ9IYiDI+1y0obr5RaNGZbh5EBwLcNQzAII
 L9aO7vGGQRHewJj3LAY4ovkT7xYT6Kra4iZuXrozeq8CJN2/0l4Yv2uPkwPtszIf
 Gp1QGgCbvUzaPdIIevx4bMyLcC5h58y7Thg2+kxSSo/VFJxGh2DFFnLuJx5RVS1D
 r7fBWH5BzUPP1sh84Gt+0IpjyoxqpWiI//Wqg2Hkt6zdis3fixDvK8Wm08EewZdj
 Wf63HgOzQeL9vE6IHg3WuUiR8QOn51+oqDWCtbrRemBsywZ9rc4vesMAREjcw2Q=
 =ygb2
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-for-davem-2015-07-20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
ath9k:

* fix device ID check for AR956x

iwlwifi:

* bug fixes specific for 8000 series
* fix a crash in time events
* fix a crash in PCIe transport
* fix BT Coex code that prevented association on certain
  devices (3160).
* revert the new RBD allocation model because it introduced
  a bug when running on weak VM setups.
* new device IDs
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:06:39 -07:00
Mathias Krause
e181a54304 net: #ifdefify sk_classid member of struct sock
The sk_classid member is only required when CONFIG_CGROUP_NET_CLASSID is
enabled. #ifdefify it to reduce the size of struct sock on 32 bit
systems, at least.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 16:04:30 -07:00
Linus Torvalds
71ebd1af09 Pin control fixes for the v4.2 series:
- Some dead defines dropped from the Samsung driver, was
   targeted for -rc2 but got delayed
 - Drop the strict mode from abx500, this was too strict
 - Fix the R-Car sparse IRQs code to work as intended
 - Fix the IRQ code for the pinctrl-single GPIO backend to not
   enforce threaded IRQs
 - Clear the latched events/IRQs for the Broadcom BCM2835
   driver
 - Fix up debugfs for the Freescale imx1 driver
 - Fix a typo bug in the Schmitt Trigger setup in the LPC18xx
   driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVrpAVAAoJEEEQszewGV1zeq8P/1hIdJfHYpVb5whr4Cxq2JFh
 RHKFCBGI75JDj+K7dBjJkflBxnb158rFA7QxEumEFp2VnWFUzlFJeirGDM9KArXO
 Wxsp+Lm9oO8U7T1dUXhsEZJTmVNXSiNcYbuaYOkxtuVn4YlVSS/XB3T8dcXPzKRG
 3BHuKnOA5qpcvM9FaA1O1UiPwR/wc/SrtX38+c1Wt0dXJO+Tgj9PtiiK6iUQHskZ
 rbsxXZEBTP2mcmBBXNtMXbAh9qnL88uG44zSEv1nTDr/jHVYftIVnTdQ07ICT3S9
 mCKEloeZuvHPIkttZ9Ddlj5Jf5PbaqvJllSHhE9FPGEjkOgAtfNdf0zN+Zbqhj0F
 aZAHtknYRsOXFDKAHJckUvXlumFrOSd/8vDIeaVwC807Lz190syBdgUbKVBtzZYf
 r7+HC1y3XIyLk2M2ZiQLwaYJPr5DJqxNgxMm7Wg/E0mmwScPhvMhrYKNJQvSu2f2
 hE/l0XigFxaY7JYAj49ltjaCOKXy02IMGTcT7MAYS9mSWeI8XFI+xPN2ZjiUkQLS
 4nLG4oC9FfCndcAEYf4f/86L9F1k+5ysH+DsEbkB6aCjz1D3Lijb+IzoRJTH9CE9
 jRyQbhtaC3kPJb7Ucsr4RBVCLOevu8E6xiBp0mdeeSc9a2mHZcrE1IVTU033oNOp
 GDhPSA4vZApj0YJqZdTw
 =pDEn
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Here are some overly ripe pin control fixes for the v4.2 series.

  They got delayed because of various crap commits and having to clean
  and rinse the patch stack a few times.  Now they are however looking
  good.

   - some dead defines dropped from the Samsung driver, was targeted for
     -rc2 but got delayed
   - drop the strict mode from abx500, this was too strict
   - fix the R-Car sparse IRQs code to work as intended
   - fix the IRQ code for the pinctrl-single GPIO backend to not enforce
     threaded IRQs
   - clear the latched events/IRQs for the Broadcom BCM2835 driver
   - fix up debugfs for the Freescale imx1 driver
   - fix a typo bug in the Schmitt Trigger setup in the LPC18xx driver"

* tag 'pinctrl-v4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: lpc18xx: fix schmitt trigger setup
  Subject: pinctrl: imx1-core: Fix debug output in .pin_config_set callback
  pinctrl: bcm2835: Clear the event latch register when disabling interrupts
  pinctrl: single: ensure pcs irq will not be forced threaded
  sh-pfc: fix sparse GPIOs for R-Car SoCs
  pinctrl: abx500: remove strict mode
  pinctrl: samsung: Remove old unused defines
2015-07-21 15:27:27 -07:00
Linus Torvalds
8426fb302c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fix from Jan Kara:
 "A fix for UDF corruption when certain disk-format feature is enabled"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Don't corrupt unalloc spacetable when writing it
2015-07-21 15:18:06 -07:00
Linus Torvalds
1ad474de93 He Kuang noticed that the sample code using the trace_event helper
function __get_dynamic_array_len() is broken. This only changes the
 sample code, and I'm pushing this now instead of later because I don't
 want others using the broken code as an example when using it for real.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVrpUnAAoJEEjnJuOKh9ld9TgIAM9lgZ9KsAcfWaYHotr6hd7r
 cbpAN5L30H5iuwC6rN+gWe4roiIn9csIgR+LHcloBrjmRujQXY4FegeuMCwNZiQO
 +J6SXuGxweZ+9kloMSw3RvQw8rp1hIIUwbvkybHNbFJq/w4m4nOraEhrVxxELCGd
 iNoOFULVjEUyHoHtttsHzaSnxl3p2TXjHxk4RlZUL3kcxlbmeG2zsQhZF8F0Gw9/
 /Q/fMzMGctl8Xj9SWwy6FIyF9DkqXDhSR0adzw/Hd03n1pMj9+YtBKNtnRAbt42E
 V+y9hjaiulUuWD7U7q1FVQp1w3ksCX8E3fNooVKeUjA2Zkrbu1EYqpskwXpAtiw=
 =Z2MY
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.2-rc2-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing sample code fix from Steven Rostedt:
 "He Kuang noticed that the sample code using the trace_event helper
  function __get_dynamic_array_len() is broken.

  This only changes the sample code, and I'm pushing this now instead of
  later because I don't want others using the broken code as an example
  when using it for real"

* tag 'trace-v4.2-rc2-fix2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix sample output of dynamic arrays
2015-07-21 14:42:40 -07:00
Mugunthan V N
1e353cddcf drivers: net: cpsw: remove tx event processing in rx napi poll
With commit c03abd8463 ("net: ethernet: cpsw: don't requests IRQs
we don't use") common isr and napi are separated into separate tx isr
and rx isr/napi, but still in rx napi tx events are handled. So removing
the tx event handling in rx napi.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:41:24 -07:00
David S. Miller
e69724f32e Merge branch 'lwtunnel'
Thomas Graf says:

====================
Lightweight & flow based encapsulation

This series combines the work previously posted by Roopa, Robert and
myself. It's according to what we discussed at NFWS. The motivation
of this series is to:

 * Consolidate code between OVS and the rest of the kernel and get
   rid of OVS vports and instead represent them as pure net_devices.
 * Introduce a lightweight tunneling mechanism which enables flow
   based encapsulation to improve scalability on both RX and TX.
 * Do the above in an encapsulation unspecific way so that the
   encapsulation type is eventually abstracted away from the user.
 * Use the same forwarding decision for both native forwarding and
   encapsulation thus allowing to switch between native IPv6 and
   UDP encapsulation based on endpoint without requiring additional
   logic

The fundamental changes introduces in this series are:
 * A new RTA_ENCAP Netlink attribute for routes carrying encapsulation
   instructions. Depending on the specified type, the instructions
   apply to UDP encapsulations, MPLS and possible other in the future.
 * Depending on the encapsulation type, the output function of the
   dst is directly overwritten or the dst merely attaches metadata and
   relies on a subsequent net_device to apply it to the packet. The
   latter is typically used if an inner and outer IP header exist which
   require two subsequent routing lookups to be performed.
 * A new metadata_dst structure which can be attached to skbs to
   carry metadata in between subsystems. This new metadata transport
   is used to provide a single interface for VXLAN, routing and OVS
   to communicate through metadata.

The OVS interfaces remain as-is but will transparently create a real
VXLAN net_device in the background. iproute2 is extended with a new
use cases:

  VXLAN:
  ip route add 40.1.1.1/32 encap vxlan id 10 dst 50.1.1.2 dev vxlan0

  MPLS:
  ip route add 10.1.1.0/30 encap mpls 200 via inet 10.1.1.1 dev swp1

Performance implications:
  The additional memory allocation in the receive path should have
  performance implications although it is not observable in standard
  throughput tests if GRO is properly done. The correct net_device
  model outweights the additional cost of the allocation. Furthermore,
  this implication can be relaxed by reintroducing a direct unqueued
  path from a software device to a consumer like bridge or OVS if
  needed.

    $ netperf  -t TCP_STREAM -H 15.1.1.201
    MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
    15.1.1.201 (15.1.1.201) port 0 AF_INET : demo
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

     87380  16384  16384    10.00    9118.17

Changes since v1:
 * Properly initialize tun_id as reported by Julian
 * Drop dupliate netif_keep_dst() as reported by Alexei
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf
614732eaa1 openvswitch: Use regular VXLAN net_device device
This gets rid of all OVS specific VXLAN code in the receive and
transmit path by using a VXLAN net_device to represent the vport.
Only a small shim layer remains which takes care of handling the
VXLAN specific OVS Netlink configuration.

Unexports vxlan_sock_add(), vxlan_sock_release(), vxlan_xmit_skb()
since they are no longer needed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf
c9db965c52 openvswitch: Abstract vport name through ovs_vport_name()
This allows to get rid of the get_name() vport ops later on.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf
be4ace6e6b openvswitch: Move dev pointer into vport itself
This is the first step in representing all OVS vports as regular
struct net_devices. Move the net_device pointer into the vport
structure itself to get rid of struct vport_netdev.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:07 -07:00
Thomas Graf
34ae932a40 openvswitch: Make tunnel set action attach a metadata dst
Utilize the new metadata dst to attach encapsulation instructions to
the skb. The existing egress_tun_info via the OVS_CB() is left in
place until all tunnel vports have been converted to the new method.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00
Thomas Graf
0dfbdf4102 vxlan: Factor out device configuration
This factors out the device configuration out of the RTNL newlink
API which allows for in-kernel creation of VXLAN net_devices.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00
Thomas Graf
e7030878fc fib: Add fib rule match on tunnel id
This add the ability to select a routing table based on the tunnel
id which allows to maintain separate routing tables for each virtual
tunnel network.

ip rule add from all tunnel-id 100 lookup 100
ip rule add from all tunnel-id 200 lookup 200

A new static key controls the collection of metadata at tunnel level
upon demand.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00
Thomas Graf
3093fbe7ff route: Per route IP tunnel metadata via lightweight tunnel
This introduces a new IP tunnel lightweight tunnel type which allows
to specify IP tunnel instructions per route. Only IPv4 is supported
at this point.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00
Thomas Graf
1b7179d3ad route: Extend flow representation with tunnel key
Add a new flowi_tunnel structure which is a subset of ip_tunnel_key to
allow routes to match on tunnel metadata. For now, the tunnel id is
added to flowi_tunnel which allows for routes to be bound to specific
virtual tunnels.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00
Thomas Graf
ee122c79d4 vxlan: Flow based tunneling
Allows putting a VXLAN device into a new flow-based mode in which
skbs with a ip_tunnel_info dst metadata attached will be encapsulated
according to the instructions stored in there with the VXLAN device
defaults taken into consideration.

Similar on the receive side, if the VXLAN_F_COLLECT_METADATA flag is
set, the packet processing will populate a ip_tunnel_info struct for
each packet received and attach it to the skb using the new metadata
dst.  The metadata structure will contain the outer header and tunnel
header fields which have been stripped off. Layers further up in the
stack such as routing, tc or netfitler can later match on these fields
and perform forwarding. It is the responsibility of upper layers to
ensure that the flag is set if the metadata is needed. The flag limits
the additional cost of metadata collecting based on demand.

This prepares the VXLAN device to be steered by the routing and other
subsystems which allows to support encapsulation for a large number
of tunnel endpoints and tunnel ids through a single net_device which
improves the scalability.

It also allows for OVS to leverage this mode which in turn allows for
the removal of the OVS specific VXLAN code.

Because the skb is currently scrubed in vxlan_rcv(), the attachment of
the new dst metadata is postponed until after scrubing which requires
the temporary addition of a new member to vxlan_metadata. This member
is removed again in a later commit after the indirect VXLAN receive API
has been removed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:06 -07:00
Thomas Graf
0accfc268f arp: Inherit metadata dst when creating ARP requests
If output device wants to see the dst, inherit the dst of the
original skb and pass it on to generate the ARP request.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:05 -07:00
Thomas Graf
f38a9eb1f7 dst: Metadata destinations
Introduces a new dst_metadata which enables to carry per packet metadata
between forwarding and processing elements via the skb->dst pointer.

The structure is set up to be a union. Thus, each separate type of
metadata requires its own dst instance. If demand arises to carry
multiple types of metadata concurrently, metadata dst entries can be
made stackable.

The metadata dst entry is refcnt'ed as expected for now but a non
reference counted use is possible if the reference is forced before
queueing the skb.

In order to allow allocating dsts with variable length, the existing
dst_alloc() is split into a dst_alloc() and dst_init() function. The
existing dst_init() function to initialize the subsystem is being
renamed to dst_subsys_init() to make it clear what is what.

The check before ip_route_input() is changed to ignore metadata dsts
and drop the dst inside the routing function thus allowing to interpret
metadata in a later commit.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:05 -07:00
Thomas Graf
773a69d64b icmp: Don't leak original dst into ip_route_input()
ip_route_input() unconditionally overwrites the dst. Hide the original
dst attached to the skb by calling skb_dst_set(skb, NULL) prior to
ip_route_input().

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:05 -07:00
Thomas Graf
1d8fff9073 ip_tunnel: Make ovs_tunnel_info and ovs_key_ipv4_tunnel generic
Rename the tunnel metadata data structures currently internal to
OVS and make them generic for use by all IP tunnels.

Both structures are kernel internal and will stay that way. Their
members are exposed to user space through individual Netlink
attributes by OVS. It will therefore be possible to extend/modify
these structures without affecting user ABI.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:05 -07:00
Roopa Prabhu
e3e4712ec0 mpls: ip tunnel support
This implementation uses lwtunnel infrastructure to register
hooks for mpls tunnel encaps.

It picks cues from iptunnel_encaps infrastructure and previous
mpls iptunnel RFC patches from Eric W. Biederman and Robert Shearman

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:05 -07:00
Roopa Prabhu
face0188e3 mpls: export mpls functions for use by mpls iptunnels
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:04 -07:00
Roopa Prabhu
74a0f2fe8e ipv6: rt6_info output redirect to tunnel output
This is similar to ipv4 redirect of dst output to lwtunnel
output function for encapsulation and xmit.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:04 -07:00
Roopa Prabhu
8602a62502 ipv4: redirect dst output to lwtunnel output
For input routes with tunnel encap state this patch redirects
dst output functions to lwtunnel_output which later resolves to
the corresponding lwtunnel output function.

This has been tested to work with mpls ip tunnels.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:04 -07:00
Roopa Prabhu
ffce41962e lwtunnel: support dst output redirect function
This patch introduces lwtunnel_output function to call corresponding
lwtunnels output function to xmit the packet.

It adds two variants lwtunnel_output and lwtunnel_output6 for ipv4 and
ipv6 respectively today. But this is subject to change when lwtstate will
reside in dst or dst_metadata (as per upstream discussions).

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:04 -07:00
Roopa Prabhu
19e42e4515 ipv6: support for fib route lwtunnel encap attributes
This patch adds support in ipv6 fib functions to parse Netlink
RTA encap attributes and attach encap state data to rt6_info.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:04 -07:00
Roopa Prabhu
571e722676 ipv4: support for fib route lwtunnel encap attributes
This patch adds support in ipv4 fib functions to parse user
provided encap attributes and attach encap state data to fib_nh
and rtable.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:03 -07:00
Roopa Prabhu
499a242568 lwtunnel: infrastructure for handling light weight tunnels like mpls
Provides infrastructure to parse/dump/store encap information for
light weight tunnels like mpls. Encap information for such tunnels
is associated with fib routes.

This infrastructure is based on previous suggestions from
Eric Biederman to follow the xfrm infrastructure.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:03 -07:00
Roopa Prabhu
a0d9a8604f rtnetlink: introduce new RTA_ENCAP_TYPE and RTA_ENCAP attributes
This patch introduces two new RTA attributes to attach encap
data to fib routes.

Example iproute2 command to attach mpls encap data to ipv4 routes

$ip route add 10.1.1.0/30 encap mpls 200 via inet 10.1.1.1 dev swp1

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:39:03 -07:00
Edward Hyunkoo Jee
0848f6428b inet: frags: fix defragmented packet's IP header for af_packet
When ip_frag_queue() computes positions, it assumes that the passed
sk_buff does not contain L2 headers.

However, when PACKET_FANOUT_FLAG_DEFRAG is used, IP reassembly
functions can be called on outgoing packets that contain L2 headers.

Also, IPv4 checksum is not corrected after reassembly.

Fixes: 7736d33f42 ("packet: Add pre-defragmentation support for ipv4 fanouts.")
Signed-off-by: Edward Hyunkoo Jee <edjee@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 10:29:23 -07:00
Pieter Hollants
2070c48cf2 qmi_wwan: Add support for Dell Wireless 5809e 4G Modem
Added the USB IDs 0x413c:0x81b1 for the "Dell Wireless 5809e Gobi(TM) 4G
LTE Mobile Broadband Card", a Dell-branded Sierra Wireless EM7305 LTE
card in M.2 form factor, used eg. in Dell's Latitude E7540 Notebook
series.

Signed-off-by: Pieter Hollants <pieter@hollants.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:31:11 -07:00
Simon Guinot
a84e328941 net: mvneta: fix refilling for Rx DMA buffers
With the actual code, if a memory allocation error happens while
refilling a Rx descriptor, then the original Rx buffer is both passed
to the networking stack (in a SKB) and let in the Rx ring. This leads
to various kernel oops and crashes.

As a fix, this patch moves Rx descriptor refilling ahead of building
SKB with the associated Rx buffer. In case of a memory allocation
failure, data is dropped and the original DMA buffer is put back into
the Rx ring.

Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <stable@vger.kernel.org> # v3.8+
Tested-by: Yoann Sculo <yoann@sculo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:30:02 -07:00
Jakub Wilk
0c199a903d xfrm: Fix a typo
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:28:36 -07:00
Joachim Eastwood
a7a6268590 stmmac: fix setting of driver data in stmmac_dvr_probe
Commit 803f8fc462 ("stmmac: move driver data setting into
stmmac_dvr_probe") mistakenly set priv and not priv->dev as
driver data. This meant that the remove, resume and suspend
callbacks that fetched and tried to use this data would most
likely explode. Fix the issue by using the correct variable.

Fixes: 803f8fc462 ("stmmac: move driver data setting into stmmac_dvr_probe")
Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:26:45 -07:00
David S. Miller
053c26f3f9 Merge branch 'sch_panic'
Daniel Borkmann says:

====================
Couple of classifier fixes

This fixes a couple of panics in the form of (analogous for
cls_flow{,er}):

[  912.759276] BUG: unable to handle kernel NULL pointer dereference at (null)
[  912.759373] IP: [<ffffffffa09d4d6d>] cls_bpf_change+0x23d/0x268 [cls_bpf]
[  912.759441] PGD 8783c067 PUD 5f684067 PMD 0
[  912.759491] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[  912.759543] Modules linked in: cls_bpf(E) act_gact [...]
[  912.772734] CPU: 3 PID: 10489 Comm: tc Tainted: G        W   E   4.2.0-rc2+ #73
[  912.775004] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, BIOS MBA51.88Z.00EF.B02.1211271028 11/27/2012
[  912.777327] task: ffff88025eaa8000 ti: ffff88005f734000 task.ti: ffff88005f734000
[  912.779662] RIP: 0010:[<ffffffffa09d4d6d>]  [<ffffffffa09d4d6d>] cls_bpf_change+0x23d/0x268 [cls_bpf]
[  912.781991] RSP: 0018:ffff88005f7379c8  EFLAGS: 00010286
[  912.784183] RAX: ffff880201d64e48 RBX: 0000000000000000 RCX: ffff880201d64e40
[  912.786402] RDX: 0000000000000000 RSI: ffffffffa09d51c0 RDI: ffffffffa09d51a6
[  912.788625] RBP: ffff88005f737a68 R08: 0000000000000000 R09: 0000000000000000
[  912.790854] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880078ab5a80
[  912.793082] R13: ffff880232b31570 R14: ffff88005f737ae0 R15: ffff8801e215d1d0
[  912.795181] FS:  00007f3c0c80d740(0000) GS:ffff880265400000(0000) knlGS:0000000000000000
[  912.797281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  912.799402] CR2: 0000000000000000 CR3: 000000005460f000 CR4: 00000000001407e0
[  912.799403] Stack:
[  912.799407]  ffffffff00000000 ffff88023ea18000 000000005f737a08 0000000000000000
[  912.799415]  ffffffff81f06140 ffff880201d64e40 0000000000000000 ffff88023ea1804c
[  912.799418]  0000000000000000 ffff88023ea18044 ffff88023ea18030 ffff88023ea18038
[  912.799418] Call Trace:
[  912.799437]  [<ffffffff816d5685>] tc_ctl_tfilter+0x335/0x910
[  912.799443]  [<ffffffff813622a8>] ? security_capable+0x48/0x60
[  912.799448]  [<ffffffff816b90e5>] rtnetlink_rcv_msg+0x95/0x240
[  912.799454]  [<ffffffff810f612d>] ? trace_hardirqs_on+0xd/0x10
[  912.799456]  [<ffffffff816b902f>] ? rtnetlink_rcv+0x1f/0x40
[  912.799459]  [<ffffffff816b902f>] ? rtnetlink_rcv+0x1f/0x40
[  912.799461]  [<ffffffff816b9050>] ? rtnetlink_rcv+0x40/0x40
[  912.799464]  [<ffffffff816df38f>] netlink_rcv_skb+0xaf/0xc0
[  912.799467]  [<ffffffff816b903e>] rtnetlink_rcv+0x2e/0x40
[  912.799469]  [<ffffffff816deaef>] netlink_unicast+0xef/0x1b0
[  912.799471]  [<ffffffff816defa0>] netlink_sendmsg+0x3f0/0x620
[  912.799476]  [<ffffffff81687028>] sock_sendmsg+0x38/0x50
[  912.799479]  [<ffffffff81687938>] ___sys_sendmsg+0x288/0x290
[  912.799482]  [<ffffffff810f7852>] ? __lock_acquire+0x572/0x2050
[  912.799488]  [<ffffffff810265db>] ? native_sched_clock+0x2b/0x90
[  912.799493]  [<ffffffff8116135f>] ? __audit_syscall_entry+0xaf/0x100
[  912.799497]  [<ffffffff8116135f>] ? __audit_syscall_entry+0xaf/0x100
[  912.799501]  [<ffffffff8112aa19>] ? current_kernel_time+0x69/0xd0
[  912.799505]  [<ffffffff81266f16>] ? __fget_light+0x66/0x90
[  912.799508]  [<ffffffff81688812>] __sys_sendmsg+0x42/0x80
[  912.799510]  [<ffffffff81688862>] SyS_sendmsg+0x12/0x20
[  912.799515]  [<ffffffff817f9a6e>] entry_SYSCALL_64_fastpath+0x12/0x76
[  912.799540] Code: 4d 88 49 8b 57 08 48 89 51 08 49 8b 57 10 48 89 c8 48 83 c0 08 48
                     89 51 10 48 8b 51 10 48 c7 c6 c0 51 9d a0 48 c7 c7 a6 51 9d a0 <48>
                     89 02 48 8b 51 08 48 89 42 08 48 b8 00 02 20 00 00 00 ad de
[  912.799544] RIP  [<ffffffffa09d4d6d>] cls_bpf_change+0x23d/0x268 [cls_bpf]
[  912.799544]  RSP <ffff88005f7379c8>
[  912.799545] CR2: 0000000000000000
[  912.807380] ---[ end trace a6440067cfdc7c29 ]---

I've split them into 3 patches, so they can be backported easier
when needed.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:25:03 -07:00
Daniel Borkmann
32b2f4b196 sched: cls_flow: fix panic on filter replace
The following test case causes a NULL pointer dereference in cls_flow:

  tc filter add dev foo parent 1: handle 0x1 flow hash keys dst action ok
  tc filter replace dev foo parent 1: pref 49152 handle 0x1 \
            flow hash keys mark action drop

To be more precise, actually two different panics are fixed, the first
occurs because tcf_exts_init() is not called on the newly allocated
filter when we do a replace. And the second panic uncovered after that
happens since the arguments of list_replace_rcu() are swapped, the old
element needs to be the first argument and the new element the second.

Fixes: 70da9f0bf9 ("net: sched: cls_flow use RCU")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:25:03 -07:00
Daniel Borkmann
ff3532f265 sched: cls_flower: fix panic on filter replace
The following test case causes a NULL pointer dereference in cls_flower:

  tc filter add dev foo parent 1: flower eth_type ipv4 action ok flowid 1:1
  tc filter replace dev foo parent 1: pref 49152 handle 0x1 \
            flower eth_type ipv6 action ok flowid 1:1

The problem is that commit 77b9900ef5 ("tc: introduce Flower classifier")
accidentally swapped the arguments of list_replace_rcu(), the old
element needs to be the first argument and the new element the second.

Fixes: 77b9900ef5 ("tc: introduce Flower classifier")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:25:03 -07:00
Daniel Borkmann
f6bfc46da6 sched: cls_bpf: fix panic on filter replace
The following test case causes a NULL pointer dereference in cls_bpf:

  FOO="1,6 0 0 4294967295,"
  tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok
  tc filter replace dev foo parent 1: pref 49152 handle 0x1 \
            bpf bytecode "$FOO" flowid 1:1 action drop

The problem is that commit 1f947bf151 ("net: sched: rcu'ify cls_bpf")
accidentally swapped the arguments of list_replace_rcu(), the old
element needs to be the first argument and the new element the second.

Fixes: 1f947bf151 ("net: sched: rcu'ify cls_bpf")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:25:02 -07:00
David S. Miller
d64e78c9e7 Merge branch 'cxgb4-dcb'
Anish Bhatt says:

====================
cxgb4 DCB updates

The following patchset covers changes to work better with  the userspace
tools cgdcbxd and cgrulesengd and improves firmware support for
host-managed mode.

Also exports traffic class information that was previously not being
exported via dcbnl_ops and unfifies how app selector information is passed
to firmware.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:23:24 -07:00
Anish Bhatt
397665dab5 cxgb4 : Fill DCB priority in vlan control headers
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:23:23 -07:00
Anish Bhatt
8d6541b7bc cxgb4 : Fill in number of DCB traffic classes supported
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:23:23 -07:00
Anish Bhatt
a85c2eb311 cxgb4 : Allow firmware DCB info to be queried in host state
Since finally DCB traffic management is still handled by firmware,
allow firmware to be fully programmed and queried even in host
managed state for the cases where this was previously rejected.

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:23:23 -07:00
Anish Bhatt
a44e7b7311 cxgb4 : Only pass app selector of 0 or 3 to firmware
This keeps app format passed to firmware the same irrespective
of DCBx version in use.

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:23:23 -07:00
Marcelo Ricardo Leitner
b52effd263 sctp: fix cut and paste issue in comment
Cookie ACK is always received by the association initiator, so fix the
comment to avoid confusion.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:21:32 -07:00
David S. Miller
57816cbcb8 Merge branch 'sctp-src-addr'
Marcelo Ricardo Leitner says:

====================
sctp: fix src address selection if using secondary address

This series improves the way SCTP chooses its src address so that the
choosen one will always belong to the interface being used for output.

v1->v2:
 - split out the refactoring from the fix itself
 - Doing a full reverse routing as in v1 is not necessary. Only looking
   for the interface that has the address and comparing its number is
   enough.
====================

Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:20:18 -07:00
Marcelo Ricardo Leitner
0ca50d12fe sctp: fix src address selection if using secondary addresses
In short, sctp is likely to incorrectly choose src address if socket is
bound to secondary addresses. This patch fixes it by adding a new check
that checks if such src address belongs to the interface that routing
identified as output.

This is enough to avoid rp_filter drops on remote peer.

Details:

Currently, sctp will do a routing attempt without specifying the src
address and compare the returned value (preferred source) with the
addresses that the socket is bound to. When using secondary addresses,
this will not match.

Then it will try specifying each of the addresses that the socket is
bound to and re-routing, checking if that address is valid as src for
that dst. Thing is, this check alone is weak:

# ip r l
192.168.100.0/24 dev eth1  proto kernel  scope link  src 192.168.100.149
192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.147

# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
       valid_lft 2160sec preferred_lft 2160sec
    inet 192.168.122.148/24 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe15:186a/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
       valid_lft 2162sec preferred_lft 2162sec
    inet 192.168.100.148/24 scope global secondary eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb3:9146/64 scope link
       valid_lft forever preferred_lft forever
4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
    inet6 fe80::5054:ff:fe05:47ee/64 scope link
       valid_lft forever preferred_lft forever

# ip r g 192.168.100.193 from 192.168.122.148
192.168.100.193 from 192.168.122.148 dev eth1
    cache

Even if you specify an interface:

# ip r g 192.168.100.193 from 192.168.122.148 oif eth1
192.168.100.193 from 192.168.122.148 dev eth1
    cache

Although this would be valid, peers using rp_filter will drop such
packets as their src doesn't match the routes for that interface.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-07-21 00:20:17 -07:00