Commit Graph

519203 Commits

Author SHA1 Message Date
Pablo Neira Ayuso
55917a21d0 netfilter: x_tables: add context to know if extension runs from nft_compat
Currently, we have four xtables extensions that cannot be used from the
xt over nft compat layer. The problem is that they need real access to
the full blown xt_entry to validate that the rule comes with the right
dependencies. This check was introduced to overcome the lack of
sufficient userspace dependency validation in iptables.

To resolve this problem, this patch introduces a new field to the
xt_tgchk_param structure that tell us if the extension is run from
nft_compat context.

The three affected extensions are:

1) CLUSTERIP, this target has been superseded by xt_cluster. So just
   bail out by returning -EINVAL.

2) TCPMSS. Relax the checking when used from nft_compat. If used with
   the wrong configuration, it will corrupt !syn packets by adding TCP
   MSS option.

3) ebt_stp. Relax the check to make sure it uses the reserved
   destination MAC address for STP.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
2015-05-15 20:14:07 +02:00
Zhang Chunyu
12b7ed29bd netfilter: xt_MARK: Add ARP support
Add arpt_MARK to xt_mark.

The corresponding userspace update is available at:

http://git.netfilter.org/arptables/commit/?id=4bb2f8340783fd3a3f70aa6f8807428a280f8474

Signed-off-by: Zhang Chunyu <zhangcy@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14 13:00:27 +02:00
Denys Vlasenko
a3b1c1eb50 netfilter: ipset: deinline ip_set_put_extensions()
On x86 allyesconfig build:
The function compiles to 489 bytes of machine code.
It has 25 callsites.

    text    data       bss       dec     hex filename
82441375 22255384 20627456 125324215 7784bb7 vmlinux.before
82434909 22255384 20627456 125317749 7783275 vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: David S. Miller <davem@davemloft.net>
CC: Jan Engelhardt <jengelh@medozas.de>
CC: Jiri Pirko <jpirko@redhat.com>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: netfilter-devel@vger.kernel.org
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14 12:51:19 +02:00
Florian Westphal
a9fcc6a41d netfilter: bridge: free nf_bridge info on xmit
nf_bridge information is only needed for -m physdev, so we can always free
it after POST_ROUTING.  This has the advantage that allocation and free will
typically happen on the same cpu.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14 12:43:49 +02:00
Florian Westphal
7fb48c5bc3 netfilter: bridge: neigh_head and physoutdev can't be used at same time
The neigh_header is only needed when we detect DNAT after prerouting
and neigh cache didn't have a mac address for us.

The output port has not been chosen yet so we can re-use the storage
area, bringing struct size down to 32 bytes on x86_64.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14 12:43:48 +02:00
Jozsef Kadlecsik
a9756e6f63 netfilter: ipset: Use better include files in xt_set.c
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 18:21:13 +02:00
Sergey Popovich
1823fb79e5 netfilter: ipset: Improve preprocessor macros checks
Check if mandatory MTYPE, HTYPE and HOST_MASK macros
defined.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 18:21:13 +02:00
Sergey Popovich
58cc06daea netfilter: ipset: Fix hashing for ipv6 sets
HKEY_DATALEN remains defined after first inclusion
of ip_set_hash_gen.h, so it is incorrectly reused
for IPv6 code.

Undefine HKEY_DATALEN in ip_set_hash_gen.h at the end.

Also remove some useless defines of HKEY_DATALEN in
ip_set_hash_{ip{,mark,port},netiface}.c as ip_set_hash_gen.h
defines it correctly for such set types anyway.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 18:21:12 +02:00
Sergey Popovich
275e2bc0f2 netfilter: ipset: Fix ext_*() macros
So pointers returned by these macros could be
referenced with -> directly.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 18:21:07 +02:00
Sergey Popovich
037261866c netfilter: ipset: Check for comment netlink attribute length
Ensure userspace supplies string not longer than
IPSET_MAX_COMMENT_SIZE.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:47 +02:00
Sergey Popovich
728a7e6903 netfilter: ipset: Return bool values instead of int
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:47 +02:00
Sergey Popovich
cabfd139aa netfilter: ipset: Use HOST_MASK literal to represent host address CIDR len
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:47 +02:00
Sergey Popovich
d25472e470 netfilter: ipset: Check IPSET_ATTR_PORT only once
We do not need to check tb[IPSET_ATTR_PORT] != NULL before
retrieving port, as this attribute is known to exist due to
ip_set_attr_netorder() returning true only when attribute
exists and it is in network byte order.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:46 +02:00
Sergey Popovich
8e55d2e590 netfilter: ipset: Return ipset error instead of bool
Statement ret = func1() || func2() returns 0 when both func1()
and func2() return 0, or 1 if func1() or func2() returns non-zero.

However in our case func1() and func2() returns error code on
failure, so it seems good to propagate such error codes, rather
than returning 1 in case of failure.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:46 +02:00
Sergey Popovich
43ef29c91a netfilter: ipset: Preprocessor directices cleanup
* Undefine mtype_data_reset_elem before defining.

 * Remove duplicated mtype_gc_init undefine, move
   mtype_gc_init define closer to mtype_gc define.

 * Use htype instead of HTYPE in IPSET_TOKEN(HTYPE, _create)().

 * Remove PF definition from sets: no more used.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:46 +02:00
Sergey Popovich
2b67d6e01d netfilter: ipset: No need to make nomatch bitfield
We do not store cidr packed with no match, so there is no
need to make nomatch bitfield.

This simplifies mtype_data_reset_flags() a bit.

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:45 +02:00
Sergey Popovich
caed0ed35b netfilter: ipset: Properly calculate extensions offsets and total length
Offsets and total length returned by the ip_set_elem_len()
calculated incorrectly as initial set element length (i.e.
len parameter) is used multiple times in offset calculations,
also affecting set element total length.

Use initial set element length as start offset, do not add aligned
extension offset to the offset. Return offset as total length of
the set element.

This reduces memory requirements on per element basic for the
hash:* type of sets.

For example output from 'ipset -terse list test-1' on 64-bit PC,
where test-1 is generated via following script:

  #!/bin/bash

  set_name='test-1'

  ipset create "$set_name" hash:net family inet \
              timeout 10800 counters comment \
              hashsize 65536 maxelem 65536

  declare -i o3 o4
  fmt="add $set_name 192.168.%u.%u\n"

  for ((o3 = 0; o3 < 256; o3++)); do
      for ((o4 = 0; o4 < 256; o4++)); do
          printf "$fmt" $o3 $o4
      done
  done |ipset -exist restore

BEFORE this patch is applied

  # ipset -terse list test-1
  Name: test-1
  Type: hash:net
  Revision: 6
  Header: family inet hashsize 65536 maxelem 65536
timeout 10800 counters comment
  Size in memory: 26348440

and AFTER applying patch

  # ipset -terse list test-1
  Name: test-1
  Type: hash:net
  Revision: 6
  Header: family inet hashsize 65536 maxelem 65536
timeout 10800 counters comment
  Size in memory: 7706392
  References: 0

Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:45 +02:00
Alexander Drozdov
3e4e8d126c netfilter: ipset: make ip_set_get_ip*_port to use skb_network_offset
All the ipset functions respect skb->network_header value,
except for ip_set_get_ip4_port() & ip_set_get_ip6_port(). The
functions should use skb_network_offset() to get the transport
header offset.

Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:45 +02:00
Jozsef Kadlecsik
22496f098b netfilter: ipset: Give a better name to a macro in ip_set_core.c
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:44 +02:00
Jozsef Kadlecsik
2006aa4a8c netfilter: ipset: Fix sparse warning
"warning: cast to restricted __be32" warnings are fixed

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13 13:25:44 +02:00
Ying Xue
9449c3cd90 net: make skb_dst_pop routine static
As xfrm_output_one() is the only caller of skb_dst_pop(), we should
make skb_dst_pop() localized.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:19:49 -04:00
Michael Holzheu
cffc642d93 test_bpf: add 173 new testcases for eBPF
add an exhaustive set of eBPF tests bringing total to:
test_bpf: Summary: 233 PASSED, 0 FAILED, [0/226 JIT'ed]

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:15:25 -04:00
Brenden Blanco
b88c06e36d samples/bpf: fix in-source build of samples with clang
in-source build of 'make samples/bpf/' was incorrectly
using default compiler instead of invoking clang/llvm.
out-of-source build was ok.

Fixes: a80857822b ("samples: bpf: trivial eBPF program in C")
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:15:25 -04:00
Hariprasad Shenai
1ecc7b7a59 cxgb4/cxgb4vf: Cleanup macros, add comments and add new MACROS
Cleanup few MACROS left out in t4_hw.h to be consistent with the
existing ones. Also replace few hardcoded values with MACROS. Also
update comments for some code

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:11:40 -04:00
KY Srinivasan
82fa3c776e hv_netvsc: Use the xmit_more skb flag to optimize signaling the host
Based on the information given to this driver (via the xmit_more skb flag),
we can defer signaling the host if more packets are on the way. This will help
make the host more efficient since it can potentially process a larger batch of
packets. Implement this optimization.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:10:43 -04:00
Alexei Starovoitov
9eea922264 pktgen: fix packet generation
pkt_gen->last_ok was not set properly, so after the first burst
pktgen instead of allocating new packet, will reuse old one, advance
eth_type_trans further, which would mean the stack will be seeing very
short bogus packets.

Fixes: 62f64aed62 ("pktgen: introduce xmit_mode '<start_xmit|netif_receive>'")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:09:52 -04:00
David S. Miller
147ef3e218 Merge branch 'systemport-irq-coalesce'
Florian Fainelli says:

====================
net: systemport: interrupt coalescing support

This patch series adds support for RX & TX interrupt coalescing in the
systemport driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:08:47 -04:00
Florian Fainelli
d0634868d3 net: systemport: Implement RX coalescing control knobs
Similarly to the TX path, allow the RX path to be configured with both
'rx-frames' and 'rx-usecs' coalescing parameters.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:08:46 -04:00
Florian Fainelli
b1a15e8643 net: systemport: Implement TX coalescing control knobs
Add the ability to configure both 'tx-frames' which controls how many frames
are doing to trigger a single interrupt and 'tx-usecs' which dictates how long
to wait before an interrupt should be services.

Since our timer resolution is close to 8.192 us, we round up to the nearest
value the 'tx-usecs' timeout value.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:08:46 -04:00
Denys Vlasenko
a2029240e5 net: deinline netif_tx_stop_all_queues(), remove WARN_ON in netif_tx_stop_queue()
These functions compile to 60 bytes of machine code each.
With this .config: http://busybox.net/~vda/kernel_config
there are 617 calls of netif_tx_stop_queue()
and 49 calls of netif_tx_stop_all_queues() in vmlinux.

To fix this, remove WARN_ON in netif_tx_stop_queue()
as suggested by davem, and deinline netif_tx_stop_all_queues().

Change in code size is about 20k:

   text      data      bss       dec     hex filename
82426986 22255416 20627456 125309858 77813a2 vmlinux.before
82406248 22255416 20627456 125289120 777c2a0 vmlinux

gcc-4.7.2 still creates deinlined version of netif_tx_stop_queue
sometimes:

$ nm --size-sort vmlinux | grep netif_tx_stop_queue | wc -l
190

ffffffff81b558a8 <netif_tx_stop_queue>:
ffffffff81b558a8:       55                      push   %rbp
ffffffff81b558a9:       48 89 e5                mov    %rsp,%rbp
ffffffff81b558ac:       f0 80 8f e0 01 00 00    lock orb $0x1,0x1e0(%rdi)
ffffffff81b558b3:       01
ffffffff81b558b4:       5d                      pop    %rbp
ffffffff81b558b5:       c3                      retq

This needs additional fixing.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Alexei Starovoitov <alexei.starovoitov@gmail.com>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Joe Perches <joe@perches.com>
CC: David S. Miller <davem@davemloft.net>
CC: Jiri Pirko <jpirko@redhat.com>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: netfilter-devel@vger.kernel.org
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:05:35 -04:00
Justin Cormack
b508208339 macvtap add missing ioctls - fix wrapping
The macvtap driver tries to emulate all the ioctls supported by a normal
tun/tap driver, however it was missing the generic SIOCGIFHWADDR and
SIOCSIFHWADDR ioctls to get and set the mac address that are supported
by tun/tap. This patch adds these.

Signed-off-by: Justin Cormack <justin@netbsd.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 23:01:01 -04:00
David S. Miller
a62b70ddd1 Merge branch 'switchdev_spring_cleanup'
Scott Feldman says:

====================
switchdev: spring cleanup

v7:

Address review comments:

 - [Jiri] split the br_setlink and br_dellink reverts into their own patches
 - [Jiri] some parameter cleanup of rocker's memory allocators
 - [Jiri] pass trans mode as formal parameter rather than hanging off of
     rocker_port.

v6:

Address review comments:

 - [Jiri] split a couple of patches into one-logical-change per patch
 - [Joe Perches] revert checkpatch -f changes for wrapped lines with long
     symbols.

v5:

Address review comments:

 - [Jiri] include Jiri's s/swdev/switchdev rename patches up front.
 - [Jiri] squash some patches.  Now setlink/dellink/getlink patches are in
     three parts: new implementation, convert drivers to new, delete old impl.
 - [Jiri] some minor variable renames
 - [Jiri] use BUG_ON rather than WARN when COMMIT phase fails when PREPARE
     phase said it was safe to come into the water.
 - [Simon] rocker: fix a few transaction prepare-commit cases that were wrong.
     This was the bulk of the changes in v5.

v4:

Well, it was a lot of work, but now prepare-commit transaction model is how
davem advises: if prepare fails, abort the transaction.  The driver must do
resource reservations up front in prepare phase and return those resources if
aborting.  Commit phase would use reserved resources.  The good news is the
driver code (for rocker) now handles resource allocation failures better by not
leaving partially device or driver states.  This is a side-effect of the
prepare phase where state isn't modified; only validation of inputs and
resource reservations happen in the prepare phase.  Since we're supporting
setting attrs and add objs across lower devs in the stacked case, we need to
hold rtnl_lock (or ensure rtnl_lock is held) so lower devs don't move on us
during the prepare-commit transaction.  DSA driver code skips the prepare phase
and goes straight for the commit phase since no up-front allocations are done
and no device failures (that could be detected in the prepare phase) can
happen.

Remove NETIF_F_HW_SWITCH_OFFLOAD from rocker and the swdev_attr_set/get
wrappers.  DSA doesn't set NETIF_F_HW_SWITCH_OFFLOAD, so it can't be in
swdev_attr_set/get.  rocker doesn't need it; or rather can't support
NETIF_F_HW_SWITCH_OFFLOAD being set/cleared at run-time after the device
port is already up and offloading L2/L3.  NETIF_F_HW_SWITCH_OFFLOAD is still
left as a feature flag for drivers that can use it.

Drop the renaming patch for netdev_switch_notifier.  Other renames are a
result of moving to the attr get/set or obj add/del model.  Everything
but the netdev_switch_notifier is still prefixed with "swdev_".

v3:

Move to two-phase prepare-commit transaction model for attr set and obj add.
Driver gets a change in prepare phase to NACK transaction if lack of resources
or support in device.

v2:

Address review comments:

 - [Jiri] squash a few related patches
 - [Roopa] don't remove NETIF_F_HW_SWITCH_OFFLOAD
 - [Roopa] address VLAN setlink/dellink
 - [Ronen] print warning is attr set revert fails

Not address:

 - Using something other than "swdev_" prefix
 - Vendor extentions

The patch set grew a bit to not only support port attr get/set but also add
support for port obj add/del.  Example of port objs are VLAN, FDB entries, and
FIB entries.  The VLAN support now allows the swdev driver to get VLAN ranges
and flags like PVID and "untagged".  Sridhar will be adding FDB obj support
in follow-on patch.

v1:

The main theme of this patch set is to cleanup swdev in preparation for
new features or fixes to be added soon.  We have a pretty good idea now how
to handle stacked drivers in swdev, but there where some loose ends.  For
example, if a set failed in the middle of walking the lower devs, we would
leave the system in an undefined state...there was no way to recover back to
the previous state.  Speaking of sets, also recognize a pattern that most
swdev API accesses are gets or sets of port attributes, so go ahead and make
port attr get/set the central swdev API, and convert everything that is
set-ish/get-ish to this new API.

Features/fixes that should follow from this cleanup:

 - solve the duplicate pkt forwarding issue
 - get/set bridge attrs, like ageing_time, from/to device
 - get/set more bridge port attrs from/to device

There are some rename cleanups tagging along at the end, to give swdev
consistent naming.

And finally, some much needed updates to the switchdev.txt documentation to
hopefully capture the state-of-the-art of swdev.  Hopefully, we can do a better
job keeping this document up-to-date.

Tested with rocker, of course, to make sure nothing functional broke.  There
are a couple minor tweaks to DSA code for getting switch ID and setting STP
updates to use new API, but not expecting amy breakage there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:56 -04:00
Scott Feldman
4ceec22d6d switchdev: bring documentation up-to-date
Much need updated of switchdev documentation to cover what's been
implmented to-date.  There are some XXX comments in the text for
unimplemented or broken items.  I'd like to keep these in there (poor-man's
TODO list) and update the document once each issue is resolved.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:56 -04:00
Scott Feldman
4725ceb9b7 rocker: make checkpatch -f clean
Well almost clean: ignore the CHECKs for space after cast operator and some
longer-than-80 char cases where for readability it's better to keep as-is.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:56 -04:00
Scott Feldman
7889cbee83 switchdev: remove NETIF_F_HW_SWITCH_OFFLOAD feature flag
Roopa said remove the feature flag for this series and she'll work on
bringing it back if needed at a later date.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
58c2cb16b1 switchdev: convert fib_ipv4_add/del over to switchdev_port_obj_add/del
The IPv4 FIB ops convert nicely to the switchdev objs and we're left with
only four switchdev ops: port get/set and port add/del.  Other objs will
follow, such as FDB.  So go ahead and convert IPv4 FIB over to switchdev
obj for consistency, anticipating more objs to come.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
85fdb95672 switchdev: cut over to new switchdev_port_bridge_getlink
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
8793d0a664 switchdev: add new switchdev_port_bridge_getlink
Like bridge_setlink, add switchdev wrapper to handle bridge_getlink and
call into port driver to get port attrs.  For now, only BR_LEARNING and
BR_LEARNING_SYNC are returned.  To add more, we'll probably want to break
away from ndo_dflt_bridge_getlink() and build the netlink skb directly in
the switchdev code.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
8508025c59 bridge: revert br_dellink change back to original
This is revert of:

commit 68e331c785 ("bridge: offload bridge port attributes to switch asic
if feature flag set")

Restore br_dellink back to original and don't call into SELF port driver.
rtnetlink.c:bridge_dellink() already does a call into port driver for SELF.

bridge vlan add/del cmd defaults to MASTER.  From man page for bridge vlan
add/del cmd:

       self   the vlan is configured on the specified physical device.
              Required if the device is the bridge device.

       master the vlan is configured on the software bridge (default).

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
87a5dae59e switchdev: remove unused switchdev_port_bridge_dellink
Now we can remove old wrappers for dellink.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
54ba5a0bbc switchdev: cut over to new switchdev_port_bridge_dellink
Rocker, bonding and team and switch over to the new
switchdev_port_bridge_dellink to avoid duplicating code in each driver.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
5c34e02214 switchdev: add new switchdev_port_bridge_dellink
Same change as setlink.  Provide the wrapper op for SELF ndo_bridge_dellink
and call into the switchdev driver to delete afspec VLANs.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:55 -04:00
Scott Feldman
41c498b935 bridge: restore br_setlink back to original
This is revert of:

commit 68e331c785 ("bridge: offload bridge port attributes to switch asic
if feature flag set")

Restore br_setlink back to original and don't call into SELF port driver.
rtnetlink.c:bridge_setlink() already does a call into port driver for SELF.

bridge set link cmd defaults to MASTER.  From man page for bridge link set
cmd:

       self   link setting is configured on specified physical device

       master link setting is configured on the software bridge (default)

The link setting has two values: the device-side value and the software
bridge-side value.  These are independent and settable using the bridge
link set cmd by specifying some combination of [master] | [self].
Furthermore, the device-side and bridge-side settings have their own
initial value, viewable from bridge -d link show cmd.

Restoring br_setlink back to original makes rocker (the only in-kernel user
of SELF link settings) work as first implement: two-sided values.

It's true that when both MASTER and SELF are specified from the command,
two netlink notifications are generated, one for each side of the settings.
The user-space app can distiquish between the two notifications by
observing the MASTER or SELF flag.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:54 -04:00
Scott Feldman
e71f220b34 switchdev: remove old switchdev_port_bridge_setlink
New attr-based bridge_setlink can recurse lower devs and recover on err, so
remove old wrapper (including ndo_dflt_switchdev_port_bridge_setlink).

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:54 -04:00
Scott Feldman
fc8f40d864 switchdev: cut over to new switchdev_port_bridge_setlink
Rocker, bonding, and team can now use the switchdev bridge setlink to parse
raw netlink; no need to duplicate this code in each driver.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:54 -04:00
Scott Feldman
47f8328bb1 switchdev: add new switchdev bridge setlink
Add new switchdev_port_bridge_setlink that can be used by drivers
implementing .ndo_bridge_setlink to set switchdev bridge attributes.
Basically turn the raw rtnl_bridge_setlink netlink into switchdev attr
sets.  Proper netlink attr policy checking is done on the protinfo part of
the netlink msg.

Currently, for protinfo, only bridge port attrs BR_LEARNING and
BR_LEARNING_SYNC are parsed and passed to port driver.

For afspec, VLAN objs are passed so switchdev driver can set VLANs assigned
to SELF.  To illustrate with iproute2 cmd, we have:

	bridge vlan add vid 10 dev sw1p1 self master

To add VLAN 10 to port sw1p1 for both the bridge (master) and the device
(self).

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:54 -04:00
Scott Feldman
6004c86718 switchdev: add bridge port flags attr
rocker: use switchdev get/set attr for bridge port flags

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:54 -04:00
Scott Feldman
9228ad26ab rocker: use switchdev add/del obj for bridge port vlans
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:54 -04:00
Scott Feldman
6fc3016da7 switchdev: add port vlan obj
VLAN obj has flags (PVID and untagged) as well as start and end vid ranges.
The switchdev driver can optimize programing the device using the ranges.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:53 -04:00
Scott Feldman
491d0f1533 switchdev: introduce switchdev add/del obj ops
Like switchdev attr get/set, add new switchdev obj add/del.  switchdev objs
will be things like VLANs or FIB entries, so add/del fits better for
objects than get/set used for attributes.

Use same two-phase prepare-commit transaction model as in attr set.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-12 18:43:53 -04:00