Commit Graph

1819 Commits

Author SHA1 Message Date
Pablo Neira Ayuso
24477e5741 uapi: ebtables: don't include linux/if.h
linux/if.h creates conflicts in userspace with net/if.h

By using it here we force userspace to use linux/if.h while
net/if.h may be needed.

Note that:

include/linux/netfilter_ipv4/ip_tables.h and
include/linux/netfilter_ipv6/ip6_tables.h

don't include linux/if.h and they also refer to IFNAMSIZ, so they are
expecting userspace to include use net/if.h from the client program.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 20:08:38 +02:00
Patrick McHardy
7d7402642e netfilter: nf_tables: variable sized set element keys / data
This patch changes sets to support variable sized set element keys / data
up to 64 bytes each by using variable sized set extensions. This allows
to use concatenations with bigger data items suchs as IPv6 addresses.

As a side effect, small keys/data now don't require the full 16 bytes
of struct nft_data anymore but just the space they need.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:31 +02:00
Patrick McHardy
49499c3e6e netfilter: nf_tables: switch registers to 32 bit addressing
Switch the nf_tables registers from 128 bit addressing to 32 bit
addressing to support so called concatenations, where multiple values
can be concatenated over multiple registers for O(1) exact matches of
multiple dimensions using sets.

The old register values are mapped to areas of 128 bits for compatibility.
When dumping register numbers, values are expressed using the old values
if they refer to the beginning of a 128 bit area for compatibility.

To support concatenations, register loads of less than a full 32 bit
value need to be padded. This mainly affects the payload and exthdr
expressions, which both unconditionally zero the last word before
copying the data.

Userspace fully passes the testsuite using both old and new register
addressing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:29 +02:00
David S. Miller
e60a9de49c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-04-11

This series contains updates to iflink, ixgbe and ixgbevf.

The entire set of changes come from Vlad Zolotarov to ultimately add
the ethtool ops to VF driver to allow querying the RSS indirection table
and RSS random key.

Currently we support only 82599 and x540 devices.  On those devices, VFs
share the RSS redirection table and hash key with a PF.  Letting the VF
query this information may introduce some security risks, therefore this
feature will be disabled by default.

The new netdev op allows a system administrator to change the default
behaviour with "ip link set" command.  The relevant iproute2 patch has
already been sent and awaits for this series upstream.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:36:57 -04:00
WANG Cong
7a6c8c34e5 fou: implement FOU_CMD_GET
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:25:13 -04:00
Vlad Zolotarov
01a3d79681 if_link: Add an additional parameter to ifla_vf_info for RSS querying
Add configuration setting for drivers to allow/block an RSS Redirection
Table and a Hash Key querying for discrete VFs.

On some devices VF share the mentioned above information with PF and
querying it may adduce a theoretical security risk. We want to let a
system administrator to decide if he/she wants to take this risk or not.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-10 21:57:22 -07:00
Pablo Neira Ayuso
aadd51aa71 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Resolve conflicts between 5888b93 ("Merge branch 'nf-hook-compress'") and
Florian Westphal br_netfilter works.

Conflicts:
        net/bridge/br_netfilter.c

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 18:30:21 +02:00
Patrick McHardy
68e942e88a netfilter: nf_tables: support optional userdata for set elements
Add an userdata set extension and allow the user to attach arbitrary
data to set elements. This is intended to hold TLV encoded data like
comments or DNS annotations that have no meaning to the kernel.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy
22fe54d5fe netfilter: nf_tables: add support for dynamic set updates
Add a new "dynset" expression for dynamic set updates.

A new set op ->update() is added which, for non existant elements,
invokes an initialization callback and inserts the new element.
For both new or existing elements the extenstion pointer is returned
to the caller to optionally perform timer updates or other actions.

Element removal is not supported so far, however that seems to be a
rather exotic need and can be added later on.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Nicolas Dichtel
9a9634545c netns: notify netns id events
With this patch, netns ids that are created and deleted are advertised into the
group RTNLGRP_NSID.

Because callers of rtnl_net_notifyid() already know the id of the peer, there is
no need to call __peernet2id() in rtnl_net_fill().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 17:29:41 -04:00
David S. Miller
c85d6975ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx4/cmd.c
	net/core/fib_rules.c
	net/ipv4/fib_frontend.c

The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.

The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 22:34:15 -04:00
Alexei Starovoitov
91bc4822c3 tc: bpf: add checksum helpers
Commit 608cd71a9c ("tc: bpf: generalize pedit action") has added the
possibility to mangle packet data to BPF programs in the tc pipeline.
This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace()
for fixing up the protocol checksums after the packet mangling.

It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid
unnecessary checksum recomputations when BPF programs adjusting l3/l4
checksums and documents all three helpers in uapi header.

Moreover, a sample program is added to show how BPF programs can make use
of the mangle and csum helpers.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 16:42:35 -04:00
Linus Torvalds
1cced5015b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
 "A fix for ALPS driver for issue introduced in the latest update and a
  tweak for yet another Lenovo box in Synaptics.

  There will be more ALPS tweaks coming.."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: define INPUT_PROP_ACCELEROMETER behavior
  Input: synaptics - fix min-max quirk value for E440
  Input: synaptics - add quirk for Thinkpad E440
  Input: ALPS - fix max coordinates for v5 and v7 protocols
  Input: add MT_TOOL_PALM
2015-04-03 14:58:48 -07:00
Daniel Borkmann
bcad571824 ebpf: add skb->priority to offset map for usage in {cls, act}_bpf
This adds the ability to read out the skb->priority from an eBPF
program, so that it can be taken into account from a tc filter
or action for the use-case where the priority is not being used
to directly override the filter classification in a qdisc, but
to tag traffic otherwise for the classifier; the priority can be
assigned from various places incl. user space, in future we may
also mangle it from an eBPF program.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 14:59:15 -04:00
David S. Miller
9f0d34bc34 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	drivers/net/usb/sr9800.c
	drivers/net/usb/usbnet.c
	include/linux/usb/usbnet.h
	net/ipv4/tcp_ipv4.c
	net/ipv6/tcp_ipv6.c

The TCP conflicts were overlapping changes.  In 'net' we added a
READ_ONCE() to the socket cached RX route read, whilst in 'net-next'
Eric Dumazet touched the surrounding code dealing with how mini
sockets are handled.

With USB, it's a case of the same bug fix first going into net-next
and then I cherry picked it back into net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:16:53 -04:00
Linus Torvalds
1e848913f0 Merge branch 'for-4.0' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "Two main issues:

   - We found that turning on pNFS by default (when it's configured at
     build time) was too aggressive, so we want to switch the default
     before the 4.0 release.

   - Recent client changes to increase open parallelism uncovered a
     serious bug lurking in the server's open code.

  Also fix a krb5/selinux regression.

  The rest is mainly smaller pNFS fixes"

* 'for-4.0' of git://linux-nfs.org/~bfields/linux:
  sunrpc: make debugfs file creation failure non-fatal
  nfsd: require an explicit option to enable pNFS
  NFSD: Fix bad update of layout in nfsd4_return_file_layout
  NFSD: Take care the return value from nfsd4_encode_stateid
  NFSD: Printk blocklayout length and offset as format 0x%llx
  nfsd: return correct lockowner when there is a race on hash insert
  nfsd: return correct openowner when there is a race to put one in the hash
  NFSD: Put exports after nfsd4_layout_verify fail
  NFSD: Error out when register_shrinker() fail
  NFSD: Take care the return value from nfsd4_decode_stateid
  NFSD: Check layout type when returning client layouts
  NFSD: restore trace event lost in mismerge
2015-04-01 09:45:47 -07:00
Oliver Hartkopp
a5581ef4c2 can: introduce new raw socket option to join the given CAN filters
The CAN_RAW socket can set multiple CAN identifier specific filters that lead
to multiple filters in the af_can.c filter processing. These filters are
indenpendent from each other which leads to logical OR'ed filters when applied.

This socket option joines the given CAN filters in the way that only CAN frames
are passed to user space that matched *all* given CAN filters. The semantic for
the applied filters is therefore changed to a logical AND.

This is useful especially when the filterset is a combination of filters where
the CAN_INV_FILTER flag is set in order to notch single CAN IDs or CAN ID
ranges from the incoming traffic.

As the raw_rcv() function is executed from NET_RX softirq the introduced
variables are implemented as per-CPU variables to avoid extensive locking at
CAN frame reception time.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-04-01 11:28:22 +02:00
Patrick McHardy
c3e1b005ed netfilter: nf_tables: add set element timeout support
Add API support for set element timeouts. Elements can have a individual
timeout value specified, overriding the sets' default.

Two new extension types are used for timeouts - the timeout value and
the expiration time. The timeout value only exists if it differs from
the default value.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-01 11:17:28 +02:00
Patrick McHardy
761da2935d netfilter: nf_tables: add set timeout API support
Add set timeout support to the netlink API. Sets with timeout support
enabled can have a default timeout value and garbage collection interval
specified.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-01 11:17:28 +02:00
David S. Miller
7b6249bba9 Lots of updates for net-next; along with the usual flurry
of small fixes, cleanups and internal features we have:
  * VHT support for TDLS and IBSS (conditional on drivers though)
  * first TX performance improvements (the biggest will come later)
  * many suspend/resume (race) fixes
  * name_assign_type support from Tom Gundersen
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJVGWlNAAoJEDBSmw7B7bqr5s8P/R9+Q6y4Ixice9dDOJYynl/d
 dMEUEfCBWUyDaQD1bNQIED2mc0sM5+Ax8OVIVx9fdrLGPxaISBqDJKE1USoTNZzm
 q+U3dM4Q45SfQSsgaY4FtxTlPWPtUKsNMXY/CxLR+IqVeA3+30rX+hv1f3SAqBj0
 68IwW/uUIHb71IZ+hz2Mwudt4JVW8KRg9VlZ0UY6EEvC4m5QD2YkwQQo/hJ2WF+/
 wAJbku02L/Vy4J7E6hNcKYWXokht4fVYphjl/1ZDd/+8L8SUv9mC88n1Jzxa428p
 1PmbtwzbpOrtTcC2BCPDA04IyfMc7k9DlLKw/h2KLPbHZXheD9eVmo/Am5vz+uH6
 926f+FK339SzoJnZ5wBBDiZ8W8TLYNc8ImxtcxjnrtGfr1CKiuh23P1CWyOlKJCO
 BYFJqkCOqWOHYnN0embaj7JqM/LmQI5ZoBZHZhD2KQXIXpTsjjIMPfJvip5D+tsV
 +iXIlQwdeK6rbjxdonBgn7n57XIeSVMAYeyDCbzIShfibjHbUZPn+RsZCtv8RWv8
 EaZu8PerU5ZDKwdX940+lWrtf/TMDJBYQpAIBRuiZK4DTNWCt3BrDlvb1FXGgA+X
 vQJnr32vjJ/pLDxNLHQlkKWC4I/CYtG47OgcJN9AQXrig1zApd+C29zy3aqch3ea
 wxV9dFfheTqZFjtZfSsH
 =O/cf
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Lots of updates for net-next; along with the usual flurry
of small fixes, cleanups and internal features we have:
 * VHT support for TDLS and IBSS (conditional on drivers though)
 * first TX performance improvements (the biggest will come later)
 * many suspend/resume (race) fixes
 * name_assign_type support from Tom Gundersen
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 16:39:04 -04:00
Jiri Benc
15e318bdc6 xfrm: simplify xfrm_address_t use
In many places, the a6 field is typecasted to struct in6_addr. As the
fields are in union anyway, just add in6_addr type to the union and
get rid of the typecasting.

Modifying the uapi header is okay, the union has still the same size.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-31 13:58:35 -04:00
Christoph Hellwig
f3f03330de nfsd: require an explicit option to enable pNFS
Turns out sending out layouts to any client is a bad idea if they
can't get at the storage device, so require explicit admin action
to enable pNFS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-30 16:05:26 -04:00
Luciano Coelho
3a323d4e17 nl80211: small clarification of the sched_scan delay attribute
Just clarify that the delay is only before the first cycle.

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-03-30 10:25:35 +02:00
Alexei Starovoitov
608cd71a9c tc: bpf: generalize pedit action
existing TC action 'pedit' can munge any bits of the packet.
Generalize it for use in bpf programs attached as cls_bpf and act_bpf via
bpf_skb_store_bytes() helper function.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29 13:26:54 -07:00
Michal Sekletar
27cd545247 filter: introduce SKF_AD_VLAN_TPID BPF extension
If vlan offloading takes place then vlan header is removed from frame
and its contents, both vlan_tci and vlan_proto, is available to user
space via TPACKET interface. However, only vlan_tci can be used in BPF
filters.

This commit introduces a new BPF extension. It makes possible to load
the value of vlan_proto (vlan TPID) to register A. Support for classic
BPF and eBPF is being added, analogous to skb->protocol.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Pirko <jpirko@redhat.com>

Signed-off-by: Michal Sekletar <msekleta@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-24 15:25:15 -04:00
Hannes Frederic Sowa
64236f3f3d ipv6: introduce IFA_F_STABLE_PRIVACY flag
We need to mark appropriate addresses so we can do retries in case their
DAD failed.

Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:12:09 -04:00
Hannes Frederic Sowa
622c81d57b ipv6: generation of stable privacy addresses for link-local and autoconf
This patch implements the stable privacy address generation for
link-local and autoconf addresses as specified in RFC7217.

  RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)

is the RID (random identifier). As the hash function F we chose one
round of sha1. Prefix will be either the link-local prefix or the
router advertised one. As Net_Iface we use the MAC address of the
device. DAD_Counter and secret_key are implemented as specified.

We don't use Network_ID, as it couples the code too closely to other
subsystems. It is specified as optional in the RFC.

As Net_Iface we only use the MAC address: we simply have no stable
identifier in the kernel we could possibly use: because this code might
run very early, we cannot depend on names, as they might be changed by
user space early on during the boot process.

A new address generation mode is introduced,
IN6_ADDR_GEN_MODE_STABLE_PRIVACY. With iproute2 one can switch back to
none or eui64 address configuration mode although the stable_secret is
already set.

We refuse writes to ipv6/conf/all/stable_secret but only allow
ipv6/conf/default/stable_secret and the interface specific file to be
written to. The default stable_secret is used as the parameter for the
namespace, the interface specific can overwrite the secret, e.g. when
switching a network configuration from one system to another while
inheriting the secret.

Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:12:08 -04:00
Hannes Frederic Sowa
3d1bec9932 ipv6: introduce secret_stable to ipv6_devconf
This patch implements the procfs logic for the stable_address knob:
The secret is formatted as an ipv6 address and will be stored per
interface and per namespace. We track initialized flag and return EIO
errors until the secret is set.

We don't inherit the secret to newly created namespaces.

Cc: Erik Kline <ek@google.com>
Cc: Fernando Gont <fgont@si6networks.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:12:08 -04:00
Alexander Drozdov
682f048bd4 af_packet: pass checksum validation status to the user
Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
af_packet user that at least the transport header checksum
has been already validated.

For now, the flag may be set for incoming packets only.

Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:01:28 -04:00
YOSHIFUJI Hideaki/吉藤英明
8da86466b8 net: neighbour: Add mcast_resolicit to configure the number of multicast resolicitations in PROBE state.
We send unicast neighbor (ARP or NDP) solicitations ucast_probes
times in PROBE state.  Zhu Yanjun reported that some implementation
does not reply against them and the entry will become FAILED, which
is undesirable.

We had been dealt with such nodes by sending multicast probes mcast_
solicit times after unicast probes in PROBE state.  In 2003, I made
a change not to send them to improve compatibility with IPv6 NDP.

Let's introduce per-protocol per-interface sysctl knob "mcast_
reprobe" to configure the number of multicast (re)solicitation for
reconfirmation in PROBE state.  The default is 0, since we have
been doing so for 10+ years.

Reported-by: Zhu Yanjun <Yanjun.Zhu@windriver.com>
CC: Ulf Samuelsson <ulf.samuelsson@ericsson.com>
Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20 21:47:40 -04:00
Daniel Borkmann
a8cb5f556b act_bpf: add initial eBPF support for actions
This work extends the "classic" BPF programmable tc action by extending
its scope also to native eBPF code!

Together with commit e2e9b6541d ("cls_bpf: add initial eBPF support
for programmable classifiers") this adds the facility to implement fully
flexible classifier and actions for tc that can be implemented in a C
subset in user space, "safely" loaded into the kernel, and being run in
native speed when JITed.

Also, since eBPF maps can be shared between eBPF programs, it offers the
possibility that cls_bpf and act_bpf can share data 1) between themselves
and 2) between user space applications. That means that, f.e. customized
runtime statistics can be collected in user space, but also more importantly
classifier and action behaviour could be altered based on map input from
the user space application.

For the remaining details on the workflow and integration, see the cls_bpf
commit e2e9b6541d. Preliminary iproute2 part can be found under [1].

  [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf-act

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20 19:10:44 -04:00
Daniel Borkmann
94caee8c31 ebpf: add sched_act_type and map it to sk_filter's verifier ops
In order to prepare eBPF support for tc action, we need to add
sched_act_type, so that the eBPF verifier is aware of what helper
function act_bpf may use, that it can load skb data and read out
currently available skb fields.

This is bascially analogous to 96be4325f4 ("ebpf: add sched_cls_type
and map it to sk_filter's verifier ops").

BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT need to be
separate since both will have a different set of functionality in
future (classifier vs action), thus we won't run into ABI troubles
when the point in time comes to diverge functionality from the
classifier.

The future plan for act_bpf would be that it will be able to write
into skb->data and alter selected fields mirrored in struct __sk_buff.

For an initial support, it's sufficient to map it to sk_filter_ops.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20 19:10:44 -04:00
David S. Miller
0fa74a4be4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/emulex/benet/be_main.c
	net/core/sysctl_net_core.c
	net/ipv4/inet_diag.c

The be_main.c conflict resolution was really tricky.  The conflict
hunks generated by GIT were very unhelpful, to say the least.  It
split functions in half and moved them around, when the real actual
conflict only existed solely inside of one function, that being
be_map_pci_bars().

So instead, to resolve this, I checked out be_main.c from the top
of net-next, then I applied the be_main.c changes from 'net' since
the last time I merged.  And this worked beautifully.

The inet_diag.c and sysctl_net_core.c conflicts were simple
overlapping changes, and were easily to resolve.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20 18:51:09 -04:00
Charlie Mooney
a736775db6 Input: add MT_TOOL_PALM
Currently there are only two "tools" that can be specified by a multi-touch
driver: MT_TOOL_FINGER and MT_TOOL_PEN. In working with Elan (The touch
vendor) and discussing their next-gen devices it seems that it will be
useful to have more tools so that their devices can give the upper layers
of the stack hints as to what is touching the sensor.

In particular they have new experimental firmware that can better
differentiate between palms vs fingertips and would like to plumb a patch
so that we can use their hints in higher-level gesture soft- ware.  The
firmware on the device can reasonably do a better job of palm detection
because it has access to all of the raw sensor readings as opposed to just
the width/pressure/etc that are exposed by the driver.  As such, the
firmware can characterize what a palm looks like in much finer-grained
detail and this change would allow such a device to share its findings with
the kernel.

Signed-off-by: Charlie Mooney <charliemooney@chromium.org>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2015-03-20 09:45:28 -07:00
Jörg Thalheim
af615762e9 bridge: add ageing_time, stp_state, priority over netlink
Signed-off-by: Jörg Thalheim <joerg@higgsboson.tk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 23:21:06 -04:00
David Ahern
db24a9044e net: add support for phys_port_name
Similar to port id allow netdevices to specify port names and export
the name via sysfs. Drivers can implement the netdevice operation to
assist udev in having sane default names for the devices using the
rule:

$ cat /etc/udev/rules.d/80-net-setup-link.rules
SUBSYSTEM=="net", ACTION=="add", ATTR{phys_port_name}!="",
NAME="$attr{phys_port_name}"

Use of phys_name versus phys_id was suggested-by Jiri Pirko.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18 22:30:35 -04:00
Alexei Starovoitov
c249739579 bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields
as a follow on to patch 70006af955 ("bpf: allow eBPF access skb fields")
this patch allows 'protocol' and 'vlan_tci' fields to be accessible
from extended BPF programs.

The usage of 'protocol', 'vlan_present' and 'vlan_tci' fields is the same as
corresponding SKF_AD_PROTOCOL, SKF_AD_VLAN_TAG_PRESENT and SKF_AD_VLAN_TAG
accesses in classic BPF.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-17 15:06:31 -04:00
Linus Torvalds
4d272f90a7 Not entirely surprising: the ongoing QEMU work on virtio 1.0 has revealed
more minor issues with our virtio 1.0 drivers just introduced in the
 kernel.
 
 (I would normally use my fixes branch for this, but there were a batch of them...)
 
 Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVB4gHAAoJENkgDmzRrbjx+BAP/0Z8TPfP5v2XQ0YlLPFpKMQ0
 DdsL9s1Se74e0RUJ6fvpVZaVgBxFq1+xi2wiSOhlOyIn0fXlO8+eYK886QX13cYO
 CUAvuwbQhGHgj+1DfQ+7pJlQe905VvHVxWIVZU1dq+PEI181kOQQE9lOHhsKsYvQ
 IxEAX3/avYALAV29FN+PvGZcmY5fgeJf58RkQb5h3XXVaBlI175HhiGm3izgqyKe
 qmZqEDuUnsdlxT5rvJnb0kg/VfRJ2MdkpIcpjpqO4DK3lY+x90LibMmnGLdLkigS
 cEfjSXPmJKNug+IoxxQuDH6zRsWqTLnwI4gU/FqbOkN12Ovt5k4F+FrFCuXD7GdW
 tCiBblkQjQ8xS+OP5slXwYKE3a4q4ih6u+9/STNlSVlG1mqZxxmK56WD00CvBpvR
 CDyTO4yHUV9rjDIhD/r8guFXsPwaaiZxKiGP1k+nnel0E9dMmZf0dE8xqHpTrl0Z
 8OAv3TgJFqhmfecFCBxj28e++dl7KvhiAGRKiZvHYkoxWZmJ5EFkw5E8FUOlJQNS
 2apGPbBEyeEQho7emzb0l9vvAu+0jJGEObxvA9wUdEcXg2kbDmGIpidZfN6xBemJ
 5WuMoGJh9UnYeImtGyXINTuQXagXdzt5bB/IkVmUYMcvsGty5lKIH1G87Q/qV4BE
 YCbKn/3J+G2TmF3+m8AH
 =8QwR
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio fixes from Rusty Russell:
 "Not entirely surprising: the ongoing QEMU work on virtio 1.0 has
  revealed more minor issues with our virtio 1.0 drivers just introduced
  in the kernel.

  (I would normally use my fixes branch for this, but there were a batch
  of them...)"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio_mmio: fix access width for mmio
  uapi/virtio_scsi: allow overriding CDB/SENSE size
  virtio_mmio: generation support
  virtio_rpmsg: set DRIVER_OK before using device
  9p/trans_virtio: fix hot-unplug
  virtio-balloon: do not call blocking ops when !TASK_RUNNING
  virtio_blk: fix comment for virtio 1.0
  virtio_blk: typo fix
  virtio_balloon: set DRIVER_OK before using device
  virtio_console: avoid config access from irq
  virtio_console: init work unconditionally
2015-03-17 10:36:01 -07:00
Alexei Starovoitov
9bac3d6d54 bpf: allow extended BPF programs access skb fields
introduce user accessible mirror of in-kernel 'struct sk_buff':
struct __sk_buff {
    __u32 len;
    __u32 pkt_type;
    __u32 mark;
    __u32 queue_mapping;
};

bpf programs can do:

int bpf_prog(struct __sk_buff *skb)
{
    __u32 var = skb->pkt_type;

which will be compiled to bpf assembler as:

dst_reg = *(u32 *)(src_reg + 4) // 4 == offsetof(struct __sk_buff, pkt_type)

bpf verifier will check validity of access and will convert it to:

dst_reg = *(u8 *)(src_reg + offsetof(struct sk_buff, __pkt_type_offset))
dst_reg &= 7

since skb->pkt_type is a bitfield.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15 22:02:28 -04:00
Daniel Borkmann
c04167ce2c ebpf: add helper for obtaining current processor id
This patch adds the possibility to obtain raw_smp_processor_id() in
eBPF. Currently, this is only possible in classic BPF where commit
da2033c282 ("filter: add SKF_AD_RXHASH and SKF_AD_CPU") has added
facilities for this.

Perhaps most importantly, this would also allow us to track per CPU
statistics with eBPF maps, or to implement a poor-man's per CPU data
structure through eBPF maps.

Example function proto-type looks like:

  u32 (*smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id;

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15 21:57:25 -04:00
Daniel Borkmann
03e69b508b ebpf: add prandom helper for packet sampling
This work is similar to commit 4cd3675ebf ("filter: added BPF
random opcode") and adds a possibility for packet sampling in eBPF.

Currently, this is only possible in classic BPF and useful to
combine sampling with f.e. packet sockets, possible also with tc.

Example function proto-type looks like:

  u32 (*prandom_u32)(void) = (void *)BPF_FUNC_get_prandom_u32;

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-15 21:57:25 -04:00
Michael S. Tsirkin
a4994b810d uapi/virtio_scsi: allow overriding CDB/SENSE size
QEMU wants to use virtio scsi structures with
a different VIRTIO_SCSI_CDB_SIZE/VIRTIO_SCSI_SENSE_SIZE,
let's add ifdefs to allow overriding them.

Keep the old defines under new names:
VIRTIO_SCSI_CDB_DEFAULT_SIZE/VIRTIO_SCSI_SENSE_DEFAULT_SIZE,
since that's what these values really are:
defaults for cdb/sense size fields.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-13 15:55:43 +10:30
Lubomir Rintel
c78ba6d64c ipv6: expose RFC4191 route preference via rtnetlink
This makes it possible to retain the route preference when RAs are handled in
userspace.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-11 23:28:09 -04:00
David S. Miller
3cef5c5b0b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cadence/macb.c

Overlapping changes in macb driver, mostly fixes and cleanups
in 'net' overlapping with the integration of at91_ether into
macb in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-09 23:38:02 -04:00
Michael S. Tsirkin
0fa2a56437 virtio_blk: fix comment for virtio 1.0
Fix up comment to match virtio 1.0 logic:
virtio_blk_outhdr isn't the first elements anymore,
the only requirement is that it comes first in
the s/g list.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-10 11:48:29 +10:30
Michael S. Tsirkin
7e41a9def0 virtio_blk: typo fix
Now that QEmu reuses linux virtio headers, we noticed
a typo in the exported virtio block header. Fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-03-10 11:48:29 +10:30
David S. Miller
5428aef811 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next
tree. Basically, improvements for the packet rejection infrastructure,
deprecation of CLUSTERIP, cleanups for nf_tables and some untangling for
br_netfilter. More specifically they are:

1) Send packet to reset flow if checksum is valid, from Florian Westphal.

2) Fix nf_tables reject bridge from the input chain, also from Florian.

3) Deprecate the CLUSTERIP target, the cluster match supersedes it in
   functionality and it's known to have problems.

4) A couple of cleanups for nf_tables rule tracing infrastructure, from
   Patrick McHardy.

5) Another cleanup to place transaction declarations at the bottom of
   nf_tables.h, also from Patrick.

6) Consolidate Kconfig dependencies wrt. NF_TABLES.

7) Limit table names to 32 bytes in nf_tables.

8) mac header copying in bridge netfilter is already required when
   calling ip_fragment(), from Florian Westphal.

9) move nf_bridge_update_protocol() to br_netfilter.c, also from
   Florian.

10) Small refactor in br_netfilter in the transmission path, again from
    Florian.

11) Move br_nf_pre_routing_finish_bridge_slow() to br_netfilter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-09 15:58:21 -04:00
Linus Torvalds
bbbce516bb TTY/Serial fixes for 4.0-rc3
Here are some tty and serial driver fixes for 4.0-rc3.
 
 Along with the atime fix that you know about, here are some other serial
 driver bugfixes as well.  Most notable is a wait_until_sent bugfix that
 was traced back to being around since before 2.6.12 that Johan has fixed
 up.
 
 All have been in linux-next successfully.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlT8RCYACgkQMUfUDdst+yk62QCgycxS4giC2hyRver3dyvaNR6g
 zYYAn2w0uRndW+AqP4Tls54isRz6owpF
 =gA2k
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are some tty and serial driver fixes for 4.0-rc3.

  Along with the atime fix that you know about, here are some other
  serial driver bugfixes as well.  Most notable is a wait_until_sent
  bugfix that was traced back to being around since before 2.6.12 that
  Johan has fixed up.

  All have been in linux-next successfully"

* tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  TTY: fix tty_wait_until_sent maximum timeout
  TTY: fix tty_wait_until_sent on 64-bit machines
  USB: serial: fix infinite wait_until_sent timeout
  TTY: bfin_jtag_comm: remove incorrect wait_until_sent operation
  net: irda: fix wait_until_sent poll timeout
  serial: uapi: Declare all userspace-visible io types
  serial: core: Fix iotype userspace breakage
  serial: sprd: Fix missing spin_unlock in sprd_handle_irq()
  console: Fix console name size mismatch
  tty: fix up atime/mtime mess, take four
  serial: 8250_dw: Fix get_mctrl behaviour
  serial:8250:8250_pci: delete unneeded quirk entries
  serial:8250:8250_pci: fix redundant entry report for WCH_CH352_2S
  Change email address for 8250_pci
  serial: 8250: Revert "tty: serial: 8250_core: read only RX if there is something in the FIFO"
  Revert "tty/serial: of_serial: add DT alias ID handling"
2015-03-08 12:25:40 -07:00
Shani Michaeli
c93682477b net/dcb: Add IEEE QCN attribute
As specified in 802.1Qau spec. Add this optional attribute to the
DCB netlink layer. To allow for application to use the new attribute,
NIC drivers should implement and register the  callbacks ieee_getqcn,
ieee_setqcn and ieee_getqcnstats.

The QCN attribute holds a set of parameters for management, and
a set of statistics to provide informative data on Congestion-Control
defined by this spec.

Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-06 21:50:02 -05:00
Peter Hurley
647f162b8e serial: uapi: Declare all userspace-visible io types
ioctl(TIOCGSERIAL|TIOCSSERIAL) report and can change the port->iotype.
UART drivers use the UPIO_* definitions, but the uapi header defines
parallel values and userspace uses these parallel values for ioctls;
thus the userspace values are definitive.

Define UPIO_* iotypes in terms of the uapi defines, SERIAL_IO_*;
extend the uapi defines to include all values in use by the serial
core.

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-07 03:39:55 +01:00