Commit Graph

435231 Commits

Author SHA1 Message Date
Andrey Vagin
ee214d54bf netfilter: nf_conntrack: initialize net.ct.generation
[  251.920788] INFO: trying to register non-static key.
[  251.921386] the code is fine but needs lockdep annotation.
[  251.921386] turning off the locking correctness validator.
[  251.921386] CPU: 2 PID: 15715 Comm: socket_listen Not tainted 3.14.0+ #294
[  251.921386] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  251.921386]  0000000000000000 000000009d18c210 ffff880075f039b8 ffffffff816b7ecd
[  251.921386]  ffffffff822c3b10 ffff880075f039c8 ffffffff816b36f4 ffff880075f03aa0
[  251.921386]  ffffffff810c65ff ffffffff810c4a85 00000000fffffe01 ffffffffa0075172
[  251.921386] Call Trace:
[  251.921386]  [<ffffffff816b7ecd>] dump_stack+0x45/0x56
[  251.921386]  [<ffffffff816b36f4>] register_lock_class.part.24+0x38/0x3c
[  251.921386]  [<ffffffff810c65ff>] __lock_acquire+0x168f/0x1b40
[  251.921386]  [<ffffffff810c4a85>] ? trace_hardirqs_on_caller+0x105/0x1d0
[  251.921386]  [<ffffffffa0075172>] ? nf_nat_setup_info+0x252/0x3a0 [nf_nat]
[  251.921386]  [<ffffffff816c1215>] ? _raw_spin_unlock_bh+0x35/0x40
[  251.921386]  [<ffffffffa0075172>] ? nf_nat_setup_info+0x252/0x3a0 [nf_nat]
[  251.921386]  [<ffffffff810c7272>] lock_acquire+0xa2/0x120
[  251.921386]  [<ffffffffa008ab90>] ? ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
[  251.921386]  [<ffffffffa0055989>] __nf_conntrack_confirm+0x129/0x410 [nf_conntrack]
[  251.921386]  [<ffffffffa008ab90>] ? ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
[  251.921386]  [<ffffffffa008ab90>] ipv4_confirm+0x90/0xf0 [nf_conntrack_ipv4]
[  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
[  251.921386]  [<ffffffff815d8c5a>] nf_iterate+0xaa/0xc0
[  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
[  251.921386]  [<ffffffff815d8d14>] nf_hook_slow+0xa4/0x190
[  251.921386]  [<ffffffff815e7b00>] ? ip_fragment+0x9f0/0x9f0
[  251.921386]  [<ffffffff815e98f2>] ip_output+0x92/0x100
[  251.921386]  [<ffffffff815e8df9>] ip_local_out+0x29/0x90
[  251.921386]  [<ffffffff815e9240>] ip_queue_xmit+0x170/0x4c0
[  251.921386]  [<ffffffff815e90d5>] ? ip_queue_xmit+0x5/0x4c0
[  251.921386]  [<ffffffff81601208>] tcp_transmit_skb+0x498/0x960
[  251.921386]  [<ffffffff81602d82>] tcp_connect+0x812/0x960
[  251.921386]  [<ffffffff810e3dc5>] ? ktime_get_real+0x25/0x70
[  251.921386]  [<ffffffff8159ea2a>] ? secure_tcp_sequence_number+0x6a/0xc0
[  251.921386]  [<ffffffff81606f57>] tcp_v4_connect+0x317/0x470
[  251.921386]  [<ffffffff8161f645>] __inet_stream_connect+0xb5/0x330
[  251.921386]  [<ffffffff8158dfc3>] ? lock_sock_nested+0x33/0xa0
[  251.921386]  [<ffffffff810c4b5d>] ? trace_hardirqs_on+0xd/0x10
[  251.921386]  [<ffffffff81078885>] ? __local_bh_enable_ip+0x75/0xe0
[  251.921386]  [<ffffffff8161f8f8>] inet_stream_connect+0x38/0x50
[  251.921386]  [<ffffffff8158b157>] SYSC_connect+0xe7/0x120
[  251.921386]  [<ffffffff810e3789>] ? current_kernel_time+0x69/0xd0
[  251.921386]  [<ffffffff810c4a85>] ? trace_hardirqs_on_caller+0x105/0x1d0
[  251.921386]  [<ffffffff810c4b5d>] ? trace_hardirqs_on+0xd/0x10
[  251.921386]  [<ffffffff8158c36e>] SyS_connect+0xe/0x10
[  251.921386]  [<ffffffff816caf69>] system_call_fastpath+0x16/0x1b
[  312.014104] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=60003 jiffies, g=42359, c=42358, q=333)
[  312.015097] INFO: Stall ended before state dump start

Fixes: 93bb0ceb75 ("netfilter: conntrack: remove central spinlock nf_conntrack_lock")
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-14 10:35:28 +02:00
Andrey Vagin
8142b227ef netfilter: nf_conntrack: flush net_gre->keymap_list only from gre helper
nf_ct_gre_keymap_flush() removes a nf_ct_gre_keymap object from
net_gre->keymap_list and frees the object. But it doesn't clean
a reference on this object from ct_pptp_info->keymap[dir].
Then nf_ct_gre_keymap_destroy() may release the same object again.

So nf_ct_gre_keymap_flush() can be called only when we are sure that
when nf_ct_gre_keymap_destroy will not be called.

nf_ct_gre_keymap is created by nf_ct_gre_keymap_add() and the right way
to destroy it is to call nf_ct_gre_keymap_destroy().

This patch marks nf_ct_gre_keymap_flush() as static, so this patch can
break compilation of third party modules, which use
nf_ct_gre_keymap_flush. I'm not sure this is the right way to deprecate
this function.

[  226.540793] general protection fault: 0000 [#1] SMP
[  226.541750] Modules linked in: nf_nat_pptp nf_nat_proto_gre
nf_conntrack_pptp nf_conntrack_proto_gre ip_gre ip_tunnel gre
ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc xt_nat
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
nf_conntrack veth tun bridge stp llc ppdev microcode joydev pcspkr
serio_raw virtio_console virtio_balloon floppy parport_pc parport
pvpanic i2c_piix4 virtio_net drm_kms_helper ttm ata_generic virtio_pci
virtio_ring virtio drm i2c_core pata_acpi [last unloaded: ip_tunnel]
[  226.541776] CPU: 0 PID: 49 Comm: kworker/u4:2 Not tainted 3.14.0-rc8+ #101
[  226.541776] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  226.541776] Workqueue: netns cleanup_net
[  226.541776] task: ffff8800371e0000 ti: ffff88003730c000 task.ti: ffff88003730c000
[  226.541776] RIP: 0010:[<ffffffff81389ba9>]  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
[  226.541776] RSP: 0018:ffff88003730dbd0  EFLAGS: 00010a83
[  226.541776] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800374e6c40 RCX: dead000000200200
[  226.541776] RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800371e07d0 RDI: ffff8800374e6c40
[  226.541776] RBP: ffff88003730dbd0 R08: 0000000000000000 R09: 0000000000000000
[  226.541776] R10: 0000000000000001 R11: ffff88003730d92e R12: 0000000000000002
[  226.541776] R13: ffff88007a4c42d0 R14: ffff88007aef0000 R15: ffff880036cf0018
[  226.541776] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[  226.541776] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  226.541776] CR2: 00007f07f643f7d0 CR3: 0000000036fd2000 CR4: 00000000000006f0
[  226.541776] Stack:
[  226.541776]  ffff88003730dbe8 ffffffff81389c5d ffff8800374ffbe4 ffff88003730dc28
[  226.541776]  ffffffffa0162a43 ffffffffa01627c5 ffff88007a4c42d0 ffff88007aef0000
[  226.541776]  ffffffffa01651c0 ffff88007a4c45e0 ffff88007aef0000 ffff88003730dc40
[  226.541776] Call Trace:
[  226.541776]  [<ffffffff81389c5d>] list_del+0xd/0x30
[  226.541776]  [<ffffffffa0162a43>] nf_ct_gre_keymap_destroy+0x283/0x2d0 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa01627c5>] ? nf_ct_gre_keymap_destroy+0x5/0x2d0 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa0162ab7>] gre_destroy+0x27/0x70 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffffa0117de3>] destroy_conntrack+0x83/0x200 [nf_conntrack]
[  226.541776]  [<ffffffffa0117d87>] ? destroy_conntrack+0x27/0x200 [nf_conntrack]
[  226.541776]  [<ffffffffa0117d60>] ? nf_conntrack_hash_check_insert+0x2e0/0x2e0 [nf_conntrack]
[  226.541776]  [<ffffffff81630142>] nf_conntrack_destroy+0x72/0x180
[  226.541776]  [<ffffffff816300d5>] ? nf_conntrack_destroy+0x5/0x180
[  226.541776]  [<ffffffffa011ef80>] ? kill_l3proto+0x20/0x20 [nf_conntrack]
[  226.541776]  [<ffffffffa011847e>] nf_ct_iterate_cleanup+0x14e/0x170 [nf_conntrack]
[  226.541776]  [<ffffffffa011f74b>] nf_ct_l4proto_pernet_unregister+0x5b/0x90 [nf_conntrack]
[  226.541776]  [<ffffffffa0162409>] proto_gre_net_exit+0x19/0x30 [nf_conntrack_proto_gre]
[  226.541776]  [<ffffffff815edf89>] ops_exit_list.isra.1+0x39/0x60
[  226.541776]  [<ffffffff815eecc0>] cleanup_net+0x100/0x1d0
[  226.541776]  [<ffffffff810a608a>] process_one_work+0x1ea/0x4f0
[  226.541776]  [<ffffffff810a6028>] ? process_one_work+0x188/0x4f0
[  226.541776]  [<ffffffff810a64ab>] worker_thread+0x11b/0x3a0
[  226.541776]  [<ffffffff810a6390>] ? process_one_work+0x4f0/0x4f0
[  226.541776]  [<ffffffff810af42d>] kthread+0xed/0x110
[  226.541776]  [<ffffffff8173d4dc>] ? _raw_spin_unlock_irq+0x2c/0x40
[  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
[  226.541776]  [<ffffffff8174747c>] ret_from_fork+0x7c/0xb0
[  226.541776]  [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200
[  226.541776] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de
48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48
39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89
42 08
[  226.541776] RIP  [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0
[  226.541776]  RSP <ffff88003730dbd0>
[  226.612193] ---[ end trace 985ae23ddfcc357c ]---

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-08 10:56:12 +02:00
Veaceslav Falico
6859e7df6d netdev: remove potentially harmful checks
Currently we're checking a variable for != NULL after actually
dereferencing it, in netdev_lower_get_next_private*().

It's counter-intuitive at best, and can lead to faulty usage (as it implies
that the variable can be NULL), so fix it by removing the useless checks.

Reported-by: Daniel Borkmann <dborkman@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: stephen hemminger <stephen@networkplumber.org>
CC: Jerry Chu <hkchu@google.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 15:52:07 -04:00
Daniel Borkmann
6f25cd47dc pktgen: fix xmit test for BQL enabled devices
Similarly as in commit 8e2f1a63f2 ("packet: fix packet_direct_xmit
for BQL enabled drivers"), we test for __QUEUE_STATE_STACK_XOFF bit
in pktgen's xmit, which would not fully fill the device's TX ring for
BQL drivers that use netdev_tx_sent_queue(). Fix is to use, similarly
as we do in packet sockets, netif_xmit_frozen_or_drv_stopped() test.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 15:20:44 -04:00
Gilles Chanteperdrix
c293fb785b net/at91_ether: avoid NULL pointer dereference
The at91_ether driver calls macb_mii_init passing a 'struct macb'
structure whose tx_clk member is initialized to 0. However,
macb_handle_link_change() expects tx_clk to be the result of
a call to clk_get, and so IS_ERR(tx_clk) to be true if the clock
is invalid. This causes an oops when booting Linux 3.14 on the
csb637 board. The following changes avoids this.

Signed-off-by: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 15:10:17 -04:00
Geert Uytterhoeven
065d7e3956 tipc: Let tipc_release() return 0
net/tipc/socket.c: In function ‘tipc_release’:
net/tipc/socket.c:352: warning: ‘res’ is used uninitialized in this function

Introduced by commit 24be34b5a0 ("tipc:
eliminate upcall function pointers between port and socket"), which
removed the sole initializer of "res".

Just return 0 to fix it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 15:08:17 -04:00
Alexander Aring
39d7f32008 at86rf230: fix MAX_CSMA_RETRIES parameter
This patch fix a copy&paste failure for setting the MAX_CSMA_RETRIES
value of the at86rf212 chip which was introduced by commit
f2fdd67c6b ("ieee802154: enable
smart transmitter features of RF212")

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 14:59:26 -04:00
Jean Sacren
6c6a985556 mac802154: fix duplicate #include headers
The commit e6278d9200 ("mac802154: use header operations to
create/parse headers") included the header

		net/ieee802154_netdev.h

which had been included by the commit b70ab2e87f ("ieee802154:
enforce consistent endianness in the 802.15.4 stack"). Fix this
duplicate #include by deleting the latter one as the required header
has already been in place.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Cc: linux-zigbee-devel@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 13:18:44 -04:00
Jean Sacren
c28c7a6acc sxgbe: fix duplicate #include headers
The commit 1edb9ca69e ("net: sxgbe: add basic framework for
Samsung 10Gb ethernet driver") added support for Samsung 10Gb
ethernet driver(sxgbe) with a minor issue of including linux/io.h
header twice in sxgbe_dma.c file. Fix the duplicate #include by
deleting the top one so that all the rest good #include headers
would be preserved in the alphabetical order.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Cc: Byungho An <bh74.an@samsung.com>
Cc: Girish K S <ks.giri@samsung.com>
Cc: Siva Reddy Kallam <siva.kallam@samsung.com>
Cc: Vipul Pandya <vipul.pandya@samsung.com>
Acked-by: Byungho An <bh74.an@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 13:18:44 -04:00
Daniel Borkmann
5f9fde5f79 net: filter: be more defensive on div/mod by X==0
The old interpreter behaviour was that we returned with 0
whenever we found a division by 0 would take place. In the new
interpreter we would currently just skip that instead and
continue execution.

It's true that a value of 0 as return might not be appropriate
in all cases, but current users (socket filters -> drop
packet, seccomp -> SECCOMP_RET_KILL, cls_bpf -> unclassified,
etc) seem fine with that behaviour. Better this than undefined
BPF program behaviour as it's expected that A contains the
result of the division. In future, as more use cases open up,
we could further adapt this return value to our needs, if
necessary.

So reintroduce return of 0 for division by 0 as in the old
interpreter. Also in case of K which is guaranteed to be 32bit
wide, sk_chk_filter() already takes care of preventing division
by 0 invoked through K, so we can generally spare us these tests.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Reviewed-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-07 12:54:39 -04:00
David S. Miller
d80e773f16 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter fixes for your net tree, they
are:

* Use 16-bits offset and length fields instead of 8-bits in the conntrack
  extension to avoid an overflow when many conntrack extension are used,
  from Andrey Vagin.

* Allow to use cgroup match from LOCAL_IN, there is no apparent reason
  for not allowing this, from Alexey Perevalov.

* Fix build of the connlimit match after recent changes to let it scale
  up that result in a divide by zero compilation error in UP, from
  Florian Westphal.

* Move the lock out of the structure connlimit_data to avoid a false
  sharing spotted by Eric Dumazet and Jesper D. Brouer, this needed as
  part of the recent connlimit scalability improvements, also from
  Florian Westphal.

* Add missing module aliases in xt_osf to fix loading of rules using
  this match, from Kirill Tkhai.

* Restrict set names in nf_tables to 15 characters instead of silently
  trimming them off, from me.

* Fix wrong format in nf_tables request module call for chain types,
  spotted by Florian Westphal, patch from me.

* Fix crash in xtables when it fails to copy the counters back to userspace
  after having replaced the table already.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-06 11:29:59 -04:00
Thomas Graf
c58dd2dd44 netfilter: Can't fail and free after table replacement
All xtables variants suffer from the defect that the copy_to_user()
to copy the counters to user memory may fail after the table has
already been exchanged and thus exposed. Return an error at this
point will result in freeing the already exposed table. Any
subsequent packet processing will result in a kernel panic.

We can't copy the counters before exposing the new tables as we
want provide the counter state after the old table has been
unhooked. Therefore convert this into a silent error.

Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-05 17:46:22 +02:00
Zoltan Kiss
00aefceb2f xen-netback: Trivial format string fix
There is a "%" after pending_idx instead of ":".

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-04 10:49:53 -04:00
Sachin Kamat
65e71ff342 net: bcmgenet: Remove unnecessary version.h inclusion
version.h inclusion is not necessary as detected by versioncheck.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-04 10:02:23 -04:00
Laurent Pinchart
a50988a11d net: smc911x: Remove unused local variable
The ioaddr local variable is assigned to but never used in the
smc911x_rx_dma_irq() function, remove it.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-04 10:02:21 -04:00
zheng.li
7db8df0279 bonding: Inactive slaves should keep inactive flag's value
bond_open is not setting the inactive flag correctly for some modes (alb and
tlb), resulting in error behavior if the bond has been administratively set
down and then back up. This effect should not occur when slaves are added while
the bond is up; it's something that only happens after a down/up bounce of the
bond.

For example, in bond tlb or alb mode, domu send some ARP request which go out
from dom0 bond's active slave, then the ARP broadcast request packets go back to
inactive slave from switch, because the inactive slave's inactive flag is zero,
kernel will receive the packets and pass them to bridge that cause dom0's bridge
map domu's MAC address to port of bond, bridge should map domu's MAC to port of
vif.

Signed-off-by: Zheng Li <zheng.x.li@oracle.com>
Signed-off-by: Jay Vosburgh <j.vosburgh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-04 10:02:20 -04:00
Pablo Neira Ayuso
2fec6bb6f4 netfilter: nf_tables: fix wrong format in request_module()
The intended format in request_module is %.*s instead of %*.s.

Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:50 +02:00
Pablo Neira Ayuso
a9bdd83656 netfilter: nf_tables: set names cannot be larger than 15 bytes
Currently, nf_tables trims off the set name if it exceeeds 15
bytes, so explicitly reject set names that are too large.

Reported-by: Giuseppe Longo <giuseppelng@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:44 +02:00
Andrey Vagin
223b02d923 netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len
"len" contains sizeof(nf_ct_ext) and size of extensions. In a worst
case it can contain all extensions. Bellow you can find sizes for all
types of extensions. Their sum is definitely bigger than 256.

nf_ct_ext_types[0]->len = 24
nf_ct_ext_types[1]->len = 32
nf_ct_ext_types[2]->len = 24
nf_ct_ext_types[3]->len = 32
nf_ct_ext_types[4]->len = 152
nf_ct_ext_types[5]->len = 2
nf_ct_ext_types[6]->len = 16
nf_ct_ext_types[7]->len = 8

I have seen "len" up to 280 and my host has crashes w/o this patch.

The right way to fix this problem is reducing the size of the ecache
extension (4) and Florian is going to do this, but these changes will
be quite large to be appropriate for a stable tree.

Fixes: 5b423f6a40 (netfilter: nf_conntrack: fix racy timer handling with reliable)
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:31 +02:00
Kirill Tkhai
b8ddd9eac8 netfilter: Add {ipt,ip6t}_osf aliases for xt_osf
There are no these aliases, so kernel can not request appropriate
match table:

$ iptables -I INPUT -p tcp -m osf --genre Windows --ttl 2 -j DROP
iptables: No chain/target/match by that name.

setsockopt() requests ipt_osf module, which is not present. Add
the aliases.

Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:22 +02:00
Alexey Perevalov
a00e76349f netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks
This simple modification allows iptables to work with INPUT chain
in combination with cgroup module. It could be useful for counting
ingress traffic per cgroup with nfacct netfilter module. There
were no problems to count the egress traffic that way formerly.

It's possible to get classified sk_buff after PREROUTING, due to
socket lookup being done in early_demux (tcp_v4_early_demux). Also
it works for udp as well.

Trivial usage example, assuming we're in the same shell every step
and we have enough permissions:

1) Classic net_cls cgroup initialization:

  mkdir /sys/fs/cgroup/net_cls
  mount -t cgroup -o net_cls net_cls /sys/fs/cgroup/net_cls

2) Set up cgroup for interesting application:

  mkdir /sys/fs/cgroup/net_cls/wget
  echo 1 > /sys/fs/cgroup/net_cls/wget/net_cls.classid
  echo $BASHPID > /sys/fs/cgroup/net_cls/wget/cgroup.procs

3) Create kernel counters:

  nfacct add wget-cgroup-in
  iptables -A INPUT -m cgroup ! --cgroup 1 -m nfacct --nfacct-name wget-cgroup-in

  nfacct add wget-cgroup-out
  iptables -A OUTPUT -m cgroup ! --cgroup 1 -m nfacct --nfacct-name wget-cgroup-out

4) Network usage:

  wget https://www.kernel.org/pub/linux/kernel/v3.x/testing/linux-3.14-rc6.tar.xz

5) Check results:

  nfacct list

Cgroup approach is being used for the DataUsage (counting & blocking
traffic) feature for Samsung's modification of the Tizen OS.

Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:17 +02:00
Florian Westphal
e00b437b3d netfilter: connlimit: move lock array out of struct connlimit_data
Eric points out that the locks can be global.
Moreover, both Jesper and Eric note that using only 32 locks increases
false sharing as only two cache lines are used.

This increases locks to 256 (16 cache lines assuming 64byte cacheline and
4 bytes per spinlock).

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:13 +02:00
Florian Westphal
e5ac6eafba netfilter: connlimit: fix UP build
cannot use ARRAY_SIZE() if spinlock_t is empty struct.

Fixes: 1442e7507d ("netfilter: connlimit: use keyed locks")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:07 +02:00
Eric Dumazet
e33d0ba804 net-gro: reset skb->truesize in napi_reuse_skb()
Recycling skb always had been very tough...

This time it appears GRO layer can accumulate skb->truesize
adjustments made by drivers when they attach a fragment to skb.

skb_gro_receive() can only subtract from skb->truesize the used part
of a fragment.

I spotted this problem seeing TcpExtPruneCalled and
TcpExtTCPRcvCollapsed that were unexpected with a recent kernel, where
TCP receive window should be sized properly to accept traffic coming
from a driver not overshooting skb->truesize.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 16:17:52 -04:00
Philipp Zabel
240a12d58e net: Micrel KSZ8864RMN 4-port managed switch support
This patch adds support for the Micrel KSZ8864RMN switch to the spi_ks8995
driver. The KSZ8864RMN switch has a wider 256-byte register space.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 16:15:29 -04:00
Erik Hugne
a5e7ac5ce1 tipc: fix regression bug where node events are not being generated
Commit 5902385a24 ("tipc: obsolete
the remote management feature") introduces a regression where node
topology events are not being generated because the publication
that triggers this: {0, <z.c.n>, <z.c.n>} is no longer available.
This will break applications that rely on node events to discover
when nodes join/leave a cluster.

We fix this by advertising the node publication when TIPC enters
networking mode, and withdraws it upon shutdown.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 16:03:57 -04:00
françois romieu
d9bd646168 sxgbe: fix driver probe error path and driver removal leaks
sxgbe_drv_probe:  mdio and priv->hw leaks
sxgbe_drv_remove: clk and priv->hw leaks

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Byungho An <bh74.an@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:36:29 -04:00
françois romieu
56c8b193ea sxgbe: use common NET_VENDOR_FOO style.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Acked-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:36:27 -04:00
Jiri Pirko
d0290214de net: add busy_poll device feature
Currently there is no way how to find out if a device supports busy
polling. So add a feature and make it dependent on ndo_busy_poll
existence.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:31:34 -04:00
Daniel Borkmann
8e2f1a63f2 packet: fix packet_direct_xmit for BQL enabled drivers
Currently, in packet_direct_xmit() we test the assigned netdevice queue
for netif_xmit_frozen_or_stopped() before doing an ndo_start_xmit().

This can have the side-effect that BQL enabled drivers which make use
of netdev_tx_sent_queue() internally, set __QUEUE_STATE_STACK_XOFF from
within the stack and would not fully fill the device's TX ring from
packet sockets with PACKET_QDISC_BYPASS enabled.

Instead, use a test without BQL bit so that bursts can be absorbed
into the NICs TX ring. Fix and code suggested by Eric Dumazet, thanks!

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:29:12 -04:00
Daniel Borkmann
0f97ede45e packet: report tx_dropped in packet_direct_xmit
Since commit 015f0688f5 ("net: net: add a core netdev->tx_dropped
counter"), we can now account for TX drops from within the core
stack instead of drivers.

Therefore, fix packet_direct_xmit() and increase drop count when we
encounter a problem before driver's xmit function was called (we do
not want to doubly account for it).

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:29:11 -04:00
Zoltan Kiss
bdab82759b xen-netback: Grant copy the header instead of map and memcpy
An old inefficiency of the TX path that we are grant mapping the first slot,
and then copy the header part to the linear area. Instead, doing a grant copy
for that header straight on is more reasonable. Especially because there are
ongoing efforts to make Xen avoiding TLB flush after unmap when the page were
not touched in Dom0. In the original way the memcpy ruined that.
The key changes:
- the vif has a tx_copy_ops array again
- xenvif_tx_build_gops sets up the grant copy operations
- we don't have to figure out whether the header and first frag are on the same
  grant mapped page or not
Note, we only grant copy PKT_PROT_LEN bytes from the first slot, the rest (if
any) will be on the first frag, which is grant mapped. If the first slot is
smaller than PKT_PROT_LEN, then we grant copy that, and later __pskb_pull_tail
will pull more from the frags (if any)

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:22:40 -04:00
Zoltan Kiss
9074ce2493 xen-netback: Rename map ops
Rename identifiers to state explicitly that they refer to map ops.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:22:38 -04:00
Josh Boyer
acdd32be6d net: qlcnic: include irq.h for irq definitions
The qlcnic driver fails to build on ARM with errors like:

In file included from drivers/net/ethernet/qlogic/qlcnic/qlcnic.h:36:0,
                 from drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c:8:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h:585:1: error: unknown type name 'irqreturn_t'
 irqreturn_t qlcnic_83xx_clear_legacy_intr(struct qlcnic_adapter *);
 ^

Nothing in the driver is explicitly including the irq definitions, so we
add an include of linux/irq.h to pick them up.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:04:36 -04:00
Josh Boyer
fef1f07cbf net: enic: include irq.h for irqreturn_t definitions
The enic driver fails to build on ARM with:

In file included from drivers/net/ethernet/cisco/enic/enic_res.c:40:0:
drivers/net/ethernet/cisco/enic/enic.h:48:2: error: expected specifier-qualifier-list before 'irqreturn_t'
  irqreturn_t (*isr)(int, void *);
  ^

Nothing in the driver is explicitly including the irq definitions, so we add
an include of linux/irq.h to pick them up.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:04:34 -04:00
Josh Boyer
df1efc2d30 net: bnx2x: include irq.h for irqreturn_t definitions
The bnx2x driver fails to build on ARM with:

In file included from drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c:28:0:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h:243:1: error: unknown type name 'irqreturn_t'
 irqreturn_t bnx2x_msix_sp_int(int irq, void *dev_instance);
 ^
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h:251:1: error: unknown type name 'irqreturn_t'
 irqreturn_t bnx2x_interrupt(int irq, void *dev_instance);
 ^

Nothing in bnx2x_link.c or bnx2x_cmn.h is explicitly including the irq
definitions, so we add an include of linux/irq.h to pick them up.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:04:32 -04:00
YOSHIFUJI Hideaki / 吉藤英明
77bc6bed71 isdnloop: Validate NUL-terminated strings from user.
Return -EINVAL unless all of user-given strings are correctly
NUL-terminated.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 11:26:24 -04:00
Alexei Starovoitov
79eb9d28c9 net: ti: fix CPTS driver build on arm
fix build errors:
drivers/net/ethernet/ti/cpts.c:266:12: error: 'ETH_HLEN' undeclared (first use in this function)
drivers/net/ethernet/ti/cpts.c:276:23: error: 'VLAN_HLEN' undeclared (first use in this function)

Fixes: 408eccce32 ("net: ptp: move PTP classifier in its own file")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Suggested-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 11:24:51 -04:00
Mike Rapoport
5933a7bbb5 net: vxlan: fix crash when interface is created with no group
If the vxlan interface is created without explicit group definition,
there are corner cases which may cause kernel panic.

For instance, in the following scenario:

node A:
$ ip link add dev vxlan42  address 2c:c2:60:00:10:20 type vxlan id 42
$ ip addr add dev vxlan42 10.0.0.1/24
$ ip link set up dev vxlan42
$ arp -i vxlan42 -s 10.0.0.2 2c:c2:60:00:01:02
$ bridge fdb add dev vxlan42 to 2c:c2:60:00:01:02 dst <IPv4 address>
$ ping 10.0.0.2

node B:
$ ip link add dev vxlan42 address 2c:c2:60:00:01:02 type vxlan id 42
$ ip addr add dev vxlan42 10.0.0.2/24
$ ip link set up dev vxlan42
$ arp -i vxlan42 -s 10.0.0.1 2c:c2:60:00:10:20

node B crashes:

 vxlan42: 2c:c2:60:00:10:20 migrated from 4011:eca4:c0a8:6466:c0a8:6415:8e09:2118 to (invalid address)
 vxlan42: 2c:c2:60:00:10:20 migrated from 4011:eca4:c0a8:6466:c0a8:6415:8e09:2118 to (invalid address)
 BUG: unable to handle kernel NULL pointer dereference at 0000000000000046
 IP: [<ffffffff8143c459>] ip6_route_output+0x58/0x82
 PGD 7bd89067 PUD 7bd4e067 PMD 0
 Oops: 0000 [#1] SMP
 Modules linked in:
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.14.0-rc8-hvx-xen-00019-g97a5221-dirty #154
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 task: ffff88007c774f50 ti: ffff88007c79c000 task.ti: ffff88007c79c000
 RIP: 0010:[<ffffffff8143c459>]  [<ffffffff8143c459>] ip6_route_output+0x58/0x82
 RSP: 0018:ffff88007fd03668  EFLAGS: 00010282
 RAX: 0000000000000000 RBX: ffffffff8186a000 RCX: 0000000000000040
 RDX: 0000000000000000 RSI: ffff88007b0e4a80 RDI: ffff88007fd03754
 RBP: ffff88007fd03688 R08: ffff88007b0e4a80 R09: 0000000000000000
 R10: 0200000a0100000a R11: 0001002200000000 R12: ffff88007fd03740
 R13: ffff88007b0e4a80 R14: ffff88007b0e4a80 R15: ffff88007bba0c50
 FS:  0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000046 CR3: 000000007bb60000 CR4: 00000000000006e0
 Stack:
  0000000000000000 ffff88007fd037a0 ffffffff8186a000 ffff88007fd03740
  ffff88007fd036c8 ffffffff814320bb 0000000000006e49 ffff88007b8b7360
  ffff88007bdbf200 ffff88007bcbc000 ffff88007b8b7000 ffff88007b8b7360
 Call Trace:
  <IRQ>
  [<ffffffff814320bb>] ip6_dst_lookup_tail+0x2d/0xa4
  [<ffffffff814322a5>] ip6_dst_lookup+0x10/0x12
  [<ffffffff81323b4e>] vxlan_xmit_one+0x32a/0x68c
  [<ffffffff814a325a>] ? _raw_spin_unlock_irqrestore+0x12/0x14
  [<ffffffff8104c551>] ? lock_timer_base.isra.23+0x26/0x4b
  [<ffffffff8132451a>] vxlan_xmit+0x66a/0x6a8
  [<ffffffff8141a365>] ? ipt_do_table+0x35f/0x37e
  [<ffffffff81204ba2>] ? selinux_ip_postroute+0x41/0x26e
  [<ffffffff8139d0c1>] dev_hard_start_xmit+0x2ce/0x3ce
  [<ffffffff8139d491>] __dev_queue_xmit+0x2d0/0x392
  [<ffffffff813b380f>] ? eth_header+0x28/0xb5
  [<ffffffff8139d569>] dev_queue_xmit+0xb/0xd
  [<ffffffff813a5aa6>] neigh_resolve_output+0x134/0x152
  [<ffffffff813db741>] ip_finish_output2+0x236/0x299
  [<ffffffff813dc074>] ip_finish_output+0x98/0x9d
  [<ffffffff813dc749>] ip_output+0x62/0x67
  [<ffffffff813da9f2>] dst_output+0xf/0x11
  [<ffffffff813dc11c>] ip_local_out+0x1b/0x1f
  [<ffffffff813dcf1b>] ip_send_skb+0x11/0x37
  [<ffffffff813dcf70>] ip_push_pending_frames+0x2f/0x33
  [<ffffffff813ff732>] icmp_push_reply+0x106/0x115
  [<ffffffff813ff9e4>] icmp_reply+0x142/0x164
  [<ffffffff813ffb3b>] icmp_echo.part.16+0x46/0x48
  [<ffffffff813c1d30>] ? nf_iterate+0x43/0x80
  [<ffffffff813d8037>] ? xfrm4_policy_check.constprop.11+0x52/0x52
  [<ffffffff813ffb62>] icmp_echo+0x25/0x27
  [<ffffffff814005f7>] icmp_rcv+0x1d2/0x20a
  [<ffffffff813d8037>] ? xfrm4_policy_check.constprop.11+0x52/0x52
  [<ffffffff813d810d>] ip_local_deliver_finish+0xd6/0x14f
  [<ffffffff813d8037>] ? xfrm4_policy_check.constprop.11+0x52/0x52
  [<ffffffff813d7fde>] NF_HOOK.constprop.10+0x4c/0x53
  [<ffffffff813d82bf>] ip_local_deliver+0x4a/0x4f
  [<ffffffff813d7f7b>] ip_rcv_finish+0x253/0x26a
  [<ffffffff813d7d28>] ? inet_add_protocol+0x3e/0x3e
  [<ffffffff813d7fde>] NF_HOOK.constprop.10+0x4c/0x53
  [<ffffffff813d856a>] ip_rcv+0x2a6/0x2ec
  [<ffffffff8139a9a0>] __netif_receive_skb_core+0x43e/0x478
  [<ffffffff812a346f>] ? virtqueue_poll+0x16/0x27
  [<ffffffff8139aa2f>] __netif_receive_skb+0x55/0x5a
  [<ffffffff8139aaaa>] process_backlog+0x76/0x12f
  [<ffffffff8139add8>] net_rx_action+0xa2/0x1ab
  [<ffffffff81047847>] __do_softirq+0xca/0x1d1
  [<ffffffff81047ace>] irq_exit+0x3e/0x85
  [<ffffffff8100b98b>] do_IRQ+0xa9/0xc4
  [<ffffffff814a37ad>] common_interrupt+0x6d/0x6d
  <EOI>
  [<ffffffff810378db>] ? native_safe_halt+0x6/0x8
  [<ffffffff810110c7>] default_idle+0x9/0xd
  [<ffffffff81011694>] arch_cpu_idle+0x13/0x1c
  [<ffffffff8107480d>] cpu_startup_entry+0xbc/0x137
  [<ffffffff8102e741>] start_secondary+0x1a0/0x1a5
 Code: 24 14 e8 f1 e5 01 00 31 d2 a8 32 0f 95 c2 49 8b 44 24 2c 49 0b 44 24 24 74 05 83 ca 04 eb 1c 4d 85 ed 74 17 49 8b 85 a8 02 00 00 <66> 8b 40 46 66 c1 e8 07 83 e0 07 c1 e0 03 09 c2 4c 89 e6 48 89
 RIP  [<ffffffff8143c459>] ip6_route_output+0x58/0x82
  RSP <ffff88007fd03668>
 CR2: 0000000000000046
 ---[ end trace 4612329caab37efd ]---

When vxlan interface is created without explicit group definition, the
default_dst protocol family is initialiazed to AF_UNSPEC and the driver
assumes IPv4 configuration. On the other side, the default_dst protocol
family is used to differentiate between IPv4 and IPv6 cases and, since,
AF_UNSPEC != AF_INET, the processing takes the IPv6 path.

Making the IPv4 assumption explicit by settting default_dst protocol
family to AF_INET4 and preventing mixing of IPv4 and IPv6 addresses in
snooped fdb entries fixes the corner case crashes.

Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 11:16:47 -04:00
Linus Torvalds
cd6362befe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Here is my initial pull request for the networking subsystem during
  this merge window:

   1) Support for ESN in AH (RFC 4302) from Fan Du.

   2) Add full kernel doc for ethtool command structures, from Ben
      Hutchings.

   3) Add BCM7xxx PHY driver, from Florian Fainelli.

   4) Export computed TCP rate information in netlink socket dumps, from
      Eric Dumazet.

   5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
      Dichtel.

   6) Convert many drivers to pci_enable_msix_range(), from Alexander
      Gordeev.

   7) Record SKB timestamps more efficiently, from Eric Dumazet.

   8) Switch to microsecond resolution for TCP round trip times, also
      from Eric Dumazet.

   9) Clean up and fix 6lowpan fragmentation handling by making use of
      the existing inet_frag api for it's implementation.

  10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.

  11) Auto size SKB lengths when composing netlink messages based upon
      past message sizes used, from Eric Dumazet.

  12) qdisc dumps can take a long time, add a cond_resched(), From Eric
      Dumazet.

  13) Sanitize netpoll core and drivers wrt.  SKB handling semantics.
      Get rid of never-used-in-tree netpoll RX handling.  From Eric W
      Biederman.

  14) Support inter-address-family and namespace changing in VTI tunnel
      driver(s).  From Steffen Klassert.

  15) Add Altera TSE driver, from Vince Bridgers.

  16) Optimizing csum_replace2() so that it doesn't adjust the checksum
      by checksumming the entire header, from Eric Dumazet.

  17) Expand BPF internal implementation for faster interpreting, more
      direct translations into JIT'd code, and much cleaner uses of BPF
      filtering in non-socket ocntexts.  From Daniel Borkmann and Alexei
      Starovoitov"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
  netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  net: Add a test to see if a skb is freeable in irq context
  qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
  net: ptp: move PTP classifier in its own file
  net: sxgbe: make "core_ops" static
  net: sxgbe: fix logical vs bitwise operation
  net: sxgbe: sxgbe_mdio_register() frees the bus
  Call efx_set_channels() before efx->type->dimension_resources()
  xen-netback: disable rogue vif in kthread context
  net/mlx4: Set proper build dependancy with vxlan
  be2net: fix build dependency on VxLAN
  mac802154: make csma/cca parameters per-wpan
  mac802154: allow only one WPAN to be up at any given time
  net: filter: minor: fix kdoc in __sk_run_filter
  netlink: don't compare the nul-termination in nla_strcmp
  can: c_can: Avoid led toggling for every packet.
  can: c_can: Simplify TX interrupt cleanup
  can: c_can: Store dlc private
  can: c_can: Reduce register access
  can: c_can: Make the code readable
  ...
2014-04-02 20:53:45 -07:00
Linus Torvalds
0f1b1e6d73 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
 - substantial cleanup of the generic and transport layers, in the
   direction of an ultimate goal of making struct hid_device completely
   transport independent, by Benjamin Tissoires
 - cp2112 driver from David Barksdale
 - a lot of fixes and new hardware support (Dualshock 4) to hid-sony
   driver, by Frank Praznik
 - support for Win 8.1 multitouch protocol by Andrew Duggan
 - other smaller fixes / device ID additions

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (75 commits)
  HID: sony: fix force feedback mismerge
  HID: sony: Set the quriks flag for Bluetooth controllers
  HID: sony: Fix Sixaxis cable state detection
  HID: uhid: Add UHID_CREATE2 + UHID_INPUT2
  HID: hyperv: fix _raw_request() prototype
  HID: hyperv: Implement a stub raw_request() entry point
  HID: hid-sensor-hub: fix sleeping function called from invalid context
  HID: multitouch: add support for Win 8.1 multitouch touchpads
  HID: remove hid_output_raw_report transport implementations
  HID: sony: do not rely on hid_output_raw_report
  HID: cp2112: remove the last hid_output_raw_report() call
  HID: cp2112: remove various hid_out_raw_report calls
  HID: multitouch: add support of other generic collections in hid-mt
  HID: multitouch: remove pen special handling
  HID: multitouch: remove registered devices with default behavior
  HID: hidp: Add a comment that some devices depend on the current behavior of uniq
  HID: sony: Prevent duplicate controller connections.
  HID: sony: Perform a boundry check on the sixaxis battery level index.
  HID: sony: Fix work queue issues
  HID: sony: Fix multi-line comment styling
  ...
2014-04-02 16:24:28 -07:00
Linus Torvalds
159d8133d0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual rocket science -- mostly documentation and comment updates"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  sparse: fix comment
  doc: fix double words
  isdn: capi: fix "CAPI_VERSION" comment
  doc: DocBook: Fix typos in xml and template file
  Bluetooth: add module name for btwilink
  driver core: unexport static function create_syslog_header
  mmc: core: typo fix in printk specifier
  ARM: spear: clean up editing mistake
  net-sysfs: fix comment typo 'CONFIG_SYFS'
  doc: Insert MODULE_ in module-signing macros
  Documentation: update URL to hfsplus Technote 1150
  gpio: update path to documentation
  ixgbe: Fix format string in ixgbe_fcoe.
  Kconfig: Remove useless "default N" lines
  user_namespace.c: Remove duplicated word in comment
  CREDITS: fix formatting
  treewide: Fix typo in Documentation/DocBook
  mm: Fix warning on make htmldocs caused by slab.c
  ata: ata-samsung_cf: cleanup in header file
  idr: remove unused prototype of idr_free()
2014-04-02 16:23:38 -07:00
Linus Torvalds
05bf58ca4b Merge branch 'sched-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched/idle changes from Ingo Molnar:
 "More idle code reorganization, to prepare for more integration.

  (Sent separately because it depended on pending timer work, which is
  now upstream)"

* 'sched-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/idle: Add more comments to the code
  sched/idle: Move idle conditions in cpuidle_idle main function
  sched/idle: Reorganize the idle loop
  cpuidle/idle: Move the cpuidle_idle_call function to idle.c
  idle/cpuidle: Split cpuidle_idle_call main function into smaller functions
2014-04-02 16:22:27 -07:00
Oleg Nesterov
d23082257d pid_namespace: pidns_get() should check task_active_pid_ns() != NULL
pidns_get()->get_pid_ns() can hit ns == NULL. This task_struct can't
go away, but task_active_pid_ns(task) is NULL if release_task(task)
was already called. Alternatively we could change get_pid_ns(ns) to
check ns != NULL, but it seems that other callers are fine.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman ebiederm@xmission.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-02 16:20:21 -07:00
Linus Torvalds
7cbb39d4d4 Merge tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
 "PPC and ARM do not have much going on this time.  Most of the cool
  stuff, instead, is in s390 and (after a few releases) x86.

  ARM has some caching fixes and PPC has transactional memory support in
  guests.  MIPS has some fixes, with more probably coming in 3.16 as
  QEMU will soon get support for MIPS KVM.

  For x86 there are optimizations for debug registers, which trigger on
  some Windows games, and other important fixes for Windows guests.  We
  now expose to the guest Broadwell instruction set extensions and also
  Intel MPX.  There's also a fix/workaround for OS X guests, nested
  virtualization features (preemption timer), and a couple kvmclock
  refinements.

  For s390, the main news is asynchronous page faults, together with
  improvements to IRQs (floating irqs and adapter irqs) that speed up
  virtio devices"

* tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
  KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
  KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
  KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
  KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
  KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
  KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
  KVM: PPC: Book3S HV: Add transactional memory support
  KVM: Specify byte order for KVM_EXIT_MMIO
  KVM: vmx: fix MPX detection
  KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
  KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
  KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
  KVM: s390: clear local interrupts at cpu initial reset
  KVM: s390: Fix possible memory leak in SIGP functions
  KVM: s390: fix calculation of idle_mask array size
  KVM: s390: randomize sca address
  KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
  KVM: Bump KVM_MAX_IRQ_ROUTES for s390
  KVM: s390: irq routing for adapter interrupts.
  KVM: s390: adapter interrupt sources
  ...
2014-04-02 14:50:10 -07:00
Linus Torvalds
64056a9425 Nothing exciting: virtio-blk users might see a bit of a boost from the
doubling of the default queue length though.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJTOirvAAoJENkgDmzRrbjxpqAP/3114WOubf4MNoi24eXNkU2x
 ZC++orq408A72g8dB7A0XAhatKc8Ay0bDP5hOZNAgTHm2hjYxmhpC9UlKEzzElpW
 yjwR0wYBXA0RoJuZqCp9MNgOtkU54QoQZ0c4EZblagslUZBmKQPUDE7XdgkaqO6o
 A3azfCjAFDu523Azep9Npj1sk+H+VH3OIXyMGY+uyHBw1a2rnmIhn4lQCQ+pX+YO
 3wpqxEragpjBAizs1CAAB9wWm2O8zVACxkoUXVYI8Tu60n99lwr9Abxlc0oHSIig
 E7kBnyzQVHfkDrPXR3EdwTi3Hwd/BaOiW4dPvQ3IJKvPOzoiS4H3IpJCnCV5PfRb
 VHl3q//SzPQ+GXH7WH2Fhb9JxoLxBRzFcy3kdIR1wBHYahAOiQjcLgapOO5mVq3X
 PJy9CDs2L9rjbtxvQWtnl62V3JFGw+ZdhhG/BjeC5Who/aSh/mDoss7/qdfrxKJx
 z5IWYSlJw7ighOuF0dPdCKAX9WiWqENvga31Q2svrH4Hxlx6eGumEmX+YQw0iRAf
 ddMYA+1qLT4myPTN0nORFM+T/mkZHNkNMCr0qylRFH0j6hSiDxwWqQG0eXA661By
 W6nIkW++sj2Vkk4knMGCXSyMmy9Nrv+1R+8unQJCXixYevotP5JEY0DoCQwlGuuq
 xa0UR+2q9Htnbytu8S0K
 =AoMS
 -----END PGP SIGNATURE-----

Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull virtio updates from Rusty Russell:
 "Nothing exciting: virtio-blk users might see a bit of a boost from the
  doubling of the default queue length though"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  virtio-blk: base queue-depth on virtqueue ringsize or module param
  Revert a02bbb1ccf: MAINTAINERS: add virtio-dev ML for virtio
  virtio: fail adding buffer on broken queues.
  virtio-rng: don't crash if virtqueue is broken.
  virtio_balloon: don't crash if virtqueue is broken.
  virtio_blk: don't crash, report error if virtqueue is broken.
  virtio_net: don't crash if virtqueue is broken.
  virtio_balloon: don't softlockup on huge balloon changes.
  virtio: Use pci_enable_msix_exact() instead of pci_enable_msix()
  MAINTAINERS: virtio-dev is subscribers only
  tools/virtio: add a missing )
  tools/virtio: fix missing kmemleak_ignore symbol
  tools/virtio: update internal copies of headers
2014-04-02 14:43:17 -07:00
Linus Torvalds
7474043eff Merge branch 'for-3.15' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping updates from Marek Szyprowski:
 "This contains extension for more efficient handling of io address
  space for dma-mapping subsystem for ARM architecture"

* 'for-3.15' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  arm: dma-mapping: remove order parameter from arm_iommu_create_mapping()
  arm: dma-mapping: Add support to extend DMA IOMMU mappings
2014-04-02 14:34:25 -07:00
Linus Torvalds
b9f2b21a32 Devicetree changes for v3.15
Updates to devicetree core code. This branch contains the following notable changes:
 * Add reserved memory binding
 * Make struct device_node a kobject and remove legacy /proc/device-tree
 * ePAPR conformance fixes
 * Update in-kernel DTC copy to version v1.4.0
 * Preparation changes for dynamic device tree overlays
 * minor bug fixes and documentation changes
 
 The most significant change in this branch is the conversion of struct
 device_node to be a kobject that is exposed via sysfs and removal of the
 old /proc/device-tree code. This simplifies the device tree handling
 code and tightens up the lifecycle on device tree nodes.
 
 [updated: added fix for dangling select PROC_DEVICETREE]
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJTOyNwAAoJEMWQL496c2LNZY0QAIreUrpo3/hKRau61EDPXkOA
 UFRyPUHD0k/dNXWWDbTfvKH/nAfzdVwejhePqEWiODiFOFkq7JyQlMKPA+CZuZj0
 ygN4215A1yj/hDf6JRD5Zn4WGpawDt9InlbZSps6P5dd8voV5t5dz6uzz+Y7uqaK
 CAjTDlBSmxEen5vRHiHQgKv74au/+b9yfSURjPQVWg46+wl3WJwjsdzerphm4unW
 tpEr8zkIsm51mqqAx4penIuiovh7+L2J5v4BFeg8o+kaZEuZpVxLHJPOuBd5hdom
 zeqEIj3AqHTh5suYIHe4aAbZ2wMP3kYGgkPGwfWLnwLyULxalcCtGZeaCi9nwTFj
 Fdj+7f17ocrt5mif0f5Deufi1LqJsDjhY6G9p7HuV7Y9hsMILpJIUoGENPji+TWj
 BA4L45eaPmNYdKJytEtFD7F2WnXeHZ6fDtYho/39DWW+Bt16IFX85T199irhxGG4
 byN6LRaahk2UeycSXkQHAlWOQHqzBcJJAkQLN2iahzyYRr9Dy+VI2E9clm53m49O
 YQYcONdUlMYrtfRwJpbB9XHM0HgZUvg0LT5z/iHQs9uJtoo33Oj+zxFixyZLQ9Dq
 qyLqQWEpV9gFLAo9tpf56gffkLiJRsHkX4UJ6oTtj4DY1WWU9H81jjCvv/7flzp/
 8ZyyZzANQf1DZ9kqO2v+
 =lyA5
 -----END PGP SIGNATURE-----

Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux

Pull devicetree changes from Grant Likely:
 "Updates to devicetree core code.  This branch contains the following
  notable changes:

   - add reserved memory binding
   - make struct device_node a kobject and remove legacy
     /proc/device-tree
   - ePAPR conformance fixes
   - update in-kernel DTC copy to version v1.4.0
   - preparatory changes for dynamic device tree overlays
   - minor bug fixes and documentation changes

  The most significant change in this branch is the conversion of struct
  device_node to be a kobject that is exposed via sysfs and removal of
  the old /proc/device-tree code.  This simplifies the device tree
  handling code and tightens up the lifecycle on device tree nodes.

  [updated: added fix for dangling select PROC_DEVICETREE]"

* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux: (29 commits)
  dt: Remove dangling "select PROC_DEVICETREE"
  of: Add support for ePAPR "stdout-path" property
  of: device_node kobject lifecycle fixes
  of: only scan for reserved mem when fdt present
  powerpc: add support for reserved memory defined by device tree
  arm64: add support for reserved memory defined by device tree
  of: add missing major vendors
  of: add vendor prefix for SMSC
  of: remove /proc/device-tree
  of/selftest: Add self tests for manipulation of properties
  of: Make device nodes kobjects so they show up in sysfs
  arm: add support for reserved memory defined by device tree
  drivers: of: add support for custom reserved memory drivers
  drivers: of: add initialization code for dynamic reserved memory
  drivers: of: add initialization code for static reserved memory
  of: document bindings for reserved-memory nodes
  Revert "of: fix of_update_property()"
  kbuild: dtbs_install: new make target
  ARM: mvebu: Allows to get the SoC ID even without PCI enabled
  of: Allows to use the PCI translator without the PCI core
  ...
2014-04-02 14:27:15 -07:00
Linus Torvalds
70f6c08757 More ACPI and power management updates for 3.15-rc1
- Remaining changes from upstream ACPICA release 20140214 that introduce
    code to automatically serialize the execution of methods creating any
    named objects which really cannot be executed in parallel with each
    other anyway (previously ACPICA attempted to address that by aborting
    methods upon conflict detection, but that wasn't reliable enough and
    led to other issues).  From Bob Moore and Lv Zheng.
 
  - intel_pstate fix to use del_timer_sync() instead of del_timer() in
    the exit path before freeing the timer structure from Dirk Brandewie
    (original patch from Thomas Gleixner).
 
  - cpufreq fix related to system resume from Viresh Kumar.
 
  - Serialization of frequency transitions in cpufreq that involve
    PRECHANGE and POSTCHANGE notifications to avoid ordering issues
    resulting from race conditions.  From Srivatsa S Bhat and Viresh Kumar.
 
  - Revert of an ACPI processor driver change that was based on a specific
    interpretation of the ACPI spec which may not be correct (the relevant
    part of the spec appears to be incomplete).  From Hanjun Guo.
 
  - Runtime PM core cleanups and documentation updates from Geert Uytterhoeven.
 
  - PNP core cleanup from Michael Opdenacker.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJTO1+vAAoJEILEb/54YlRxHYgP/RB18RLcwSIPMTWoZPo5t+pd
 IGtHkG5xzCBZXiqL9OJLm+dH1V5w+wZVXh2865ZDiqq4CZYZWD6RUQnx5q0rSVR5
 54PYzx2I0i8ApPxRYTTmnb2NHUPedp3l0YSRC0gt73Q/6o9TcmOMtcn5pfTyCvbB
 m3am3mpKKxRD+vYCADjjUtuj4NQ62z9DjM5iJIql7Pj4kAJVgSxP8nsfHY6EwNaT
 m9mnNCA6Zemh89luM1W2vw69ZoZwLAbXIXJYCNy3khT13SYO2SCNhX/tlY7ncCvv
 P+9gawJb6Wio7pVHqRR0Lesc8J29uzivEeS8WqZ3R0P0HoTP6z5a5R+b06ecGQjF
 OWvj7wURjZE4t7qEtIOHmwIyCRE4Nxly90r5upj9kKVBaczz/LbDeCVfKU/Y2Vu6
 PPxmjRwjO4S8FqLihwiXCSYVf3pxBrDKgjjofM7f2CiO8D41C4KhgowbUqyUSCgw
 VKXU6UQbzVigfrGXsdqIJiTnEMmbOvrPy6PaVh27NlwXX3sg1SwYcoegsW+ee7m1
 jJdl1TRI27pl7NPgTkLpf5K7n6mkDsou8Y+PcQhFa6FNTn/k8gp/RfOHpLHaNTjL
 15Aswkm70Ojeae+Ahx8ZgrWXF7iu+uBX7KakeUVQJg/PFjXIspx+c/SrGzh7ZLa1
 aOqoKfFY7zDke4AV3eH/
 =EfZ8
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more ACPI and power management updates from Rafael Wysocki:
 "These are commits that were not quite ready when I sent the original
  pull request for 3.15-rc1 several days ago, but they have spent some
  time in linux-next since then and appear to be good to go.  All of
  them are fixes and cleanups.

  Specifics:

   - Remaining changes from upstream ACPICA release 20140214 that
     introduce code to automatically serialize the execution of methods
     creating any named objects which really cannot be executed in
     parallel with each other anyway (previously ACPICA attempted to
     address that by aborting methods upon conflict detection, but that
     wasn't reliable enough and led to other issues).  From Bob Moore
     and Lv Zheng.

   - intel_pstate fix to use del_timer_sync() instead of del_timer() in
     the exit path before freeing the timer structure from Dirk
     Brandewie (original patch from Thomas Gleixner).

   - cpufreq fix related to system resume from Viresh Kumar.

   - Serialization of frequency transitions in cpufreq that involve
     PRECHANGE and POSTCHANGE notifications to avoid ordering issues
     resulting from race conditions.  From Srivatsa S Bhat and Viresh
     Kumar.

   - Revert of an ACPI processor driver change that was based on a
     specific interpretation of the ACPI spec which may not be correct
     (the relevant part of the spec appears to be incomplete).  From
     Hanjun Guo.

   - Runtime PM core cleanups and documentation updates from Geert
     Uytterhoeven.

   - PNP core cleanup from Michael Opdenacker"

* tag 'pm+acpi-3.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Make cpufreq_notify_transition & cpufreq_notify_post_transition static
  cpufreq: Convert existing drivers to use cpufreq_freq_transition_{begin|end}
  cpufreq: Make sure frequency transitions are serialized
  intel_pstate: Use del_timer_sync in intel_pstate_cpu_stop
  cpufreq: resume drivers before enabling governors
  PM / Runtime: Spelling s/competing/completing/
  PM / Runtime: s/foo_process_requests/foo_process_next_request/
  PM / Runtime: GENERIC_SUBSYS_PM_OPS is gone
  PM / Runtime: Correct documented return values for generic PM callbacks
  PM / Runtime: Split line longer than 80 characters
  PM / Runtime: dev_pm_info.runtime_error is signed
  Revert "ACPI / processor: Make it possible to get APIC ID via GIC"
  ACPICA: Enable auto-serialization as a default kernel behavior.
  ACPICA: Ignore sync_level for methods that have been auto-serialized.
  ACPICA: Add additional named objects for the auto-serialize method scan.
  ACPICA: Add auto-serialization support for ill-behaved control methods.
  ACPICA: Remove global option to serialize all control methods.
  PNP: remove deprecated IRQF_DISABLED
2014-04-02 14:10:21 -07:00
Linus Torvalds
e6d9bfc638 Merge branch 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt:
 "This is the branch I mentioned in my other pull request which contains
  our improved cpuidle support for the "powernv" platform
  (non-virtualized).

  It adds support for the "fast sleep" feature of the processor which
  provides higher power savings than our usual "nap" mode but at the
  cost of losing the timers while asleep, and thus exploits the new
  timer broadcast framework to work around that limitation.

  It's based on a tip timer tree that you seem to have already merged"

* 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  cpuidle/powernv: Parse device tree to setup idle states
  cpuidle/powernv: Add "Fast-Sleep" CPU idle state
  powerpc/powernv: Add OPAL call to resync timebase on wakeup
  powerpc/powernv: Add context management for Fast Sleep
  powerpc: Split timer_interrupt() into timer handling and interrupt handling routines
  powerpc: Implement tick broadcast IPI as a fixed IPI message
  powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message
2014-04-02 13:47:29 -07:00