This closes a race where an ipq6hashfn() caller could get a hash value
and race with the cycling of the random seed. By the time they got to
the read_lock they'd have a stale hash value and might not find
previous fragments of their datagram.
This matches the previous patch to IPv4.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] vfs: add splice_write and splice_read to documentation
[PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
[PATCH] splice: warning fix
[PATCH] another round of fs/pipe.c cleanups
[PATCH] splice: comment styles
[PATCH] splice: add Ingo as addition copyright holder
[PATCH] splice: unlikely() optimizations
[PATCH] splice: speedups and optimizations
[PATCH] pipe.c/fifo.c code cleanups
[PATCH] get rid of the PIPE_*() macros
[PATCH] splice: speedup __generic_file_splice_read
[PATCH] splice: add direct fd <-> fd splicing support
[PATCH] splice: add optional input and output offsets
[PATCH] introduce a "kernel-internal pipe object" abstraction
[PATCH] splice: be smarter about calling do_page_cache_readahead()
[PATCH] splice: optimize the splice buffer mapping
[PATCH] splice: cleanup __generic_file_splice_read()
[PATCH] splice: only call wake_up_interruptible() when we really have to
[PATCH] splice: potential !page dereference
[PATCH] splice: mark the io page as accessed
We're using svc_take_page here to get another page for the tail in case one
wasn't already allocated. But there isn't always guaranteed to be another
page available.
Also fix a typo that made us check the tail buffer for space when we meant to
be checking the head buffer.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs. This is inefficient and
possibly buggy.
We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.
This patch replaces for_each_cpu with for_each_possible_cpu under /net
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Andrew Morton <akpm@osdl.org>
net/socket.c:148: warning: initialization from incompatible pointer type
extern declarations in .c files! Bad boy.
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@suse.de>
Deinline a few functions which produce 200+ bytes of code.
Size Uses Wasted Name and definition
===== ==== ====== ================================================
429 3 818 __inet6_lookup include/net/inet6_hashtables.h
404 2 384 __inet6_lookup_established include/net/inet6_hashtables.h
206 3 372 __inet6_hash include/net/inet6_hashtables.h
Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise we could compute an inaccurate hash due to the
random seed changing.
Noticed by Zach Brown and patch is based upon some feedback
from Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Thomas de Grenier de Latour <degrenier@easyconnect.fr>
On Sun, 9 Apr 2006 21:56:59 +0400,
Sergey Vlasov <vsu@altlinux.ru> wrote:
> However, show_address() does not output anything unless
> dev->reg_state == NETREG_REGISTERED - and this state is set by
> netdev_run_todo() only after netdev_register_sysfs() returns, so in
> the meantime (while netdev_register_sysfs() is busy adding the
> "statistics" attribute group) some process may see an empty "address"
> attribute.
I've tried the attached patch, suggested by Sergey Vlasov on
hotplug-devel@, and as far as i can test it works just fine.
Signed-off-by: David S. Miller <davem@davemloft.net>
Can't build with CONFIG_NETFILTER=y/m on IA64, there's a missing
#include in net/ipv6/netfilter.c
net/ipv6/netfilter.c: In function `nf_ip6_checksum':
net/ipv6/netfilter.c:92: warning: implicit declaration of function
`csum_ipv6_magic'
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename policer specific _generic_ methods to be specific to
_act_police_
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Besides removing lots of duplicate code, all converted users benefit
from improved HW checksum error handling. Tested with and without HW
checksums in almost all combinations.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add checksum operation which takes care of verifying the checksum and
dealing with HW checksum errors and avoids multiple checksum
operations by setting ip_summed to CHECKSUM_UNNECESSARY after
successful verification.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the queue rerouter intrastructure to a generic usable
infrastructure for address family specific operations as a base for
some cleanups.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When NAT is built as a module, ip_conntrack_netlink can not be linked
statically.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
default_rrq_ttl is used when no TTL is included in the RRQ.
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move prototypes of NAT callbacks to ip_conntrack_h323.h. Because the
use of typedefs as arguments, some header files need to be moved as
well.
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix section mismatch warnings caused by netfilter's init_or_cleanup
functions used in many places by splitting the init from the cleanup
parts.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up hook registration by makeing use of the new mass registration and
unregistration helpers.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes GRE and SIT to generate port unreachable instead of
protocol unreachable errors when we can't find a matching tunnel for a
packet.
This removes the ambiguity as to whether the error is caused by no
tunnel being found or by the lack of support for the given tunnel
type.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
this fixes coverity bug id #1068.
hci_send_sco() frees skb if (skb->len > hdev->sco_mtu).
Since it returns a negative error value only in this case, we
can directly return here.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes an off-by-21-or-49 error ;-) spotted by the Coverity
checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves the sending of ICMP messages when there are no IPv4/IPv6
tunnels present to tunnel4/tunnel6 respectively. Please note that for now
if xfrm4_tunnel/xfrm6_tunnel is loaded then no ICMP messages will ever be
sent. This is similar to how we handle AH/ESP/IPCOMP.
This move fixes the bug where we always send an ICMP message when there is
no ip6_tunnel device present for a given packet even if it is later handled
by IPsec. It also causes ICMP messages to be sent when no IPIP tunnel is
present.
I've decided to use the "port unreachable" ICMP message over the current
value of "address unreachable" (and "protocol unreachable" by GRE) because
it is not ambiguous unlike the other ones which can be triggered by other
conditions. There seems to be no standard specifying what value must be
used so this change should be OK. In fact we should change GRE to use
this value as well.
Incidentally, this patch also fixes a fairly serious bug in xfrm6_tunnel
where we don't check whether the embedded IPv6 header is present before
dereferencing it for the inside source address.
This patch is inspired by a previous patch by Hugo Santos <hsantos@av.it.pt>.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conntrack code doesn't do re-fragmentation of defragmented packets
anymore but relies on fragmentation in the IP layer. Purely bridged
packets don't pass through the IP layer, so the bridge netfilter code
needs to take care of fragmentation itself.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Seems like leaf (end-nodes) has been freed by __tnode_free_rcu and not
by __leaf_free_rcu. This fixes the problem. Only tnode_free is now
used which checks for appropriate node type. free_leaf can be removed.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to dereference x->encap before dereferencing it for encap_type.
If it's absent then the encap_type is zero.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andi Kleen was right, fput() on sock->file will end up calling
sock_release() if necessary. So here is the rest of his version
of the fix for these leaks.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch extends current iptables compatibility layer in order to get
32bit iptables to work on 64bit kernel. Current layer is insufficient due
to alignment checks both in kernel and user space tools.
Patch is for current net-2.6.17 with addition of move of ipt_entry_{match|
target} definitions to xt_entry_{match|target}.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes expectation notifier unregistration on module unload to
use ip_conntrack_expect_unregister_notifier(). This bug causes a soft
lockup at the first expectation created after a rmmod ; insmod of this
module.
Should go into -stable as well.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When ctnetlink was ported from ip_conntrack to nf_conntrack two #ifdef's
for connmark support were left unchanged and this code was never
compiled.
Problem noticed by Daniel De Graaf.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This unifies ipt_multiport and ip6t_multiport to xt_multiport.
As a result, this addes support for inversion and port range match
to IPv6 packets.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This unifies ipt_esp and ip6t_esp to xt_esp. Please note that now
a user program needs to specify IPPROTO_ESP as protocol to use esp match
with IPv6. This means that ip6tables requires '-p esp' like iptables.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This regression was added by commit:
39d8c1b6fb
("Do not lose accepted socket when -ENFILE/-EMFILE.")
This is based upon a patch from Andi Kleen.
Thanks to Adrian Bridgett for narrowing down a good test case, and to
Andi Kleen and Andrew Morton for eyeballing this code.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the *_decap_state structures which were previously
used to share state between input/post_input. This is no longer
needed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the decap_state argument from the xfrm input hook.
Previously this function allowed the input hook to share state with
the post_input hook. The latter has since been removed.
The only purpose for it now is to check the encap type. However, it
is easier and better to move the encap type check to the generic
xfrm_rcv function. This allows us to get rid of the decap state
argument altogether.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Put a comment in there explaining why we double the setsockopt()
caller's SO_RCVBUF. People keep wondering.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).
From the splice.c comments:
"splice": joining two ropes together by interweaving their strands.
This is the "extended pipe" functionality, where a pipe is used as
an arbitrary in-memory buffer. Think of a pipe as a small kernel
buffer that you can use to transfer data from one end to the other.
The traditional unix read/write is extended with a "splice()" operation
that transfers data buffers to or from a pipe buffer.
Named by Larry McVoy, original implementation from Linus, extended by
Jens to support splicing to files and fixing the initial implementation
bugs.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On a allyesconfig'ured kernel:
Size Uses Wasted Name and definition
===== ==== ====== ================================================
95 162 12075 netif_wake_queue include/linux/netdevice.h
129 86 9265 dev_kfree_skb_any include/linux/netdevice.h
127 56 5885 netif_device_attach include/linux/netdevice.h
73 86 4505 dev_kfree_skb_irq include/linux/netdevice.h
46 60 1534 netif_device_detach include/linux/netdevice.h
119 16 1485 __netif_rx_schedule include/linux/netdevice.h
143 5 492 netif_rx_schedule include/linux/netdevice.h
81 7 366 netif_schedule include/linux/netdevice.h
netif_wake_queue is big because __netif_schedule is a big inline:
static inline void __netif_schedule(struct net_device *dev)
{
if (!test_and_set_bit(__LINK_STATE_SCHED, &dev->state)) {
unsigned long flags;
struct softnet_data *sd;
local_irq_save(flags);
sd = &__get_cpu_var(softnet_data);
dev->next_sched = sd->output_queue;
sd->output_queue = dev;
raise_softirq_irqoff(NET_TX_SOFTIRQ);
local_irq_restore(flags);
}
}
static inline void netif_wake_queue(struct net_device *dev)
{
#ifdef CONFIG_NETPOLL_TRAP
if (netpoll_trap())
return;
#endif
if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
__netif_schedule(dev);
}
By de-inlining __netif_schedule we are saving a lot of text
at each callsite of netif_wake_queue and netif_schedule.
__netif_rx_schedule is also big, and it makes more sense to keep
both of them out of line.
Patch also deinlines dev_kfree_skb_any. We can deinline dev_kfree_skb_irq
instead... oh well.
netif_device_attach/detach are not hot paths, we can deinline them too.
Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Randy Dunlap <rdunlap@xenotime.net>
Use NULL instead of 0 for pointers.
Fix these sparse warnings:
net/dccp/feat.c:207:20: warning: Using plain integer as NULL pointer
net/dccp/feat.c:325:21: warning: Using plain integer as NULL pointer
net/dccp/feat.c:526:20: warning: Using plain integer as NULL pointer
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Patrick Caulfield <patrick@tykepenguin.com>
This patch fixes a bug in the reference counting for the default
DECnet device.
If the device is changed, then the new device had its refcount
decremented rather than the old one!
Signed-off-by: David S. Miller <davem@davemloft.net>
Every netfilter module uses `init' for its module_init() function and
`fini' or `cleanup' for its module_exit() function.
Problem is, this creates uninformative initcall_debug output and makes
ctags rather useless.
So go through and rename them all to $(filename)_init and
$(filename)_fini.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>