linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-04 11:16:48 +07:00

Author	SHA1	Message	Date
Venkat Yekkirala	beb8d13bed	[MLSXFRM]: Add flow labeling This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:27 -07:00
Herbert Xu	e4d5b79c66	[CRYPTO] users: Use crypto_comp and crypto_has_* This patch converts all users to use the new crypto_comp type and the crypto_has_* functions. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-09-21 11:46:22 +10:00
Herbert Xu	07d4ee583e	[IPSEC]: Use HMAC template and hash interface This patch converts IPsec to use the new HMAC template. The names of existing simple digest algorithms may still be used to refer to their HMAC composites. The same structure can be used by other MACs such as AES-XCBC-MAC. This patch also switches from the digest interface to hash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-21 11:46:18 +10:00
Herbert Xu	6b7326c849	[IPSEC] ESP: Use block ciphers where applicable This patch converts IPSec/ESP to use the new block cipher type where applicable. Similar to the HMAC conversion, existing algorithm names have been kept for compatibility. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-09-21 11:46:14 +10:00
Remi Denis-Courmont	d0ee011f72	[IPV6]: Accept -1 for IPV6_TCLASS This patch should add support for -1 as "default" IPv6 traffic class, as specified in IETF RFC3542 §6.5. Within the kernel, it seems tclass < 0 is already handled, but setsockopt, getsockopt and recvmsg calls won't accept it from userland. Signed-off-by: Remi Denis-Courmont <rdenis@simphalempin.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:08 -07:00
YOSHIFUJI Hideaki	e012d51cbc	[IPV6]: Fix tclass setting for raw sockets. np->cork.tclass is used only in cork'ed context. Otherwise, np->tclass should be used. Bug#7096 reported by Remi Denis-Courmont <rdenis@simphalempin.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:07 -07:00
YOSHIFUJI Hideaki	99c7bc0133	[IPV6]: Fix kernel OOPs when setting sticky socket options. Bug noticed by Remi Denis-Courmont <rdenis@simphalempin.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-31 14:52:17 -07:00
Keir Fraser	57f5f544f5	[IPV6]: ipv6_add_addr should install dstentry earlier ipv6_add_addr allocates a struct inet6_ifaddr and a dstentry, but it doesn't install the dstentry in ifa->rt until after it releases the addrconf_hash_lock. This means other CPUs will be able to see the new address while it hasn't been initialized completely yet. One possible fix would be to grab the ifp->lock spinlock when creating the address struct; a simpler fix is to just move the assignment. Acked-by: jbeulich@novell.com Acked-by: okir@suse.de Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-29 21:22:18 -07:00
Lv Liangying	76d0cc1b64	[IPV6]: SNMPv2 "ipv6IfStatsInAddrErrors" counter error When I tested Linux kernel 2.6.17.7 about statistics "ipv6IfStatsInAddrErrors", found that this counter couldn't increase correctly. The criteria is RFC2465: ipv6IfStatsInAddrErrors OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of input datagrams discarded because the IPv6 address in their IPv6 header's destination field was not a valid address to be received at this entity. This count includes invalid addresses (e.g., ::0) and unsupported addresses (e.g., addresses with unallocated prefixes). For entities which are not IPv6 routers and therefore do not forward datagrams, this counter includes datagrams discarded because the destination address was not a local address." ::= { ipv6IfStatsEntry 5 } When I send packet to host with destination that is ether invalid address(::0) or unsupported addresses(1::1), the Linux kernel just discard the packet, and the counter doesn't increase(in the function ip6_pkt_discard). Signed-off-by: Lv Liangying <lvly@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-29 21:22:15 -07:00
Stephen Hemminger	59eed279c5	[IPV6]: Segmentation offload not set correctly on TCP children TCP over IPV6 would incorrectly inherit the GSO settings. This would cause kernel to send Tcp Segmentation Offload packets for IPV6 data to devices that can't handle it. It caused the sky2 driver to lock http://bugzilla.kernel.org/show_bug.cgi?id=7050 and the e1000 would generate bogus packets. I can't blame the hardware for gagging if the upper layers feed it garbage. This was a new bug in 2.6.18 introduced with GSO support. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-26 18:42:01 -07:00
David L Stevens	acd6e00b8e	[MCAST]: Fix filter leak on device removal. This fixes source filter leakage when a device is removed and a process leaves the group thereafter. This also includes corresponding fixes for IPv6 multicast source filters on device removal. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 16:29:57 -07:00
Ingo Molnar	640c41c77a	[IPV6] lockdep: annotate __icmpv6_socket Split off __icmpv6_socket's sk->sk_dst_lock class, because it gets used from softirqs, which is safe for __icmpv6_sockets (because they never get directly used via userspace syscalls), but unsafe for normal sockets. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 16:29:48 -07:00
Herbert Xu	e9fa4f7bd2	[INET]: Use pskb_trim_unique when trimming paged unique skbs The IPv4/IPv6 datagram output path was using skb_trim to trim paged packets because they know that the packet has not been cloned yet (since the packet hasn't been given to anything else in the system). This broke because skb_trim no longer allows paged packets to be trimmed. Paged packets must be given to one of the pskb_trim functions instead. This patch adds a new pskb_trim_unique function to cover the IPv4/IPv6 datagram output path scenario and replaces the corresponding skb_trim calls with it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 20:12:58 -07:00
Patrick McHardy	0eff66e625	[NETFILTER]: {arp,ip,ip6}_tables: proper error recovery in init path Neither of {arp,ip,ip6}_tables cleans up behind itself when something goes wrong during initialization. Noticed by Rennie deGraaf <degraaf@cpsc.ucalgary.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 18:57:28 -07:00
Herbert Xu	06aebfb7fa	[IPV6]: The ifa lock is a BH lock The ifa lock is expected to be taken in BH context (by addrconf timers) so we must disable BH when accessing it from user context. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-09 16:52:04 -07:00
Wei Dong	dafee49085	[IPV6]: SNMPv2 "ipv6IfStatsOutFragCreates" counter error When I tested linux kernel 2.6.71.7 about statistics "ipv6IfStatsOutFragCreates", and found that it couldn't increase correctly. The criteria is RFC 2465: ipv6IfStatsOutFragCreates OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of output datagram fragments that have been generated as a result of fragmentation at this output interface." ::= { ipv6IfStatsEntry 15 } I think there are two issues in Linux kernel. 1st: RFC2465 specifies the counter is "The number of output datagram fragments...". I think increasing this counter after output a fragment successfully is better. And it should not be increased even though a fragment is created but failed to output. 2nd: If we send a big ICMP/ICMPv6 echo request to a host, and receive ICMP/ICMPv6 echo reply consisted of some fragments. As we know that in Linux kernel first fragmentation occurs in ICMP layer(maybe saying transport layer is better), but this is not the "real" fragmentation,just do some "pre-fragment" -- allocate space for date, and form a frag_list, etc. The "real" fragmentation happens in IP layer -- set offset and MF flag and so on. So I think in "fast path" for ip_fragment/ip6_fragment, if we send a fragment which "pre-fragment" by upper layer we should also increase "ipv6IfStatsOutFragCreates". Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:41:21 -07:00
Wei Dong	32c524d1c4	[IPV6]: SNMPv2 "ipv6IfStatsInHdrErrors" counter error When I tested Linux kernel 2.6.17.7 about statistics "ipv6IfStatsInHdrErrors", found that this counter couldn't increase correctly. The criteria is RFC2465: ipv6IfStatsInHdrErrors OBJECT-TYPE SYNTAX Counter3 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of input datagrams discarded due to errors in their IPv6 headers, including version number mismatch, other format errors, hop count exceeded, errors discovered in processing their IPv6 options, etc." ::= { ipv6IfStatsEntry 2 } When I send TTL=0 and TTL=1 a packet to a router which need to be forwarded, router just sends an ICMPv6 message to tell the sender that TIME_EXCEED and HOPLIMITS, but no increments for this counter(in the function ip6_forward). Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:39:57 -07:00
Tom Tucker	8d71740c56	[NET]: Core net changes to generate netevents Generate netevents for: - neighbour changes - routing redirects - pmtu changes Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:21 -07:00
Wei Yongjun	3687b1dc6f	[TCP]: SNMPv2 tcpAttemptFails counter error Refer to RFC2012, tcpAttemptFails is defined as following: tcpAttemptFails OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of times TCP connections have made a direct transition to the CLOSED state from either the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state." ::= { tcp 7 } When I lookup into RFC793, I found that the state change should occured under following condition: 1. SYN-SENT -> CLOSED a) Received ACK,RST segment when SYN-SENT state. 2. SYN-RCVD -> CLOSED b) Received SYN segment when SYN-RCVD state(came from LISTEN). c) Received RST segment when SYN-RCVD state(came from SYN-SENT). d) Received SYN segment when SYN-RCVD state(came from SYN-SENT). 3. SYN-RCVD -> LISTEN e) Received RST segment when SYN-RCVD state(came from LISTEN). In my test, those direct state transition can not be counted to tcpAttemptFails. Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:19 -07:00
Herbert Xu	497c615aba	[IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls The current users of ip6_dst_lookup can be divided into two classes: 1) The caller holds no locks and is in user-context (UDP). 2) The caller does not want to lookup the dst cache at all. The second class covers everyone except UDP because most people do the cache lookup directly before calling ip6_dst_lookup. This patch adds ip6_sk_dst_lookup for the first class. Similarly ip6_dst_store users can be divded into those that need to take the socket dst lock and those that don't. This patch adds __ip6_dst_store for those (everyone except UDP/datagram) that don't need an extra lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:14 -07:00
Patrick McHardy	679e898a47	[XFRM]: Fix protocol field value for outgoing IPv6 GSO packets Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:13 -07:00
Noriaki TAKAMIYA	081bba5b3a	[IPV6] ADDRCONF: NLM_F_REPLACE support for RTM_NEWADDR Based on MIPL2 kernel patch. Signed-off-by: Noriaki YAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:12 -07:00
Noriaki TAKAMIYA	6c22382805	[IPV6] ADDRCONF: Support get operation of single address Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:11 -07:00
YOSHIFUJI Hideaki	8f27ebb982	[IPV6] ADDRCONF: Do not verify an address with infinity lifetime We also do not try regenarating new temporary address corresponding to an address with infinite preferred lifetime. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:10 -07:00
Noriaki TAKAMIYA	0778769d39	[IPV6] ADDRCONF: Allow user-space to specify address lifetime Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:09 -07:00
YOSHIFUJI Hideaki	643162258e	[IPV6] ADDRCONF: Check payload length for IFA_LOCAL attribute in RTM_{ADD,DEL}MSG message Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:08 -07:00
Tetsuo Handa	f59fc7f30b	[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg(). From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp The recvmsg() for raw socket seems to return random u16 value from the kernel stack memory since port field is not initialized. But I'm not sure this patch is correct. Does raw socket return any information stored in port field? [ BSD defines RAW IP recvmsg to return a sin_port value of zero. This is described in Steven's TCP/IP Illustrated Volume 2 on page 1055, which is discussing the BSD rip_input() implementation. ] Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 17:05:35 -07:00
Guillaume Chazarain	6b7fdc3ae1	[IPV6]: Clean skb cb on IPv6 input. Clear the accumulated junk in IP6CB when starting to handle an IPV6 packet. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 23:44:44 -07:00
David S. Miller	a922ba5510	[IPV6] xfrm6_tunnel: Delete debugging code. It doesn't compile, and it's dubious in several regards: 1) is enabled by non-Kconfig controlled CONFIG_* value (noted by Randy Dunlap) 2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use 3) the debugging messages print object pointer addresses which have no meaning without context So let's just get rid of it. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 13:49:06 -07:00
Panagiotis Issaris	0da974f4f3	[NET]: Conversions from kmalloc+memset to k(z\|c)alloc. Signed-off-by: Panagiotis Issaris <takis@issaris.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 14:51:30 -07:00
Herbert Xu	5d9c5a3292	[IPV4]: Get rid of redundant IPCB->opts initialisation Now that we always zero the IPCB->opts in ip_rcv, it is no longer necessary to do so before calling netif_rx for tunneled packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 14:29:53 -07:00
Herbert Xu	da952315c9	[IPCOMP]: Fix truesize after decompression The truesize check has uncovered the fact that we forgot to update truesize after pskb_expand_head. Unfortunately pskb_expand_head can't update it for us because it's used in all sorts of different contexts, some of which would not allow truesize to be updated by itself. So the solution for now is to simply update it in IPComp. This patch also changes skb_put to __skb_put since we've just expanded tailroom by exactly that amount so we know it's there (but gcc does not). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:55 -07:00
YOSHIFUJI Hideaki	8a6ce0c083	[IPV6]: Use ipv6_addr_src_scope for link address sorting. In the source address selection, the address must be sorted from global to node-local. But, ifp->scope is different from the scope for source address selection. 2001::1 fe80::1 ::1 ifp->scope 0 0x02 0x01 ipv6_addr_src_scope(&ifp->addr) 0x0e 0x02 0x01 So, we need to use ipv6_addr_src_scope(&ifp->addr) for sorting. And, for backward compatibility, addresses should be sorted from new one to old one. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:53 -07:00
Brian Haley	e55ffac601	[IPV6]: order addresses by scope If IPv6 addresses are ordered by scope, then ipv6_dev_get_saddr() can break-out of the device addr_list for() loop when the candidate source address scope is less than the destination address scope. Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:37 -07:00
Herbert Xu	a430a43d08	[NET] gso: Fix up GSO packets with broken checksums Certain subsystems in the stack (e.g., netfilter) can break the partial checksum on GSO packets. Until they're fixed, this patch allows this to work by recomputing the partial checksums through the GSO mechanism. Once they've all been converted to update the partial checksum instead of clearing it, this workaround can be removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-08 13:34:56 -07:00
Herbert Xu	89114afd43	[NET] gso: Add skb_is_gso This patch adds the wrapper function skb_is_gso which can be used instead of directly testing skb_shinfo(skb)->gso_size. This makes things a little nicer and allows us to change the primary key for indicating whether an skb is GSO (if we ever want to do that). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-08 13:34:32 -07:00
Randy Dunlap	4bdbf6c033	[NET]: add+use poison defines Add and use poison defines in net/. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-03 19:47:27 -07:00
Michael Chan	6703931c54	[IPV6]: Fix ipv6 GSO payload length Fix ipv6 GSO payload length calculation. The ipv6 payload length excludes the ipv6 base header length and so must be subtracted. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-03 19:41:11 -07:00
Herbert Xu	bbcf467dab	[NET]: Verify gso_type too in gso_segment We don't want nasty Xen guests to pass a TCPv6 packet in with gso_type set to TCPv4 or even UDP (or a packet that's both TCP and UDP). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-03 19:38:35 -07:00
Linus Torvalds	e37a72de84	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV6]: Added GSO support for TCPv6 [NET]: Generalise TSO-specific bits from skb_setup_caps [IPV6]: Added GSO support for TCPv6 [IPV6]: Remove redundant length check on input [NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunks [TG3]: Update version and reldate [TG3]: Add TSO workaround using GSO [TG3]: Turn on hw fix for ASF problems [TG3]: Add rx BD workaround [TG3]: Add tg3_netif_stop() in vlan functions [TCP]: Reset gso_segs if packet is dodgy	2006-06-30 15:40:17 -07:00
Herbert Xu	f83ef8c0b5	[IPV6]: Added GSO support for TCPv6 This patch adds GSO support for IPv6 and TCPv6. This is based on a patch by Ananda Raju <Ananda.Raju@neterion.com>. His original description is: This patch enables TSO over IPv6. Currently Linux network stacks restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from "dev->features". This patch will remove this restriction. This patch will introduce a new flag NETIF_F_TSO6 which will be used to check whether device supports TSO over IPv6. If device support TSO over IPv6 then we don't clear of NETIF_F_TSO and which will make the TCP layer to create TSO packets. Any device supporting TSO over IPv6 will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO. In case when user disables TSO using ethtool, NETIF_F_TSO will get cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't get TSO packets created by TCP layer. SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet. SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature. UFO is supported over IPv6 also The following table shows there is significant improvement in throughput with normal frames and CPU usage for both normal and jumbo. -------------------------------------------------- \| \| 1500 \| 9600 \| \| ------------------\|-------------------\| \| \| thru CPU \| thru CPU \| -------------------------------------------------- \| TSO OFF \| 2.00 5.5% id \| 5.66 20.0% id \| -------------------------------------------------- \| TSO ON \| 2.63 78.0 id \| 5.67 39.0% id \| -------------------------------------------------- Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:10 -07:00
Herbert Xu	adcfc7d0b4	[IPV6]: Added GSO support for TCPv6 This patch adds GSO support for IPv6 and TCPv6. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:06 -07:00
Herbert Xu	2889139a6a	[IPV6]: Remove redundant length check on input We don't need to check skb->len when we're just about to call pskb_may_pull since that checks it for us. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:04 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Sridhar Samudrala	47da8ee681	[TCP]: Export accept queue len of a TCP listening socket via rx_queue While debugging a TCP server hang issue, we noticed that currently there is no way for a user to get the acceptq backlog value for a TCP listen socket. All the standard networking utilities that display socket info like netstat, ss and /proc/net/tcp have 2 fields called rx_queue and tx_queue. These fields do not mean much for listening sockets. This patch uses one of these unused fields(rx_queue) to export the accept queue len for listening sockets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:57 -07:00
Darrel Goeddel	c7bdb545d2	[NETLINK]: Encapsulate eff_cap usage within security framework. This patch encapsulates the usage of eff_cap (in netlink_skb_params) within the security framework by extending security_netlink_recv to include a required capability parameter and converting all direct usage of eff_caps outside of the lsm modules to use the interface. It also updates the SELinux implementation of the security_netlink_send and security_netlink_recv hooks to take advantage of the sid in the netlink_skb_params struct. This also enables SELinux to perform auditing of netlink capability checks. Please apply, for 2.6.18 if possible. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:55 -07:00
Patrick McHardy	da298d3a4f	[NETFILTER]: x_tables: fix xt_register_table error propagation When xt_register_table fails the error is not properly propagated back. Based on patch by Lepton Wu <ytht.net@gmail.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:40 -07:00
Ingo Molnar	34af946a22	[PATCH] spin/rwlock init cleanups locking init cleanups: - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK() - convert rwlocks in a similar manner this patch was generated automatically. Motivation: - cleanliness - lockdep needs control of lock initialization, which the open-coded variants do not give - it's also useful for -rt and for lock debugging in general Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:39 -07:00
Herbert Xu	09b8f7a93e	[IPSEC]: Handle GSO packets This patch segments GSO packets received by the IPsec stack. This can happen when a NIC driver injects GSO packets into the stack which are then forwarded to another host. The primary application of this is going to be Xen where its backend driver may inject GSO packets into dom0. Of course this also can be used by other virtualisation schemes such as VMWare or UML since the tap device could be modified to inject GSO packets received through splice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:38 -07:00
Herbert Xu	7967168cef	[NET]: Merge TSO/UFO fields in sk_buff Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not going to scale if we add any more segmentation methods (e.g., DCCP). So let's merge them. They were used to tell the protocol of a packet. This function has been subsumed by the new gso_type field. This is essentially a set of netdev feature bits (shifted by 16 bits) that are required to process a specific skb. As such it's easy to tell whether a given device can process a GSO skb: you just have to and the gso_type field and the netdev's features field. I've made gso_type a conjunction. The idea is that you have a base type (e.g., SKB_GSO_TCPV4) that can be modified further to support new features. For example, if we add a hardware TSO type that supports ECN, they would declare NETIF_F_TSO \| NETIF_F_TSO_ECN. All TSO packets with CWR set would have a gso_type of SKB_GSO_TCPV4 \| SKB_GSO_TCPV4_ECN while all other TSO packets would be SKB_GSO_TCPV4. This means that only the CWR packets need to be emulated in software. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:29 -07:00
YOSHIFUJI Hideaki	5e2707fa3a	[IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY, because we have more less significant rule; longest match. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:24 -07:00
Łukasz Stelmach	102128e3a2	[IPV6]: Fix source address selection. Two additional labels (RFC 3484, sec. 10.3) for IPv6 addreses are defined to make a distinction between global unicast addresses and Unique Local Addresses (fc00::/7, RFC 4193) and Teredo (2001::/32, RFC 4380). It is necessary to avoid attempts of connection that would either fail (eg. fec0:: to 2001:feed::) or be sub-optimal (2001:0:: to 2001:feed::). Signed-off-by: Łukasz Stelmach <stlman@poczta.fm> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:22 -07:00
YOSHIFUJI Hideaki	c5396a31b2	[IPV6]: Sum real space for RTAs. This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications. Issue pointed out by Patrick McHardy <kaber@trash.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 22:48:48 -07:00
Herbert Xu	b38dfee3d6	[NET]: skb_trim audit I found a few more spots where pskb_trim_rcsum could be used but were not. This patch changes them to use it. Also, sk_filter can get paged skb data. Therefore we must use pskb_trim instead of skb_trim. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:20 -07:00
Herbert Xu	364c6badde	[NET]: Clean up skb_linearize The linearisation operation doesn't need to be super-optimised. So we can replace __skb_linearize with __pskb_pull_tail which does the same thing but is more general. Also, most users of skb_linearize end up testing whether the skb is linear or not so it helps to make skb_linearize do just that. Some callers of skb_linearize also use it to copy cloned data, so it's useful to have a new function skb_linearize_cow to copy the data if it's either non-linear or cloned. Last but not least, I've removed the gfp argument since nobody uses it anymore. If it's ever needed we can easily add it back. Misc bugs fixed by this patch: * via-velocity error handling (also, no SG => no frags) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:16 -07:00
James Morris	984bc16cc9	[SECMARK]: Add secmark support to core networking. Add a secmark field to the skbuff structure, to allow security subsystems to place security markings on network packets. This is similar to the nfmark field, except is intended for implementing security policy, rather than than networking policy. This patch was already acked in principle by Dave Miller. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:57 -07:00
Patrick McHardy	39a27a35c5	[NETFILTER]: conntrack: add sysctl to disable checksumming Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:57 -07:00
Patrick McHardy	6442f1cf89	[NETFILTER]: conntrack: don't call helpers for related ICMP messages None of the existing helpers expects to get called for related ICMP packets and some even drop them if they can't parse them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:55 -07:00
Herbert Xu	31a4ab9302	[IPSEC] proto: Move transport mode input path into xfrm_mode_transport Now that we have xfrm_mode objects we can move the transport mode specific input decapsulation code into xfrm_mode_transport. This removes duplicate code as well as unnecessary header movement in case of tunnel mode SAs since we will discard the original IP header immediately. This also fixes a minor bug for transport-mode ESP where the IP payload length is set to the correct value minus the header length (with extension headers for IPv6). Of course the other neat thing is that we no longer have to allocate temporary buffers to hold the IP headers for ESP and IPComp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:41 -07:00
Herbert Xu	b59f45d0b2	[IPSEC] xfrm: Abstract out encapsulation modes This patch adds the structure xfrm_mode. It is meant to represent the operations carried out by transport/tunnel modes. By doing this we allow additional encapsulation modes to be added without clogging up the xfrm_input/xfrm_output paths. Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and BEET modes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:39 -07:00
Herbert Xu	546be2405b	[IPSEC] xfrm: Undo afinfo lock proliferation The number of locks used to manage afinfo structures can easily be reduced down to one each for policy and state respectively. This is based on the observation that the write locks are only held by module insertion/removal which are very rare events so there is no need to further differentiate between the insertion of modules like ipv6 versus esp6. The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look suspicious at first. However, after you realise that nobody ever takes the corresponding write lock you'll feel better :) As far as I can gather it's an attempt to guard against the removal of the corresponding modules. Since neither module can be unloaded at all we can leave it to whoever fixes up IPv6 unloading :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:37 -07:00
Chris Leech	1a2449a87b	[I/OAT]: TCP recv offload to I/OAT Locks down user pages and sets up for DMA in tcp_recvmsg, then calls dma_async_try_early_copy in tcp_v4_do_rcv Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:25:56 -07:00
YOSHIFUJI Hideaki	4d0c591166	[IPV6] ROUTE: Don't try less preferred routes for on-link routes. In addition to the real on-link routes, NONEXTHOP routes should be considered on-link. Problem reported by Meelis Roos <mroos@linux.ee>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-26 13:23:41 -07:00
Alexey Dobriyan	4195f81453	[NET]: Fix "ntohl(ntohs" bugs Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-22 16:53:22 -07:00
Solar Designer	2c8ac66bb2	[NETFILTER]: Fix do_add_counters race, possible oops or info leak (CVE-2006-0039) Solar Designer found a race condition in do_add_counters(). The beginning of paddc is supposed to be the same as tmp which was sanity-checked above, but it might not be the same in reality. In case the integer overflow and/or the race condition are triggered, paddc->num_counters might not match the allocation size for paddc. If the check below (t->private->number != paddc->num_counters) nevertheless passes (perhaps this requires the race condition to be triggered), IPT_ENTRY_ITERATE() would read kernel memory beyond the allocation size, potentially causing an oops or leaking sensitive data (e.g., passwords from host system or from another VPS) via counter increments. This requires CAP_NET_ADMIN. Signed-off-by: Solar Designer <solar@openwall.com> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:16:52 -07:00
Philip Craig	5c170a09d9	[NETFILTER]: fix format specifier for netfilter log targets The prefix argument for nf_log_packet is a format specifier, so don't pass the user defined string directly to it. Signed-off-by: Philip Craig <philipc@snapgear.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:15:47 -07:00
Alexey Dobriyan	d8fd0a7316	[IPV6]: Endian fix in net/ipv6/netfilter/ip6t_eui64.c:match(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-16 15:24:41 -07:00
Alexey Kuznetsov	b0013fd47b	[IPV6]: skb leakage in inet6_csk_xmit inet6_csk_xit does not free skb when routing fails. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-10 13:24:38 -07:00
YOSHIFUJI Hideaki	c302e6d54e	[IPV6]: Fix race in route selection. We eliminated rt6_dflt_lock (to protect default router pointer) at 2.6.17-rc1, and introduced rt6_select() for general router selection. The function is called in the context of rt6_lock read-lock held, but this means, we have some race conditions when we do round-robin. Signed-off-by; YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-29 18:33:22 -07:00
Patrick McHardy	e4a79ef811	[NETFILTER]: ip6_tables: remove broken comefrom debugging The introduction of x_tables broke comefrom debugging, remove it from ip6_tables as well (ip_tables already got removed). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-24 17:27:32 -07:00
YOSHIFUJI Hideaki	b809739a1b	[IPV6]: Clean up hop-by-hop options handler. - Removed unused argument (nhoff) for ipv6_parse_hopopts(). - Make ipv6_parse_hopopts() to align with other extension header handlers. - Removed pointless assignment (hdr), which is not used afterwards. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:53 -07:00
YOSHIFUJI Hideaki	e5d25a9088	[IPV6] XFRM: Fix decoding session with preceding extension header(s). We did not correctly decode session with preceding extension header(s). This was because we had already pulled preceding headers, skb->nh.raw + 40 + 1 - skb->data was minus, and pskb_may_pull() failed. We now have IP6CB(skb)->nhoff and skb->h.raw, and we can start parsing / decoding upper layer protocol from current position. Tracked down by Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> and tested by Kazunori Miyazawa <kazunori@miyazawa.org>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:52 -07:00
YOSHIFUJI Hideaki	e3cae904d7	[IPV6] XFRM: Don't use old copy of pointer after pskb_may_pull(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:51 -07:00
YOSHIFUJI Hideaki	ec6700958a	[IPV6]: Ensure to have hop-by-hop options in our header of &sk_buff. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:50 -07:00
Zach Brown	f6596f9d2b	[IPv6] reassembly: Always compute hash under the fragment lock. This closes a race where an ipq6hashfn() caller could get a hash value and race with the cycling of the random seed. By the time they got to the read_lock they'd have a stale hash value and might not find previous fragments of their datagram. This matches the previous patch to IPv4. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-11 17:21:05 -07:00
KAMEZAWA Hiroyuki	6f91204225	[PATCH] for_each_possible_cpu: network codes for_each_cpu() actually iterates across all possible CPUs. We've had mistakes in the past where people were using for_each_cpu() where they should have been iterating across only online or present CPUs. This is inefficient and possibly buggy. We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the future. This patch replaces for_each_cpu with for_each_possible_cpu under /net Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:31 -07:00
Denis Vlasenko	b1a7ffcb7a	[IPV6]: Deinline few large functions in inet6 code Deinline a few functions which produce 200+ bytes of code. Size Uses Wasted Name and definition ===== ==== ====== ================================================ 429 3 818 __inet6_lookup include/net/inet6_hashtables.h 404 2 384 __inet6_lookup_established include/net/inet6_hashtables.h 206 3 372 __inet6_hash include/net/inet6_hashtables.h Signed-off-by: Denis Vlasenko <vda@ilport.com.ua> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:48:59 -07:00
Brian Haley	503e4faad1	[NETFILTER]: Fix build with CONFIG_NETFILTER=y/m on IA64 Can't build with CONFIG_NETFILTER=y/m on IA64, there's a missing #include in net/ipv6/netfilter.c net/ipv6/netfilter.c: In function `nf_ip6_checksum': net/ipv6/netfilter.c:92: warning: implicit declaration of function `csum_ipv6_magic' Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:49 -07:00
Patrick McHardy	96f6bf82ea	[NETFILTER]: Convert conntrack/ipt_REJECT to new checksumming functions Besides removing lots of duplicate code, all converted users benefit from improved HW checksum error handling. Tested with and without HW checksums in almost all combinations. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:42 -07:00
Patrick McHardy	422c346fad	[NETFILTER]: Add address family specific checksum helpers Add checksum operation which takes care of verifying the checksum and dealing with HW checksum errors and avoids multiple checksum operations by setting ip_summed to CHECKSUM_UNNECESSARY after successful verification. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:41 -07:00
Patrick McHardy	bce8032ef3	[NETFILTER]: Introduce infrastructure for address family specific operations Change the queue rerouter intrastructure to a generic usable infrastructure for address family specific operations as a base for some cleanups. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:40 -07:00
Patrick McHardy	32292a7ff1	[NETFILTER]: Fix section mismatch warnings Fix section mismatch warnings caused by netfilter's init_or_cleanup functions used in many places by splitting the init from the cleanup parts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:34 -07:00
Patrick McHardy	964ddaa10d	[NETFILTER]: Clean up hook registration Clean up hook registration by makeing use of the new mass registration and unregistration helpers. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:33 -07:00
Herbert Xu	45af08be6d	[INET]: Use port unreachable instead of proto for tunnels This patch changes GRE and SIT to generate port unreachable instead of protocol unreachable errors when we can't find a matching tunnel for a packet. This removes the ambiguity as to whether the error is caused by no tunnel being found or by the lack of support for the given tunnel type. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:29 -07:00
Herbert Xu	50fba2aa7c	[INET]: Move no-tunnel ICMP error to tunnel4/tunnel6 This patch moves the sending of ICMP messages when there are no IPv4/IPv6 tunnels present to tunnel4/tunnel6 respectively. Please note that for now if xfrm4_tunnel/xfrm6_tunnel is loaded then no ICMP messages will ever be sent. This is similar to how we handle AH/ESP/IPCOMP. This move fixes the bug where we always send an ICMP message when there is no ip6_tunnel device present for a given packet even if it is later handled by IPsec. It also causes ICMP messages to be sent when no IPIP tunnel is present. I've decided to use the "port unreachable" ICMP message over the current value of "address unreachable" (and "protocol unreachable" by GRE) because it is not ambiguous unlike the other ones which can be triggered by other conditions. There seems to be no standard specifying what value must be used so this change should be OK. In fact we should change GRE to use this value as well. Incidentally, this patch also fixes a fairly serious bug in xfrm6_tunnel where we don't check whether the embedded IPv6 header is present before dereferencing it for the inside source address. This patch is inspired by a previous patch by Hugo Santos <hsantos@av.it.pt>. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:25 -07:00
Yasuyuki Kozakai	a89ecb6a2e	[NETFILTER]: x_tables: unify IPv4/IPv6 multiport match This unifies ipt_multiport and ip6t_multiport to xt_multiport. As a result, this addes support for inversion and port range match to IPv6 packets. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 02:22:54 -08:00
Yasuyuki Kozakai	dc5ab2faec	[NETFILTER]: x_tables: unify IPv4/IPv6 esp match This unifies ipt_esp and ip6t_esp to xt_esp. Please note that now a user program needs to specify IPPROTO_ESP as protocol to use esp match with IPv6. This means that ip6tables requires '-p esp' like iptables. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 02:22:30 -08:00
Herbert Xu	dbe5b4aaaf	[IPSEC]: Kill unused decap state structure This patch removes the *_decap_state structures which were previously used to share state between input/post_input. This is no longer needed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 00:54:16 -08:00
Herbert Xu	e695633e21	[IPSEC]: Kill unused decap state argument This patch removes the decap_state argument from the xfrm input hook. Previously this function allowed the input hook to share state with the post_input hook. The latter has since been removed. The only purpose for it now is to check the encap type. However, it is easier and better to move the encap type check to the generic xfrm_rcv function. This allows us to get rid of the decap state argument altogether. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 00:52:46 -08:00
Andrew Morton	65b4b4e81a	[NETFILTER]: Rename init functions. Every netfilter module uses `init' for its module_init() function and `fini' or `cleanup' for its module_exit() function. Problem is, this creates uninformative initcall_debug output and makes ctags rather useless. So go through and rename them all to $(filename)_init and $(filename)_fini. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-28 17:02:48 -08:00
Herbert Xu	d2acc3479c	[INET]: Introduce tunnel4/tunnel6 Basically this patch moves the generic tunnel protocol stuff out of xfrm4_tunnel/xfrm6_tunnel and moves it into the new files of tunnel4.c and tunnel6 respectively. The reason for this is that the problem that Hugo uncovered is only the tip of the iceberg. The real problem is that when we removed the dependency of ipip on xfrm4_tunnel we didn't really consider the module case at all. For instance, as it is it's possible to build both ipip and xfrm4_tunnel as modules and if the latter is loaded then ipip simply won't load. After considering the alternatives I've decided that the best way out of this is to restore the dependency of ipip on the non-xfrm-specific part of xfrm4_tunnel. This is acceptable IMHO because the intention of the removal was really to be able to use ipip without the xfrm subsystem. This is still preserved by this patch. So now both ipip/xfrm4_tunnel depend on the new tunnel4.c which handles the arbitration between the two. The order of processing is determined by a simple integer which ensures that ipip gets processed before xfrm4_tunnel. The situation for ICMP handling is a little bit more complicated since we may not have enough information to determine who it's for. It's not a big deal at the moment since the xfrm ICMP handlers are basically no-ops. In future we can deal with this when we look at ICMP caching in general. The user-visible change to this is the removal of the TUNNEL Kconfig prompts. This makes sense because it can only be used through IPCOMP as it stands. The addition of the new modules shouldn't introduce any problems since module dependency will cause them to be loaded. Oh and I also turned some unnecessary pskb's in IPv6 related to this patch to skb's. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-28 17:02:46 -08:00
Linus Torvalds	fdccffc6b7	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: drop duplicate assignment in request_sock [IPSEC]: Fix tunnel error handling in ipcomp6	2006-03-27 08:47:29 -08:00
Alan Stern	e041c68341	[PATCH] Notifier chain update: API changes The kernel's implementation of notifier chains is unsafe. There is no protection against entries being added to or removed from a chain while the chain is in use. The issues were discussed in this thread: http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2 We noticed that notifier chains in the kernel fall into two basic usage classes: "Blocking" chains are always called from a process context and the callout routines are allowed to sleep; "Atomic" chains can be called from an atomic context and the callout routines are not allowed to sleep. We decided to codify this distinction and make it part of the API. Therefore this set of patches introduces three new, parallel APIs: one for blocking notifiers, one for atomic notifiers, and one for "raw" notifiers (which is really just the old API under a new name). New kinds of data structures are used for the heads of the chains, and new routines are defined for registration, unregistration, and calling a chain. The three APIs are explained in include/linux/notifier.h and their implementation is in kernel/sys.c. With atomic and blocking chains, the implementation guarantees that the chain links will not be corrupted and that chain callers will not get messed up by entries being added or removed. For raw chains the implementation provides no guarantees at all; users of this API must provide their own protections. (The idea was that situations may come up where the assumptions of the atomic and blocking APIs are not appropriate, so it should be possible for users to handle these things in their own way.) There are some limitations, which should not be too hard to live with. For atomic/blocking chains, registration and unregistration must always be done in a process context since the chain is protected by a mutex/rwsem. Also, a callout routine for a non-raw chain must not try to register or unregister entries on its own chain. (This did happen in a couple of places and the code had to be changed to avoid it.) Since atomic chains may be called from within an NMI handler, they cannot use spinlocks for synchronization. Instead we use RCU. The overhead falls almost entirely in the unregister routine, which is okay since unregistration is much less frequent that calling a chain. Here is the list of chains that we adjusted and their classifications. None of them use the raw API, so for the moment it is only a placeholder. ATOMIC CHAINS ------------- arch/i386/kernel/traps.c: i386die_chain arch/ia64/kernel/traps.c: ia64die_chain arch/powerpc/kernel/traps.c: powerpc_die_chain arch/sparc64/kernel/traps.c: sparc64die_chain arch/x86_64/kernel/traps.c: die_chain drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list kernel/panic.c: panic_notifier_list kernel/profile.c: task_free_notifier net/bluetooth/hci_core.c: hci_notifier net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain net/ipv6/addrconf.c: inet6addr_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain net/netlink/af_netlink.c: netlink_chain BLOCKING CHAINS --------------- arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain arch/s390/kernel/process.c: idle_chain arch/x86_64/kernel/process.c idle_notifier drivers/base/memory.c: memory_chain drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list drivers/macintosh/adb.c: adb_client_list drivers/macintosh/via-pmu.c sleep_notifier_list drivers/macintosh/via-pmu68k.c sleep_notifier_list drivers/macintosh/windfarm_core.c wf_client_list drivers/usb/core/notify.c usb_notifier_list drivers/video/fbmem.c fb_notifier_list kernel/cpu.c cpu_chain kernel/module.c module_notify_list kernel/profile.c munmap_notifier kernel/profile.c task_exit_notifier kernel/sys.c reboot_notifier_list net/core/dev.c netdev_chain net/decnet/dn_dev.c: dnaddr_chain net/ipv4/devinet.c: inetaddr_chain It's possible that some of these classifications are wrong. If they are, please let us know or submit a patch to fix them. Note that any chain that gets called very frequently should be atomic, because the rwsem read-locking used for blocking chains is very likely to incur cache misses on SMP systems. (However, if the chain's callout routines may sleep then the chain cannot be atomic.) The patch set was written by Alan Stern and Chandra Seetharaman, incorporating material written by Keith Owens and suggestions from Paul McKenney and Andrew Morton. [jes@sgi.com: restructure the notifier chain initialization macros] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-27 08:44:50 -08:00
Herbert Xu	6abaaaae6d	[IPSEC]: Fix tunnel error handling in ipcomp6 The error handling in ipcomp6_tunnel_create is broken in two ways: 1) If we fail to allocate an SPI (this should never happen in practice since there are plenty of 32-bit SPI values for us to use), we will still go ahead and create the SA. 2) When xfrm_init_state fails, we first of all may trigger the BUG_TRAP in __xfrm_state_destroy because we didn't set the state to DEAD. More importantly we end up returning the freed state as if we succeeded! This patch fixes them both. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-26 17:37:54 -08:00
Patrick McHardy	b30bd282cb	[IPV6]: ip6_xmit: remove unnecessary NULL ptr check The sk argument to ip6_xmit is never NULL nowadays since the skb->priority assigment expects a valid socket. Coverity #354 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-23 01:17:25 -08:00
Pablo Neira Ayuso	b9f78f9fca	[NETFILTER]: nf_conntrack: support for layer 3 protocol load on demand x_tables matches and targets that require nf_conntrack_ipv[4\|6] to work don't have enough information to load on demand these modules. This patch introduces the following changes to solve this issue: o nf_ct_l3proto_try_module_get: try to load the layer 3 connection tracker module and increases the refcount. o nf_ct_l3proto_module put: drop the refcount of the module. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-22 13:56:08 -08:00
Pablo Neira Ayuso	a45049c51c	[NETFILTER]: x_tables: set the protocol family in x_tables targets/matches Set the family field in xt_[matches\|targets] registered. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-22 13:55:40 -08:00
Patrick McHardy	443da0d527	[NETFILTER]: Fix ip6tables breakage from {get,set}sockopt compat layer do_ipv6_getsockopt returns -EINVAL for unknown options, not -ENOPROTOOPT as do_ipv6_setsockopt. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-22 13:53:20 -08:00
Ingo Oeser	322f74a432	[IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2 Here are some possible (and trivial) cleanups. - use kzalloc() where possible - invert allocation failure test like if (object) { /* Rest of function here / } to if (object == NULL) return NULL; / Rest of function here */ Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 23:01:47 -08:00
Ingo Oeser	0c600eda4b	[IPV6]: Nearly complete kzalloc cleanup for net/ipv6 Stupidly use kzalloc() instead of kmalloc()/memset() everywhere where this is possible in net/ipv6/*.c . Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 23:01:32 -08:00
Ingo Oeser	78c784c47a	[IPV6]: Cleanup of net/ipv6/reassambly.c Two minor cleanups: 1. Using kzalloc() in fraq_alloc_queue() saves the memset() in ipv6_frag_create(). 2. Invert sense of if-statements to streamline code. Inverts the comment, too. Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 23:01:17 -08:00
Arnaldo Carvalho de Melo	543d9cfeec	[NET]: Identation & other cleanups related to compat_[gs]etsockopt cset No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs to just after the function exported, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:48:35 -08:00
Dmitry Mishin	3fdadf7d27	[NET]: {get\|set}sockopt compatibility layer This patch extends {get\|set}sockopt compatibility layer in order to move protocol specific parts to their place and avoid huge universal net/compat.c file in the future. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:45:21 -08:00
Dave Jones	c750360938	[IPV6]: remove useless test in ip6_append_data We've already dereferenced 'np' a dozen times at this point, so it's safe to say it's not null. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:44:52 -08:00
Ingo Molnar	57b47a53ec	[NET]: sem2mutex part 2 Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:35:41 -08:00
Arjan van de Ven	4a3e2f711a	[NET] sem2mutex: net/ Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:33:17 -08:00
Arnaldo Carvalho de Melo	c4d9390941	[ICSK]: Introduce inet_csk_ctl_sock_create Consolidating open coded sequences in tcp and dccp, v4 and v6. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:01:03 -08:00
David S. Miller	d76e60a5b5	[IPV6]: Fix some code/comment formatting in ip6_dst_output(). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:35:50 -08:00
Jamal Hadi Salim	9500e8a81f	[IPSEC]: Sync series - fast path Fast path sequence updates that will generate ipsec async events Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:15:29 -08:00
Patrick McHardy	c4b8851392	[NETFILTER]: x_tables: replace IPv4/IPv6 policy match by address family independant version Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:03:40 -08:00
Patrick McHardy	f2ffd9eeda	[NETFILTER]: Move ip6_masked_addrcmp to include/net/ipv6.h Replace netfilter's ip6_masked_addrcmp by a more efficient version in include/net/ipv6.h to make it usable without module dependencies. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:03:16 -08:00
Patrick McHardy	c498673474	[NETFILTER]: x_tables: add xt_{match,target} arguments to match/target functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:02:56 -08:00
Patrick McHardy	1c524830d0	[NETFILTER]: x_tables: pass registered match/target data to match/target functions This allows to make decisions based on the revision (and address family with a follow-up patch) at runtime. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:02:15 -08:00
Patrick McHardy	7f9397138e	[NETFILTER]: Convert ip6_tables matches/targets to centralized error checking Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:01:43 -08:00
Patrick McHardy	3cdc7c953e	[NETFILTER]: Change {ip,ip6,arp}_tables to use centralized error checking Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:00:36 -08:00
Yasuyuki Kozakai	6ea46c9c12	[NETFILTER]: nf_conntrack: use ipv6_addr_equal in nf_ct_reasm Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:58:44 -08:00
Harald Welte	dc808fe28d	[NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn' This patch moves all helper related data fields of 'struct nf_conn' into a separate structure 'struct nf_conn_help'. This new structure is only present in conntrack entries for which we actually have a helper loaded. Also, this patch cleans up the nf_conntrack 'features' mechanism to resemble what the original idea was: Just glue the feature-specific data structures at the end of 'struct nf_conn', and explicitly re-calculate the pointer to it when needed rather than keeping pointers around. Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack is 276 bytes. We still need to save another 20 bytes in order to fit into to target of 256bytes. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:56:32 -08:00
John Heffner	5d424d5a67	[TCP]: MTU probing Implementation of packetization layer path mtu discovery for TCP, based on the internet-draft currently found at <http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-05.txt>. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:53:41 -08:00
Jesper Juhl	2b191befe2	[IPCOMP6]: don't check vfree() argument for NULL. vfree does it's own NULL checking, so checking a pointer before handing it to vfree is pointless. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:46:29 -08:00
YOSHIFUJI Hideaki	e843b9e1be	[IPV6]: ROUTE: Ensure to accept redirects from nexthop for the target. It is possible to get redirects from nexthop of "more-specific" routes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:07:49 -08:00
YOSHIFUJI Hideaki	09c884d4c3	[IPV6]: ROUTE: Add accept_ra_rt_info_max_plen sysctl. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:07:03 -08:00
YOSHIFUJI Hideaki	e317da9622	[IPV6]: ROUTE: Flag RTF_DEFAULT for Route Infomation for ::/0. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:06:42 -08:00
YOSHIFUJI Hideaki	70ceb4f539	[IPV6]: ROUTE: Add experimental support for Route Information Option in RA (RFC4191). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:06:24 -08:00
YOSHIFUJI Hideaki	52e1635631	[IPV6]: ROUTE: Add router_probe_interval sysctl. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:05:47 -08:00
YOSHIFUJI Hideaki	930d6ff2e2	[IPV6]: ROUTE: Add accept_ra_rtr_pref sysctl. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:05:30 -08:00
YOSHIFUJI Hideaki	270972554c	[IPV6]: ROUTE: Add Router Reachability Probing (RFC4191). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:05:13 -08:00
YOSHIFUJI Hideaki	ebacaaa0fd	[IPV6]: ROUTE: Add support for Router Preference (RFC4191). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:04:53 -08:00
YOSHIFUJI Hideaki	8238dd0698	[IPV6]: ROUTE: Handle finding the next best route in reachability in BACKTRACK(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:04:35 -08:00
YOSHIFUJI Hideaki	bb133964e0	[IPV6]: ROUTE: Try finding the next best route. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:01:43 -08:00
YOSHIFUJI Hideaki	1ddef044ed	[IPV6]: ROUTE: Clean up rt6_select() code path in ip6_route_{intput,output}(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:01:24 -08:00
YOSHIFUJI Hideaki	118f8c1654	[IPV6]: ROUTE: Try selecting better route for non-default routes as well. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:01:06 -08:00
YOSHIFUJI Hideaki	045927ff84	[IPV6]: ROUTE: More strict check for default routers in rt6_get_dflt_router(). Check RTF_ADDRCONF\|RTF_DEFAULT in rt6_get_dflt_router(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:00:48 -08:00
YOSHIFUJI Hideaki	554cfb7ee5	[IPV6]: ROUTE: Eliminate lock for default route pointer. And prepare for more advanced router selection. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:00:26 -08:00
YOSHIFUJI Hideaki	519fbd8715	[IPV6]: ROUTE: Clean-up cow'ing in ip6_route_{intput,output}(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:00:05 -08:00
YOSHIFUJI Hideaki	e40cf3533c	[IPV6]: ROUTE: Convert rt6_cow() to rt6_alloc_cow(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:59:27 -08:00
YOSHIFUJI Hideaki	fb9de91ea8	[IPV6]: ROUTE: Clean up reference counting / unlocking for returning object. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:59:08 -08:00
YOSHIFUJI Hideaki	d5315b500b	[IPV6]: ROUTE: Unify two code paths for pmtu disc. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:58:48 -08:00
YOSHIFUJI Hideaki	299d993908	[IPV6]: ROUTE: Add rt6_alloc_clone() for cloning route allocation. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:58:32 -08:00
YOSHIFUJI Hideaki	76f9edd17d	[IPV6]: ROUTE: Copy u.dst.error for RTF_REJECT routes when cloning. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:56:50 -08:00
YOSHIFUJI Hideaki	a1e783634a	[IPV6]: ROUTE: Set appropriate information before inserting a route. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:56:32 -08:00
YOSHIFUJI Hideaki	95a9a5ba02	[IPV6]: ROUTE: Split up rt6_cow() for future changes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:55:51 -08:00
YOSHIFUJI Hideaki	c4fd30eb18	[IPV6]: ADDRCONF: Add accept_ra_pinfo sysctl. This controls whether we accept Prefix Information in RAs. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:55:26 -08:00
YOSHIFUJI Hideaki	65f5c7c114	[IPV6]: ROUTE: Add accept_ra_defrtr sysctl. This controls whether we accept default router information in RAs. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:55:08 -08:00
YOSHIFUJI Hideaki	073a8e0e15	[IPV6]: ADDRCONF: Split up ipv6_generate_eui64() by device type. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:54:49 -08:00
YOSHIFUJI Hideaki	955189efb4	[IPV6]: ADDRCONF: Use our standard algorithm for randomized ifid. RFC 3041 describes an algorithm to generate random interface identifier. In RFC 3041bis, it is allowed to use different algorithm than one described in RFC 3041. So, let's use our standard pseudo random algorithm to simplify our implementation. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:54:09 -08:00
YOSHIFUJI Hideaki	74a3a0ed90	[IPV6]: TUNNEL6: Don't try to add multicast route twice. Since addrconf_add_dev() has already called addrconf_add_mroute() to added route for multicast prefix, there's no point to call it again in addrconf_ip6_tnl_config(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:51:48 -08:00
Herbert Xu	3759fa9c55	[TCP]: Fix zero port problem in IPv6 When we link a socket into the hash table, we need to make sure that we set the num/port fields so that it shows us with a non-zero port value in proc/netlink and on the wire. This code and comment is copied over from the IPv4 stack as is. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-03-13 14:26:12 -08:00
Patrick McHardy	baa829d892	[IPV4/6]: Fix UFO error propagation When ufo_append_data fails err is uninitialized, but returned back. Strangely gcc doesn't notice it. Coverity #901 and #902 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-12 20:39:40 -08:00
Patrick McHardy	f8dc01f543	[XFRM]: Fix leak in ah6_input tmp_hdr is not freed when ipv6_clear_mutable_options fails. Coverity #650 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-12 20:39:37 -08:00
Brian Haley	0d27b42739	[IPV6]: fix ipv6_saddr_score struct element The scope element in the ipv6_saddr_score struct used in ipv6_dev_get_saddr() is an unsigned integer, but __ipv6_addr_src_scope() returns a signed integer (and can return -1). Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-11 18:50:14 -08:00
Thomas Graf	850a9a4e3c	[NETFILTER] ip_queue: Fix wrong skb->len == nlmsg_len assumption The size of the skb carrying the netlink message is not equivalent to the length of the actual netlink message due to padding. ip_queue matches the length of the payload against the original packet size to determine if packet mangling is desired, due to the above wrong assumption arbitary packets may not be mangled depening on their original size. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-07 14:56:12 -08:00
Patrick McHardy	bafac2a512	[NETFILTER]: Restore {ipt,ip6t,ebt}_LOG compatibility The nfnetlink_log infrastructure changes broke compatiblity of the LOG targets. They currently use whatever log backend was registered first, which means that if ipt_ULOG was loaded first, no messages will be printed to the ring buffer anymore. Restore compatiblity by using the old log functions by default and only use the nf_log backend if the user explicitly said so. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-27 13:04:17 -08:00
YOSHIFUJI Hideaki	d91675f9c7	[IPV6]: Do not ignore IPV6_MTU socket option. Based on patch by Hoerdt Mickael <hoerdt@clarinet.u-strasbg.fr>. Signed-off-by: YOSHIFUJI Hideaki <yosufuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-24 13:18:33 -08:00
Hugo Santos	0c0888908d	[IPV6] ip6_tunnel: release cached dst on change of tunnel params The included patch fixes ip6_tunnel to release the cached dst entry when the tunnel parameters (such as tunnel endpoints) are changed so they are used immediatly for the next encapsulated packets. Signed-off-by: Hugo Santos <hsantos@av.it.pt> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-24 13:16:25 -08:00
Al Viro	cc6cdac0cf	[PATCH] missing ntohs() in ip6_tunnel ->payload_len is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-18 16:02:18 -05:00
Yasuyuki Kozakai	763ecff187	[NETFILTER]: nf_conntrack: attach conntrack to locally generated ICMPv6 error Locally generated ICMPv6 errors should be associated with the conntrack of the original packet. Since the conntrack entry may not be in the hash tables (for the first packet), it must be manually attached. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-15 15:24:15 -08:00
Yasuyuki Kozakai	08857fa745	[NETFILTER]: nf_conntrack: attach conntrack to TCP RST generated by ip6t_REJECT TCP RSTs generated by the REJECT target should be associated with the conntrack of the original TCP packet. Since the conntrack entry is usually not is the hash tables, it must be manually attached. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-15 15:23:28 -08:00
Nicolas DICHTEL	6d3e85ecf2	[IPV6] Don't store dst_entry for RAW socket Signed-off-by: Nicolas DICHTEL <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-13 15:56:13 -08:00
Kristian Slavov	9908104935	[IPV6]: Address autoconfiguration does not work after device down/up cycle If you set network interface down and up again, the IPv6 address autoconfiguration does not work. 'ip addr' shows that the link-local address is in tentative state. We don't even react to periodical router advertisements. During NETDEV_DOWN we clear IF_READY, and we don't set it back in NETDEV_UP. While starting to perform DAD on the link-local address, we notice that the device is not in IF_READY, and we abort autoconfiguration process (which would eventually send router solicitations). Acked-by: Juha-Matti Tapio <jmtapio@verkkotelakka.net> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-08 16:13:28 -08:00
Al Viro	e80e28b6b6	[PATCH] net/ipv6/mcast.c NULL noise removal Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-07 20:58:56 -05:00
Al Viro	1b8623545b	[PATCH] remove bogus asm/bug.h includes. A bunch of asm/bug.h includes are both not needed (since it will get pulled anyway) and bogus (since they are done too early). Removed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-07 20:56:35 -05:00
Linus Torvalds	98bd0c07b6	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2006-02-05 11:10:29 -08:00
Eric Dumazet	88a2a4ac6b	[PATCH] percpu data: only iterate over possible CPUs percpu_data blindly allocates bootmem memory to store NR_CPUS instances of cpudata, instead of allocating memory only for possible cpus. As a preparation for changing that, we need to convert various 0 -> NR_CPUS loops to use for_each_cpu(). (The above only applies to users of asm-generic/percpu.h. powerpc has gone it alone and is presently only allocating memory for present CPUs, so it's currently corrupting memory). Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <axboe@suse.de> Cc: Anton Blanchard <anton@samba.org> Acked-by: William Irwin <wli@holomorphy.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-05 11:06:51 -08:00
Patrick McHardy	0047c65a60	[NETFILTER]: Prepare {ipt,ip6t}_policy match for x_tables unification The IPv4 and IPv6 version of the policy match are identical besides address comparison and the data structure used for userspace communication. Unify the data structures to break compatiblity now (before it is released), so we can port it to x_tables in 2.6.17. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:28 -08:00
Patrick McHardy	878c41ce57	[NETFILTER]: Fix ip6t_policy address matching Fix two bugs in ip6t_policy address matching: - misorder arguments to ip6_masked_addrcmp, mask must be the second argument - inversion incorrectly applied to the entire expression instead of just the address comparison Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:27 -08:00
Patrick McHardy	e55f1bc5dc	[NETFILTER]: Check policy length in policy match strict mode Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:26 -08:00
Kirill Korotaev	ee4bb818ae	[NETFILTER]: Fix possible overflow in netfilters do_replace() netfilter's do_replace() can overflow on addition within SMP_ALIGN() and/or on multiplication by NR_CPUS, resulting in a buffer overflow on the copy_from_user(). In practice, the overflow on addition is triggerable on all systems, whereas the multiplication one might require much physical memory to be present due to the check above. Either is sufficient to overwrite arbitrary amounts of kernel memory. I really hate adding the same check to all 4 versions of do_replace(), but the code is duplicate... Found by Solar Designer during security audit of OpenVZ.org Signed-Off-By: Kirill Korotaev <dev@openvz.org> Signed-Off-By: Solar Designer <solar@openwall.com> Signed-off-by: Patrck McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:25 -08:00
Herbert Xu	6f4b6ec1cf	[IPV6]: Fix illegal dst locking in softirq context. On Tue, Jan 31, 2006 at 10:24:32PM +0100, Ingo Molnar wrote: > > [<c04de9e8>] _write_lock+0x8/0x10 > [<c0499015>] inet6_destroy_sock+0x25/0x100 > [<c04b8672>] tcp_v6_destroy_sock+0x12/0x20 > [<c046bbda>] inet_csk_destroy_sock+0x4a/0x150 > [<c047625c>] tcp_rcv_state_process+0xd4c/0xdd0 > [<c047d8e9>] tcp_v4_do_rcv+0xa9/0x340 > [<c047eabb>] tcp_v4_rcv+0x8eb/0x9d0 OK this is definitely broken. We should never touch the dst lock in softirq context. Since inet6_destroy_sock may be called from that context due to the asynchronous nature of sockets, we can't take the lock there. In fact this sk_dst_reset is totally redundant since all IPv6 sockets use inet_sock_destruct as their socket destructor which always cleans up the dst anyway. So the solution is to simply remove the call. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-02 17:01:13 -08:00
Herbert Xu	4641e7a334	[IPV6]: Don't hold extra ref count in ipv6_ifa_notify Currently the logic in ipv6_ifa_notify is to hold an extra reference count for addrconf dst's that get added to the routing table. Thus, when addrconf dst entries are taken out of the routing table, we need to drop that dst. However, addrconf dst entries may be removed from the routing table by means other than __ipv6_ifa_notify. So we're faced with the choice of either fixing up all places where addrconf dst entries are removed, or dropping the extra reference count altogether. I chose the latter because the ifp itself always holds a dst reference count of 1 while it's alive. This is dropped just before we kfree the ifp object. Therefore we know that in __ipv6_ifa_notify we will always hold that count. This bug was found by Eric W. Biederman. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-02 16:55:45 -08:00
Eric W. Biederman	78b910429e	[IPV6] tcp_v6_send_synack: release the destination This patch fix dst reference counting in tcp_v6_send_synack Analysis: Currently tcp_v6_send_synack is never called with a dst entry so dst always comes in as NULL. ip6_dst_lookup calls ip6_route_output which calls dst_hold before it returns the dst entry. Neither xfrm_lookup nor tcp_make_synack consume the dst entry so we still have a dst_entry with a bumped refrence count at the end of this function. Therefore we need to call dst_release just before we return just like tcp_v4_send_synack does. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-31 17:51:44 -08:00
David L Stevens	7add2a4398	[IPV6] MLDv2: fix change records when transitioning to/from inactive The following patch fixes these problems in MLDv2: 1) Add/remove "delete" records for sending change reports when addition of a filter results in that filter transitioning to/from inactive. [same as recent IPv4 IGMPv3 fix] 2) Remove 2 redundant "group_type" checks (can't be IPV6_ADDR_ANY within that loop, so checks are always true) 3) change an is_in() "return 0" to "return type == MLD2_MODE_IS_INCLUDE". It should always be "0" to get here, but it improves code locality to not assume it, and if some race allowed otherwise, doing the check would return the correct result. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-24 13:06:39 -08:00
Yasuyuki Kozakai	f0daaa654a	[NETFILTER] ip6tables: whitespace and indent cosmetic cleanup Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:39:39 -08:00
Yasuyuki Kozakai	6dd42af790	[NETFILTER] Makefile cleanup These are replaced with x_tables matches and no longer exist. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:38:56 -08:00
Benoit Boissinot	ccc91324a1	[NETFILTER] ip[6]t_policy: Fix compilation warnings ip[6]t_policy argument conversion slipped when merging with x_tables Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:26:34 -08:00
YOSHIFUJI Hideaki	9343e79a7b	[IPV6]: Preserve procfs IPV6 address output format Procfs always output IPV6 addresses without the colon characters, and we cannot change that. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:10:53 -08:00
Patrick McHardy	ee51b1b6ce	[XFRM]: IPsec tunnel wildcard address support When the source address of a tunnel is given as 0.0.0.0 do a routing lookup to get the real source address for the destination and fill that into the acquire message. This allows to specify policies like this: spdadd 172.16.128.13/32 172.16.0.0/20 any -P out ipsec esp/tunnel/0.0.0.0-x.x.x.x/require; spdadd 172.16.0.0/20 172.16.128.13/32 any -P in ipsec esp/tunnel/x.x.x.x-0.0.0.0/require; Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-13 14:34:36 -08:00
Joe Perches	46b86a2da0	[NET]: Use NIP6_FMT in kernel.h There are errors and inconsistency in the display of NIP6 strings. ie: net/ipv6/ip6_flowlabel.c There are errors and inconsistency in the display of NIPQUAD strings too. ie: net/netfilter/nf_conntrack_ftp.c This patch: adds NIP6_FMT to kernel.h changes all code to use NIP6_FMT fixes net/ipv6/ip6_flowlabel.c adds NIPQUAD_FMT to kernel.h fixes net/netfilter/nf_conntrack_ftp.c changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-13 14:29:07 -08:00
Harald Welte	2e4e6a17af	[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables This monster-patch tries to do the best job for unifying the data structures and backend interfaces for the three evil clones ip_tables, ip6_tables and arp_tables. In an ideal world we would never have allowed this kind of copy+paste programming... but well, our world isn't (yet?) ideal. o introduce a new x_tables module o {ip,arp,ip6}_tables depend on this x_tables module o registration functions for tables, matches and targets are only wrappers around x_tables provided functions o all matches/targets that are used from ip_tables and ip6_tables are now implemented as xt_FOOBAR.c files and provide module aliases to ipt_FOOBAR and ip6t_FOOBAR o header files for xt_matches are in include/linux/netfilter/, include/linux/netfilter_{ipv4,ipv6} contains compatibility wrappers around the xt_FOOBAR.h headers Based on this patchset we're going to further unify the code, gradually getting rid of all the layer 3 specific assumptions. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-12 14:06:43 -08:00
Randy Dunlap	4fc268d24c	[PATCH] capable/capability.h (net/) net: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 18:42:14 -08:00
Kris Katterjohn	8b3a70058b	[NET]: Remove more unneeded typecasts on *malloc() This removes more unneeded casts on the return value for kmalloc(), sock_kmalloc(), and vmalloc(). Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-11 16:32:14 -08:00
David Woodhouse	ae0f7d5f83	[IPV6]: Avoid calling ip6_xmit() with NULL sk The ip6_xmit() function now assumes that its sk argument is non-NULL, which isn't currently true when TCPv6 code is sending RST or ACK packets. This fixes that code to use a socket of its own for sending such packets, as TCPv4 does. (Thanks Andi for the pointer). Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-11 16:32:13 -08:00
David S. Miller	82bf7e97ac	[NET]: Some more missing include/etherdevice.h includes For compare_ether_addr() Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-11 16:32:11 -08:00
David S. Miller	5bf887f2ff	[IPV6]: Fix modular build with netfilter enabled. Also, drop __exit marker from ipv6_netfilter_fini() as this can be invoked from inet6_init() error handling paths. Based upon a report from Stephen Hemminger. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 21:02:21 -08:00
Patrick McHardy	babbdb1a18	[NETFILTER]: Fix timeout sysctls on big-endian 64bit architectures The connection tracking timeout variables are unsigned long, but proc_dointvec_jiffies is used with sizeof(unsigned int) in the sysctl tables. Since there is no proc_doulongvec_jiffies function, change the timeout variables to unsigned int. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 12:54:35 -08:00
Patrick McHardy	bb94aa169e	[NETFILTER]: net/ipv[46]/netfilter.c cleanups Don't wrap entire file in #ifdef CONFIG_NETFILTER, remove a few unneccessary includes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 12:54:29 -08:00
Kris Katterjohn	d3f4a687f6	[NET]: Change memcmp(,,ETH_ALEN) to compare_ether_addr() This changes some memcmp(one,two,ETH_ALEN) to compare_ether_addr(one,two). Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 12:54:28 -08:00
Patrick McHardy	a2c2064f7f	[IPV6]: Set skb->priority in ip6_output.c Set skb->priority = sk->sk_priority as in raw.c and IPv4. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-09 14:16:31 -08:00
Patrick McHardy	2941a48631	[NET]: Convert net/{ipv4,ipv6,sched} to netdev_priv Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-09 14:16:03 -08:00
Pekka Enberg	f9f7500521	[PATCH] slab: remove unused align parameter from alloc_percpu __alloc_percpu and alloc_percpu both take an 'align' argument which is completely ignored. snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use it, but it will be ignored. Therefore, remove the 'align' argument and fixup the lone caller. Signed-off-by: Matthew Dobson <colpatch@us.ibm.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:39 -08:00
Adrian Bunk	9f5336e218	[IPV6]: small cleanups This patch contains the following cleanups: - addrconf.c: make addrconf_dad_stop() static - inet6_connection_sock.c should #include <net/inet6_connection_sock.h> for getting the prototypes of it's global functions Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 13:24:25 -08:00
Patrick McHardy	e16a8f0b8c	[NETFILTER]: Add ipt_policy/ip6t_policy matches Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:38 -08:00
Patrick McHardy	3e3850e989	[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder ip_route_me_harder doesn't use the port numbers of the xfrm lookup and uses ip_route_input for non-local addresses which doesn't do a xfrm lookup, ip6_route_me_harder doesn't do a xfrm lookup at all. Use xfrm_decode_session and do the lookup manually, make sure both only do the lookup if the packet hasn't been transformed already. Makeing sure the lookup only happens once needs a new field in the IP6CB, which exceeds the size of skb->cb. The size of skb->cb is increased to 48b. Apparently the IPv6 mobile extensions need some more room anyway. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:33 -08:00
Patrick McHardy	8cdfab8a43	[IPV4]: reset IPCB flags when neccessary Reset IPSKB_XFRM_TUNNEL_SIZE flags in ipip and ip_gre hard_start_xmit function before the packet reenters IP. This is neccessary so the encapsulated packets are checked not to be oversized in xfrm4_output.c again. Reset all flags in sit when a packet changes its address family. Also remove some obsolete IPSKB flags. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:32 -08:00
Patrick McHardy	b05e106698	[IPV4/6]: Netfilter IPsec input hooks When the innermost transform uses transport mode the decapsulated packet is not visible to netfilter. Pass the packet through the PRE_ROUTING and LOCAL_IN hooks again before handing it to upper layer protocols to make netfilter-visibility symetrical to the output path. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:31 -08:00
Patrick McHardy	951dbc8ac7	[IPV6]: Move nextheader offset to the IP6CB Move nextheader offset to the IP6CB to make it possible to pass a packet to ip6_input_finish multiple times and have it skip already parsed headers. As a nice side effect this gets rid of the manual hopopts skipping in ip6_input_finish. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:29 -08:00
Patrick McHardy	16a6677fdf	[XFRM]: Netfilter IPsec output hooks Call netfilter hooks before IPsec transforms. Packets visit the FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode transform. Patch from Herbert Xu <herbert@gondor.apana.org.au>: Move the loop from dst_output into xfrm4_output/xfrm6_output since they're the only ones who need to it. xfrm{4,6}_output_one() processes the first SA all subsequent transport mode SAs and is called in a loop that calls the netfilter hooks between each two calls. In order to avoid the tail call issue, I've added the inline function nf_hook which is nf_hook_slow plus the empty list check. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:28 -08:00
Kris Katterjohn	46f25dffba	[NET]: Change 1500 to ETH_DATA_LEN in some files These patches add the header linux/if_ether.h and change 1500 to ETH_DATA_LEN in some files. Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 16:48:56 -08:00
Patrick McHardy	22dea562bb	[NETFILTER]: Export ip6_masked_addrcmp, don't pass IPv6 addresses on stack Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:21:34 -08:00
Patrick McHardy	b777e0ce74	[NETFILTER]: make ipv6_find_hdr() find transport protocol header The original ipv6_find_hdr() finds the specified header in IPv6 packets. This makes it possible to get transport header so that we can kill similar loop in ip6_match_packet(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:21:16 -08:00
Pablo Neira Ayuso	c1d10adb4a	[NETFILTER]: Add ctnetlink port for nf_conntrack Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:19:05 -08:00
YOSHIFUJI Hideaki	181a46a56e	[NETFILTER]: Use macro for spinlock_t/rwlock_t initializations/definition. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-04 13:56:54 -08:00
YOSHIFUJI Hideaki	196433c5b7	[IPV6]: Use macro for rwlock_t initialization. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-04 13:56:31 -08:00
Christoph Hellwig	b5e5fa5e09	[NET]: Add a dev_ioctl() fallback to sock_ioctl() Currently all network protocols need to call dev_ioctl as the default fallback in their ioctl implementations. This patch adds a fallback to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD. This way all the procotol ioctl handlers can be simplified and we don't need to export dev_ioctl. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 14:18:33 -08:00
Arnaldo Carvalho de Melo	14c850212e	[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:21 -08:00
Eric Dumazet	90ddc4f047	[NET]: move struct proto_ops to const I noticed that some of 'struct proto_ops' used in the kernel may share a cache line used by locks or other heavily modified data. (default linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at least) This patch makes sure a 'struct proto_ops' can be declared as const, so that all cpus can share all parts of it without false sharing. This is not mandatory : a driver can still use a read/write structure if it needs to (and eventually a __read_mostly) I made a global stubstitute to change all existing occurences to make them const. This should reduce the possibility of false sharing on SMP, and speedup some socket system calls. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:15 -08:00
Arnaldo Carvalho de Melo	d83d8461f9	[IP_SOCKGLUE]: Remove most of the tcp specific calls As DCCP needs to be called in the same spots. Now we have a member in inet_sock (is_icsk), set at sock creation time from struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and DCCP) to see if a struct sock instance is a inet_connection_sock for places like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if sk_type was SOCK_STREAM, that is insufficient because we now use the same code for DCCP, that has sk_type SOCK_DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo	d8313f5ca2	[INET6]: Generalise tcp_v6_hash_connect Renaming it to inet6_hash_connect, making it possible to ditch dccp_v6_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo	6d6ee43e0b	[TWSK]: Introduce struct timewait_sock_ops So that we can share several timewait sockets related functions and make the timewait mini sockets infrastructure closer to the request mini sockets one. Next changesets will take advantage of this, moving more code out of TCP and DCCP v4 and v6 to common infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo	399c07def6	[IPV6]: Export ipv6_opt_accepted It was already non-TCP specific, will be used by DCCPv6. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:51 -08:00
Arnaldo Carvalho de Melo	3cf3dc6c2e	[IPV6]: Export some symbols for DCCPv6 Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:48 -08:00
Arnaldo Carvalho de Melo	0fa1a53e1f	[IPV6]: Introduce inet6_timewait_sock Out of tcp6_timewait_sock, that now is just an aggregation of inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct inet_timewait_sock, that is common to the IPv6 transport protocols that use timewait sockets, like DCCP and TCP. tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic code to find the IPv6 area in a timewait sock. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:47 -08:00
Arnaldo Carvalho de Melo	b9750ce13c	[IPV6]: Generalise some functions Using sk->sk_protocol instead of IPPROTO_TCP. Will be used by DCCPv6 in the next changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:46 -08:00
Herbert Xu	3305b80c21	[IP]: Simplify and consolidate MSG_PEEK error handling When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled it is left on the socket receive queue. This means that when we detect a checksum error we have to be careful when trying to free the packet as someone could have dequeued it in the time being. Currently this delicate logic is duplicated three times between UDPv4, UDPv6 and RAWv6. This patch moves them into a one place and simplifies the code somewhat. This is based on a suggestion by Eric Dumazet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:41 -08:00
Arnaldo Carvalho de Melo	8292a17a39	[ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops And move it to struct inet_connection_sock. DCCP will use it in the upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:38 -08:00
Arnaldo Carvalho de Melo	ca304b6104	[IPV6]: Introduce inet6_rsk() And inet6_rsk_offset in inet_request_sock, for the same reasons as inet_sock's pinfo6 member. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:37 -08:00
Arnaldo Carvalho de Melo	8129765ac0	[IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add More work is needed tho to introduce inet6_request_sock from tcp6_request_sock, in the same layout considerations as ipv6_pinfo in inet_sock, next changeset will do that. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:36 -08:00
Arnaldo Carvalho de Melo	90b19d3169	[IPV6]: Generalise __tcp_v6_hash, renaming it to __inet6_hash Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:33 -08:00
Arnaldo Carvalho de Melo	971af18bbf	[IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:33 -08:00
Eric Dumazet	3183606469	[NETFILTER] ip_tables: NUMA-aware allocation Part of a performance problem with ip_tables is that memory allocation is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch separate cache lines) Even with small iptables rules, the cost of this misplacement can be high on common workloads. Instead of using one vmalloc() area (located in the node of the iptables process), we now allocate an area for each possible CPU, using vmalloc_node() so that memory should be allocated in the CPU's node if possible. Port to arp_tables and ip6_tables by Harald Welte. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:29 -08:00
David L Stevens	5ab4a6c81e	[IPV6] mcast: Fix multiple issues in MLDv2 reports. The below "jumbo" patch fixes the following problems in MLDv2. 1) Add necessary "ntohs" to recent "pskb_may_pull" check [breaks all nonzero source queries on little-endian (!)] 2) Add locking to source filter list [resend of prior patch] 3) fix "mld_marksources()" to a) send nothing when all queried sources are excluded b) send full exclude report when source queried sources are not excluded c) don't schedule a timer when there's nothing to report NOTE: RFC 3810 specifies the source list should be saved and each source reported individually as an IS_IN. This is an obvious DOS path, requiring the host to store and then multicast as many sources as are queried (e.g., millions...). This alternative sends a full, relevant report that's limited to number of sources present on the machine. 4) fix "add_grec()" to send empty-source records when it should The original check doesn't account for a non-empty source list with all sources inactive; the new code keeps that short-circuit case, and also generates the group header with an empty list if needed. 5) fix mca_crcount decrement to be after add_grec(), which needs its original value These issues (other than item #1 ;-) ) were all found by Yan Zheng, much thanks! Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-27 14:03:00 -08:00
YOSHIFUJI Hideaki	6732badee0	[IPV6]: Fix addrconf dead lock. We need to release idev->lcok before we call addrconf_dad_stop(). It calls ipv6_addr_del(), which will hold idev->lock. Bug spotted by Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-27 13:35:15 -08:00
David L Stevens	6f4353d891	[IPV6]: Increase default MLD_MAX_MSF to 64. The existing default of 10 is just way too low. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-26 17:03:46 -08:00
Hiroyuki YAMAMORI	291d809ba5	[IPV6]: Fix Temporary Address Generation From: Hiroyuki YAMAMORI <h-yamamo@db3.so-net.ne.jp> Since regen_count is stored in the public address, we need to reset it when we start renewing temporary address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-23 11:24:05 -08:00
YOSHIFUJI Hideaki	3dd3bf8357	[IPV6]: Fix dead lock. We need to relesae ifp->lock before we call addrconf_dad_stop(), which will hold ifp->lock. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-23 11:23:21 -08:00
David S. Miller	e6469297d4	Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+git+ipv6-fix-20051221a	2005-12-22 07:41:27 -08:00
Kristian Slavov	1d1428045c	[IPV6]: Fix address deletion If you add more than one IPv6 address belonging to the same prefix and delete the address that was last added, routing table entry for that prefix is also deleted. Tested on 2.6.14.4 To reproduce: ip addr add 3ffe::1/64 dev eth0 ip addr add 3ffe::2/64 dev eth0 /* wait DAD */ sleep 1 ip addr del 3ffe::2/64 dev eth0 ip -6 route (route to 3ffe::/64 should be gone) In ipv6_del_addr(), if ifa == ifp, we set ifa->if_next to NULL, and later assign ifap = &ifa->if_next, effectively terminating the for-loop. This prevents us from checking if there are other addresses using the same prefix that are valid, and thus resulting in deletion of the prefix. This applies only if the first entry in idev->addr_list is the address to be deleted. Signed-off-by: Kristian Slavov <kristian.slavov@nomadiclab.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-21 18:47:24 -08:00
YOSHIFUJI Hideaki	6b3ae80a63	[IPV6]: Don't select a tentative address as a source address. A tentative address is not considered "assigned to an interface" in the traditional sense (RFC2462 Section 4). Don't try to select such an address for the source address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:58:01 +09:00
YOSHIFUJI Hideaki	c5e33bddd3	[IPV6]: Run DAD when the link becomes ready. If the link was not available when the interface was created, run DAD for pending tentative addresses when the link becomes ready. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:57:44 +09:00
YOSHIFUJI Hideaki	3c21edbd11	[IPV6]: Defer IPv6 device initialization until the link becomes ready. NETDEV_UP might be sent even if the link attached to the interface was not ready. DAD does not make sense in such case, so we won't do so. After interface Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:57:24 +09:00
YOSHIFUJI Hideaki	8de3351e6e	[IPV6]: Try not to send icmp to anycast address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:57:06 +09:00
YOSHIFUJI Hideaki	58c4fb86ea	[IPV6]: Flag RTF_ANYCAST for anycast routes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:56:42 +09:00
Patrick McHardy	9e999993c7	[XFRM]: Handle DCCP in xfrm{4,6}_decode_session Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-19 14:03:46 -08:00
YOSHIFUJI Hideaki	3dd4bc68fa	[IPV6]: Fix route lifetime. The route expiration time is stored in rt6i_expires in jiffies. The argument of rt6_route_add() for adding a route is not the expiration time in jiffies nor in clock_t, but the lifetime (or time left before expiration) in clock_t. Because of the confusion, we sometimes saw several strange errors (FAILs) in TAHI IPv6 Ready Logo Phase-2 Self Test. The symptoms were analyzed by Mitsuru Chinen <CHINEN@jp.ibm.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-19 14:02:45 -08:00
Patrick McHardy	31cb5bd4dc	[NETFILTER]: Fix incorrect dependency for IP6_NF_TARGET_NFQUEUE IP6_NF_TARGET_NFQUEUE depends on IP6_NF_IPTABLES, not IP_NF_IPTABLES. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-19 13:53:26 -08:00
David S. Miller	a1493d9cd1	[IPV6] addrconf: Do not print device pointer in privacy log message. Noticed by Andi Kleen, it is pointless to emit the device structure pointer in the kernel logs like this. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-13 22:59:36 -08:00
Arnaldo Carvalho de Melo	ecc51b6d5c	[TCPv6]: Fix skb leak Spotted by Francois Romieu, thanks! Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-12 14:38:10 -08:00
Kazunori MIYAZAWA	73d4f84fd0	[IPv6] IPsec: fix pmtu calculation of esp It is a simple bug which uses the wrong member. This bug does not seriously affect ordinary use of IPsec. But it is important to pass IPv6 ready logo phase-2 conformance test of IPsec SGW. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-08 23:11:42 -08:00
Yasuyuki Kozakai	f16c910724	[NETFILTER]: nf_conntrack: Fix missing check for ICMPv6 type This makes nf_conntrack_icmpv6 check that ICMPv6 type isn't < 128 to avoid accessing out of array valid_new[] and invmap[]. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-05 13:32:50 -08:00
YOSHIFUJI Hideaki	af1afe8662	[IPV6]: Load protocol module dynamically. [ Modified to match inet_create() bug fix by Herbert Xu -DaveM ] Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-02 20:56:57 -08:00
David Stevens	24c6927505	[IGMP]: workaround for IGMP v1/v2 bug From: David Stevens <dlstevens@us.ibm.com> As explained at: http://www.cs.ucsb.edu/~krishna/igmp_dos/ With IGMP version 1 and 2 it is possible to inject a unicast report to a client which will make it ignore multicast reports sent later by the router. The fix is to only accept the report if is was sent to a multicast or unicast address. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-02 20:32:59 -08:00
Adrian Bunk	34a0b3cdc0	[IPV6]: make two functions static This patch makes two needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-29 16:28:56 -08:00
Arjan van de Ven	9b5b5cff9a	[NET]: Add const markers to various variables. the patch below marks various variables const in net/; the goal is to move them to the .rodata section so that they can't false-share cachelines with things that get written to, as well as potentially helping gcc a bit with optimisations. (these were found using a gcc patch to warn about such variables) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-29 16:21:38 -08:00
YOSHIFUJI Hideaki	220bbd7483	[IPV6]: Implement appropriate dummy rule 4 in ipv6_dev_get_saddr(). Ensure to update hiscore.rule in dummy rule 4 in ipv6_dev_get_saddr(). Pointed out by Yan Zheng <yanzheng@21cn.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-28 22:27:11 -08:00
David S. Miller	1ef43204f4	Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+advapi-fix/	2005-11-20 20:52:16 -08:00
Yan Zheng	5d5780df23	[IPV6]: Acquire addrconf_hash_lock for read in addrconf_verify(...) addrconf_verify(...) only traverse address hash table when addrconf_hash_lock is held for writing, and it may hold addrconf_hash_lock for a long time. So I think it's better to acquire addrconf_hash_lock for reading instead of writing Signed-off-by: Yan Zheng <yanzheng@21cn.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-20 13:42:20 -08:00
YOSHIFUJI Hideaki	df9890c31a	[IPV6]: Fix sending extension headers before and including routing header. Based on suggestion from Masahide Nakamura <nakam@linux-ipv6.org>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-11-20 12:23:18 +09:00
Ville Nuorvala	a305989386	[IPV6]: Fix calculation of AH length during filling ancillary data. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-11-20 12:21:59 +09:00
YOSHIFUJI Hideaki	8b8aa4b5a6	[IPV6]: Fix memory management error during setting up new advapi sockopts. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-11-20 12:18:17 +09:00
David S. Miller	9e147a1cfc	[IPV6]: Fib dump really needs GFP_ATOMIC. Revert: `8225ccbaf0` Based upon a report by Yan Zheng. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-17 16:52:51 -08:00
Yasuyuki Kozakai	e7c8a41e81	[IPV4,IPV6]: replace handmade list with hlist in IPv{4,6} reassembly Both of ipq and frag_queue have next and *prev, and they can be replaced with hlist. Thanks Arnaldo Carvalho de Melo for the suggestion. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-16 12:55:37 -08:00
Luiz Capitulino	cb422c464b	[IPV6]: Fixes sparse warning in ipv6/ipv6_sockglue.c The patch below fixes the following sparse warning: net/ipv6/ipv6_sockglue.c:291:13: warning: Using plain integer as NULL pointer Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 21:43:36 -08:00
Yan Zheng	12da2a435c	[IPV6]: small fix for ipv6_dev_get_saddr(...) The "score.rule++" doesn't make any sense for me. According to codes above, I think it should be "hiscore.rule++;" . Signed-off-by: Yan Zheng<yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 21:42:46 -08:00
Yasuyuki Kozakai	302fe1758d	[NETFILTER] fix leak of fragment queue at unloading nf_conntrack_ipv6 This patch makes nf_conntrack_ipv6 free all IPv6 fragment queues at module unloading time. Also introduce a BUG_ON if we ever again have leaks in the memory accounting. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:28:45 -08:00
Yasuyuki Kozakai	1ba430bc3e	[NETFILTER] nf_conntrack: fix possibility of infinite loop while evicting nf_ct_frag6_queue This synchronizes nf_ct_reasm with ipv6 reassembly, and fixes a possibility of an infinite loop if CPUs evict and create nf_ct_frag6_queue in parallel. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:28:18 -08:00
Yasuyuki Kozakai	7686a02c0e	[NETFILTER]: fix type of sysctl variables in nf_conntrack_ipv6 These variables should be unsigned. This fixes sysctl handler for nf_ct_frag6_{low,high}_thresh. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:27:43 -08:00
Yasuyuki Kozakai	9bdf87d90b	[NETFILTER]: cleanup IPv6 Netfilter Kconfig This removes linux 2.4 configs in comments as TODO lists. And this also move the entry of nf_conntrack to top like IPv4 Netfilter Kconfig. Based on original patch by Krzysztof Piotr Oledzki <ole@ans.pl>. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:26:58 -08:00
Thomas Graf	8225ccbaf0	[IPV6]: Fix unnecessary GFP_ATOMIC allocation in fib6 dump Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-12 12:15:16 -08:00
Herbert Xu	efacfbcb6c	[IPV6]: Fix rtnetlink dump infinite loop The recent change to netlink dump "done" callback handling broke IPv6 which played dirty tricks with the "done" callback. This causes an infinite loop during a dump. The following patch fixes it. This bug was reported by Jeff Garzik. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-12 12:12:05 -08:00
David S. Miller	8eb5591052	[IPV6]: Fix inet6_init missing unregister. Based mostly upon a patch from Olaf Kirch <okir@suse.de> When initialization fails in inet6_init(), we should unregister the PF_INET6 socket ops. Also, check sock_register()'s return value for errors. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-11 15:05:47 -08:00
Herbert Xu	fb286bb299	[NET]: Detect hardware rx checksum faults correctly Here is the patch that introduces the generic skb_checksum_complete which also checks for hardware RX checksum faults. If that happens, it'll call netdev_rx_csum_fault which currently prints out a stack trace with the device name. In future it can turn off RX checksum. I've converted every spot under net/ that does RX checksum checks to use skb_checksum_complete or __skb_checksum_complete with the exceptions of: * Those places where checksums are done bit by bit. These will call netdev_rx_csum_fault directly. * The following have not been completely checked/converted: ipmr ip_vs netfilter dccp This patch is based on patches and suggestions from Stephen Hemminger and David S. Miller. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-10 13:01:24 -08:00
Thomas Graf	a8f74b2288	[NETLINK]: Make netlink_callback->done() optional Most netlink families make no use of the done() callback, making it optional gets rid of all unnecessary dummy implementations. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-10 02:26:40 +01:00
Yasuyuki Kozakai	9fb9cbb108	[NETFILTER]: Add nf_conntrack subsystem. The existing connection tracking subsystem in netfilter can only handle ipv4. There were basically two choices present to add connection tracking support for ipv6. We could either duplicate all of the ipv4 connection tracking code into an ipv6 counterpart, or (the choice taken by these patches) we could design a generic layer that could handle both ipv4 and ipv6 and thus requiring only one sub-protocol (TCP, UDP, etc.) connection tracking helper module to be written. In fact nf_conntrack is capable of working with any layer 3 protocol. The existing ipv4 specific conntrack code could also not deal with the pecularities of doing connection tracking on ipv6, which is also cured here. For example, these issues include: 1) ICMPv6 handling, which is used for neighbour discovery in ipv6 thus some messages such as these should not participate in connection tracking since effectively they are like ARP messages 2) fragmentation must be handled differently in ipv6, because the simplistic "defrag, connection track and NAT, refrag" (which the existing ipv4 connection tracking does) approach simply isn't feasible in ipv6 3) ipv6 extension header parsing must occur at the correct spots before and after connection tracking decisions, and there were no provisions for this in the existing connection tracking design 4) ipv6 has no need for stateful NAT The ipv4 specific conntrack layer is kept around, until all of the ipv4 specific conntrack helpers are ported over to nf_conntrack and it is feature complete. Once that occurs, the old conntrack stuff will get placed into the feature-removal-schedule and we will fully kill it off 6 months later. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-09 16:38:16 -08:00
Ken-ichirou MATSUZAWA	9f0ede52a0	[IPV6]: ip6ip6_lock is not unlocked in error path. From: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-09 13:08:29 -08:00
Peter Chubb	44fd0261d3	[IPV6]: Fix fallout from CONFIG_IPV6_PRIVACY Trying to build today's 2.6.14+git snapshot gives undefined references to use_tempaddr Looks like an ifdef got left out. Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-09 13:05:47 -08:00
Jesper Juhl	a51482bde2	[NET]: kfree cleanup From: Jesper Juhl <jesper.juhl@gmail.com> This is the net/ part of the big kfree cleanup patch. Remove pointless checks for NULL prior to calling kfree() in net/. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Acked-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Andrew Morton <akpm@osdl.org>	2005-11-08 09:41:34 -08:00
YOSHIFUJI Hideaki	072047e4de	[IPV6]: RFC3484 compliant source address selection Choose more appropriate source address; e.g. - outgoing interface - non-deprecated - scope - matching label Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-08 09:38:30 -08:00
YOSHIFUJI Hideaki	b1cacb6820	[IPV6]: Make ipv6_addr_type() more generic so that we can use it for source address selection. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-08 09:38:12 -08:00
YOSHIFUJI Hideaki	971f359ddc	[IPV6]: Put addr_diff() into common header for future use. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-08 09:37:56 -08:00
Stephen Hemminger	6df716340d	[TCP/DCCP]: Randomize port selection This patch randomizes the port selected on bind() for connections to help with possible security attacks. It should also be faster in most cases because there is no need for a global lock. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-05 21:23:15 -02:00
Yan Zheng	979ad66312	[IPV6]: inet6_ifinfo_notify should use RTM_DELLINK in addrconf_ifdown Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-03 01:03:05 -02:00
Yan Zheng	8713dbf057	[MCAST]: ip[6]_mc_add_src should be called when number of sources is zero And filter mode is exclude. Further explanation by David Stevens: Multicast source filters aren't widely used yet, and that's really the only feature that's affected if an application actually exercises this bug, as far as I can tell. An ordinary filter-less multicast join should still work, and only forwarded multicast traffic making use of filters and doing empty-source filters with the MSFILTER ioctl would be at risk of not getting multicast traffic forwarded to them because the reports generated would not be based on the correct counts. Signed-off-by: Yan Zheng <yanzheng@21cn.com Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-02 21:03:57 -02:00
Yan Zheng	97300b5fdf	[MCAST] IPv6: Check packet size when process Multicast Signed-off-by: Yan Zheng <yanzheng@21cn.com Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 22:52:03 -02:00
Yan Zheng	9d17f21893	[IPV6]: Fix behavior of ip6_route_input() for link local address I find that linux will reply echo request destined to an address which belongs to an interface other than the one from which the request received. This behavior doesn't make sense for link local address. YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> said: Please note that sender does need to setup neighbor entry by hand to reproduce this bug. (Link-local address on eth1 is not visible on eth0, from the point of view of neighbor discovery in IPv6.) +--------+ +--------+ \| sender \| \| router \| +---+----+ +-+----+-+ \|eth0 eth0\| \|eth1 -----+----------------------+- -+-------------- Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Andrew Morton <akpm@osdl.org> (forwarded) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 16:54:05 -02:00
Harald Welte	6b7d31fcdd	[NETFILTER]: Add "revision" support to arp_tables and ip6_tables Like ip_tables already has it for some time, this adds support for having multiple revisions for each match/target. We steal one byte from the name in order to accomodate a 8 bit version number. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 16:36:08 -02:00
David Hardeman	378f058cc4	[PATCH] Use sg_set_buf/sg_init_one where applicable This patch uses sg_set_buf/sg_init_one in some places where it was duplicated. Signed-off-by: David Hardeman <david@2gen.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-10-30 11:19:43 +11:00
Yan Zheng	f12baeab9d	[MCAST] IPv6: Fix algorithm to compute Querier's Query Interval 5.1.3. Maximum Response Code The Maximum Response Code field specifies the maximum time allowed before sending a responding Report. The actual time allowed, called the Maximum Response Delay, is represented in units of milliseconds, and is derived from the Maximum Response Code as follows: If Maximum Response Code < 32768, Maximum Response Delay = Maximum Response Code If Maximum Response Code >=32768, Maximum Response Code represents a floating-point value as follows: 0 1 2 3 4 5 6 7 8 9 A B C D E F +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \|1\| exp \| mant \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Maximum Response Delay = (mant \| 0x1000) << (exp+3) 5.1.9. QQIC (Querier's Query Interval Code) The Querier's Query Interval Code field specifies the [Query Interval] used by the Querier. The actual interval, called the Querier's Query Interval (QQI), is represented in units of seconds, and is derived from the Querier's Query Interval Code as follows: If QQIC < 128, QQI = QQIC If QQIC >= 128, QQIC represents a floating-point value as follows: 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ \|1\| exp \| mant \| +-+-+-+-+-+-+-+-+ QQI = (mant \| 0x10) << (exp + 3) -- rfc3810 #define MLDV2_QQIC(value) MLDV2_EXP(0x80, 4, 3, value) #define MLDV2_MRC(value) MLDV2_EXP(0x8000, 12, 3, value) Above macro are defined in mcast.c. but 1 << 4 == 0x10 and 1 << 12 == 0x1000. So the result computed by original Macro is larger. Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-28 16:35:18 -02:00
Ananda Raju	e89e9cf539	[IPv4/IPv6]: UFO Scatter-gather approach Attached is kernel patch for UDP Fragmentation Offload (UFO) feature. 1. This patch incorporate the review comments by Jeff Garzik. 2. Renamed USO as UFO (UDP Fragmentation Offload) 3. udp sendfile support with UFO This patches uses scatter-gather feature of skb to generate large UDP datagram. Below is a "how-to" on changes required in network device driver to use the UFO interface. UDP Fragmentation Offload (UFO) Interface: ------------------------------------------- UFO is a feature wherein the Linux kernel network stack will offload the IP fragmentation functionality of large UDP datagram to hardware. This will reduce the overhead of stack in fragmenting the large UDP datagram to MTU sized packets 1) Drivers indicate their capability of UFO using dev->features \|= NETIF_F_UFO \| NETIF_F_HW_CSUM \| NETIF_F_SG NETIF_F_HW_CSUM is required for UFO over ipv6. 2) UFO packet will be submitted for transmission using driver xmit routine. UFO packet will have a non-zero value for "skb_shinfo(skb)->ufo_size" skb_shinfo(skb)->ufo_size will indicate the length of data part in each IP fragment going out of the adapter after IP fragmentation by hardware. skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[] contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW indicating that hardware has to do checksum calculation. Hardware should compute the UDP checksum of complete datagram and also ip header checksum of each fragmented IP packet. For IPV6 the UFO provides the fragment identification-id in skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating IPv6 fragments. Signed-off-by: Ananda Raju <ananda.raju@neterion.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (forwarded) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-28 16:30:00 -02:00
John Hawkes	670c02c2bf	[NET]: Wider use of for_each_*cpu() In 'net' change the explicit use of for-loops and NR_CPUS into the general for_each_cpu() or for_each_online_cpu() constructs, as appropriate. This widens the scope of potential future optimizations of the general constructs, as well as takes advantage of the existing optimizations of first_cpu() and next_cpu(), which is advantageous when the true CPU count is much smaller than NR_CPUS. Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-25 23:54:01 -02:00
Yan Zheng	4ea6a8046b	[IPV6]: Fix refcnt of struct ip6_flowlabel Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-25 21:17:52 -02:00
Andrew Morton	e6850cce8f	[NETFILTER]: Fix ip6_table.c build with NETFILTER_DEBUG enabled. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-15 16:15:38 -07:00
David S. Miller	c8923c6b85	[NETFILTER]: Fix OOPSes on machines with discontiguous cpu numbering. Original patch by Harald Welte, with feedback from Herbert Xu and testing by S�bastien Bernard. EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus are numbered linearly. That is not necessarily true. This patch fixes that up by calculating the largest possible cpu number, and allocating enough per-cpu structure space given that. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-13 14:41:23 -07:00
Herbert Xu	d4875b049b	[IPSEC] Fix block size/MTU bugs in ESP This patch fixes the following bugs in ESP: * Fix transport mode MTU overestimate. This means that the inner MTU is smaller than it needs be. Worse yet, given an input MTU which is a multiple of 4 it will always produce an estimate which is not a multiple of 4. For example, given a standard ESP/3DES/MD5 transform and an MTU of 1500, the resulting MTU for transport mode is 1462 when it should be 1464. The reason for this is because IP header lengths are always a multiple of 4 for IPv4 and 8 for IPv6. * Ensure that the block size is at least 4. This is required by RFC2406 and corresponds to what the esp_output function does. At the moment this only affects crypto_null as its block size is 1. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:11:34 -07:00
Herbert Xu	a02a64223e	[IPSEC]: Use ALIGN macro in ESP This patch uses the macro ALIGN in all the applicable spots for ESP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:11:08 -07:00
YOSHIFUJI Hideaki	140e26fcd5	[IPV6]: Fix NS handing for proxy/anycast address Timer set up by pneigh_enqueue() ended up calling ndisc_rcv() via pndisc_redo(), which clears LOCALLY_ENQUEUED flag in NEIGH_CB(skb) and NS was queued again. Let's call ndisc_recv_ns() directly to avoid the loop. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-05 12:11:41 -07:00
Yan Zheng	fab10fe37a	[MCAST] ipv6: Fix address size in grec_size Signed-Off-By: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-05 12:08:13 -07:00
YOSHIFUJI Hideaki	87bf9c97b4	[IPV6]: Fix infinite loop in udp_v6_get_port(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-04 13:00:39 -07:00
Herbert Xu	e5ed639913	[IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl The following patch renames __in_dev_get() to __in_dev_get_rtnl() and introduces __in_dev_get_rcu() to cover the second case. 1) RCU with refcnt should use in_dev_get(). 2) RCU without refcnt should use __in_dev_get_rcu(). 3) All others must hold RTNL and use __in_dev_get_rtnl(). There is one exception in net/ipv4/route.c which is in fact a pre-existing race condition. I've marked it as such so that we remember to fix it. This patch is based on suggestions and prior work by Suzanne Wood and Paul McKenney. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:35:55 -07:00
David S. Miller	a5e7c210fe	[IPV6]: Fix leak added by udp connect dst caching fix. Based upon a patch from Mitsuru KANDA <mk@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:21:58 -07:00
Yan Zheng	f36d6ab182	[IPV6]: Fix ipv6 fragment ID selection at slow path Signed-Off-By: Yan Zheng <yanzheng@21cn.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:19:15 -07:00
Eric Dumazet	81c3d5470e	[INET]: speedup inet (tcp/dccp) lookups Arnaldo and I agreed it could be applied now, because I have other pending patches depending on this one (Thank you Arnaldo) (The other important patch moves skc_refcnt in a separate cache line, so that the SMP/NUMA performance doesnt suffer from cache line ping pongs) 1) First some performance data : -------------------------------- tcp_v4_rcv() wastes a lot of time in __inet_lookup_established() The most time critical code is : sk_for_each(sk, node, &head->chain) { if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif)) goto hit; /* You sunk my battleship! / } The sk_for_each() does use prefetch() hints but only the begining of "struct sock" is prefetched. As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far away from the begining of "struct sock", it has to bring into CPU cache cold cache line. Each iteration has to use at least 2 cache lines. This can be problematic if some chains are very long. 2) The goal ----------- The idea I had is to change things so that INET_MATCH() may return FALSE in 99% of cases only using the data already in the CPU cache, using one cache line per iteration. 3) Description of the patch --------------------------- Adds a new 'unsigned int skc_hash' field in 'struct sock_common', filling a 32 bits hole on 64 bits platform. struct sock_common { unsigned short skc_family; volatile unsigned char skc_state; unsigned char skc_reuse; int skc_bound_dev_if; struct hlist_node skc_node; struct hlist_node skc_bind_node; atomic_t skc_refcnt; + unsigned int skc_hash; struct proto skc_prot; }; Store in this 32 bits field the full hash, not masked by (ehash_size - 1) Using this full hash as the first comparison done in INET_MATCH permits us immediatly skip the element without touching a second cache line in case of a miss. Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to sk_hash and tw_hash) already contains the slot number if we mask with (ehash_size - 1) File include/net/inet_hashtables.h 64 bits platforms : #define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) ((((__u64 )&(inet_sk(__sk)->daddr)))== (__cookie)) && \ ((((__u32 )&(inet_sk(__sk)->dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) 32bits platforms: #define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) && \ (inet_sk(__sk)->daddr == (__saddr)) && \ (inet_sk(__sk)->rcv_saddr == (__daddr)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) - Adds a prefetch(head->chain.first) in __inet_lookup_established()/__tcp_v4_check_established() and __inet6_lookup_established()/__tcp_v6_check_established() and __dccp_v4_check_established() to bring into cache the first element of the list, before the {read\|write}_lock(&head->lock); Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:13:38 -07:00
Herbert Xu	325ed82393	[NET]: Fix packet timestamping. I've found the problem in general. It affects any 64-bit architecture. The problem occurs when you change the system time. Suppose that when you boot your system clock is forward by a day. This gets recorded down in skb_tv_base. You then wind the clock back by a day. From that point onwards the offset will be negative which essentially overflows the 32-bit variables they're stored in. In fact, why don't we just store the real time stamp in those 32-bit variables? After all, we're not going to overflow for quite a while yet. When we do overflow, we'll need a better solution of course. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 13:57:23 -07:00
Herbert Xu	c62dba9011	[IPV6]: Fix [Bug 5306] Oops on IPv6 route lookup > Steps to reproduce: > 1. Boot Linux, do NOT setup any IPv6 routes > 2. ip route get 2001::1 (or any unroutable address) Well caught. We never set rt6i_idev on ip6_null_entry. This patch should make the problem go away. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-26 15:10:16 -07:00
Harald Welte	d67b24c40f	[NETFILTER]: Fix ip[6]t_NFQUEUE Kconfig dependency We have to introduce a separate Kconfig menu entry for the NFQUEUE targets. They cannot "just" depend on nfnetlink_queue, since nfnetlink_queue could be linked into the kernel, whereas iptables can be a module. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-24 16:52:03 -07:00
Linus Torvalds	875bd5ab01	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2005-09-19 18:46:11 -07:00
Mark J Cox	6d1cfe3f17	[PATCH] raw_sendmsg DoS on 2.6 Fix unchecked __get_user that could be tricked into generating a memory read on an arbitrary address. The result of the read is not returned directly but you may be able to divine some information about it, or use the read to cause a crash on some architectures by reading hardware state. CAN-2004-2492. Fix from Al Viro, ack from Dave Miller. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-19 18:45:42 -07:00
Yasuyuki Kozakai	e674d0f38d	[NETFILTER] ip6tables: remove duplicate code Some IPv6 matches have very similar loops to find IPv6 extension header and we can unify them. This patch introduces ipv6_find_hdr() to do it. I just checked that it can find the target headers in the packet which has dst,hbh,rt,frag,ah,esp headers. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-19 15:34:40 -07:00
Mitsuru KANDA	987905ded3	[IPV6]: Check connect(2) status for IPv6 UDP socket (Re: xfrm_lookup) I think we should cache the per-socket route(dst_entry) only when the IPv6 UDP socket is connect(2)'ed. (which is same as IPv4 UDP send behavior) Signed-off-by: Mitsuru KANDA <mk@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-18 00:30:08 -07:00
David L Stevens	40796c5e8f	[IPV6]: Fix per-socket multicast filtering in sk_reuse case per-socket multicast filters were not being applied to all sockets in the case of an exact-match bound address, due to an over-exuberant "return" in the look-up code. Fix below. IPv4 does not have this problem. Thanks to Hoerdt Mickael for reporting the bug. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-14 21:10:20 -07:00
Denis Lukianov	de9daad90e	[MCAST]: Fix MCAST_EXCLUDE line dupes This patch fixes line dupes at /ipv4/igmp.c and /ipv6/mcast.c in the 2.6 kernel, where MCAST_EXCLUDE is mistakenly used instead of MCAST_INCLUDE. Signed-off-by: Denis Lukianov <denis@voxelsoft.com> Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-14 20:53:42 -07:00
Brian Haley	e6df439b89	[IPV6]: Bring Type 0 routing header in-line with rfc3542. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-10 00:15:06 -07:00
Ingo Molnar	8d06afab73	[PATCH] timer initialization cleanup: DEFINE_TIMER Clean up timer initialization by introducing DEFINE_TIMER a'la DEFINE_SPINLOCK. Build and boot-tested on x86. A similar patch has been been in the -RT tree for some time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 14:03:48 -07:00
Dipankar Sarma	b835996f62	[PATCH] files: lock-free fd look-up With the use of RCU in files structure, the look-up of files using fds can now be lock-free. The lookup is protected by rcu_read_lock()/rcu_read_unlock(). This patch changes the readers to use lock-free lookup. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 13:57:55 -07:00
Patrick McHardy	e104411b82	[XFRM]: Always release dst_entry on error in xfrm_lookup Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-08 15:11:55 -07:00
Patrick McHardy	a57ebc90f1	[IPV6]: Don't redo xfrm_lookup for cached dst entries The xfrm lookup is already done when the dst entry is looked up first and stored in the cache. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-08 14:27:47 -07:00
David S. Miller	2e66fc4116	Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6-git-rfc3542	2005-09-08 12:59:43 -07:00
Stephen Hemminger	42ca89c18b	[IPV6]: Need to use pskb_trim_rcsum(). Fix pskb_trim usage in ipv6. Only the udp one is really a bug, other places are just doing equivalent code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-08 12:57:43 -07:00
YOSHIFUJI Hideaki	41a1f8ea4f	[IPV6]: Support IPV6_{RECV,}TCLASS socket options / ancillary data. Based on patch from David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-09-08 10:19:03 +09:00
YOSHIFUJI Hideaki	333fad5364	[IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542). Support several new socket options / ancillary data: IPV6_RECVPKTINFO, IPV6_PKTINFO, IPV6_RECVHOPOPTS, IPV6_HOPOPTS, IPV6_RECVDSTOPTS, IPV6_DSTOPTS, IPV6_RTHDRDSTOPTS, IPV6_RECVRTHDR, IPV6_RTHDR, IPV6_RECVHOPOPTS, IPV6_HOPOPTS Old semantics are preserved as IPV6_2292xxxx so that we can maintain backward compatibility. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-09-08 09:59:17 +09:00
YOSHIFUJI Hideaki	2dac4b96b9	[IPV6]: Repair Incoming Interface Handling for Raw Socket. Due to changes to enforce checking interface bindings, sockets did not see loopback packets bound for our local address on our interface. e.g.) When we ping6 fe80::1%eth0, skb->dev points loopback_dev while IP6CB(skb)->iif indicates eth0. This patch fixes the issue by using appropriate incoming interface, in the sense of scoping architecture. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-01 17:44:49 -07:00
Jesper Juhl	573dbd9596	[CRYPTO]: crypto_free_tfm() callers no longer need to check for NULL Since the patch to add a NULL short-circuit to crypto_free_tfm() went in, there's no longer any need for callers of that function to check for NULL. This patch removes the redundant NULL checks and also a few similar checks for NULL before calls to kfree() that I ran into while doing the crypto_free_tfm bits. I've succesfuly compile tested this patch, and a kernel with the patch applied boots and runs just fine. When I posted the patch to LKML (and other lists/people on Cc) it drew the following comments : J. Bruce Fields commented "I've no problem with the auth_gss or nfsv4 bits.--b." Sridhar Samudrala said "sctp change looks fine." Herbert Xu signed off on the patch. So, I guess this is ready to be dropped into -mm and eventually mainline. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-01 17:44:29 -07:00
Harald Welte	0ac4f893f2	[NETFILTER6]: Add new ip6tables HOPLIMIT target This target allows users to modify the hoplimit header field of the IPv6 header. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:13:29 -07:00
Eric Dumazet	ba89966c19	[NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers This patch puts mostly read only data in the right section (read_mostly), to help sharing of these data between CPUS without memory ping pongs. On one of my production machine, tcp_statistics was sitting in a heavily modified cache line, so every SNMP update had to force a reload. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:18 -07:00
Patrick McHardy	05465343bf	[NETFILTER]: Add goto target Originally written by Henrik Nordstrom <hno@marasystems.com>, taken from netfilter patch-o-matic and added ip6_tables support. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:18 -07:00
Patrick McHardy	764d8a9f24	[NETFILTER]: Add IPv6 REJECT target Originally written by Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>, taken from netfilter patch-o-matic and fixed up to work with current kernels. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:12 -07:00
Arnaldo Carvalho de Melo	20380731bc	[NET]: Fix sparse warnings Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:01:32 -07:00
Patrick McHardy	066286071d	[NETLINK]: Add "groups" argument to netlink_kernel_create Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:01:11 -07:00
Patrick McHardy	ac6d439d20	[NETLINK]: Convert netlink users to use group numbers instead of bitmasks Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:00:54 -07:00
Christoph Hellwig	34b4a4a624	[NETFILTER]: Remove tasklist_lock abuse in ipt{,6}owner Rip out cmd/sid/pid matching since its unfixable broken and stands in the way of locking changes to tasklist_lock. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:07 -07:00
Patrick McHardy	a61bbcf28a	[NET]: Store skb->timestamp as offset to a base timestamp Reduces skb size by 8 bytes on 64-bit. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:58:24 -07:00
Arnaldo Carvalho de Melo	5324a040cc	[INET6_HASHTABLES]: Move inet6_lookup functions to net/ipv6/inet6_hashtables.c Doing this we allow tcp_diag to support IPV6 even if tcp_diag is compiled statically and IPV6 is compiled as a module, removing the previous restriction while not building any IPV6 code if it is not selected. Now to work on the tcpdiag_register infrastructure and then to rename the whole thing to inetdiag, reflecting its by then completely generic nature. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:29 -07:00
Arnaldo Carvalho de Melo	505cbfc577	[IPV6]: Generalise the tcp_v6_lookup routines In the same way as was done with the v4 counterparts, this will be moved to inet6_hashtables.c. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:24 -07:00
Arnaldo Carvalho de Melo	6687e988d9	[ICSK]: Move TCP congestion avoidance members to icsk This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(), minimal renaming/moving done in this changeset to ease review. Most of it is just changes of struct tcp_sock * to struct sock * parameters. With this we move to a state closer to two interesting goals: 1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used for any INET transport protocol that has struct inet_hashinfo and are derived from struct inet_connection_sock. Keeps the userspace API, that will just not display DCCP sockets, while newer versions of tools can support DCCP. 2. INET generic transport pluggable Congestion Avoidance infrastructure, using the current TCP CA infrastructure with DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:56:18 -07:00
Patrick McHardy	64ce207306	[NET]: Make NETDEBUG pure printk wrappers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:56:08 -07:00
Arnaldo Carvalho de Melo	295ff7edb8	[TIMEWAIT]: Introduce inet_timewait_death_row That groups all of the tables and variables associated to the TCP timewait schedulling/recycling/killing code, that now can be isolated from the TCP specific code and used by other transport protocols, such as DCCP. Next changeset will move this code to net/ipv4/inet_timewait_sock.c Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:55:48 -07:00
Harald Welte	bbd86b9fc4	[NETFILTER]: add /proc/net/netfilter interface to nf_queue This patch adds a /proc/net/netfilter/nf_queue file, similar to the recently-added /proc/net/netfilter/nf_log. It indicates which queue handler is registered to which protocol family. This is useful since there are now multiple queue handlers in the treee (ip[6]_queue, nfnetlink_queue). Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:51:18 -07:00
Harald Welte	210a9ebef2	[NETFILTER]: ip{6}_queue: prevent unregistration race with nfnetlink_queue Since nfnetlink_queue can override ip{6}_queue as queue handlers, we can no longer blindly unregister whoever is registered for PF_INET[6], but only unregister ourselves. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:51:08 -07:00
Arnaldo Carvalho de Melo	0a5578cf8e	[ICSK]: Generalise tcp_listen_{start,stop} This also moved inet_iif from tcp to inet_hashtables.h, as it is needed by the inet_lookup callers, perhaps this needs a bit of polishing, but for now seems fine. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:49:24 -07:00
Arnaldo Carvalho de Melo	463c84b97f	[NET]: Introduce inet_connection_sock This creates struct inet_connection_sock, moving members out of struct tcp_sock that are shareable with other INET connection oriented protocols, such as DCCP, that in my private tree already uses most of these members. The functions that operate on these members were renamed, using a inet_csk_ prefix while not being moved yet to a new file, so as to ease the review of these changes. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:43:19 -07:00
Arnaldo Carvalho de Melo	8feaf0c0a5	[INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets This paves the way to generalise the rest of the sock ID lookup routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro kernels (where IPv6 is always built as a module): [root@qemu ~]# grep tw_sock /proc/slabinfo tw_sock_TCPv6 0 0 128 31 1 tw_sock_TCP 0 0 96 41 1 [root@qemu ~]# Now if a protocol wants to use the TIME_WAIT generic infrastructure it only has to set the sk_prot->twsk_obj_size field with the size of its inet_timewait_sock derived sock and proto_register will create sk_prot->twsk_slab, for now its only for INET sockets, but we can introduce timewait_sock later if some non INET transport protocolo wants to use this stuff. Next changesets will take advantage of this new infrastructure to generalise even more TCP code. [acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size /tmp/before.size: 188646 11764 5068 205478 322a6 net/ipv4/built-in.o /tmp/after.size: 188144 11764 5068 204976 320b0 net/ipv4/built-in.o [acme@toy net-2.6.14]$ Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1 (qemu host)). Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:42:13 -07:00
Arnaldo Carvalho de Melo	c752f0739f	[TCP]: Move the tcp sock states to net/tcp_states.h Lots of places just needs the states, not even linux/tcp.h, where this enum was, needs it. This speeds up development of the refactorings as less sources are rebuilt when things get moved from net/tcp.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:41:54 -07:00
Arnaldo Carvalho de Melo	f3f05f7046	[INET]: Generalise the tcp_listen_ lock routines Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:41:49 -07:00
Arnaldo Carvalho de Melo	6e04e02165	[INET]: Move tcp_port_rover to inet_hashinfo Also expose all of the tcp_hashinfo members, i.e. killing those tcp_ehash, etc macros, this will more clearly expose already generic functions and some that need just a bit of work to become generic, as we'll see in the upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:41:44 -07:00
Arnaldo Carvalho de Melo	2d8c4ce519	[INET]: Generalise tcp_bind_hash & tcp_inherit_port This required moving tcp_bucket_cachep to inet_hashinfo. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:40:29 -07:00
Arnaldo Carvalho de Melo	a55ebcc4c4	[INET]: Move bind_hash from tcp_sk to inet_sk This should really be in a inet_connection_sock, but I'm leaving it for a later optimization, when some more fields common to INET transport protocols now in tcp_sk or inet_sk will be chunked out into inet_connection_sock, for now its better to concentrate on getting the changes in the core merged to leave the DCCP tree with only DCCP specific code. Next changesets will take advantage of this move to generalise things like tcp_bind_hash, tcp_put_port, tcp_inherit_port, making the later receive a inet_hashinfo parameter, and even __tcp_tw_hashdance, etc in the future, when tcp_tw_bucket gets transformed into the struct timewait_sock hierarchy. tcp_destroy_sock also is eligible as soon as tcp_orphan_count gets moved to sk_prot. A cascade of incremental changes will ultimately make the tcp_lookup functions be fully generic. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:38:48 -07:00
Arnaldo Carvalho de Melo	0f7ff9274e	[INET]: Just rename the TCP hashtable functions/structs to inet_ This is to break down the complexity of the series of patches, making it very clear that this one just does: 1. renames tcp_ prefixed hashtable functions and data structures that were already mostly generic to inet_ to share it with DCCP and other INET transport protocols. 2. Removes not used functions (__tb_head & tb_head) 3. Removes some leftover prototypes in the headers (tcp_bucket_unlock & tcp_v4_build_header) Next changesets will move tcp_sk(sk)->bind_hash to inet_sock so that we can make functions such as tcp_inherit_port, __tcp_inherit_port, tcp_v4_get_port, __tcp_put_port, generic and get others like tcp_destroy_sock closer to generic (tcp_orphan_count will go to sk->sk_prot to allow this). Eventually most of these functions will be used passing the transport protocol inet_hashinfo structure. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:38:32 -07:00
Harald Welte	608c8e4f7b	[NETFILTER]: Extend netfilter logging API This patch is in preparation to nfnetlink_log: - loggers now have to register struct nf_logger instead of nf_logfn - nf_log_unregister() replaced by nf_log_unregister_pf() and nf_log_unregister_logger() - add comment to ip[6]t_LOG.h to assure nobody redefines flags - add /proc/net/netfilter/nf_log to tell user which logger is currently registered for which address family - if user has configured logging, but no logging backend (logger) is available, always spit a message to syslog, not just the first time. - split ip[6]t_LOG.c into two parts: Backend: Always try to register as logger for the respective address family Frontend: Always log via nf_log_packet() API - modify all users of nf_log_packet() to accomodate additional argument Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:38:07 -07:00
Arnaldo Carvalho de Melo	32519f11d3	[INET]: Introduce inet_sk_rebuild_header From tcp_v4_rebuild_header, that already was pretty generic, I only needed to use sk->sk_protocol instead of the hardcoded IPPROTO_TCP and establish the requirement that INET transport layer protocols that want to use this function map TCP_SYN_SENT to its equivalent state. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:55 -07:00
Arnaldo Carvalho de Melo	e6848976b7	[NET]: Cleanup INET_REFCNT_DEBUG code Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:29 -07:00
Patrick McHardy	d13964f449	[IPV4/6]: Check if packet was actually delivered to a raw socket to decide whether to send an ICMP unreachable Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:22 -07:00
Andrew McDonald	0bd1b59b15	[IPV6]: Check interface bindings on IPv6 raw socket reception Take account of whether a socket is bound to a particular device when selecting an IPv6 raw socket to receive a packet. Also perform this check when receiving IPv6 packets with router alert options. Signed-off-by: Andrew McDonald <andrew@mcdonald.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:06 -07:00
Harald Welte	7af4cc3fa1	[NETFILTER]: Add "nfnetlink_queue" netfilter queue handler over nfnetlink - Add new nfnetlink_queue module - Add new ipt_NFQUEUE and ip6t_NFQUEUE modules to access queue numbers 1-65535 - Mark ip_queue and ip6_queue Kconfig options as OBSOLETE - Update feature-removal-schedule to remove ip[6]_queue in December Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:36:56 -07:00
Harald Welte	0ab43f8499	[NETFILTER]: Core changes required by upcoming nfnetlink_queue code - split netfiler verdict in 16bit verdict and 16bit queue number - add 'queuenum' argument to nf_queue_outfn_t and its users ip[6]_queue - move NFNL_SUBSYS_ definitions from enum to #define - introduce autoloading for nfnetlink subsystem modules - add MODULE_ALIAS_NFNL_SUBSYS macro - add nf_unregister_queue_handlers() to register all handlers for a given nf_queue_outfn_t - add more verbose DEBUGP macro definition to nfnetlink.c - make nfnetlink_subsys_register fail if subsys already exists - add some more comments and debug statements to nfnetlink.c Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:36:49 -07:00
Harald Welte	2cc7d57309	[NETFILTER]: Move reroute-after-queue code up to the nf_queue layer. The rerouting functionality is required by the core, therefore it has to be implemented by the core and not in individual queue handlers. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:36:19 -07:00
Harald Welte	4fdb3bb723	[NETLINK]: Add properly module refcounting for kernel netlink sockets. - Remove bogus code for compiling netlink as module - Add module refcounting support for modules implementing a netlink protocol - Add support for autoloading modules that implement a netlink protocol as soon as someone opens a socket for that protocol Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:35:08 -07:00
Harald Welte	020b4c12db	[NETFILTER]: Move ipv4 specific code from net/core/netfilter.c to net/ipv4/netfilter.c Netfilter cleanup - Move ipv4 code from net/core/netfilter.c to net/ipv4/netfilter.c - Move ipv6 netfilter code from net/ipv6/ip6_output.c to net/ipv6/netfilter.c Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:35:01 -07:00
Harald Welte	089af26c70	[NETFILTER]: Rename skb_ip_make_writable() to skb_make_writable() There is nothing IPv4-specific in it. In fact, it was already used by IPv6, too... Upcoming nfnetlink_queue code will use it for any kind of packet. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:34:40 -07:00
David S. Miller	f2ccd8fa06	[NET]: Kill skb->real_dev Bonding just wants the device before the skb_bond() decapsulation occurs, so simply pass that original device into packet_type->func() as an argument. It remains to be seen whether we can use this same exact thing to get rid of skb->input_dev as well. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:32:25 -07:00
Harald Welte	6869c4d8e0	[NETFILTER]: reduce netfilter sk_buff enlargement As discussed at netconf'05, we're trying to save every bit in sk_buff. The patch below makes sk_buff 8 bytes smaller. I did some basic testing on my notebook and it seems to work. The only real in-tree user of nfcache was IPVS, who only needs a single bit. Unfortunately I couldn't find some other free bit in sk_buff to stuff that bit into, so I introduced a separate field for them. Maybe the IPVS guys can resolve that to further save space. Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and alike are only 6 values defined), but unfortunately the bluetooth code overloads pkt_type :( The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just came up with a way how to do it without any skb fields, so it's safe to remove it. - remove all never-implemented 'nfcache' code - don't have ipvs code abuse 'nfcache' field. currently get's their own compile-conditional skb->ipvs_property field. IPVS maintainers can decide to move this bit elswhere, but nfcache needs to die. - remove skb->nfcache field to save 4 bytes - move skb->nfctinfo into three unused bits to save further 4 bytes Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:31:04 -07:00
David S. Miller	d5d283751e	[TCP]: Document non-trivial locking path in tcp_v{4,6}_get_port(). This trips up a lot of folks reading this code. Put an unlikely() around the port-exhaustion test for good measure. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-23 10:49:54 -07:00
Patrick McHardy	66a79a19a7	[NETFILTER]: Fix HW checksum handling in ip_queue/ip6_queue The checksum needs to be filled in on output, after mangling a packet ip_summed needs to be reset. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-23 10:10:35 -07:00

... 5 6 7 8 9 ...

711 Commits