linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-03 11:26:46 +07:00

Author	SHA1	Message	Date
Herbert Xu	aa5d62cc87	[IPSEC]: Move type and mode map into xfrm_state.c The type and mode maps are only used by SAs, not policies. So it makes sense to move them from xfrm_policy.c into xfrm_state.c. This also allows us to mark xfrm_get_type/xfrm_put_type/xfrm_get_mode/xfrm_put_mode as static. The only other change I've made in the move is to get rid of the casts on the request_module call for types. They're unnecessary because C will promote them to ints anyway. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:31:12 -07:00
Herbert Xu	440725000c	[IPSEC]: Fix length check in xfrm_parse_spi Currently xfrm_parse_spi requires there to be 16 bytes for AH and ESP. In contrived cases there may not actually be 16 bytes there since the respective header sizes are less than that (8 and 12 currently). This patch changes the test to use the actual header length instead of 16. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:30:34 -07:00
Denis V. Lunev	cd40b7d398	[NET]: make netlink user -> kernel interface synchronious This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:15:29 -07:00
Herbert Xu	b7c6538cd8	[IPSEC]: Move state lock into x->type->output This patch releases the lock on the state before calling x->type->output. It also adds the lock to the spots where they're currently needed. Most of those places (all except mip6) are expected to disappear with async crypto. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:03 -07:00
Herbert Xu	050f009e16	[IPSEC]: Lock state when copying non-atomic fields to user-space This patch adds locking so that when we're copying non-atomic fields such as life-time or coaddr to user-space we don't get a partial result. For af_key I've changed every instance of pfkey_xfrm_state2msg apart from expiration notification to include the keys and life-times. This is in-line with XFRM behaviour. The actual cases affected are: * pfkey_getspi: No change as we don't have any keys to copy. * key_notify_sa: + ADD/UPD: This wouldn't work otherwise. + DEL: It can't hurt. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:02 -07:00
Herbert Xu	68325d3b12	[XFRM] user: Move attribute copying code into copy_to_user_state_extra Here's a good example of code duplication leading to code rot. The notification patch did its own netlink message creation for xfrm states. It duplicated code that was already in dump_one_state. Guess what, the next time (and the time after) when someone updated dump_one_state the notification path got zilch. This patch moves that code from dump_one_state to copy_to_user_state_extra and uses it in xfrm_notify_sa too. Unfortunately whoever updates this still needs to update xfrm_sa_len since the notification path wants to know the exact size for allocation. At least I've added a comment saying so and if someone still forgest, we'll have a WARN_ON telling us so. I also changed the security size calculation to use xfrm_user_sec_ctx since that's what we actually put into the skb. However it makes no practical difference since it has the same size as xfrm_sec_ctx. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:02 -07:00
Herbert Xu	658b219e93	[IPSEC]: Move common code into xfrm_alloc_spi This patch moves some common code that conceptually belongs to the xfrm core from af_key/xfrm_user into xfrm_alloc_spi. In particular, the spin lock on the state is now taken inside xfrm_alloc_spi. Previously it also protected the construction of the response PF_KEY/XFRM messages to user-space. This is inconsistent as other identical constructions are not protected by the state lock. This is bad because they in fact should be protected but only in certain spots (so as not to hold the lock for too long which may cause packet drops). The SPI byte order conversion has also been moved. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:01 -07:00
Herbert Xu	75ba28c633	[IPSEC]: Remove gratuitous km wake-up events on ACQUIRE There is no point in waking people up when creating/updating larval states because they'll just go back to sleep again as larval states by definition cannot be found by xfrm_state_find. We should only wake them up when the larvals mature or die. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:01 -07:00
Herbert Xu	007f0211a8	[IPSEC]: Store IPv6 nh pointer in mac_header on output Current the x->mode->output functions store the IPv6 nh pointer in the skb network header. This is inconvenient because the network header then has to be fixed up before the packet can leave the IPsec stack. The mac header field is unused on output so we can use that to store this instead. This patch does that and removes the network header fix-up in xfrm_output. It also uses ipv6_hdr where appropriate in the x->type->output functions. There is also a minor clean-up in esp4 to make it use the same code as esp6 to help any subsequent effort to merge the two. Lastly it kills two redundant skb_set_* statements in BEET that were simply copied over from transport mode. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:00 -07:00
Herbert Xu	1ecafede83	[IPSEC]: Remove bogus ref count in xfrm_secpath_reject Constructs of the form xfrm_state_hold(x); foo(x); xfrm_state_put(x); tend to be broken because foo is either synchronous where this is totally unnecessary or if foo is asynchronous then the reference count is in the wrong spot. In the case of xfrm_secpath_reject, the function is synchronous and therefore we should just kill the reference count. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:59 -07:00
Herbert Xu	45b17f48ea	[IPSEC]: Move RO-specific output code into xfrm6_mode_ro.c The lastused update check in xfrm_output can be done just as well in the mode output function which is specific to RO. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:56 -07:00
Herbert Xu	cdf7e668d4	[IPSEC]: Unexport xfrm_replay_notify Now that the only callers of xfrm_replay_notify are in xfrm, we can remove the export. This patch also removes xfrm_aevent_doreplay since it's now called in just one spot. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:55 -07:00
Herbert Xu	436a0a4022	[IPSEC]: Move output replay code into xfrm_output The replay counter is one of only two remaining things in the output code that requires a lock on the xfrm state (the other being the crypto). This patch moves it into the generic xfrm_output so we can remove the lock from the transforms themselves. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:54 -07:00
Herbert Xu	83815dea47	[IPSEC]: Move xfrm_state_check into xfrm_output.c The functions xfrm_state_check and xfrm_state_check_space are only used by the output code in xfrm_output.c so we can move them over. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:54 -07:00
Herbert Xu	406ef77c89	[IPSEC]: Move common output code to xfrm_output Most of the code in xfrm4_output_one and xfrm6_output_one are identical so this patch moves them into a common xfrm_output function which will live in net/xfrm. In fact this would seem to fix a bug as on IPv4 we never reset the network header after a transform which may upset netfilter later on. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:53 -07:00
Eric W. Biederman	2774c7aba6	[NET]: Make the loopback device per network namespace. This patch makes loopback_dev per network namespace. Adding code to create a different loopback device for each network namespace and adding the code to free a loopback device when a network namespace exits. This patch modifies all users the loopback_dev so they access it as init_net.loopback_dev, keeping all of the code compiling and working. A later pass will be needed to update the users to use something other than the initial network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:49 -07:00
Daniel Lezcano	de3cb747ff	[NET]: Dynamically allocate the loopback device, part 1. This patch replaces all occurences to the static variable loopback_dev to a pointer loopback_dev. That provides the mindless, trivial, uninteressting change part for the dynamic allocation for the loopback. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-By: Kirill Korotaev <dev@sw.ru> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:14 -07:00
Herbert Xu	0cfad07555	[NETLINK]: Avoid pointer in netlink_run_queue I was looking at Patrick's fix to inet_diag and it occured to me that we're using a pointer argument to return values unnecessarily in netlink_run_queue. Changing it to return the value will allow the compiler to generate better code since the value won't have to be memory-backed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:24 -07:00
Eric W. Biederman	b4b510290b	[NET]: Support multiple network namespaces with netlink Each netlink socket will live in exactly one network namespace, this includes the controlling kernel sockets. This patch updates all of the existing netlink protocols to only support the initial network namespace. Request by clients in other namespaces will get -ECONREFUSED. As they would if the kernel did not have the support for that netlink protocol compiled in. As each netlink protocol is updated to be multiple network namespace safe it can register multiple kernel sockets to acquire a presence in the rest of the network namespaces. The implementation in af_netlink is a simple filter implementation at hash table insertion and hash table look up time. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:09 -07:00
Eric W. Biederman	e9dc865340	[NET]: Make device event notification network namespace safe Every user of the network device notifiers is either a protocol stack or a pseudo device. If a protocol stack that does not have support for multiple network namespaces receives an event for a device that is not in the initial network namespace it quite possibly can get confused and do the wrong thing. To avoid problems until all of the protocol stacks are converted this patch modifies all netdev event handlers to ignore events on devices that are not in the initial network namespace. As the rest of the code is made network namespace aware these checks can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:09 -07:00
Joy Latten	ab5f5e8b14	[XFRM]: xfrm audit calls This patch modifies the current ipsec audit layer by breaking it up into purpose driven audit calls. So far, the only audit calls made are when add/delete an SA/policy. It had been discussed to give each key manager it's own calls to do this, but I found there to be much redundnacy since they did the exact same things, except for how they got auid and sid, so I combined them. The below audit calls can be made by any key manager. Hopefully, this is ok. Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:02 -07:00
Thomas Graf	f7944fb191	[XFRM] policy: Replace magic number with XFRM_POLICY_OUT Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:34 -07:00
Thomas Graf	fd21150a0f	[XFRM] netlink: Inline attach_encap_tmpl(), attach_sec_ctx(), and attach_one_addr() These functions are only used once and are a lot easier to understand if inlined directly into the function. Fixes by Masahide NAKAMURA. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:26 -07:00
Thomas Graf	15901a2746	[XFRM] netlink: Remove dependency on rtnetlink Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:25 -07:00
Thomas Graf	5424f32e48	[XFRM] netlink: Use nlattr instead of rtattr Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:25 -07:00
Thomas Graf	35a7aa08bf	[XFRM] netlink: Rename attribute array from xfrma[] to attrs[] Increases readability a lot. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:24 -07:00
Thomas Graf	fab448991d	[XFRM] netlink: Enhance indexing of the attribute array nlmsg_parse() puts attributes at array[type] so the indexing method can be simpilfied by removing the obscuring "- 1". Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:23 -07:00
Thomas Graf	cf5cb79f69	[XFRM] netlink: Establish an attribute policy Adds a policy defining the minimal payload lengths for all the attributes allowing for most attribute validation checks to be removed from in the middle of the code path. Makes updates more consistent as many format errors are recognised earlier, before any changes have been attempted. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:23 -07:00
Thomas Graf	a7bd9a45c8	[XFRM] netlink: Use nlmsg_parse() to parse attributes Uses nlmsg_parse() to parse the attributes. This actually changes behaviour as unknown attributes (type > MAXTYPE) no longer cause an error. Instead unknown attributes will be ignored henceforth to keep older kernels compatible with more recent userspace tools. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:22 -07:00
Thomas Graf	7deb226490	[XFRM] netlink: Use nlmsg_new() and type-safe size calculation helpers Moves all complex message size calculation into own inlined helper functions and makes use of the type-safe netlink interface. Using nlmsg_new() simplifies the calculation itself as it takes care of the netlink header length by itself. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:22 -07:00
Thomas Graf	cfbfd45a8c	[XFRM] netlink: Clear up some of the CONFIG_XFRM_SUB_POLICY ifdef mess Moves all of the SUB_POLICY ifdefs related to the attribute size calculation into a function. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:21 -07:00
Thomas Graf	c26445acbc	[XFRM] netlink: Move algorithm length calculation to its own function Adds alg_len() to calculate the properly padded length of an algorithm attribute to simplify the code. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:21 -07:00
Thomas Graf	c0144beaec	[XFRM] netlink: Use nla_put()/NLA_PUT() variantes Also makes use of copy_sec_ctx() in another place and removes duplicated code. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:20 -07:00
Thomas Graf	082a1ad573	[XFRM] netlink: Use nlmsg_broadcast() and nlmsg_unicast() This simplifies successful return codes from >0 to 0. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:20 -07:00
Thomas Graf	7b67c8575f	[XFRM] netlink: Use nlmsg_data() instead of NLMSG_DATA() Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:19 -07:00
Thomas Graf	9825069d09	[XFRM] netlink: Use nlmsg_end() and nlmsg_cancel() Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:18 -07:00
Thomas Graf	79b8b7f4ab	[XFRM] netlink: Use nlmsg_put() instead of NLMSG_PUT() Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:18 -07:00
Jesper Juhl	b5890d8ba4	[XFRM]: Clean up duplicate includes in net/xfrm/ This patch cleans up duplicate includes in net/xfrm/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-13 22:52:08 -07:00
Paul Moore	e6e0871cce	Net/Security: fix memory leaks from security_secid_to_secctx() The security_secid_to_secctx() function returns memory that must be freed by a call to security_release_secctx() which was not always happening. This patch fixes two of these problems (all that I could find in the kernel source at present). Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2007-08-02 11:52:26 -04:00
Joakim Koskela	48b8d78315	[XFRM]: State selection update to use inner addresses. This patch modifies the xfrm state selection logic to use the inner addresses where the outer have been (incorrectly) used. This is required for beet mode in general and interfamily setups in both tunnel and beet mode. Signed-off-by: Joakim Koskela <jookos@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Diego Beltrami <diego.beltrami@gmail.com> Signed-off-by: Miika Komu <miika@iki.fi> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-31 02:28:33 -07:00
Herbert Xu	196b003620	[IPSEC]: Ensure that state inner family is set Similar to the issue we had with template families which specified the inner families of policies, we need to set the inner families of states as the main xfrm user Openswan leaves it as zero. af_key is unaffected because the inner family is set by it and not the KM. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-31 02:28:32 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
YOSHIFUJI Hideaki	7dc12d6dd6	[NET] XFRM: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-07-19 10:45:15 +09:00
Patrick McHardy	bd0bf0765e	[XFRM]: Fix crash introduced by struct dst_entry reordering XFRM expects xfrm_dst->u.next to be same pointer as dst->next, which was broken by the dst_entry reordering in commit 1e19e02c~, causing an oops in xfrm_bundle_ok when walking the bundle upwards. Kill xfrm_dst->u.next and change the only user to use dst->next instead. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-18 01:55:52 -07:00
Jamal Hadi Salim	628529b6ee	[XFRM] Introduce standalone SAD lookup This allows other in-kernel functions to do SAD lookups. The only known user at the moment is pktgen. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:35 -07:00
Patrick McHardy	281216177a	[XFRM]: Fix MTU calculation for non-ESP SAs My IPsec MTU optimization patch introduced a regression in MTU calculation for non-ESP SAs, the SA's header_len needs to be subtracted from the MTU if the transform doesn't provide a ->get_mtu() function. Reported-and-tested-by: Marco Berizzi <pupilla@hotmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-18 22:30:15 -07:00
Joy Latten	4aa2e62c45	xfrm: Add security check before flushing SAD/SPD Currently we check for permission before deleting entries from SAD and SPD, (see security_xfrm_policy_delete() security_xfrm_state_delete()) However we are not checking for authorization when flushing the SPD and the SAD completely. It was perhaps missed in the original security hooks patch. This patch adds a security check when flushing entries from the SAD and SPD. It runs the entire database and checks each entry for a denial. If the process attempting the flush is unable to remove all of the entries a denial is logged the the flush function returns an error without removing anything. This is particularly useful when a process may need to create or delete its own xfrm entries used for things like labeled networking but that same process should not be able to delete other entries or flush the entire database. Signed-off-by: Joy Latten<latten@austin.ibm.com> Signed-off-by: Eric Paris <eparis@parisplace.org> Signed-off-by: James Morris <jmorris@namei.org>	2007-06-07 13:42:46 -07:00
David S. Miller	aad0e0b9b6	[XFRM]: xfrm_larval_drop sysctl should be __read_mostly. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:24 -07:00
David S. Miller	01e67d08fa	[XFRM]: Allow XFRM_ACQ_EXPIRES to be tunable via sysctl. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:23 -07:00
David S. Miller	14e50e57ae	[XFRM]: Allow packet drops during larval state resolution. The current IPSEC rule resolution behavior we have does not work for a lot of people, even though technically it's an improvement from the -EAGAIN buisness we had before. Right now we'll block until the key manager resolves the route. That works for simple cases, but many folks would rather packets get silently dropped until the key manager resolves the IPSEC rules. We can't tell these folks to "set the socket non-blocking" because they don't have control over the non-block setting of things like the sockets used to resolve DNS deep inside of the resolver libraries in libc. With that in mind I coded up the patch below with some help from Herbert Xu which provides packet-drop behavior during larval state resolution, controllable via sysctl and off by default. This lays the framework to either: 1) Make this default at some point or... 2) Move this logic into xfrm{4,6}_policy.c and implement the ARP-like resolution queue we've all been dreaming of. The idea would be to queue packets to the policy, then once the larval state is resolved by the key manager we re-resolve the route and push the packets out. The packets would timeout if the rule didn't get resolved in a certain amount of time. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-24 18:17:54 -07:00
Herbert Xu	26b8e51e98	[IPSEC]: Fix warnings with casting int to pointer This patch adds some casts to shut up the warnings introduced by my last patch that added a common interator function for xfrm algorightms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-22 16:12:26 -07:00
Herbert Xu	c92b3a2f1f	[IPSEC] pfkey: Load specific algorithm in pfkey_add rather than all This is a natural extension of the changeset [XFRM]: Probe selected algorithm only. which only removed the probe call for xfrm_user. This patch does exactly the same thing for af_key. In other words, we load the algorithm requested by the user rather than everything when adding xfrm states in af_key. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-19 14:21:18 -07:00
Herbert Xu	6253db055e	[IPSEC]: Don't warn if high-order hash resize fails Multi-page allocations are always likely to fail. Since such failures are expected and non-critical in xfrm_hash_alloc, we shouldn't warn about them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-14 02:19:11 -07:00
Herbert Xu	b5505c6e10	[IPSEC]: Check validity of direction in xfrm_policy_byid The function xfrm_policy_byid takes a dir argument but finds the policy using the index instead. We only use the dir argument to update the policy count for that direction. Since the user can supply any value for dir, this can corrupt our policy count. I know this is the problem because a few days ago I was deleting policies by hand using indicies and accidentally typed in the wrong direction. It still deleted the policy and at the time I thought that was cool. In retrospect it isn't such a good idea :) I decided against letting it delete the policy anyway just in case we ever remove the connection between indicies and direction. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-14 02:15:47 -07:00
Jamal Hadi Salim	5a6d34162f	[XFRM] SPD info TLV aggregation Aggregate the SPD info TLVs. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-04 12:55:39 -07:00
Jamal Hadi Salim	af11e31609	[XFRM] SAD info TLV aggregationx Aggregate the SAD info TLVs. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-04 12:55:13 -07:00
Masahide NAKAMURA	157bfc2502	[XFRM]: Restrict upper layer information by bundle. On MIPv6 usage, XFRM sub policy is enabled. When main (IPsec) and sub (MIPv6) policy selectors have the same address set but different upper layer information (i.e. protocol number and its ports or type/code), multiple bundle should be created. However, currently we have issue to use the same bundle created for the first time with all flows covered by the case. It is useful for the bundle to have the upper layer information to be restructured correctly if it does not match with the flow. 1. Bundle was created by two policies Selector from another policy is added to xfrm_dst. If the flow does not match the selector, it goes to slow path to restructure new bundle by single policy. 2. Bundle was created by one policy Flow cache is added to xfrm_dst as originated one. If the flow does not match the cache, it goes to slow path to try searching another policy. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-30 00:58:09 -07:00
Jamal Hadi Salim	ecfd6b1837	[XFRM]: Export SPD info With this patch you can use iproute2 in user space to efficiently see how many policies exist in different directions. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-28 21:20:32 -07:00
David S. Miller	1a028e5072	[NET]: Revert sk_buff walker cleanups. This reverts `eefa390628` The simplification made in that change works with the assumption that the 'offset' parameter to these functions is always positive or zero, which is not true. It can be and often is negative in order to access SKB header values in front of skb->data. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-27 15:21:23 -07:00
Jamal Hadi Salim	566ec03448	[XFRM]: Missing bits to SAD info. This brings the SAD info in sync with net-2.6.22/net-2.6 Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-26 14:12:15 -07:00
Jean Delvare	eefa390628	[NET]: Clean up sk_buff walkers. I noticed recently that, in skb_checksum(), "offset" and "start" are essentially the same thing and have the same value throughout the function, despite being computed differently. Using a single variable allows some cleanups and makes the skb_checksum() function smaller, more readable, and presumably marginally faster. We appear to have many other "sk_buff walker" functions built on the exact same model, so the cleanup applies to them, too. Here is a list of the functions I found to be affected: net/appletalk/ddp.c:atalk_sum_skb() net/core/datagram.c:skb_copy_datagram_iovec() net/core/datagram.c:skb_copy_and_csum_datagram() net/core/skbuff.c:skb_copy_bits() net/core/skbuff.c:skb_store_bits() net/core/skbuff.c:skb_checksum() net/core/skbuff.c:skb_copy_and_csum_bit() net/core/user_dma.c:dma_skb_copy_datagram_iovec() net/xfrm/xfrm_algo.c:skb_icv_walk() net/xfrm/xfrm_algo.c:skb_to_sgvec() OTOH, I admit I'm a bit surprised, the cleanup is rather obvious so I'm really wondering if I am missing something. Can anyone please comment on this? Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-26 00:44:22 -07:00
Jamal Hadi Salim	28d8909bc7	[XFRM]: Export SAD info. On a system with a lot of SAs, counting SAD entries chews useful CPU time since you need to dump the whole SAD to user space; i.e something like ip xfrm state ls \| grep -i src \| wc -l I have seen taking literally minutes on a 40K SAs when the system is swapping. With this patch, some of the SAD info (that was already being tracked) is exposed to user space. i.e you do: ip xfrm state count And you get the count; you can also pass -s to the command line and get the hash info. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-26 00:10:29 -07:00
Stephen Hemminger	3ff50b7997	[NET]: cleanup extra semicolons Spring cleaning time... There seems to be a lot of places in the network code that have extra bogus semicolons after conditionals. Most commonly is a bogus semicolon after: switch() { } Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:24 -07:00
Patrick McHardy	af65bdfce9	[NETLINK]: Switch cb_lock spinlock to mutex and allow to override it Switch cb_lock to mutex and allow netlink kernel users to override it with a subsystem specific mutex for consistent locking in dump callbacks. All netlink_dump_start users have been audited not to rely on any side-effects of the previously used spinlock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:03 -07:00
Patrick McHardy	c5c2523893	[XFRM]: Optimize MTU calculation Replace the probing based MTU estimation, which usually takes 2-3 iterations to find a fitting value and may underestimate the MTU, by an exact calculation. Also fix underestimation of the XFRM trailer_len, which causes unnecessary reallocations. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:38 -07:00
David Howells	716ea3a7aa	[NET]: Move generic skbuff stuff from XFRM code to generic code Move generic skbuff stuff from XFRM code to generic code so that AF_RXRPC can use it too. The kdoc comments I've attached to the functions needs to be checked by whoever wrote them as I had to make some guesses about the workings of these functions. Signed-off-By: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:33 -07:00
Thomas Graf	c702e8047f	[NETLINK]: Directly return -EINTR from netlink_dump_start() Now that all users of netlink_dump_start() use netlink_run_queue() to process the receive queue, it is possible to return -EINTR from netlink_dump_start() directly, therefore simplying the callers. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:33 -07:00
Thomas Graf	1d00a4eb42	[NETLINK]: Remove error pointer from netlink message handler The error pointer argument in netlink message handlers is used to signal the special case where processing has to be interrupted because a dump was started but no error happened. Instead it is simpler and more clear to return -EINTR and have netlink_run_queue() deal with getting the queue right. nfnetlink passed on this error pointer to its subsystem handlers but only uses it to signal the start of a netlink dump. Therefore it can be removed there as well. This patch also cleans up the error handling in the affected message handlers to be consistent since it had to be touched anyway. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:30 -07:00
Thomas Graf	45e7ae7f71	[NETLINK]: Ignore control messages directly in netlink_run_queue() Changes netlink_rcv_skb() to skip netlink controll messages and don't pass them on to the message handler. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:29 -07:00
Thomas Graf	d35b685640	[NETLINK]: Ignore !NLM_F_REQUEST messages directly in netlink_run_queue() netlink_rcv_skb() is changed to skip messages which don't have the NLM_F_REQUEST bit to avoid every netlink family having to perform this check on their own. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:29 -07:00
Arnaldo Carvalho de Melo	dc5fc579b9	[NETLINK]: Use nlmsg_trim() where appropriate Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:37 -07:00
Arnaldo Carvalho de Melo	27a884dc3c	[SK_BUFF]: Convert skb->tail to sk_buff_data_t So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes on 64bit architectures, allowing us to combine the 4 bytes hole left by the layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4 64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN... :-) Many calculations that previously required that skb->{transport,network, mac}_header be first converted to a pointer now can be done directly, being meaningful as offsets or pointers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:28 -07:00
Arnaldo Carvalho de Melo	9c70220b73	[SK_BUFF]: Introduce skb_transport_header(skb) For the places where we need a pointer to the transport header, it is still legal to touch skb->h.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:31 -07:00
James Morris	9d729f72dc	[NET]: Convert xtime.tv_sec to get_seconds() Where appropriate, convert references to xtime.tv_sec to the get_seconds() helper function. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:32 -07:00
Joy Latten	661697f728	[IPSEC] XFRM_USER: kernel panic when large security contexts in ACQUIRE When sending a security context of 50+ characters in an ACQUIRE message, following kernel panic occurred. kernel BUG in xfrm_send_acquire at net/xfrm/xfrm_user.c:1781! cpu 0x3: Vector: 700 (Program Check) at [c0000000421bb2e0] pc: c00000000033b074: .xfrm_send_acquire+0x240/0x2c8 lr: c00000000033b014: .xfrm_send_acquire+0x1e0/0x2c8 sp: c0000000421bb560 msr: 8000000000029032 current = 0xc00000000fce8f00 paca = 0xc000000000464b00 pid = 2303, comm = ping kernel BUG in xfrm_send_acquire at net/xfrm/xfrm_user.c:1781! enter ? for help 3:mon> t [c0000000421bb650] c00000000033538c .km_query+0x6c/0xec [c0000000421bb6f0] c000000000337374 .xfrm_state_find+0x7f4/0xb88 [c0000000421bb7f0] c000000000332350 .xfrm_tmpl_resolve+0xc4/0x21c [c0000000421bb8d0] c0000000003326e8 .xfrm_lookup+0x1a0/0x5b0 [c0000000421bba00] c0000000002e6ea0 .ip_route_output_flow+0x88/0xb4 [c0000000421bbaa0] c0000000003106d8 .ip4_datagram_connect+0x218/0x374 [c0000000421bbbd0] c00000000031bc00 .inet_dgram_connect+0xac/0xd4 [c0000000421bbc60] c0000000002b11ac .sys_connect+0xd8/0x120 [c0000000421bbd90] c0000000002d38d0 .compat_sys_socketcall+0xdc/0x214 [c0000000421bbe30] c00000000000869c syscall_exit+0x0/0x40 --- Exception: c00 (System Call) at 0000000007f0ca9c SP (fc0ef8f0) is in userspace We are using size of security context from xfrm_policy to determine how much space to alloc skb and then putting security context from xfrm_state into skb. Should have been using size of security context from xfrm_state to alloc skb. Following fix does that Signed-off-by: Joy Latten <latten@austin.ibm.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-13 16:14:35 -07:00
Herbert Xu	4c4d51a731	[IPSEC]: Reject packets within replay window but outside the bit mask Up until this point we've accepted replay window settings greater than 32 but our bit mask can only accomodate 32 packets. Thus any packet with a sequence number within the window but outside the bit mask would be accepted. This patch causes those packets to be rejected instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-05 00:07:39 -07:00
Dave Jones	b6f99a2119	[NET]: fix up misplaced inlines. Turning up the warnings on gcc makes it emit warnings about the placement of 'inline' in function declarations. Here's everything that was under net/ Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-22 12:27:49 -07:00
Joy Latten	961995582e	[XFRM]: ipsecv6 needs a space when printing audit record. This patch adds a space between printing of the src and dst ipv6 addresses. Otherwise, audit or other test tools may fail to process the audit record properly because they cannot find the dst address. Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-20 00:09:47 -07:00
Joy Latten	75e252d981	[XFRM]: Fix missing protocol comparison of larval SAs. I noticed that in xfrm_state_add we look for the larval SA in a few places without checking for protocol match. So when using both AH and ESP, whichever one gets added first, deletes the larval SA. It seems AH always gets added first and ESP is always the larval SA's protocol since the xfrm->tmpl has it first. Thus causing the additional km_query() Adding the check eliminates accidental double SA creation. Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-12 17:14:07 -07:00
Eric Paris	16bec31db7	[IPSEC]: xfrm audit hook misplaced in pfkey_delete and xfrm_del_sa Inside pfkey_delete and xfrm_del_sa the audit hooks were not called if there was any permission/security failures in attempting to do the del operation (such as permission denied from security_xfrm_state_delete). This patch moves the audit hook to the exit path such that all failures (and successes) will actually get audited. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Venkat Yekkirala <vyekkirala@trustedcs.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-07 16:08:11 -08:00
Eric Paris	ef41aaa0b7	[IPSEC]: xfrm_policy delete security check misplaced The security hooks to check permissions to remove an xfrm_policy were actually done after the policy was removed. Since the unlinking and deletion are done in xfrm_policy_by* functions this moves the hooks inside those 2 functions. There we have all the information needed to do the security check and it can be done before the deletion. Since auditing requires the result of that security check err has to be passed back and forth from the xfrm_policy_by* functions. This patch also fixes a bug where a deletion that failed the security check could cause improper accounting on the xfrm_policy (xfrm_get_policy didn't have a put on the exit path for the hold taken by xfrm_policy_by) It also fixes the return code when no policy is found in xfrm_add_pol_expire. In old code (at least back in the 2.6.18 days) err wasn't used before the return when no policy is found and so the initialization would cause err to be ENOENT. But since err has since been used above when we don't get a policy back from the xfrm_policy_by function we would always return 0 instead of the intended ENOENT. Also fixed some white space damage in the same area. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Venkat Yekkirala <vyekkirala@trustedcs.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-07 16:08:09 -08:00
Patrick McHardy	b08d5840d2	[NET]: Fix kfree(skb) Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-28 09:42:14 -08:00
David S. Miller	3a765aa528	[XFRM] xfrm_user: Fix return values of xfrm_add_sa_expire. As noted by Kent Yoder, this function will always return an error. Make sure it returns zero on success. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-28 09:41:57 -08:00
Kazunori MIYAZAWA	928ba4169d	[IPSEC]: Fix the address family to refer encap_family Fix the address family to refer encap_family when comparing with a kernel generated xfrm_state Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-13 12:57:16 -08:00
David S. Miller	13fcfbb067	[XFRM]: Fix OOPSes in xfrm_audit_log(). Make sure that this function is called correctly, and add BUG() checking to ensure the arguments are sane. Based upon a patch by Joy Latten. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 13:53:54 -08:00
YOSHIFUJI Hideaki	a716c1197d	[NET] XFRM: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:24 -08:00
David S. Miller	9783e1df7a	Merge branch 'HEAD' of master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6 Conflicts: crypto/Kconfig	2007-02-08 15:25:18 -08:00
David S. Miller	e610e679dd	[XFRM]: xfrm_migrate() needs exporting to modules. Needed by xfrm_user and af_key. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 13:29:15 -08:00
Shinta Sugimoto	f6ed0ec0ee	[PFKEYV2]: CONFIG_NET_KEY_MIGRATE option Add CONFIG_NET_KEY_MIGRATE option which makes it possible for user application to send or receive MIGRATE message to/from PF_KEY socket. Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 13:15:05 -08:00
Shinta Sugimoto	d0473655c8	[XFRM]: CONFIG_XFRM_MIGRATE option Add CONFIG_XFRM_MIGRATE option which makes it possible for for user application to send or receive MIGRATE message to/from netlink socket. Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 13:13:07 -08:00
Shinta Sugimoto	5c79de6e79	[XFRM]: User interface for handling XFRM_MSG_MIGRATE Add user interface for handling XFRM_MSG_MIGRATE. The message is issued by user application. When kernel receives the message, procedure of updating XFRM databases will take place. Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 13:12:32 -08:00
Shinta Sugimoto	80c9abaabf	[XFRM]: Extension for dynamic update of endpoint address(es) Extend the XFRM framework so that endpoint address(es) in the XFRM databases could be dynamically updated according to a request (MIGRATE message) from user application. Target XFRM policy is first identified by the selector in the MIGRATE message. Next, the endpoint addresses of the matching templates and XFRM states are updated according to the MIGRATE message. Signed-off-by: Shinta Sugimoto <shinta.sugimoto@ericsson.com> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 13:11:42 -08:00
Miika Komu	cdca72652a	[IPSEC]: exporting xfrm_state_afinfo This patch exports xfrm_state_afinfo. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:00 -08:00
Noriaki TAKAMIYA	6a0dc8d733	[IPSEC]: added the entry of Camellia cipher algorithm to ealg_list[] This patch adds the entry of Camellia cipher algorithm to ealg_list[]. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-02-07 09:21:05 +11:00
Herbert Xu	a6c7ab55dd	[IPSEC]: Policy list disorder The recent hashing introduced an off-by-one bug in policy list insertion. Instead of adding after the last entry with a lesser or equal priority, we're adding after the successor of that entry. This patch fixes this and also adds a warning if we detect a duplicate entry in the policy list. This should never happen due to this if clause. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-23 20:25:51 -08:00
Christoph Hellwig	22e7005023	[XFRM_USER]: avoid pointless void casts All ->doit handlers want a struct rtattr , so pass down the right type. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-03 18:38:13 -08:00
Martin Willi	b836267aa7	[XFRM]: Algorithm lookup using .compat name Installing an IPsec SA using old algorithm names (.compat) does not work if the algorithm is not already loaded. When not using the PF_KEY interface, algorithms are not preloaded in xfrm_probe_algs() and installing a IPsec SA fails. Signed-off-by: Martin Willi <martin@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-31 14:06:51 -08:00
Linus Torvalds	2685b267bc	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (48 commits) [NETFILTER]: Fix non-ANSI func. decl. [TG3]: Identify Serdes devices more clearly. [TG3]: Use msleep. [TG3]: Use netif_msg_*. [TG3]: Allow partial speed advertisement. [TG3]: Add TG3_FLG2_IS_NIC flag. [TG3]: Add 5787F device ID. [TG3]: Fix Phy loopback. [WANROUTER]: Kill kmalloc debugging code. [TCP] inet_twdr_hangman: Delete unnecessary memory barrier(). [NET]: Memory barrier cleanups [IPSEC]: Fix inetpeer leak in ipv4 xfrm dst entries. audit: disable ipsec auditing when CONFIG_AUDITSYSCALL=n audit: Add auditing to ipsec [IRDA] irlan: Fix compile warning when CONFIG_PROC_FS=n [IrDA]: Incorrect TTP header reservation [IrDA]: PXA FIR code device model conversion [GENETLINK]: Fix misplaced command flags. [NETLIK]: Add a pointer to the Generic Netlink wiki page. [IPV6] RAW: Don't release unlocked sock. ...	2006-12-07 09:05:15 -08:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
Christoph Lameter	54e6ecb239	[PATCH] slab: remove SLAB_ATOMIC SLAB_ATOMIC is an alias of GFP_ATOMIC Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:24 -08:00
Joy Latten	c9204d9ca7	audit: disable ipsec auditing when CONFIG_AUDITSYSCALL=n Disables auditing in ipsec when CONFIG_AUDITSYSCALL is disabled in the kernel. Also includes a bug fix for xfrm_state.c as a result of original ipsec audit patch. Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-06 20:14:23 -08:00
Joy Latten	161a09e737	audit: Add auditing to ipsec An audit message occurs when an ipsec SA or ipsec policy is created/deleted. Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-06 20:14:22 -08:00
Kazunori MIYAZAWA	7cf4c1a5fd	[IPSEC]: Add support for AES-XCBC-MAC The glue of xfrm. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-12-06 18:38:51 -08:00
Jamal Hadi Salim	94b9bb5480	[XFRM] Optimize SA dumping Same comments as in "[XFRM] Optimize policy dumping" The numbers are (20K SAs):	2006-12-06 18:38:45 -08:00
Jamal Hadi Salim	baf5d743d1	[XFRM] Optimize policy dumping This change optimizes the dumping of Security policies. 1) Before this change .. speedopolis:~# time ./ip xf pol real 0m22.274s user 0m0.000s sys 0m22.269s 2) Turn off sub-policies speedopolis:~# ./ip xf pol real 0m13.496s user 0m0.000s sys 0m13.493s i suppose the above is to be expected 3) With this change .. speedopolis:~# time ./ip x policy real 0m7.901s user 0m0.008s sys 0m7.896s	2006-12-06 18:38:44 -08:00
David Howells	9db7372445	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/ata/libata-scsi.c include/linux/libata.h Futher merge of Linus's head and compilation fixups. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-12-05 17:01:28 +00:00
David Howells	4c1ac1b491	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/infiniband/core/iwcm.c drivers/net/chelsio/cxgb2.c drivers/net/wireless/bcm43xx/bcm43xx_main.c drivers/net/wireless/prism54/islpci_eth.c drivers/usb/core/hub.h drivers/usb/input/hid-core.c net/core/netpoll.c Fix up merge failures with Linus's head and fix new compilation failures. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-12-05 14:37:56 +00:00
David S. Miller	b4ad86bf52	[XFRM] xfrm_user: Better validation of user templates. Since we never checked the ->family value of templates before, many applications simply leave it at zero. Detect this and fix it up to be the pol->family value. Also, do not clobber xp->family while reading in templates, that is not necessary. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-03 19:19:26 -08:00
Jamal Hadi Salim	2b5f6dcce5	[XFRM]: Fix aevent structuring to be more complete. aevents can not uniquely identify an SA. We break the ABI with this patch, but consensus is that since it is not yet utilized by any (known) application then it is fine (better do it now than later). Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 22:22:25 -08:00
Miika Komu	8511d01d7c	[IPSEC]: Add netlink interface for the encapsulation family. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:31:49 -08:00
Miika Komu	76b3f055f3	[IPSEC]: Add encapsulation family. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:31:48 -08:00
Jamal Hadi Salim	b798a9ede2	[XFRM]: Convert a few __u8 to proper u8 Caught by the EyeBalls(tm) of Thomas Graf Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:50 -08:00
Jamal Hadi Salim	0c51f53c57	[XFRM]: Make flush notifier prettier when subpolicy used Might as well make flush notifier prettier when subpolicy used Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:49 -08:00
Thomas Graf	4e9b826935	[NETLINK]: Remove unused dst_pid field in netlink_skb_parms The destination PID is passed directly to netlink_unicast() respectively netlink_multicast(). Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:43 -08:00
Arnaldo Carvalho de Melo	cdbc6dae5c	[XFRM]: Use kmemdup where appropriate Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:30:22 -08:00
Jamal Hadi Salim	1459bb36b1	[XFRM]: Make copy_to_user_policy_type take a type Make copy_to_user_policy_type take a type instead a policy and fix its users to pass the type Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:26:14 -08:00
Andrew Morton	776810217a	[XFRM]: uninline xfrm_selector_match() Six callsites, huge. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:36 -08:00
Venkat Yekkirala	67f83cbf08	SELinux: Fix SA selection semantics Fix the selection of an SA for an outgoing packet to be at the same context as the originating socket/flow. This eliminates the SELinux policy's ability to use/sendto SAs with contexts other than the socket's. With this patch applied, the SELinux policy will require one or more of the following for a socket to be able to communicate with/without SAs: 1. To enable a socket to communicate without using labeled-IPSec SAs: allow socket_t unlabeled_t:association { sendto recvfrom } 2. To enable a socket to communicate with labeled-IPSec SAs: allow socket_t self:association { sendto }; allow socket_t peer_sa_t:association { recvfrom }; Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-12-02 21:21:34 -08:00
Al Viro	5d36b1803d	[XFRM]: annotate ->new_mapping() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:18 -08:00
Masahide NAKAMURA	9abbffee86	[XFRM] STATE: Fix to respond error to get operation if no matching entry exists. When application uses XFRM_MSG_GETSA to get state entry through netlink socket and kernel has no matching one, the application expects reply message with error status by kernel. Kernel doesn't send the message back in the case of Mobile IPv6 route optimization protocols (i.e. routing header or destination options header). This is caused by incorrect return code "0" from net/xfrm/xfrm_user.c(xfrm_user_state_lookup) and it makes kernel skip to acknowledge at net/netlink/af_netlink.c(netlink_rcv_skb). This patch fix to reply ESRCH to application. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: TAKAMIYA Noriaki <takamiya@po.ntts.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-25 15:16:52 -08:00
David Howells	c4028958b6	WorkStruct: make allyesconfig Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:57:56 +00:00
Jamal Hadi Salim	785fd8b8a5	[XFRM]: nlmsg length not computed correctly in the presence of subpolicies I actually dont have a test case for these; i just found them by inspection. Refer to patch "[XFRM]: Sub-policies broke policy events" for more info Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-21 16:16:35 -08:00
Jamal Hadi Salim	334f3d45d3	[XFRM]: Sub-policies broke policy events XFRM policy events are broken when sub-policy feature is turned on. A simple test to verify this: run ip xfrm mon on one window and add then delete a policy on another window .. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-21 16:16:34 -08:00
David S. Miller	54489c14c0	[XFRM] xfrm_user: Fix unaligned accesses. Use memcpy() to move xfrm_address_t objects in and out of netlink messages. The vast majority of xfrm_user was doing this properly, except for copy_from_user_state() and copy_to_user_state(). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-30 15:24:35 -08:00
Patrick McHardy	2fab22f2d3	[XFRM]: Fix xfrm_state accounting xfrm_state_num needs to be increased for XFRM_STATE_ACQ states created by xfrm_state_find() to prevent the counter from going negative when the state is destroyed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-24 15:34:00 -07:00
David S. Miller	918049f013	[XFRM]: Fix xfrm_state_num going negative. Missing counter bump when hashing in a new ACQ xfrm_state. Now that we have two spots to do the hash grow check, break it out into a helper function. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:18 -07:00
Venkat Yekkirala	3bccfbc7a7	IPsec: fix handling of errors for socket policies This treats the security errors encountered in the case of socket policy matching, the same as how these are treated in the case of main/sub policies, which is to return a full lookup failure. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:39 -07:00
Venkat Yekkirala	5b368e61c2	IPsec: correct semantics for SELinux policy matching Currently when an IPSec policy rule doesn't specify a security context, it is assumed to be "unlabeled" by SELinux, and so the IPSec policy rule fails to match to a flow that it would otherwise match to, unless one has explicitly added an SELinux policy rule allowing the flow to "polmatch" to the "unlabeled" IPSec policy rules. In the absence of such an explicitly added SELinux policy rule, the IPSec policy rule fails to match and so the packet(s) flow in clear text without the otherwise applicable xfrm(s) applied. The above SELinux behavior violates the SELinux security notion of "deny by default" which should actually translate to "encrypt by default" in the above case. This was first reported by Evgeniy Polyakov and the way James Morris was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. With this patch applied, SELinux "polmatching" of flows Vs. IPSec policy rules will only come into play when there's a explicit context specified for the IPSec policy rule (which also means there's corresponding SELinux policy allowing appropriate domains/flows to polmatch to this context). Secondly, when a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return errors other than access denied, such as -EINVAL. We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The solution for this is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). This patch: Fix the selinux side of things. This makes sure SELinux polmatching of flow contexts to IPSec policy rules comes into play only when an explicit context is associated with the IPSec policy rule. Also, this no longer defaults the context of a socket policy to the context of the socket since the "no explicit context" case is now handled properly. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:37 -07:00
James Morris	134b0fc544	IPsec: propagate security module errors up from flow_cache_lookup When a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return an access denied permission (or other error). We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The way I was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. The first SYNACK would be blocked, because of an uncached lookup via flow_cache_lookup(), which would fail to resolve an xfrm policy because the SELinux policy is checked at that point via the resolver. However, retransmitted SYNACKs would then find a cached flow entry when calling into flow_cache_lookup() with a null xfrm policy, which is interpreted by xfrm_lookup() as the packet not having any associated policy and similarly to the first case, allowing it to pass without transformation. The solution presented here is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:34 -07:00
Diego Beltrami	0a69452cb4	[XFRM]: BEET mode This patch introduces the BEET mode (Bound End-to-End Tunnel) with as specified by the ietf draft at the following link: http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-06.txt The patch provides only single family support (i.e. inner family = outer family). Signed-off-by: Diego Beltrami <diego.beltrami@gmail.com> Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Abhinav Pathak <abhinav.pathak@hiit.fi> Signed-off-by: Jeff Ahrenholz <ahrenholz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-04 00:31:09 -07:00
David S. Miller	ae8c05779a	[XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect. When we flush policies, we do a type match so we might not actually delete all policies matching a certain direction. So keep track of how many policies we actually kill and subtract that number from xfrm_policy_count[dir] at the end. Based upon a patch by Masahide NAKAMURA. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-04 00:31:02 -07:00
Masahide NAKAMURA	667bbcb6c0	[XFRM] STATE: Use destination address for src hash. Src hash is introduced for Mobile IPv6 route optimization usage. On current kenrel code it is calculated with source address only. It results we uses the same hash value for outbound state (when the node has only one address for Mobile IPv6). This patch use also destination address as peer information for src hash to be dispersed. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-04 00:31:02 -07:00
Masahide NAKAMURA	7b4dc3600e	[XFRM]: Do not add a state whose SPI is zero to the SPI hash. SPI=0 is used for acquired IPsec SA and MIPv6 RO state. Such state should not be added to the SPI hash because we do not care about it on deleting path. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-28 18:02:49 -07:00
Al Viro	8122adf06e	[XFRM]: xfrm_spi_hash() annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:44 -07:00
Al Viro	61f4627b2f	[XFRM]: xfrm_replay_advance() annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:41 -07:00
Al Viro	a252cc2371	[XFRM]: xrfm_replay_check() annotations seq argument is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:40 -07:00
Al Viro	6067b2baba	[XFRM]: xfrm_parse_spi() annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:39 -07:00
Al Viro	a94cfd1974	[XFRM]: xfrm_state_lookup() annotations spi argument of xfrm_state_lookup() is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:37 -07:00
Al Viro	26977b4ed7	[XFRM]: xfrm_alloc_spi() annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:36 -07:00
Patrick McHardy	a1e59abf82	[XFRM]: Fix wildcard as tunnel source Hashing SAs by source address breaks templates with wildcards as tunnel source since the source address used for hashing/lookup is still 0/0. Move source address lookup to xfrm_tmpl_resolve_one() so we can use the real address in the lookup. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:06 -07:00
James Morris	d1d9facfd1	[XFRM]: remove xerr_idxp from __xfrm_policy_check() It seems that during the MIPv6 respin, some code which was originally conditionally compiled around CONFIG_XFRM_ADVANCED was accidently left in after the config option was removed. This patch removes an extraneous pointer (xerr_idxp) which is no longer needed. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:49 -07:00
Masahide NAKAMURA	a9917c0665	[XFRM] STATE: Fix flusing with hash mask. This is a minor fix about transformation state flushing for net-2.6.19. Please apply it. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:45 -07:00
Alexey Dobriyan	e5d679f339	[NET]: Use SLAB_PANIC Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:19 -07:00
David S. Miller	acba48e1a3	[XFRM]: Respect priority in policy lookups. Even if we find an exact match in the hash table, we must inspect the inexact list to look for a match with a better priority. Noticed by Masahide NAKAMURA <nakam@linux-ipv6.org>. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:05 -07:00
David S. Miller	44e36b42a8	[XFRM]: Extract common hashing code into xfrm_hash.[ch] Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:49 -07:00
David S. Miller	2518c7c2b3	[XFRM]: Hash policies when non-prefixed. This idea is from Alexey Kuznetsov. It is common for policies to be non-prefixed. And for that case we can optimize lookups, insert, etc. quite a bit. For each direction, we have a dynamically sized policy hash table for non-prefixed policies. We also have a hash table on policy->index. For prefixed policies, we have a list per-direction which we will consult on lookups when a non-prefix hashtable lookup fails. This still isn't as efficient as I would like it. There are four immediate problems: 1) Lots of excessive refcounting, which can be fixed just like xfrm_state was 2) We do 2 hash probes on insert, one to look for dups and one to allocate a unique policy->index. Althought I wonder how much this matters since xfrm_state inserts do up to 3 hash probes and that seems to perform fine. 3) xfrm_policy_insert() is very complex because of the priority ordering and entry replacement logic. 4) Lots of counter bumping, in addition to policy refcounts, in the form of xfrm_policy_count[]. This is merely used to let code path(s) know that some IPSEC rules exist. So this count is indexed per-direction, maybe that is overkill. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:48 -07:00
David S. Miller	c1969f294e	[XFRM]: Hash xfrm_state objects by source address too. The source address is always non-prefixed so we should use it to help give entropy to the bydst hash. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:47 -07:00
David S. Miller	a47f0ce05a	[XFRM]: Kill excessive refcounting of xfrm_state objects. The refcounting done for timers and hash table insertions are just wasted cycles. We can eliminate all of this refcounting because: 1) The implicit refcount when the xfrm_state object is active will always be held while the object is in the hash tables. We never kfree() the xfrm_state until long after we've made sure that it has been unhashed. 2) Timers are even easier. Once we mark that x->km.state as anything other than XFRM_STATE_VALID (__xfrm_state_delete sets it to XFRM_STATE_DEAD), any timer that fires will do nothing and return without rearming the timer. Therefore we can defer the del_timer calls until when the object is about to be freed up during GC. We have to use del_timer_sync() and defer it to GC because we can't do a del_timer_sync() while holding x->lock which all callers of __xfrm_state_delete hold. This makes SA changes even more light-weight. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:47 -07:00
David S. Miller	1c09539975	[XFRM]: Purge dst references to deleted SAs passively. Just let GC and other normal mechanisms take care of getting rid of DST cache references to deleted xfrm_state objects instead of walking all the policy bundles. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:46 -07:00
David S. Miller	c7f5ea3a4d	[XFRM]: Do not flush all bundles on SA insert. Instead, simply set all potentially aliasing existing xfrm_state objects to have the current generation counter value. This will make routes get relooked up the next time an existing route mentioning these aliased xfrm_state objects gets used, via xfrm_dst_check(). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:45 -07:00

1 2 3 4 5 ...

360 Commits