1. After sock_register() returns, it's possible to create sockets,
even if module still not initialized fully (blame generic module code
for that!)
2. Consequently, pfkey_create() can be called with pfkey_net_id still not
initialized which will BUG_ON in net_generic():
kernel BUG at include/net/netns/generic.h:43!
3. During netns shutdown, netns ops should be unregistered after
key manager unregistered because key manager calls can be triggered
from xfrm_user module:
general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
pfkey_broadcast+0x111/0x210 [af_key]
pfkey_send_notify+0x16a/0x300 [af_key]
km_state_notify+0x41/0x70
xfrm_flush_sa+0x75/0x90 [xfrm_user]
4. Unregister netns ops after socket ops just in case and for symmetry.
Reported by Luca Tettamanti.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Tested-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use atomic_inc_return() in get_acqseq() to avoid taking a spinlock
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__net_init/__net_exit are apparently not going away, so use them
to full extent.
In some cases __net_init was removed, because it was called from
__net_exit code.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 4447bb33f0 ("xfrm: Store aalg in
xfrm_state with a user specified truncation length") breaks
installation of authentication algorithms via PF_KEY, as the state
specific truncation length is not installed with the algorithms
default truncation length. This patch initializes state properly to
the default if installed via PF_KEY.
Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Take advantage of the new pernet automatic storage management,
and stop using compatibility network namespace functions.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The generic __sock_create function has a kern argument which allows the
security system to make decisions based on if a socket is being created by
the kernel or by userspace. This patch passes that flag to the
net_proto_family specific create function, so it can do the same thing.
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a new socket level option to report number of queue overflows
Recently I augmented the AF_PACKET protocol to report the number of frames lost
on the socket receive queue between any two enqueued frames. This value was
exported via a SOL_PACKET level cmsg. AFter I completed that work it was
requested that this feature be generalized so that any datagram oriented socket
could make use of this option. As such I've created this patch, It creates a
new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
overflowed between any two given frames. It also augments the AF_PACKET
protocol to take advantage of this new feature (as it previously did not touch
sk->sk_drops, which this patch uses to record the overflow count). Tested
successfully by me.
Notes:
1) Unlike my previous patch, this patch simply records the sk_drops value, which
is not a number of drops between packets, but rather a total number of drops.
Deltas must be computed in user space.
2) While this patch currently works with datagram oriented protocols, it will
also be accepted by non-datagram oriented protocols. I'm not sure if thats
agreeable to everyone, but my argument in favor of doing so is that, for those
protocols which aren't applicable to this option, sk_drops will always be zero,
and reporting no drops on a receive queue that isn't used for those
non-participating protocols seems reasonable to me. This also saves us having
to code in a per-protocol opt in mechanism.
3) This applies cleanly to net-next assuming that commit
977750076d (my af packet cmsg patch) is reverted
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All usages of structure net_proto_ops should be declared const.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All instances of file_operations should be const.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 2b85a34e91
(net: No more expensive sock_hold()/sock_put() on each tx)
changed initial sk_wmem_alloc value.
We need to take into account this offset when reporting
sk_wmem_alloc to user, in PROC_FS files or various
ioctls (SIOCOUTQ/TIOCOUTQ)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove some pointless conditionals before kfree_skb().
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently encap_oa is left uninitialized, so it contains garbage data which
is visible to userland via Netlink. Initialize it by zeroing it out.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* interaction with userspace -- take netns from userspace socket.
* in ->notify hook take netns either from SA or explicitly passed --
we don't know if SA/SPD flush is coming.
* stub policy migration with init_net for now.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* netns boilerplate
* keep per-netns socket list
* keep per-netns number of sockets
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SA and SPD flush are executed with NULL SA and SPD respectively, for
these cases pass netns explicitly from userspace socket.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add netns parameter to xfrm_policy_bysel_ctx(), xfrm_policy_byidx().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Again, to avoid complications with passing netns when not necessary.
Again, ->xp_net is set-once field, once set it never changes.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disallow spurious wakeups in __xfrm_lookup().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid unnecessary complications with passing netns around.
* set once, very early after allocating
* once set, never changes
For a while create every xfrm_state in init_net.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm_policy_destroy() will oops if not dead policy is passed to it.
On error path in pfkey_compile_policy() exactly this happens.
Oopsable for CAP_NET_ADMIN owners.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When deleting an SPD entry using SADB_X_SPDDELETE, c.data.byid is not
initialized to zero in pfkey_spddelete(). Thus, key_notify_policy()
responds with a PF_KEY message of type SADB_X_SPDDELETE2 instead of
SADB_X_SPDDELETE.
Signed-off-by: Tobias Brunner <tobias.brunner@strongswan.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provides implementation of the enhancements of XFRM/PF_KEY MIGRATE mechanism
specified in draft-ebalard-mext-pfkey-enhanced-migrate-00. Defines associated
PF_KEY SADB_X_EXT_KMADDRESS extension and XFRM/netlink XFRMA_KMADDRESS
attribute.
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu came up with the idea and the original patch to make
xfrm_state dump list contain also dumpers:
As it is we go to extraordinary lengths to ensure that states
don't go away while dumpers go to sleep. It's much easier if
we just put the dumpers themselves on the list since they can't
go away while they're going.
I've also changed the order of addition on new states to prevent
a never-ending dump.
Timo Teräs improved the patch to apply cleanly to latest tree,
modified iteration code to be more readable by using a common
struct for entries in the list, implemented the same idea for
xfrm_policy dumping and moved the af_key specific "last" entry
caching to af_key.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a xfrm_{state,policy}_walk leak if pfkey socket is closed while
dumping is on-going.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removes legacy reinvent-the-wheel type thing. The generic
machinery integrates much better to automated debugging aids
such as kerneloops.org (and others), and is unambiguous due to
better naming. Non-intuively BUG_TRAP() is actually equal to
WARN_ON() rather than BUG_ON() though some might actually be
promoted to BUG_ON() but I left that to future.
I could make at least one BUILD_BUG_ON conversion.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
When pfkey has no km listeners, it still does a lot of work
before finding out there aint nobody out there.
If a tree falls in a forest and no one is around to hear it, does it make
a sound? In this case it makes a lot of noise:
With this short-circuit adding 10s of thousands of SAs using
netlink improves performance by ~10%.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
This propagates the xfrm_user fix made in commit
bcf0dda8d2 ("[XFRM]: xfrm_user: fix
selector family initialization")
Based upon a bug report from, and tested by, Alan Swanson.
Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously I added sessionid output to all audit messages where it was
available but we still didn't know the sessionid of the sender of
netlink messages. This patch adds that information to netlink messages
so we can audit who sent netlink messages.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
net/key/af_key.c: In function ‘pfkey_spddelete’:
net/key/af_key.c:2359: warning: ‘pol_ctx’ may be used uninitialized in
this function
When CONFIG_SECURITY_NETWORK_XFRM isn't set,
security_xfrm_policy_alloc() is an inline that doesn't set pol_ctx, so
this seemed like the easiest fix short of using *uninitialized_var(pol_ctx).
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it stands it's impossible to use any authentication algorithms
with an ID above 31 portably. It just happens to work on x86 but
fails miserably on ppc64.
The reason is that we're using a bit mask to check the algorithm
ID but the mask is only 32 bits wide.
After looking at how this is used in the field, I have concluded
that in the long term we should phase out state matching by IDs
because this is made superfluous by the reqid feature. For current
applications, the best solution IMHO is to allow all algorithms when
the bit masks are all ~0.
The following patch does exactly that.
This bug was identified by IBM when testing on the ppc64 platform
using the NULL authentication algorithm which has an ID of 251.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The xfrm_get_policy() and xfrm_add_pol_expire() put some rather large structs
on the stack to work around the LSM API. This patch attempts to fix that
problem by changing the LSM API to require only the relevant "security"
pointers instead of the entire SPD entry; we do this for all of the
security_xfrm_policy*() functions to keep things consistent.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stop dumping of entries when af_key socket receive queue is getting
full and continue it later when there is more room again.
This fixes dumping of large databases. Currently the entries not
fitting into the receive queue are just dropped (including the
end-of-dump message) which can confuse applications.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>