Commit Graph

2570 Commits

Author SHA1 Message Date
KOVACS Krisztian
a3116ac5c2 tcp: Port redirection support for TCP
Current TCP code relies on the local port of the listening socket
being the same as the destination address of the incoming
connection. Port redirection used by many transparent proxying
techniques obviously breaks this, so we have to store the original
destination port address.

This patch extends struct inet_request_sock and stores the incoming
destination port value there. It also modifies the handshake code to
use that value as the source port when sending reply packets.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:46:49 -07:00
KOVACS Krisztian
86b08d867d ipv4: Make Netfilter's ip_route_me_harder() non-local address compatible
Netfilter's ip_route_me_harder() tries to re-route packets either
generated or re-routed by Netfilter. This patch changes
ip_route_me_harder() to handle packets from non-locally-bound sockets
with IP_TRANSPARENT set as local and to set the appropriate flowi
flags when re-doing the routing lookup.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:44:42 -07:00
KOVACS Krisztian
88ef4a5a78 tcp: Handle TCP SYN+ACK/ACK/RST transparency
The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
incoming packets. The non-local source address check on output bites
us again, as replies for transparently redirected traffic won't have a
chance to leave the node.

This patch selectively sets the FLOWI_FLAG_ANYSRC flag when doing the
route lookup for those replies. Transparent replies are enabled if the
listening socket has the transparent socket flag set.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:41:00 -07:00
KOVACS Krisztian
79876874ce ipv4: Conditionally enable transparent flow flag when connecting
Set FLOWI_FLAG_ANYSRC in flowi->flags if the socket has the
transparent socket option set. This way we selectively enable certain
connections with non-local source addresses to be routed.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:35:39 -07:00
KOVACS Krisztian
1668e010cb ipv4: Make inet_sock.h independent of route.h
inet_iif() in inet_sock.h requires route.h. Since users of inet_iif()
usually require other route.h functionality anyway this patch moves
inet_iif() to route.h.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:33:10 -07:00
KOVACS Krisztian
f5715aea45 ipv4: Implement IP_TRANSPARENT socket option
This patch introduces the IP_TRANSPARENT socket option: enabling that
will make the IPv4 routing omit the non-local source address check on
output. Setting IP_TRANSPARENT requires NET_ADMIN capability.

Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:30:02 -07:00
Julian Anastasov
a210d01ae3 ipv4: Loosen source address check on IPv4 output
ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.

This patch introduces a flowi flag which makes omitting this check
possible. The new flag provides a way of handling transparent and
non-transparent connections differently.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:28:28 -07:00
Herbert Xu
12a169e7d8 ipsec: Put dumpers on the dump list
Herbert Xu came up with the idea and the original patch to make
xfrm_state dump list contain also dumpers:

As it is we go to extraordinary lengths to ensure that states
don't go away while dumpers go to sleep.  It's much easier if
we just put the dumpers themselves on the list since they can't
go away while they're going.

I've also changed the order of addition on new states to prevent
a never-ending dump.

Timo Teräs improved the patch to apply cleanly to latest tree,
modified iteration code to be more readable by using a common
struct for entries in the list, implemented the same idea for
xfrm_policy dumping and moved the af_key specific "last" entry
caching to af_key.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 07:03:24 -07:00
David S. Miller
b262e60309 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/ath9k/core.c
	drivers/net/wireless/ath9k/main.c
	net/core/dev.c
2008-10-01 06:12:56 -07:00
Ilpo Järvinen
93c8b90f01 ipv6: almost identical frag hashing funcs combined
$ diff-funcs ip6qhashfn reassembly.c netfilter/nf_conntrack_reasm.c
 --- reassembly.c:ip6qhashfn()
 +++ netfilter/nf_conntrack_reasm.c:ip6qhashfn()
@@ -1,5 +1,5 @@
-static unsigned int ip6qhashfn(__be32 id, struct in6_addr *saddr,
-			       struct in6_addr *daddr)
+static unsigned int ip6qhashfn(__be32 id, const struct in6_addr *saddr,
+			       const struct in6_addr *daddr)
 {
 	u32 a, b, c;

@@ -9,7 +9,7 @@

 	a += JHASH_GOLDEN_RATIO;
 	b += JHASH_GOLDEN_RATIO;
-	c += ip6_frags.rnd;
+	c += nf_frags.rnd;
 	__jhash_mix(a, b, c);

 	a += (__force u32)saddr->s6_addr32[3];

And codiff xx.o.old xx.o.new:

net/ipv6/netfilter/nf_conntrack_reasm.c:
  ip6qhashfn         | -512
  nf_hashfn          |   +6
  nf_ct_frag6_gather |  +36
 3 functions changed, 42 bytes added, 512 bytes removed, diff: -470
net/ipv6/reassembly.c:
  ip6qhashfn    | -512
  ip6_hashfn    |   +7
  ipv6_frag_rcv |  +89
 3 functions changed, 96 bytes added, 512 bytes removed, diff: -416

net/ipv6/reassembly.c:
  inet6_hash_frag | +510
 1 function changed, 510 bytes added, diff: +510

Total: -376

Compile tested.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01 02:48:31 -07:00
John W. Linville
55ad175fb6 ieee80211.h: remove superfluous ETH_P_PAE definition
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-30 14:07:23 -04:00
Wei Yongjun
ba0166708e sctp: Fix kernel panic while process protocol violation parameter
Since call to function sctp_sf_abort_violation() need paramter 'arg' with
'struct sctp_chunk' type, it will read the chunk type and chunk length from
the chunk_hdr member of chunk. But call to sctp_sf_violation_paramlen()
always with 'struct sctp_paramhdr' type's parameter, it will be passed to
sctp_sf_abort_violation(). This may cause kernel panic.

   sctp_sf_violation_paramlen()
     |-- sctp_sf_abort_violation()
        |-- sctp_make_abort_violation()

This patch fixed this problem. This patch also fix two place which called
sctp_sf_violation_paramlen() with wrong paramter type.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-30 05:32:24 -07:00
Tejun Heo
72029fe85d 9p: implement proper trans module refcounting and unregistration
9p trans modules aren't refcounted nor were they unregistered
properly.  Fix it.

* Add 9p_trans_module->owner and reference the module on each trans
  instance creation and put it on destruction.

* Protect v9fs_trans_list with a spinlock.  This isn't strictly
  necessary as the list is manipulated only during module loading /
  unloading but it's a good idea to make the API safe.

* Unregister trans modules when the corresponding module is being
  unloaded.

* While at it, kill unnecessary EXPORT_SYMBOL on p9_trans_fd_init().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-09-24 16:22:23 -05:00
Johannes Berg
4b7679a561 mac80211: clean up rate control API
Long awaited, hard work. This patch totally cleans up the rate control
API to remove the requirement to include internal headers outside of
net/mac80211/.

There's one internal use in the PID algorithm left for mesh networking,
we'll have to figure out a way to clean that one up and decide how to
do the peer link evaluation, possibly independent of the rate control
algorithm or via new API.

Additionally, ath9k is left using the cross-inclusion hack for now, we
will add new API where necessary to make this work properly, but right
now I'm not expert enough to do it. It's still off better than before.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-24 16:18:03 -04:00
Johannes Berg
60719ffd72 cfg80211: show interface type
This patch makes cfg80211 show the interface in the nl80211
information about a specific interface. API users are required
to keep the type updated (everything else is fairly complicated)
but you will get a warning if you fail to keep it updated.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-24 16:18:00 -04:00
Johannes Berg
e07aa3783e cfg80211: fix code ordering in header file
Luis added the regulatory hint stuff to this file without
observing that __ieee80211_get_channel and ieee80211_get_channel
really belong together.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-24 16:17:59 -04:00
Jarek Poplawski
f4ab543201 pkt_sched: Remove the tx queue state check in qdisc_run()
The current check wrongly uses the state of one (currently the first)
tx queue for all tx queues in case of non-default qdiscs. This check
mainly prevented requeuing loop with __netif_schedule(), but now it's
controlled inside __qdisc_run(), while dequeuing. The wrongness of
this check was first noticed by Herbert Xu.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-23 01:05:56 -07:00
David S. Miller
cd07a8ea0d tcp: Use SKB queue handling interfaces instead of by-hand versions.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-23 00:50:13 -07:00
David S. Miller
d258b4914b tcp: Use skb_queue_is_last() instead of by-hand version.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-23 00:34:37 -07:00
David S. Miller
242f8bfefe pkt_sched: Make qdisc->gso_skb a list.
The idea is that we can use this to get rid of
->requeue().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 22:15:30 -07:00
David S. Miller
3d09274cc9 sctp: Use skb_queue_walk_safe() and skb_queue_split_tail_init().
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 22:14:36 -07:00
Remi Denis-Courmont
be0c52bfed Phonet: emit errors when a packet cannot be delivered locally
When there is no listener socket for a received packet, send an error
back to the sender.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:09:13 -07:00
Remi Denis-Courmont
87ab4e20b4 Phonet: proc interface for port range
Phonet endpoints are bound to individual ports.
This provides a /proc/sys/net/phonet (or sysctl) interface for
selecting the range of automatically allocated ports (much like the
ip_local_port_range with IPv4).

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:08:39 -07:00
Remi Denis-Courmont
107d0d9b8d Phonet: Phonet datagram transport protocol
This provides the basic SOCK_DGRAM transport protocol for Phonet.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:05:57 -07:00
Remi Denis-Courmont
ba113a94b7 Phonet: common socket glue
This provides the socket API for the Phonet protocols family.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:05:19 -07:00
Remi Denis-Courmont
8fb397406f Phonet: Netlink interface
This provides support for configuring Phonet addresses, notifying
Phonet configuration changes, and dumping the configuration.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:04:30 -07:00
Remi Denis-Courmont
f8ff60283d Phonet: network device and address handling
This provides support for adding Phonet addresses to and removing
Phonet addresses from network devices.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:03:44 -07:00
Remi Denis-Courmont
4b07b3f69a Phonet: PF_PHONET protocol family support
This is the basis for the Phonet protocol families, and introduces
the ETH_P_PHONET packet type and the PF_PHONET socket family.

Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 20:02:10 -07:00
Herbert Xu
5c1824587f ipsec: Fix xfrm_state_walk race
As discovered by Timo Teräs, the currently xfrm_state_walk scheme
is racy because if a second dump finishes before the first, we
may free xfrm states that the first dump would walk over later.

This patch fixes this by storing the dumps in a list in order
to calculate the correct completion counter which cures this
problem.

I've expanded netlink_cb in order to accomodate the extra state
related to this.  It shouldn't be a big deal since netlink_cb
is kmalloced for each dump and we're just increasing it by 4 or
8 bytes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-22 19:48:19 -07:00
David S. Miller
43f59c8939 net: Remove __skb_insert() calls outside of skbuff internals.
This minor cleanup simplifies later changes which will convert
struct sk_buff and friends over to using struct list_head.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-21 21:28:51 -07:00
Ilpo Järvinen
ef9da47c7c tcp: don't clear retransmit_skb_hint when not necessary
Most importantly avoid doing it with cumulative ACK. Not clearing
means that we no longer need n^2 processing in resolution of each
fast recovery.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-20 21:25:15 -07:00
Ilpo Järvinen
0e1c54c2a4 tcp: reorganize retransmit code loops
Both loops are quite similar, so they can be combined
with little effort. As a result, forward_skb_hint becomes
obsolete as well.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-20 21:24:21 -07:00
Ilpo Järvinen
006f582c73 tcp: convert retransmit_cnt_hint to seqno
Main benefit in this is that we can then freely point
the retransmit_skb_hint to anywhere we want to because
there's no longer need to know what would be the count
changes involve, and since this is really used only as a
terminator, unnecessary work is one time walk at most,
and if some retransmissions are necessary after that
point later on, the walk is not full waste of time
anyway.

Since retransmit_high must be kept valid, all lost
markers must ensure that.

Now I also have learned how those "holes" in the
rexmittable skbs can appear, mtu probe does them. So
I removed the misleading comment as well.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-20 21:20:20 -07:00
Ilpo Järvinen
64edc2736e tcp: Partial hint clearing has again become meaningless
Ie., the difference between partial and all clearing doesn't
exists anymore since the SACK optimizations got dropped by
an sacktag rewrite.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-20 21:18:32 -07:00
Johannes Berg
25d834e162 mac80211: fix virtual interfaces vs. injection
Currently, virtual interface pointers passed to drivers might be
from monitor interfaces and as such completely uninitialised
because we do not tell the driver about monitor interfaces when
those are created. Instead of passing them, we should therefore
indicate to the driver that there is no information; do that by
passing a NULL value and adjust drivers to cope with it.

As a result, some mac80211 API functions also need to cope with
a NULL vif pointer so drivers can still call them unconditionally.

Also, when injecting frames we really don't want to pass NULL all
the time, if we know we are the source address of a frame and have
a local interface for that address, we can to use that interface.
This also helps with processing the frame correctly for that
interface which will help the 802.11w implementation. It's not
entirely correct for VLANs or WDS interfaces because there the MAC
address isn't unique, but it's already a lot better than what we
do now.

Finally, when injecting without a matching local interface, don't
assign sequence numbers at all.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:25 -04:00
Johannes Berg
687c7c0807 mac80211: share sta_info->ht_info
Rate control algorithms may need access to a station's
HT capabilities, so share the ht_info struct in the
public station API.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:24 -04:00
Johannes Berg
323ce79a9c mac80211: share sta->supp_rates
As more preparation for a saner rate control algorithm API,
share the supported rates bitmap in the public API.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:24 -04:00
Johannes Berg
17741cdc26 mac80211: share STA information with driver
This patch changes mac80211 to share some more data about
stations with drivers. Should help iwlwifi and ath9k when
 they get around to updating, and might also help with
implementing rate control algorithms without internals.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Sujith Manoharan <Sujith.Manoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:23 -04:00
Johannes Berg
05c914fe33 mac80211: use nl80211 interface types
There's really no reason for mac80211 to be using its
own interface type defines. Use the nl80211 types and
simplify the configuration code a bit: there's no need
to translate them any more now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:23 -04:00
Johannes Berg
96dd22ac06 mac80211: inform driver of basic rateset
Drivers need to know the basic rateset to be able to configure
the ACK/CTS programming in hardware correctly.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:22 -04:00
Johannes Berg
5bc75728fd mac80211: fix scan vs. interface removal race
When we remove an interface, we can currently end up having
a pointer to it left in local->scan_sdata after it has been
set down, and then with a hardware scan the scan completion
can try to access it which is a bug. Alternatively, a scan
that started as a hardware scan may terminate as though it
was a software scan, if the timing is just right.

On SMP systems, software scan also has a similar problem,
just canceling the delayed work and setting a flag isn't
enough since it may be running concurrently; in this case
we would also never restore state of other interfaces.

This patch hopefully fixes the problems by always invoking
ieee80211_scan_completed or requiring it to be invoked by
the driver, I suspect the drivers that have ->hw_scan() are
buggy. The bug will not manifest itself unless you remove
the interface while hw-scanning which will also turn off
the hw, and then add a new interface which will be unusable
until you scan once.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:20 -04:00
Luis R. Rodriguez
b2e1b30290 cfg80211: Add new wireless regulatory infrastructure
This adds the new wireless regulatory infrastructure. The
main motiviation behind this was to centralize regulatory
code as each driver was implementing their own regulatory solution,
and to replace the initial centralized code we have where:

* only 3 regulatory domains are supported: US, JP and EU
* regulatory domains can only be changed through module parameter
* all rules were built statically in the kernel

We now have support for regulatory domains for many countries
and regulatory domains are now queried through a userspace agent
through udev allowing distributions to update regulatory rules
without updating the kernel.

Each driver can regulatory_hint() a regulatory domain
based on either their EEPROM mapped regulatory domain value to a
respective ISO/IEC 3166-1 country code or pass an internally built
regulatory domain. We also add support to let the user set the
regulatory domain through userspace in case of faulty EEPROMs to
further help compliance.

Support for world roaming will be added soon for cards capable of
this.

For more information see:

http://wireless.kernel.org/en/developers/Regulatory/CRDA

For now we leave an option to enable the old module parameter,
ieee80211_regdom, and to build the 3 old regdomains statically
(US, JP and EU). This option is CONFIG_WIRELESS_OLD_REGULATORY.
These old static definitions and the module parameter is being
scheduled for removal for 2.6.29. Note that if you use this
you won't make use of a world regulatory domain as its pointless.
If you leave this option enabled and if CRDA is present and you
use US or JP we will try to ask CRDA to update us a regulatory
domain for us.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-15 16:48:19 -04:00
Alexander Duyck
ca9b0e27e0 pkt_action: add new action skbedit
This new action will have the ability to change the priority and/or
queue_mapping fields on an sk_buff.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-12 16:30:20 -07:00
Vegard Nossum
1045b03e07 netlink: fix overrun in attribute iteration
kmemcheck reported this:

  kmemcheck: Caught 16-bit read from uninitialized memory (f6c1ba30)
  0500110001508abf050010000500000002017300140000006f72672e66726565
   i i i i i i i i i i i i i u u u u u u u u u u u u u u u u u u u
                                   ^

  Pid: 3462, comm: wpa_supplicant Not tainted (2.6.27-rc3-00054-g6397ab9-dirty #13)
  EIP: 0060:[<c05de64a>] EFLAGS: 00010296 CPU: 0
  EIP is at nla_parse+0x5a/0xf0
  EAX: 00000008 EBX: fffffffd ECX: c06f16c0 EDX: 00000005
  ESI: 00000010 EDI: f6c1ba30 EBP: f6367c6c ESP: c0a11e88
   DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
  CR0: 8005003b CR2: f781cc84 CR3: 3632f000 CR4: 000006d0
  DR0: c0ead9bc DR1: 00000000 DR2: 00000000 DR3: 00000000
  DR6: ffff4ff0 DR7: 00000400
   [<c05d4b23>] rtnl_setlink+0x63/0x130
   [<c05d5f75>] rtnetlink_rcv_msg+0x165/0x200
   [<c05ddf66>] netlink_rcv_skb+0x76/0xa0
   [<c05d5dfe>] rtnetlink_rcv+0x1e/0x30
   [<c05dda21>] netlink_unicast+0x281/0x290
   [<c05ddbe9>] netlink_sendmsg+0x1b9/0x2b0
   [<c05beef2>] sock_sendmsg+0xd2/0x100
   [<c05bf945>] sys_sendto+0xa5/0xd0
   [<c05bf9a6>] sys_send+0x36/0x40
   [<c05c03d6>] sys_socketcall+0x1e6/0x2c0
   [<c020353b>] sysenter_do_call+0x12/0x3f
   [<ffffffff>] 0xffffffff

This is the line in nla_ok():

  /**
   * nla_ok - check if the netlink attribute fits into the remaining bytes
   * @nla: netlink attribute
   * @remaining: number of bytes remaining in attribute stream
   */
  static inline int nla_ok(const struct nlattr *nla, int remaining)
  {
          return remaining >= sizeof(*nla) &&
                 nla->nla_len >= sizeof(*nla) &&
                 nla->nla_len <= remaining;
  }

It turns out that remaining can become negative due to alignment in
nla_next(). But GCC promotes "remaining" to unsigned in the test
against sizeof(*nla) above. Therefore the test succeeds, and the
nla_for_each_attr() may access memory outside the received buffer.

A short example illustrating this point is here:

  #include <stdio.h>

  main(void)
  {
          printf("%d\n", -1 >= sizeof(int));
  }

...which prints "1".

This patch adds a cast in front of the sizeof so that GCC will make
a signed comparison and fix the illegal memory dereference. With the
patch applied, there is no kmemcheck report.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-11 19:05:29 -07:00
Johannes Berg
fe3fa82731 mac80211: make conf_tx non-atomic
The conf_tx callback currently needs to be atomic, this requirement
is just because it can be called from scanning. This rearranges it
slightly to only update while not scanning (which is fine, we'll be
getting beacons when associated) and thus removes the atomic
requirement.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-11 15:53:34 -04:00
Herbert Xu
abb81c4f3c ipsec: Use RCU-like construct for saved state within a walk
Now that we save states within a walk we need synchronisation
so that the list the saved state is on doesn't disappear from
under us.

As it stands this is done by keeping the state on the list which
is bad because it gets in the way of the management of the state
life-cycle.

An alternative is to make our own pseudo-RCU system where we use
counters to indicate which state can't be freed immediately as
it may be referenced by an ongoing walk when that resumes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-09 19:58:29 -07:00
David S. Miller
dacc62dbf5 Merge branch 'lvs-next-2.6' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6 2008-09-09 19:51:04 -07:00
David S. Miller
47abf28d5b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2008-09-09 19:28:03 -07:00
Simon Horman
c051a0a2c9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 into lvs-next-2.6 2008-09-10 09:14:52 +10:00
Gerrit Renker
410e27a49b This reverts "Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp"
as it accentally contained the wrong set of patches. These will be
submitted separately.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-09 13:27:22 +02:00
David S. Miller
fd9ec7d31f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2008-09-09 02:11:11 -07:00
Marcel Holtmann
e7c29cb16c [Bluetooth] Reject L2CAP connections on an insecure ACL link
The Security Mode 4 of the Bluetooth 2.1 specification has strict
authentication and encryption requirements. It is the initiators job
to create a secure ACL link. However in case of malicious devices, the
acceptor has to make sure that the ACL is encrypted before allowing
any kind of L2CAP connection. The only exception here is the PSM 1 for
the service discovery protocol, because that is allowed to run on an
insecure ACL link.

Previously it was enough to reject a L2CAP connection during the
connection setup phase, but with Bluetooth 2.1 it is forbidden to
do any L2CAP protocol exchange on an insecure link (except SDP).

The new hci_conn_check_link_mode() function can be used to check the
integrity of an ACL link. This functions also takes care of the cases
where Security Mode 4 is disabled or one of the devices is based on
an older specification.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-09-09 07:19:20 +02:00
Marcel Holtmann
09ab6f4c23 [Bluetooth] Enforce correct authentication requirements
With the introduction of Security Mode 4 and Simple Pairing from the
Bluetooth 2.1 specification it became mandatory that the initiator
requires authentication and encryption before any L2CAP channel can
be established. The only exception here is PSM 1 for the service
discovery protocol (SDP). It is meant to be used without any encryption
since it contains only public information. This is how Bluetooth 2.0
and before handle connections on PSM 1.

For Bluetooth 2.1 devices the pairing procedure differentiates between
no bonding, general bonding and dedicated bonding. The L2CAP layer
wrongly uses always general bonding when creating new connections, but it
should not do this for SDP connections. In this case the authentication
requirement should be no bonding and the just-works model should be used,
but in case of non-SDP connection it is required to use general bonding.

If the new connection requires man-in-the-middle (MITM) protection, it
also first wrongly creates an unauthenticated link key and then later on
requests an upgrade to an authenticated link key to provide full MITM
protection. With Simple Pairing the link key generation is an expensive
operation (compared to Bluetooth 2.0 and before) and doing this twice
during a connection setup causes a noticeable delay when establishing
a new connection. This should be avoided to not regress from the expected
Bluetooth 2.0 connection times. The authentication requirements are known
up-front and so enforce them.

To fulfill these requirements the hci_connect() function has been extended
with an authentication requirement parameter that will be stored inside
the connection information and can be retrieved by userspace at any
time. This allows the correct IO capabilities exchange and results in
the expected behavior.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-09-09 07:19:20 +02:00
David S. Miller
0a68a20cc3 Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp
Conflicts:

	net/dccp/input.c
	net/dccp/options.c
2008-09-08 17:28:59 -07:00
David S. Miller
17dce5dfe3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	net/mac80211/mlme.c
2008-09-08 16:59:05 -07:00
Sven Wegener
e9c0ce232e ipvs: Embed user stats structure into kernel stats structure
Instead of duplicating the fields, integrate a user stats structure into
the kernel stats structure. This is more robust when the members are
changed, because they are now automatically kept in sync.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Reviewed-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-09 09:53:08 +10:00
Sven Wegener
2206a3f5b7 ipvs: Restrict connection table size via Kconfig
Instead of checking the value in include/net/ip_vs.h, we can just
restrict the range in our Kconfig file. This will prevent values outside
of the range early.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Reviewed-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-09 09:50:55 +10:00
Daniel Lezcano
d315492b1a netns : fix kernel panic in timewait socket destruction
How to reproduce ?
 - create a network namespace
 - use tcp protocol and get timewait socket
 - exit the network namespace
 - after a moment (when the timewait socket is destroyed), the kernel
   panics.

# BUG: unable to handle kernel NULL pointer dereference at
0000000000000007
IP: [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
PGD 119985067 PUD 11c5c0067 PMD 0
Oops: 0000 [1] SMP
CPU 1
Modules linked in: ipv6 button battery ac loop dm_mod tg3 libphy ext3 jbd
edd fan thermal processor thermal_sys sg sata_svw libata dock serverworks
sd_mod scsi_mod ide_disk ide_core [last unloaded: freq_table]
Pid: 0, comm: swapper Not tainted 2.6.27-rc2 #3
RIP: 0010:[<ffffffff821e394d>] [<ffffffff821e394d>]
inet_twdr_do_twkill_work+0x6e/0xb8
RSP: 0018:ffff88011ff7fed0 EFLAGS: 00010246
RAX: ffffffffffffffff RBX: ffffffff82339420 RCX: ffff88011ff7ff30
RDX: 0000000000000001 RSI: ffff88011a4d03c0 RDI: ffff88011ac2fc00
RBP: ffffffff823392e0 R08: 0000000000000000 R09: ffff88002802a200
R10: ffff8800a5c4b000 R11: ffffffff823e4080 R12: ffff88011ac2fc00
R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000041cbd940(0000) GS:ffff8800bff839c0(0000)
knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000007 CR3: 00000000bd87c000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8800bff9e000, task
ffff88011ff76690)
Stack: ffffffff823392e0 0000000000000100 ffffffff821e3a3a
0000000000000008
0000000000000000 ffffffff821e3a61 ffff8800bff7c000 ffffffff8203c7e7
ffff88011ff7ff10 ffff88011ff7ff10 0000000000000021 ffffffff82351108
Call Trace:
<IRQ> [<ffffffff821e3a3a>] ? inet_twdr_hangman+0x0/0x9e
[<ffffffff821e3a61>] ? inet_twdr_hangman+0x27/0x9e
[<ffffffff8203c7e7>] ? run_timer_softirq+0x12c/0x193
[<ffffffff820390d1>] ? __do_softirq+0x5e/0xcd
[<ffffffff8200d08c>] ? call_softirq+0x1c/0x28
[<ffffffff8200e611>] ? do_softirq+0x2c/0x68
[<ffffffff8201a055>] ? smp_apic_timer_interrupt+0x8e/0xa9
[<ffffffff8200cad6>] ? apic_timer_interrupt+0x66/0x70
<EOI> [<ffffffff82011f4c>] ? default_idle+0x27/0x3b
[<ffffffff8200abbd>] ? cpu_idle+0x5f/0x7d


Code: e8 01 00 00 4c 89 e7 41 ff c5 e8 8d fd ff ff 49 8b 44 24 38 4c 89 e7
65 8b 14 25 24 00 00 00 89 d2 48 8b 80 e8 00 00 00 48 f7 d0 <48> 8b 04 d0
48 ff 40 58 e8 fc fc ff ff 48 89 df e8 c0 5f 04 00
RIP [<ffffffff821e394d>] inet_twdr_do_twkill_work+0x6e/0xb8
RSP <ffff88011ff7fed0>
CR2: 0000000000000007

This patch provides a function to purge all timewait sockets related
to a network namespace. The timewait sockets life cycle is not tied with
the network namespace, that means the timewait sockets stay alive while
the network namespace dies. The timewait sockets are for avoiding to
receive a duplicate packet from the network, if the network namespace is
freed, the network stack is removed, so no chance to receive any packets
from the outside world. Furthermore, having a pending destruction timer
on these sockets with a network namespace freed is not safe and will lead
to an oops if the timer callback which try to access data belonging to 
the namespace like for example in:
	inet_twdr_do_twkill_work
		-> NET_INC_STATS_BH(twsk_net(tw), LINUX_MIB_TIMEWAITED);

Purging the timewait sockets at the network namespace destruction will:
 1) speed up memory freeing for the namespace
 2) fix kernel panic on asynchronous timewait destruction

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Denis V. Lunev <den@openvz.org>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-08 13:17:27 -07:00
Luis R. Rodriguez
f59ac04816 cfg80211: keep track of supported interface modes
It is obviously good for userspace to know up front which
interface modes a given piece of hardware might support (even
if adding such an interface might fail later because of
concurrency issues), so let's make cfg80211 aware of that.
For good measure, disallow adding interfaces in all other
modes so drivers don't forget to announce support for one mode
when they add it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Stephen Blackheath <tramp.enshrine.stephen@blacksapphire.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-09-05 16:17:42 -04:00
Julius Volz
cfc78c5a09 IPVS: Adjust various debug outputs to use new macros
Adjust various debug outputs to use the new *_BUF macro variants for
correct output of v4/v6 addresses.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:12 +10:00
Julius Volz
7937df1564 IPVS: Convert real server lookup functions
Convert functions for looking up destinations (real servers) to support
IPv6 services/dests.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:10 +10:00
Julius Volz
b3cdd2a738 IPVS: Add and bind IPv6 xmit functions
Add xmit functions for IPv6. Also add the already needed __ip_vs_get_out_rt_v6()
to ip_vs_core.c. Bind the new xmit functions to v6 connections.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:08 +10:00
Julius Volz
28364a59f3 IPVS: Extend functions for getting/creating connections
Extend functions for getting/creating connections and connection
templates for IPv6 support and fix the callers.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:08 +10:00
Julius Volz
0bbdd42b7e IPVS: Extend protocol DNAT/SNAT and state handlers
Extend protocol DNAT/SNAT and state handlers to work with IPv6. Also
change/introduce new checksumming helper functions for this.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:07 +10:00
Julius Volz
51ef348b14 IPVS: Add 'af' args to protocol handler functions
Add 'af' arguments to conn_schedule(), conn_in_get(), conn_out_get() and
csum_check() function pointers in struct ip_vs_protocol. Extend the
respective functions for TCP, UDP, AH and ESP and adjust the callers.

The changes in the callers need to be somewhat extensive, since they now
need to pass a filled out struct ip_vs_iphdr * to the modified functions
instead of a struct iphdr *.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:06 +10:00
Julius Volz
b14198f6c1 IPVS: Add IPv6 support flag to schedulers
Add 'supports_ipv6' flag to struct ip_vs_scheduler to indicate whether a
scheduler supports IPv6. Set the flag to 1 in schedulers that work with
IPv6, 0 otherwise. This flag is checked in a later patch while trying to
add a service with a specific scheduler. Adjust debug in v6-supporting
schedulers to work with both address families.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:06 +10:00
Julius Volz
3c2e0505d2 IPVS: Add v6 support to ip_vs_service_get()
Add support for selecting services based on their address family to
ip_vs_service_get() and adjust the callers.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:05 +10:00
Julius Volz
c860c6b147 IPVS: Add internal versions of sockopt interface structs
Add extended internal versions of struct ip_vs_service_user and struct
ip_vs_dest_user (the originals can't be modified as they are part
of the old sockopt interface). Adjust ip_vs_ctl.c to work with the new
data structures and add some minor AF-awareness.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:04 +10:00
Julius Volz
c842a3ada9 IPVS: Add debug macros for v4 and v6 address output
Add some debugging macros that allow conditional output of either v4 or v6
addresses, depending on an 'af' parameter. This is done by creating a
temporary string buffer in an outer debug macro and writing addresses'
string representations into it from another macro which can only be used
when inside the outer one.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:04 +10:00
Julius Volz
64aae3cb9f IPVS: Add general v4/v6 helper functions / data structures
Add a struct ip_vs_iphdr for easier handling of common v4 and v6 header
fields in the same code path. ip_vs_fill_iphdr() helps to fill this struct
from an IPv4 or IPv6 header. Add further helper functions for copying and
comparing addresses.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:03 +10:00
Julius Volz
e7ade46a53 IPVS: Change IPVS data structures to support IPv6 addresses
Introduce new 'af' fields into IPVS data structures for specifying an
entry's address family. Convert IP addresses to be of type union
nf_inet_addr.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-09-05 11:17:03 +10:00
Gerrit Renker
6224877b2c tcp/dccp: Consolidate common code for RFC 3390 conversion
This patch consolidates the code common to TCP and CCID-2:
 * TCP uses RFC 3390 in a packet-oriented manner (tcp_input.c) and
 * CCID-2 uses RFC 3390 in packet-oriented manner (RFC 4341).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2008-09-04 07:45:39 +02:00
Thomas Graf
2c10b32bf5 netlink: Remove compat API for nested attributes
Removes all _nested_compat() functions from the API. The prio qdisc
no longer requires them and netem has its own format anyway. Their
existance is only confusing.

Resend: Also remove the wrapper macro.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-09-02 17:30:27 -07:00
David S. Miller
b171e19ed0 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	net/mac80211/mlme.c
2008-08-29 23:06:00 -07:00
David S. Miller
143b11c03c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-08-29 14:02:13 -07:00
Jouni Malinen
36aedc903e mac80211/cfg80211: HT capabilities for NEW_STA
Allow userspace (e.g., hostapd) to set HT capabilities for associated
STAs. This is based on a patch from Zhu Yi <yi.zhu@intel.com> (only
the NL80211_ATTR_HT_CAPABILITY for NEW_STA part is included here).

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29 16:24:09 -04:00
Jouni Malinen
9f1ba9062e mac80211/cfg80211: Add BSS configuration options for AP mode
This change adds a new cfg80211 command, NL80211_CMD_SET_BSS, to allow
AP mode BSS parameters to be changed from user space (e.g., hostapd).
The drivers using mac80211 are expected to be modified with separate
changes to use the new BSS info parameter for short slot time in the
bss_info_changed() handler.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-29 16:23:55 -04:00
Alexey Dobriyan
af01d53746 net: more #ifdef CONFIG_COMPAT
All users of struct proto::compat_[gs]etsockopt and
struct inet_connection_sock_af_ops::compat_[gs]etsockopt are under
#ifdef already, so use it in structure definition too.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-28 02:53:51 -07:00
Jarek Poplawski
fe439dd09d pkt_sched: Fix sch_tree_lock()
Use new qdisc_root_sleeping_lock() instead of qdisc_root_lock() as
sch_tree_lock() because this lock could be used while dev is
deactivated, but we never need to use this with noop_qdisc as a root.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27 02:27:10 -07:00
Jarek Poplawski
f6f9b93f16 pkt_sched: Fix gen_estimator locks
While passing a qdisc root lock to gen_new_estimator() and
gen_replace_estimator() dev could be deactivated or even before
grafting proper root qdisc as qdisc_sleeping (e.g. qdisc_create), so
using qdisc_root_lock() is not enough. This patch adds
qdisc_root_sleeping_lock() for this, plus additional checks, where
necessary.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-27 02:25:17 -07:00
Simon Horman
7fd1067851 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6 into lvs-next-2.6 2008-08-27 15:11:37 +10:00
Harvey Harrison
6b644e524b mac80211: remove ieee80211_get_hdrlen
All users have been moved over to the version taking a le16 frame control
rather than a cpu-endian value.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-22 16:29:54 -04:00
Bruno Randolf
b4f28bbb9b mac80211: add rx status flag for short preamble
and use it for the radiotap header

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-22 16:29:50 -04:00
Tomas Winkler
92ab853549 mac80211: add ieee80211_queue_stopped)
This patch adds ieee80211_queue_stopped that let drivers to query
queue status

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-22 16:29:50 -04:00
Jarek Poplawski
f6e0b239a2 pkt_sched: Fix qdisc list locking
Since some qdiscs call qdisc_tree_decrease_qlen() (so qdisc_lookup())
without rtnl_lock(), adding and deleting from a qdisc list needs
additional locking. This patch adds global spinlock qdisc_list_lock
and wrapper functions for modifying the list. It is considered as a
temporary solution until hfsc_dequeue(), netem_dequeue() and
tbf_dequeue() (or qdisc_tree_decrease_qlen()) are redone.

With feedback from Herbert Xu and David S. Miller.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-22 03:31:39 -07:00
Jarek Poplawski
2540e0511e pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race
dev_deactivate() can skip rescheduling of a qdisc by qdisc_watchdog()
or other timer calling netif_schedule() after dev_queue_deactivate().
We prevent this checking aliveness before scheduling the timer. Since
during deactivation the root qdisc is available only as qdisc_sleeping
additional accessor qdisc_root_sleeping() is created.

With feedback from Herbert Xu <herbert@gondor.apana.org.au>

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-21 05:11:14 -07:00
Simon Horman
3f087668c4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-08-19 17:36:22 +10:00
David S. Miller
8e0f36ec37 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-08-18 21:15:44 -07:00
Luis R. Rodriguez
546c80c91f mac80211: remove kdoc references to IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE
IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE was made unnecessary in
the recent revamp on beacon configuration.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-18 11:05:14 -04:00
David S. Miller
1e0d5a5747 pkt_sched: No longer destroy qdiscs from RCU.
We can now kill them synchronously with all of the
previous dev_deactivate() cures.

This makes netdev destruction and shutdown saner as
the qdiscs hold references to the device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-17 22:31:26 -07:00
David S. Miller
a9312ae893 pkt_sched: Add 'deactivated' state.
This new state lets dev_deactivate() mark a qdisc as having been
deactivated.

dev_queue_xmit() and ing_filter() check for this bit and do not
try to process the qdisc if the bit is set.

dev_deactivate() polls the qdisc after setting the bit, waiting
for both __QDISC_STATE_RUNNING and __QDISC_STATE_SCHED to clear.

This isn't perfect yet, but subsequent changesets will make it so.
This part is just one piece of the puzzle.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-17 21:51:03 -07:00
Simon Horman
51df190139 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-08-16 14:44:17 +10:00
Sven Wegener
a919cf4b6b ipvs: Create init functions for estimator code
Commit 8ab19ea36c ("ipvs: Fix possible deadlock
in estimator code") fixed a deadlock condition, but that condition can only
happen during unload of IPVS, because during normal operation there is at least
our global stats structure in the estimator list. The mod_timer() and
del_timer_sync() calls are actually initialization and cleanup code in
disguise. Let's make it explicit and move them to their own init and cleanup
function.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
2008-08-15 09:26:15 +10:00
Brian Haley
191cd58250 netns: Add network namespace argument to rt6_fill_node() and ipv6_dev_get_saddr()
ipv6_dev_get_saddr() blindly de-references dst_dev to get the network
namespace, but some callers might pass NULL.  Change callers to pass a
namespace pointer instead.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-14 15:33:21 -07:00
Rami Rosen
6bf90b2bf4 ipv6: Kill unused ip6_prohibit_entry and ip6_blk_hole_entry declarations.
This patch removes ip6_prohibit_entry and ip6_blk_hole_entry
declarations from include/net/ip6_route.h as they are unused.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:35:39 -07:00
Rami Rosen
83ac794f15 ipv6: ip6_route.h cleanup.
This patch removes rt6_lock declaration from include/net/ip6_route.h
as it is unused.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:34:39 -07:00
David S. Miller
83f36f3f35 pkt_sched: Add queue stopped test back to qdisc_run().
Based upon a bug report by Andrew Gallatin on netdev
with subject "CPU utilization increased in 2.6.27rc"

In commit 37437bb2e1
("pkt_sched: Schedule qdiscs instead of netdev_queue.")
the test of the queue being stopped was erroneously
removed from qdisc_run().

When the TX queue of the device fills up, this omission
causes lots of extraneous useless work to be queued up
to softirq context, where we'll just return immediately
because the device is still stuffed up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-13 02:13:34 -07:00
Sven Wegener
3a14a313f9 ipvs: Embed estimator object into stats object
There's no reason for dynamically allocating an estimator object for every
stats object. Directly embed an estimator object into every stats object and
switch to using the kernel-provided list implementation. This makes the code
much simpler and faster, as we do not need to traverse the list of all
estimators to find the one belonging to a stats object. There's no need to use
an rwlock, as we only have one reader. Also reorder the members of the
estimator structure slightly to avoid padding overhead. This can't be done
with the stats object as the members are currently copied to our user space
object via memcpy() and changing it would break ABI.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 14:00:43 +02:00
Sven Wegener
5587da55fb ipvs: Mark net_vs_ctl_path const
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:46:27 +02:00
Sven Wegener
afdd614071 ipvs: Use ARRAY_SIZE()
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
2008-08-11 11:45:48 +02:00
David S. Miller
32bb93b02d Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-08-07 02:10:27 -07:00
Jeff Garzik
3859069bc3 Merge branch 'for-jeff' of git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6 into tmp 2008-08-07 04:05:46 -04:00
Gui Jianfeng
6edafaaf6f tcp: Fix kernel panic when calling tcp_v(4/6)_md5_do_lookup
If the following packet flow happen, kernel will panic.
MathineA			MathineB
		SYN
	---------------------->    
        	SYN+ACK
	<----------------------
		ACK(bad seq)
	---------------------->
When a bad seq ACK is received, tcp_v4_md5_do_lookup(skb->sk, ip_hdr(skb)->daddr))
is finally called by tcp_v4_reqsk_send_ack(), but the first parameter(skb->sk) is 
NULL at that moment, so kernel panic happens.
This patch fixes this bug.

OOPS output is as following:
[  302.812793] IP: [<c05cfaa6>] tcp_v4_md5_do_lookup+0x12/0x42
[  302.817075] Oops: 0000 [#1] SMP 
[  302.819815] Modules linked in: ipv6 loop dm_multipath rtc_cmos rtc_core rtc_lib pcspkr pcnet32 mii i2c_piix4 parport_pc i2c_core parport ac button ata_piix libata dm_mod mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod crc_t10dif ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
[  302.849946] 
[  302.851198] Pid: 0, comm: swapper Not tainted (2.6.27-rc1-guijf #5)
[  302.855184] EIP: 0060:[<c05cfaa6>] EFLAGS: 00010296 CPU: 0
[  302.858296] EIP is at tcp_v4_md5_do_lookup+0x12/0x42
[  302.861027] EAX: 0000001e EBX: 00000000 ECX: 00000046 EDX: 00000046
[  302.864867] ESI: ceb69e00 EDI: 1467a8c0 EBP: cf75f180 ESP: c0792e54
[  302.868333]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  302.871287] Process swapper (pid: 0, ti=c0792000 task=c0712340 task.ti=c0746000)
[  302.875592] Stack: c06f413a 00000000 cf75f180 ceb69e00 00000000 c05d0d86 000016d0 ceac5400 
[  302.883275]        c05d28f8 000016d0 ceb69e00 ceb69e20 681bf6e3 00001000 00000000 0a67a8c0 
[  302.890971]        ceac5400 c04250a3 c06f413a c0792eb0 c0792edc cf59a620 cf59a620 cf59a634 
[  302.900140] Call Trace:
[  302.902392]  [<c05d0d86>] tcp_v4_reqsk_send_ack+0x17/0x35
[  302.907060]  [<c05d28f8>] tcp_check_req+0x156/0x372
[  302.910082]  [<c04250a3>] printk+0x14/0x18
[  302.912868]  [<c05d0aa1>] tcp_v4_do_rcv+0x1d3/0x2bf
[  302.917423]  [<c05d26be>] tcp_v4_rcv+0x563/0x5b9
[  302.920453]  [<c05bb20f>] ip_local_deliver_finish+0xe8/0x183
[  302.923865]  [<c05bb10a>] ip_rcv_finish+0x286/0x2a3
[  302.928569]  [<c059e438>] dev_alloc_skb+0x11/0x25
[  302.931563]  [<c05a211f>] netif_receive_skb+0x2d6/0x33a
[  302.934914]  [<d0917941>] pcnet32_poll+0x333/0x680 [pcnet32]
[  302.938735]  [<c05a3b48>] net_rx_action+0x5c/0xfe
[  302.941792]  [<c042856b>] __do_softirq+0x5d/0xc1
[  302.944788]  [<c042850e>] __do_softirq+0x0/0xc1
[  302.948999]  [<c040564b>] do_softirq+0x55/0x88
[  302.951870]  [<c04501b1>] handle_fasteoi_irq+0x0/0xa4
[  302.954986]  [<c04284da>] irq_exit+0x35/0x69
[  302.959081]  [<c0405717>] do_IRQ+0x99/0xae
[  302.961896]  [<c040422b>] common_interrupt+0x23/0x28
[  302.966279]  [<c040819d>] default_idle+0x2a/0x3d
[  302.969212]  [<c0402552>] cpu_idle+0xb2/0xd2
[  302.972169]  =======================
[  302.974274] Code: fc ff 84 d2 0f 84 df fd ff ff e9 34 fe ff ff 83 c4 0c 5b 5e 5f 5d c3 90 90 57 89 d7 56 53 89 c3 50 68 3a 41 6f c0 e8 e9 55 e5 ff <8b> 93 9c 04 00 00 58 85 d2 59 74 1e 8b 72 10 31 db 31 c9 85 f6 
[  303.011610] EIP: [<c05cfaa6>] tcp_v4_md5_do_lookup+0x12/0x42 SS:ESP 0068:c0792e54
[  303.018360] Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-06 23:50:04 -07:00
David S. Miller
33e334950a Merge branch 'no-ath9k' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-08-05 01:28:35 -07:00
Rami Rosen
95c3e8bfcd ipv4: remove unused field in struct flowi (include/net/flow.h).
This patch removes an unused field (flags) from struct flowi; it seems
that this "flags" field was used once in the past for multipath
routing with FLOWI_FLAG_MULTIPATHOLDROUTE flag (which does no longer
exist); however, the "flags" field of struct flowi is not used
anymore.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-05 01:19:50 -07:00
David S. Miller
cc6533e98a net: Kill plain NET_XMIT_BYPASS.
dst_input() was doing something completely absurd, looping
on skb->dst->input() if NET_XMIT_BYPASS was seen, but these
functions never return such an error.

And as a result plain ole' NET_XMIT_BYPASS has no more
references and can be completely killed off.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04 23:04:08 -07:00
Jarek Poplawski
c27f339af9 net_sched: Add qdisc __NET_XMIT_BYPASS flag
Patrick McHardy <kaber@trash.net> noticed that it would be nice to
handle NET_XMIT_BYPASS by NET_XMIT_SUCCESS with an internal qdisc flag
__NET_XMIT_BYPASS and to remove the mapping from dev_queue_xmit().

David Miller <davem@davemloft.net> spotted a serious bug in the first
version of this patch.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04 22:39:11 -07:00
Jarek Poplawski
378a2f090f net_sched: Add qdisc __NET_XMIT_STOLEN flag
Patrick McHardy <kaber@trash.net> noticed:
"The other problem that affects all qdiscs supporting actions is
TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS
even though the packet is not queued, corrupting upper qdiscs'
qlen counters."

and later explained:
"The reason why it translates it at all seems to be to not increase
the drops counter. Within a single qdisc this could be avoided by
other means easily, upper qdiscs would still increase the counter
when we return anything besides NET_XMIT_SUCCESS though.

This means we need a new NET_XMIT return value to indicate this to
the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN,
return that to upper qdiscs and translate it to NET_XMIT_SUCCESS
in dev_queue_xmit, similar to NET_XMIT_BYPASS."

David Miller <davem@davemloft.net> noticed:
"Maybe these NET_XMIT_* values being passed around should be a set of
bits. They could be composed of base meanings, combined with specific
attributes.

So you could say "NET_XMIT_DROP | __NET_XMIT_NO_DROP_COUNT"

The attributes get masked out by the top-level ->enqueue() caller,
such that the base meanings are the only thing that make their
way up into the stack. If it's only about communication within the
qdisc tree, let's simply code it that way."

This patch is trying to realize these ideas.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04 22:31:03 -07:00
Tomas Winkler
ea95bba41e mac80211: make listen_interval be limited by low level driver
This patch makes possible for a driver to specify maximal listen interval
The possibility for user to configure listen interval is not implemented
yet, currently the maximum provided by the driver or 1 is used.
Mac80211 uses config handler to set listen interval for to the driver.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-04 15:09:07 -04:00
Emmanuel Grumbach
98f7dfd86c mac80211: pass dtim_period to low level driver
This patch adds the dtim_period in ieee80211_bss_conf, this allows the low
level driver to know the dtim_period, and to plan power save accordingly.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-08-04 15:09:07 -04:00
Herbert Xu
f880374c2f sctp: Drop ipfargok in sctp_xmit function
The ipfragok flag controls whether the packet may be fragmented
either on the local host on beyond.  The latter is only valid on
IPv4.

In fact, we never want to do the latter even on IPv4 when PMTU is
enabled.  This is because even though we can't fragment packets
within SCTP due to the prtocol's inherent faults, we can still
fragment it at IP layer.  By setting the DF bit we will improve
the PMTU process.

RFC 2960 only says that we SHOULD clear the DF bit in this case,
so we're compliant even if we set the DF bit.  In fact RFC 4960
no longer has this statement.

Once we make this change, we only need to control the local
fragmentation.  There is already a bit in the skb which controls
that, local_df.  So this patch sets that instead of using the
ipfragok argument.

The only complication is that there isn't a struct sock object
per transport, so for IPv4 we have to resort to changing the
pmtudisc field for every packet.  This should be safe though
as the protocol is single-threaded.

Note that after this patch we can remove ipfragok from the rest
of the stack too.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-03 21:15:08 -07:00
David S. Miller
7e43f1128d pkt_sched: Make sure RTNL is held in qdisc_root_lock().
It is the only legal environment in which this can be
used.

Add some commentary explaining the situation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-02 23:27:37 -07:00
Julius Volz
bc4768eb08 ipvs: Move userspace definitions to include/linux/ip_vs.h
Current versions of ipvsadm include "/usr/src/linux/include/net/ip_vs.h"
directly. This file also contains kernel-only definitions. Normally, public
definitions should live in include/linux, so this patch moves the
definitions shared with userspace to a new file, "include/linux/ip_vs.h".

This also removes the unused NFC_IPVS_PROPERTY bitmask, which was once
used to point into skb->nfcache.

To make old ipvsadms still compile with this, the old header file includes
the new one.

Thanks to Dave Miller and Horms for noting/adding the missing Kbuild entry
for the new header file.

Signed-off-by: Julius Volz <juliusv@google.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-31 20:45:24 -07:00
Johannes Berg
d0f0980414 mac80211: partially fix skb->cb use
This patch fixes mac80211 to not use the skb->cb over the queue step
from virtual interfaces to the master. The patch also, for now,
disables aggregation because that would still require requeuing,
will fix that in a separate patch. There are two other places (software
requeue and powersaving stations) where requeue can happen, but that is
not currently used by any drivers/not possible to use respectively.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-29 16:55:08 -04:00
Johannes Berg
605a0bd66d mac80211: remove IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE flag
I forgot this in the previous patch that made it unused.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-29 16:36:24 -04:00
Al Viro
6f9f489a4e net: missing bits of net-namespace / sysctl
Piss-poor sysctl registration API strikes again, film at 11...
What we really need is _pathname_ required to be present in
already registered table, so that kernel could warn about bad
order.  That's the next target for sysctl stuff (and generally
saner and more explicit order of initialization of ipv[46]
internals wouldn't hurt either).

For the time being, here are full fixups required by ..._rotable()
stuff; we make per-net sysctl sets descendents of "ro" one and
make sure that sufficient skeleton is there before we start registering
per-net sysctls.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-27 04:40:51 -07:00
Linus Torvalds
4836e30078 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits)
  [PATCH] fix RLIM_NOFILE handling
  [PATCH] get rid of corner case in dup3() entirely
  [PATCH] remove remaining namei_{32,64}.h crap
  [PATCH] get rid of indirect users of namei.h
  [PATCH] get rid of __user_path_lookup_open
  [PATCH] f_count may wrap around
  [PATCH] dup3 fix
  [PATCH] don't pass nameidata to __ncp_lookup_validate()
  [PATCH] don't pass nameidata to gfs2_lookupi()
  [PATCH] new (local) helper: user_path_parent()
  [PATCH] sanitize __user_walk_fd() et.al.
  [PATCH] preparation to __user_walk_fd cleanup
  [PATCH] kill nameidata passing to permission(), rename to inode_permission()
  [PATCH] take noexec checks to very few callers that care
  Re: [PATCH 3/6] vfs: open_exec cleanup
  [patch 4/4] vfs: immutable inode checking cleanup
  [patch 3/4] fat: dont call notify_change
  [patch 2/4] vfs: utimes cleanup
  [patch 1/4] vfs: utimes: move owner check into inode_change_ok()
  [PATCH] vfs: use kstrdup() and check failing allocation
  ...
2008-07-26 20:23:44 -07:00
Linus Torvalds
2284284281 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  netns: fix ip_rt_frag_needed rt_is_expired
  netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences
  netfilter: fix double-free and use-after free
  netfilter: arptables in netns for real
  netfilter: ip{,6}tables_security: fix future section mismatch
  selinux: use nf_register_hooks()
  netfilter: ebtables: use nf_register_hooks()
  Revert "pkt_sched: sch_sfq: dump a real number of flows"
  qeth: use dev->ml_priv instead of dev->priv
  syncookies: Make sure ECN is disabled
  net: drop unused BUG_TRAP()
  net: convert BUG_TRAP to generic WARN_ON
  drivers/net: convert BUG_TRAP to generic WARN_ON
2008-07-26 20:17:56 -07:00
Al Viro
516e0cc564 [PATCH] f_count may wrap around
make it atomic_long_t; while we are at it, get rid of useless checks in affs,
hfs and hpfs - ->open() always has it equal to 1, ->release() - to 0.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:40 -04:00
Al Viro
bd7b1533cd [PATCH] sysctl: make sure that /proc/sys/net/ipv4 appears before per-ns ones
Massage ipv4 initialization - make sure that net.ipv4 appears as
non-per-net-namespace before it shows up in per-net-namespace sysctls.
That's the only change outside of sysctl.c needed to get sane ordering
rules and data structures for sysctls (esp. for procfs side of that
mess).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:10 -04:00
Al Viro
734550921e [PATCH] beginning of sysctl cleanup - ctl_table_set
New object: set of sysctls [currently - root and per-net-ns].
Contains: pointer to parent set, list of tables and "should I see this set?"
method (->is_seen(set)).
Current lists of tables are subsumed by that; net-ns contains such a beast.
->lookup() for ctl_table_root returns pointer to ctl_table_set instead of
that to ->list of that ctl_table_set.

[folded compile fixes by rdd for configs without sysctl]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:08 -04:00
Ilpo Järvinen
547b792cac net: convert BUG_TRAP to generic WARN_ON
Removes legacy reinvent-the-wheel type thing. The generic
machinery integrates much better to automated debugging aids
such as kerneloops.org (and others), and is unambiguous due to
better naming. Non-intuively BUG_TRAP() is actually equal to
WARN_ON() rather than BUG_ON() though some might actually be
promoted to BUG_ON() but I left that to future.

I could make at least one BUILD_BUG_ON conversion.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25 21:43:18 -07:00
Linus Torvalds
1ff8419871 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  ipsec: ipcomp - Decompress into frags if necessary
  ipsec: ipcomp - Merge IPComp implementations
  pkt_sched: Fix locking in shutdown_scheduler_queue()
2008-07-25 17:40:16 -07:00
Harvey Harrison
8b5ac31e27 include: use get/put_unaligned_* helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:26 -07:00
Herbert Xu
6fccab671f ipsec: ipcomp - Merge IPComp implementations
This patch merges the IPv4/IPv6 IPComp implementations since most
of the code is identical.  As a result future enhancements will no
longer need to be duplicated.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25 02:54:40 -07:00
Krzysztof Hałasa
efa415840d Remove bogus variables from syncppp.[ch]
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
2008-07-23 23:00:31 +02:00
Stephen Hemminger
3d0f24a74e ipv6: icmp6_dst_gc return change
Change icmp6_dst_gc to return the one value the caller cares about rather
than using call by reference.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:35:50 -07:00
Stephen Hemminger
417f28bb34 netns: dont alloc ipv6 fib timer list
FIB timer list is a trivial size structure, avoid indirection and just
put it in existing ns.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:33:45 -07:00
Adrian Bunk
888c848ed3 ipv6: make struct ipv6_devconf static
struct ipv6_devconf can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:21:58 -07:00
Adrian Bunk
abd0b198ea sctp: make sctp_outq_flush() static
sctp_outq_flush() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:20:45 -07:00
Krzysztof Piotr Oledzki
584015727a netfilter: accounting rework: ct_extend + 64bit counters (v4)
Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.

This patch:
 - reimplements accounting with respect to the extension infrastructure,
 - makes one global version of seq_print_acct() instead of two seq_print_counters(),
 - makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
 - makes it possible to enable/disable it at runtime by sysctl or sysfs,
 - extends counters from 32bit to 64bit,
 - renames ip_conntrack_counter -> nf_conn_counter,
 - enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
 - set initial accounting enable state based on CONFIG_NF_CT_ACCT
 - removes buggy IPCT_COUNTER_FILLING event handling.

If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 10:10:58 -07:00
Krzysztof Piotr Oledzki
07a7c1070e netlink: add NLA_PUT_BE64 macro
Add NLA_PUT_BE64 macro required for 64bit counters in netfilter

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 10:10:58 -07:00
David S. Miller
3a682fbd73 pkt_sched: Fix build with NET_SCHED disabled.
The stab bits can't be referenced uniless the full
packet scheduler layer is enabled.

Reported by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 18:13:01 -07:00
Jussi Kivilinna
175f9c1bba net_sched: Add size table for qdiscs
Add size table functions for qdiscs and calculate packet size in
qdisc_enqueue().

Based on patch by Patrick McHardy
 http://marc.info/?l=linux-netdev&m=115201979221729&w=2

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:47 -07:00
Jussi Kivilinna
0abf77e55a net_sched: Add accessor function for packet length for qdiscs
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:27 -07:00
Jussi Kivilinna
5f86173bdf net_sched: Add qdisc_enqueue wrapper
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:04 -07:00
YOSHIFUJI Hideaki
230b183921 net: Use standard structures for generic socket address structures.
Use sockaddr_storage{} for generic socket address storage
and ensures proper alignment.
Use sockaddr{} for pointers to omit several casts.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 22:35:47 -07:00
David S. Miller
407d819cf0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2008-07-19 00:30:39 -07:00
Denis V. Lunev
7abbcd6a4c ipv6: remove unused macros from net/ipv6.h
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:29:42 -07:00
Denis V. Lunev
725a8ff04a ipv6: remove unused parameter from ip6_ra_control
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:28:58 -07:00
Adam Langley
33ad798c92 tcp: options clean up
This should fix the following bugs:
  * Connections with MD5 signatures produce invalid packets whenever SACK
    options are included
  * MD5 signatures are counted twice in the MSS calculations

Behaviour changes:
  * A SYN with MD5 + SACK + TS elicits a SYNACK with MD5 + SACK

    This is because we can't fit any SACK blocks in a packet with MD5 + TS
    options. There was discussion about disabling SACK rather than TS in
    order to fit in better with old, buggy kernels, but that was deemed to
    be unnecessary.

  * SYNs with MD5 don't include a TS option

    See above.

Additionally, it removes a bunch of duplicated logic for calculating options,
which should help avoid these sort of issues in the future.

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:04:31 -07:00
Adam Langley
49a72dfb88 tcp: Fix MD5 signatures for non-linear skbs
Currently, the MD5 code assumes that the SKBs are linear and, in the case
that they aren't, happily goes off and hashes off the end of the SKB and
into random memory.

Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy
Polyakov. Also includes a couple of missed route_caps from Stephen's patch
in [2].

[1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2
[2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:01:42 -07:00
Harvey Harrison
336d3262df sctp: remove unnecessary byteshifting, calculate directly in big-endian
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:07:09 -07:00
Vlad Yasevich
7dab83de50 sctp: Support ipv6only AF_INET6 sockets.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:05:40 -07:00
Stephen Hemminger
c1e20f7c8b tcp: RTT metrics scaling
Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in
kernel units (jiffies) and this leaks out through the netlink API to
user space where the units for jiffies are unknown.

This patches changes the kernel to convert to/from milliseconds. This
changes the ABI, but milliseconds seemed like the most natural unit
for these parameters.  Values available via syscall in
/proc/net/rt_cache and netlink will be in milliseconds.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:02:15 -07:00
David S. Miller
72b25a913e pkt_sched: Get rid of u32_list.
The u32_list is just an indirect way of maintaining a reference
to a U32 node on a per-qdisc basis.

Just add an explicit node pointer for u32 to struct Qdisc an do
away with this global list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 20:54:17 -07:00
Pavel Emelyanov
923c6586b0 mib: put icmpmsg statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:04:22 -07:00
Pavel Emelyanov
b60538a0d7 mib: put icmp statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:04:02 -07:00
Pavel Emelyanov
386019d351 mib: put udplite statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:45 -07:00
Pavel Emelyanov
2f275f91a4 mib: put udp statistics on struct net
Similar to... ouch, I repeat myself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:27 -07:00
Pavel Emelyanov
61a7e26028 mib: put net statistics on struct net
Similar to ip and tcp ones :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:08 -07:00
Pavel Emelyanov
a20f5799ca mib: put ip statistics on struct net
Similar to tcp one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:02:42 -07:00
Pavel Emelyanov
57ef42d59d mib: put tcp statistics on struct net
Proc temporary uses stats from init_net.

BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:02:08 -07:00
Pavel Emelyanov
852566f53c mib: add netns/mib.h file
The only structure declared within is the netns_mib, which will
carry all our mibs within. I didn't put the mibs in the existing
netns_xxx structures to make it possible to mark this one as
properly aligned and get in a separate "read-mostly" cache-line.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:01:24 -07:00
David S. Miller
93245dd6d3 pkt_sched: Don't used locked skb_queue_purge() in __qdisc_reset_queue()
We have to have exclusive access to the given qdisc anyways, so
doing even more locking is superfluous.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:32 -07:00
David S. Miller
8387400092 pkt_sched: Kill netdev_queue lock.
We can simply use the qdisc->q.lock for all of the
qdisc tree synchronization.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:30 -07:00
David S. Miller
c7e4f3bbb4 pkt_sched: Kill qdisc_lock_tree and qdisc_unlock_tree.
No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:29 -07:00
David S. Miller
78a5b30b73 pkt_sched: Rework {sch,tbf}_tree_lock().
Make sch_tree_lock() lock the qdisc's root.  All of the
users hold the RTNL semaphore and the root qdisc is not
changing.

Implement tbf_tree_{lock,unlock}() simply in terms of
sch_tree_{lock,unlock}().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:28 -07:00
David S. Miller
37437bb2e1 pkt_sched: Schedule qdiscs instead of netdev_queue.
When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.

Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.

Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:20 -07:00
David S. Miller
7698b4fcab pkt_sched: Add and use qdisc_root() and qdisc_root_lock().
When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.

Add qdisc_root_lock() to represent this and convert the
easiest cases.

In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:19 -07:00
David S. Miller
e2627c8c22 pkt_sched: Make QDISC_RUNNING a qdisc state.
Currently it is associated with a netdev_queue, but when we have
qdisc sharing that no longer makes any sense.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:18 -07:00
David S. Miller
d3b753db7c pkt_sched: Move gso_skb into Qdisc.
We liberate any dangling gso_skb during qdisc destruction.

It really only matters for the root qdisc.  But when qdiscs
can be shared by multiple netdev_queue objects, we can't
have the gso_skb in the netdev_queue any more.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:18 -07:00
David S. Miller
51cb6db0f5 mac80211: Reimplement WME using ->select_queue().
The only behavior change is that we do not drop packets under any
circumstances.  If that is absolutely needed, we could easily add it
back.

With cleanups and help from Johannes Berg.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:12 -07:00
David S. Miller
fd2ea0a79f net: Use queue aware tests throughout.
This effectively "flips the switch" by making the core networking
and multiqueue-aware drivers use the new TX multiqueue structures.

Non-multiqueue drivers need no changes.  The interfaces they use such
as netif_stop_queue() degenerate into an operation on TX queue zero.
So everything "just works" for them.

Code that really wants to do "X" to all TX queues now invokes a
routine that does so, such as netif_tx_wake_all_queues(),
netif_tx_stop_all_queues(), etc.

pktgen and netpoll required a little bit more surgery than the others.

In particular the pktgen changes, whilst functional, could be largely
improved.  The initial check in pktgen_xmit() will sometimes check the
wrong queue, which is mostly harmless.  The thing to do is probably to
invoke fill_packet() earlier.

The bulk of the netpoll changes is to make the code operate solely on
the TX queue indicated by by the SKB queue mapping.

Setting of the SKB queue mapping is entirely confined inside of
net/core/dev.c:dev_pick_tx().  If we end up needing any kind of
special semantics (drops, for example) it will be implemented here.

Finally, we now have a "real_num_tx_queues" which is where the driver
indicates how many TX queues are actually active.

With IGB changes from Jeff Kirsher.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:07 -07:00
David S. Miller
e8a0464cc9 netdev: Allocate multiple queues for TX.
alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.

Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces.  This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.

Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:00 -07:00
Neil Horman
9a6d276e85 core: add stat to track unresolved discards in neighbor cache
in __neigh_event_send, if we have a neighbour entry which is in
NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
to the neighbours arp_queue, which is default capped to a length of 3
skbs.  If that queue exceeds its set length, it will drop an skb on
the queue to enqueue the newly arrived skb.  This results in a drop
for which we have no statistics incremented.  This patch adds an
unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
lost frames.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:50:49 -07:00
Pavel Emelyanov
ed88098e25 mib: add net to NET_ADD_STATS_USER
Done with NET_XXX_STATS macros :)

To be continued...

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:32:45 -07:00
Pavel Emelyanov
f2bf415cfe mib: add net to NET_ADD_STATS_BH
This one is tricky. 

The thing is that this macro is only used when killing tw buckets, 
but since this killer is promiscuous wrt to which net each particular
tw belongs to, I have to use it only when NET_NS is off. When the net
namespaces are on, I use the INET_INC_STATS_BH for each bucket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:32:25 -07:00
Pavel Emelyanov
6f67c817fc mib: add net to NET_INC_STATS_USER
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:31:39 -07:00
Pavel Emelyanov
de0744af1f mib: add net to NET_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:31:16 -07:00
Pavel Emelyanov
4e6734447d mib: add net to NET_INC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:30:14 -07:00
Pavel Emelyanov
5c52ba170f sock: add net to prot->enter_memory_pressure callback
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't
have where to get the net from.

I decided to add a sk argument, not the net itself, only to factor
all the required sock_net(sk) calls inside the enter_memory_pressure 
callback itself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:28:10 -07:00
Pavel Emelyanov
cf1100a7a4 mib: add net to TCP_ADD_STATS_USER
Now we're done with the TCP_XXX_STATS macros.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:27:38 -07:00
Pavel Emelyanov
74688e487a mib: add net to TCP_DEC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:46 -07:00
Pavel Emelyanov
63231bddf6 mib: add net to TCP_INC_STATS_BH
Same as before - the sock is always there to get the net from,
but there are also some places with the net already saved on 
the stack.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:25 -07:00
Pavel Emelyanov
81cc8a75d9 mib: add net to TCP_INC_STATS
Fortunately (almost) all the TCP code has a sock to get the net from :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:04 -07:00
Pavel Emelyanov
a9c19329ec tcp: add net to tcp_mib_init
This one sets TCP MIBs after zeroing them, and thus requires
the net.

The existing single caller can use init_net (temporarily).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:21:42 -07:00
Pavel Emelyanov
f10f84314d mib: drop unused TCP_XXX_STATS macros
TCP_INC_STATS_USER and TCP_ADD_STATS_BH are currently unused.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:21:20 -07:00
Pavel Emelyanov
c5346fe396 mib: add net to IP_ADD_STATS_BH
Very simple - only ip_evictor (fragments) requires such.
This patch ends up the IP_XXX_STATS patching.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:20:33 -07:00
Pavel Emelyanov
7c73a6faff mib: add net to IP_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:20:11 -07:00
Pavel Emelyanov
5e38e27044 mib: add net to IP_INC_STATS
All the callers already have either the net itself, or the place
where to get it from.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:19:49 -07:00
Pavel Emelyanov
c6f8f7e3bb mib: drop unused IP_INC_STATS_USER
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:19:26 -07:00
Pavel Emelyanov
f66ac03d49 mib: add struct net to ICMPMSGIN_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:31 -07:00
Pavel Emelyanov
903fc1964e mib: add struct net to ICMPMSGOUT_INC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:30 -07:00
Pavel Emelyanov
dcfc23cac1 mib: add struct net to ICMP_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:29 -07:00
Pavel Emelyanov
75c939bb4d mib: add struct net to ICMP_INC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:28 -07:00
Pavel Emelyanov
43589aa93c icmp: drop unused MIB accounting wrappers
There are ICMP_XXX_STATS that are not used in the kernel, so I remove
them, not to "just patch" them later. But if there's some sense in
keeping them, kick me - I will remake this set keeping them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:26 -07:00
Pavel Emelyanov
0388b00426 icmp: add struct net argument to icmp_out_count
This routine deals with ICMP statistics, but doesn't have a
struct net at hands, so add one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:13 -07:00
Allan Stephens
0ea522416b tipc: Remove unneeded parameter to tipc_createport_raw()
This patch eliminates an unneeded parameter when creating a low-level
TIPC port object.  Instead of returning both the pointer to the port
structure and the port's reference ID, it now returns only the pointer
since the port structure contains the reference ID as one of its fields.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 22:42:19 -07:00
David S. Miller
fc943b12e4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-07-14 20:40:34 -07:00
David S. Miller
4c88949800 netfilter: Let nf_ct_kill() callers know if del_timer() returned true.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 20:22:38 -07:00
Johannes Berg
f434b2d111 mac80211: fix struct ieee80211_tx_queue_params
Multiple issues:
 - there are no "default" values needed
 - cw_min/cw_max can be larger than documented
 - restructure to decrease size
 - use get_unaligned_le16

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-14 14:52:57 -04:00
Johannes Berg
f591fa5dbb mac80211: fix TX sequence numbers
This patch makes mac80211 assign proper sequence numbers to
QoS-data frames. It also removes the old sequence number code
because we noticed that only the driver or hardware can assign
sequence numbers to non-QoS-data and especially management
frames in a race-free manner because beacons aren't passed
through mac80211's TX path.

This patch also adds temporary code to the rt2x00 drivers to
not break them completely, that code will have to be reworked
for proper sequence numbers on beacons.

It also moves sequence number assignment down in the TX path
so no sequence numbers are assigned to frames that are dropped.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-14 14:52:57 -04:00
Johannes Berg
9d139c810a mac80211: revamp beacon configuration
This patch changes mac80211's beacon configuration handling
to never pass skbs to the driver directly but rather always
require the driver to use ieee80211_beacon_get(). Additionally,
it introduces "change flags" on the config_interface() call
to enable drivers to figure out what is changing. Finally, it
removes the beacon_update() driver callback in favour of
having IBSS beacon delivered by ieee80211_beacon_get() as well.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-14 14:30:07 -04:00
Samuel Ortiz
49292d5635 mac80211: power management wext hooks
This patch implements the power management routines wireless extensions
for mac80211.
For now we only support switching PS mode between on and off.

Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-14 14:30:06 -04:00
Marcel Holtmann
8b6b3da765 [Bluetooth] Store remote modem status for RFCOMM TTY
When switching a RFCOMM socket to a TTY, the remote modem status might
be needed later. Currently it is lost since the original configuration
is done via the socket interface. So store the modem status and reply
it when the socket has been converted to a TTY.

Signed-off-by: Denis Kenzior <denis.kenzior@trolltech.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:52 +02:00
Marcel Holtmann
3241ad820d [Bluetooth] Add timestamp support to L2CAP, RFCOMM and SCO
Enable the common timestamp functionality that the network subsystem
provides for L2CAP, RFCOMM and SCO sockets. It is possible to either
use SO_TIMESTAMP or the IOCTLs to retrieve the timestamp of the
current packet.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:50 +02:00
Marcel Holtmann
40be492fe4 [Bluetooth] Export details about authentication requirements
With the Simple Pairing support, the authentication requirements are
an explicit setting during the bonding process. Track and enforce the
requirements and allow higher layers like L2CAP and RFCOMM to increase
them if needed.

This patch introduces a new IOCTL that allows to query the current
authentication requirements. It is also possible to detect Simple
Pairing support in the kernel this way.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:50 +02:00
Marcel Holtmann
769be974d0 [Bluetooth] Use ACL config stage to retrieve remote features
The Bluetooth technology introduces new features on a regular basis
and for some of them it is important that the hardware on both sides
support them. For features like Simple Pairing it is important that
the host stacks on both sides have switched this feature on. To make
valid decisions, a config stage during ACL link establishment has been
introduced that retrieves remote features and if needed also the remote
extended features (known as remote host features) before signalling
this link as connected.

This change introduces full reference counting of incoming and outgoing
ACL links and the Bluetooth core will disconnect both if no owner of it
is present. To better handle interoperability during the pairing phase
the disconnect timeout for incoming connections has been increased to
10 seconds. This is five times more than for outgoing connections.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:49 +02:00
Marcel Holtmann
41a96212b3 [Bluetooth] Track status of remote Simple Pairing mode
The Simple Pairing process can only be used if both sides have the
support enabled in the host stack. The current Bluetooth specification
has three ways to detect this support.

If an Extended Inquiry Result has been sent during inquiry then it
is safe to assume that Simple Pairing is enabled. It is not allowed
to enable Extended Inquiry without Simple Pairing. During the remote
name request phase a notification with the remote host supported
features will be sent to indicate Simple Pairing support. Also the
second page of the remote extended features can indicate support for
Simple Pairing.

For all three cases the value of remote Simple Pairing mode is stored
in the inquiry cache for later use.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:48 +02:00
Marcel Holtmann
333140b57f [Bluetooth] Track status of Simple Pairing mode
The Simple Pairing feature is optional and needs to be enabled by the
host stack first. The Linux kernel relies on the Bluetooth daemon to
either enable or disable it, but at any time it needs to know the
current state of the Simple Pairing mode. So track any changes made
by external entities and store the current mode in the HCI device
structure.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:48 +02:00
Marcel Holtmann
0493684ed2 [Bluetooth] Disable disconnect timer during Simple Pairing
During the Simple Pairing process the HCI disconnect timer must be
disabled. The way to do this is by holding a reference count of the
HCI connection. The Simple Pairing process on both sides starts with
an IO Capabilities Request and ends with Simple Pairing Complete.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:48 +02:00
Marcel Holtmann
e4e8e37c42 [Bluetooth] Make use of the default link policy settings
The Bluetooth specification supports the default link policy settings
on a per host controller basis. For every new connection the link
manager would then use these settings. It is better to use this instead
of bothering the controller on every connection setup to overwrite the
default settings.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:47 +02:00
Marcel Holtmann
a8746417e8 [Bluetooth] Track connection packet type changes
The connection packet type can be changed after the connection has been
established and thus needs to be properly tracked to ensure that the
host stack has always correct and valid information about it.

On incoming connections the Bluetooth core switches the supported packet
types to the configured list for this controller. However the usefulness
of this feature has been questioned a lot. The general consent is that
every Bluetooth host stack should enable as many packet types as the
hardware actually supports and leave the decision to the link manager
software running on the Bluetooth chip.

When running on Bluetooth 2.0 or later hardware, don't change the packet
type for incoming connections anymore. This hardware likely supports
Enhanced Data Rate and thus leave it completely up to the link manager
to pick the best packet type.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:46 +02:00
Marcel Holtmann
9719f8afce [Bluetooth] Disconnect when encryption gets disabled
The Bluetooth specification allows to enable or disable the encryption
of an ACL link at any time by either the peer or the remote device. If
a L2CAP or RFCOMM connection requested an encrypted link, they will now
disconnect that link if the encryption gets disabled. Higher protocols
that don't care about encryption (like SDP) are not affected.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:45 +02:00
Marcel Holtmann
77db198056 [Bluetooth] Enforce security for outgoing RFCOMM connections
Recent tests with various Bluetooth headsets have shown that some of
them don't enforce authentication and encryption when connecting. All
of them leave it up to the host stack to enforce it. Non of them should
allow unencrypted connections, but that is how it is. So in case the
link mode settings require authentication and/or encryption it will now
also be enforced on outgoing RFCOMM connections. Previously this was
only done for incoming connections.

This support has a small drawback from a protocol level point of view
since the host stack can't really tell with 100% certainty if a remote
side is already authenticated or not. So if both sides are configured
to enforce authentication it will be requested twice. Most Bluetooth
chips are caching this information and thus no extra authentication
procedure has to be triggered over-the-air, but it can happen.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2008-07-14 20:13:45 +02:00
David S. Miller
79d16385c7 netdev: Move atomic queue state bits into netdev_queue.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 23:14:46 -07:00
David S. Miller
eb6aafe3f8 pkt_sched: Make qdisc_run take a netdev_queue.
This allows us to use this calling convention all the way down into
qdisc_restart().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 23:12:38 -07:00
David S. Miller
052979499c pkt_sched: Add qdisc_tx_is_noop() helper and use in IPV6.
This indicates if the NOOP scheduler is what is active for TX on a
given device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 23:01:27 -07:00
David S. Miller
6fa9864b53 net: Clean up explicit ->tx_queue references in link watch.
First, we add a qdisc_tx_changing() helper which returns true if the
qdisc attachment is in transition.

Second, we remove an assertion warning which is of limited value and
is hard to express precisely in a multiqueue environment.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 23:01:06 -07:00
David S. Miller
3e745dd695 pkt_sched: Add qdisc_all_tx_empty()
This is a helper function, currently used by IRDA.

This is being added so that we can contain and isolate as many
explicit ->tx_queue references in the tree as possible.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 23:00:25 -07:00
David S. Miller
5aa709954a pkt_sched: Add qdisc_reset_all_tx().
Isolate callers that want to simply reset all the TX qdiscs from the
details of TX queues.

Use this in the ISDN code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 22:59:10 -07:00
David S. Miller
68dfb42798 pkt_sched: Kill stats_lock member of struct Qdisc.
It is always equal to qdisc->dev_queue->lock

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 22:57:31 -07:00
David S. Miller
b0e1e6462d netdev: Move rest of qdisc state into struct netdev_queue
Now qdisc, qdisc_sleeping, and qdisc_list also live there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 17:42:10 -07:00
David S. Miller
5ce2d488fe pkt_sched: Remove 'dev' member of struct Qdisc.
It can be obtained via the netdev_queue.  So create a helper routine,
qdisc_dev(), to make the transformations nicer looking.

Now, qdisc_alloc() now no longer needs a net_device pointer argument.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 17:06:30 -07:00
David S. Miller
bb949fbd18 netdev: Create netdev_queue abstraction.
A netdev_queue is an entity managed by a qdisc.

Currently there is one RX and one TX queue, and a netdev_queue merely
contains a backpointer to the net_device.

The Qdisc struct is augmented with a netdev_queue pointer as well.

Eventually the 'dev' Qdisc member will go away and we will have the
resulting hierarchy:

	net_device --> netdev_queue --> Qdisc

Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue
pointer argument.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 16:55:56 -07:00
Randy Dunlap
6ef307bc56 mac80211: fix lots of kernel-doc
Fix more than 50 kernel-doc warnings in ieee80211/mac80211 kernel-doc notation.
Fix a few typos also.

Note: Some fields are marked as TBD and need to have their description
corrected.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-08 14:16:03 -04:00
Ron Rindjunsky
429a380571 mac80211: add block ack request capability
This patch adds block ack request capability

Signed-off-by: Ester Kummer <ester.kummer@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-08 10:21:34 -04:00
Pablo Neira Ayuso
b891c5a831 netfilter: nf_conntrack: add allocation flag to nf_conntrack_alloc
ctnetlink does not need to allocate the conntrack entries with GFP_ATOMIC
as its code is executed in user context.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08 02:35:55 -07:00
Patrick McHardy
fb0305ce1b net-sched: consolidate default fifo qdisc setup
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:40:21 -07:00
Patrick McHardy
6fe1c7a555 net-sched: add dynamically sized qdisc class hash helpers
Currently all qdiscs which allow to create classes uses a fixed sized hash
table with size 16 to hash the classes. This causes a large bottleneck
when using thousands of classes and unbound filters.

Add helpers for dynamically sized class hashes to fix this. The following
patches will convert the qdiscs to use them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 23:21:31 -07:00
David S. Miller
ea2aca084b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	Documentation/feature-removal-schedule.txt
	drivers/net/wan/hdlc_fr.c
	drivers/net/wireless/iwlwifi/iwl-4965.c
	drivers/net/wireless/iwlwifi/iwl3945-base.c
2008-07-05 23:08:07 -07:00
David S. Miller
f3032be921 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-07-05 21:41:53 -07:00
Patrick McHardy
70c03b49b8 vlan: Add GVRP support
Add GVRP support for dynamically registering VLANs with switches.

By default GVRP is disabled because we only support the applicant-only
participant model, which means it should not be enabled on vlans that
are members of a bridge. Since there is currently no way to cleanly
determine that, the user is responsible for enabling it.

The code is pretty small and low impact, its wrapped in a config
option though because it depends on the GARP implementation and
the STP core.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:26:57 -07:00
Patrick McHardy
eca9ebac65 net: Add GARP applicant-only participant
Add an implementation of the GARP (Generic Attribute Registration Protocol)
applicant-only participant. This will be used by the following patch to
add GVRP support to the VLAN code.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:26:13 -07:00
Patrick McHardy
a19800d704 net: Add STP demux layer
Add small STP demux layer for demuxing STP PDUs based on MAC address.
This is needed to run both GARP and STP in parallel (or even load the
modules) since both use LLC_SAP_BSPAN.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:25:39 -07:00
Pavel Emelyanov
ef28d1a20f MIB: add struct net to UDP6_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:19:40 -07:00
Pavel Emelyanov
235b9f7ac5 MIB: add struct net to UDP6_INC_STATS_USER
As simple as the patch #1 in this set.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:19:20 -07:00
Pavel Emelyanov
0283328e23 MIB: add struct net to UDP_INC_STATS_BH
Two special cases here - one is rxrpc - I put init_net there
explicitly, since we haven't touched this part yet. The second
place is in __udp4_lib_rcv - we already have a struct net there,
but I have to move its initialization above to make it ready
at the "drop" label.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:18:48 -07:00
Pavel Emelyanov
629ca23c33 MIB: add struct net to UDP_INC_STATS_USER
Nothing special - all the places already have a struct sock
at hands, so use the sock_net() net.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 21:18:07 -07:00
Denis V. Lunev
e84f84f276 netns: place rt_genid into struct net
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:04:32 -07:00
Denis V. Lunev
9f5e97e536 netns: make rt_secret_rebuild timer per namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:02:59 -07:00
Denis V. Lunev
39a23e7508 netns: register net.ipv4.route.flush in each namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:02:33 -07:00
Denis V. Lunev
ae299fc051 net: add fib_rules_ops to flush_cache method
This is required to pass namespace context into rt_cache_flush called from
->flush_cache.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:01:28 -07:00
Denis V. Lunev
76e6ebfb40 netns: add namespace parameter to rt_cache_flush
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-05 19:00:44 -07:00
Patrick McHardy
ff31ab56c0 net-sched: change tcf_destroy_chain() to clear start of filter list
Pass double tcf_proto pointers to tcf_destroy_chain() to make it
clear the start of the filter list for more consistency.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-01 19:52:38 -07:00
Tomas Winkler
06ff47bc95 mac80211: add spectrum capabilities
This patch add spectrum capability and required information
elements to association request providing AP has requested it and
it is supported by the driver

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 17:37:34 -04:00
Emmanuel Grumbach
23976efedd mac80211: don't accept WEP keys other than WEP40 and WEP104
This patch makes mac80211 refuse a WEP key whose length is not WEP40 nor
WEP104.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-30 15:43:53 -04:00
David S. Miller
28f49d8fec Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-06-28 22:57:58 -07:00
David S. Miller
1b63ba8a86 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/iwlwifi/iwl4965-base.c
2008-06-28 01:19:40 -07:00
Ivo van Doorn
428da76523 mac80211: Add RTNL warning for workqueue
The workqueue provided by mac80211 should not be used for
scheduled tasks that acquire the RTNL lock. This could be done
when the driver uses the function ieee80211_iterate_active_interfaces()
within the scheduled work. Such behavior will end in locking
dependencies problems when an interface is being removed.

This patch will add a notification about the RTNL locking and
the mac80211 workqueue to prevent driver developers from
blindly using it.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:20 -04:00
Luis R. Rodriguez
ffd7891dc9 mac80211: Let drivers have access to TKIP key offets for TX and RX MIC
Some drivers may want to to use the TKIP key offsets for TX and RX
MIC so lets move this out. Lets also clear up a bit how this is used
internally in mac80211.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 09:09:17 -04:00
John W. Linville
1839cea91e Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/wireless-2.6 2008-06-25 15:17:58 -04:00
Eric W. Biederman
b9f75f45a6 netns: Don't receive new packets in a dead network namespace.
Alexey Dobriyan <adobriyan@gmail.com> writes:
> Subject: ICMP sockets destruction vs ICMP packets oops

> After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
> icmp_reply() wants ICMP socket.
>
> Steps to reproduce:
>
> 	launch shell in new netns
> 	move real NIC to netns
> 	setup routing
> 	ping -i 0
> 	exit from shell
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
> PGD 17f3cd067 PUD 17f3ce067 PMD 0 
> Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU 0 
> Modules linked in: usblp usbcore
> Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
> RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
> RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
> RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
> RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
> R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
> FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
> Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
>  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
>  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
> Call Trace:
>  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
>  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
>  [<ffffffff803fd645>] icmp_echo+0x65/0x70
>  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
>  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
>  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
>  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
>  [<ffffffff803be57d>] process_backlog+0x9d/0x100
>  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
>  [<ffffffff80237be5>] __do_softirq+0x75/0x100
>  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
>  [<ffffffff8020f085>] do_softirq+0x65/0xa0
>  [<ffffffff80237af7>] irq_exit+0x97/0xa0
>  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
>  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
>  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
>  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
>  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
>  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
>  [<ffffffff8040f380>] ? rest_init+0x70/0x80
> Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
> 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
> 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
> RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
>  RSP <ffffffff8057fc30>
> CR2: 0000000000000000
> ---[ end trace ea161157b76b33e8 ]---
> Kernel panic - not syncing: Aiee, killing interrupt handler!

Receiving packets while we are cleaning up a network namespace is a
racy proposition. It is possible when the packet arrives that we have
removed some but not all of the state we need to fully process it.  We
have the choice of either playing wack-a-mole with the cleanup routines
or simply dropping packets when we don't have a network namespace to
handle them.

Since the check looks inexpensive in netif_receive_skb let's just
drop the incoming packets.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 22:16:51 -07:00
YOSHIFUJI Hideaki
f630e43a21 ipv6: Drop packets for loopback address from outside of the box.
[ Based upon original report and patch by Karsten Keil.  Karsten
  has verified that this fixes the TAHI test case "ICMPv6 test
  v6LC.5.1.2 Part F". -DaveM ]

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:33:57 -07:00
Vlad Yasevich
2e3216cd54 sctp: Follow security requirement of responding with 1 packet
RFC 4960, Section 11.4. Protection of Non-SCTP-Capable Hosts

When an SCTP stack receives a packet containing multiple control or
DATA chunks and the processing of the packet requires the sending of
multiple chunks in response, the sender of the response chunk(s) MUST
NOT send more than one packet.  If bundling is supported, multiple
response chunks that fit into a single packet MAY be bundled together
into one single response packet.  If bundling is not supported, then
the sender MUST NOT send more than one response chunk and MUST
discard all other responses.  Note that this rule does NOT apply to a
SACK chunk, since a SACK chunk is, in itself, a response to DATA and
a SACK does not require a response of more DATA.

We implement this by not servicing our outqueue until we reach the end
of the packet.  This enables maximum bundling.  We also identify
'response' chunks and make sure that we only send 1 packet when sending
such chunks.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:08:18 -07:00
David S. Miller
0344f1c66b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	net/mac80211/tx.c
2008-06-19 16:00:04 -07:00
David S. Miller
972692e0db net: Add sk_set_socket() helper.
In order to more easily grep for all things that set
sk->sk_socket, add sk_set_socket() helper inline function.

Suggested (although only half-seriously) by Evgeniy Polyakov.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 22:41:38 -07:00
Eric Dumazet
cb61cb9b8b udp: sk_drops handling
In commits 33c732c361 ([IPV4]: Add raw
drops counter) and a92aa318b4 ([IPV6]:
Add raw drops counter), Wang Chen added raw drops counter for
/proc/net/raw & /proc/net/raw6

This patch adds this capability to UDP sockets too (/proc/net/udp &
/proc/net/udp6).

This means that 'RcvbufErrors' errors found in /proc/net/snmp can be also
be examined for each udp socket.

# grep Udp: /proc/net/snmp
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 23971006 75 899420 16390693 146348 0

# cat /proc/net/udp
 sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt  ---
uid  timeout inode ref pointer drops
 75: 00000000:02CB 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2358 2 ffff81082a538c80 0
111: 00000000:006F 00000000:0000 07 00000000:00000000 00:00000000 00000000  ---
  0        0 2286 2 ffff81042dd35c80 146348

In this example, only port 111 (0x006F) was flooded by messages that
user program could not read fast enough. 146348 messages were lost.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 21:04:56 -07:00
Bernard Pidoux
fe2c802ab6 rose: improving AX25 routing frames via ROSE network
ROSE network is organized through nodes connected via hamradio or Internet.
AX25 packet radio frames sent to a remote ROSE address destination are routed
through these nodes.

Without the present patch, automatic routing mechanism did not work optimally
due to an improper parameter checking.

rose_get_neigh() function is called either by rose_connect() or by
rose_route_frame().

In the case of a call from rose_connect(), f0 timer is checked to find if a connection
is already pending. In that case it returns the address of the neighbour, or returns a NULL otherwise.

When called by rose_route_frame() the purpose was to route a packet AX25 frame
through an adjacent node given a destination rose address.
However, in that case, t0 timer checked does not indicate if the adjacent node
is actually connected even if the timer is not null. Thus, for each frame sent, the
function often tried to start a new connexion even if the adjacent node was already connected.

The patch adds a "new" parameter that is true when the function is called by
rose route_frame().
This instructs rose_get_neigh() to check node parameter "restarted". 
If restarted is true it means that the route to the destination address is opened via a neighbour
node already connected.
If "restarted" is false the function returns a NULL.
In that case the calling function will initiate a new connection as before.

This results in a fast routing of frames, from nodes to nodes, until
destination is reached, as originaly specified by ROSE protocole.

Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 17:08:32 -07:00
Patrick McHardy
68b80f1138 netfilter: nf_nat: fix RCU races
Fix three ct_extend/NAT extension related races:

- When cleaning up the extension area and removing it from the bysource hash,
  the nat->ct pointer must not be set to NULL since it may still be used in
  a RCU read side

- When replacing a NAT extension area in the bysource hash, the nat->ct
  pointer must be assigned before performing the replacement

- When reallocating extension storage in ct_extend, the old memory must
  not be freed immediately since it may still be used by a RCU read side

Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315
and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 15:51:47 -07:00
David S. Miller
338db08551 net: Kill SOCK_SLEEP_PRE and SOCK_SLEEP_POST, no users.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 01:09:00 -07:00
David S. Miller
8ce9c6ede1 sctp: Kill SCTP_SOCK_SLEEP_{PRE,POST}, unused.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 00:40:36 -07:00
David S. Miller
ccc580571c wext: Emit event stream entries correctly when compat.
Three major portions to this change:

1) Add IW_EV_COMPAT_LCP_LEN, IW_EV_COMPAT_POINT_OFF,
   and IW_EV_COMPAT_POINT_LEN helper defines.

2) Delete iw_stream_check_add_*(), they are unused.

3) Add iw_request_info argument to iwe_stream_add_*(), and use it to
   size the event and pointer lengths correctly depending upon whether
   IW_REQUEST_FLAG_COMPAT is set or not.

4) The mechanical transformations to the drivers and wireless stack
   bits to get the iw_request_info passed down into the routines
   modified in #3.  Also, explicit references to IW_EV_LCP_LEN are
   replaced with iwe_stream_lcp_len(info).

With a lot of help and bug fixes from Masakazu Mokuno.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:50:49 -07:00
David S. Miller
0f5cabba49 wext: Create IW_REQUEST_FLAG_COMPAT and set it as needed.
Now low-level WEXT ioctl handlers can do compat handling
when necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:34:49 -07:00
David S. Miller
87de87d5e4 wext: Dispatch and handle compat ioctls entirely in net/wireless/wext.c
Next we can kill the hacks in fs/compat_ioctl.c and also
dispatch compat ioctls down into the driver and 80211 protocol
helper layers in order to handle iw_point objects embedded in
stream replies which need to be translated.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 18:32:46 -07:00
Pavel Emelyanov
0b4419162a netns: introduce the net_hash_mix "salt" for hashes
There are many possible ways to add this "salt", thus I made this
patch to be the last in the series to change it if required.

Currently I propose to use the struct net pointer itself as this 
salt, but since this pointer is most often cache-line aligned, shift 
this right to eliminate the bits, that are most often zeroed.

After this, simply add this mix to prepared hashfn-s.

For CONFIG_NET_NS=n case this salt is 0 and no changes in hashfn
appear.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:14:11 -07:00
Pavel Emelyanov
33de014c63 inet6: add struct net argument to inet6_ehashfn
Same as for inet_hashfn, prepare its ipv6 incarnation.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:13:48 -07:00
Pavel Emelyanov
9f26b3add3 inet: add struct net argument to inet_ehashfn
Although this hash takes addresses into account, the ehash chains
can also be too long when, for instance, communications via lo occur.
So, prepare the inet_hashfn to take struct net into account.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:13:27 -07:00
Pavel Emelyanov
2086a65078 inet: add struct net argument to inet_lhashfn
Listening-on-one-port sockets in many namespaces produce long 
chains in the listening_hash-es, so prepare the inet_lhashfn to 
take struct net into account.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:13:08 -07:00
Pavel Emelyanov
7f635ab71e inet: add struct net argument to inet_bhashfn
Binding to some port in many namespaces may create too long
chains in bhash-es, so prepare the hashfn to take struct net
into account.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-16 17:12:49 -07:00
David S. Miller
942e7b102a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-06-14 17:15:39 -07:00
Brian Haley
7d06b2e053 net: change proto destroy method to return void
Change struct proto destroy function pointer to return void.  Noticed
by Al Viro.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-14 17:04:49 -07:00
Harvey Harrison
6693be7124 mac80211: add utility function to get header length
Take a __le16 directly rather than a host-endian value.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:13 -04:00
Harvey Harrison
c9c6950c14 mac80211: make ieee80211_get_hdrlen_from_skb return unsigned
Many callers already expect it to.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-14 12:18:12 -04:00
David S. Miller
4ae127d1b6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/smc911x.c
2008-06-13 20:52:39 -07:00
Richard Kennedy
875ec4333b udp: reorder udp_iter_state to remove padding on 64bit builds
reorder udp_iter_state to remove padding on 64bit builds

shrinks from 24 to 16 bytes, moving to a smaller slab when
CONFIG_NET_NS is undefined & seq_net_private = {}

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-13 03:03:17 -07:00
David S. Miller
ec0a196626 tcp: Revert 'process defer accept as established' changes.
This reverts two changesets, ec3c0982a2
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adb
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --> child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
>--- a/net/ipv4/tcp_timer.c
>+++ b/net/ipv4/tcp_timer.c
>@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
> 		goto death;
> 	}
>
>+	if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) {
>+		tcp_send_active_reset(sk, GFP_ATOMIC);
>+		goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-12 16:34:35 -07:00
David S. Miller
e6e30add6b Merge branch 'net-next-2.6-misc-20080612a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-next 2008-06-11 22:33:59 -07:00
Adrian Bunk
0b04082995 net: remove CVS keywords
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-11 21:00:38 -07:00
YOSHIFUJI Hideaki
9501f97229 tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().
As we do for other socket/timewait-socket specific parameters,
let the callers pass appropriate arguments to
tcp_v{4,6}_do_calc_md5_hash().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 03:46:30 +09:00
YOSHIFUJI Hideaki
8d26d76dd4 tcp md5sig: Share most of hash calcucaltion bits between IPv4 and IPv6.
We can share most part of the hash calculation code because
the only difference between IPv4 and IPv6 is their pseudo headers.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:20 +09:00
YOSHIFUJI Hideaki
076fb72233 tcp md5sig: Remove redundant protocol argument.
Protocol is always TCP, so remove useless protocol argument.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:19 +09:00
YOSHIFUJI Hideaki
7d5d5525bd tcp md5sig: Share MD5 Signature option parser between IPv4 and IPv6.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:18 +09:00
Benjamin Thery
3de232554a ipv6 netns: Address labels per namespace
This pacth makes IPv6 address labels per network namespace.
It keeps the global label tables, ip6addrlbl_table, but
adds a 'net' member to each ip6addrlbl_entry.
This new member is taken into account when matching labels.

Changelog
=========
* v1: Initial version
* v2:
  * Minize the penalty when network namespaces are not configured:
      *  the 'net' member is added only if CONFIG_NET_NS is
         defined. This saves space when network namespaces are not
         configured.
      * 'net' value is retrieved with the inlined function
         ip6addrlbl_net() that always return &init_net when
         CONFIG_NET_NS is not defined.
  * 'net' member in ip6addrlbl_entry renamed to the less generic
    'lbl_net' name (helps code search).

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:15 +09:00
Rami Rosen
0399e5f07a ipv6 addrconf: Remove IFA_GLOBAL definition from include/net/if_inet6.h.
This patches removes IFA_GLOBAL definition from linux/include/net/if_inet6.h
as it is unused.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-12 02:38:13 +09:00
Arnaldo Carvalho de Melo
ce4a7d0d48 inet{6}_request_sock: Init ->opt and ->pktopts in the constructor
Wei Yongjun noticed that we may call reqsk_free on request sock objects where
the opt fields may not be initialized, fix it by introducing inet_reqsk_alloc
where we initialize ->opt to NULL and set ->pktopts to NULL in
inet6_reqsk_alloc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-10 12:39:35 -07:00
Rami Rosen
45d465bc23 ipv4: Remove unused declaration from include/net/tcp.h.
- The tcp_unhash() method in /include/net/tcp.h is no more needed, as the
unhash method in tcp_prot structure is now inet_unhash (instead of
tcp_unhash in the
past); see tcp_prot structure in net/ipv4/tcp_ipv4.c.

- So, this patch removes tcp_unhash() declaration from include/net/tcp.h

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-10 12:37:42 -07:00
David S. Miller
65b53e4cc9 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/tg3.c
	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/mac80211/ieee80211_i.h
2008-06-10 02:22:26 -07:00
David S. Miller
788c0a5316 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-next-2.6
Conflicts:

	drivers/net/ps3_gelic_wireless.c
	drivers/net/wireless/libertas/main.c
2008-06-10 01:54:31 -07:00
Rami Rosen
7bcd978e8c netfilter: nf_conntrack: remove unnecessary function declaration
This patch removes nf_ct_ipv4_ct_gather_frags() method declaration from
include/net/netfilter/ipv4/nf_conntrack_ipv4.h, since it is unused in
the Linux kernel.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 16:00:22 -07:00
Fabian Hugelshofer
718d4ad98e netfilter: nf_conntrack: properly account terminating packets
Currently the last packet of a connection isn't accounted when its causing
abnormal termination.

Introduces nf_ct_kill_acct() which increments the accounting counters on
conntrack kill. The new function was necessary, because there are calls
to nf_ct_kill() which don't need accounting:

nf_conntrack_proto_tcp.c line ~847:
Kills ct and returns NF_REPEAT. We don't want to count twice.

nf_conntrack_proto_tcp.c line ~880:
Kills ct and returns NF_DROP. I think we don't want to count dropped
packets.

nf_conntrack_netlink.c line ~824:
As far as I can see ctnetlink_del_conntrack() is used to destroy a
conntrack on behalf of the user. There is an sk_buff, but I don't think
this is an actual packet. Incrementing counters here is therefore not
desired.

Signed-off-by: Fabian Hugelshofer <hugelshofer2006@gmx.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:59:40 -07:00
Patrick McHardy
51091764f2 netfilter: nf_conntrack: add nf_ct_kill()
Encapsulate the common

	if (del_timer(&ct->timeout))
		ct->timeout.function((unsigned long)ct)

sequence in a new function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:59:06 -07:00
James Morris
17e6e59f0a netfilter: ip6_tables: add ip6tables security table
This is a port of the IPv4 security table for IPv6.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:58:05 -07:00
James Morris
560ee653b6 netfilter: ip_tables: add iptables security table for mandatory access control rules
The following patch implements a new "security" table for iptables, so
that MAC (SELinux etc.) networking rules can be managed separately to
standard DAC rules.

This is to help with distro integration of the new secmark-based
network controls, per various previous discussions.

The need for a separate table arises from the fact that existing tools
and usage of iptables will likely clash with centralized MAC policy
management.

The SECMARK and CONNSECMARK targets will still be valid in the mangle
table to prevent breakage of existing users.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-09 15:57:24 -07:00
Vlad Yasevich
b9031d9d87 sctp: Fix ECN markings for IPv6
Commit e9df2e8fd8 ("[IPV6]: Use
appropriate sock tclass setting for routing lookup.") also changed the
way that ECN capable transports mark this capability in IPv6.  As a
result, SCTP was not marking ECN capablity because the traffic class
was never set.  This patch brings back the markings for IPv6 traffic.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:40:15 -07:00
Vlad Yasevich
62aeaff5cc sctp: Start T3-RTX timer when fast retransmitting lowest TSN
When we are trying to fast retransmit the lowest outstanding TSN, we
need to restart the T3-RTX timer, so that subsequent timeouts will
correctly tag all the packets necessary for retransmissions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:39:11 -07:00
Vlad Yasevich
a646523481 sctp: Correctly implement Fast Recovery cwnd manipulations.
Correctly keep track of Fast Recovery state and do not reduce
congestion window multiple times during sucht state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:38:43 -07:00
David S. Miller
aed5a833fb Merge branch 'net-2.6-misc-20080605a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix 2008-06-04 12:10:21 -07:00
Denis V. Lunev
36d926b94a [IPV6]: inet_sk(sk)->cork.opt leak
IPv6 UDP sockets wth IPv4 mapped address use udp_sendmsg to send the data
actually. In this case ip_flush_pending_frames should be called instead
of ip6_flush_pending_frames.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:38 +09:00
YOSHIFUJI Hideaki
91e1908f56 [IPV6] NETNS: Handle ancillary data in appropriate namespace.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:36 +09:00
YOSHIFUJI Hideaki
4bed72e4f5 [IPV6] ADDRCONF: Allow longer lifetime on 64bit archs.
- Allow longer lifetimes (>= 0x7fffffff/HZ) on 64bit archs
  by using unsigned long.
- Shadow this arithmetic overflow workaround by introducing
  helper functions: addrconf_timeout_fixup() and
  addrconf_finite_timeout().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:34 +09:00
YOSHIFUJI Hideaki
e51171019b [SCTP]: Fix NULL dereference of asoc.
Commit 7cbca67c07 ("[IPV6]: Support
Source Address Selection API (RFC5014)") introduced NULL dereference
of asoc to sctp_v6_get_saddr in net/sctp/ipv6.c.
Pointed out by Johann Felix Soden <johfel@users.sourceforge.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:30 +09:00
Thomas Graf
bc3ed28caa netlink: Improve returned error codes
Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
nla_nest_cancel() void functions.

Return -EMSGSIZE instead of -1 if the provided message buffer is not
big enough.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:36:54 -07:00
Emmanuel Grumbach
9306102ea5 mac80211: allow disable FAT in specific configurations
This patch allows to disable FAT channel in specific configurations.

For example the configuration (8, +1), (primary channel 8, extension
channel 12) isn't permitted in U.S., but (8, -1), (primary channel 8,
extension channel 4) is. When FAT channel configuration is not
permitted, FAT channel should be reported as not supported in the
capabilities of the HT IE in association request. And sssociation is
performed on 20Mhz channel.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-03 15:00:26 -04:00
David S. Miller
43154d08d6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/cpmac.c
	net/mac80211/mlme.c
2008-05-25 23:26:10 -07:00
Thomas Graf
b9a2f2e450 netlink: Fix nla_parse_nested_compat() to call nla_parse() directly
The purpose of nla_parse_nested_compat() is to parse attributes which
contain a struct followed by a stream of nested attributes.  So far,
it called nla_parse_nested() to parse the stream of nested attributes
which was wrong, as nla_parse_nested() expects a container attribute
as data which holds the attribute stream.  It needs to call
nla_parse() directly while pointing at the next possible alignment
point after the struct in the beginning of the attribute.

With this patch, I can no longer reproduce the reported leftover
warnings.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-22 10:48:59 -07:00
Johannes Berg
e253008360 mac80211: use multi-queue master netdevice
This patch updates mac80211 and drivers to be multi-queue aware and
use that instead of the internal queue mapping. Also does a number
of cleanups in various pieces of the code that fall out and reduces
internal mac80211 state size.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-21 21:48:14 -04:00
Johannes Berg
eefce91a38 mac80211: dont allow fragmentation and requeuing on A-MPDU queues
There really is no reason for a driver to reject a frame on
an A-MPDU queue when it can stop that queue for any period
of time and is given frames one by one. Hence, disallow it
with a big warning and reduce mac80211-internal state.

Also add a warning when we try to fragment a frame destined
for an A-MPDU queue and drop it, the actual bug needs to be
fixed elsewhere but I'm not exactly sure how to yet.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-21 21:48:13 -04:00
Johannes Berg
e039fa4a41 mac80211: move TX info into skb->cb
This patch converts mac80211 and all drivers to have transmit
information and status in skb->cb rather than allocating extra
memory for it and copying all the data around. To make it fit,
a union is used where only data that is necessary for all steps
is kept outside of the union.

A number of fixes were done by Ivo, as well as the rt2x00 part
of this patch.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-21 21:48:11 -04:00
Johannes Berg
2e92e6f2c5 mac80211: use rate index in TX control
This patch modifies struct ieee80211_tx_control to give band
info and the rate index (instead of rate pointers) to drivers.
This mostly serves to reduce the TX control structure size to
make it fit into skb->cb so that the fragmentation code can
put it there and we can think about passing it to drivers that
way in the future.

The rt2x00 driver update was done by Ivo, thanks.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-21 21:48:09 -04:00
Johannes Berg
36d6825b91 mac80211: let drivers wake but not start queues
Having drivers start queues is just confusing, their ->start()
callback can block and do whatever is necessary, so let mac80211
start queues and have drivers wake queues when necessary (to get
packets flowing again right away.)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-21 21:48:08 -04:00
Pavel Emelyanov
3dca02af38 ip6tnl: Use on-device stats instead of private ones.
This tunnel uses its own private structure and requires separate
patch to switch from private stats to on-device ones.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 14:17:05 -07:00
Pavel Emelyanov
f56dd017c3 tunnels: Remove stat member from ip_tunnel struct.
All users already use on-device statistics, so this field can be
safely removed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 14:16:36 -07:00
David S. Miller
44dc19c829 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-05-19 16:29:40 -07:00
YOSHIFUJI Hideaki
0686caa35e ndisc: Add missing strategies for per-device retrans timer/reachable time settings.
Noticed from Al Viro <viro@ftp.linux.org.uk> via David Miller
<davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-19 16:25:42 -07:00
Pavel Emelyanov
d62c612ef8 netns: Introduce sysctl root for read-only net sysctls.
This one stores all ctl-heads in one list and restricts the
permissions not give write access to non-init net namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-19 13:45:33 -07:00
Ivo van Doorn
2f561feb38 mac80211: Add RTNL version of ieee80211_iterate_active_interfaces
Since commit e38bad4766
	mac80211: make ieee80211_iterate_active_interfaces not need rtnl
rt2500usb and rt73usb broke down due to attempting register access
in atomic context (which is not possible for USB hardware).

This patch restores ieee80211_iterate_active_interfaces() to use RTNL lock,
and provides the non-RTNL version under a new name:
	ieee80211_iterate_active_interfaces_atomic()

So far only rt2x00 uses ieee80211_iterate_active_interfaces(), and those
drivers require the RTNL version of ieee80211_iterate_active_interfaces().
Since they already call that function directly, this patch will automatically
fix the USB rt2x00 drivers.

v2: Rename ieee80211_iterate_active_interfaces_rtnl

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-16 17:15:09 -04:00
David S. Miller
f42a44494b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-05-15 00:52:37 -07:00
David S. Miller
63fe46da9c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/iwlwifi/iwl-4965-rs.c
	drivers/net/wireless/rt2x00/rt61pci.c
2008-05-15 00:34:44 -07:00
Eric Van Hensbergen
887b3ece65 9p: fix error path during early mount
There was some cleanup issues during early mount which would trigger
a kernel bug for certain types of failure.  This patch reorganizes the
cleanup to get rid of the bad behavior.

This also merges the 9pnet and 9pnet_fd modules for the purpose of
configuration and initialization.  Keeping the fd transport separate
from the core 9pnet code seemed like a good idea at the time, but in
practice has caused more harm and confusion than good.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:27 -05:00
Eric Van Hensbergen
ee443996a3 9p: Documentation updates
The kernel-doc comments of much of the 9p system have been in disarray since
reorganization.  This patch fixes those problems, adds additional documentation
and a template book which collects the 9p information.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:25 -05:00
Bruno Randolf
566bfe5a8b mac80211: use hardware flags for signal/noise units
trying to clean up the signal/noise code. the previous code in mac80211 had
confusing names for the related variables, did not have much definition of
what units of signal and noise were provided and used implicit mechanisms from
the wireless extensions.

this patch introduces hardware capability flags to let the hardware specify
clearly if it can provide signal and noise level values and which units it can
provide. this also anticipates possible new units like RCPI in the future.

for signal:

  IEEE80211_HW_SIGNAL_UNSPEC - unspecified, unknown, hw specific
  IEEE80211_HW_SIGNAL_DB     - dB difference to unspecified reference point
  IEEE80211_HW_SIGNAL_DBM    - dBm, difference to 1mW

for noise we currently only have dBm:

  IEEE80211_HW_NOISE_DBM     - dBm, difference to 1mW

if IEEE80211_HW_SIGNAL_UNSPEC or IEEE80211_HW_SIGNAL_DB is used the driver has
to provide the maximum value (max_signal) it reports in order for applications
to make sense of the signal values.

i tried my best to find out for each driver what it can provide and update it
but i'm not sure (?) for some of them and used the more conservative guess in
doubt. this can be fixed easily after this patch has been merged by changing
the hardware flags of the driver.

DRIVER          SIGNAL    MAX	NOISE   QUAL
-----------------------------------------------------------------
adm8211         unspec(?) 100   n/a     missing
at76_usb        unspec(?) (?)   unused  missing
ath5k           dBm             dBm     percent rssi
b43legacy       dBm             dBm     percent jssi(?)
b43             dBm             dBm     percent jssi(?)
iwl-3945        dBm             dBm     percent snr+more
iwl-4965        dBm             dBm     percent snr+more
p54             unspec    127   n/a     missing
rt2x00          dBm	        n/a     percent rssi+tx/rx frame success
  rt2400        dBm             n/a
  rt2500pci     dBm             n/a
  rt2500usb     dBm             n/a
  rt61pci       dBm             n/a
  rt73usb       dBm             n/a
rtl8180         unspec(?) 65    n/a     (?)
rtl8187         unspec(?) 65    (?)     noise(?)
zd1211          dB(?)     100   n/a     percent

drivers/net/wireless/ath5k/base.c:      Changes-licensed-under: 3-Clause-BSD

Signed-off-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-14 16:29:49 -04:00
Graf Yang
332223831e irda: Fix a misalign access issue. (v2)
Replace u16ho with put/get_unaligned functions

Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 23:25:57 -07:00
Allan Stephens
7ef43ebaa5 tipc: Fix race condition when creating socket or native port
This patch eliminates the (very remote) chance of a crash resulting
from a partially initialized socket or native port unexpectedly
receiving a message.  Now, during the creation of a socket or native
port, the underlying generic port's lock is not released until all
initialization required to handle incoming messages has been done.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 15:42:28 -07:00
David S. Miller
4951704b4e syncppp: Fix crashes.
The syncppp layer wants a mid-level netdev private pointer.

It was using netdev->priv but that only worked by accident,
and thus this scheme was broken when the device private
allocation strategy changed.

Add a proper mid-layer private pointer for uses like this,
update syncppp and all users, and remove the HDLC_PPP broken
tag from drivers/net/wan/Kconfig

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 03:29:11 -07:00
Neil Horman
20c2c1fd6c sctp: add sctp/remaddr table to complete RFC remote address table OID
Add support for RFC3873 remote address table OID.

      +--(5) sctpAssocRemAddrTable
      |   |
      |   |--(-) sctpAssocId (shared index)
      |   |
      |   +--(1) sctpAssocRemAddrType (index)
      .   |
      .   +--(2) sctpAssocRemAddr (index)
      .   |
          +--(3) sctpAssocRemAddrActive
          |
          +--(4) sctpAssocRemAddrHBActive
          |
          +--(5) sctpAssocRemAddrRTO
          |
          +--(6) sctpAssocRemAddrMaxPathRtx
          |
          +--(7) sctpAssocRemAddrRtx
          |
          +--(8) sctpAssocRemAddrStartTime

This patch places all the requsite data in /proc/net/sctp/remaddr.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-09 15:14:50 -07:00
Vlad Yasevich
88a0a948e7 sctp: Support the new specification of sctp_connectx()
The specification of sctp_connectx() has been changed to return
an association id.  We've added a new socket option that will
return the association id as the return value from the setsockopt()
call.  The library that implements sctp_connectx() interface will
implement both socket options.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-09 15:14:11 -07:00
Wei Yongjun
d364d9276b sctp: Bring SCTP_DELAYED_ACK socket option into API compliance
Brings delayed_ack socket option set/get into line with the latest ietf
socket extensions API draft, while maintaining backwards compatibility.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-09 15:13:26 -07:00
Johannes Berg
e100bb64bf mac80211: QoS related cleanups
This
 * makes the queue number passed to drivers a u16
   (as it will be with skb_get_queue_mapping)
 * removes the useless queue number defines
 * splits hw->queues into hw->queues/ampdu_queues
 * removes the debugfs files for per-queue counters
 * removes some dead QoS code
 * removes the beacon queue configuration for IBSS
   so that the drivers now never get a queue number
   bigger than (hw->queues + hw->ampdu_queues - 1)
   for tx and only in the range 0..hw->queues-1 for
   conf_tx.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-07 15:02:26 -04:00
Johannes Berg
36fc6757fe mac80211: remove queue info from ieee80211_tx_status
The queue info in struct ieee80211_tx_status is never used.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-07 15:02:26 -04:00
Johannes Berg
57ffc589a9 mac80211: clean up get_tx_stats callback
The callback takes a ieee80211_tx_queue_stats with a contained
array of ieee80211_tx_queue_stats_data, remove the former, rename
the latter to ieee80211_tx_queue_stats and make tx_stats() take
the array directly.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-07 15:02:26 -04:00
Adrian Bunk
7eafd25d95 remove ieee80211_wx_{get,set}_auth()
After the bcm43xx removal ieee80211_wx_{get,set}_auth() were no longer
used.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-07 15:02:14 -04:00
Adrian Bunk
c12cf21097 remove ieee80211_tx_frame()
After the softmac removal ieee80211_tx_frame() was no longer used.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-07 15:02:14 -04:00
Ivo van Doorn
c6adbd2158 mac80211: Add IEEE80211_KEY_FLAG_PAIRWISE
This adds a new flag to the ieee80211_key_conf structure.
This flag will inform the driver the key is pairwise rather then
a shared key.

This is important for drivers who support both types of keys,
and need to be informed which type of key this is. Alternative
would be drivers checking the address argument of set_key(),
but it will be safer when mac80211 is more explicit.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-07 15:02:11 -04:00
Ivo van Doorn
1c01442058 mac80211: Replace ieee80211_tx_control->key_idx with ieee80211_key_conf
The hw_key_idx inside the ieee80211_key_conf structure does
not provide all the information drivers might need to perform
hardware encryption.

This is in particular true for rt2x00 who needs to know the
key algorithm and whether it is a shared or pairwise key.

By passing the ieee80211_key_conf pointer it assures us that
drivers can make full use of all information that it should know
about a particular key.

Additionally this patch updates all drivers to grab the hw_key_idx from
the ieee80211_key_conf structure.

v2: Removed bogus u16 cast
v3: Add warning about ieee80211_tx_control pointers
v4: Update warning about ieee80211_tx_control pointers

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-07 15:02:11 -04:00
Satoru SATOH
0bbeafd011 ip: Make use of the inline function dst_metric_locked()
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 22:12:43 -07:00
Marcin Slusarz
41fef0ee7b xfrm: convert empty xfrm_audit_* macros to functions
it removes these warnings when CONFIG_AUDITSYSCALL is unset:

net/xfrm/xfrm_user.c: In function 'xfrm_add_sa':
net/xfrm/xfrm_user.c:412: warning: unused variable 'sid'
net/xfrm/xfrm_user.c:411: warning: unused variable 'sessionid'
net/xfrm/xfrm_user.c:410: warning: unused variable 'loginuid'
net/xfrm/xfrm_user.c: In function 'xfrm_del_sa':
net/xfrm/xfrm_user.c:485: warning: unused variable 'sid'
net/xfrm/xfrm_user.c:484: warning: unused variable 'sessionid'
net/xfrm/xfrm_user.c:483: warning: unused variable 'loginuid'
net/xfrm/xfrm_user.c: In function 'xfrm_add_policy':
net/xfrm/xfrm_user.c:1132: warning: unused variable 'sid'
net/xfrm/xfrm_user.c:1131: warning: unused variable 'sessionid'
net/xfrm/xfrm_user.c:1130: warning: unused variable 'loginuid'
net/xfrm/xfrm_user.c: In function 'xfrm_get_policy':
net/xfrm/xfrm_user.c:1382: warning: unused variable 'sid'
net/xfrm/xfrm_user.c:1381: warning: unused variable 'sessionid'
net/xfrm/xfrm_user.c:1380: warning: unused variable 'loginuid'
net/xfrm/xfrm_user.c: In function 'xfrm_add_pol_expire':
net/xfrm/xfrm_user.c:1620: warning: unused variable 'sid'
net/xfrm/xfrm_user.c:1619: warning: unused variable 'sessionid'
net/xfrm/xfrm_user.c:1618: warning: unused variable 'loginuid'
net/xfrm/xfrm_user.c: In function 'xfrm_add_sa_expire':
net/xfrm/xfrm_user.c:1658: warning: unused variable 'sid'
net/xfrm/xfrm_user.c:1657: warning: unused variable 'sessionid'
net/xfrm/xfrm_user.c:1656: warning: unused variable 'loginuid'

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-03 21:03:01 -07:00
Linus Torvalds
95dfec6ae1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (53 commits)
  tcp: Overflow bug in Vegas
  [IPv4] UFO: prevent generation of chained skb destined to UFO device
  iwlwifi: move the selects to the tristate drivers
  ipv4: annotate a few functions __init in ipconfig.c
  atm: ambassador: vcc_sf semaphore to mutex
  MAINTAINERS: The socketcan-core list is subscribers-only.
  netfilter: nf_conntrack: padding breaks conntrack hash on ARM
  ipv4: Update MTU to all related cache entries in ip_rt_frag_needed()
  sch_sfq: use del_timer_sync() in sfq_destroy()
  net: Add compat support for getsockopt (MCAST_MSFILTER)
  net: Several cleanups for the setsockopt compat support.
  ipvs: fix oops in backup for fwmark conn templates
  bridge: kernel panic when unloading bridge module
  bridge: fix error handling in br_add_if()
  netfilter: {nfnetlink,ip,ip6}_queue: fix skb_over_panic when enlarging packets
  netfilter: x_tables: fix net namespace leak when reading /proc/net/xxx_tables_names
  netfilter: xt_TCPOPTSTRIP: signed tcphoff for ipv6_skip_exthdr() retval
  tcp: Limit cwnd growth when deferring for GSO
  tcp: Allow send-limited cwnd to grow up to max_burst when gso disabled
  [netdrvr] gianfar: Determine TBIPA value dynamically
  ...
2008-04-30 08:45:48 -07:00
Linus Torvalds
9781db7b34 Merge branch 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] new predicate - AUDIT_FILETYPE
  [patch 2/2] Use find_task_by_vpid in audit code
  [patch 1/2] audit: let userspace fully control TTY input auditing
  [PATCH 2/2] audit: fix sparse shadowed variable warnings
  [PATCH 1/2] audit: move extern declarations to audit.h
  Audit: MAINTAINERS update
  Audit: increase the maximum length of the key field
  Audit: standardize string audit interfaces
  Audit: stop deadlock from signals under load
  Audit: save audit_backlog_limit audit messages in case auditd comes back
  Audit: collect sessionid in netlink messages
  Audit: end printk with newline
2008-04-29 11:41:22 -07:00
Philip Craig
443a70d50b netfilter: nf_conntrack: padding breaks conntrack hash on ARM
commit 0794935e "[NETFILTER]: nf_conntrack: optimize hash_conntrack()"
results in ARM platforms hashing uninitialised padding.  This padding
doesn't exist on other architectures.

Fix this by replacing NF_CT_TUPLE_U_BLANK() with memset() to ensure
everything is initialised.  There were only 4 bytes that
NF_CT_TUPLE_U_BLANK() wasn't clearing anyway (or 12 bytes on ARM).

Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-29 03:35:10 -07:00
Timo Teras
0010e46577 ipv4: Update MTU to all related cache entries in ip_rt_frag_needed()
Add struct net_device parameter to ip_rt_frag_needed() and update MTU to
cache entries where ifindex is specified. This is similar to what is
already done in ip_rt_redirect().

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-29 03:32:25 -07:00
David L Stevens
42908c69f6 net: Add compat support for getsockopt (MCAST_MSFILTER)
This patch adds support for getsockopt for MCAST_MSFILTER for
both IPv4 and IPv6. It depends on the previous setsockopt patch,
and uses the same method.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-29 03:23:22 -07:00
Julian Anastasov
2ad17defd5 ipvs: fix oops in backup for fwmark conn templates
Fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=10556
where conn templates with protocol=IPPROTO_IP can oops backup box.

        Result from ip_vs_proto_get() should be checked because
protocol value can be invalid or unsupported in backup. But
for valid message we should not fail for templates which use
IPPROTO_IP. Also, add checks to validate message limits and
connection state. Show state NONE for templates using IPPROTO_IP.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-29 03:21:23 -07:00
Linus Torvalds
77a50df2b1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  iwlwifi: Allow building iwl3945 without iwl4965.
  wireless: Fix compile error with wifi & leds
  tcp: Fix slab corruption with ipv6 and tcp6fuzz
  ipv4/ipv6 compat: Fix SSM applications on 64bit kernels.
  [IPSEC]: Use digest_null directly for auth
  sunrpc: fix missing kernel-doc
  can: Fix copy_from_user() results interpretation
  Revert "ipv6: Fix typo in net/ipv6/Kconfig"
  tipc: endianness annotations
  ipv6: result of csum_fold() is already 16bit, no need to cast
  [XFRM] AUDIT: Fix flowlabel text format ambibuity.
2008-04-28 09:44:11 -07:00
Eric Paris
2532386f48 Audit: collect sessionid in netlink messages
Previously I added sessionid output to all audit messages where it was
available but we still didn't know the sessionid of the sender of
netlink messages.  This patch adds that information to netlink messages
so we can audit who sent netlink messages.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-28 06:18:03 -04:00
David L Stevens
dae5029548 ipv4/ipv6 compat: Fix SSM applications on 64bit kernels.
Add support on 64-bit kernels for seting 32-bit compatible MCAST*
socket options.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-27 14:26:53 -07:00
Aurélien Charbon
f15364bd4c IPv6 support for NFS server export caches
This adds IPv6 support to the interfaces that are used to express nfsd
exports.  All addressed are stored internally as IPv6; backwards
compatibility is maintained using mapped addresses.

Thanks to Bruce Fields, Brian Haley, Neil Brown and Hideaki Joshifuji
for comments

Signed-off-by: Aurelien Charbon <aurelien.charbon@bull.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Brian Haley <brian.haley@hp.com>
Cc:  YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:36 -04:00
Herbert Xu
c5d18e984a [IPSEC]: Fix catch-22 with algorithm IDs above 31
As it stands it's impossible to use any authentication algorithms
with an ID above 31 portably.  It just happens to work on x86 but
fails miserably on ppc64.

The reason is that we're using a bit mask to check the algorithm
ID but the mask is only 32 bits wide.

After looking at how this is used in the field, I have concluded
that in the long term we should phase out state matching by IDs
because this is made superfluous by the reqid feature.  For current
applications, the best solution IMHO is to allow all algorithms when
the bit masks are all ~0.

The following patch does exactly that.

This bug was identified by IBM when testing on the ppc64 platform
using the NULL authentication algorithm which has an ID of 251.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-22 00:46:42 -07:00
Pavel Emelyanov
53083773dc [INET]: Uninline the __inet_inherit_port call.
This deblats ~200 bytes when ipv6 and dccp are 'y'.

Besides, this will ease compilation issues for patches
I'm working on to make inet hash tables more scalable 
wrt net namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-17 23:18:15 -07:00
Pavel Emelyanov
e56d8b8a2e [INET]: Drop the inet_inherit_port() call.
As I can see from the code, two places (tcp_v6_syn_recv_sock and
dccp_v6_request_recv_sock) that call this one already run with
BHs disabled, so it's safe to call __inet_inherit_port there.

Besides (in case I missed smth with code review) the calltrace
tcp_v6_syn_recv_sock
 `- tcp_v4_syn_recv_sock
     `- __inet_inherit_port
and the similar for DCCP are valid, but assumes BHs to be disabled.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-17 23:17:34 -07:00
Reinette Chatre
d18ef29f34 mac80211: no BSS changes to driver from beacons processed during scanning
There is no need to send BSS changes to driver from beacons processed
during scanning. We are more interested in beacons from an AP with which
we are associated - these will still be used to send updates to driver as
the beacons are received without scanning.

This change·removes the requirement that bss_info_changed needs to be atomic.
The beacons received during scanning are processed from a tasklet, but if we
do not call bss_info_changed for these beacons there is no need for it to be
atomic. This function (bss_info_changed) is called either from workqueue or
ioctl in all other instances.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-16 15:59:56 -04:00
John Heffner
dd9e0dda66 [TCP]: Increase the max_burst threshold from 3 to tp->reordering.
This change is necessary to allow cwnd to grow during persistent
reordering.  Cwnd moderation is applied when in the disorder state
and an ack that fills the hole comes in.  If the hole was greater
than 3 packets, but less than tp->reordering, cwnd will shrink when
it should not have.

Signed-off-by: John Heffner <jheffner@napa.(none)>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 02:29:56 -07:00
Denis V. Lunev
3661a91083 [NETNS]: Add netns refcnt debug to fib rules.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 02:01:56 -07:00
Denis V. Lunev
65a18ec58e [NETNS]: Add netns refcnt debug for kernel sockets.
Protocol control sockets and netlink kernel sockets should not prevent the
namespace stop request. They are initialized and disposed in a special way by
sk_change_net/sk_release_kernel.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 01:59:46 -07:00
Denis V. Lunev
5d1e4468a7 [NETNS]: Make netns refconting debug like a socket one.
Make release_net/hold_net noop for performance-hungry people. This is a debug
staff and should be used in the debug mode only.

Add check for net != NULL in hold/release calls. This will be required
later on.

[ Added minor simplifications suggested by Brian Haley. -DaveM ]

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 01:58:04 -07:00
Pavel Emelyanov
669f87baab [RTNL]: Introduce the rtnl_kill_links helper.
This one is responsible for calling ->dellink on each net
device found in net to help with vlan net_exit hook in the
nearest future.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:46:52 -07:00
Pavel Emelyanov
dec827d174 [NETNS]: The generic per-net pointers.
Add the elastic array of void * pointer to the struct net.
The access rules are simple:

 1. register the ops with register_pernet_gen_device to get
    the id of your private pointer
 2. call net_assign_generic() to put the private data on the
    struct net (most preferably this should be done in the
    ->init callback of the ops registered)
 3. do not store any private reference on the net_generic array;
 4. do not change this pointer while the net is alive;
 5. use the net_generic() to get the pointer.

When adding a new pointer, I copy the old array, replace it
with a new one and schedule the old for kfree after an RCU
grace period.

Since the net_generic explores the net->gen array inside rcu
read section and once set the net->gen->ptr[x] pointer never 
changes, this grants us a safe access to generic pointers.

Quoting Paul: "... RCU is protecting -only- the net_generic 
structure that net_generic() is traversing, and the [pointer]
returned by net_generic() is protected by a reference counter 
in the upper-level struct net."

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:36:08 -07:00
Pavel Emelyanov
c93cf61fd1 [NETNS]: The net-subsys IDs generator.
To make some per-net generic pointers, we need some way to address
them, i.e. - IDs. This is simple IDA-based IDs generator for pernet
subsystems.

Addressing questions about potential checkpoint/restart problems: 
these IDs are "lite-offsets" within the net structure and are by no 
means supposed to be exported to the userspace.

Since it will be used in the nearest future by devices only (tun,
vlan, tunnels, bridge, etc), I make it resemble the functionality
of register_pernet_device().

The new ids is stored in the *id pointer _before_ calling the init
callback to make this id available in this callback.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:35:23 -07:00
Adrian Bunk
7ef3abd210 [IRDA]: Remove irlan_eth_send_gratuitous_arp()
Even kernel 2.2.26 (sic) already contains the
  #undef CONFIG_IRLAN_SEND_GRATUITOUS_ARP
with the comment "but for some reason the machine crashes if you use DHCP".

Either someone finally looks into this or it's simply time to remove 
this dead code.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:29:24 -07:00
Allan Stephens
0c3141e910 [TIPC]: Overhaul of socket locking logic
This patch modifies TIPC's socket code to follow the same approach
used by other protocols.  This change eliminates the need for a
mutex in the TIPC-specific portion of the socket protocol data
structure -- in its place, the standard Linux socket backlog queue
and associated locking routines are utilized.  These changes fix
a long-standing receive queue bug on SMP systems, and also enable
individual read and write threads to utilize a socket without
unnecessarily interfering with each other.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:22:02 -07:00
David S. Miller
334f8b2afd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6.26 2008-04-14 03:50:43 -07:00
David S. Miller
df39e8ba56 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ehea/ehea_main.c
	drivers/net/wireless/iwlwifi/Kconfig
	drivers/net/wireless/rt2x00/rt61pci.c
	net/ipv4/inet_timewait_sock.c
	net/ipv6/raw.c
	net/mac80211/ieee80211_sta.c
2008-04-14 02:30:23 -07:00
Jan Engelhardt
3c9fba656a [NETFILTER]: nf_conntrack: replace NF_CT_DUMP_TUPLE macro indrection by function call
Directly call IPv4 and IPv6 variants where the address family is
easily known.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Jan Engelhardt
f2ea825f48 [NETFILTER]: nf_nat: use bool type in nf_nat_proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt
5f2b4c9006 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_tuple.h
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt
09f263cd39 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_l4proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt
8ce8439a31 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_l3proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Jan Engelhardt
9dbae79178 [NETFILTER]: Remove unused callbacks in nf_conntrack_l3proto
These functions are never called.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
5e8fbe2ac8 [NETFILTER]: nf_conntrack: add tuplehash l3num/protonum accessors
Add accessors for l3num and protonum and get rid of some overly long
expressions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
dd13b01036 [NETFILTER]: nf_nat: kill helper and seq_adjust hooks
Connection tracking helpers (specifically FTP) need to be called
before NAT sequence numbers adjustments are performed to be able
to compare them against previously seen ones. We've introduced
two new hooks around 2.6.11 to maintain this ordering when NAT
modules were changed to get called from conntrack helpers directly.

The cost of netfilter hooks is quite high and sequence number
adjustments are only rarely needed however. Add a RCU-protected
sequence number adjustment function pointer and call it from
IPv4 conntrack after calling the helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
55871d0479 [NETFILTER]: nf_conntrack_extend: warn on confirmed conntracks
New extensions may only be added to unconfirmed conntracks to avoid races
when reallocating the storage.

Also change NF_CT_ASSERT to use WARN_ON to get backtraces.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:51 +02:00
Patrick McHardy
8c87238b72 [NETFILTER]: nf_nat: don't add NAT extension for confirmed conntracks
Adding extensions to confirmed conntracks is not allowed to avoid races
on reallocation. Don't setup NAT for confirmed conntracks in case NAT
module is loaded late.

The has one side-effect, the connections existing before the NAT module
was loaded won't enter the bysource hash. The only case where this actually
makes a difference is in case of SNAT to a multirange where the IP before
NAT is also part of the range. Since old connections don't enter the
bysource hash the first new connection from the IP will have a new address
selected. This shouldn't matter at all.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:51 +02:00
Patrick McHardy
2bc780499a [NETFILTER]: nf_conntrack: add DCCP protocol support
Add DCCP conntrack helper. Thanks to Gerrit Renker <gerrit@erg.abdn.ac.uk>
for review and testing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:49 +02:00
Patrick McHardy
2d2d84c40e [NETFILTER]: nf_nat: remove unused name from struct nf_nat_protocol
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:48 +02:00
Patrick McHardy
535b57c7c1 [NETFILTER]: nf_nat: move NAT ctnetlink helpers to nf_nat_proto_common
Move to nf_nat_proto_common and rename to nf_nat_proto_... since they're
also used by protocols that don't have port numbers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:47 +02:00
Patrick McHardy
937e0dfd87 [NETFILTER]: nf_nat: add helpers for common NAT protocol operations
Add generic ->in_range and ->unique_tuple ops to avoid duplicating them
again and again for future NAT modules and save a few bytes of text:

net/ipv4/netfilter/nf_nat_proto_tcp.c:
  tcp_in_range     |  -62 (removed)
  tcp_unique_tuple | -259 # 271 -> 12, # inlines: 1 -> 0, size inlines: 7 -> 0
 2 functions changed, 321 bytes removed

net/ipv4/netfilter/nf_nat_proto_udp.c:
  udp_in_range     |  -62 (removed)
  udp_unique_tuple | -259 # 271 -> 12, # inlines: 1 -> 0, size inlines: 7 -> 0
 2 functions changed, 321 bytes removed

net/ipv4/netfilter/nf_nat_proto_gre.c:
  gre_in_range |  -62 (removed)
 1 function changed, 62 bytes removed

vmlinux:
 5 functions changed, 704 bytes removed

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:46 +02:00
Gerrit Renker
7de6c03336 [SKB]: __skb_append = __skb_queue_after
This expresses __skb_append in terms of __skb_queue_after, exploiting that

  __skb_append(old, new, list) = __skb_queue_after(list, old, new).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 00:05:09 -07:00
YOSHIFUJI Hideaki
e9df2e8fd8 [IPV6]: Use appropriate sock tclass setting for routing lookup.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 23:40:51 -07:00
Pavel Emelyanov
0204774191 [NETNS][DCCPV6]: Move the dccp_v6_ctl_sk on the struct net.
And replace all its usage with init_net's socket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:32:25 -07:00
Pavel Emelyanov
7b1cffa8c9 [NETNS][DCCPV4]: Move the dccp_v4_ctl_sk on the struct net.
And replace all its usage with init_net's socket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:29:37 -07:00
Pavel Emelyanov
67019cc9ee [NETNS]: Add an empty netns_dccp structure on struct net.
According to the overall struct net design, it will be
filled with DCCP-related members.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:28:42 -07:00
Denis V. Lunev
5f4472c5a6 [TCP]: Remove owner from tcp_seq_afinfo.
Move it to tcp_seq_afinfo->seq_fops as should be.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:13:53 -07:00
Denis V. Lunev
68fcadd16c [TCP]: Place file operations directly into tcp_seq_afinfo.
No need to have separate never-used variable.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:13:30 -07:00
Denis V. Lunev
9427c4b36b [TCP]: Move seq_ops from tcp_iter_state to tcp_seq_afinfo.
No need to create seq_operations for each instance of 'netstat'.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:12:13 -07:00
Denis V. Lunev
a4146b1b2c [TCP]: Replace struct net on tcp_iter_state with seq_net_private.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-13 22:11:14 -07:00
David S. Miller
6fb9114e4b Merge branch 'net-2.6.26-misc-20080412b' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-12 19:19:46 -07:00
Paul Moore
00447872a6 NetLabel: Allow passing the LSM domain as a shared pointer
Smack doesn't have the need to create a private copy of the LSM "domain" when
setting NetLabel security attributes like SELinux, however, the current
NetLabel code requires a private copy of the LSM "domain".  This patches fixes
that by letting the LSM determine how it wants to pass the domain value.

 * NETLBL_SECATTR_DOMAIN_CPY
   The current behavior, NetLabel assumes that the domain value is a copy and
   frees it when done

 * NETLBL_SECATTR_DOMAIN
   New, Smack-friendly behavior, NetLabel assumes that the domain value is a
   reference to a string managed by the LSM and does not free it when done

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 19:06:42 -07:00
Vlad Yasevich
ab38fb04c9 [SCTP]: Fix compiler warning about const qualifiers
Fix 3 warnings about discarding const qualifiers:

net/sctp/ulpevent.c:862: warning: passing argument 1 of 'sctp_event2skb' discards qualifiers from pointer target type
net/sctp/sm_statefuns.c:4393: warning: passing argument 1 of 'SCTP_ASOC' discards qualifiers from pointer target type
net/sctp/socket.c:5874: warning: passing argument 1 of 'cmsg_nxthdr' discards qualifiers from pointer target type

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:40:06 -07:00
Gui Jianfeng
f4ad85ca3e [SCTP]: Fix protocol violation when receiving an error lenght INIT-ACK
When receiving an error length INIT-ACK during COOKIE-WAIT,
a 0-vtag ABORT will be responsed. This action violates the
protocol apparently. This patch achieves the following things.
1 If the INIT-ACK contains all the fixed parameters, use init-tag
  recorded from INIT-ACK as vtag.
2 If the INIT-ACK doesn't contain all the fixed parameters,
  just reflect its vtag.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-12 18:39:34 -07:00
YOSHIFUJI Hideaki
7f1eced8b0 [IPV6] MIP6: Use our standard definitions for paddings.
MIP6_OPT_PAD_X are actually for paddings in destination
option header.  Replace them with our standard IPV6_TLV_PADX.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:22 +09:00
YOSHIFUJI Hideaki
f3ee4010e8 [IPV6]: Define constants for link-local multicast addresses.
- Define link-local all-node / all-router multicast addresses.
- Remove ipv6_addr_all_nodes() and ipv6_addr_all_routers().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:19 +09:00
YOSHIFUJI Hideaki
9acd9f3ae9 [IPV6]: Make address arguments const.
- net/ipv6/addrconf.c:
	ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
	ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
	inet6_mc_check(),
	ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
	ipv6_chk_mcast_addr()
- net/ipv6/route.c:
	rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
	ip6_nd_hdr()
- net/ipv6/ndisc.c:
	ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
	ndisc_get_neigh(), __ndisc_send()

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:18 +09:00
YOSHIFUJI Hideaki
dfd982baff [IPV6] ADDRCONF: Uninline ipv6_isatap_eui64().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:17 +09:00
YOSHIFUJI Hideaki
3eb84f4929 [IPV6] ADDRCONF: Uninline ipv6_addr_hash().
The function is only used in net/ipv6/addrconf.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:15 +09:00
YOSHIFUJI Hideaki
fed85383ac [IPV6]: Use XOR and OR rather than mutiple ands for ipv6 address comparisons.
ipv6_addr_equal(), ipv6_addr_v4mapped(),
ipv6_addr_is_ll_all_{nodes,routers}(),
ipv6_masked_addr_cmp()

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-12 13:43:14 +09:00
Florian Westphal
4dfc281702 [Syncookies]: Add support for TCP options via timestamps.
Allow the use of SACK and window scaling when syncookies are used
and the client supports tcp timestamps. Options are encoded into
the timestamp sent in the syn-ack and restored from the timestamp
echo when the ack is received.

Based on earlier work by Glenn Griffin.
This patch avoids increasing the size of structs by encoding TCP
options into the least significant bits of the timestamp and
by not using any 'timestamp offset'.

The downside is that the timestamp sent in the packet after the synack
will increase by several seconds.

changes since v1:
 don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
 and have ipv6/syncookies.c use it.
 Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()

Reviewed-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 03:12:40 -07:00
Rami Rosen
5c06f510a2 [IPV6]: Remove unused declarations in include/net/ip6_route.h.
1) Standlaone ip6_null_entry is no longer needed as it is replaced by
   the ip6_null_entry member of ipv6 (instance of struct netns_ipv6) in
   struct net (as a result of Network Namespaces patches).


2) These 3 methods from this same header are not defined anywhere:
   ip6_rt_addr_add(), ip6_rt_addr_del(), rt6_sndmsg()

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 02:31:20 -07:00
Rami Rosen
3cccd60784 [IPV6] Remove three method declarations in include/net/ndisc.h.
This patch removes two unused method declarations in
include/net/ndisc.h: ndisc_forwarding_on(void) and
ndisc_forwarding_off(void);

Also igmp6_cleanup(void) appears twice in this header, so one
igmp6_cleanup(void) declaration is removed.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 02:01:21 -07:00
Stephen Hemminger
43db6d65e0 socket: sk_filter deinline
The sk_filter function is too big to be inlined. This saves 2296 bytes
of text on allyesconfig.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-10 01:43:09 -07:00
Mohamed Abbas
84363e6e07 mac80211: notify mac from low level driver (iwlwifi)
Add new API to MAC80211 to allow low level driver to
notify MAC with driver status.

Signed-off-by: Mohamed Abbas <mabbas@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 16:44:43 -04:00
Chr
fff7710937 mac80211: add station aid into ieee80211_tx_control
This patch is necessary for the upcoming Accesspoint patch for p54.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:57 -04:00
Tomas Winkler
21c0cbe760 mac80211: add association capabilty and timing info into bss_conf
This patch adds assocation capability, timestamp (tsf) and beacon interval
to bss_conf. This is required for successful assocation of iwlwifi drivers

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:56 -04:00
Tomas Winkler
38668c059f mac80211: eliminate conf_ht
This patch eliminates the use of conf_ht, replacing it with
bss_info_changed.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-08 15:05:56 -04:00
David S. Miller
8eefca4888 Merge branch 'net-2.6.26-isatap-20080403' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-04-08 02:33:36 -07:00
Ilpo Järvinen
882bebaaca [TCP]: tcp_simple_retransmit can cause S+L
This fixes Bugzilla #10384

tcp_simple_retransmit does L increment without any checking
whatsoever for overflowing S+L when Reno is in use.

The simplest scenario I can currently think of is rather
complex in practice (there might be some more straightforward
cases though). Ie., if mss is reduced during mtu probing, it
may end up marking everything lost and if some duplicate ACKs
arrived prior to that sacked_out will be non-zero as well,
leading to S+L > packets_out, tcp_clean_rtx_queue on the next
cumulative ACK or tcp_fastretrans_alert on the next duplicate
ACK will fix the S counter.

More straightforward (but questionable) solution would be to
just call tcp_reset_reno_sack() in tcp_simple_retransmit but
it would negatively impact the probe's retransmission, ie.,
the retransmissions would not occur if some duplicate ACKs
had arrived.

So I had to add reno sacked_out reseting to CA_Loss state
when the first cumulative ACK arrives (this stale sacked_out
might actually be the explanation for the reports of left_out
overflows in kernel prior to 2.6.23 and S+L overflow reports
of 2.6.24). However, this alone won't be enough to fix kernel
before 2.6.24 because it is building on top of the commit
1b6d427bb7 ([TCP]: Reduce sacked_out with reno when purging
write_queue) to keep the sacked_out from overflowing.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-07 22:33:07 -07:00
Denis V. Lunev
046ee90235 [NETNS]: Create tcp control socket in the each namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:31:33 -07:00
Denis V. Lunev
5677242f43 [NETNS]: Inet control socket should not hold a namespace.
This is a generic requirement, so make inet_ctl_sock_create namespace
aware and create a inet_ctl_sock_destroy wrapper around
sk_release_kernel.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:28:30 -07:00
Denis V. Lunev
eee4fe4ded [INET]: Let inet_ctl_sock_create return sock rather than socket.
All upper protocol layers are already use sock internally.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:27:58 -07:00
Denis V. Lunev
3d58b5fa8e [INET]: Rename inet_csk_ctl_sock_create to inet_ctl_sock_create.
This call is nothing common with INET connection sockets code. It
simply creates an unhashes kernel sockets for protocol messages.

Move the new call into af_inet.c after the rename.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 14:22:32 -07:00
Denis V. Lunev
a4aa834a91 [NETNS]: Declare init_net even without CONFIG_NET defined.
This does not look good, but there is no other choice. The compilation
without CONFIG_NET is broken and can not be fixed with ease.

After that there is no need for the following commits:
1567ca7eec
3edf8fa5cc
2d38f9a4f8

Revert them.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-03 13:04:33 -07:00
David S. Miller
e1ec1b8ccd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/s2io.c
2008-04-02 22:35:23 -07:00
YOSHIFUJI Hideaki
52eeeb8481 [IPV6]: Unify ip6_onlink() and ipip6_onlink().
Both are identical, let's create ipv6_chk_prefix() and use it
in both places.
2008-04-03 10:06:00 +09:00
YOSHIFUJI Hideaki
300aaeeaab [IPV6] SIT: Add SIOCGETPRL ioctl to get/dump PRL.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:06:00 +09:00
Templin, Fred L
fadf6bf060 [IPV6] SIT: Add PRL management for ISATAP.
This patch updates the Linux the Intra-Site Automatic Tunnel Addressing
Protocol (ISATAP) implementation. It places the ISATAP potential router
list (PRL) in the kernel and adds three new private ioctls for PRL
management.

[Add several changes of structure name, constant names etc. - yoshfuji]

Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-04-03 10:05:58 +09:00
Denis V. Lunev
c0f39322c3 [NETNS]: Do not include net/net_namespace.h from seq_file.h
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-02 00:10:28 -07:00
Denis V. Lunev
225c0a0107 [NETNS]: Merge ifdef CONFIG_NET in include/net/net_namespace.h.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-02 00:09:29 -07:00
Joonwoo Park
f83f1768f8 [LLC]: skb allocation size for responses
Allocate the skb for llc responses with the received packet size by
using the size adjustable llc_frame_alloc.
Don't allocate useless extra payload.
Cleanup magic numbers.

So, this fixes oops.
Reported by Jim Westfall:
kernel: skb_over_panic: text:c0541fc7 len:1000 put:997 head:c166ac00 data:c166ac2f tail:0xc166b017 end:0xc166ac80 dev:eth0
kernel: ------------[ cut here ]------------
kernel: kernel BUG at net/core/skbuff.c:95!

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 21:02:47 -07:00
Pavel Emelyanov
70ee115942 [SOCK][NETNS]: Add the percpu prot_inuse counter in the struct net.
Such an accounting would cost us two more dereferences to get the
percpu variable from the struct net, so I make sock_prot_inuse_get
and _add calls work differently depending on CONFIG_NET_NS - without
it old optimized routines are used.

The per-cpu counter for init_net is prepared in core_initcall, so
that even af_inet, that starts as fs_initcall, will already have the
init_net prepared.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:42:16 -07:00
Pavel Emelyanov
c29a0bc4df [SOCK][NETNS]: Add a struct net argument to sock_prot_inuse_add and _get.
This counter is about to become per-proto-and-per-net, so we'll need 
two arguments to determine which cell in this "table" to work with.

All the places, but proc already pass proper net to it - proc will be
tuned a bit later.

Some indentation with spaces in proc files is done to keep the file
coding style consistent.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:41:46 -07:00
Pavel Emelyanov
8efa6e93cb [NETNS]: Introduce a netns_core structure.
There's already some stuff on the struct net, that should better
be folded into netns_core structure. I'm making the per-proto inuse 
counter be per-net also, which is also a candidate for this, so 
introduce this structure and populate it a bit.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-31 19:41:14 -07:00
Denis V. Lunev
4ad96d39a2 [UDP]: Remove owner from udp_seq_afinfo.
Move it to udp_seq_afinfo->seq_fops as should be.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:25:53 -07:00
Denis V. Lunev
3ba9441bdf [UDP]: Place file operations directly into udp_seq_afinfo.
No need to have separate never-used variable.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:25:32 -07:00
Denis V. Lunev
dda61925f8 [UDP]: Move seq_ops from udp_iter_state to udp_seq_afinfo.
No need to create seq_operations for each instance of 'netstat'.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:24:26 -07:00
Denis V. Lunev
6f191efe48 [UDP]: Replace struct net on udp_iter_state with seq_net_private.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 18:23:33 -07:00
Pavel Emelyanov
bdcde3d71a [SOCK]: Drop inuse pcounter from struct proto (v2).
An uppercut - do not use the pcounter on struct proto.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:33 -07:00
Pavel Emelyanov
60e7663d46 [SOCK]: Drop per-proto inuse init and fre functions (v2).
Constructive part of the set is finished here. We have to remove the
pcounter, so start with its init and free functions.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:39:10 -07:00
Pavel Emelyanov
1338d466d9 [SOCK]: Introduce a percpu inuse counters array (v2).
And redirect sock_prot_inuse_add and _get to use one.

As far as the dereferences are concerned. Before the patch we made
1 dereference to proto->inuse.add call, the call itself and then
called the __get_cpu_var() on a static variable. After the patch we 
make a direct call, then one dereference to proto->inuse_idx and 
then the same __get_cpu_var() on a still static variable. So this 
patch doesn't seem to produce performance penalty on SMP.

This is not per-net yet, but I will deliberately make NET_NS=y case
separated from NET_NS=n one, since it'll cost us one-or-two more 
dereferences to get the struct net and the inuse counter.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:38:43 -07:00
Pavel Emelyanov
13ff3d6fa4 [SOCK]: Enumerate struct proto-s to facilitate percpu inuse accounting (v2).
The inuse counters are going to become a per-cpu array.  Introduce an
index for this array on the struct proto.

To handle the case of proto register-unregister-register loop the
bitmap is used. All its bits manipulations are protected with
proto_list_lock and a sanity check for the bitmap being exhausted is
also added.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:38:17 -07:00
Joe Perches
bc578a54f0 [NET]: Rename inet_frag.h identifiers COMPLETE, FIRST_IN, LAST_IN to INET_FRAG_*
On Fri, 2008-03-28 at 03:24 -0700, Andrew Morton wrote:
> they should all be renamed.

Done for include/net and net

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:35:27 -07:00
Joonwoo Park
a5a04819c5 [LLC]: station source mac address
kill unnecessary llc_station_mac_sa.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:28:36 -07:00
Rami Rosen
be2ce06b49 [IPV6]: Remove unused method declaration in include/net/addrconf.h.
This patches removes unused declaration of addrconf_forwarding_on() method
in include/net/addrconf.h.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-28 16:26:45 -07:00
David S. Miller
8e8e43843b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/usb/rndis_host.c
	drivers/net/wireless/b43/dma.c
	net/ipv6/ndisc.c
2008-03-27 18:48:56 -07:00
David S. Miller
ed85f2c3b2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-27 18:01:13 -07:00
Ilpo Järvinen
bc09dff198 [SCTP]: Remove sctp_add_cmd_sf wrapper bloat
With a was number of callsites sctp_add_cmd_sf wrapper bloats
kernel by some amount. Due to unlikely tracking allyesconfig,
with the initial result were around ~7kB (thus caught my
attention) while a non-debug config produced only ~2.3kB effect.

I (ij) proposed first a patch to uninline it but Vlad responded
with a patch that removed the only sctp_add_cmd call which is
wrapped by sctp_add_cmd_sf (I wasn't sure if I could do that).
I did minor cleanup to Vlad's patch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:54:29 -07:00
Ilpo Järvinen
8d3308687f [NET]: uninline dst_release
Codiff stats (allyesconfig, v2.6.24-mm1):
-16420  187 funcs, 103 +, 16523 -, diff: -16420 --- dst_release

Without number of debug related CONFIGs (v2.6.25-rc2-mm1):
-7257  186 funcs, 70 +, 7327 -, diff: -7257 --- dst_release
dst_release                   |  +40

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:53:31 -07:00
Rami Rosen
4f95165d4b [IPV6]: Remove three unused method declarations in include/net/ipv6.h
This patch removes three unused method declarations in include/net/ipv6.h:
inet_getfrag_t(), ipv6_build_nfrag_opts() and ipv6_build_frag_opts().

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 17:39:19 -07:00
Denis V. Lunev
09382bac66 [PKT_SCHED]: Pass real namespace in net scheduler classifiers.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-27 16:53:37 -07:00
Johannes Berg
6c507cd040 cfg80211: don't export ieee80211_get_channel
This patch makes ieee80211_get_channel a static inline defined in
cfg80211's header file which simply calls __ieee80211_get_channel
to avoid symbol clashes with the ieee80211 code.

The problem was pointed out by David Miller, thanks!

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-27 16:03:20 -04:00
Benjamin Thery
60e8fbc4c5 [NETNS][IPV6] flowlabels - make flowlabels per namespace
This patch introduces a new member, fl_net, in struct ip6_flowlabel.
This allows to create labels with the same value in different namespaces.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:53:08 -07:00
Daniel Lezcano
6ab57e7e7f [NETNS][IPV6] anycast - handle several network namespace
Make use of the network namespace information to have this protocol to
handle several network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:52:32 -07:00
Herbert Xu
732c8bd590 [IPSEC]: Fix BEET output
The IPv6 BEET output function is incorrectly including the inner
header in the payload to be protected.  This causes a crash as
the packet doesn't actually have that many bytes for a second
header.

The IPv4 BEET output on the other hand is broken when it comes
to handling an inner IPv6 header since it always assumes an
inner IPv4 header.

This patch fixes both by making sure that neither BEET output
function touches the inner header at all.  All access is now
done through the protocol-independent cb structure.  Two new
attributes are added to make this work, the IP header length
and the IPv4 option length.  They're filled in by the inner
mode's output function.

Thanks to Joakim Koskela for finding this problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 16:51:09 -07:00
Pavel Emelyanov
68528f0998 [NETNS][ICMP]: Make ctl tables for ICMP sysctls per-net.
Add some flesh to ipv4_sysctl_init_net and ipv4_sysctl_exit_net,
i.e. copy the table, alter .data pointers and register it per-net.

Other ipv4_table's sysctls are now global, but this is going to
change once sysctl permissions patches migrate from -mm tree to 
mainline in 2.6.26 merge window :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:56:24 -07:00
Pavel Emelyanov
a24022e188 [NETNS][ICMP]: Move ICMP sysctls on struct net.
Initialization is moved to icmp_sk_init, all the places, that
refer to them use init_net for now.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 01:55:37 -07:00
Denis V. Lunev
f5aa23fd49 [NETNS]: Compilation warnings under CONFIG_NET_NS.
Recent commits from YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
have been introduced a several compilation warnings
'assignment discards qualifiers from pointer target type'
due to extra const modifier in the inline call parameters of
{dev|sock|twsk}_net_set.

Drop it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26 00:48:17 -07:00
Patrick McHardy
0d0ab0378d [NETFILTER]: nf_conntrack_sip: support multiple media channels
Add support for multiple media channels and use it to create
expectations for video streams when present.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:26:24 -07:00
Patrick McHardy
0f32a40fc9 [NETFILTER]: nf_conntrack_sip: create signalling expectations
Create expectations for incoming signalling connections when seeing
a REGISTER request. This is needed when the registrar uses a
different source port number for signalling messages and for receiving
incoming calls from other endpoints than the registrar.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:25:13 -07:00
Patrick McHardy
b8beedd25d [NETFILTER]: Add nf_inet_addr_cmp()
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:09:33 -07:00
Patrick McHardy
6002f266b3 [NETFILTER]: nf_conntrack: introduce expectation classes and policies
Introduce expectation classes and policies. An expectation class
is used to distinguish different types of expectations by the
same helper (for example audio/video/t.120). The expectation
policy is used to hold the maximum number of expectations and
the initial timeout for each class.

The individual classes are isolated from each other, which means
that for example an audio expectation will only evict other audio
expectations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:09:15 -07:00
Patrick McHardy
359b9ab614 [NETFILTER]: nf_conntrack_expect: support inactive expectations
This is useful for the SIP helper and signalling expectations.
We don't want to create a full-blown expectation with a wildcard
as source based on a single UDP packet, but need to know the
final port anyways. With inactive expectations we can register
the expectation and reserve the tuple, but wait for confirmation
from the registrar before activating it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:08:37 -07:00
Patrick McHardy
1d9d752259 [NETFILTER]: nf_conntrack_expect: constify nf_ct_expect_init arguments
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:07:58 -07:00
Patrick McHardy
ef27559b70 [NETFILTER]: nf_conntrack: fix NF_CT_TUPLE_DUMP for IPv4
NF_CT_TUPLE_DUMP prints IPv4 addresses as IPv6, fix this and use printk
(guarded by #ifdef DEBUG) directly instead of pr_debug since the tuple
is usually printed at the end of line and we don't want to include a
log-level.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-25 20:07:38 -07:00
David S. Miller
dfe98e9214 Merge branch 'net-2.6.26-netns-20080326' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-dev 2008-03-25 19:43:59 -07:00
David S. Miller
f89e6e3834 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-25 17:20:03 -07:00
Johannes Berg
906c730a2d wireless: add wiphy channel freq to channel struct lookup helper
Add ieee80211_get_channel() which gets you a channel struct for a
specific wiphy if that channel is present in that wiphy.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:55 -04:00
Emmanuel Grumbach
9ae4fda332 mac80211: allows driver to request a Phase 1 RX key
This patch makes mac80211 able to send a phase1 key for TKIP
decryption.
This is needed for drivers that don't do the rekeying by themselves
(i.e. iwlwifi). Upon IV16 wrap around, the packet is decrypted in SW,
if decryption is ok, mac80211 calls to update_tkip_key  with a new
phase 1 RX key.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:53 -04:00
Emmanuel Grumbach
5d2cdcd4e8 mac80211: get a TKIP phase key from skb
This patch makes mac80211 able to compute a TKIP key from an skb.
The requested key can be a phase 1 or a phase 2 key.
This is useful for drivers who need to provide tkip key to their
HW to enable HW encryption.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-25 16:41:52 -04:00
YOSHIFUJI Hideaki
878628fbf2 [NET] NETNS: Omit namespace comparision without CONFIG_NET_NS.
Introduce an inline net_eq() to compare two namespaces.
Without CONFIG_NET_NS, since no namespace other than &init_net
exists, it is always 1.

We do not need to convert 1) inline vs inline and
2) inline vs &init_net comparisons.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:40:00 +09:00
YOSHIFUJI Hideaki
57da52c1e6 [NET] NETNS: Omit neigh_parms->net and pneigh_entry->net without CONFIG_NET_NS.
Introduce neigh_parms/pneigh_entry inlines: neigh_parms_net(), pneigh_net().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:58 +09:00
YOSHIFUJI Hideaki
3b1e0a655f [NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS.
Introduce per-sock inlines: sock_net(), sock_net_set()
and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:55 +09:00
YOSHIFUJI Hideaki
7cbca67c07 [IPV6]: Support Source Address Selection API (RFC5014).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:01 +09:00
YOSHIFUJI Hideaki
6b75d09081 [IPV6]: Optimize hop-limit determination.
Last part of hop-limit determination is always:
    hoplimit = dst_metric(dst, RTAX_HOPLIMIT);
    if (hoplimit < 0)
        hoplimit = ipv6_get_hoplimit(dst->dev).

Let's consolidate it as ip6_dst_hoplimit(dst).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:00 +09:00
YOSHIFUJI Hideaki
c8cdaf998d [IPV4,IPV6]: Share cork.rt between IPv4 and IPv6.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:59 +09:00
YOSHIFUJI Hideaki
9bb182a700 [XFRM] MIP6: Fix address keys for routing search.
Each MIPv6 XFRM state (DSTOPT/RH2) holds either destination or source
address to be mangled in the IPv6 header (that is "CoA").
On Inter-MN communication after both nodes binds each other,
they use route optimized traffic two MIPv6 states applied, and
both source and destination address in the IPv6 header
are replaced by the states respectively.
The packet format is correct, however, next-hop routing search
are not.
This patch fixes it by remembering address pairs for later states.

Based on patch from Masahide NAKAMURA <nakam@linux-ipv6.org>.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:57 +09:00
Denis V. Lunev
f145049a06 [NETNS]: Drop packets in the non-initial namespace on the per/protocol basis.
IP layer now can handle multiple namespaces normally. So, process such
packets normally and drop them only if the transport layer is not
aware about namespaces.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:33:00 -07:00
Denis V. Lunev
7a6adb92fe [NETNS]: Add namespace parameter to ip_cmsg_send.
Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:30:27 -07:00
Denis V. Lunev
f2c4802b3f [NETNS]: Add namespace parameter to ip_options_get(...).
Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:29:55 -07:00
Denis V. Lunev
0e6bd4a1c6 [NETNS]: Add namespace parameter to ip_options_compile.
ip_options_compile uses inet_addr_type which requires a namespace. The
packet argument is optional, so parameter is the only way to obtain
it. Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:29:23 -07:00
Kazunori MIYAZAWA
df9dcb4588 [IPSEC]: Fix inter address family IPsec tunnel handling.
Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:51:51 -07:00
Pavel Emelyanov
fa86d322d8 [NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3).
Proxy neighbors do not have any reference counting, so any caller
of pneigh_lookup (unless it's a netlink triggered add/del routine)
should _not_ perform any actions on the found proxy entry. 

There's one exception from this rule - the ipv6's ndisc_recv_ns() 
uses found entry to check the flags for NTF_ROUTER.

This creates a race between the ndisc and pneigh_delete - after 
the pneigh is returned to the caller, the nd_tbl.lock is dropped 
and the deleting procedure may proceed.

One of the fixes would be to add a reference counting, but this
problem exists for ndisc only. Besides such a patch would be too 
big for -rc4.

So I propose to introduce a __pneigh_lookup() which is supposed
to be called with the lock held and use it in ndisc code to check
the flags on alive pneigh entry.


Changes from v2:
As David noticed, Exported the __pneigh_lookup() to ipv6 module. 
The checkpatch generates a warning on it, since the EXPORT_SYMBOL 
does not follow the symbol itself, but in this file all the 
exports come at the end, so I decided no to break this harmony.

Changes from v1:
Fixed comments from YOSHIFUJI - indentation of prototype in header
and the pndisc_check_router() name - and a compilation fix, pointed
by Daniel - the is_routed was (falsely) considered as uninitialized
by gcc.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:48:59 -07:00
David S. Miller
06802a819a Merge branch 'master' of ../net-2.6/
Conflicts:

	net/ipv6/ndisc.c
2008-03-23 22:54:03 -07:00
Florian Westphal
80445cfb28 [SCTP]: Remove redundant wrapper functions.
sctp_datamsg_free and sctp_datamsg_track are just aliases for
sctp_datamsg_put and sctp_chunk_hold, respectively.

Saves 32 Bytes on x86.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:47:08 -07:00
Florian Westphal
2051f11fb8 [TCP]: Shrink syncookie_secret by 8 byte.
the first u32 copied from syncookie_secret is overwritten by the
minute-counter four lines below.  After adjusting the destination
address, the size of syncookie_secret can be reduced accordingly.

AFAICS, the only other user of syncookie_secret[] is the ipv6
syncookie support.  Because ipv6 syncookies only grab 44 bytes from
syncookie_secret[], this shouldn't affect them in any way.

With fixes from Glenn Griffin.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:21:28 -07:00
Joe Perches
7d164be8aa [NET]: include/net/route.h - remove duplicate include
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:03:56 -07:00
Pavel Emelyanov
fc8717baa8 [RAW]: Add raw_hashinfo member on struct proto.
Sorry for the patch sequence confusion :| but I found that the similar
thing can be done for raw sockets easily too late.

Expand the proto.h union with the raw_hashinfo member and use it in
raw_prot and rawv6_prot. This allows to drop the protocol specific
versions of hash and unhash callbacks.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:56:51 -07:00
Pavel Emelyanov
6ba5a3c52d [UDP]: Make full use of proto.h.udp_hash innovation.
After this we have only udp_lib_get_port to get the port and two 
stubs for ipv4 and ipv6. No difference in udp and udplite except
for initialized h.udp_hash member.

I tried to find a graceful way to drop the only difference between
udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison 
routine), but adding one more callback on the struct proto didn't 
appear such :( Maybe later.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:51:21 -07:00
Pavel Emelyanov
39d8cda76c [SOCK]: Add udp_hash member to struct proto.
Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to 
struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 
and -v6 protocols.

The result is not that exciting, but it removes some levels of
indirection in udpxxx_get_port and saves some space in code and text.

The first step is to union existing hashinfo and new udp_hash on the
struct proto and give a name to this union, since future initialization 
of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc 
warning about inability to initialize anonymous member this way.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:50:58 -07:00
Denis V. Lunev
ef722495c8 [IPV4]: Remove unused ip_options->is_data.
ip_options->is_data is assigned only and never checked. The structure is
not a part of kernel interface to the userspace. So, it is safe to remove
this field.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:35:29 -07:00
Patrick McManus
ec3c0982a2 [TCP]: TCP_DEFER_ACCEPT updates - process as established
Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:33:01 -07:00
Stephen Hemminger
4cd9029d25 socket: SOCK_DEBUG type checking
Use the inline trick (same as pr_debug) to get checking of debug
statements even if no code is generated.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:54:53 -07:00
David S. Miller
1233823b08 [SCTP]: Fix build warnings with IPV6 disabled.
Introduced by 270637abff
("[SCTP]: Fix a race between module load and protosw access")

Reported by Gabriel C:

In file included from net/sctp/sm_statetable.c:50:
include/net/sctp/sctp.h: In function 'sctp_v6_pf_init':
include/net/sctp/sctp.h:392: warning: 'return' with a value, in function returning void
In file included from net/sctp/sm_statefuns.c:62:
include/net/sctp/sctp.h: In function 'sctp_v6_pf_init':
include/net/sctp/sctp.h:392: warning: 'return' with a value, in function returning void
 ...

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:40:47 -07:00
Daniel Lezcano
6f8b13bcb3 [NETNS][IPV6] tcp6 - make proc per namespace
Make the proc for tcp6 to be per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:45 -07:00
Daniel Lezcano
0c96d8c50b [NETNS][IPV6] udp6 - make proc per namespace
The proc init/exit functions take a new network namespace parameter in
order to register/unregister /proc/net/udp6 for a namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:17 -07:00
Daniel Lezcano
f40c8174d3 [NETNS][IPV4] tcp - make proc handle the network namespaces
This patch, like udp proc, makes the proc functions to take care of
which namespace the socket belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:13:54 -07:00
Daniel Lezcano
a91275eff4 [NETNS][IPV6] udp - make proc handle the network namespace
This patch makes the common udp proc functions to take care of which
socket they should show taking into account the namespace it belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:11:58 -07:00
Peter P Waskiewicz Jr
82cc1a7a56 [NET]: Add per-connection option to set max TSO frame size
Update: My mailer ate one of Jarek's feedback mails...  Fixed the
parameter in netif_set_gso_max_size() to be u32, not u16.  Fixed the
whitespace issue due to a patch import botch.  Changed the types from
u32 to unsigned int to be more consistent with other variables in the
area.  Also brought the patch up to the latest net-2.6.26 tree.

Update: Made gso_max_size container 32 bits, not 16.  Moved the
location of gso_max_size within netdev to be less hotpath.  Made more
consistent names between the sock and netdev layers, and added a
define for the max GSO size.

Update: Respun for net-2.6.26 tree.

Update: changed max_gso_frame_size and sk_gso_max_size from signed to
unsigned - thanks Stephen!

This patch adds the ability for device drivers to control the size of
the TSO frames being sent to them, per TCP connection.  By setting the
netdevice's gso_max_size value, the socket layer will set the GSO
frame size based on that value.  This will propogate into the TCP
layer, and send TSO's of that size to the hardware.

This can be desirable to help tune the bursty nature of TSO on a
per-adapter basis, where one may have 1 GbE and 10 GbE devices
coexisting in a system, one running multiqueue and the other not, etc.

This can also be desirable for devices that cannot support full 64 KB
TSO's, but still want to benefit from some level of segmentation
offloading.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 03:43:19 -07:00
David S. Miller
a25606c845 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-03-21 03:42:24 -07:00
Vlad Yasevich
270637abff [SCTP]: Fix a race between module load and protosw access
There is a race is SCTP between the loading of the module
and the access by the socket layer to the protocol functions.
In particular, a list of addresss that SCTP maintains is
not initialized prior to the registration with the protosw.
Thus it is possible for a user application to gain access
to SCTP functions before everything has been initialized.
The problem shows up as odd crashes during connection
initializtion when we try to access the SCTP address list.

The solution is to refactor how we do registration and
initialize the lists prior to registering with the protosw.
Care must be taken since the address list initialization
depends on some other pieces of SCTP initialization.  Also
the clean-up in case of failure now also needs to be refactored.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:17:14 -07:00
David S. Miller
577f99c1d0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/8021q/vlan_dev.c
2008-03-18 00:37:55 -07:00
Al Viro
8e3d716cce xfrm: ->eth_proto is __be16
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:49:16 -07:00
Joe Perches
068edceb7e include/net/ieee80211.h - remove duplicate include
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 19:32:31 -04:00
Adrian Bunk
7524d7d6de the scheduled ieee80211 softmac removal
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 16:02:31 -04:00
Zhang Yanmin
f1dd9c379c [NET]: Fix tbench regression in 2.6.25-rc1
Comparing with kernel 2.6.24, tbench result has regression with
2.6.25-rc1.

1) On 2 quad-core processor stoakley: 4%.
2) On 4 quad-core processor tigerton: more than 30%.

bisect located below patch.

b4ce92775c is first bad commit
commit b4ce92775c
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Nov 13 21:33:32 2007 -0800

    [IPV6]: Move nfheader_len into rt6_info

    The dst member nfheader_len is only used by IPv6.  It's also currently
    creating a rather ugly alignment hole in struct dst.  Therefore this patch
    moves it from there into struct rt6_info.

Above patch changes the cache line alignment, especially member
__refcnt. I did a testing by adding 2 unsigned long pading before
lastuse, so the 3 members, lastuse/__refcnt/__use, are moved to next
cache line. The performance is recovered.

I created a patch to rearrange the members in struct dst_entry.

With Eric and Valdis Kletnieks's suggestion, I made finer arrangement.

1) Move tclassid under ops in case CONFIG_NET_CLS_ROUTE=y. So
   sizeof(dst_entry)=200 no matter if CONFIG_NET_CLS_ROUTE=y/n. I
   tested many patches on my 16-core tigerton by moving tclassid to
   different place. It looks like tclassid could also have impact on
   performance.  If moving tclassid before metrics, or just don't move
   tclassid, the performance isn't good. So I move it behind metrics.

2) Add comments before __refcnt.

On 16-core tigerton:

If CONFIG_NET_CLS_ROUTE=y, the result with below patch is about 18%
better than the one without the patch;

If CONFIG_NET_CLS_ROUTE=n, the result with below patch is about 30%
better than the one without the patch.

With 32bit 2.6.25-rc1 on 8-core stoakley, the new patch doesn't
introduce regression.

Thank Eric, Valdis, and David!

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-12 22:52:37 -07:00
David S. Miller
ba73d4c84a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-11 19:17:18 -07:00
Pekka Enberg
019f692ea7 [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
There's a horrible slab abuse in net/netfilter/nf_conntrack_extend.c
that can be replaced with a call to ksize().

Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:43:41 -07:00
Ron Rindjunsky
6c5ef8a705 mac80211: document IEEE80211_TXCTL_OFDM_HT
This patch clarifies the use of IEEE80211_TXCTL_OFDM_HT flag.

Can by united with patch "mac80211: adding mac80211_tx_control
flags and HT flags"

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-07 16:03:01 -05:00
Ron Rindjunsky
11f4b1cec9 mac80211: adding mac80211_tx_control_flags and HT flags
This patch makes enum from the defines previously dwelled inside
ieee80211_tx_control for better readability.
The patch also addes HT flags, for 802.11n drivers:
- IEEE80211_TXCTL_OFDM_HT: request low-level driver to use HT OFDM rates
- IEEE80211_TXCTL_GREEN_FIELD: use green field protection
- IEEE80211_TXCTL_DUP_DATA: duplicate data on both 20 Mhz channels
- IEEE80211_TXCTL_40_MHZ_WIDTH: send this frame in 40Mhz width
- IEEE80211_TXCTL_SHORT_GI: send this frame with short guard interval

Tx command can be a combination of any of these flags, along with
bitrate represented by ieee80211_rate. this will allow legacy drivers to
switch easily to any 11n rate representation.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
CC: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-07 16:02:59 -05:00
Daniel Lezcano
b8ad0cbc58 [NETNS][IPV6] mcast - handle several network namespace
This patch make use of the network namespace information at the right
places to handle the multicast for several network namespaces.  It
makes the socket control to be per namespace too.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:55 -08:00
Daniel Lezcano
93ec926b07 [NETNS][IPV6] tcp6 - make socket control per namespace
Instead of having a tcp6_socket global to all the namespace, there is
tcp6 socket control per namespace. That is consistent with which
namespace sent a RST and allows to pass the socket to the underlying
function to retrieve the network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:02 -08:00
Daniel Lezcano
1762f7e88e [NETNS][IPV6] ndisc - make socket control per namespace
Make ndisc socket control per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:15:34 -08:00
Pavel Emelyanov
e9720acd72 [NET]: Make /proc/net a symlink on /proc/self/net (v3)
Current /proc/net is done with so called "shadows", but current
implementation is broken and has little chances to get fixed.

The problem is that dentries subtree of /proc/net directory has
fancy revalidation rules to make processes living in different
net namespaces see different entries in /proc/net subtree, but
currently, tasks see in the /proc/net subdir the contents of any
other namespace, depending on who opened the file first.

The proposed fix is to turn /proc/net into a symlink, which points
to /proc/self/net, which in turn shows what previously was in
/proc/net - the network-related info, from the net namespace the
appropriate task lives in.

# ls -l /proc/net
lrwxrwxrwx  1 root root 8 Mar  5 15:17 /proc/net -> self/net

In other words - this behaves like /proc/mounts, but unlike
"mounts", "net" is not a file, but a directory.

Changes from v2:
* Fixed discrepancy of /proc/net nlink count and selinux labeling
  screwup pointed out by Stephen.

  To get the correct nlink count the ->getattr callback for /proc/net
  is overridden to read one from the net->proc_net entry.

  To make selinux still work the net->proc_net entry is initialized
  properly, i.e. with the "net" name and the proc_net parent.

Selinux fixes are
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>

Changes from v1:
* Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:08:40 -08:00
David S. Miller
db8dac20d5 [UDP]: Revert udplite and code split.
This reverts commit db1ed684f6 ("[IPV6]
UDP: Rename IPv6 UDP files."), commit
8be8af8fa4 ("[IPV4] UDP: Move
IPv4-specific bits to other file.") and commit
e898d4db27 ("[UDP]: Allow users to
configure UDP-Lite.").

First, udplite is of such small cost, and it is a core protocol just
like TCP and normal UDP are.

We spent enormous amounts of effort to make udplite share as much code
with core UDP as possible.  All of that work is less valuable if we're
just going to slap a config option on udplite support.

It is also causing build failures, as reported on linux-next, showing
that the changeset was not tested very well.  In fact, this is the
second build failure resulting from the udplite change.

Finally, the config options provided was a bool, instead of a modular
option.  Meaning the udplite code does not even get build tested
by allmodconfig builds, and furthermore the user is not presented
with a reasonable modular build option which is particularly needed
by distribution vendors.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 16:22:02 -08:00
Allan Stephens
0e0609bbd2 [TIPC]: Eliminate "sparse" symbol warnings
This patch eliminates warnings about undeclared symbols.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:06:06 -08:00
Allan Stephens
8c8696553a [TIPC]: Removal of message header option code
This patch removes code associated with optional, user-specified
fields of the TIPC message header.  Such fields were never
utilized by TIPC, and have now been removed from the protocol
specification.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:05:07 -08:00
Johannes Berg
dbbea6713d mac80211: add documentation book
Quite a while ago I started this book. The required kernel-doc
patches have since gone into the tree so it is now possible to
build the book in mainline.

The actual documentation is still rather incomplete and not all
things are linked into the book, but this enables us to edit
the documentation collaboratively, hopefully driver authors can
add documentation based on their experience with mac80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
902acc7896 mac80211: clean up mesh code
Various cleanups, reducing the #ifdef mess and other things.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Johannes Berg
6032f934c8 mac80211: add mesh interface type
This adds the mesh interface type.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
2ec600d672 nl80211/cfg80211: support for mesh, sta dumping
Added support for mesh id and mesh path operation as well as
station structure dumping.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Tobias Klauser
04005dd9ae bluetooth: Make hci_sock_cleanup() return void
hci_sock_cleanup() always returns 0 and its return value isn't used
anywhere in the code.

Compile-tested with 'make allyesconfig && make net/bluetooth/bluetooth.ko'

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
2008-03-05 18:47:03 -08:00
Harvey Harrison
4eb329a5aa irda: replace __inline with inline
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-05 18:37:16 -08:00