Commit Graph

19313 Commits

Author SHA1 Message Date
Gustavo F. Padovan
62f3a2cfb1 Bluetooth: Fix another locking unbalance
l2cap_get_sock_by_scid was changed to not lock the socket anymore, but I
forgot to change all the users of this function.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-14 18:34:34 -03:00
Ralf Baechle
8849b720e9 NET: AX.25, NETROM, ROSE: Remove SOCK_DEBUG calls
Nobody alive seems to recall when they last were useful.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-14 00:20:07 -07:00
Gustavo F. Padovan
280f294f7b Bluetooth: Don't lock sock inside l2cap_get_sock_by_scid()
Fix an locking issue with the new l2cap_att_channel(). l2cap_att_channel()
was trying to lock a locked socket.

Reported-by: Anderson Lizardo <anderson.lizardo@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 19:01:22 -03:00
cozybit Inc
a3e6b12c02 mac80211: Allocate new mesh path and portal tables before taking locks
It is unnecessary to hold the path table resize lock while allocating a
new table.  Allocate first and take lock later.  This resolves a
soft-lockup:

[  293.385799] BUG: soft lockup - CPU#0 stuck for 61s! [kworker/u:3:744]
(...)
[  293.386049] Call Trace:
[  293.386049]  [<c119fd04>] do_raw_read_lock+0x26/0x29
[  293.386049]  [<c14b2982>] _raw_read_lock+0x8/0xa
[  293.386049]  [<c148c178>] mesh_path_add+0xb7/0x24e
[  293.386049]  [<c148b98d>] ? mesh_path_lookup+0x1b/0xa6
[  293.386049]  [<c148ded5>] hwmp_route_info_get+0x276/0x2fd
[  293.386049]  [<c148dfb6>] mesh_rx_path_sel_frame+0x5a/0x5d9
[  293.386049]  [<c102667d>] ? update_curr+0x1cf/0x1d7
[  293.386049]  [<c148b45a>] ieee80211_mesh_rx_queued_mgmt+0x60/0x67
[  293.386049]  [<c147c374>] ieee80211_iface_work+0x1f0/0x258
(...)

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-13 15:24:48 -04:00
Felix Fietkau
4114fa2146 mac80211: receive EAP frames from a station in an AP VLAN on the main AP
This makes it easier to handle moving stations to VLAN interfaces that are
part of a different bridge.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-13 15:21:52 -04:00
David S. Miller
3e8c806a08 Revert "tcp: disallow bind() to reuse addr/port"
This reverts commit c191a836a9.

It causes known regressions for programs that expect to be able to use
SO_REUSEADDR to shutdown a socket, then successfully rebind another
socket to the same ID.

Programs such as haproxy and amavisd expect this to work.

This should fix kernel bugzilla 32832.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-13 12:01:14 -07:00
amit salecha
8b5933c380 net: ethtool support to configure number of channels
Ethtool support to configure RX, TX and other channels. combined field
in struct ethtool_channels to reflect set of channel (RX, TX or other).
Other channel can be link interrupts, SR-IOV coordination etc.

ETHTOOL_GCHANNELS will report max and current number of RX channels,
max and current number of TX channels, max and current number of other channel
or max and current number of combined channel.

Number of channel can be modify upto max number of channel through
ETHTOOL_SCHANNELS command.

Ben Hutchings:
o define 'combined' and 'other' types.  Most multiqueue drivers pair up RX and TX
  queues so that most channels combine RX and TX work.
o Please could you use a kernel-doc comment to describe the structure.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-13 11:31:04 -07:00
Szymon Janc
e1ba1f1546 Bluetooth: Fix Out Of Band pairing when mgmt interface is disabled
Use kernel stored remote Out Of Band data only if management interface
is enabled. Otherwise HCI_OP_REMOTE_OOB_DATA_NEG_REPLY was sent to
controller even if remote Out Of Band data was present in bluetoothd.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 12:20:02 -03:00
Gustavo F. Padovan
9f69bda6aa Bluetooth: Add proper handling of received LE data
Despite it works, handling through l2cap_data_channel() is wrongs.
That function should handle only connection oriented data.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 12:20:02 -03:00
Gustavo F. Padovan
cd69a03af1 Bluetooth: Fix wrong comparison in listen()
We should check for the pi->scid there.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 12:20:00 -03:00
Gustavo F. Padovan
58d35f87ef Bluetooth: Move tx queue to struct l2cap_chan
tx_q is the queue used by ERTM mode.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 12:19:59 -03:00
Gustavo F. Padovan
c916fbe45c Bluetooth: Remove unneeded uninitialized_vars()
That was unnecessary use of it.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 12:19:58 -03:00
Gustavo F. Padovan
49208c9c7b Bluetooth: Remove some sk references from l2cap_core.c
Change some BT_DBG messages and consequently remove some struct sock
declarations.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 12:19:57 -03:00
Gustavo F. Padovan
39d5a3ee35 Bluetooth: Move SREJ list to struct l2cap_chan
As part of moving all the Channel related operation to struct l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-13 12:19:47 -03:00
Jozsef Kadlecsik
91eb7c08c6 netfilter: ipset: SCTP, UDPLITE support added
SCTP and UDPLITE port support added to the hash:*port* set types.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-13 13:51:38 +02:00
Jozsef Kadlecsik
eafbd3fde6 netfilter: ipset: set match and SET target fixes
The SET target with --del-set did not work due to using wrongly
the internal dimension of --add-set instead of --del-set.
Also, the checkentries did not release the set references when
returned an error. Bugs reported by Lennert Buytenhek.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-13 13:45:57 +02:00
Jozsef Kadlecsik
0e8a835aa5 netfilter: ipset: bitmap:ip,mac type requires "src" for MAC
Enforce that the second "src/dst" parameter of the set match and SET target
must be "src", because we have access to the source MAC only in the packet.
The previous behaviour, that the type required the second parameter
but actually ignored the value was counter-intuitive and confusing.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-13 13:43:23 +02:00
Wei Yongjun
9494c7c577 sctp: fix oops while removed transport still using as retran path
Since we can not update retran path to unconfirmed transports,
when we remove a peer, the retran path may not be update if the
other transports are all unconfirmed, and we will still using
the removed transport as the retran path. This may cause panic
if retrasnmit happen.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 19:33:51 -07:00
Vlad Yasevich
25f7bf7d0d sctp: fix oops when updating retransmit path with DEBUG on
commit fbdf501c93
  sctp: Do no select unconfirmed transports for retransmissions

Introduced the initial falt.

commit d598b166ce
  sctp: Make sure we always return valid retransmit path

Solved the problem, but forgot to change the DEBUG statement.
Thus it was still possible to dereference a NULL pointer.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 19:33:50 -07:00
Ben Hutchings
31d8b9e099 net: Disable NETIF_F_TSO_ECN when TSO is disabled
NETIF_F_TSO_ECN has no effect when TSO is disabled; this just means
that feature state will be accurately reported to user-space.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 19:29:45 -07:00
Ben Hutchings
ea2d36883c net: Disable all TSO features when SG is disabled
The feature flags NETIF_F_TSO and NETIF_F_TSO6 independently enable
TSO for IPv4 and IPv6 respectively.  However, the test in
netdev_fix_features() and its predecessor functions was never updated
to check for NETIF_F_TSO6, possibly because it was originally proposed
that TSO for IPv6 would be dependent on both feature flags.

Now that these feature flags can be changed independently from
user-space and we depend on netdev_fix_features() to fix invalid
feature combinations, it's important to disable them both if
scatter-gather is disabled.  Also disable NETIF_F_TSO_ECN so
user-space sees all TSO features as disabled.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 19:29:45 -07:00
David S. Miller
a7e7015888 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-04-12 16:16:02 -07:00
David S. Miller
095d3da610 9p: Kill set but unused variable in 9p_client_{read,write}() and p9_client_readdir()
Fixes the following warnings:

net/9p/client.c:1305:18: warning: variable ‘total’ set but not used [-Wunused-but-set-variable]
net/9p/client.c:1370:18: warning: variable ‘total’ set but not used [-Wunused-but-set-variable]
net/9p/client.c:1769:18: warning: variable ‘total’ set but not used [-Wunused-but-set-variable]

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 15:58:41 -07:00
David S. Miller
bfac3693c4 ieee802154: Remove hacked CFLAGS in net/ieee802154/Makefile
It adds -Wall (which the kernel carefully controls already) and of all
things -DDEBUG (which should be set by other means if desired, please
we have dynamic-debug these days).

Kill this noise.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 15:33:23 -07:00
Dave Jones
020318d0d2 irda: fix locking unbalance in irda_sendmsg
5b40964ead ("irda: Remove BKL instances
from af_irda.c") introduced a path where we have a locking unbalance.
If we pass invalid flags, we unlock a socket we never locked,
resulting in this...

=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
trinity/20101 is trying to release lock (sk_lock-AF_IRDA) at:
[<ffffffffa057f001>] irda_sendmsg+0x207/0x21d [irda]
but there are no more locks to release!

other info that might help us debug this:
no locks held by trinity/20101.

stack backtrace:
Pid: 20101, comm: trinity Not tainted 2.6.39-rc3+ #3
Call Trace:
 [<ffffffffa057f001>] ? irda_sendmsg+0x207/0x21d [irda]
 [<ffffffff81085041>] print_unlock_inbalance_bug+0xc7/0xd2
 [<ffffffffa057f001>] ? irda_sendmsg+0x207/0x21d [irda]
 [<ffffffff81086aca>] lock_release+0xcf/0x18e
 [<ffffffff813ed190>] release_sock+0x2d/0x155
 [<ffffffffa057f001>] irda_sendmsg+0x207/0x21d [irda]
 [<ffffffff813e9f8c>] __sock_sendmsg+0x69/0x75
 [<ffffffff813ea105>] sock_sendmsg+0xa1/0xb6
 [<ffffffff81100ca3>] ? might_fault+0x5c/0xac
 [<ffffffff81086b7c>] ? lock_release+0x181/0x18e
 [<ffffffff81100cec>] ? might_fault+0xa5/0xac
 [<ffffffff81100ca3>] ? might_fault+0x5c/0xac
 [<ffffffff81133b94>] ? fcheck_files+0xb9/0xf0
 [<ffffffff813f387a>] ? copy_from_user+0x2f/0x31
 [<ffffffff813f3b70>] ? verify_iovec+0x52/0xa6
 [<ffffffff813eb4e3>] sys_sendmsg+0x23a/0x2b8
 [<ffffffff81086b7c>] ? lock_release+0x181/0x18e
 [<ffffffff810773c6>] ? up_read+0x28/0x2c
 [<ffffffff814bec3d>] ? do_page_fault+0x360/0x3b4
 [<ffffffff81087043>] ? trace_hardirqs_on_caller+0x10b/0x12f
 [<ffffffff810458aa>] ? finish_task_switch+0xb2/0xe3
 [<ffffffff8104583e>] ? finish_task_switch+0x46/0xe3
 [<ffffffff8108364a>] ? trace_hardirqs_off_caller+0x33/0x90
 [<ffffffff814bbaf9>] ? retint_swapgs+0x13/0x1b
 [<ffffffff81087043>] ? trace_hardirqs_on_caller+0x10b/0x12f
 [<ffffffff810a9dd3>] ? audit_syscall_entry+0x11c/0x148
 [<ffffffff8125609e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff814c22c2>] system_call_fastpath+0x16/0x1b

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 15:29:54 -07:00
Michał Mirosław
872674858f net: add RTNL_ASSERT in __netdev_update_features()
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 14:36:07 -07:00
Jiri Pirko
bcc6d47903 net: vlan: make non-hw-accel rx path similar to hw-accel
Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is
enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into
vlan code in __netif_receive_skb - vlan_hwaccel_do_receive.

For non-rx-vlan-hw-accel however, tagged skb goes thru whole
__netif_receive_skb, it's untagged in ptype_base hander and reinjected

This incosistency is fixed by this patch. Vlan untagging happens early in
__netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers)
see the skb like it was untagged by hw.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

v1->v2:
	remove "inline" from vlan_core.c functions
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 14:15:19 -07:00
Joakim Tjernlund
192910a6cc net: Do not wrap sysctl igmp_max_memberships in IP_MULTICAST
controlling igmp_max_membership is useful even when IP_MULTICAST
is off.
Quagga(an OSPF deamon) uses multicast addresses for all interfaces
using a single socket and hits igmp_max_membership limit when
there are 20 interfaces or more.
Always export sysctl igmp_max_memberships in proc, just like
igmp_max_msf

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 13:59:33 -07:00
Mohammed Shafi Shajakhan
ebe27c91af {mac|nl}80211: Add station connected time
Add station connected time in debugfs. This will be helpful to get a
measure of stability of the connection and for debugging stress issues

Cc: Senthilkumar Balasubramanian <Senthilkumar.Balasubramanian@Atheros.com>
Signed-off-by: Mohammed Shafi Shajakhan <mshajakhan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:58:47 -04:00
Eric Dumazet
66944e1c57 inetpeer: reduce stack usage
On 64bit arches, we use 752 bytes of stack when cleanup_once() is called
from inet_getpeer().

Lets share the avl stack to save ~376 bytes.

Before patch :

# objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl

0x000006c3 unlink_from_pool [inetpeer.o]:		376
0x00000721 unlink_from_pool [inetpeer.o]:		376
0x00000cb1 inet_getpeer [inetpeer.o]:			376
0x00000e6d inet_getpeer [inetpeer.o]:			376
0x0004 inet_initpeers [inetpeer.o]:			112
# size net/ipv4/inetpeer.o
   text	   data	    bss	    dec	    hex	filename
   5320	    432	     21	   5773	   168d	net/ipv4/inetpeer.o

After patch :

objdump -d net/ipv4/inetpeer.o | scripts/checkstack.pl
0x00000c11 inet_getpeer [inetpeer.o]:			376
0x00000dcd inet_getpeer [inetpeer.o]:			376
0x00000ab9 peer_check_expire [inetpeer.o]:		328
0x00000b7f peer_check_expire [inetpeer.o]:		328
0x0004 inet_initpeers [inetpeer.o]:			112
# size net/ipv4/inetpeer.o
   text	   data	    bss	    dec	    hex	filename
   5163	    432	     21	   5616	   15f0	net/ipv4/inetpeer.o

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Scot Doyle <lkml@scotdoyle.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Reviewed-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 13:58:33 -07:00
Javier Cardona
1570ca5927 mac80211: send notification on new peer candidate for our secure mesh
Also, advertise support for mesh authentication.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:58:24 -04:00
Javier Cardona
c93b5e717e nl80211: New notification to discover mesh peer candidates.
Notify userspace when a beacon/presp is received from a suitable mesh
peer candidate for whom no sta information exists.  Userspace can then
decide to create a sta info for the candidate.  If userspace is not
ready to authenticate the peer right away, it can create the sta info
with the authenticated flag unset and set it later.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:39 -04:00
Javier Cardona
96b78dff03 nl80211/mac80211: Perform PLINK_ACTION on new station
Modify the NEW_STATION command to accept PLINK_ACTIONS, in case
userspace wants to create stations and initiate a peer link right away
(for authenticated stations) or create a blocked station (for
debugging).

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:39 -04:00
Javier Cardona
53e805111b mac80211: ignore peer link requests from unauthenticated stations.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:38 -04:00
Javier Cardona
71839121a0 mac80211: Let user space receive and send mesh auth/deauth frames
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:38 -04:00
Javier Cardona
b39c48fac1 nl80211/mac80211: let userspace authenticate stations
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:38 -04:00
Javier Cardona
5cff5e01e8 mac80211: ignore peers if security is enabled for this mesh
Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:37 -04:00
Javier Cardona
15d5dda623 cfg80211/nl80211: Add userspace authentication flag to mesh setup
During mesh setup, use NL80211_MESH_SETUP_USERSPACE_AUTH flag to create
a secure mesh and route management frames to userspace.

Also, NL80211_CMD_GET_WIPHY now returns a flag NL80211_SUPPORT_MESH_AUTH
if the wiphy's mesh implementation supports routing of mesh auth frames
to userspace.  This is useful for forward compatibility between old
kernels and new userspace tools.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:37 -04:00
Javier Cardona
581a8b0fee nl80211: rename NL80211_MESH_SETUP_VENDOR_PATH_SEL_IE
To NL80211_MESH_SETUP_IE. This reflects our ability to insert any ie
into a mesh beacon, not simply path selection ies.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:37 -04:00
Vivek Natarajan
e8306f9894 mac80211: Check for queued frames before entering power save.
In a highly noisy environment, the tx rate of the driver drops and
the application slows down since it has not yet received ACKs for
the frames already queued in the hardware. Since this ACK may take
more than 100ms, stopping the dev queues for entering PS at this
stage breaks applications, WMM test cases in my testing.
If there are frames already pending in the tx queue, postponing the
PS logic helps to avoid redundant queue stops. When power save is
enabled by default and in a noisy environment, this API certainly
helps in improving the average throughput.

Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-12 16:57:34 -04:00
Allan, Bruce W
143780c656 ethtool: time to blink provided in seconds not jiffies
When blinking for a duration set by the user, the value specified is in
seconds but it is used as the number of jiffies in the timeout after which
the Physical ID indicator is deactivated.  Fix by converting the timeout
to seconds.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 13:47:33 -07:00
Eric Dumazet
f8e9881c2a bridge: reset IPCB in br_parse_ip_options
Commit 462fb2af97 (bridge : Sanitize skb before it enters the IP
stack), missed one IPCB init before calling ip_options_compile()

Thanks to Scot Doyle for his tests and bug reports.

Reported-by: Scot Doyle <lkml@scotdoyle.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: Bandan Das <bandan.das@stratus.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jan Lübbe <jluebbe@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 13:39:14 -07:00
John W. Linville
252f4bf400 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	drivers/net/wireless/ath/ar9170/main.c
	drivers/net/wireless/ath/ar9170/phy.c
	drivers/net/wireless/zd1211rw/zd_rf_rf2959.c
2011-04-12 16:18:44 -04:00
David S. Miller
aa8673599f llc: Fix length check in llc_fixup_skb().
Fixes bugzilla #32872

The LLC stack pretends to support non-linear skbs but there is a
direct use of skb_tail_pointer() in llc_fixup_skb().

Use pskb_may_pull() to see if data_size bytes remain and can be
accessed linearly in the packet, instead of direct pointer checks.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 18:59:05 -07:00
Sjur Brændeland
39b9afbb4c caif: Add BUG_ON if dev_info is missing in packet
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 15:08:48 -07:00
Sjur Brændeland
4dd820c088 caif: Don't resend if dev_queue_xmit fails.
If CAIF Link Layer returns an error, we no longer try to re-build the
CAIF packet and resend it. Instead, we simply return any transmission
errors to the socket client.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 15:08:48 -07:00
Stephen Hemminger
73d6ac633c caif: code cleanup
Cleanup of new CAIF code.
  * make local functions static
  * remove code that is never used
  * expand get_caif_conf() since wrapper is no longer needed
  * make args to comparison functions const
  * rename connect_req_to_link_param to keep exported names
    consistent

Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 15:08:47 -07:00
David S. Miller
1c01a80cfe Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/smsc911x.c
2011-04-11 13:44:25 -07:00
Alexander Duyck
127fe533ae v3 ethtool: add ntuple flow specifier data to network flow classifier
This change is meant to add an ntuple data extensions to the rx network flow
classification specifiers.  The idea is to allow ntuple to be displayed via
the network flow classification interface.

The first patch had some left over stuff from the original flow extension
flags I had added.  That bit is removed in this patch.

The second had some left over comments that stated we ignored bits in the
masks when we actually match them.

This work is based on input from Ben Hutchings.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 13:20:49 -07:00
Alexander Duyck
5d9f11cf50 ethtool: prevent null pointer dereference with NTUPLE set but no set_rx_ntuple
This change is meant to prevent a possible null pointer dereference if
NETIF_F_NTUPLE is defined but the set_rx_ntuple function pointer is not.

The main motivation behind this patch is to eventually replace the ntuple
interfaces entirely with the network flow classifier interfaces.  This
allows the device drivers to maintain the ntuple check internally while
using the network flow classifier interface for setting up and displaying
rules.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 13:20:49 -07:00
Sjur Brændeland
4a9f65f630 caif: performance bugfix - allow radio stack to prioritize packets.
In the CAIF Payload message the Packet Type indication must be set to
    UNCLASSIFIED in order to allow packet prioritization in the modem's
    network stack. Otherwise TCP-Ack is not prioritized in the modems
    transmit queue.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 13:15:58 -07:00
Sjur Brændeland
0c184ed903 caif: Bugfix use for_each_safe when removing list nodes.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-11 13:15:57 -07:00
Linus Torvalds
c44eaf41a5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits)
  net: Add support for SMSC LAN9530, LAN9730 and LAN89530
  mlx4_en: Restoring RX buffer pointer in case of failure
  mlx4: Sensing link type at device initialization
  ipv4: Fix "Set rt->rt_iif more sanely on output routes."
  MAINTAINERS: add entry for Xen network backend
  be2net: Fix suspend/resume operation
  be2net: Rename some struct members for clarity
  pppoe: drop PPPOX_ZOMBIEs in pppoe_flush_dev
  dsa/mv88e6131: add support for mv88e6085 switch
  ipv6: Enable RFS sk_rxhash tracking for ipv6 sockets (v2)
  be2net: Fix a potential crash during shutdown.
  bna: Fix for handling firmware heartbeat failure
  can: mcp251x: Allow pass IRQ flags through platform data.
  smsc911x: fix mac_lock acquision before calling smsc911x_mac_read
  iwlwifi: accept EEPROM version 0x423 for iwl6000
  rt2x00: fix cancelling uninitialized work
  rtlwifi: Fix some warnings/bugs
  p54usb: IDs for two new devices
  wl12xx: fix potential buffer overflow in testmode nvs push
  zd1211rw: reset rx idle timer from tasklet
  ...
2011-04-11 07:27:24 -07:00
Michael Smith
990078afbf Disable rp_filter for IPsec packets
The reverse path filter interferes with IPsec subnet-to-subnet tunnels,
especially when the link to the IPsec peer is on an interface other than
the one hosting the default route.

With dynamic routing, where the peer might be reachable through eth0
today and eth1 tomorrow, it's difficult to keep rp_filter enabled unless
fake routes to the remote subnets are configured on the interface
currently used to reach the peer.

IPsec provides a much stronger anti-spoofing policy than rp_filter, so
this patch disables the rp_filter for packets with a security path.

Signed-off-by: Michael Smith <msmith@cbnco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-10 18:50:59 -07:00
Michael Smith
5c04c819a2 fib_validate_source(): pass sk_buff instead of mark
This makes sk_buff available for other use in fib_validate_source().

Signed-off-by: Michael Smith <msmith@cbnco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-10 18:50:59 -07:00
Linus Torvalds
94c8a984ae Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC
  NFS: Fix a signed vs. unsigned secinfo bug
  Revert "net/sunrpc: Use static const char arrays"
2011-04-08 11:47:35 -07:00
Gustavo F. Padovan
2ead70b839 Bluetooth: Fix lockdep warning with skb list lock
This is a regression acctually, caused by the first patch series for
creating a formal strcut l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:28 -03:00
Gustavo F. Padovan
311bb895e3 Bluetooth: Move busy workqueue to struct l2cap_chan
As part of the moving channel stuff to l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:27 -03:00
Gustavo F. Padovan
f1c6775be6 Bluetooth: Move srej and busy queues to struct l2cap_chan
As part of the moving channel stuff to l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:27 -03:00
Gustavo F. Padovan
e92c8e70fa Bluetooth: Move ERTM timers to struct l2cap_chan
This also triggered a change in l2cap_send_disconn_req() parameters.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:27 -03:00
Gustavo F. Padovan
2c03a7a49e Bluetooth: Move remote info to struct l2cap_chan
As part of the moving channel stuff to l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:27 -03:00
Gustavo F. Padovan
6f61fd4759 Bluetooth: Move SDU related vars to struct l2cap_chan
As part of the moving channel stuff to l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:27 -03:00
Gustavo F. Padovan
6a026610ee Bluetooth: Move more ERTM stuff to struct l2cap_chan
As part of the moving channel stuff to l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:27 -03:00
Gustavo F. Padovan
42e5c8027b Bluetooth: Move of ERTM *_seq vars to struct l2cap_chan
As part of the moving channel to stuff to struct l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:26 -03:00
Gustavo F. Padovan
525cd1851b Bluetooth: Move conn_state to struct l2cap_chan
This is part of "moving things to l2cap_chan". As one the first move it
triggered a big number of changes in the funcions parameters, basically
changing the struct sock param to struct l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:26 -03:00
Gustavo F. Padovan
710f9b0a42 Bluetooth: clean up l2cap_sock_recvmsg()
Move some channel specific stuff to l2cap_core.c, this will make things
more clear.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:26 -03:00
Gustavo F. Padovan
73ffa904b7 Bluetooth: Move conf_{req,rsp} stuff to struct l2cap_chan
They are also l2cap_chan specific.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:26 -03:00
Gustavo F. Padovan
fc7f8a7ed4 Bluetooth: Move ident to struct l2cap_chan
ident is chan property, no need to reside on socket.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:26 -03:00
Gustavo F. Padovan
820ffdb3d2 Bluetooth: Remove struct del_list
As we use struct list_head to keep L2CAP channels list the workaround with
del_list is not needed anymore.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:26 -03:00
Gustavo F. Padovan
baa7e1fa6d Bluetooth: Use struct list_head for L2CAP channels list
Use a well known Kernel API is always a good idea than implement your own
list.
In the future we might use RCU on this list.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:25 -03:00
Gustavo F. Padovan
48454079c2 Bluetooth: Create struct l2cap_chan
struct l2cap_chan cames to create a clear separation between what
properties and data belongs to the L2CAP channel and what belongs to the
socket. By now we just fold the struct sock * in struct l2cap_chan as all
the channel info is struct l2cap_pinfo today.

In the next commits we will see a move of channel stuff to struct
l2cap_chan.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-07 18:06:25 -03:00
David S. Miller
c1e48efc70 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/benet/be_main.c
2011-04-07 14:05:23 -07:00
OGAWA Hirofumi
1b86a58f9d ipv4: Fix "Set rt->rt_iif more sanely on output routes."
Commit 1018b5c016 ("Set rt->rt_iif more
sanely on output routes.")  breaks rt_is_{output,input}_route.

This became the cause to return "IP_PKTINFO's ->ipi_ifindex == 0".

To fix it, this does:

1) Add "int rt_route_iif;" to struct rtable

2) For input routes, always set rt_route_iif to same value as rt_iif

3) For output routes, always set rt_route_iif to zero.  Set rt_iif
   as it is done currently.

4) Change rt_is_{output,input}_route() to test rt_route_iif

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-07 14:04:08 -07:00
John W. Linville
b37e3b6d64 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
Conflicts:
	drivers/net/wireless/rtlwifi/efuse.c
	drivers/net/wireless/rtlwifi/rtl8192c/fw_common.c
	net/bluetooth/mgmt.c
2011-04-07 16:45:40 -04:00
Luis R. Rodriguez
a90c7a313a cfg80211: add a timer for invalid user reg hints
We have no other option but to inform userspace that we
have queued up their regulatory hint request when we are
given one given that nl80211 operates atomically on user
requests. The best we can do is accept the request, and
add a delayed work item for processing failure and cancel it
if we succeeed. Upon failure we restore the regulatory
settings and ignore the user input.

This fixes this reported bug:

https://bugzilla.kernel.org/show_bug.cgi?id=28112

Reported-by: gregoryx.alagnou@intel.com
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-07 15:52:29 -04:00
Luis R. Rodriguez
146095557b cfg80211: fix regulatory restore upon user hints
When we restore regulatory settings its possible CRDA
will not reply because of a bogus user entry. In this
case the bogus entry will prevent any further processing
on cfg80211 for regulatory domains even if we restore
regulatory settings.

To prevent this we suck out all pending requests when
restoring regulatory settings and add them back into the
queue after we have queued up the reset work.

The impact of not having this applied is that a user
with privileges can issue a userspace regulatory hint
while we are disasocciating and this would prevent any
further processing of regulatory domains.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-07 15:52:29 -04:00
Mohammed Shafi Shajakhan
ffbd308dce mac80211: remove few obsolete flags
Signed-off-by: Mohammed Shafi Shajakhan <mshajakhan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-07 15:34:14 -04:00
Paul Stewart
f4263c9857 nl80211: Add BSS parameters to station
This allows user-space monitoring of BSS parameters for the associated
station.  This is useful for debugging and verifying that the paramaters
are as expected.

[Exactly the same as before but bundled into a single message]

Signed-off-by: Paul Stewart <pstew@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-07 15:34:12 -04:00
John W. Linville
f3b3e36f4e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-next-2.6 2011-04-07 15:30:53 -04:00
Linus Torvalds
42933bac11 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
  Fix common misspellings
2011-04-07 11:14:49 -07:00
David S. Miller
a25a32ab71 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-04-06 13:34:15 -07:00
Peter Korsgaard
ec80bfcb68 dsa/mv88e6131: add support for mv88e6085 switch
The mv88e6085 is identical to the mv88e6095, except that all ports are
10/100 Mb/s, so use the existing setup code except for the cpu/dsa speed
selection in _setup_port().

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-06 13:32:53 -07:00
Neil Horman
47482f132a ipv6: Enable RFS sk_rxhash tracking for ipv6 sockets (v2)
properly record sk_rxhash in ipv6 sockets (v2)

Noticed while working on another project that flows to sockets which I had open
on a test systems weren't getting steered properly when I had RFS enabled.
Looking more closely I found that:

1) The affected sockets were all ipv6
2) They weren't getting steered because sk->sk_rxhash was never set from the
incomming skbs on that socket.

This was occuring because there are several points in the IPv4 tcp and udp code
which save the rxhash value when a new connection is established.  Those calls
to sock_rps_save_rxhash were never added to the corresponding ipv6 code paths.
This patch adds those calls.  Tested by myself to properly enable RFS
functionalty on ipv6.

Change notes:
v2:
	Filtered UDP to only arm RFS on bound sockets (Eric Dumazet)

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-06 13:07:09 -07:00
Oliver Hartkopp
1ca050d909 can: convert protocol handling to RCU
This patch removes spin_locks at CAN socket creation time by using RCU.

Inspired by the discussion with Kurt van Dijck and Eric Dumazet the RCU code
was partly derived from af_phonet.c

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-06 12:35:51 -07:00
David S. Miller
4c844d97d2 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next-2.6 2011-04-06 12:27:34 -07:00
Trond Myklebust
0867659fa3 Revert "net/sunrpc: Use static const char arrays"
This reverts commit 411b5e0561.

Olga Kornievskaia reports:

Problem: linux client mounting linux server using rc4-hmac-md5
enctype. gssd fails with create a context after receiving a reply from
the server.

Diagnose: putting printout statements in the server kernel and
kerberos libraries revealed that client and server derived different
integrity keys.

Server kernel code was at fault due the the commit

[aglo@skydive linux-pnfs]$ git show 411b5e0561

Trond: The problem is that since it relies on virt_to_page(), you cannot
call sg_set_buf() for data in the const section.

Reported-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org	[2.6.36+]
2011-04-06 11:18:17 -07:00
David S. Miller
9d93059497 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-04-05 14:21:11 -07:00
Linus Torvalds
d14f5b810b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  ipv6: Don't pass invalid dst_entry pointer to dst_release().
  mlx4: fix kfree on error path in new_steering_entry()
  tcp: len check is unnecessarily devastating, change to WARN_ON
  sctp: malloc enough room for asconf-ack chunk
  sctp: fix auth_hmacs field's length of struct sctp_cookie
  net: Fix dev dev_ethtool_get_rx_csum() for forced NETIF_F_RXCSUM
  usbnet: use eth%d name for known ethernet devices
  starfire: clean up dma_addr_t size test
  iwlegacy: fix bugs in change_interface
  carl9170: Fix tx aggregation problems with some clients
  iwl3945: disable hw scan by default
  wireless: rt2x00: rt2800usb.c add and identify ids
  iwl3945: do not deprecate software scan
  mac80211: fix aggregation frame release during timeout
  cfg80211: fix BSS double-unlinking (continued)
  cfg80211:: fix possible NULL pointer dereference
  mac80211: fix possible NULL pointer dereference
  mac80211: fix NULL pointer dereference in ieee80211_key_alloc()
  ath9k: fix a chip wakeup related crash in ath9k_start
  mac80211: fix a crash in minstrel_ht in HT mode with no supported MCS rates
  ...
2011-04-05 12:26:57 -07:00
Alexey Dobriyan
db940cb0db Bluetooth: convert net/bluetooth/ to kstrtox
Convert from strict_strto*() interfaces to kstrto*() interfaces.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-05 13:21:11 -03:00
Gustavo F. Padovan
e63a15ec0f Bluetooth: Use GFP_KERNEL in user context
The allocation in mgmt_control() code are in user context and not locked
by any spinlock, so it's not recommended the use of GFP_ATOMIC there.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-05 13:03:10 -03:00
Gustavo F. Padovan
1322901da5 Bluetooth: Don't use spin_lock_bh in user context
spin_lock() and spin_unlock() are more apropiated for user context.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-05 12:58:40 -03:00
Szymon Janc
fada4ac339 Bluetooth: Use kthread API in cmtp
kernel_thread() is a low-level implementation detail and
EXPORT_SYMBOL(kernel_thread) is scheduled for removal.
Use the <linux/kthread.h> API instead.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-05 12:40:47 -03:00
Szymon Janc
f4d7cd4a4c Bluetooth: Use kthread API in bnep
kernel_thread() is a low-level implementation detail and
EXPORT_SYMBOL(kernel_thread) is scheduled for removal.
Use the <linux/kthread.h> API instead.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-05 12:40:47 -03:00
Szymon Janc
aabf6f897e Bluetooth: Use kthread API in hidp
kernel_thread() is a low-level implementation detail and
EXPORT_SYMBOL(kernel_thread) is scheduled for removal.
Use the <linux/kthread.h> API instead.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-05 12:40:46 -03:00
Ben Hutchings
68f512f21a ethtool: Change ETHTOOL_PHYS_ID implementation to allow dropping RTNL
The ethtool ETHTOOL_PHYS_ID command runs for an arbitrarily long
period of time, holding the RTNL lock.  This blocks routing updates,
device enumeration, and various important operations that one might
want to keep running while hunting for the flashing LED.

We need to drop the RTNL lock during this operation, but currently the
core implementation is a thin wrapper around a driver operation and
drivers may well depend upon holding the lock.

Define a new driver operation 'set_phys_id' with an argument that sets
the ID indicator on/off/inactive/active (the last optional, for any
driver or firmware that prefers to handle blinking asynchronously).
When this is defined, the ethtool core drops the lock while waiting
and only acquires it around calls to this operation.

Deprecate the 'phys_id' operation in favour of this.  It can be
removed once all in-tree drivers are converted.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2011-04-05 15:12:01 +01:00
Tom Herbert
c6e1a0d12c net: Allow no-cache copy from user on transmit
This patch uses __copy_from_user_nocache on transmit to bypass data
cache for a performance improvement.  skb_add_data_nocache and
skb_copy_to_page_nocache can be called by sendmsg functions to use
this feature, initial support is in tcp_sendmsg.  This functionality is
configurable per device using ethtool.

Presumably, this feature would only be useful when the driver does
not touch the data.  The feature is turned on by default if a device
indicates that it does some form of checksum offload; it is off by
default for devices that do no checksum offload or indicate no checksum
is necessary.  For the former case copy-checksum is probably done
anyway, in the latter case the device is likely loopback in which case
the no cache copy is probably not beneficial.

This patch was tested using 200 instances of netperf TCP_RR with
1400 byte request and one byte reply.  Platform is 16 core AMD x86.

No-cache copy disabled:
   672703 tps, 97.13% utilization
   50/90/99% latency:244.31 484.205 1028.41

No-cache copy enabled:
   702113 tps, 96.16% utilization,
   50/90/99% latency 238.56 467.56 956.955

Using 14000 byte request and response sizes demonstrate the
effects more dramatically:

No-cache copy disabled:
   79571 tps, 34.34 %utlization
   50/90/95% latency 1584.46 2319.59 5001.76

No-cache copy enabled:
   83856 tps, 34.81% utilization
   50/90/95% latency 2508.42 2622.62 2735.88

Note especially the effect on latency tail (95th percentile).

This seems to provide a nice performance improvement and is
consistent in the tests I ran.  Presumably, this would provide
the greatest benfits in the presence of an application workload
stressing the cache and a lot of transmit data happening.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 22:30:30 -07:00
Simon Horman
e3f6a652fd IPVS: combine consecutive #ifdef CONFIG_PROC_FS blocks
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-04-05 11:25:03 +09:00
stephen hemminger
14f98f258f bridge: range check STP parameters
Apply restrictions on STP parameters based 802.1D 1998 standard.
   * Fixes missing locking in set path cost ioctl
   * Uses common code for both ioctl and sysfs

This is based on an earlier patch Sasikanth V but with overhaul.

Note:
1. It does NOT enforce the restriction on the relationship max_age and
   forward delay or hello time because in existing implementation these are
   set as independant operations.

2. If STP is disabled, there is no restriction on forward delay

3. No restriction on holding time because users use Linux code to act
   as hub or be sticky.

4. Although standard allow 0-255, Linux only allows 0-63 for port priority
   because more bits are reserved for port number.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:29 -07:00
stephen hemminger
bb900b27a2 bridge: allow creating bridge devices with netlink
Add netlink device ops to allow creating bridge device via netlink.
This works in a manner similar to vlan, macvlan and bonding.

Example:
  # ip link add link dev br0 type bridge
  # ip link del dev br0

The change required rearranging initializtion code to deal with
being called by create link. Most of the initialization happens
in br_dev_setup, but allocation of stats is done in ndo_init callback
to deal with allocation failure. Sysfs setup has to wait until
after the network device kobject is registered.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:28 -07:00
stephen hemminger
36fd2b63e3 bridge: allow creating/deleting fdb entries via netlink
Use RTM_NEWNEIGH and RTM_DELNEIGH to allow updating of entries
in bridge forwarding table. This allows manipulating static entries
which is not possible with existing tools.

Example (using bridge extensions to iproute2)
   # br fdb add 00:02:03:04:05:06 dev eth0

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:28 -07:00
stephen hemminger
b078f0df67 bridge: add netlink notification on forward entry changes
This allows applications to query and monitor bridge forwarding
table in the same method used for neighbor table. The forward table
entries are returned in same structure format as used by the ioctl.
If more information is desired in future, the netlink method is
extensible.

Example (using bridge extensions to iproute2)
  # br monitor

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:27 -07:00
stephen hemminger
664de48bb6 bridge: split rcu and no-rcu cases of fdb lookup
In some cases, look up of forward database entry is done with RCU;
and for others no RCU is needed because of locking. Split the two
cases into two differnt loops (and take off inline).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:26 -07:00
stephen hemminger
7cd8861ab0 bridge: track last used time in forwarding table
Adds tracking the last used time in forwarding table.
Rename ageing_timer to updated to better describe it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:26 -07:00
stephen hemminger
03e9b64b89 bridge: change arguments to fdb_create
Later patch provides ability to create non-local static entry.
To make this easier move the updating of the flag values to
after the code that creates entry.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:25 -07:00
Johan Hedberg
a88a9652d2 Bluetooth: Add mgmt_remote_name event
This patch adds a new remote_name event to the Management interface
which is sent every time the name of a remote device is resolved (over
BR/EDR).

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-04 18:47:38 -03:00
Johan Hedberg
e17acd40f6 Bluetooth: Add mgmt_device_found event
This patch adds a device_found event to the Management interface. For
now the event only maps to BR/EDR inquiry result HCI events, but in the
future the plan is to also use it for the LE device discovery process.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-04 18:47:06 -03:00
Gustavo F. Padovan
1e429f3842 Bluetooth: Remove gfp_mask param from hci_reassembly()
It is unnecessary, once we are always in interrupt context.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-04-04 18:25:14 -03:00
Johannes Berg
26d59535aa mac80211: clean up station cleanup timer
We currently run this timer exactly once when
a new mac80211 device is registered, but that
is completely pointless since it will have no
work to do at all. Therefore, remove that and
also simplify some code using the timer.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-04 16:20:07 -04:00
Felix Fietkau
5f9f1812b6 mac80211: remove the dependency on crypto_blkcipher
The only thing that using crypto_blkcipher with ecb does over just using
arc4 directly is wrapping the encrypt/decrypt function into a for loop,
looping over each individual character.
To be able to do this, it pulls in around 40 kb worth of unnecessary
kernel modules (at least on a MIPS embedded device).
Using arc4 directly not only eliminates those dependencies, it also makes
the code smaller.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-04 16:20:00 -04:00
Felix Fietkau
1ed76487ce mac80211: fix suppressing probe responses in ad-hoc mode
The commit "mac80211: reply to directed probes in IBSS" changed ad-hoc
specific code to respond to unicast probe requests, even if
drv_tx_last_beacon returns false, however due to confusion over the
meaning of the IEEE80211_RX_RA_MATCH flag, it also unconditionally
enabled responding to multicast probe requests.
Fix this by explicitly checking for a multicast destination address
instead.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-04 16:19:59 -04:00
Ben Greear
279daf64c0 wifi: Add hwflags to debugfs.
Aids debugging wifi behaviour.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-04 16:18:33 -04:00
Boris Ostrovsky
738faca343 ipv6: Don't pass invalid dst_entry pointer to dst_release().
Make sure dst_release() is not called with error pointer. This is
similar to commit 4910ac6c52 ("ipv4:
Don't ip_rt_put() an error pointer in RAW sockets.").

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 13:07:26 -07:00
Helmut Schaa
fcf8bd3ba5 mac80211: Fix duplicate frames on cooked monitor
Cleaning the ieee80211_rx_data.flags field here is wrong, instead the
flags should be valid accross processing the frame on different
interfaces. Fix this by removing the incorrect flags=0 assignment.

Introduced in commit 554891e63a
(mac80211: move packet flags into packet).

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-04-04 15:22:11 -04:00
stephen hemminger
0545a30377 pkt_sched: QFQ - quick fair queue scheduler
This is an implementation of the Quick Fair Queue scheduler developed
by Fabio Checconi. The same algorithm is already implemented in ipfw
in FreeBSD. Fabio had an earlier version developed on Linux, I just
cleaned it up.  Thanks to Eric Dumazet for testing this under load.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 11:10:24 -07:00
David S. Miller
083dd8b8aa Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-04-04 10:39:12 -07:00
Florian Westphal
96120d86fe netfilter: xt_conntrack: fix inverted conntrack direction test
--ctdir ORIGINAL matches REPLY packets, and vv:

userspace sets "invert_flags &= ~XT_CONNTRACK_DIRECTION" in ORIGINAL
case.

Thus: (CTINFO2DIR(ctinfo) == IP_CT_DIR_ORIGINAL) ^
      !!(info->invert_flags & XT_CONNTRACK_DIRECTION))

yields "1 ^ 0", which is true -> returns false.

Reproducer:
iptables -I OUTPUT 1 -p tcp --syn -m conntrack --ctdir ORIGINAL

Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 17:06:21 +02:00
Eric Dumazet
7f5c6d4f66 netfilter: get rid of atomic ops in fast path
We currently use a percpu spinlock to 'protect' rule bytes/packets
counters, after various attempts to use RCU instead.

Lately we added a seqlock so that get_counters() can run without
blocking BH or 'writers'. But we really only need the seqcount in it.

Spinlock itself is only locked by the current/owner cpu, so we can
remove it completely.

This cleanups api, using correct 'writer' vs 'reader' semantic.

At replace time, the get_counters() call makes sure all cpus are done
using the old table.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 17:04:03 +02:00
Florian Westphal
b7225041e9 netfilter: xt_addrtype: replace rt6_lookup with nf_afinfo->route
This avoids pulling in the ipv6 module when using (ipv4-only) iptables
-m addrtype.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 17:01:43 +02:00
Florian Westphal
0fae2e7740 netfilter: af_info: add 'strict' parameter to limit lookup to .oif
ipv6 fib lookup can set RT6_LOOKUP_F_IFACE flag to restrict search
to an interface, but this flag cannot be set via struct flowi.

Also, it cannot be set via ip6_route_output: this function uses the
passed sock struct to determine if this flag is required
(by testing for nonzero sk_bound_dev_if).

Work around this by passing in an artificial struct sk in case
'strict' argument is true.

This is required to replace the rt6_lookup call in xt_addrtype.c with
nf_afinfo->route().

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 17:00:54 +02:00
Florian Westphal
31ad3dd64e netfilter: af_info: add network namespace parameter to route hook
This is required to eventually replace the rt6_lookup call in
xt_addrtype.c with nf_afinfo->route().

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 16:56:29 +02:00
Hans Schillstrom
a09d19779f IPVS: fix NULL ptr dereference in ip_vs_ctl.c ip_vs_genl_dump_daemons()
ipvsadm -ln --daemon will trigger a Null pointer exception because
ip_vs_genl_dump_daemons() uses skb_net() instead of skb_sknet().

To prevent others from NULL ptr a check is made in ip_vs.h skb_net().

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 15:25:18 +02:00
David Sterba
b4232a2277 netfilter: h323: bug in parsing of ASN1 SEQOF field
Static analyzer of clang found a dead store which appears to be a bug in
reading count of items in SEQOF field, only the lower byte of word is
stored. This may lead to corrupted read and communication shutdown.

The bug has been in the module since it's first inclusion into linux
kernel.

[Patrick: the bug is real, but without practical consequence since the
 largest amount of sequence-of members we parse is 30.]

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 15:21:02 +02:00
Jozsef Kadlecsik
2f9f28b212 netfilter: ipset: references are protected by rwlock instead of mutex
The timeout variant of the list:set type must reference the member sets.
However, its garbage collector runs at timer interrupt so the mutex
protection of the references is a no go. Therefore the reference protection
is converted to rwlock.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 15:19:25 +02:00
Jozsef Kadlecsik
512d06b5b6 netfilter: ipset: list:set timeout variant fixes
- the timeout value was actually not set
- the garbage collector was broken

The variant is fixed, the tests to the ipset testsuite are added.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-04-04 15:18:45 +02:00
Michał Mirosław
8a0427bb68 vlan: convert VLAN devices to use ndo_fix_features()
Note: get_flags was actually broken, because it should return the
flags capped with vlan_features. This is now done implicitly by
limiting netdev->hw_features.

RX checksumming offload control is (and was) broken, as there was no way
before to say whether it's done for tagged packets.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-02 22:49:12 -07:00
Michał Mirosław
6cb6a27c45 net: Call netdev_features_change() from netdev_update_features()
Issue FEAT_CHANGE notification when features are changed by
netdev_update_features().  This will allow changes made by extra constraints
on e.g. MTU change to be properly propagated like changes via ethtool.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-02 22:48:47 -07:00
Ilpo Järvinen
2fceec1337 tcp: len check is unnecessarily devastating, change to WARN_ON
All callers are prepared for alloc failures anyway, so this error
can safely be boomeranged to the callers domain without super
bad consequences. ...At worst the connection might go into a state
where each RTO tries to (unsuccessfully) re-fragment with such
a mis-sized value and eventually dies.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-01 21:47:41 -07:00
Wei Yongjun
2cab86bee8 sctp: malloc enough room for asconf-ack chunk
Sometime the ASCONF_ACK parameters can equal to the fourfold of
ASCONF parameters, this only happend in some special case:

  ASCONF parameter is :
    Unrecognized Parameter (4 bytes)
  ASCONF_ACK parameter should be:
    Error Cause Indication parameter (8 bytes header)
     + Error Cause (4 bytes header)
       + Unrecognized Parameter (4bytes)

Four 4bytes Unrecognized Parameters in ASCONF chunk will cause panic.

Pid: 0, comm: swapper Not tainted 2.6.38-next+ #22 Bochs Bochs
EIP: 0060:[<c0717eae>] EFLAGS: 00010246 CPU: 0
EIP is at skb_put+0x60/0x70
EAX: 00000077 EBX: c09060e2 ECX: dec1dc30 EDX: c09469c0
ESI: 00000000 EDI: de3c8d40 EBP: dec1dc58 ESP: dec1dc2c
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper (pid: 0, ti=dec1c000 task=c09aef20 task.ti=c0980000)
Stack:
 c09469c0 e1894fa4 00000044 00000004 de3c8d00 de3c8d00 de3c8d44 de3c8d40
 c09060e2 de25dd80 de3c8d40 dec1dc7c e1894fa4 dec1dcb0 00000040 00000004
 00000000 00000800 00000004 00000004 dec1dce0 e1895a2b dec1dcb4 de25d960
Call Trace:
 [<e1894fa4>] ? sctp_addto_chunk+0x4e/0x89 [sctp]
 [<e1894fa4>] sctp_addto_chunk+0x4e/0x89 [sctp]
 [<e1895a2b>] sctp_process_asconf+0x32f/0x3d1 [sctp]
 [<e188d554>] sctp_sf_do_asconf+0xf8/0x173 [sctp]
 [<e1890b02>] sctp_do_sm+0xb8/0x159 [sctp]
 [<e18a2248>] ? sctp_cname+0x0/0x52 [sctp]
 [<e189392d>] sctp_assoc_bh_rcv+0xac/0xe3 [sctp]
 [<e1897d76>] sctp_inq_push+0x2d/0x30 [sctp]
 [<e18a21b2>] sctp_rcv+0x7a7/0x83d [sctp]
 [<c077a95c>] ? ipv4_confirm+0x118/0x125
 [<c073a970>] ? nf_iterate+0x34/0x62
 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194
 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194
 [<c0747992>] ip_local_deliver_finish+0xf5/0x194
 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194
 [<c0747a6e>] NF_HOOK.clone.1+0x3d/0x44
 [<c0747ab3>] ip_local_deliver+0x3e/0x44
 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194
 [<c074775c>] ip_rcv_finish+0x29f/0x2c7
 [<c07474bd>] ? ip_rcv_finish+0x0/0x2c7
 [<c0747a6e>] NF_HOOK.clone.1+0x3d/0x44
 [<c0747cae>] ip_rcv+0x1f5/0x233
 [<c07474bd>] ? ip_rcv_finish+0x0/0x2c7
 [<c071dce3>] __netif_receive_skb+0x310/0x336
 [<c07221f3>] netif_receive_skb+0x4b/0x51
 [<e0a4ed3d>] cp_rx_poll+0x1e7/0x29c [8139cp]
 [<c072275e>] net_rx_action+0x65/0x13a
 [<c0445a54>] __do_softirq+0xa1/0x149
 [<c04459b3>] ? __do_softirq+0x0/0x149
 <IRQ>
 [<c0445891>] ? irq_exit+0x37/0x72
 [<c040a7e9>] ? do_IRQ+0x81/0x95
 [<c07b3670>] ? common_interrupt+0x30/0x38
 [<c0428058>] ? native_safe_halt+0xa/0xc
 [<c040f5d7>] ? default_idle+0x58/0x92
 [<c0408fb0>] ? cpu_idle+0x96/0xb2
 [<c0797989>] ? rest_init+0x5d/0x5f
 [<c09fd90c>] ? start_kernel+0x34b/0x350
 [<c09fd0cb>] ? i386_start_kernel+0xba/0xc1

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-01 21:45:51 -07:00
David S. Miller
5e58e5283a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-04-01 17:15:25 -07:00
Linus Torvalds
84daeb09ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  appletalk: Fix OOPS in atalk_release().
  mlx4: Fixing bad size of event queue buffer
  mlx4: Fixing use after free
  bonding:typo in comment
  sctp: Pass __GFP_NOWARN to hash table allocation attempts.
  connector: convert to synchronous netlink message processing
  fib: add rtnl locking in ip_fib_net_exit
  atm/solos-pci: Don't flap VCs when carrier state changes
  atm/solos-pci: Don't include frame pseudo-header on transmit hex-dump
  atm/solos-pci: Use VPI.VCI notation uniformly.
  Atheros, atl2: Fix mem leaks in error paths of atl2_set_eeprom
  netdev: fix mtu check when TSO is enabled
  net/usb: Ethernet quirks for the LG-VL600 4G modem
  phylib: phy_attach_direct: phy_init_hw can fail, add cleanup
  bridge: mcast snooping, fix length check of snooped MLDv1/2
  via-ircc: Pass PCI device pointer to dma_{alloc, free}_coherent()
  via-ircc: Use pci_{get, set}_drvdata() instead of static pointer variable
  net: gre: provide multicast mappings for ipv4 and ipv6
  bridge: Fix compilation warning in function br_stp_recalculate_bridge_id()
  net: Fix warnings caused by MAX_SKB_FRAGS change.
2011-04-01 08:53:50 -07:00
David S. Miller
c100c8f4c3 appletalk: Fix OOPS in atalk_release().
Commit 60d9f461a2 ("appletalk: remove
the BKL") added a dereference of "sk" before checking for NULL in
atalk_release().

Guard the code block completely, rather than partially, with the
NULL check.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 18:59:10 -07:00
Gustavo F. Padovan
220b881a77 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-2.6 2011-03-31 16:26:01 -03:00
Thomas Gleixner
6f5ef998b7 Bluetooth: Fix warning with hci_cmd_timer
After we made debugobjects working again, we got the following:

WARNING: at lib/debugobjects.c:262 debug_print_object+0x8e/0xb0()
Hardware name: System Product Name
ODEBUG: free active (active state 0) object type: timer_list hint: hci_cmd_timer+0x0/0x60
Pid: 2125, comm: dmsetup Tainted: G        W   2.6.38-06707-gc62b389 #110375
Call Trace:
 [<ffffffff8104700a>] warn_slowpath_common+0x7a/0xb0
 [<ffffffff810470b6>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff812d3a5e>] debug_print_object+0x8e/0xb0
 [<ffffffff81bd8810>] ? hci_cmd_timer+0x0/0x60
 [<ffffffff812d4685>] debug_check_no_obj_freed+0x125/0x230
 [<ffffffff810f1063>] ? check_object+0xb3/0x2b0
 [<ffffffff810f3630>] kfree+0x150/0x190
 [<ffffffff81be4d06>] ? bt_host_release+0x16/0x20
 [<ffffffff81be4d06>] bt_host_release+0x16/0x20
 [<ffffffff813a1907>] device_release+0x27/0xa0
 [<ffffffff812c519c>] kobject_release+0x4c/0xa0
 [<ffffffff812c5150>] ? kobject_release+0x0/0xa0
 [<ffffffff812c61f6>] kref_put+0x36/0x70
 [<ffffffff812c4d37>] kobject_put+0x27/0x60
 [<ffffffff813a21f7>] put_device+0x17/0x20
 [<ffffffff81bda4f9>] hci_free_dev+0x29/0x30
 [<ffffffff81928be6>] vhci_release+0x36/0x70
 [<ffffffff810fb366>] fput+0xd6/0x1f0
 [<ffffffff810f8fe6>] filp_close+0x66/0x90
 [<ffffffff810f90a9>] sys_close+0x99/0xf0
 [<ffffffff81d4c96b>] system_call_fastpath+0x16/0x1b

That timer was introduced with commit 6bd32326cda(Bluetooth: Use
proper timer for hci command timout)

Timer seems to be running when the thing is closed. Removing the timer
unconditionally fixes the problem. And yes, it needs to be fixed
before the HCI_UP check.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:25:26 -03:00
Andrei Emeltchenko
34bd0273b6 Bluetooth: delete hanging L2CAP channel
Sometimes L2CAP connection remains hanging. Make sure that
L2CAP channel is deleted.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:25:26 -03:00
Johan Hedberg
08ba53824a Bluetooth: Fix missing hci_dev_lock_bh in user_confirm_reply
The code was correctly calling _unlock at the end of the function but
there was no actual _lock call anywhere.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:25:25 -03:00
Gustavo F. Padovan
105721328f Bluetooth: Fix HCI_RESET command synchronization
We can't send new commands before a cmd_complete for the HCI_RESET command
shows up.

Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Reported-by: Justin P. Mattock <justinmattock@gmail.com>
Reported-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Ed Tomlinson <edt@aei.ca>
2011-03-31 14:25:25 -03:00
Suraj Sumangala
23e9fde2b3 Bluetooth: Increment unacked_frames count only the first transmit
This patch lets 'l2cap_pinfo.unacked_frames' be incremented only
the first time a frame is transmitted.

Previously it was being incremented for retransmitted packets
too resulting the value to cross the transmit window size.

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:25:25 -03:00
Johan Hedberg
80a1e1dbf6 Bluetooth: Add local Extended Inquiry Response (EIR) support
This patch adds automated creation of the local EIR data based on what
16-bit UUIDs are registered and what the device name is. This should
cover the majority use cases, however things like 32/128-bit UUIDs, TX
power and Device ID will need to be added later to be on par with what
bluetoothd is capable of doing (without the Management interface).

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:58 -03:00
Andrei Emeltchenko
e90165be9a Bluetooth: check L2CAP info_rsp ident and state
Information requests/responses are unbound to L2CAP channel. Patch
fixes issue arising when two devices connects at the same time to
each other. This way we do not process out of the context messages.
We are safe dropping info_rsp since info_timer is left running.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:58 -03:00
Gustavo F. Padovan
d1010240fa Bluetooth: Move bt_accept_enqueue() to outside __l2cap_chan_add
bt_accept_enqueue() is not really a channel action, so do it outside.
This patch is part of a set of patches to create an struct l2cap_chan to
have a clear separation between the struct sock and the L2CAP channel
stuff.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Szymon Janc
ce85ee13e6 Bluetooth: Enable support for out of band association model
If remote side reports oob availability or we are pairing initiator
use oob data for pairing if available.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Szymon Janc
2763eda6cc Bluetooth: Add add/remove_remote_oob_data management commands
This patch adds commands to add and remove remote OOB data to the managment
interface. Remote data is stored in kernel and can be used by corresponding
HCI commands and events when needed.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Szymon Janc
c35938b2f5 Bluetooth: Add read_local_oob_data management command
This patch adds a command to read local OOB data to the managment interface.
The command maps directly to the Read Local OOB Data HCI command.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Szymon Janc
8fce6357a9 Bluetooth: Allow for NULL data in mgmt_pending_add
Since index is in mgmt_hdr it is possible to have mgmt command with
no parameters that still needs to add itself to pending list.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:57 -03:00
Szymon Janc
c68fb7ff29 Bluetooth: Rename cmd to param in pending_cmd
This field holds not whole command but only command specific
parameters.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:56 -03:00
Szymon Janc
e0e185efba Bluetooth: Fix checkpatch error in cmtp.h
Do not use C99 // comments.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:56 -03:00
Szymon Janc
ffd13320aa Bluetooth: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
As warned by checkpatch.pl, use #include <linux/uaccess.h> instead of
<asm/uaccess.h>

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:56 -03:00
Szymon Janc
58aac468be Bluetooth: Do not use assignments in IF conditions
Fix checkpatch warnings concerning assignments in if conditions.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:56 -03:00
Szymon Janc
17f09a7e4e Bluetooth: Fix checkpatch errors, code style issues and typos in hidp
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:56 -03:00
Szymon Janc
8c20aa9ffc Bluetooth: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
As warned by checkpatch.pl, use #include <linux/uaccess.h> instead of
<asm/uaccess.h>

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:55 -03:00
Szymon Janc
3aad75a128 Bluetooth: Fix checkpatch errors and some code style issues in bnep
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:55 -03:00
Szymon Janc
a3d9bd4c00 Bluetooth: Opencode macros in bnep/core.c
BNEP_RX_TYPES and INCA macros have only one user each and don't provide
any benefits compared to opencoding them.

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:55 -03:00
Gustavo F. Padovan
2c6d1a2eec Bluetooth: Improve error message on wrong link type
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Johan Hedberg
b312b161ec Bluetooth: mgmt: Add support for setting the local name
This patch adds a new set_local_name management command as well as a
local_name_changed management event. With these user space can both
change the local name as well as monitor changes to it by others.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Johan Hedberg
dc4fe30b86 Bluetooth: mgmt: Add local name information to read_info reply
This patch adds the name of the adapter to the reply of the read_info
management command.

The management messages reserve 249 bytes for the name instead of 248
(like in the HCI spec) so that there is always a guarantee that it is
nul-terminated. That way it can safely be passed onto string
manipulation functions.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Johan Hedberg
1f6c6378c5 Bluetooth: Add define for the maximum name length on HCI level
This patch adds a clear define for the maximum device name length in HCI
messages and thereby avoids magic numbers in the code.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:54 -03:00
Gustavo F. Padovan
f0681a68dd Bluetooth: remove unnecessary function declaration
hci_notify() doesn't need declaration first.

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-31 14:22:53 -03:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
David S. Miller
c0951cbcfd ipv4: Use flowi4_init_output() in udp_sendmsg()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:54:27 -07:00
David S. Miller
1bba6ffeeb ipv4: Use flowi4_init_output() in cookie_v4_check()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:54:08 -07:00
David S. Miller
ef164ae356 ipv4: Use flowi4_init_output() in raw_sendmsg()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:53:51 -07:00
David S. Miller
538de0e01f ipv4: Use flowi4_init_output() in ip_send_reply()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:53:37 -07:00
David S. Miller
e79d9bc7ea ipv4: Use flowi4_init_output() in inet_connection_sock.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 04:53:20 -07:00
Eric Dumazet
0a5c047507 fib: add __rcu annotations
Add __rcu annotations and lockdep checks.

Add const qualifiers

node_parent() and node_parent_rcu() can use
rcu_dereference_index_check()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-31 01:51:35 -07:00
David S. Miller
a84b50ceb7 sctp: Pass __GFP_NOWARN to hash table allocation attempts.
Like DCCP and other similar pieces of code, there are mechanisms
here to try allocating smaller hash tables if the allocation
fails.  So pass in __GFP_NOWARN like the others do instead of
emitting a scary message.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 17:51:36 -07:00
Eric Dumazet
e2666f8495 fib: add rtnl locking in ip_fib_net_exit
Daniel J Blueman reported a lockdep splat in trie_firstleaf(), caused by
RTNL being not locked before a call to fib_table_flush()

Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 16:57:46 -07:00
Philip A. Prindeville
c031235b39 atm/solos-pci: Don't flap VCs when carrier state changes
Don't flap VCs when carrier state changes; higher-level protocols
can detect loss of connectivity and act accordingly. This is more
consistent with how other network interfaces work.

We no longer use release_vccs() so we can delete it.

release_vccs() was duplicated from net/atm/common.c; make the
corresponding function exported, since other code duplicates it
and could leverage it if it were public.

Signed-off-by: Philip A. Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 16:53:38 -07:00
Jouni Malinen
ec15e68ba6 cfg80211: Add nl80211 event for deletion of a station entry
Indicate an NL80211_CMD_DEL_STATION event when a station entry in
mac80211 is deleted to match with the NL80211_CMD_NEW_STATION event
that is used when the entry was added. This is needed, e.g., to allow
user space to remove a peer from RSN IBSS Authenticator state machine
to avoid re-authentication and re-keying delays when the peer is not
reachable anymore.

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-30 14:15:18 -04:00
Helmut Schaa
be7974aa10 mac80211: Minor optimization in tx status handling
ieee80211_tx_status iterates over all tx rates the driver reports back
in order to
1) mark tx rates as invalid if the driver cannot have tried that rate
2) find the actually used tx rate for the final retransmission

By leaving the for loop when the first invalid rate index is found we
can move the rates_idx assignment after the loop and therefore save
a few assignments and conditionals.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-30 14:15:16 -04:00
Johannes Berg
c835b21405 mac80211: add comment about reordering
Took me a minute to figure this out, maybe
it's better documented...

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-30 14:15:12 -04:00
Linus Torvalds
50f3515828 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: Create a new key type "ceph".
  libceph: Get secret from the kernel keys api when mounting with key=NAME.
  ceph: Move secret key parsing earlier.
  libceph: fix null dereference when unregistering linger requests
  ceph: unlock on error in ceph_osdc_start_request()
  ceph: fix possible NULL pointer dereference
  ceph: flush msgr_wq during mds_client shutdown
2011-03-30 09:46:09 -07:00
Daniel Lezcano
79b569f0ec netdev: fix mtu check when TSO is enabled
In case the device where is coming from the packet has TSO enabled,
we should not check the mtu size value as this one could be bigger
than the expected value.

This is the case for the macvlan driver when the lower device has
TSO enabled. The macvlan inherit this feature and forward the packets
without fragmenting them. Then the packets go through dev_forward_skb
and are dropped. This patch fix this by checking TSO is not enabled
when we want to check the mtu size.

Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 02:42:17 -07:00
Linus Lüssing
ff9a57a62a bridge: mcast snooping, fix length check of snooped MLDv1/2
"len = ntohs(ip6h->payload_len)" does not include the length of the ipv6
header itself, which the rest of this function assumes, though.

This leads to a length check less restrictive as it should be in the
following line for one thing. For another, it very likely leads to an
integer underrun when substracting the offset and therefore to a very
high new value of 'len' due to its unsignedness. This will ultimately
lead to the pskb_trim_rcsum() practically never being called, even in
the cases where it should.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 02:28:20 -07:00
Timo Teräs
93ca3bb5df net: gre: provide multicast mappings for ipv4 and ipv6
My commit 6d55cb91a0 (gre: fix hard header destination
address checking) broke multicast.

The reason is that ip_gre used to get ipgre_header() calls with
zero destination if we have NOARP or multicast destination. Instead
the actual target was decided at ipgre_tunnel_xmit() time based on
per-protocol dissection.

Instead of allowing the "abuse" of ->header() calls with invalid
destination, this creates multicast mappings for ip_gre. This also
fixes "ip neigh show nud noarp" to display the proper multicast
mappings used by the gre device.

Reported-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 00:10:47 -07:00
Balaji G
1459a3cc51 bridge: Fix compilation warning in function br_stp_recalculate_bridge_id()
net/bridge/br_stp_if.c: In function ‘br_stp_recalculate_bridge_id’:
net/bridge/br_stp_if.c:216:3: warning: ‘return’ with no value, in function returning non-void

Signed-off-by: G.Balaji <balajig81@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-29 23:37:23 -07:00
Tommi Virtanen
4b2a58abd1 libceph: Create a new key type "ceph".
This allows us to use existence of the key type as a feature test,
from userspace.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-29 12:11:24 -07:00
Tommi Virtanen
e2c3d29b42 libceph: Get secret from the kernel keys api when mounting with key=NAME.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-29 12:11:19 -07:00
Tommi Virtanen
8323c3aa74 ceph: Move secret key parsing earlier.
This makes the base64 logic be contained in mount option parsing,
and prepares us for replacing the homebew key management with the
kernel key retention service.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-29 12:11:16 -07:00
Sage Weil
fbdb919048 libceph: fix null dereference when unregistering linger requests
We should only clear r_osd if we are neither registered as a linger or a
regular request.  We may unregister as a linger while still registered as
a regular request (e.g., in reset_osd).  Incorrectly clearing r_osd there
leads to a null pointer dereference in __send_request.

Also simplify the parallel check in __unregister_request() where we just
removed r_osd_item and know it's empty.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-29 12:11:06 -07:00
Dan Carpenter
234af26ff1 ceph: unlock on error in ceph_osdc_start_request()
There was a missing unlock on the error path if __map_request() failed.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-29 08:59:54 -07:00
Linus Torvalds
cb1817b373 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
  xfrm: Restrict extended sequence numbers to esp
  xfrm: Check for esn buffer len in xfrm_new_ae
  xfrm: Assign esn pointers when cloning a state
  xfrm: Move the test on replay window size into the replay check functions
  netdev: bfin_mac: document TE setting in RMII modes
  drivers net: Fix declaration ordering in inline functions.
  cxgb3: Apply interrupt coalescing settings to all queues
  net: Always allocate at least 16 skb frags regardless of page size
  ipv4: Don't ip_rt_put() an error pointer in RAW sockets.
  net: fix ethtool->set_flags not intended -EINVAL return value
  mlx4_en: Fix loss of promiscuity
  tg3: Fix inline keyword usage
  tg3: use <linux/io.h> and <linux/uaccess.h> instead <asm/io.h> and <asm/uaccess.h>
  net: use CHECKSUM_NONE instead of magic number
  Net / jme: Do not use legacy PCI power management
  myri10ge: small rx_done refactoring
  bridge: notify applications if address of bridge device changes
  ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in ip_options_echo()
  can: c_can: Fix tx_bytes accounting
  can: c_can_platform: fix irq check in probe
  ...
2011-03-29 07:41:33 -07:00
Steffen Klassert
02aadf72fe xfrm: Restrict extended sequence numbers to esp
The IPsec extended sequence numbers are fully implemented just for
esp. So restrict the usage to esp until other protocols have
support too.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-28 23:34:53 -07:00
Steffen Klassert
e2b19125e9 xfrm: Check for esn buffer len in xfrm_new_ae
In xfrm_new_ae() we may overwrite the allocated esn replay state
buffer with a wrong size. So check that the new size matches the
original allocated size and return an error if this is not the case.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-28 23:34:52 -07:00
Steffen Klassert
af2f464e32 xfrm: Assign esn pointers when cloning a state
When we clone a xfrm state we have to assign the replay_esn
and the preplay_esn pointers to the state if we use the
new replay detection method. To this end, we add a
xfrm_replay_clone() function that allocates memory for
the replay detection and takes over the necessary values
from the original state.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-28 23:34:52 -07:00
Steffen Klassert
36ae0148db xfrm: Move the test on replay window size into the replay check functions
As it is, the replay check is just performed if the replay window of the
legacy implementation is nonzero. So we move the test on a nonzero replay
window inside the replay check functions to be sure we are testing for the
right implementation.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-28 23:34:51 -07:00
David S. Miller
4910ac6c52 ipv4: Don't ip_rt_put() an error pointer in RAW sockets.
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-28 16:51:15 -07:00
Daniel Halperin
499fe9a419 mac80211: fix aggregation frame release during timeout
Suppose the aggregation reorder buffer looks like this:

x-T-R1-y-R2,

where x and y are frames that have not been received, T is a received
frame that has timed out, and R1,R2 are received frames that have not
yet timed out. The proper behavior in this scenario is to move the
window past x (skipping it), release T and R1, and leave the window at y
until y is received or R2 times out.

As written, this code will instead leave the window at R1, because it
has not yet timed out. Fix this by exiting the reorder loop only when
the frame that has not timed out AND there are skipped frames earlier in
the current valid window.

Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-28 15:42:02 -04:00
Juuso Oikarinen
2b78ac9bfc cfg80211: fix BSS double-unlinking (continued)
This patch adds to the fix "fix BSS double-unlinking"
(commit 3207390a8b) by Johannes Berg.

It turns out, that the double-unlinking scenario can also occur if expired
BSS elements are removed whilst an interface is performing association.

To work around that, replace list_del with list_del_init also in the
"cfg80211_bss_expire" function, so that the check for whether the BSS still is
in the list works correctly in cfg80211_unlink_bss.

Signed-off-by: Juuso Oikarinen <juuso.oikarinen@nokia.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-28 15:42:02 -04:00
Mariusz Kozlowski
bef9bacc4e cfg80211:: fix possible NULL pointer dereference
In cfg80211_inform_bss_frame() wiphy is first dereferenced on privsz
initialisation and then it is checked for NULL. This patch fixes that.

Signed-off-by: Mariusz Kozlowski <mk@lab.zgora.pl>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-28 15:42:02 -04:00
Mariusz Kozlowski
67aa030c0d mac80211: fix possible NULL pointer dereference
This patch moves 'key' dereference after BUG_ON(!key) so that when key is NULL
we will see proper trace instead of oops.

Signed-off-by: Mariusz Kozlowski <mk@lab.zgora.pl>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-28 15:42:02 -04:00
Petr Štetiar
1f951a7f8b mac80211: fix NULL pointer dereference in ieee80211_key_alloc()
The ieee80211_key struct can be kfree()d several times in the function, for
example if some of the key setup functions fails beforehand, but there's no
check if the struct is still valid before we call memcpy() and INIT_LIST_HEAD()
on it.  In some cases (like it was in my case), if there's missing aes-generic
module it could lead to the following kernel OOPS:

	Unable to handle kernel NULL pointer dereference at virtual address 0000018c
	....
	PC is at memcpy+0x80/0x29c
	...
	Backtrace:
	[<bf11c5e4>] (ieee80211_key_alloc+0x0/0x234 [mac80211]) from [<bf1148b4>] (ieee80211_add_key+0x70/0x12c [mac80211])
	[<bf114844>] (ieee80211_add_key+0x0/0x12c [mac80211]) from [<bf070cc0>] (__cfg80211_set_encryption+0x2a8/0x464 [cfg80211])

Signed-off-by: Petr Štetiar <ynezz@true.cz>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-28 15:42:02 -04:00
Felix Fietkau
4dc217df68 mac80211: fix a crash in minstrel_ht in HT mode with no supported MCS rates
When a client connects in HT mode but does not provide any valid MCS
rates, the function that finds the next sample rate gets stuck in an
infinite loop.
Fix this by falling back to legacy rates if no usable MCS rates are found.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-28 15:42:01 -04:00
Stanislaw Gruszka
673e63c688 net: fix ethtool->set_flags not intended -EINVAL return value
After commit d5dbda2380 "ethtool: Add
support for vlan accleration.", drivers that have NETIF_F_HW_VLAN_TX,
and/or NETIF_F_HW_VLAN_RX feature, but do not allow enable/disable vlan
acceleration via ethtool set_flags, always return -EINVAL from that
function. Fix by returning -EINVAL only if requested features do not
match current settings and can not be changed by driver.

Change any driver that define ethtool->set_flags to use
ethtool_invalid_flags() to avoid similar problems in the future
(also on drivers that do not have the problem).

Tested with modified (to reproduce this bug) myri10ge driver.

Cc: stable@kernel.org # 2.6.37+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 23:35:24 -07:00
Cesar Eduardo Barros
3e49e6d520 net: use CHECKSUM_NONE instead of magic number
Two places in the kernel were doing skb->ip_summed = 0.

Change both to skb->ip_summed = CHECKSUM_NONE, which is more readable.

Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 23:35:05 -07:00
stephen hemminger
edf947f100 bridge: notify applications if address of bridge device changes
The mac address of the bridge device may be changed when a new interface
is added to the bridge. If this happens, then the bridge needs to call
the network notifiers to tickle any other systems that care. Since bridge
can be a module, this also means exporting the notifier function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 23:35:02 -07:00
Jan Luebbe
8628bd8af7 ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in ip_options_echo()
The current handling of echoed IP timestamp options with prespecified
addresses is rather broken since the 2.2.x kernels. As far as i understand
it, it should behave like when originating packets.

Currently it will only timestamp the next free slot if:
 - there is space for *two* timestamps
 - some random data from the echoed packet taken as an IP is *not* a local IP

This first is caused by an off-by-one error. 'soffset' points to the next
free slot and so we only need to have 'soffset + 7 <= optlen'.

The second bug is using sptr as the start of the option, when it really is
set to 'skb_network_header(skb)'. I just use dptr instead which points to
the timestamp option.

Finally it would only timestamp for non-local IPs, which we shouldn't do.
So instead we exclude all unicast destinations, similar to what we do in
ip_options_compile().

Signed-off-by: Jan Luebbe <jluebbe@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 23:35:02 -07:00
Oliver Hartkopp
53914b6799 can: make struct proto const
can_ioctl is the only reason for struct proto to be non-const.
script/check-patch.pl suggests struct proto be const.

Setting the reference to the common can_ioctl() in all CAN protocols directly
removes the need to make the struct proto writable in af_can.c

Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 23:34:59 -07:00
Amerigo Wang
3b261ade42 net: remove useless comments in net/core/dev.c
The code itself can explain what it is doing, no need these comments.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 23:34:59 -07:00
Ben Hutchings
e0bccd315d rose: Add length checks to CALL_REQUEST parsing
Define some constant offsets for CALL_REQUEST based on the description
at <http://www.techfest.com/networking/wan/x25plp.htm> and the
definition of ROSE as using 10-digit (5-byte) addresses.  Use them
consistently.  Validate all implicit and explicit facilities lengths.
Validate the address length byte rather than either trusting or
assuming its value.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:59:04 -07:00
Dan Rosenberg
be20250c13 ROSE: prevent heap corruption with bad facilities
When parsing the FAC_NATIONAL_DIGIS facilities field, it's possible for
a remote host to provide more digipeaters than expected, resulting in
heap corruption.  Check against ROSE_MAX_DIGIS to prevent overflows, and
abort facilities parsing on failure.

Additionally, when parsing the FAC_CCITT_DEST_NSAP and
FAC_CCITT_SRC_NSAP facilities fields, a remote host can provide a length
of less than 10, resulting in an underflow in a memcpy size, causing a
kernel panic due to massive heap corruption.  A length of greater than
20 results in a stack overflow of the callsign array.  Abort facilities
parsing on these invalid length values.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:59:03 -07:00
Dan Rosenberg
d370af0ef7 irda: validate peer name and attribute lengths
Length fields provided by a peer for names and attributes may be longer
than the destination array sizes.  Validate lengths to prevent stack
buffer overflows.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:59:02 -07:00
Dan Rosenberg
d50e7e3604 irda: prevent heap corruption on invalid nickname
Invalid nicknames containing only spaces will result in an underflow in
a memcpy size calculation, subsequently destroying the heap and
panicking.

v2 also catches the case where the provided nickname is longer than the
buffer size, which can result in controllable heap corruption.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:59:02 -07:00
Steffen Klassert
e433430a0c dst: Clone child entry in skb_dst_pop
We clone the child entry in skb_dst_pop before we call
skb_dst_drop(). Otherwise we might kill the child right
before we return it to the caller.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:55:01 -07:00
Steffen Klassert
3bc07321cc xfrm: Force a dst refcount before entering the xfrm type handlers
Crypto requests might return asynchronous. In this case we leave
the rcu protected region, so force a refcount on the skb's
destination entry before we enter the xfrm type input/output
handlers.

This fixes a crash when a route is deleted whilst sending IPsec
data that is transformed by an asynchronous algorithm.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 17:55:01 -07:00
OGAWA Hirofumi
a271c5a0de NFS: Ensure that rpc_release_resources_task() can be called twice.
BUG: atomic_dec_and_test(): -1: atomic counter underflow at:
Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1
Call Trace:
 [<ffffffffa02223a0>] ? put_rpccred+0x44/0x14e [sunrpc]
 [<ffffffffa021bbe9>] ? rpc_ping+0x4e/0x58 [sunrpc]
 [<ffffffffa021c4a5>] ? rpc_create+0x481/0x4fc [sunrpc]
 [<ffffffffa022298a>] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc]
 [<ffffffffa028be8c>] ? nfs_create_rpc_client+0xa6/0xeb [nfs]
 [<ffffffffa028c660>] ? nfs4_set_client+0xc2/0x1f9 [nfs]
 [<ffffffffa028cd3c>] ? nfs4_create_server+0xf2/0x2a6 [nfs]
 [<ffffffffa0295d07>] ? nfs4_remote_mount+0x4e/0x14a [nfs]
 [<ffffffff810dd570>] ? vfs_kern_mount+0x6e/0x133
 [<ffffffffa029605a>] ? nfs_do_root_mount+0x76/0x95 [nfs]
 [<ffffffffa029643d>] ? nfs4_try_mount+0x56/0xaf [nfs]
 [<ffffffffa0297434>] ? nfs_get_sb+0x435/0x73c [nfs]
 [<ffffffff810dd59b>] ? vfs_kern_mount+0x99/0x133
 [<ffffffff810dd693>] ? do_kern_mount+0x48/0xd8
 [<ffffffff810f5b75>] ? do_mount+0x6da/0x741
 [<ffffffff810f5c5f>] ? sys_mount+0x83/0xc0
 [<ffffffff8100293b>] ? system_call_fastpath+0x16/0x1b

Well, so, I think this is real bug of nfs codes somewhere. With some
review, the code

rpc_call_sync()
    rpc_run_task
        rpc_execute()
            __rpc_execute()
                rpc_release_task()
                    rpc_release_resources_task()
                        put_rpccred()                <= release cred
    rpc_put_task
        rpc_do_put_task()
            rpc_release_resources_task()
                put_rpccred()                        <= release cred again

seems to be release cred unintendedly.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-27 17:55:36 +02:00
Mariusz Kozlowski
6b0ae4097c ceph: fix possible NULL pointer dereference
This patch fixes 'event_work' dereference before it is checked for NULL.

Signed-off-by: Mariusz Kozlowski <mk@lab.zgora.pl>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-26 13:41:20 -07:00
Linus Torvalds
00a2470546 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits)
  route: Take the right src and dst addresses in ip_route_newports
  ipv4: Fix nexthop caching wrt. scoping.
  ipv4: Invalidate nexthop cache nh_saddr more correctly.
  net: fix pch_gbe section mismatch warning
  ipv4: fix fib metrics
  mlx4_en: Removing HW info from ethtool -i report.
  net_sched: fix THROTTLED/RUNNING race
  drivers/net/a2065.c: Convert release_resource to release_region/release_mem_region
  drivers/net/ariadne.c: Convert release_resource to release_region/release_mem_region
  bonding: fix rx_handler locking
  myri10ge: fix rmmod crash
  mlx4_en: updated driver version to 1.5.4.1
  mlx4_en: Using blue flame support
  mlx4_core: reserve UARs for userspace consumers
  mlx4_core: maintain available field in bitmap allocator
  mlx4: Add blue flame support for kernel consumers
  mlx4_en: Enabling new steering
  mlx4: Add support for promiscuous mode in the new steering model.
  mlx4: generalization of multicast steering.
  mlx4_en: Reporting HW revision in ethtool -i
  ...
2011-03-25 21:02:22 -07:00
Julian Anastasov
1fbc784392 ipv4: do not ignore route errors
The "ipv4: Inline fib_semantic_match into check_leaf"
change forgets to return the route errors. check_leaf should
return the same results as fib_table_lookup.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-25 20:33:23 -07:00
Sage Weil
ef550f6f4f ceph: flush msgr_wq during mds_client shutdown
The release method for mds connections uses a backpointer to the
mds_client, so we need to flush the workqueue of any pending work (and
ceph_connection references) prior to freeing the mds_client.  This fixes
an oops easily triggered under UML by

 while true ; do mount ... ; umount ... ; done

Also fix an outdated comment: the flush in ceph_destroy_client only flushes
OSD connections out.  This bug is basically an artifact of the ceph ->
ceph+libceph conversion.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-25 13:27:48 -07:00
Linus Torvalds
40471856f2 Merge branch 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (28 commits)
  Cleanup XDR parsing for LAYOUTGET, GETDEVICEINFO
  NFSv4.1 convert layoutcommit sync to boolean
  NFSv4.1 pnfs_layoutcommit_inode fixes
  NFS: Determine initial mount security
  NFS: use secinfo when crossing mountpoints
  NFS: Add secinfo procedure
  NFS: lookup supports alternate client
  NFS: convert call_sync() to a function
  NFSv4.1 remove temp code that prevented ds commits
  NFSv4.1: layoutcommit
  NFSv4.1: filelayout driver specific code for COMMIT
  NFSv4.1: remove GETATTR from ds commits
  NFSv4.1: add generic layer hooks for pnfs COMMIT
  NFSv4.1: alloc and free commit_buckets
  NFSv4.1: shift filelayout_free_lseg
  NFSv4.1: pull out code from nfs_commit_release
  NFSv4.1: pull error handling out of nfs_commit_list
  NFSv4.1: add callback to nfs4_commit_done
  NFSv4.1: rearrange nfs_commit_rpcsetup
  NFSv4.1: don't send COMMIT to ds for data sync writes
  ...
2011-03-25 10:03:28 -07:00
David S. Miller
37e826c513 ipv4: Fix nexthop caching wrt. scoping.
Move the scope value out of the fib alias entries and into fib_info,
so that we always use the correct scope when recomputing the nexthop
cached source address.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24 18:06:47 -07:00
David S. Miller
436c3b66ec ipv4: Invalidate nexthop cache nh_saddr more correctly.
Any operation that:

1) Brings up an interface
2) Adds an IP address to an interface
3) Deletes an IP address from an interface

can potentially invalidate the nh_saddr value, requiring
it to be recomputed.

Perform the recomputation lazily using a generation ID.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24 17:42:21 -07:00
Trond Myklebust
0acd220192 Merge branch 'nfs-for-2.6.39' into nfs-for-next 2011-03-24 17:03:14 -04:00
Thomas Gleixner
b77dcf8460 Bluetooth: Fix warning with hci_cmd_timer
After we made debugobjects working again, we got the following:

WARNING: at lib/debugobjects.c:262 debug_print_object+0x8e/0xb0()
Hardware name: System Product Name
ODEBUG: free active (active state 0) object type: timer_list hint: hci_cmd_timer+0x0/0x60
Pid: 2125, comm: dmsetup Tainted: G        W   2.6.38-06707-gc62b389 #110375
Call Trace:
 [<ffffffff8104700a>] warn_slowpath_common+0x7a/0xb0
 [<ffffffff810470b6>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff812d3a5e>] debug_print_object+0x8e/0xb0
 [<ffffffff81bd8810>] ? hci_cmd_timer+0x0/0x60
 [<ffffffff812d4685>] debug_check_no_obj_freed+0x125/0x230
 [<ffffffff810f1063>] ? check_object+0xb3/0x2b0
 [<ffffffff810f3630>] kfree+0x150/0x190
 [<ffffffff81be4d06>] ? bt_host_release+0x16/0x20
 [<ffffffff81be4d06>] bt_host_release+0x16/0x20
 [<ffffffff813a1907>] device_release+0x27/0xa0
 [<ffffffff812c519c>] kobject_release+0x4c/0xa0
 [<ffffffff812c5150>] ? kobject_release+0x0/0xa0
 [<ffffffff812c61f6>] kref_put+0x36/0x70
 [<ffffffff812c4d37>] kobject_put+0x27/0x60
 [<ffffffff813a21f7>] put_device+0x17/0x20
 [<ffffffff81bda4f9>] hci_free_dev+0x29/0x30
 [<ffffffff81928be6>] vhci_release+0x36/0x70
 [<ffffffff810fb366>] fput+0xd6/0x1f0
 [<ffffffff810f8fe6>] filp_close+0x66/0x90
 [<ffffffff810f90a9>] sys_close+0x99/0xf0
 [<ffffffff81d4c96b>] system_call_fastpath+0x16/0x1b

That timer was introduced with commit 6bd32326cda(Bluetooth: Use
proper timer for hci command timout)

Timer seems to be running when the thing is closed. Removing the timer
unconditionally fixes the problem. And yes, it needs to be fixed
before the HCI_UP check.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-24 17:04:44 -03:00
Andrei Emeltchenko
a0cc9a1b57 Bluetooth: delete hanging L2CAP channel
Sometimes L2CAP connection remains hanging. Make sure that
L2CAP channel is deleted.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-24 17:04:44 -03:00
Johan Hedberg
6994ca5e8a Bluetooth: Fix missing hci_dev_lock_bh in user_confirm_reply
The code was correctly calling _unlock at the end of the function but
there was no actual _lock call anywhere.

Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-24 17:04:44 -03:00
Gustavo F. Padovan
f630cf0d54 Bluetooth: Fix HCI_RESET command synchronization
We can't send new commands before a cmd_complete for the HCI_RESET command
shows up.

Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Reported-by: Justin P. Mattock <justinmattock@gmail.com>
Reported-by: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Ed Tomlinson <edt@aei.ca>
2011-03-24 17:04:44 -03:00
Suraj Sumangala
52cb1c0d20 Bluetooth: Increment unacked_frames count only the first transmit
This patch lets 'l2cap_pinfo.unacked_frames' be incremented only
the first time a frame is transmitted.

Previously it was being incremented for retransmitted packets
too resulting the value to cross the transmit window size.

Signed-off-by: Suraj Sumangala <suraj@atheros.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-03-24 17:04:43 -03:00
Eric Dumazet
fcd13f42c9 ipv4: fix fib metrics
Alessandro Suardi reported that we could not change route metrics :

ip ro change default .... advmss 1400

This regression came with commit 9c150e82ac (Allocate fib metrics
dynamically). fib_metrics is no longer an array, but a pointer to an
array.

Reported-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Alessandro Suardi <alessandro.suardi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-24 11:49:54 -07:00
Bryan Schumaker
8f70e95f9f NFS: Determine initial mount security
When sec=<something> is not presented as a mount option,
we should attempt to determine what security flavor the
server is using.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:42 -04:00
Bryan Schumaker
7ebb931598 NFS: use secinfo when crossing mountpoints
A submount may use different security than the parent
mount does.  We should figure out what sec flavor the
submount uses at mount time.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:42 -04:00
Linus Torvalds
dc87c55120 Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.39' of git://linux-nfs.org/~bfields/linux:
  SUNRPC: Remove resource leak in svc_rdma_send_error()
  nfsd: wrong index used in inner loop
  nfsd4: fix comment and remove unused nfsd4_file fields
  nfs41: make sure nfs server return right ca_maxresponsesize_cached
  nfsd: fix compile error
  svcrpc: fix bad argument in unix_domain_find
  nfsd4: fix struct file leak
  nfsd4: minor nfs4state.c reshuffling
  svcrpc: fix rare race on unix_domain creation
  nfsd41: modify the members value of nfsd4_op_flags
  nfsd: add proc file listing kernel's gss_krb5 enctypes
  gss:krb5 only include enctype numbers in gm_upcall_enctypes
  NFSD, VFS: Remove dead code in nfsd_rename()
  nfsd: kill unused macro definition
  locks: use assign_type()
2011-03-24 08:20:39 -07:00
Akinobu Mita
e1dc1c81b9 rds: use little-endian bitops
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h.  This converts ext2 non-atomic bit operations to
little-endian bit operations.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Andy Grover <andy.grover@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:16 -07:00
Akinobu Mita
12ce22423a rds: stop including asm-generic/bitops/le.h directly
asm-generic/bitops/le.h is only intended to be included directly from
asm-generic/bitops/ext2-non-atomic.h or asm-generic/bitops/minix-le.h
which implements generic ext2 or minix bit operations.

This stops including asm-generic/bitops/le.h directly and use ext2
non-atomic bit operations instead.

It seems odd to use ext2_*_bit() on rds, but it will replaced with
__{set,clear,test}_bit_le() after introducing little endian bit operations
for all architectures.  This indirect step is necessary to maintain
bisectability for some architectures which have their own little-endian
bit operations.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Andy Grover <andy.grover@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:10 -07:00
Eric Dumazet
eb49a97363 ipv4: fix ip_rt_update_pmtu()
commit 2c8cec5c10 (Cache learned PMTU information in inetpeer) added
an extra inet_putpeer() call in ip_rt_update_pmtu().

This results in various problems, since we can free one inetpeer, while
it is still in use.

Ref: http://www.spinics.net/lists/netdev/msg159121.html

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23 12:18:15 -07:00
David S. Miller
406b6f974d ipv4: Fallback to FIB local table in __ip_dev_find().
In commit 9435eb1cf0
("ipv4: Implement __ip_dev_find using new interface address hash.")
we reimplemented __ip_dev_find() so that it doesn't have to
do a full FIB table lookup.

Instead, it consults a hash table of addresses configured to
interfaces.

This works identically to the old code in all except one case,
and that is for loopback subnets.

The old code would match the loopback device for any IP address
that falls within a subnet configured to the loopback device.

Handle this corner case by doing the FIB lookup.

We could implement this via inet_addr_onlink() but:

1) Someone could configure many addresses to loopback and
   inet_addr_onlink() is a simple list traversal.

2) We know the old code works.

Reported-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-23 12:16:15 -07:00
David S. Miller
f6152737a9 tcp: Make undo_ssthresh arg to tcp_undo_cwr() a bool.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 19:37:11 -07:00
Yuchung Cheng
67d4120a17 tcp: avoid cwnd moderation in undo
In the current undo logic, cwnd is moderated after it was restored
to the value prior entering fast-recovery. It was moderated first
in tcp_try_undo_recovery then again in tcp_complete_cwr.

Since the undo indicates recovery was false, these moderations
are not necessary. If the undo is triggered when most of the
outstanding data have been acknowledged, the (restored) cwnd is
falsely pulled down to a small value.

This patch removes these cwnd moderations if cwnd is undone
  a) during fast-recovery
	b) by receiving DSACKs past fast-recovery

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 19:36:08 -07:00
Linus Lüssing
a7bff75b08 bridge: Fix possibly wrong MLD queries' ethernet source address
The ipv6_dev_get_saddr() is currently called with an uninitialized
destination address. Although in tests it usually seemed to nevertheless
always fetch the right source address, there seems to be a possible race
condition.

Therefore this commit changes this, first setting the destination
address and only after that fetching the source address.

Reported-by: Jan Beulich <JBeulich@novell.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 19:26:42 -07:00
Florian Westphal
9c7a4f9ce6 ipv6: ip6_route_output does not modify sk parameter, so make it const
This avoids explicit cast to avoid 'discards qualifiers'
compiler warning in a netfilter patch that i've been working on.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 19:17:36 -07:00
Eric Dumazet
94dcf29a11 kthread: use kthread_create_on_node()
ksoftirqd, kworker, migration, and pktgend kthreads can be created with
kthread_create_on_node(), to get proper NUMA affinities for their stack and
task_struct.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 17:44:01 -07:00
Linus Torvalds
ab70a1d7c7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  [net/9p]: Introduce basic flow-control for VirtIO transport.
  9p: use the updated offset given by generic_write_checks
  [net/9p] Don't re-pin pages on retrying virtqueue_add_buf().
  [net/9p] Set the condition just before waking up.
  [net/9p] unconditional wake_up to proc waiting for space on VirtIO ring
  fs/9p: Add v9fs_dentry2v9ses
  fs/9p: Attach writeback_fid on first open with WR flag
  fs/9p: Open writeback fid in O_SYNC mode
  fs/9p: Use truncate_setsize instead of vmtruncate
  net/9p: Fix compile warning
  net/9p: Convert the in the 9p rpc call path to GFP_NOFS
  fs/9p: Fix race in initializing writeback fid
2011-03-22 16:26:10 -07:00
Linus Torvalds
0adfc56ce8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: use watch/notify for changes in rbd header
  libceph: add lingering request and watch/notify event framework
  rbd: update email address in Documentation
  ceph: rename dentry_release -> d_release, fix comment
  ceph: add request to the tail of unsafe write list
  ceph: remove request from unsafe list if it is canceled/timed out
  ceph: move readahead default to fs/ceph from libceph
  ceph: add ino32 mount option
  ceph: update common header files
  ceph: remove debugfs debug cruft
  libceph: fix osd request queuing on osdmap updates
  ceph: preserve I_COMPLETE across rename
  libceph: Fix base64-decoding when input ends in newline.
2011-03-22 16:25:25 -07:00
Trond Myklebust
246408dcd5 SUNRPC: Never reuse the socket port after an xs_close()
If we call xs_close(), we're in one of two situations:
 - Autoclose, which means we don't expect to resend a request
 - bind+connect failed, which probably means the port is in use

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2011-03-22 18:42:33 -04:00
David S. Miller
db138908cc Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-03-22 14:36:18 -07:00
Venkateswararao Jujjuri (JV)
68da9ba4ee [net/9p]: Introduce basic flow-control for VirtIO transport.
Recent zerocopy work in the 9P VirtIO transport maps and pins
user buffers into kernel memory for the server to work on them.
Since the user process can initiate this kind of pinning with a simple
read/write call, thousands of IO threads initiated by the user process can
hog the system resources and could result into denial of service.

This patch introduces flow control to avoid that extreme scenario.

The ceiling limit to avoid denial of service attacks is set to relatively
high (nr_free_pagecache_pages()/4) so that it won't interfere with
regular usage, but can step in extreme cases to limit the total system
hang. Since we don't have a global structure to accommodate this variable,
I choose the virtio_chan as the home for this.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Reviewed-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:50 -05:00
Venkateswararao Jujjuri (JV)
316ad5501c [net/9p] Don't re-pin pages on retrying virtqueue_add_buf().
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:48 -05:00
Venkateswararao Jujjuri (JV)
a01a984035 [net/9p] Set the condition just before waking up.
Given that the sprious wake-ups are common, we need to move the
condition setting right next to the wake_up().  After setting the condition
to req->status = REQ_STATUS_RCVD, sprious wakeups may cause the
virtqueue back on the free list for someone else to use.
This may result in kernel panic while relasing the pinned pages
in p9_release_req_pages().

Also rearranged the while loop in req_done() for better redability.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:47 -05:00
Venkateswararao Jujjuri (JV)
53bda3e5b4 [net/9p] unconditional wake_up to proc waiting for space on VirtIO ring
Process may wait to get space on VirtIO ring to send a transaction to
VirtFS server. Current code just does a conditional wake_up() which
means only one process will be woken up even if multiple processes
are waiting.

This fix makes the wake_up unconditional. Hence we won't have any
processes waiting for-ever.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:19 -05:00
Aneesh Kumar K.V
472e7f9f8b net/9p: Fix compile warning
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:35 -05:00
Aneesh Kumar K.V
eeff66ef6e net/9p: Convert the in the 9p rpc call path to GFP_NOFS
Without this we can cause reclaim allocation in writepage.

[ 3433.448430] =================================
[ 3433.449117] [ INFO: inconsistent lock state ]
[ 3433.449117] 2.6.38-rc5+ #84
[ 3433.449117] ---------------------------------
[ 3433.449117] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
[ 3433.449117] kswapd0/505 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 3433.449117]  (iprune_sem){+++++-}, at: [<ffffffff810ebbab>] shrink_icache_memory+0x45/0x2b1
[ 3433.449117] {RECLAIM_FS-ON-W} state was registered at:
[ 3433.449117]   [<ffffffff8107fe5f>] mark_held_locks+0x52/0x70
[ 3433.449117]   [<ffffffff8107ff02>] lockdep_trace_alloc+0x85/0x9f
[ 3433.449117]   [<ffffffff810d353d>] slab_pre_alloc_hook+0x18/0x3c
[ 3433.449117]   [<ffffffff810d3fd5>] kmem_cache_alloc+0x23/0xa2
[ 3433.449117]   [<ffffffff8127be77>] idr_pre_get+0x2d/0x6f
[ 3433.449117]   [<ffffffff815434eb>] p9_idpool_get+0x30/0xae
[ 3433.449117]   [<ffffffff81540123>] p9_client_rpc+0xd7/0x9b0
[ 3433.449117]   [<ffffffff815427b0>] p9_client_clunk+0x88/0xdb
[ 3433.449117]   [<ffffffff811d56e5>] v9fs_evict_inode+0x3c/0x48
[ 3433.449117]   [<ffffffff810eb511>] evict+0x1f/0x87
[ 3433.449117]   [<ffffffff810eb5c0>] dispose_list+0x47/0xe3
[ 3433.449117]   [<ffffffff810eb8da>] evict_inodes+0x138/0x14f
[ 3433.449117]   [<ffffffff810d90e2>] generic_shutdown_super+0x57/0xe8
[ 3433.449117]   [<ffffffff810d91e8>] kill_anon_super+0x11/0x50
[ 3433.449117]   [<ffffffff811d4951>] v9fs_kill_super+0x49/0xab
[ 3433.449117]   [<ffffffff810d926e>] deactivate_locked_super+0x21/0x46
[ 3433.449117]   [<ffffffff810d9e84>] deactivate_super+0x40/0x44
[ 3433.449117]   [<ffffffff810ef848>] mntput_no_expire+0x100/0x109
[ 3433.449117]   [<ffffffff810f0aeb>] sys_umount+0x2f1/0x31c
[ 3433.449117]   [<ffffffff8102c87b>] system_call_fastpath+0x16/0x1b
[ 3433.449117] irq event stamp: 192941
[ 3433.449117] hardirqs last  enabled at (192941): [<ffffffff81568dcf>] _raw_spin_unlock_irq+0x2b/0x30
[ 3433.449117] hardirqs last disabled at (192940): [<ffffffff810b5f97>] shrink_inactive_list+0x290/0x2f5
[ 3433.449117] softirqs last  enabled at (188470): [<ffffffff8105fd65>] __do_softirq+0x133/0x152
[ 3433.449117] softirqs last disabled at (188455): [<ffffffff8102d7cc>] call_softirq+0x1c/0x28
[ 3433.449117]
[ 3433.449117] other info that might help us debug this:
[ 3433.449117] 1 lock held by kswapd0/505:
[ 3433.449117]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff810b52e2>] shrink_slab+0x38/0x15f
[ 3433.449117]
[ 3433.449117] stack backtrace:
[ 3433.449117] Pid: 505, comm: kswapd0 Not tainted 2.6.38-rc5+ #84
[ 3433.449117] Call Trace:
[ 3433.449117]  [<ffffffff8107fbce>] ? valid_state+0x17e/0x191
[ 3433.449117]  [<ffffffff81036896>] ? save_stack_trace+0x28/0x45
[ 3433.449117]  [<ffffffff81080426>] ? check_usage_forwards+0x0/0x87
[ 3433.449117]  [<ffffffff8107fcf4>] ? mark_lock+0x113/0x22c
[ 3433.449117]  [<ffffffff8108105f>] ? __lock_acquire+0x37a/0xcf7
[ 3433.449117]  [<ffffffff8107fc0e>] ? mark_lock+0x2d/0x22c
[ 3433.449117]  [<ffffffff81081077>] ? __lock_acquire+0x392/0xcf7
[ 3433.449117]  [<ffffffff810b14d2>] ? determine_dirtyable_memory+0x15/0x28
[ 3433.449117]  [<ffffffff81081a33>] ? lock_acquire+0x57/0x6d
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff81567d85>] ? down_read+0x47/0x5c
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff810b5385>] ? shrink_slab+0xdb/0x15f
[ 3433.449117]  [<ffffffff810b69bc>] ? kswapd+0x574/0x96a
[ 3433.449117]  [<ffffffff810b6448>] ? kswapd+0x0/0x96a
[ 3433.449117]  [<ffffffff810714e2>] ? kthread+0x7d/0x85
[ 3433.449117]  [<ffffffff8102d6d4>] ? kernel_thread_helper+0x4/0x10
[ 3433.449117]  [<ffffffff81569200>] ? restore_args+0x0/0x30
[ 3433.449117]  [<ffffffff81071465>] ? kthread+0x0/0x85
[ 3433.449117]  [<ffffffff8102d6d0>] ? kernel_thread_helper+0x0/0x10

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:35 -05:00
Yehuda Sadeh
a40c4f10e3 libceph: add lingering request and watch/notify event framework
Lingering requests are requests that are sent to the OSD normally but
tracked also after we get a successful request.  This keeps the OSD
connection open and resends the original request if the object moves to
another OSD.  The OSD can then send notification messages back to us
if another client initiates a notify.

This framework will be used by RBD so that the client gets notification
when a snapshot is created by another node or tool.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-22 11:33:55 -07:00
Julian Anastasov
04024b937a ipv4: optimize route adding on secondary promotion
Optimize the calling of fib_add_ifaddr for all
secondary addresses after the promoted one to start from
their place, not from the new place of the promoted
secondary. It will save some CPU cycles because we
are sure the promoted secondary was first for the subnet
and all next secondaries do not change their place.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 01:06:33 -07:00
Julian Anastasov
2d230e2b2c ipv4: remove the routes on secondary promotion
The secondary address promotion relies on fib_sync_down_addr
to remove all routes created for the secondary addresses when
the old primary address is deleted. It does not happen for cases
when the primary address is also in another subnet. Fix that
by deleting local and broadcast routes for all secondaries while
they are on device list and by faking that all addresses from
this subnet are to be deleted. It relies on fib_del_ifaddr being
able to ignore the IPs from the concerned subnet while checking
for duplication.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 01:06:33 -07:00
Julian Anastasov
e6abbaa272 ipv4: fix route deletion for IPs on many subnets
Alex Sidorenko reported for problems with local
routes left after IP addresses are deleted. It happens
when same IPs are used in more than one subnet for the
device.

	Fix fib_del_ifaddr to restrict the checks for duplicate
local and broadcast addresses only to the IFAs that use
our primary IFA or another primary IFA with same address.
And we expect the prefsrc to be matched when the routes
are deleted because it is possible they to differ only by
prefsrc. This patch prevents local and broadcast routes
to be leaked until their primary IP is deleted finally
from the box.

	As the secondary address promotion needs to delete
the routes for all secondaries that used the old primary IFA,
add option to ignore these secondaries from the checks and
to assume they are already deleted, so that we can safely
delete the route while these IFAs are still on the device list.

Reported-by: Alex Sidorenko <alexandre.sidorenko@hp.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 01:06:32 -07:00
Julian Anastasov
74cb3c108b ipv4: match prefsrc when deleting routes
fib_table_delete forgets to match the routes by prefsrc.
Callers can specify known IP in fc_prefsrc and we should remove
the exact route. This is needed for cases when same local or
broadcast addresses are used in different subnets and the
routes differ only in prefsrc. All callers that do not provide
fc_prefsrc will ignore the route prefsrc as before and will
delete the first occurence. That is how the ip route del default
magic works.

	Current callers are:

- ip_rt_ioctl where rtentry_to_fib_config provides fc_prefsrc only
when the provided device name matches IP label with colon.

- inet_rtm_delroute where RTA_PREFSRC is optional too

- fib_magic which deals with routes when deleting addresses
and where the fc_prefsrc is always set with the primary IP
for the concerned IFA.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 01:06:31 -07:00
Michał Mirosław
27660515a2 net: implement dev_disable_lro() hw_features compatibility
Implement compatibility with new hw_features for dev_disable_lro().
This is a transition path - dev_disable_lro() should be later
integrated into netdev_fix_features() after all drivers are converted.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 01:00:26 -07:00
Simon Horman
736561a01f IPVS: Use global mutex in ip_vs_app.c
As part of the work to make IPVS network namespace aware
__ip_vs_app_mutex was replaced by a per-namespace lock,
ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.

Unfortunately this implementation results in ipvs->app_key residing
in non-static storage which at the very least causes a lockdep warning.

This patch takes the rather heavy-handed approach of reinstating
__ip_vs_app_mutex which will cover access to the ipvs->list_head
of all network namespaces.

[   12.610000] IPVS: Creating netns size=2456 id=0
[   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[   12.640000] BUG: key ffff880003bbf1a0 not in .data!
[   12.640000] ------------[ cut here ]------------
[   12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
[   12.640000] Hardware name: Bochs
[   12.640000] Pid: 1, comm: swapper Tainted: G        W 2.6.38-kexec-06330-g69b7efe-dirty #122
[   12.650000] Call Trace:
[   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
[   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
[   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
[   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
[   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
[   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
[   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
[   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
[   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
[   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
[   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
[   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
[   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
[   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
[   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
[   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 20:39:24 -07:00
Eric Dumazet
f40f94fc6c ipvs: fix a typo in __ip_vs_control_init()
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 20:39:24 -07:00
Eric W. Biederman
9d2a8fa96a net ipv6: Fix duplicate /proc/sys/net/ipv6/neigh directory entries.
When I was fixing issues with unregisgtering tables under /proc/sys/net/ipv6/neigh
by adding a mount point it appears I missed a critical ordering issue, in the
ipv6 initialization.  I had not realized that ipv6_sysctl_register is called
at the very end of the ipv6 initialization and in particular after we call
neigh_sysctl_register from ndisc_init.

"neigh" needs to be initialized in ipv6_static_sysctl_register which is
the first ipv6 table to initialized, and definitely before ndisc_init.
This removes the weirdness of duplicate tables while still providing a
"neigh" mount point which prevents races in sysctl unregistering.

This was initially reported at https://bugzilla.kernel.org/show_bug.cgi?id=31232
Reported-by: sunkan@zappa.cx
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:23:34 -07:00
Neil Horman
ac0a121d79 net: fix incorrect spelling in drop monitor protocol
It was pointed out to me recently that my spelling could be better :)

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:20:26 -07:00
Arnd Bergmann
b20e7bbfc7 net/appletalk: fix atalk_release use after free
The BKL removal in appletalk introduced a use-after-free problem,
where atalk_destroy_socket frees a sock, but we still release
the socket lock on it.

An easy fix is to take an extra reference on the sock and sock_put
it when returning from atalk_release.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:18:00 -07:00
Eric Dumazet
674f211599 ipx: fix ipx_release()
Commit b0d0d915d1 (remove the BKL) added a regression, because
sock_put() can free memory while we are going to use it later.

Fix is to delay sock_put() _after_ release_sock().

Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:16:39 -07:00
James Chapman
8aa525a934 l2tp: fix possible oops on l2tp_eth module unload
A struct used in the l2tp_eth driver for registering network namespace
ops was incorrectly marked as __net_initdata, leading to oops when
module unloaded.

BUG: unable to handle kernel paging request at ffffffffa00ec098
IP: [<ffffffff8123dbd8>] ops_exit_list+0x7/0x4b
PGD 142d067 PUD 1431063 PMD 195da8067 PTE 0
Oops: 0000 [#1] SMP 
last sysfs file: /sys/module/l2tp_eth/refcnt
Call Trace:
 [<ffffffff8123dc94>] ? unregister_pernet_operations+0x32/0x93
 [<ffffffff8123dd20>] ? unregister_pernet_device+0x2b/0x38
 [<ffffffff81068b6e>] ? sys_delete_module+0x1b8/0x222
 [<ffffffff810c7300>] ? do_munmap+0x254/0x318
 [<ffffffff812c64e5>] ? page_fault+0x25/0x30
 [<ffffffff812c6952>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:10:25 -07:00
Wei Yongjun
a454f0ccef xfrm: Fix initialize repl field of struct xfrm_state
Commit 'xfrm: Move IPsec replay detection functions to a separate file'
  (9fdc4883d9)
introduce repl field to struct xfrm_state, and only initialize it
under SA's netlink create path, the other path, such as pf_key,
ipcomp/ipcomp6 etc, the repl field remaining uninitialize. So if
the SA is created by pf_key, any input packet with SA's encryption
algorithm will cause panic.

    int xfrm_input()
    {
        ...
        x->repl->advance(x, seq);
        ...
    }

This patch fixed it by introduce new function __xfrm_init_state().

Pid: 0, comm: swapper Not tainted 2.6.38-next+ #14 Bochs Bochs
EIP: 0060:[<c078e5d5>] EFLAGS: 00010206 CPU: 0
EIP is at xfrm_input+0x31c/0x4cc
EAX: dd839c00 EBX: 00000084 ECX: 00000000 EDX: 01000000
ESI: dd839c00 EDI: de3a0780 EBP: dec1de88 ESP: dec1de64
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper (pid: 0, ti=dec1c000 task=c09c0f20 task.ti=c0992000)
Stack:
 00000000 00000000 00000002 c0ba27c0 00100000 01000000 de3a0798 c0ba27c0
 00000033 dec1de98 c0786848 00000000 de3a0780 dec1dea4 c0786868 00000000
 dec1debc c074ee56 e1da6b8c de3a0780 c074ed44 de3a07a8 dec1decc c074ef32
Call Trace:
 [<c0786848>] xfrm4_rcv_encap+0x22/0x27
 [<c0786868>] xfrm4_rcv+0x1b/0x1d
 [<c074ee56>] ip_local_deliver_finish+0x112/0x1b1
 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
 [<c074ef77>] ip_local_deliver+0x3e/0x44
 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
 [<c074ec03>] ip_rcv_finish+0x30a/0x332
 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
 [<c074f188>] ip_rcv+0x20b/0x247
 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
 [<c072797d>] __netif_receive_skb+0x373/0x399
 [<c0727bc1>] netif_receive_skb+0x4b/0x51
 [<e0817e2a>] cp_rx_poll+0x210/0x2c4 [8139cp]
 [<c072818f>] net_rx_action+0x9a/0x17d
 [<c0445b5c>] __do_softirq+0xa1/0x149
 [<c0445abb>] ? __do_softirq+0x0/0x149

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:08:28 -07:00
Sage Weil
6f6c700675 libceph: fix osd request queuing on osdmap updates
If we send a request to osd A, and the request's pg remaps to osd B and
then back to A in quick succession, we need to resend the request to A. The
old code was only calling kick_requests after processing all incremental
maps in a message, so it was very possible to not resend a request that
needed to be resent.  This would make the osd eventually time out (at least
with the current default of osd timeouts enabled).

The correct approach is to scan requests on every map incremental.  This
patch refactors the kick code in a few ways:
 - all requests are either on req_lru (in flight), req_unsent (ready to
   send), or req_notarget (currently map to no up osd)
 - mapping always done by map_request (previous map_osds)
 - if the mapping changes, we requeue.  requests are resent only after all
   map incrementals are processed.
 - some osd reset code is moved out of kick_requests into a separate
   function
 - the "kick this osd" functionality is moved to kick_osd_requests, as it
   is unrelated to scanning for request->pg->osd mapping changes

Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-21 12:24:19 -07:00
Felix Fietkau
8bc8aecdc5 mac80211: initialize sta->last_rx in sta_info_alloc
This field is used to determine the inactivity time. When in AP mode,
hostapd uses it for kicking out inactive clients after a while. Without this
patch, hostapd immediately deauthenticates a new client if it checks the
inactivity time before the client sends its first data frame.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-21 15:19:50 -04:00
Vasiliy Kulikov
961ed183a9 netfilter: ipt_CLUSTERIP: fix buffer overflow
'buffer' string is copied from userspace.  It is not checked whether it is
zero terminated.  This may lead to overflow inside of simple_strtoul().
Changli Gao suggested to copy not more than user supplied 'size' bytes.

It was introduced before the git epoch.  Files "ipt_CLUSTERIP/*" are
root writable only by default, however, on some setups permissions might be
relaxed to e.g. network admin user.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-20 15:42:52 +01:00
Eric Dumazet
db856674ac netfilter: xtables: fix reentrancy
commit f3c5c1bfd4 (make ip_tables reentrant) introduced a race in
handling the stackptr restore, at the end of ipt_do_table()

We should do it before the call to xt_info_rdunlock_bh(), or we allow
cpu preemption and another cpu overwrites stackptr of original one.

A second fix is to change the underflow test to check the origptr value
instead of 0 to detect underflow, or else we allow a jump from different
hooks.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-20 15:40:06 +01:00
Jozsef Kadlecsik
5c1aba4678 netfilter: ipset: fix checking the type revision at create command
The revision of the set type was not checked at the create command: if the
userspace sent a valid set type but with not supported revision number,
it'd create a loop.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-20 15:35:01 +01:00
Jozsef Kadlecsik
5e0c1eb7e6 netfilter: ipset: fix address ranges at hash:*port* types
The hash:*port* types with IPv4 silently ignored when address ranges
with non TCP/UDP were added/deleted from the set and used the first
address from the range only.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-20 15:33:26 +01:00
Herbert Xu
6b1e960fdb bridge: Reset IPCB when entering IP stack on NF_FORWARD
Whenever we enter the IP stack proper from bridge netfilter we
need to ensure that the skb is in a form the IP stack expects
it to be in.

The entry point on NF_FORWARD did not meet the requirements of
the IP stack, therefore leading to potential crashes/panics.

This patch fixes the problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-18 15:13:12 -07:00
Eric Dumazet
d870bfb9d3 vlan: should take into account needed_headroom
Commit c95b819ad7 (gre: Use needed_headroom)
made gre use needed_headroom instead of hard_header_len

This uncover a bug in vlan code.

We should make sure vlan devices take into account their
real_dev->needed_headroom or we risk a crash in ipgre_header(), because
we dont have enough room to push IP header in skb.

Reported-by: Diddi Oscarsson <diddi@diddi.se>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-18 15:13:12 -07:00
Ben Hutchings
3a7da39d16 ethtool: Compat handling for struct ethtool_rxnfc
This structure was accidentally defined such that its layout can
differ between 32-bit and 64-bit processes.  Add compat structure
definitions and an ioctl wrapper function.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: stable@kernel.org [2.6.30+]
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-18 15:13:11 -07:00
Roger Luethi
5e5069b41d ethtool: __ethtool_set_sg: check for function pointer before using it
__ethtool_set_sg does not check if dev->ethtool_ops->set_sg is defined
which can result in a NULL pointer dereference when ethtool is used to
change SG settings for drivers without SG support.

Signed-off-by: Roger Luethi <rl@hellgate.ch>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-18 15:13:10 -07:00
Vasiliy Kulikov
67c5c6cb81 econet: 4 byte infoleak to the network
struct aunhdr has 4 padding bytes between 'pad' and 'handle' fields on
x86_64.  These bytes are not initialized in the variable 'ah' before
sending 'ah' to the network.  This leads to 4 bytes kernel stack
infoleak.

This bug was introduced before the git epoch.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Phil Blundell <philb@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-18 15:12:15 -07:00
Linus Torvalds
e16b396ce3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
  doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
  Update cpuset info & webiste for cgroups
  dcdbas: force SMI to happen when expected
  arch/arm/Kconfig: remove one to many l's in the word.
  asm-generic/user.h: Fix spelling in comment
  drm: fix printk typo 'sracth'
  Remove one to many n's in a word
  Documentation/filesystems/romfs.txt: fixing link to genromfs
  drivers:scsi Change printk typo initate -> initiate
  serial, pch uart: Remove duplicate inclusion of linux/pci.h header
  fs/eventpoll.c: fix spelling
  mm: Fix out-of-date comments which refers non-existent functions
  drm: Fix printk typo 'failled'
  coh901318.c: Change initate to initiate.
  mbox-db5500.c Change initate to initiate.
  edac: correct i82975x error-info reported
  edac: correct i82975x mci initialisation
  edac: correct commented info
  fs: update comments to point correct document
  target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
  ...

Trivial conflict in fs/eventpoll.c (spelling vs addition)
2011-03-18 10:37:40 -07:00
Linus Torvalds
7fd23a2471 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (48 commits)
  HID: add support for Logitech Driving Force Pro wheel
  HID: hid-ortek: remove spurious reference
  HID: add support for Ortek PKB-1700
  HID: roccat-koneplus: vorrect mode of sysfs attr 'sensor'
  HID: hid-ntrig: init settle and mode check
  HID: merge hid-egalax into hid-multitouch
  HID: hid-multitouch: Send events per slot if CONTACTCOUNT is missing
  HID: ntrig remove if and drop an indent
  HID: ACRUX - activate the device immediately after binding
  HID: ntrig: apply NO_INIT_REPORTS quirk
  HID: hid-magicmouse: Correct touch orientation direction
  HID: ntrig don't dereference unclaimed hidinput
  HID: Do not create input devices for feature reports
  HID: bt hidp: send Output reports using SET_REPORT on the Control channel
  HID: hid-sony.c: Fix sending Output reports to the Sixaxis
  HID: add support for Keytouch IEC 60945
  HID: Add HID Report Descriptor to sysfs
  HID: add IRTOUCH infrared USB to hid_have_special_driver
  HID: kernel oops in out_cleanup in function hidinput_connect
  HID: Add teletext/color keys - gyration remote - EU version (GYAR3101CKDE)
  ...
2011-03-18 10:35:30 -07:00
Linus Torvalds
179198373c Merge branch 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (54 commits)
  RPC: killing RPC tasks races fixed
  xprt: remove redundant check
  SUNRPC: Convert struct rpc_xprt to use atomic_t counters
  SUNRPC: Ensure we always run the tk_callback before tk_action
  sunrpc: fix printk format warning
  xprt: remove redundant null check
  nfs: BKL is no longer needed, so remove the include
  NFS: Fix a warning in fs/nfs/idmap.c
  Cleanup: Factor out some cut-and-paste code.
  cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions.
  NFS: account direct-io into task io accounting
  gss:krb5 only include enctype numbers in gm_upcall_enctypes
  RPCRDMA: Fix FRMR registration/invalidate handling.
  RPCRDMA: Fix to XDR page base interpretation in marshalling logic.
  NFSv4: Send unmapped uid/gids to the server when using auth_sys
  NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
  NFSv4: cleanup idmapper functions to take an nfs_server argument
  NFSv4: Send unmapped uid/gids to the server if the idmapper fails
  NFSv4: If the server sends us a numeric uid/gid then accept it
  NFSv4.1: reject zero layout with zeroed stripe unit
  ...
2011-03-17 17:40:00 -07:00
Jesper Juhl
4be34b9d69 SUNRPC: Remove resource leak in svc_rdma_send_error()
We leak the memory allocated to 'ctxt' when we return after
'ib_dma_mapping_error()' returns !=0.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-17 14:38:03 -04:00
Stanislav Kinsbursky
8e26de238f RPC: killing RPC tasks races fixed
RPC task RPC_TASK_QUEUED bit is set must be checked before trying to wake up
task rpc_killall_tasks() because task->tk_waitqueue can not be set (equal to
NULL).
Also, as Trond Myklebust mentioned, such approach (instead of checking
tk_waitqueue to NULL) allows us to "optimise away the call to
rpc_wake_up_queued_task() altogether for those
tasks that aren't queued".

Here is an example of dereferencing of tk_waitqueue equal to NULL:

CPU 0               	CPU 1				CPU 2
--------------------	---------------------	--------------------------
nfs4_run_open_task
rpc_run_task
rpc_execute
rpc_set_active
rpc_make_runnable
(waiting)
			rpc_async_schedule
			nfs4_open_prepare
			nfs_wait_on_sequence
						nfs_umount_begin
						rpc_killall_tasks
						rpc_wake_up_task
						rpc_wake_up_queued_task
						spin_lock(tk_waitqueue == NULL)
						BUG()
			rpc_sleep_on
			spin_lock(&q->lock)
			__rpc_sleep_on
			task->tk_waitqueue = q

Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-17 12:39:00 -04:00
j223yang@asset.uwaterloo.ca
ba3c578de2 xprt: remove redundant check
remove redundant check.

Signed-off-by: Jinqiu Yang <crindy646@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-17 12:39:00 -04:00
Trond Myklebust
a8de240a90 SUNRPC: Convert struct rpc_xprt to use atomic_t counters
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-17 12:38:59 -04:00
Trond Myklebust
e020c6800c SUNRPC: Ensure we always run the tk_callback before tk_action
This fixes a race in which the task->tk_callback() puts the rpc_task
to sleep, setting a new callback. Under certain circumstances, the current
code may end up executing the task->tk_action before it gets round to the
callback.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2011-03-17 12:38:41 -04:00
Jiri Kosina
65b06194c9 Merge branches 'dragonrise', 'hidraw-feature', 'multitouch', 'ntrig', 'roccat', 'upstream' and 'upstream-fixes' into for-linus 2011-03-17 14:31:46 +01:00
Linus Torvalds
f74b944419 Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  BKL: That's all, folks
  fs/locks.c: Remove stale FIXME left over from BKL conversion
  ipx: remove the BKL
  appletalk: remove the BKL
  x25: remove the BKL
  ufs: remove the BKL
  hpfs: remove the BKL
  drivers: remove extraneous includes of smp_lock.h
  tracing: don't trace the BKL
  adfs: remove the big kernel lock
2011-03-16 17:21:00 -07:00
Linus Torvalds
7a6362800c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits)
  bonding: enable netpoll without checking link status
  xfrm: Refcount destination entry on xfrm_lookup
  net: introduce rx_handler results and logic around that
  bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag
  bonding: wrap slave state work
  net: get rid of multiple bond-related netdevice->priv_flags
  bonding: register slave pointer for rx_handler
  be2net: Bump up the version number
  be2net: Copyright notice change. Update to Emulex instead of ServerEngines
  e1000e: fix kconfig for crc32 dependency
  netfilter ebtables: fix xt_AUDIT to work with ebtables
  xen network backend driver
  bonding: Improve syslog message at device creation time
  bonding: Call netif_carrier_off after register_netdevice
  bonding: Incorrect TX queue offset
  net_sched: fix ip_tos2prio
  xfrm: fix __xfrm_route_forward()
  be2net: Fix UDP packet detected status in RX compl
  Phonet: fix aligned-mode pipe socket buffer header reserve
  netxen: support for GbE port settings
  ...

Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c
with the staging updates.
2011-03-16 16:29:25 -07:00
Linus Torvalds
e6bee325e4 Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (76 commits)
  pch_uart: reference clock on CM-iTC
  pch_phub: add new device ML7213
  n_gsm: fix UIH control byte : P bit should be 0
  n_gsm: add a documentation
  serial: msm_serial_hs: Add MSM high speed UART driver
  tty_audit: fix tty_audit_add_data live lock on audit disabled
  tty: move cd1865.h to drivers/staging/tty/
  Staging: tty: fix build with epca.c driver
  pcmcia: synclink_cs: fix prototype for mgslpc_ioctl()
  Staging: generic_serial: fix double locking bug
  nozomi: don't use flush_scheduled_work()
  tty/serial: Relax the device_type restriction from of_serial
  MAINTAINERS: Update HVC file patterns
  tty: phase out of ioctl file pointer for tty3270 as well
  tty: forgot to remove ipwireless from drivers/char/pcmcia/Makefile
  pch_uart: Fix DMA channel miss-setting issue.
  pch_uart: fix exclusive access issue
  pch_uart: fix auto flow control miss-setting issue
  pch_uart: fix uart clock setting issue
  pch_uart : Use dev_xxx not pr_xxx
  ...

Fix up trivial conflicts in drivers/misc/pch_phub.c (same patch applied
twice, then changes to the same area in one branch)
2011-03-16 15:11:04 -07:00
Steffen Klassert
fbd5060875 xfrm: Refcount destination entry on xfrm_lookup
We return a destination entry without refcount if a socket
policy is found in xfrm_lookup. This triggers a warning on
a negative refcount when freeeing this dst entry. So take
a refcount in this case to fix it.

This refcount was forgotten when xfrm changed to cache bundles
instead of policies for outgoing flows.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-16 12:55:36 -07:00
Jiri Pirko
8a4eb5734e net: introduce rx_handler results and logic around that
This patch allows rx_handlers to better signalize what to do next to
it's caller. That makes skb->deliver_no_wcard no longer needed.

kernel-doc for rx_handler_result is taken from Nicolas' patch.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-16 12:53:54 -07:00
David S. Miller
ee0caa7956 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2011-03-16 11:12:57 -07:00
Thomas Graf
400b871ba6 netfilter ebtables: fix xt_AUDIT to work with ebtables
Even though ebtables uses xtables it still requires targets to
return EBT_CONTINUE instead of XT_CONTINUE. This prevented
xt_AUDIT to work as ebt module.

Upon Jan's suggestion, use a separate struct xt_target for
NFPROTO_BRIDGE having its own target callback returning
EBT_CONTINUE instead of cloning the module.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-16 18:32:13 +01:00
Linus Torvalds
0f6e0e8448 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits)
  AppArmor: kill unused macros in lsm.c
  AppArmor: cleanup generated files correctly
  KEYS: Add an iovec version of KEYCTL_INSTANTIATE
  KEYS: Add a new keyctl op to reject a key with a specified error code
  KEYS: Add a key type op to permit the key description to be vetted
  KEYS: Add an RCU payload dereference macro
  AppArmor: Cleanup make file to remove cruft and make it easier to read
  SELinux: implement the new sb_remount LSM hook
  LSM: Pass -o remount options to the LSM
  SELinux: Compute SID for the newly created socket
  SELinux: Socket retains creator role and MLS attribute
  SELinux: Auto-generate security_is_socket_class
  TOMOYO: Fix memory leak upon file open.
  Revert "selinux: simplify ioctl checking"
  selinux: drop unused packet flow permissions
  selinux: Fix packet forwarding checks on postrouting
  selinux: Fix wrong checks for selinux_policycap_netpeer
  selinux: Fix check for xfrm selinux context algorithm
  ima: remove unnecessary call to ima_must_measure
  IMA: remove IMA imbalance checking
  ...
2011-03-16 09:15:43 -07:00
Linus Torvalds
26a992dbc2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: (46 commits)
  fs/9p: Make the writeback_fid owned by root
  fs/9p: Writeback dirty data before setattr
  fs/9p: call vmtruncate before setattr 9p opeation
  fs/9p: Properly update inode attributes on link
  fs/9p: Prevent multiple inclusion of same header
  fs/9p: Workaround vfs rename rehash bug
  fs/9p: Mark directory inode invalid for many directory inode operations
  fs/9p: Add . and .. dentry revalidation flag
  fs/9p: mark inode attribute invalid on rename, unlink and setattr
  fs/9p: Add support for marking inode attribute invalid
  fs/9p: Initialize root inode number for dotl
  fs/9p: Update link count correctly on different file system operations
  fs/9p: Add drop_inode 9p callback
  fs/9p: Add direct IO support in cached mode
  fs/9p: Fix inode i_size update in file_write
  fs/9p: set default readahead pages in cached mode
  fs/9p: Move writeback fid to v9fs_inode
  fs/9p: Add v9fs_inode
  fs/9p: Don't set stat.st_blocks based on nrpages
  fs/9p: Add inode hashing
  ...
2011-03-16 08:58:09 -07:00
Linus Torvalds
bd2895eead Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix build failure introduced by s/freezeable/freezable/
  workqueue: add system_freezeable_wq
  rds/ib: use system_wq instead of rds_ib_fmr_wq
  net/9p: replace p9_poll_task with a work
  net/9p: use system_wq instead of p9_mux_wq
  xfs: convert to alloc_workqueue()
  reiserfs: make commit_wq use the default concurrency level
  ocfs2: use system_wq instead of ocfs2_quota_wq
  ext4: convert to alloc_workqueue()
  scsi/scsi_tgt_lib: scsi_tgtd isn't used in memory reclaim path
  scsi/be2iscsi,qla2xxx: convert to alloc_workqueue()
  misc/iwmc3200top: use system_wq instead of dedicated workqueues
  i2o: use alloc_workqueue() instead of create_workqueue()
  acpi: kacpi*_wq don't need WQ_MEM_RECLAIM
  fs/aio: aio_wq isn't used in memory reclaim path
  input/tps6507x-ts: use system_wq instead of dedicated workqueue
  cpufreq: use system_wq instead of dedicated workqueues
  wireless/ipw2x00: use system_wq instead of dedicated workqueues
  arm/omap: use system_wq in mailbox
  workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER
2011-03-16 08:20:19 -07:00
Dan Siemon
4a2b9c3756 net_sched: fix ip_tos2prio
ECN support incorrectly maps ECN BESTEFFORT packets to TC_PRIO_FILLER
(1) instead of TC_PRIO_BESTEFFORT (0)

This means ECN enabled flows are placed in pfifo_fast/prio low priority
band, giving ECN enabled flows [ECT(0) and CE codepoints] higher drop
probabilities.

This is rather unfortunate, given we would like ECN being more widely
used.

Ref : http://www.coverfire.com/archives/2011/03/13/pfifo_fast-and-ecn/

Signed-off-by: Dan Siemon <dan@coverfire.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dave Täht <d@taht.net>
Cc: Jonathan Morton <chromatix99@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-15 18:53:54 -07:00
Randy Dunlap
986d4abbdd sunrpc: fix printk format warning
Fix printk format build warning:

net/sunrpc/xprtrdma/verbs.c:1463: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'dma_addr_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-15 20:17:32 -04:00
j223yang@asset.uwaterloo.ca
4d4a76f330 xprt: remove redundant null check
'req' is dereferenced before checked for NULL.
The patch simply removes the check.

Signed-off-by: Jinqiu Yang<crindy646@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-15 20:16:14 -04:00
Linus Torvalds
422e6c4bc4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (57 commits)
  tidy the trailing symlinks traversal up
  Turn resolution of trailing symlinks iterative everywhere
  simplify link_path_walk() tail
  Make trailing symlink resolution in path_lookupat() iterative
  update nd->inode in __do_follow_link() instead of after do_follow_link()
  pull handling of one pathname component into a helper
  fs: allow AT_EMPTY_PATH in linkat(), limit that to CAP_DAC_READ_SEARCH
  Allow passing O_PATH descriptors via SCM_RIGHTS datagrams
  readlinkat(), fchownat() and fstatat() with empty relative pathnames
  Allow O_PATH for symlinks
  New kind of open files - "location only".
  ext4: Copy fs UUID to superblock
  ext3: Copy fs UUID to superblock.
  vfs: Export file system uuid via /proc/<pid>/mountinfo
  unistd.h: Add new syscalls numbers to asm-generic
  x86: Add new syscalls for x86_64
  x86: Add new syscalls for x86_32
  fs: Remove i_nlink check from file system link callback
  fs: Don't allow to create hardlink for deleted file
  vfs: Add open by file handle support
  ...
2011-03-15 15:48:13 -07:00
James Morris
a002951c97 Merge branch 'next' into for-linus 2011-03-16 09:41:17 +11:00
Eric Dumazet
7313714775 xfrm: fix __xfrm_route_forward()
This function should return 0 in case of error, 1 if OK
commit 452edd598f (xfrm: Return dst directly from xfrm_lookup())
got it wrong.

Reported-and-bisected-by: Michael Smith <msmith@cbnco.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-15 15:26:43 -07:00
David S. Miller
c337ffb68e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-03-15 15:15:17 -07:00
Rémi Denis-Courmont
638be34459 Phonet: fix aligned-mode pipe socket buffer header reserve
When the pipe uses aligned-mode data packets, we must reserve 4 bytes
instead of 3 for the pipe protocol header. Otherwise the Phonet header
would not be aligned, resulting in potentially corrupted headers with
later unaligned memory writes.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-15 14:55:49 -07:00
David S. Miller
918690f981 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-03-15 13:57:18 -07:00
David S. Miller
31111c26d9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
2011-03-15 13:03:27 -07:00
Florian Westphal
2f5dc63123 netfilter: xt_addrtype: ipv6 support
The kernel will refuse certain types that do not work in ipv6 mode.
We can then add these features incrementally without risk of userspace
breakage.

Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 20:17:44 +01:00
Florian Westphal
de81bbea17 netfilter: ipt_addrtype: rename to xt_addrtype
Followup patch will add ipv6 support.

ipt_addrtype.h is retained for compatibility reasons, but no longer used
by the kernel.

Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 20:16:20 +01:00
John W. Linville
106af2c99a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-03-15 14:16:48 -04:00
Tommi Virtanen
b09734b1f4 libceph: Fix base64-decoding when input ends in newline.
It used to return -EINVAL because it thought the end was not aligned
to 4 bytes.

Clean up superfluous src < end test in if, the while itself guarantees
that.

Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-15 09:14:02 -07:00
Aneesh Kumar K.V
c0aa4caf4c net/9p: Implement syncfs 9P operation
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:38 -05:00
Venkateswararao Jujjuri (JV)
f735195d51 [net/9p] Small non-IO PDUs for zero-copy supporting transports.
If a transport prefers payload to be sent separate from the PDU
(P9_TRANS_PREF_PAYLOAD_SEP), there is no need to allocate msize
PDU buffers(struct p9_fcall).

This patch allocates only upto 4k buffers for this kind of transports
and there won't be any change to the legacy transports.

Hence, this patch on top of zero copy changes allows user to
specify higher msizes through the mount option
without hogging the kernel heap.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:36 -05:00
Venkateswararao Jujjuri (JV)
ca41bb3e21 [net/9p] Handle Zero Copy TREAD/RERROR case in !dotl case.
This takes care of copying out error buffers from user buffer
payloads when we are using zero copy.  This happens because the
only payload buffer the server has to respond to the request is
the user buffer given for the zero copy read.

Because we only use zerocopy when the amount of data to transfer
is greater than a certain size (currently 4K) and error strings are
limited to ERRMAX (currently 128) we don't need to worry about there
being sufficient space for the error to fit in the payload.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:36 -05:00
Venkateswararao Jujjuri (JV)
2c66523fd2 [net/9p] readdir zerocopy changes for 9P2000.L protocol.
Modify p9_client_readdir() to check the transport preference and act according
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:35 -05:00
Venkateswararao Jujjuri (JV)
1fc52481c2 [net/9p] Write side zerocopy changes for 9P2000.L protocol.
Modify p9_client_write() to check the transport preference and act accordingly.
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:35 -05:00
Venkateswararao Jujjuri (JV)
bb2f8a5515 [net/9p] Read side zerocopy changes for 9P2000.L protocol.
Modify p9_client_read() to check the transport preference and act accordingly.
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:35 -05:00
Venkateswararao Jujjuri (JV)
6f69c395ce [net/9p] Add preferences to transport layer.
This patch adds preferences field to the p9_trans_module.
Through this, now transport layer can express its preference about the
payload. i.e if payload neds to be part of the PDU or it prefers it
to be sent sepearetly so that the transport layer can handle it in
a better way.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:35 -05:00
Venkateswararao Jujjuri (JV)
4038866dab [net/9p] Add gup/zero_copy support to VirtIO transport layer.
Modify p9_virtio_request() and req_done() functions to support
additional payload sent down to the transport layer through
tc->pubuf and tc->pkbuf.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:35 -05:00
Venkateswararao Jujjuri (JV)
9bb6c10a4e [net/9p] Assign type of transaction to tc->pdu->id which is otherwise unsed.
This will be used by the transport layer to determine the out going
request type. Transport layer uses this information to correctly
place the mapped pages in the PDU. Patches following this will make
use of this to achieve zero copy.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:34 -05:00
Venkateswararao Jujjuri (JV)
022cae3655 [net/9p] Preparation and helper functions for zero copy
This patch prepares p9_fcall structure for zero copy. Added
fields send the payload buffer information to the transport layer.
In addition it adds a 'private' field for the transport layer to
store mapped/pinned page information so that it can be freed/unpinned
during req_done.

This patch also creates trans_common.[ch] to house helper functions.
It adds the following helper functions.

p9_release_req_pages - Release pages after the transaction.
p9_nr_pages - Return number of pages needed to accomodate the payload.
payload_gup - Translates user buffer into kernel pages.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-15 09:57:34 -05:00
Vasiliy Kulikov
6a8ab06077 ipv6: netfilter: ip6_tables: fix infoleak to userspace
Structures ip6t_replace, compat_ip6t_replace, and xt_get_revision are
copied from userspace.  Fields of these structs that are
zero-terminated strings are not checked.  When they are used as argument
to a format string containing "%s" in request_module(), some sensitive
information is leaked to userspace via argument of spawned modprobe
process.

The first bug was introduced before the git epoch;  the second was
introduced in 3bc3fe5e (v2.6.25-rc1);  the third is introduced by
6b7d31fc (v2.6.15-rc1).  To trigger the bug one should have
CAP_NET_ADMIN.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 13:37:13 +01:00
Vasiliy Kulikov
78b7987676 netfilter: ip_tables: fix infoleak to userspace
Structures ipt_replace, compat_ipt_replace, and xt_get_revision are
copied from userspace.  Fields of these structs that are
zero-terminated strings are not checked.  When they are used as argument
to a format string containing "%s" in request_module(), some sensitive
information is leaked to userspace via argument of spawned modprobe
process.

The first and the third bugs were introduced before the git epoch; the
second was introduced in 2722971c (v2.6.17-rc1).  To trigger the bug
one should have CAP_NET_ADMIN.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 13:36:05 +01:00
Vasiliy Kulikov
42eab94fff netfilter: arp_tables: fix infoleak to userspace
Structures ipt_replace, compat_ipt_replace, and xt_get_revision are
copied from userspace.  Fields of these structs that are
zero-terminated strings are not checked.  When they are used as argument
to a format string containing "%s" in request_module(), some sensitive
information is leaked to userspace via argument of spawned modprobe
process.

The first bug was introduced before the git epoch;  the second is
introduced by 6b7d31fc (v2.6.15-rc1);  the third is introduced by
6b7d31fc (v2.6.15-rc1).  To trigger the bug one should have
CAP_NET_ADMIN.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 13:35:21 +01:00
Changli Gao
4656c4d61a netfilter: xt_connlimit: remove connlimit_rnd_inited
A potential race condition when generating connlimit_rnd is also fixed.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 13:26:32 +01:00
Changli Gao
3e0d5149e6 netfilter: xt_connlimit: use hlist instead
The header of hlist is smaller than list.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 13:25:42 +01:00
Changli Gao
0e23ca14f8 netfilter: xt_connlimit: use kmalloc() instead of kzalloc()
All the members are initialized after kzalloc().

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 13:24:56 +01:00
Changli Gao
8183e3a88a netfilter: xt_connlimit: fix daddr connlimit in SNAT scenario
We use the reply tuples when limiting the connections by the destination
addresses, however, in SNAT scenario, the final reply tuples won't be
ready until SNAT is done in POSTROUING or INPUT chain, and the following
nf_conntrack_find_get() in count_tem() will get nothing, so connlimit
can't work as expected.

In this patch, the original tuples are always used, and an additional
member addr is appended to save the address in either end.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-15 13:23:28 +01:00
Al Viro
326be7b484 Allow passing O_PATH descriptors via SCM_RIGHTS datagrams
Just need to make sure that AF_UNIX garbage collector won't
confuse O_PATHed socket on filesystem for real AF_UNIX opened
socket.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-15 02:21:45 -04:00
Simon Horman
14e405461e IPVS: Add __ip_vs_control_{init,cleanup}_sysctl()
Break out the portions of __ip_vs_control_init() and
__ip_vs_control_cleanup() where aren't necessary when
CONFIG_SYSCTL is undefined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:37:01 +09:00
Simon Horman
fb1de432c1 IPVS: Conditionally define and use ip_vs_lblc{r}_table
ip_vs_lblc_table and ip_vs_lblcr_table, and code that uses them
are unnecessary when CONFIG_SYSCTL is undefined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:37:01 +09:00
Simon Horman
a7a86b8616 IPVS: Minimise ip_vs_leave when CONFIG_SYSCTL is undefined
Much of ip_vs_leave() is unnecessary if CONFIG_SYSCTL is undefined.

I tried an approach of breaking the now #ifdef'ed portions out
into a separate function. However this appeared to grow the
compiled code on x86_64 by about 200 bytes in the case where
CONFIG_SYSCTL is defined. So I have gone with the simpler though
less elegant #ifdef'ed solution for now.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:37:00 +09:00
Simon Horman
b27d777ec5 IPVS: Conditinally use sysctl_lblc{r}_expiration
In preparation for not including sysctl_lblc{r}_expiration in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:59 +09:00
Simon Horman
8e1b0b1b56 IPVS: Add expire_quiescent_template()
In preparation for not including sysctl_expire_quiescent_template in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:58 +09:00
Simon Horman
71a8ab6cad IPVS: Add sysctl_expire_nodest_conn()
In preparation for not including sysctl_expire_nodest_conn in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:58 +09:00
Simon Horman
7532e8d40c IPVS: Add sysctl_sync_ver()
In preparation for not including sysctl_sync_ver in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:57 +09:00
Simon Horman
59e0350ead IPVS: Add {sysctl_sync_threshold,period}()
In preparation for not including sysctl_sync_threshold in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:57 +09:00
Simon Horman
0cfa558e2c IPVS: Add sysctl_nat_icmp_send()
In preparation for not including sysctl_nat_icmp_send in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:56 +09:00
Simon Horman
84b3cee39f IPVS: Add sysctl_snat_reroute()
In preparation for not including sysctl_snat_reroute in
struct netns_ipvs when CONFIG_SYCTL is not defined.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:55 +09:00
Simon Horman
ba4fd7e966 IPVS: Add ip_vs_route_me_harder()
Add ip_vs_route_me_harder() to avoid repeating the same code twice.

Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:54 +09:00
Julian Anastasov
6ef757f965 ipvs: rename estimator functions
Rename ip_vs_new_estimator to ip_vs_start_estimator
and ip_vs_kill_estimator to ip_vs_stop_estimator to better
match their logic.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:54 +09:00
Julian Anastasov
ea9f22cce9 ipvs: optimize rates reading
Move the estimator reading from estimation_timer to user
context. ip_vs_read_estimator() will be used to decode the rate
values. As the decoded rates are not set by estimation timer
there is no need to reset them in ip_vs_zero_stats.

 	There is no need ip_vs_new_estimator() to encode stats
to rates, if the destination is in trash both the stats and the
rates are inactive.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:53 +09:00
Julian Anastasov
55a3d4e15c ipvs: properly zero stats and rates
Currently, the new percpu counters are not zeroed and
the zero commands do not work as expected, we still show the old
sum of percpu values. OTOH, we can not reset the percpu counters
from user context without causing the incrementing to use old
and bogus values.

 	So, as Eric Dumazet suggested fix that by moving all overhead
to stats reading in user context. Do not introduce overhead in
timer context (estimator) and incrementing (packet handling in
softirqs).

 	The new ustats0 field holds the zero point for all
counter values, the rates always use 0 as base value as before.
When showing the values to user space just give the difference
between counters and the base values. The only drawback is that
percpu stats are not zeroed, they are accessible only from /proc
and are new interface, so it should not be a compatibility problem
as long as the sum stats are correct after zeroing.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:52 +09:00
Julian Anastasov
2a0751af09 ipvs: reorganize tot_stats
The global tot_stats contains cpustats field just like the
stats for dest and svc, so better use it to simplify the usage
in estimation_timer. As tot_stats is registered as estimator
we can remove the special ip_vs_read_cpu_stats call for
tot_stats. Fix ip_vs_read_cpu_stats to be called under
stats lock because it is still used as synchronization between
estimation timer and user context (the stats readers).

 	Also, make sure ip_vs_stats_percpu_show reads properly
the u64 stats from user context.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:52 +09:00
Shan Wei
6060c74a3d netfilter:ipvs: use kmemdup
The semantic patch that makes this output is available
in scripts/coccinelle/api/memdup.cocci.

More information about semantic patching is available at
http://coccinelle.lip6.fr/

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:49 +09:00
Julian Anastasov
4a569c0c0f ipvs: remove _bh from percpu stats reading
ip_vs_read_cpu_stats is called only from timer, so
no need for _bh locks.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:48 +09:00
Julian Anastasov
097fc76a08 ipvs: avoid lookup for fwmark 0
Restore the previous behaviour to lookup for fwmark
service only when fwmark is non-null. This saves only CPU.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-03-15 09:36:48 +09:00
Mark Rustad
698e1d23cf net: dcbnl: Update copyright dates
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 17:02:42 -07:00
Sangtae Ha
b5ccd07337 tcp_cubic: fix low utilization of CUBIC with HyStart
HyStart sets the initial exit point of slow start.
Suppose that HyStart exits at 0.5BDP in a BDP network and no history exists.
If the BDP of a network is large, CUBIC's initial cwnd growth may be
too conservative to utilize the link.
CUBIC increases the cwnd 20% per RTT in this case.

Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:42 -07:00
Sangtae Ha
2b4636a5f8 tcp_cubic: make the delay threshold of HyStart less sensitive
Make HyStart less sensitive to abrupt delay variations due to buffer bloat.

Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:42 -07:00
stephen hemminger
3b585b3449 tcp_cubic: enable high resolution ack time if needed
This is a refined version of an earlier patch by Lucas Nussbaum.
Cubic needs RTT values in milliseconds. If HZ < 1000 then
the values will be too coarse.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:40 -07:00
stephen hemminger
17a6e9f1aa tcp_cubic: fix clock dependency
The hystart code was written with assumption that HZ=1000.
Replace the use of jiffies with bictcp_clock as a millisecond
real time clock.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reported-by: Lucas Nussbaum <lucas.nussbaum@loria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:39 -07:00
stephen hemminger
aac46324e1 tcp_cubic: make ack train delta value a parameter
Make the spacing between ACK's that indicates a train a tuneable
value like other hystart values.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:39 -07:00
stephen hemminger
c54b4b7655 tcp_cubic: fix comparison of jiffies
Jiffies wraps around therefore the correct way to compare is
to use cast to signed value.

Note: cubic is not using full jiffies value on 64 bit arch
because using full unsigned long makes struct bictcp grow too
large for the available ca_priv area.

Includes correction from Sangtae Ha to improve ack train detection.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:38 -07:00
stephen hemminger
febf081987 tcp: fix RTT for quick packets in congestion control
In the congestion control interface, the callback for each ACK
includes an estimated round trip time in microseconds.
Some algorithms need high resolution (Vegas style) but most only
need jiffie resolution.  If RTT is not accurate (like a retransmission)
-1 is used as a flag value.

When doing coarse resolution if RTT is less than a a jiffie
then 0 should be returned rather than no estimate. Otherwise algorithms
that expect good ack's to trigger slow start (like CUBIC Hystart)
will be confused.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:54:38 -07:00
Daniel Baluta
e5537bfc98 af_unix: update locking comment
We latch our state using a spinlock not a r/w kind of lock.

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:25:33 -07:00
stephen hemminger
a461c0297f bridge: skip forwarding delay if not using STP
If Spanning Tree Protocol is not enabled, there is no good reason for
the bridge code to wait for the forwarding delay period before enabling
the link. The purpose of the forwarding delay is to allow STP to
learn about other bridges before nominating itself.

The only possible impact is that when starting up a new port
the bridge may flood a packet now, where previously it might have
seen traffic from the other host and preseeded the forwarding table.

Includes change for local variable br already available in that func.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:06:49 -07:00
stephen hemminger
1faa4356a3 bridge: control carrier based on ports online
This makes the bridge device behave like a physical device.
In earlier releases the bridge always asserted carrier. This
changes the behavior so that bridge device carrier is on only
if one or more ports are in the forwarding state. This
should help IPv6 autoconfiguration, DHCP, and routing daemons.

I did brief testing with Network and Virt manager and they
seem fine, but since this changes behavior of bridge, it should
wait until net-next (2.6.39).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Tested-By: Adam Majer <adamm@zombino.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 14:29:02 -07:00
David S. Miller
201a11c1db Merge branch 'tipc-Mar14-2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/net-next-2.6 2011-03-14 13:49:53 -07:00
Daniel Turull
05aebe2e5d pktgen: bug fix in transmission headers with frags=0
(bug introduced by commit 26ad787962
(pktgen: speedup fragmented skbs)

The headers of pktgen were incorrectly added in a pktgen packet
without frags (frags=0). There was an offset in the pktgen headers.

The cause was in reusing the pgh variable as a return variable in skb_put
when adding the payload to the skb.

Signed-off-by: Daniel Turull <daniel.turull@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
2011-03-14 13:47:40 -07:00
Felix Fietkau
9db372fdd5 mac80211: fix channel type recalculation with HT and non-HT interfaces
When running an AP interface along with the cooked monitor interface created
by hostapd, adding an interface and deleting it again triggers a channel type
recalculation during which the (non-HT) monitor interface takes precedence
over the HT AP interface, thus causing the channel type to be set to non-HT.
Fix this by ensuring that a more wide channel type will not be overwritten
by a less wide channel type.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-14 14:46:58 -04:00
Helmut Schaa
cf28d7934c mac80211: Shortcut minstrel_ht rate setup for non-MRR capable devices
Devices without multi rate retry support won't be able to use all rates
as specified by mintrel_ht. Hence, we can simply skip setting up further
rates as the devices will only use the first one.

Also add a special case for devices with only two possible tx rates. We
use sample_rate -> max_prob_rate for sampling and max_tp_rate ->
max_prob_rate by default.

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-14 14:46:58 -04:00
Stephen Hemminger
fe8f661f2c netfilter: nf_conntrack: fix sysctl memory leak
Message in log because sysctl table was not empty at netns exit
 WARNING: at net/sysctl_net.c:84 sysctl_net_exit+0x2a/0x2c()

Instrumenting showed that the nf_conntrack_timestamp was the entry
that was being created but not cleared.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-14 19:20:44 +01:00
Linus Torvalds
5f40d42094 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: NFSROOT should default to "proto=udp"
  nfs4: remove duplicated #include
  NFSv4: nfs4_state_mark_reclaim_nograce() should be static
  NFSv4: Fix the setlk error handler
  NFSv4.1: Fix the handling of the SEQUENCE status bits
  NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
  NFSv4.1 reclaim complete must wait for completion
  NFSv4: remove duplicate clientid in struct nfs_client
  NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
  sunrpc: Propagate errors from xs_bind() through xs_create_sock()
  (try3-resend) Fix nfs_compat_user_ino64 so it doesn't cause problems if bit 31 or 63 are set in fileid
  nfs: fix compilation warning
  nfs: add kmalloc return value check in decode_and_add_ds
  SUNRPC: Remove resource leak in svc_rdma_send_error()
  nfs: close NFSv4 COMMIT vs. CLOSE race
  SUNRPC: Close a race in __rpc_wait_for_completion_task()
2011-03-14 11:19:50 -07:00
Patrick McHardy
42046e2e45 netfilter: x_tables: return -ENOENT for non-existant matches/targets
As Stephen correctly points out, we need to return -ENOENT in
xt_find_match()/xt_find_target() after the patch "netfilter: x_tables:
misuse of try_then_request_module" in order to properly indicate
a non-existant module to the caller.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-03-14 19:11:44 +01:00
Paul Gortmaker
1fa073803e tipc: delete extra semicolon blocking node deletion
Remove bogus semicolon only recently introduced in 34e46258cb
that blocks cleanup of nodes for N>1 on shutdown.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-14 12:21:12 -04:00
Al Viro
c9c6cac0c2 kill path_lookup()
all remaining callers pass LOOKUP_PARENT to it, so
flags argument can die; renamed to kern_path_parent()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-14 09:15:23 -04:00
Eric Dumazet
4e75db2e8f inetpeer: should use call_rcu() variant
After commit 7b46ac4e77 (inetpeer: Don't disable BH for initial
fast RCU lookup.), we should use call_rcu() to wait proper RCU grace
period.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 23:22:23 -07:00
Steffen Klassert
d8647b79c3 xfrm: Add user interface for esn and big anti-replay windows
This patch adds a netlink based user interface to configure
esn and big anti-replay windows. The new netlink attribute
XFRMA_REPLAY_ESN_VAL is used to configure the new implementation.
If the XFRM_STATE_ESN flag is set, we use esn and support for big
anti-replay windows for the configured state. If this flag is not
set we use the new implementation with 32 bit sequence numbers.
A big anti-replay window can be configured in this case anyway.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:31 -07:00
Steffen Klassert
2cd084678f xfrm: Add support for IPsec extended sequence numbers
This patch adds support for IPsec extended sequence numbers (esn)
as defined in RFC 4303. The bits to manage the anti-replay window
are based on a patch from Alex Badea.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:31 -07:00
Steffen Klassert
97e15c3a85 xfrm: Support anti-replay window size bigger than 32 packets
As it is, the anti-replay bitmap in struct xfrm_replay_state can
only accomodate 32 packets. Even though it is possible to configure
anti-replay window sizes up to 255 packets from userspace. So we
reject any packet with a sequence number within the configured window
but outside the bitmap. With this patch, we represent the anti-replay
window as a bitmap of variable length that can be accessed via the
new struct xfrm_replay_state_esn. Thus, we have no limit on the
window size anymore. To use the new anti-replay window implementantion,
new userspace tools are required. We leave the old implementation
untouched to stay in sync with old userspace tools.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:30 -07:00
Steffen Klassert
9fdc4883d9 xfrm: Move IPsec replay detection functions to a separate file
To support multiple versions of replay detection, we move the replay
detection functions to a separate file and make them accessible
via function pointers contained in the struct xfrm_replay.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:30 -07:00
Steffen Klassert
d212a4c290 esp6: Add support for IPsec extended sequence numbers
This patch adds IPsec extended sequence numbers support to esp6.
We use the authencesn crypto algorithm to handle esp with separate
encryption/authentication algorithms.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:29 -07:00
Steffen Klassert
0dc49e9b28 esp4: Add support for IPsec extended sequence numbers
This patch adds IPsec extended sequence numbers support to esp4.
We use the authencesn crypto algorithm to handle esp with separate
encryption/authentication algorithms.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:29 -07:00
Steffen Klassert
1ce3644ade xfrm: Use separate low and high order bits of the sequence numbers in xfrm_skb_cb
To support IPsec extended sequence numbers, we split the
output sequence numbers of xfrm_skb_cb in low and high order 32 bits
and we add the high order 32 bits to the input sequence numbers.
All users are updated accordingly.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 20:22:28 -07:00
David S. Miller
27b61ae2d7 Merge branch 'tipc-Mar13-2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/net-next-2.6 2011-03-13 18:49:11 -07:00
Hiroaki SHIMODA
46af31800b ipv4: Fix PMTU update.
On current net-next-2.6, when Linux receives ICMP Type: 3, Code: 4
(Destination unreachable (Fragmentation needed)),

  icmp_unreach
    -> ip_rt_frag_needed
         (peer->pmtu_expires is set here)
    -> tcp_v4_err
         -> do_pmtu_discovery
              -> ip_rt_update_pmtu
                   (peer->pmtu_expires is already set,
                    so check_peer_pmtu is skipped.)
                   -> check_peer_pmtu

check_peer_pmtu is skipped and MTU is not updated.

To fix this, let check_peer_pmtu execute unconditionally.
And some minor fixes
1) Avoid potential peer->pmtu_expires set to be zero.
2) In check_peer_pmtu, argument of time_before is reversed.
3) check_peer_pmtu expects peer->pmtu_orig is initialized as zero,
   but not initialized.

Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-13 18:37:49 -07:00
Allan Stephens
390bce4237 tipc: Eliminate obsolete routine for handling routed messages
Eliminates a routine that is used in handling messages arriving from
another cluster or zone. Such messages can no longer be received by TIPC
now that multi-cluster and multi-zone network support has been eliminated.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:19 -04:00
Allan Stephens
7945c1fb02 tipc: Eliminate remaining support for routing table messages
Gets rid of all remaining code relating to ROUTE_DISTRIBUTOR messages.
These messages were only used in multi-cluster and multi-zone networks,
which TIPC no longer supports. (For safety, TIPC now treats such messages
the same way that it handles other unrecognized messages.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:19 -04:00
Allan Stephens
50d492321a tipc: Remove bearer flag indicating existence of broadcast address
Eliminates the flag in the TIPC bearer structure that indicates if
the bearer supports broadcasting, since the flag is always set to 1
and serves no useful purpose.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:19 -04:00
Allan Stephens
f9107ebe7d tipc: Don't respond to neighbor discovery request on blocked bearer
Adds a check to prevent TIPC from trying to respond to an incoming
LINK_CONFIG request message if the associated bearer is currently
prohibited from sending messages.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:19 -04:00
Allan Stephens
d901a42b27 tipc: Eliminate unnecessary constant for neighbor discovery msg size
Eliminates an unnecessary constant that defines the size of a LINK_CONFIG
message, and uses one of the existing standard message size symbols in
its place. (The defunct constant was located in the wrong place anyway,
since it was grouped with other constants that define message users instead
of message sizes.)

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:18 -04:00
Allan Stephens
a2b58de2e3 tipc: Remove unused field in bearer structure
Eliminates a field in TIPC's bearer objects that is set, but never
referenced.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:18 -04:00
Allan Stephens
50d3e6399a tipc: Correct misnamed references to neighbor discovery domain
Renames items that are improperly labelled as "network scope" items
(which are represented by simple integer values) rather than "network
domain" items (which are represented by <Z.C.N>-type network addresses).
This change is purely cosmetic, and does not affect the operation of TIPC.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:18 -04:00
Allan Stephens
37b9c08a88 tipc: Optimizations to link creation code
Enhances link creation code as follows:

1) Detects illegal attempts to add a requested link earlier in the
   link creation process. This prevents TIPC from wasting time
   initializing a link object it then throws away, and also eliminates
   the code needed to do the throwing away.

2) Passes in the node object associated with the requested link.
   This allows TIPC to eliminate a search to locate the node object,
   as well as code that attempted to create the node if it doesn't
   exist.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:18 -04:00
Allan Stephens
fa2bae2d5b tipc: Give Tx of discovery responses priority over link messages
Delay releasing the node lock when processing a neighbor discovery
message until after the optional discovery response message has been
sent. This helps ensure that any link protocol messages sent by a
link endpoint created as a result of a neighbor discovery request
are received after the discovery response is received, thereby
giving the receiving node a chance to create a peer link endpoint to
consume those link protocol messages, if one does not already exist.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:18 -04:00
Allan Stephens
a728750e4f tipc: Cosmetic changes to neighbor discovery logic
Reworks the appearance of the routine that processes incoming
LINK_CONFIG messages to keep the main logic flow at a consistent level
of indentation, and to add comments outlining the various phases involved
in processing each message. This rework is being done to allow upcoming
enhancements to this routine to be integrated more cleanly.

The diff isn't really readable, so know that it was a case of the
old code being like:

	tipc_disc_recv_msg(..)
	{
		if (in_own_cluster(orig)) {
			...
			lines and lines of stuff
			...
		}
	}

which is now replaced with the more sane:

	tipc_disc_recv_msg(..)
	{
		if (!in_own_cluster(orig))
			return;
		...
		lines and lines of stuff
		...
	}

Instances of spin locking within the reindented block were replaced with
the identical tipc_node_[un]lock() abstractions.  Note that all these
changes are cosmetic in nature, and do not change the way LINK_CONFIG
messages are processed.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:18 -04:00
Allan Stephens
75f0aa4990 tipc: Fix redundant link field handling in link protocol message
Ensures that the "redundant link exists" field of the LINK_PROTOCOL
messages sent by a link endpoint is set if and only if the sending
node has at least one other working link to the peer node. Previously,
the bit was set only if there were at least 2 working links to the peer
node, meaning the bit was incorrectly left unset in messages sent by a
non-working link endpoint when exactly one alternate working link was
available. The revised code now takes the state of the link sending
the message into account when deciding if an alternate link exists.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:18 -04:00
Allan Stephens
77f167fcce tipc: make msg_set_redundant_link() consistent with other set ops
All the other boolean like msg_set_X(m) operations don't
export both a msg_set_X(a) and a msg_clear_X(m), but instead
just have the single msg_set_X(m, val) variant.

Make the redundant_link one consistent by having the set take
a value, and delete the msg_clear_redundant_link() anomoly.
This is a cosmetic change and should not change behaviour.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:17 -04:00
Paul Gortmaker
8f19afb2db tipc: cosmetic - function names are not to be full sentences
Function names like "tipc_node_has_redundant_links" are unweildy
and result in long lines even for simple lines.  The "has" doesn't
contribute any value add, so dropping that is a slight step in the
right direction.   This is a cosmetic change, basic result of:

for i in `grep -l tipc_node_has_ *` ; do sed -i s/tipc_node_has_/tipc_node_/ $i ; done

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:17 -04:00
Allan Stephens
e7b3acb6a8 tipc: Eliminate timestamp from link protocol messages
Removes support for the timestamp field of TIPC's link protocol messages.

This field was previously used to hold an OS-dependent timestamp value
that was used to assist in debugging early versions of TIPC. The field
has now been deemed unnecessary and has been removed from the latest TIPC
specification. This change has no impact on the operation of TIPC since
the field was set by TIPC, but never referenced.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:17 -04:00
Allan Stephens
34e46258cb tipc: manually inline net_start/stop, make assoc. vars static
Relocates network-related variables into the subsystem files where
they are now primarily used (following the recent rework of TIPC's
node table), and converts globals into locals where possible. Changes
the initialization of tipc_num_links from run-time to compile-time,
and eliminates the net_start routine that becomes empty as a result.
Also eliminates the corresponding net_stop routine by moving its
(trivial) content into the one location that called the routine.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:17 -04:00
Allan Stephens
672d99e19a tipc: Convert node object array to a hash table
Replaces the dynamically allocated array of pointers to the cluster's
node objects with a static hash table. Hash collisions are resolved
using chaining, with a typical hash chain having only a single node,
to avoid degrading performance during processing of incoming packets.
The conversion to a hash table reduces the memory requirements for
TIPC's node table to approximately the same size it had prior to
the previous commit.

In addition to the hash table itself, TIPC now also maintains a
linked list for the node objects, sorted by ascending network address.
This list allows TIPC to continue sending responses to user space
applications that request node and link information in sorted order.
The list also improves performance when name table update messages are
sent by making it easier to identify the nodes that must be notified.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:17 -04:00
Allan Stephens
f831c963b5 tipc: Eliminate configuration for maximum number of cluster nodes
Gets rid of the need for users to specify the maximum number of
cluster nodes supported by TIPC. TIPC now automatically provides
support for all 4K nodes allowed by its addressing scheme.

Note: This change sets TIPC's memory usage to the amount used by
a maximum size node table with 4K entries.  An upcoming patch that
converts the node table from a linear array to a hash table will
compact the node table to a more efficient design, but for clarity
it is nice to have all the Kconfig infrastruture go away separately.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:17 -04:00
Allan Stephens
d1bcb11544 tipc: Split up unified structure of network-related variables
Converts the fields of the global "tipc_net" structure into individual
variables.  Since the struct was never referenced as a complete unit,
its existence was pointless.  This will facilitate upcoming changes to
TIPC's node table and simpify upcoming relocation of the variables so
they are only visible to the files that actually use them.

This change is essentially cosmetic in nature, and doesn't affect the
operation of TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:17 -04:00
Allan Stephens
9df3b7eb6e tipc: Fix problem with missing link in "tipc-config -l" output
Removes a race condition that could cause TIPC's internal counter
of the number of links it has to neighboring nodes to have the
incorrect value if two independent threads of control simultaneously
create new link endpoints connecting to two different nodes using two
different bearers. Such under counting would result in TIPC failing to
list the final link(s) in its response to a configuration request to
list all of the node's links. The counter is now updated atomically
to ensure that simultaneous increments do not interfere with each
other.

Thanks go to Peter Butler <pbutler@pt.com> for his assistance in
diagnosing and fixing this problem.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:16 -04:00
Allan Stephens
71092ea122 tipc: Add support for SO_RCVTIMEO socket option
Adds support for the SO_RCVTIMEO socket option to TIPC's socket
receive routines.

Thanks go out to Raj Hegde <rajenhegde@yahoo.ca> for his contribution
to the development and testing this enhancement.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:16 -04:00
Allan Stephens
f137917332 tipc: Cosmetic changes to node subscription code
Relocates the code that notifies users of node subscriptions so that
it is adjacent to the rest of the routines that implement TIPC's node
subscription capability. Renames the name table routine that is
invoked by a node subscription to better reflect its purpose and to
be consistent with other, similar name table routines.

These changes are cosmetic in nature, and do not alter the behavior
of TIPC.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:16 -04:00
Allan Stephens
431697eb60 tipc: Prevent null pointer error when removing a node subscription
Prevents a null pointer dereference from occurring if a node subscription
is triggered at the same time that the subscribing port or publication is
terminating the subscription. The problem arises if the triggering routine
asynchronously activates and deregisters the node subscription while
deregistration is already underway -- the deregistration routine may find
that the pointer it has just verified to be non-NULL is now NULL.
To avoid this race condition the triggering routine now simply marks the
node subscription as defunct (to prevent it from re-activating)
instead of deregistering it. The subscription is now both deregistered
and destroyed only when the subscribing port or publication code terminates
the node subscription.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:16 -04:00
Allan Stephens
a3796f895f tipc: Add network address mask helper routines
Introduces a pair of helper routines that convert the network address
for a TIPC node into the network address for its cluster or zone.

This is a cosmetic change designed to avoid future errors caused by
the incorrect use of address bitmasks, and does not alter the existing
operation of TIPC.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:16 -04:00
Allan Stephens
aa84729484 tipc: Correct broadcast link peer info when displaying links
Fixes a typo in the calculation of the network address of a node's own
cluster when generating a response to the configuration command that
lists all of the node's links. The correct mask value for a <Z.C.N>
network address uses 1's for the 8-bit zone and 12-bit cluster parts
and 0's for the 12-bit node part.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:16 -04:00
Allan Stephens
0232fd0ac4 tipc: Allow receiving into iovec containing multiple entries
Enhances TIPC's socket receive routines to support iovec structures
containing more than a single entry. This change leverages existing
sk_buff routines to do most of the work; the only significant change
to TIPC itself is that an sk_buff now records how much data has been
already consumed as an numeric offset, rather than as a pointer to
the first unread data byte.

Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-03-13 16:35:16 -04:00
David S. Miller
bef55aebd5 decnet: Convert to use flowidn where applicable.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:55 -08:00
David S. Miller
1958b856c1 net: Put fl6_* macros to struct flowi6 and use them again.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:55 -08:00
David S. Miller
4c9483b2fb ipv6: Convert to use flowi6 where applicable.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:54 -08:00
David S. Miller
9cce96df5b net: Put fl4_* macros to struct flowi4 and use them again.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:54 -08:00
David S. Miller
f42454d632 ipv4: Kill fib_semantic_match declaration from fib_lookup.h
This function no longer exists.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:53 -08:00
David S. Miller
7e1dc7b6f7 net: Use flowi4 and flowi6 in xfrm layer.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:52 -08:00
David S. Miller
a1bbb0e698 netfilter: Use flowi4 and flowi6 in xt_TCPMSS
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:51 -08:00
David S. Miller
5a49d0e04d netfilter: Use flowi4 and flowi6 in nf_conntrack_h323_main
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:51 -08:00
David S. Miller
b6f21b2680 ipv4: Use flowi4 in UDP
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:50 -08:00
David S. Miller
3073e5ab92 netfilter: Use flowi4 in nf_nat_standalone.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:50 -08:00
David S. Miller
da91981bee ipv4: Use flowi4 in ipmr code.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:49 -08:00
David S. Miller
9ade22861f ipv4: Use flowi4 in FIB layer.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:49 -08:00
David S. Miller
9d6ec93801 ipv4: Use flowi4 in public route lookup interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:48 -08:00
David S. Miller
68a5e3dd0a ipv4: Use struct flowi4 internally in routing lookups.
We will change the externally visible APIs next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:48 -08:00
David S. Miller
22bd5b9b13 ipv4: Pass ipv4 flow objects into fib_lookup() paths.
To start doing these conversions, we need to add some temporary
flow4_* macros which will eventually go away when all the protocol
code paths are changed to work on AF specific flowi objects.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:47 -08:00
David S. Miller
56bb8059e1 net: Break struct flowi out into AF specific instances.
Now we have struct flowi4, flowi6, and flowidn for each address
family.  And struct flowi is just a union of them all.

It might have been troublesome to convert flow_cache_uli_match() but
as it turns out this function is completely unused and therefore can
be simply removed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:46 -08:00
David S. Miller
6281dcc94a net: Make flowi ports AF dependent.
Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*

This will let us to create AF optimal flow instances.

It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:46 -08:00
David S. Miller
1d28f42c1b net: Put flowi_* prefix on AF independent members of struct flowi
I intend to turn struct flowi into a union of AF specific flowi
structs.  There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:44 -08:00
David S. Miller
ca116922af xfrm: Eliminate "fl" and "pol" args to xfrm_bundle_ok().
There is only one caller of xfrm_bundle_ok(), and that always passes these
parameters as NULL.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:43 -08:00
David S. Miller
78fbfd8a65 ipv4: Create and use route lookup helpers.
The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:42 -08:00
Kevin Coffman
f8628220bb gss:krb5 only include enctype numbers in gm_upcall_enctypes
Make the value in gm_upcall_enctypes just the enctype values.
This allows the values to be used more easily elsewhere.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:39:27 -05:00
Tom Tucker
5c635e09ce RPCRDMA: Fix FRMR registration/invalidate handling.
When the rpc_memreg_strategy is 5, FRMR are used to map RPC data.
This mode uses an FRMR to map the RPC data, then invalidates
(i.e. unregisers) the data in xprt_rdma_free. These FRMR are used
across connections on the same mount, i.e. if the connection goes
away on an idle timeout and reconnects later, the FRMR are not
destroyed and recreated.

This creates a problem for transport errors because the WR that
invalidate an FRMR may be flushed (i.e. fail) leaving the
FRMR valid. When the FRMR is later used to map an RPC it will fail,
tearing down the transport and starting over. Over time, more and
more of the FRMR pool end up in the wrong state resulting in
seemingly random disconnects.

This fix keeps track of the FRMR state explicitly by setting it's
state based on the successful completion of a reg/inv WR. If the FRMR
is ever used and found to be in the wrong state, an invalidate WR
is prepended, re-syncing the FRMR state and avoiding the connection loss.

Signed-off-by: Tom Tucker <tom@ogc.us>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:39:27 -05:00
Tom Tucker
bd7ea31b9e RPCRDMA: Fix to XDR page base interpretation in marshalling logic.
The RPCRDMA marshalling logic assumed that xdr->page_base was an
offset into the first page of xdr->page_list. It is in fact an
offset into the xdr->page_list itself, that is, it selects the
first page in the page_list and the offset into that page.

The symptom depended in part on the rpc_memreg_strategy, if it was
FRMR, or some other one-shot mapping mode, the connection would get
torn down on a base and bounds error. When the badly marshalled RPC
was retransmitted it would reconnect, get the error, and tear down the
connection again in a loop forever. This resulted in a hung-mount. For
the other modes, it would result in silent data corruption. This bug is
most easily reproduced by writing more data than the filesystem
has space for.

This fix corrects the page_base assumption and otherwise simplifies
the iov mapping logic.

Signed-off-by: Tom Tucker <tom@ogc.us>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:39:27 -05:00
Andy Adamson
cbdabc7f8b NFSv4.1: filelayout async error handler
Use our own async error handler.
Mark the layout as failed and retry i/o through the MDS on specified errors.

Update the mds_offset in nfs_readpage_retry so that a failed short-read retry
to a DS gets correctly resent through the MDS.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:43 -05:00
Fred Isaman
eabf5baaaa RPC: clarify rpc_run_task error handling
rpc_run_task can only fail if it is not passed in a preallocated task.
However, that is not at all clear with the current code.  So
remove several impossible to occur failure checks.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:40 -05:00
Fred Isaman
cee6a5372f RPC: remove check for impossible condition in rpc_make_runnable
queue_work() only returns 0 or 1, never a negative value.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:40 -05:00
John W. Linville
38c091590f mac80211: implement support for cfg80211_ops->{get,set}_ringparam
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-11 15:34:10 -05:00
John W. Linville
3677713b79 wireless: add support for ethtool_ops->{get,set}_ringparam
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-11 14:16:58 -05:00
Jason Young
808118cb41 mac80211: do not enable ps if 802.1x controlled port is unblocked
If dynamic_ps is disabled, enabling power save before the 4-way
handshake completes may delay the station from being authorized to
send/receive traffic, i.e. increase roaming times. It also may result in
a failed 4-way handshake depending on the AP's timing requirements and
beacon interval, and the station's listen interval.

To fix this, prevent power save from being enabled while the station
isn't authorized and recalculate power save whenever the station's
authorized state changes.

Signed-off-by: Jason Young <a.young.jason@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-11 14:15:37 -05:00
John W. Linville
409ec36c32 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-03-11 14:11:11 -05:00
David S. Miller
1b7fe59322 ipv4: Kill flowi arg to fib_select_multipath()
Completely unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 17:03:45 -08:00
David S. Miller
ff3fccb3d0 ipv4: Remove unnecessary test from ip_mkroute_input()
fl->oif will always be zero on the input path, so there is no reason
to test for that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 17:01:01 -08:00
David S. Miller
dbdd9a52e3 ipv4: Remove redundant RCU locking in ip_check_mc().
All callers are under rcu_read_lock() protection already.

Rename to ip_check_mc_rcu() to make it even more clear.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 16:37:26 -08:00
David S. Miller
33175d84ee Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x_cmn.c
2011-03-10 14:26:00 -08:00
stephen hemminger
6dfbd87a20 ip6ip6: autoload ip6 tunnel
Add necessary alias to autoload ip6ip6 tunnel module.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 14:18:48 -08:00
David S. Miller
bef6e7e768 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ 2011-03-10 14:00:44 -08:00
Randy Dunlap
dcbcdf22f5 net: bridge builtin vs. ipv6 modular
When configs BRIDGE=y and IPV6=m, this build error occurs:

br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr'

BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding
	depends on IPV6 || IPV6=n
to BRIDGE_IGMP_SNOOPING would be a good fix.  As it is currently,
making BRIDGE depend on the IPV6 config works.

Reported-by: Patrick Schaaf <netdev@bof.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 13:45:57 -08:00
Ben Hutchings
4cea288aaf sunrpc: Propagate errors from xs_bind() through xs_create_sock()
xs_create_sock() is supposed to return a pointer or an ERR_PTR-encoded
error, but it currently returns 0 if xs_bind() fails.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org [v2.6.37]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:58 -05:00
Jesper Juhl
a5e5026810 SUNRPC: Remove resource leak in svc_rdma_send_error()
We leak the memory allocated to 'ctxt' when we return after
'ib_dma_mapping_error()' returns !=0.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:54 -05:00
Trond Myklebust
bf294b41ce SUNRPC: Close a race in __rpc_wait_for_completion_task()
Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.

For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.

This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.

In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.

Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:52 -05:00
Stephen Hemminger
a252bebe22 tcp: mark tcp_congestion_ops read_mostly
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 00:40:17 -08:00
David S. Miller
cc7e17ea04 ipv4: Optimize flow initialization in fib_validate_source().
Like in commit 44713b67db
("ipv4: Optimize flow initialization in output route lookup."
we can optimize the on-stack flow setup to only initialize
the members which are actually used.

Otherwise we bzero the entire structure, then initialize
explicitly the first half of it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 20:57:50 -08:00
David S. Miller
67e28ffd86 ipv4: Optimize flow initialization in input route lookup.
Like in commit 44713b67db
("ipv4: Optimize flow initialization in output route lookup."
we can optimize the on-stack flow setup to only initialize
the members which are actually used.

Otherwise we bzero the entire structure, then initialize
explicitly the first half of it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 20:42:07 -08:00
David S. Miller
7343ff31eb ipv6: Don't create clones of host routes.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=29252
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=30462

In commit d80bc0fd26 ("ipv6: Always
clone offlink routes.") we forced the kernel to always clone offlink
routes.

The reason we do that is to make sure we never bind an inetpeer to a
prefixed route.

The logic turned on here has existed in the tree for many years,
but was always off due to a protecting CPP define.  So perhaps
it's no surprise that there is a logic bug here.

The problem is that we canot clone a route that is already a
host route (ie. has DST_HOST set).  Because if we do, an identical
entry already exists in the routing tree and therefore the
ip6_rt_ins() call is going to fail.

This sets off a series of failures and high cpu usage, because when
ip6_rt_ins() fails we loop retrying this operation a few times in
order to handle a race between two threads trying to clone and insert
the same host route at the same time.

Fix this by simply using the route as-is when DST_HOST is set.

Reported-by: slash@ac.auone-net.jp
Reported-by: Ernst Sjöstrand <ernstp@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 19:55:25 -08:00
J. Bruce Fields
352b5d13c0 svcrpc: fix bad argument in unix_domain_find
"After merging the nfsd tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

net/sunrpc/svcauth_unix.c: In function 'unix_domain_find':
net/sunrpc/svcauth_unix.c:58: warning: passing argument 1 of
+'svcauth_unix_domain_release' from incompatible pointer type
net/sunrpc/svcauth_unix.c:41: note: expected 'struct auth_domain *' but
argument
+is of type 'struct unix_domain *'

Introduced by commit 8b3e07ac90 ("svcrpc: fix rare race on unix_domain
creation")."

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-09 22:40:30 -05:00
Vasiliy Kulikov
8909c9ad8f net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules
Since a8f80e8ff9 any process with
CAP_NET_ADMIN may load any module from /lib/modules/.  This doesn't mean
that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
limited to /lib/modules/**.  However, CAP_NET_ADMIN capability shouldn't
allow anybody load any module not related to networking.

This patch restricts an ability of autoloading modules to netdev modules
with explicit aliases.  This fixes CVE-2011-1019.

Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
of loading netdev modules by name (without any prefix) for processes
with CAP_SYS_MODULE to maintain the compatibility with network scripts
that use autoloading netdev modules by aliases like "eth0", "wlan0".

Currently there are only three users of the feature in the upstream
kernel: ipip, ip_gre and sit.

    root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
    root@albatros:~# grep Cap /proc/$$/status
    CapInh:	0000000000000000
    CapPrm:	fffffff800001000
    CapEff:	fffffff800001000
    CapBnd:	fffffff800001000
    root@albatros:~# modprobe xfs
    FATAL: Error inserting xfs
    (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit
    sit: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep sit
    root@albatros:~# ifconfig sit0
    sit0      Link encap:IPv6-in-IPv4
	      NOARP  MTU:1480  Metric:1

    root@albatros:~# lsmod | grep sit
    sit                    10457  0
    tunnel4                 2957  1 sit

For CAP_SYS_MODULE module loading is still relaxed:

    root@albatros:~# grep Cap /proc/$$/status
    CapInh:	0000000000000000
    CapPrm:	ffffffffffffffff
    CapEff:	ffffffffffffffff
    CapBnd:	ffffffffffffffff
    root@albatros:~# ifconfig xfs
    xfs: error fetching interface information: Device not found
    root@albatros:~# lsmod | grep xfs
    xfs                   745319  0

Reference: https://lkml.org/lkml/2011/2/24/203

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-03-10 10:25:19 +11:00
Daniel Turull
03a14ab134 pktgen: fix errata in show results
The units in show_results in pktgen were not correct.
The results are in usec but it was displayed nsec.

Reported-by: Jong-won Lee <ljw@handong.edu>
Signed-off-by: Daniel Turull <daniel.turull@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 14:11:00 -08:00
Mario Schuknecht
2f4e1b3970 tcp: ioctl type SIOCOUTQNSD returns amount of data not sent
In contrast to SIOCOUTQ which returns the amount of data sent
but not yet acknowledged plus data not yet sent this patch only
returns the data not sent.

For various methods of live streaming bitrate control it may
be helpful to know how much data are in the tcp outqueue are
not sent yet.

Signed-off-by: Mario Schuknecht <m.schuknecht@dresearch.de>
Signed-off-by: Steffen Sledz <sledz@dresearch.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 14:08:09 -08:00
David S. Miller
ee3f1aaf93 ipv4: Lookup multicast routes by rtable using helper.
Create a common helper for this operation, since we do
it identically in three spots.

Suggested by Eric Dumazet.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 14:06:20 -08:00
David S. Miller
6c91afe1a9 ipv4: Fix erroneous uses of ifa_address.
In usual cases ifa_address == ifa_local, but in the case where
SIOCSIFDSTADDR sets the destination address on a point-to-point
link, ifa_address gets set to that destination address.

Therefore we should use ifa_local when we want the local interface
address.

There were two cases where the selection was done incorrectly:

1) When devinet_ioctl() does matching, it checks ifa_address even
   though gifconf correct reported ifa_local to the user

2) IN_DEV_ARP_NOTIFY handling sends a gratuitous ARP using
   ifa_address instead of ifa_local.

Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 13:27:16 -08:00
Daniel Halperin
8d5eab5aa6 mac80211: update minstrel_ht sample rate when probe is set
Waiting until the status is received can cause the same rate to be
probed multiple times consecutively.

Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-09 16:10:58 -05:00
Scott James Remnant
4d9d88d121 net/wireless: add COUNTRY to to regulatory device uevent
Regulatory devices issue change uevents to inform userspace of a need
to call the crda tool; however these can often be sent before udevd is
running, and were not previously included in the results of
udevadm trigger (which requests a new change event using the /uevent
attribute of the sysfs object).

Add a uevent function to the device type which includes the COUNTRY
information from the last request if it has yet to be processed, the
case of multiple requests is already handled in the code by checking
whether an unprocessed one is queued in the same manner and refusing
to queue a new one.

The existing udev rule continues to work as before.

Signed-off-by: Scott James Remnant <keybuk@google.com>
Acked-By: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-03-09 16:10:57 -05:00
Rémi Denis-Courmont
a015f6f499 Phonet: kill the ST-Ericsson pipe controller Kconfig
This is now a run-time choice so that a single kernel can support both
old and new generation ISI modems. Support for manually enabling the
pipe flow is removed as it did not work properly, does not fit well
with the socket API, and I am not aware of any use at the moment.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:33 -08:00
Rémi Denis-Courmont
297edb6003 Phonet: support active connection without pipe controller on modem
This provides support for newer ISI modems with no need for the
earlier experimental compile-time alternative choice. With this,
we can now use the same kernel and userspace with both types of
modems.

This also avoids confusing two different and incompatible state
machines, actively connected vs accepted sockets, and adds
connection response error handling (processing "SYN/RST" of sorts).

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:33 -08:00
Rémi Denis-Courmont
acaf7df610 Phonet: provide pipe socket option to retrieve the pipe identifier
User-space sometimes needs this information. In particular, the GPRS
context or the AT commands pipe setups may use the pipe handle as a
reference.

This removes the settable pipe handle with CONFIG_PHONET_PIPECTRLR.
It did not handle error cases correctly. Furthermore, the kernel
*could* implement a smart scheme for allocating handles (if ever
needed), but userspace really cannot.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:32 -08:00
Rémi Denis-Courmont
f7ae8d59f6 Phonet: allocate sock from accept syscall rather than soft IRQ
This moves most of the accept logic to process context like other
socket stacks do. Then we can use a few more common socket helpers
and simplify a bit.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:32 -08:00
Rémi Denis-Courmont
44c9ab16d2 Phonet: factor common code to send control messages
With the addition of the pipe controller, there is now quite a bit
of repetitive code for small signaling messages. Lets factor it.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:31 -08:00
Rémi Denis-Courmont
0ebbf31863 Phonet: correct pipe backlog callback return values
In some cases, the Phonet pipe backlog callbacks returned negative
errno instead of NET_RX_* values.

In other cases, NET_RX_DROP was returned for invalid packets, even
though it seems only intended for buffering problems (not for
deliberately discarded packets).

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:30 -08:00
Rémi Denis-Courmont
b765e84f96 Phonet: return an error when packet TX fails
Phonet assumes that packets are never dropped. We try our best to
avoid this situation. But lets return ENOBUFS if queueing to the
network device fails so that the caller knows things went wrong.

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:30 -08:00
Rémi Denis-Courmont
c69d4407d8 Phonet: fix NULL dereference on TX path with implicit source
The previous Phonet patch series introduced per-socket implicit
destination (i.e. connect()). In that case, the destination
socket address is NULL in the transmit function.
However commit a8059512b1
("Phonet: implement per-socket destination/peer address")
is incomplete and would trigger a NULL dereference.
(Fortunately, the code is not in released kernel, and in fact
 currently not reachable.)

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09 11:59:29 -08:00