Commit Graph

17262 Commits

Author SHA1 Message Date
Trond Myklebust
68c404b18f Merge branch 'bugfixes' into nfs-for-2.6.38
Conflicts:
	fs/nfs/nfs2xdr.c
	fs/nfs/nfs3xdr.c
	fs/nfs/nfs4xdr.c
2011-01-10 14:48:02 -05:00
Trond Myklebust
6650239a4b NFS: Don't use vm_map_ram() in readdir
vm_map_ram() is not available on NOMMU platforms, and causes trouble
on incoherrent architectures such as ARM when we access the page data
through both the direct and the virtual mapping.

The alternative is to use the direct mapping to access page data
for the case when we are not crossing a page boundary, but to copy
the data into a linear scratch buffer when we are accessing data
that spans page boundaries.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: stable@kernel.org  [2.6.37]
2011-01-10 14:45:01 -05:00
Andy Adamson
4a19de0f4b NFS rename client back channel transport field
Differentiate from server backchannel

Signed-off-by: Andy Adamson <andros@netapp.com>
Acked-by: Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:25 -05:00
Andy Adamson
2c2618c6f2 NFS associate sessionid with callback connection
The sessions based callback service is started prior to the CREATE_SESSION call
so that it can handle CB_NULL requests which can be sent before the
CREATE_SESSION call returns and the session ID is known.

Set the callback sessionid after a sucessful CREATE_SESSION.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:24 -05:00
Andy Adamson
16b2d1e1d1 SUNRPC register and unregister the back channel transport
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:23 -05:00
Andy Adamson
1f11a034cd SUNRPC new transport for the NFSv4.1 shared back channel
Move the current sock create and destroy routines into the new transport ops.
Back channel socket will be destroyed by the svc_closs_all call in svc_destroy.

Added check: only TCP supported on shared back channel.

Signed-off-by: Andy Adamson <andros@netapp.com>
Acked-by: Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:23 -05:00
Andy Adamson
71e161a6a9 SUNRPC fix bc_send print
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:23 -05:00
Andy Adamson
4b5b3ba16b SUNRPC move svc_drop to caller of svc_process_common
The NFSv4.1 shared back channel does not need to call svc_drop because the
callback service never outlives the single connection it services, and it
reuses it's buffers and keeps the trasport.

Signed-off-by: Andy Adamson <andros@netapp.com>
Acked-by: Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:23 -05:00
Joel Sing
9fc3bbb4a7 ipv4/route.c: respect prefsrc for local routes
The preferred source address is currently ignored for local routes,
which results in all local connections having a src address that is the
same as the local dst address. Fix this by respecting the preferred source
address when it is provided for local routes.

This bug can be demonstrated as follows:

 # ifconfig dummy0 192.168.0.1
 # ip route show table local | grep local.*dummy0
 local 192.168.0.1 dev dummy0  proto kernel  scope host  src 192.168.0.1
 # ip route change table local local 192.168.0.1 dev dummy0 \
     proto kernel scope host src 127.0.0.1
 # ip route show table local | grep local.*dummy0
 local 192.168.0.1 dev dummy0  proto kernel  scope host  src 127.0.0.1

We now establish a local connection and verify the source IP
address selection:

 # nc -l 192.168.0.1 3128 &
 # nc 192.168.0.1 3128 &
 # netstat -ant | grep 192.168.0.1:3128.*EST
 tcp        0      0 192.168.0.1:3128        192.168.0.1:33228 ESTABLISHED
 tcp        0      0 192.168.0.1:33228       192.168.0.1:3128  ESTABLISHED

Signed-off-by: Joel Sing <jsing@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-04 11:35:12 -08:00
Trond Myklebust
beb0f0a9fb kernel panic when mount NFSv4
On Tue, 2010-12-14 at 16:58 +0800, Mi Jinlong wrote:
> Hi,
>
> When testing NFSv4 at RHEL6 with kernel 2.6.32, I got a kernel panic
> at NFS client's __rpc_create_common function.
>
> The panic place is:
>   rpc_mkpipe
>     __rpc_lookup_create()          <=== find pipefile *idmap*
>     __rpc_mkpipe()                 <=== pipefile is *idmap*
>       __rpc_create_common()
>        ******  BUG_ON(!d_unhashed(dentry)); ******    *panic*
>
> It means that the dentry's d_flags have be set DCACHE_UNHASHED,
> but it should not be set here.
>
> Is someone known this bug? or give me some idea?
>
> A reproduce program is append, but it can't reproduce the bug every time.
> the export is: "/nfsroot       *(rw,no_root_squash,fsid=0,insecure)"
>
> And the panic message is append.
>
> ============================================================================
> #!/bin/sh
>
> LOOPTOTAL=768
> LOOPCOUNT=0
> ret=0
>
> while [ $LOOPCOUNT -ne $LOOPTOTAL ]
> do
> 	((LOOPCOUNT += 1))
> 	service nfs restart
> 	/usr/sbin/rpc.idmapd
> 	mount -t nfs4 127.0.0.1:/ /mnt|| return 1;
> 	ls -l /var/lib/nfs/rpc_pipefs/nfs/*/
> 	umount /mnt
> 	echo $LOOPCOUNT
> done
>
> ===============================================================================
> Code: af 60 01 00 00 89 fa 89 f0 e8 64 cf 89 f0 e8 5c 7c 64 cf 31 c0 8b 5c 24 10 8b
> 74 24 14 8b 7c 24 18 8b 6c 24 1c 83 c4 20 c3 <0f> 0b eb fc 8b 46 28 c7 44 24 08 20
> de ee f0 c7 44 24 04 56 ea
> EIP:[<f0ee92ea>] __rpc_create_common+0x8a/0xc0 [sunrpc] SS:ESP 0068:eccb5d28
> ---[ end trace 8f5606cd08928ed2]---
> Kernel panic - not syncing: Fatal exception
> Pid:7131, comm: mount.nfs4 Tainted: G     D   -------------------2.6.32 #1
> Call Trace:
>  [<c080ad18>] ? panic+0x42/0xed
>  [<c080e42c>] ? oops_end+0xbc/0xd0
>  [<c040b090>] ? do_invalid_op+0x0/0x90
>  [<c040b10f>] ? do_invalid_op+0x7f/0x90
>  [<f0ee92ea>] ? __rpc_create_common+0x8a/0xc0[sunrpc]
>  [<f0edc433>] ? rpc_free_task+0x33/0x70[sunrpc]
>  [<f0ed6508>] ? prc_call_sync+0x48/0x60[sunrpc]
>  [<f0ed656e>] ? rpc_ping+0x4e/0x60[sunrpc]
>  [<f0ed6eaf>] ? rpc_create+0x38f/0x4f0[sunrpc]
>  [<c080d80b>] ? error_code+0x73/0x78
>  [<f0ee92ea>] ? __rpc_create_common+0x8a/0xc0[sunrpc]
>  [<c0532bda>] ? d_lookup+0x2a/0x40
>  [<f0ee94b1>] ? rpc_mkpipe+0x111/0x1b0[sunrpc]
>  [<f10a59f4>] ? nfs_create_rpc_client+0xb4/0xf0[nfs]
>  [<f10d6c6d>] ? nfs_fscache_get_client_cookie+0x1d/0x50[nfs]
>  [<f10d3fcb>] ? nfs_idmap_new+0x7b/0x140[nfs]
>  [<c05e76aa>] ? strlcpy+0x3a/0x60
>  [<f10a60ca>] ? nfs4_set_client+0xea/0x2b0[nfs]
>  [<f10a6d0c>] ? nfs4_create_server+0xac/0x1b0[nfs]
>  [<c04f1400>] ? krealloc+0x40/0x50
>  [<f10b0e8b>] ? nfs4_remote_get_sb+0x6b/0x250[nfs]
>  [<c04f14ec>] ? kstrdup+0x3c/0x60
>  [<c0520739>] ? vfs_kern_mount+0x69/0x170
>  [<f10b1a3c>] ? nfs_do_root_mount+0x6c/0xa0[nfs]
>  [<f10b1b47>] ? nfs4_try_mount+0x37/0xa0[nfs]
>  [<f10afe6d>] ? nfs4_validate_text_mount_data+-x7d/0xf0[nfs]
>  [<f10b1c42>] ? nfs4_get_sb+0x92/0x2f0
>  [<c0520739>] ? vfs_kern_mount+0x69/0x170
>  [<c05366d2>] ? get_fs_type+0x32/0xb0
>  [<c052089f>] ? do_kern_mount+0x3f/0xe0
>  [<c053954f>] ? do_mount+0x2ef/0x740
>  [<c0537740>] ? copy_mount_options+0xb0/0x120
>  [<c0539a0e>] ? sys_mount+0x6e/0xa0

Hi,

Does the following patch fix the problem?

Cheers
  Trond

--------------------------
SUNRPC: Fix a BUG in __rpc_create_common

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Mi Jinlong reports:

When testing NFSv4 at RHEL6 with kernel 2.6.32, I got a kernel panic
at NFS client's __rpc_create_common function.

The panic place is:
  rpc_mkpipe
      __rpc_lookup_create()          <=== find pipefile *idmap*
      __rpc_mkpipe()                 <=== pipefile is *idmap*
        __rpc_create_common()
         ******  BUG_ON(!d_unhashed(dentry)); ****** *panic*

The test is wrong: we can find ourselves with a hashed negative dentry here
if the idmapper tried to look up the file before we got round to creating
it.

Just replace the BUG_ON() with a d_drop(dentry).

Reported-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-04 13:10:38 -05:00
Florian Westphal
e6f26129eb bridge: stp: ensure mac header is set
commit bf9ae5386b
(llc: use dev_hard_header) removed the
skb_reset_mac_header call from llc_mac_hdr_init.

This seems fine itself, but br_send_bpdu() invokes ebtables LOCAL_OUT.

We oops in ebt_basic_match() because it assumes eth_hdr(skb) returns
a meaningful result.

Cc: acme@ghostprotocols.net
References: https://bugzilla.kernel.org/show_bug.cgi?id=24532
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-03 12:09:33 -08:00
Tomas Winkler
9d89081d69 bridge: fix br_multicast_ipv6_rcv for paged skbs
use pskb_may_pull to access ipv6 header correctly for paged skbs
It was omitted in the bridge code leading to crash in blind
__skb_pull

since the skb is cloned undonditionally we also simplify the
the exit path

this fixes bug https://bugzilla.kernel.org/show_bug.cgi?id=25202

Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: authenticated
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: associated (aid 2)
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 RADIUS: starting accounting session 4D0608A3-00000005
Dec 15 14:36:41 User-PC kernel: [175576.120287] ------------[ cut here ]------------
Dec 15 14:36:41 User-PC kernel: [175576.120452] kernel BUG at include/linux/skbuff.h:1178!
Dec 15 14:36:41 User-PC kernel: [175576.120609] invalid opcode: 0000 [#1] SMP
Dec 15 14:36:41 User-PC kernel: [175576.120749] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
Dec 15 14:36:41 User-PC kernel: [175576.121035] Modules linked in: approvals binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.122712]
Dec 15 14:36:41 User-PC kernel: [175576.122769] Pid: 0, comm: kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.123012] EIP: 0060:[<f83edd65>] EFLAGS: 00010283 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.123193] EIP is at br_multicast_rcv+0xc95/0xe1c [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.123362] EAX: 0000001c EBX: f5626318 ECX: 00000000 EDX: 00000000
Dec 15 14:36:41 User-PC kernel: [175576.123550] ESI: ec512262 EDI: f5626180 EBP: f60b5ca0 ESP: f60b5bd8
Dec 15 14:36:41 User-PC kernel: [175576.123737]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.123902] Process kworker/0:0 (pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.124137] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ec556500 f6d06800 f60b5be8 c01087d8 ec512262 00000030 00000024 f5626180
Dec 15 14:36:41 User-PC kernel: [175576.124181]  f572c200 ef463440 f5626300 3affffff f6d06dd0 e60766a4 000000c4 f6d06860
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ffffffff ec55652c 00000001 f6d06844 f60b5c64 c0138264 c016e451 c013e47d
Dec 15 14:36:41 User-PC kernel: [175576.124181] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01087d8>] ? sched_clock+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0138264>] ? enqueue_entity+0x174/0x440
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e451>] ? sched_clock_cpu+0x131/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c013e47d>] ? select_task_rq_fair+0x2ad/0x730
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0524fc1>] ? nf_iterate+0x71/0x90
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4914>] ? br_handle_frame_finish+0x184/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e46e9>] ? br_handle_frame+0x189/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4560>] ? br_handle_frame+0x0/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff026>] ? __netif_receive_skb+0x1b6/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04f7a30>] ? skb_copy_bits+0x110/0x210
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0503a7f>] ? netif_receive_skb+0x6f/0x80
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cb74c>] ? ieee80211_deliver_skb+0x8c/0x1a0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cc836>] ? ieee80211_rx_handlers+0xeb6/0x1aa0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff1f0>] ? __netif_receive_skb+0x380/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e242>] ? sched_clock_local+0xb2/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c012b688>] ? default_spin_lock_flags+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cd621>] ? ieee80211_prepare_and_rx_handle+0x201/0xa90 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82ce154>] ? ieee80211_rx+0x2a4/0x830 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f815a8d6>] ? iwl_update_stats+0xa6/0x2a0 [iwlcore]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8499212>] ? iwlagn_rx_reply_rx+0x292/0x3b0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8483697>] ? iwl_rx_handle+0xe7/0x350 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8486ab7>] ? iwl_irq_tasklet+0xf7/0x5c0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01aece1>] ? __rcu_process_callbacks+0x201/0x2d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150d05>] ? tasklet_action+0xc5/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150a07>] ? __do_softirq+0x97/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d910c>] ? nmi_stack_correct+0x2f/0x34
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150970>] ? __do_softirq+0x0/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  <IRQ>
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01508f5>] ? irq_exit+0x65/0x70
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05df062>] ? do_IRQ+0x52/0xc0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01036b0>] ? common_interrupt+0x30/0x38
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c03a1fc2>] ? intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04daebb>] ? cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0101dea>] ? cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d2702>] ? start_secondary+0x1e8/0x1ee

Cc: David Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-03 11:26:34 -08:00
Dan Rosenberg
9f260e0efa CAN: Use inode instead of kernel address for /proc file
Since the socket address is just being used as a unique identifier, its
inode number is an alternative that does not leak potentially sensitive
information.

CC-ing stable because MITRE has assigned CVE-2010-4565 to the issue.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-31 11:13:27 -08:00
Linus Torvalds
d7c1255a3a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
  ipv4: dont create routes on down devices
  epic100: hamachi: yellowfin: Fix skb allocation size
  sundance: Fix oopses with corrupted skb_shared_info
  Revert "ipv4: Allow configuring subnets as local addresses"
  USB: mcs7830: return negative if auto negotiate fails
  irda: prevent integer underflow in IRLMP_ENUMDEVICES
  tcp: fix listening_get_next()
  atl1c: Do not use legacy PCI power management
  mac80211: fix mesh forwarding
  MAINTAINERS: email address change
  net: Fix range checks in tcf_valid_offset().
  net_sched: sch_sfq: fix allot handling
  hostap: remove netif_stop_queue from init
  mac80211/rt2x00: add ieee80211_tx_status_ni()
  typhoon: memory corruption in typhoon_get_drvinfo()
  net: Add USB PID for new MOSCHIP USB ethernet controller MCS7832 variant
  net_sched: always clone skbs
  ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed.
  netlink: fix gcc -Wconversion compilation warning
  asix: add USB ID for Logitec LAN-GTJ U2A
  ...
2010-12-26 12:06:56 -08:00
Eric Dumazet
fc75fc8339 ipv4: dont create routes on down devices
In ip_route_output_slow(), instead of allowing a route to be created on
a not UPed device, report -ENETUNREACH immediately.

# ip tunnel add mode ipip remote 10.16.0.164 local
10.16.0.72 dev eth0
# (Note : tunl1 is down)
# ping -I tunl1 10.1.2.3
PING 10.1.2.3 (10.1.2.3) from 192.168.18.5 tunl1: 56(84) bytes of data.
(nothing)
# ./a.out tunl1
# ip tunnel del tunl1
Message from syslogd@shelby at Dec 22 10:12:08 ...
  kernel: unregister_netdevice: waiting for tunl1 to become free.
Usage count = 3

After patch:
# ping -I tunl1 10.1.2.3
connect: Network is unreachable

Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-25 20:05:31 -08:00
David S. Miller
e058464990 Revert "ipv4: Allow configuring subnets as local addresses"
This reverts commit 4465b46900.

Conflicts:

	net/ipv4/fib_frontend.c

As reported by Ben Greear, this causes regressions:

> Change 4465b46900 caused rules
> to stop matching the input device properly because the
> FLOWI_FLAG_MATCH_ANY_IIF is always defined in ip_dev_find().
>
> This breaks rules such as:
>
> ip rule add pref 512 lookup local
> ip rule del pref 0 lookup local
> ip link set eth2 up
> ip -4 addr add 172.16.0.102/24 broadcast 172.16.0.255 dev eth2
> ip rule add to 172.16.0.102 iif eth2 lookup local pref 10
> ip rule add iif eth2 lookup 10001 pref 20
> ip route add 172.16.0.0/24 dev eth2 table 10001
> ip route add unreachable 0/0 table 10001
>
> If you had a second interface 'eth0' that was on a different
> subnet, pinging a system on that interface would fail:
>
>   [root@ct503-60 ~]# ping 192.168.100.1
>   connect: Invalid argument

Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-23 12:03:57 -08:00
Dan Rosenberg
fdac1e0697 irda: prevent integer underflow in IRLMP_ENUMDEVICES
If the user-provided len is less than the expected offset, the
IRLMP_ENUMDEVICES getsockopt will do a copy_to_user() with a very large
size value.  While this isn't be a security issue on x86 because it will
get caught by the access_ok() check, it may leak large amounts of kernel
heap on other architectures.  In any event, this patch fixes it.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-23 10:09:43 -08:00
Eric Dumazet
1bde5ac493 tcp: fix listening_get_next()
Alexey Vlasov found /proc/net/tcp could sometime loop and display
millions of sockets in LISTEN state.

In 2.6.29, when we converted TCP hash tables to RCU, we left two
sk_next() calls in listening_get_next().

We must instead use sk_nulls_next() to properly detect an end of chain.

Reported-by: Alexey Vlasov <renton@renton.name>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-23 09:32:46 -08:00
David S. Miller
b7e03ec9a6 Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-12-22 17:34:40 -08:00
Johannes Berg
b51aff057c mac80211: fix mesh forwarding
Under memory pressure, the mac80211 mesh code
may helpfully print a message that it failed
to clone a mesh frame and then will proceed
to crash trying to use it anyway. Fix that.

Cc: stable@kernel.org [2.6.27+]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-22 13:36:35 -05:00
Joe Perches
b3bcedadf2 net/sunrpc/clnt.c: Convert sprintf_symbol to %ps
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-21 11:51:26 -05:00
Linus Torvalds
9d5004fcf6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: handle partial result from get_user_pages
  ceph: mark user pages dirty on direct-io reads
  ceph: fix null pointer dereference in ceph_init_dentry for nfs reexport
  ceph: fix direct-io on non-page-aligned buffers
  ceph: fix msgr_init error path
2010-12-20 21:32:20 -08:00
Eric Dumazet
aa3e219997 net_sched: sch_sfq: fix allot handling
When deploying SFQ/IFB here at work, I found the allot management was
pretty wrong in sfq, even changing allot from short to int...

We should init allot for each new flow, not using a previous value found
in slot.

Before patch, I saw bursts of several packets per flow, apparently
denying the default "quantum 1514" limit I had on my SFQ class.

class sfq 11:1 parent 11: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 7p requeues 0 
 allot 11546 

class sfq 11:46 parent 11: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 1p requeues 0 
 allot -23873 

class sfq 11:78 parent 11: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 5p requeues 0 
 allot 11393 

After patch, better fairness among each flow, allot limit being
respected, allot is positive :

class sfq 11:e parent 11: 
 (dropped 0, overlimits 0 requeues 86) 
 backlog 0b 3p requeues 86 
 allot 596 

class sfq 11:94 parent 11: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 3p requeues 0 
 allot 1468 

class sfq 11:a4 parent 11: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 4p requeues 0 
 allot 650 

class sfq 11:bb parent 11: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 3p requeues 0 
 allot 596 

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-20 13:18:16 -08:00
David Stevens
ad0081e43a ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed.
This patch modifies IPsec6 to fragment IPv6 packets that are
locally generated as needed.

This version of the patch only fragments in tunnel mode, so that fragment
headers will not be obscured by ESP in transport mode.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-19 20:22:23 -08:00
Henry C Chang
361cf40519 ceph: handle partial result from get_user_pages
The get_user_pages() helper can return fewer than the requested pages.
Error out in that case, and clean up the partial result.

Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-17 09:55:59 -08:00
Henry C Chang
b6aa5901c7 ceph: mark user pages dirty on direct-io reads
For read operation, we have to set the argument _write_ of get_user_pages
to 1 since we will write data to pages. Also, we need to SetPageDirty before
releasing these pages.

Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-17 09:54:40 -08:00
stephen hemminger
29ba5fed1b ipv6: don't flush routes when setting loopback down
When loopback device is being brought down, then keep the route table
entries because they are special. The entries in the local table for
linklocal routes and ::1 address should not be purged.

This is a sub optimal solution to the problem and should be replaced
by a better fix in future.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 18:26:26 -08:00
Wei Yongjun
7d743b7e95 sctp: fix the return value of getting the sctp partial delivery point
Get the sctp partial delivery point using SCTP_PARTIAL_DELIVERY_POINT
socket option should return 0 if success, not -ENOTSUPP.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:48:44 -08:00
David Stevens
76d661586c bridge: fix IPv6 queries for bridge multicast snooping
This patch fixes a missing ntohs() for bridge IPv6 multicast snooping.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:41:23 -08:00
Octavian Purdila
fcbdf09d96 net: fix nulls list corruptions in sk_prot_alloc
Special care is taken inside sk_port_alloc to avoid overwriting
skc_node/skc_nulls_node. We should also avoid overwriting
skc_bind_node/skc_portaddr_node.

The patch fixes the following crash:

 BUG: unable to handle kernel paging request at fffffffffffffff0
 IP: [<ffffffff812ec6dd>] udp4_lib_lookup2+0xad/0x370
 [<ffffffff812ecc22>] __udp4_lib_lookup+0x282/0x360
 [<ffffffff812ed63e>] __udp4_lib_rcv+0x31e/0x700
 [<ffffffff812bba45>] ? ip_local_deliver_finish+0x65/0x190
 [<ffffffff812bbbf8>] ? ip_local_deliver+0x88/0xa0
 [<ffffffff812eda35>] udp_rcv+0x15/0x20
 [<ffffffff812bba45>] ip_local_deliver_finish+0x65/0x190
 [<ffffffff812bbbf8>] ip_local_deliver+0x88/0xa0
 [<ffffffff812bb2cd>] ip_rcv_finish+0x32d/0x6f0
 [<ffffffff8128c14c>] ? netif_receive_skb+0x99c/0x11c0
 [<ffffffff812bb94b>] ip_rcv+0x2bb/0x350
 [<ffffffff8128c14c>] netif_receive_skb+0x99c/0x11c0

Signed-off-by: Leonard Crestez <lcrestez@ixiacom.com>
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:26:56 -08:00
Andrey Vagin
d3052b557a ipv6: delete expired route in ip6_pmtu_deliver
The first big packets sent to a "low-MTU" client correctly
triggers the creation of a temporary route containing the reduced MTU.

But after the temporary route has expired, new ICMP6 "packet too big"
will be sent, rt6_pmtu_discovery will find the previous EXPIRED route
check that its mtu isn't bigger then in icmp packet and do nothing
before the temporary route will not deleted by gc.

I make the simple experiment:
while :; do
    time ( dd if=/dev/zero bs=10K count=1 | ssh hostname dd of=/dev/null ) || break;
done

The "time" reports real 0m0.197s if a temporary route isn't expired, but
it reports real 0m52.837s (!!!!) immediately after a temporare route has
expired.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 12:28:13 -08:00
Chuck Lever
bf2695516d SUNRPC: New xdr_streams XDR decoder API
Now that all client-side XDR decoder routines use xdr_streams, there
should be no need to support the legacy calling sequence [rpc_rqst *,
__be32 *, RPC res *] anywhere.  We can construct an xdr_stream in the
generic RPC code, instead of in each decoder function.

This is a refactoring change.  It should not cause different behavior.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-16 12:37:25 -05:00
Chuck Lever
9f06c719f4 SUNRPC: New xdr_streams XDR encoder API
Now that all client-side XDR encoder routines use xdr_streams, there
should be no need to support the legacy calling sequence [rpc_rqst *,
__be32 *, RPC arg *] anywhere.  We can construct an xdr_stream in the
generic RPC code, instead of in each encoder function.

Also, all the client-side encoder functions return 0 now, making a
return value superfluous.  Take this opportunity to convert them to
return void instead.

This is a refactoring change.  It should not cause different behavior.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-16 12:37:25 -05:00
Chuck Lever
1ac7c23e4a SUNRPC: Determine value of "nrprocs" automatically
Clean up.

Just fixed a panic where the nrprocs field in a different upper layer
client was set by hand incorrectly.  Use the compiler-generated method
used by the other upper layer protocols.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-16 12:37:25 -05:00
Chuck Lever
4129ccf303 SUNRPC: Avoid return code checking in rpcbind XDR encoder functions
Clean up.

The trend in the other XDR encoder functions is to BUG() when encoding
problems occur, since a problem here is always due to a local coding
error.  Then, instead of a status, zero is unconditionally returned.

Update the rpcbind XDR encoders to behave this way.

To finish the update, use the new-style be32_to_cpup() and
cpu_to_be32() macros, and compute the buffer sizes using raw integers
instead of sizeof().  This matches the conventions used in other XDR
functions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-16 12:37:25 -05:00
David S. Miller
82cc4f5cb8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-12-15 09:43:13 -08:00
Linus Torvalds
b4fe2a0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (75 commits)
  pppoe.c: Fix kernel panic caused by __pppoe_xmit
  WAN: Fix a TX IRQ causing BUG() in PC300 and PCI200SYN drivers.
  bnx2x: Advance a version number to 1.60.01-0
  bnx2x: Fixed a compilation warning
  bnx2x: LSO code was broken on BE platforms
  qlge: Fix deadlock when cancelling worker.
  net: fix skb_defer_rx_timestamp()
  cxgb4vf: Ingress Queue Entry Size needs to be 64 bytes
  phy: add the IC+ IP1001 driver
  atm: correct sysfs 'device' link creation and parent relationships
  MAINTAINERS: remove me from tulip
  SCTP: Fix SCTP_SET_PEER_PRIMARY_ADDR to accpet v4mapped address
  enic: Bug Fix: Pass napi reference to the isr that services receive queue
  ipv6: fix nl group when advertising a new link
  connector: add module alias
  net: Document the kernel_recvmsg() function
  r8169: Fix runtime power management
  hso: IP checksuming doesn't work on GE0301 option cards
  xfrm: Fix xfrm_state_migrate leak
  net: Convert netpoll blocking api in bonding driver to be a counter
  ...
2010-12-14 17:33:40 -08:00
Sage Weil
d96c9043d1 ceph: fix msgr_init error path
create_workqueue() returns NULL on failure.

Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-13 20:30:28 -08:00
Herton Ronaldo Krzesinski
8808f64171 mac80211: avoid calling ieee80211_work_work unconditionally
On suspend, there might be usb wireless drivers which wrongly trigger
the warning in ieee80211_work_work. If an usb driver doesn't have a
suspend hook, the usb stack will disconnect the device. On disconnect,
a mac80211 driver calls ieee80211_unregister_hw, which calls dev_close,
which calls ieee80211_stop, and in the end calls ieee80211_work_purge->
ieee80211_work_work.

The problem is that this call to ieee80211_work_purge comes after
mac80211 is suspended, triggering the warning even when we don't have
work queued in work_list (the expected case when already suspended),
because it always calls ieee80211_work_work.

So, just call ieee80211_work_work in ieee80211_work_purge if we really
have to abort work. This addresses the warning reported at
https://bugzilla.kernel.org/show_bug.cgi?id=24402

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-13 14:55:08 -05:00
Tim Harvey
c926d006c1 mac80211: Fix NULL-pointer deference on ibss merge when not ready
dev_open will eventually call ieee80211_ibss_join which sets up the
skb used for beacons/probe-responses however it is possible to
receive beacons that attempt to merge before this occurs causing
a null pointer dereference.  Check ssid_len as that is the last
thing set in ieee80211_ibss_join.

This occurs quite easily in the presence of adhoc nodes with hidden SSID's

revised previous patch to check further up based on irc feedback

Signed-off-by: Tim Harvey <harvey.tim@gmail.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-12-13 14:53:46 -05:00
John W. Linville
10c38c3306 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-2.6 2010-12-13 14:41:23 -05:00
Eric Dumazet
a19faf0250 net: fix skb_defer_rx_timestamp()
After commit c1f19b51d1 (net: support time stamping in phy devices.),
kernel might crash if CONFIG_NETWORK_PHY_TIMESTAMPING=y and
skb_defer_rx_timestamp() handles a packet without an ethernet header.

Fixes kernel bugzilla #24102

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=24102
Reported-and-tested-by: Andrew Watts <akwatts@ymail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 16:20:56 -08:00
Dan Williams
d9ca676bcb atm: correct sysfs 'device' link creation and parent relationships
The ATM subsystem was incorrectly creating the 'device' link for ATM
nodes in sysfs.  This led to incorrect device/parent relationships
exposed by sysfs and udev.  Instead of rolling the 'device' link by hand
in the generic ATM code, pass each ATM driver's bus device down to the
sysfs code and let sysfs do this stuff correctly.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 15:45:05 -08:00
Wei Yongjun
40a010395c SCTP: Fix SCTP_SET_PEER_PRIMARY_ADDR to accpet v4mapped address
SCTP_SET_PEER_PRIMARY_ADDR does not accpet v4mapped address, using
v4mapped address in SCTP_SET_PEER_PRIMARY_ADDR socket option will
get -EADDRNOTAVAIL error if v4map is enabled. This patch try to
fix it by mapping v4mapped address to v4 address if allowed.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 15:29:49 -08:00
David S. Miller
e91db5cd6f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2010-12-10 12:51:02 -08:00
Nicolas Dichtel
5f75a1042f ipv6: fix nl group when advertising a new link
New idev are advertised with NL group RTNLGRP_IPV6_IFADDR, but
should use RTNLGRP_IPV6_IFINFO.
Bug was introduced by commit 8d7a76c9.

Signed-off-by: Wang Xuefu <xuefu.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 12:49:36 -08:00
Martin Lucina
c1249c0aae net: Document the kernel_recvmsg() function
Signed-off-by: Martin Lucina <mato@kotelna.sk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 11:13:18 -08:00
Thomas Egerer
78347c8c6b xfrm: Fix xfrm_state_migrate leak
xfrm_state_migrate calls kfree instead of xfrm_state_put to free
a failed state. According to git commit 553f9118 this can cause
memory leaks.

Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-09 20:35:27 -08:00
David S. Miller
4e085e76cb econet: Fix crash in aun_incoming().
Unconditional use of skb->dev won't work here,
try to fetch the econet device via skb_dst()->dev
instead.

Suggested by Eric Dumazet.

Reported-by: Nelson Elhage <nelhage@ksplice.com>
Tested-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-08 20:51:15 -08:00
Eric Dumazet
f19872575f tcp: protect sysctl_tcp_cookie_size reads
Make sure sysctl_tcp_cookie_size is read once in
tcp_cookie_size_check(), or we might return an illegal value to caller
if sysctl_tcp_cookie_size is changed by another cpu.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: William Allen Simpson <william.allen.simpson@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-08 12:34:09 -08:00