Commit Graph

17896 Commits

Author SHA1 Message Date
Lucian Adrian Grijincu
c486da3439 sysctl: ipv6: use correct net in ipv6_sysctl_rtcache_flush
Before this patch issuing these commands:

  fd = open("/proc/sys/net/ipv6/route/flush")
  unshare(CLONE_NEWNET)
  write(fd, "stuff")

would flush the newly created net, not the original one.

The equivalent ipv4 code is correct (stores the net inside ->extra1).
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-25 11:01:56 -08:00
Linus Torvalds
ef3242859f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
  Added support for usb ethernet (0x0fe6, 0x9700)
  r8169: fix RTL8168DP power off issue.
  r8169: correct settings of rtl8102e.
  r8169: fix incorrect args to oob notify.
  DM9000B: Fix PHY power for network down/up
  DM9000B: Fix reg_save after spin_lock in dm9000_timeout
  net_sched: long word align struct qdisc_skb_cb data
  sfc: lower stack usage in efx_ethtool_self_test
  bridge: Use IPv6 link-local address for multicast listener queries
  bridge: Fix MLD queries' ethernet source address
  bridge: Allow mcast snooping for transient link local addresses too
  ipv6: Add IPv6 multicast address flag defines
  bridge: Add missing ntohs()s for MLDv2 report parsing
  bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
  bridge: Fix IPv6 multicast snooping by storing correct protocol type
  p54pci: update receive dma buffers before and after processing
  fix cfg80211_wext_siwfreq lock ordering...
  rt2x00: Fix WPA TKIP Michael MIC failures.
  ath5k: Fix fast channel switching
  tcp: undo_retrans counter fixes
  ...
2011-02-23 16:02:00 -08:00
David S. Miller
d3bd1b4c89 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-02-22 11:53:05 -08:00
Linus Lüssing
fe29ec41aa bridge: Use IPv6 link-local address for multicast listener queries
Currently the bridge multicast snooping feature periodically issues
IPv6 general multicast listener queries to sense the absence of a
listener.

For this, it uses :: as its source address - however RFC 2710 requires:
"To be valid, the Query message MUST come from a link-local IPv6 Source
Address". Current Linux kernel versions seem to follow this requirement
and ignore our bogus MLD queries.

With this commit a link local address from the bridge interface is being
used to issue the MLD query, resulting in other Linux devices which are
multicast listeners in the network to respond with a MLD response (which
was not the case before).

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:29 -08:00
Linus Lüssing
36cff5a10c bridge: Fix MLD queries' ethernet source address
Map the IPv6 header's destination multicast address to an ethernet
source address instead of the MLD queries multicast address.

For instance for a general MLD query (multicast address in the MLD query
set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although
an MLD queries destination MAC should always be 33:33:00:00:00:01 which
matches the IPv6 header's multicast destination ff02::1.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:28 -08:00
Linus Lüssing
e4de9f9e83 bridge: Allow mcast snooping for transient link local addresses too
Currently the multicast bridge snooping support is not active for
link local multicast. I assume this has been done to leave
important multicast data untouched, like IPv6 Neighborhood Discovery.

In larger, bridged, local networks it could however be desirable to
optimize for instance local multicast audio/video streaming too.

With the transient flag in IPv6 multicast addresses we have an easy
way to optimize such multimedia traffic without tempering with the
high priority multicast data from well-known addresses.

This patch alters the multicast bridge snooping for IPv6, to take
effect for transient multicast addresses instead of non-link-local
addresses.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:28 -08:00
Linus Lüssing
d41db9f3f7 bridge: Add missing ntohs()s for MLDv2 report parsing
The nsrcs number is 2 Byte wide, therefore we need to call ntohs()
before using it.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:27 -08:00
Linus Lüssing
649e984d00 bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
We actually want a pointer to the grec_nsrcr and not the following
field. Otherwise we can get very high values for *nsrcs as the first two
bytes of the IPv6 multicast address are being used instead, leading to
a failing pskb_may_pull() which results in MLDv2 reports not being
parsed.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:26 -08:00
Linus Lüssing
9cc6e0c4c4 bridge: Fix IPv6 multicast snooping by storing correct protocol type
The protocol type for IPv6 entries in the hash table for multicast
bridge snooping is falsely set to ETH_P_IP, marking it as an IPv4
address, instead of setting it to ETH_P_IPV6, which results in negative
look-ups in the hash table later.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:26 -08:00
Linus Torvalds
8bd89ca220 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: keep reference to parent inode on ceph_dentry
  ceph: queue cap_snaps once per realm
  libceph: fix socket write error handling
  libceph: fix socket read error handling
2011-02-21 15:01:38 -08:00
Daniel J Blueman
4f919a3bc5 fix cfg80211_wext_siwfreq lock ordering...
I previously managed to reproduce a hang while scanning wireless
channels (reproducible with airodump-ng hopping channels); subsequent
lockdep instrumentation revealed a lock ordering issue.

Without knowing the design intent, it looks like the locks should be
taken in reverse order; please comment.

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.38-rc5-341cd #4
-------------------------------------------------------
airodump-ng/15445 is trying to acquire lock:
 (&rdev->devlist_mtx){+.+.+.}, at: [<ffffffff816b1266>]
cfg80211_wext_siwfreq+0xc6/0x100

but task is already holding lock:
 (&wdev->mtx){+.+.+.}, at: [<ffffffff816b125c>] cfg80211_wext_siwfreq+0xbc/0x100

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&wdev->mtx){+.+.+.}:
       [<ffffffff810a79d6>] lock_acquire+0xc6/0x280
       [<ffffffff816d6bce>] mutex_lock_nested+0x6e/0x4b0
       [<ffffffff81696080>] cfg80211_netdev_notifier_call+0x430/0x5f0
       [<ffffffff8109351b>] notifier_call_chain+0x8b/0x100
       [<ffffffff810935b1>] raw_notifier_call_chain+0x11/0x20
       [<ffffffff81576d92>] call_netdevice_notifiers+0x32/0x60
       [<ffffffff815771a4>] __dev_notify_flags+0x34/0x80
       [<ffffffff81577230>] dev_change_flags+0x40/0x70
       [<ffffffff8158587c>] do_setlink+0x1fc/0x8d0
       [<ffffffff81586042>] rtnl_setlink+0xf2/0x140
       [<ffffffff81586923>] rtnetlink_rcv_msg+0x163/0x270
       [<ffffffff8159d741>] netlink_rcv_skb+0xa1/0xd0
       [<ffffffff815867b0>] rtnetlink_rcv+0x20/0x30
       [<ffffffff8159d39a>] netlink_unicast+0x2ba/0x300
       [<ffffffff8159dd57>] netlink_sendmsg+0x267/0x3e0
       [<ffffffff8155e364>] sock_sendmsg+0xe4/0x110
       [<ffffffff8155f3a3>] sys_sendmsg+0x253/0x3b0
       [<ffffffff81003192>] system_call_fastpath+0x16/0x1b

-> #0 (&rdev->devlist_mtx){+.+.+.}:
       [<ffffffff810a7222>] __lock_acquire+0x1622/0x1d10
       [<ffffffff810a79d6>] lock_acquire+0xc6/0x280
       [<ffffffff816d6bce>] mutex_lock_nested+0x6e/0x4b0
       [<ffffffff816b1266>] cfg80211_wext_siwfreq+0xc6/0x100
       [<ffffffff816b2fad>] ioctl_standard_call+0x5d/0xd0
       [<ffffffff816b3223>] T.808+0x163/0x170
       [<ffffffff816b326a>] wext_handle_ioctl+0x3a/0x90
       [<ffffffff815798d2>] dev_ioctl+0x6f2/0x830
       [<ffffffff8155cf3d>] sock_ioctl+0xfd/0x290
       [<ffffffff8117dffd>] do_vfs_ioctl+0x9d/0x590
       [<ffffffff8117e53a>] sys_ioctl+0x4a/0x80
       [<ffffffff81003192>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

2 locks held by airodump-ng/15445:
 #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81586782>] rtnl_lock+0x12/0x20
 #1:  (&wdev->mtx){+.+.+.}, at: [<ffffffff816b125c>]
cfg80211_wext_siwfreq+0xbc/0x100

stack backtrace:
Pid: 15445, comm: airodump-ng Not tainted 2.6.38-rc5-341cd #4
Call Trace:
 [<ffffffff810a3f0a>] ? print_circular_bug+0xfa/0x100
 [<ffffffff810a7222>] ? __lock_acquire+0x1622/0x1d10
 [<ffffffff810a1f99>] ? trace_hardirqs_off_caller+0x29/0xc0
 [<ffffffff810a79d6>] ? lock_acquire+0xc6/0x280
 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100
 [<ffffffff810a31d7>] ? mark_held_locks+0x67/0x90
 [<ffffffff816d6bce>] ? mutex_lock_nested+0x6e/0x4b0
 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100
 [<ffffffff810a31d7>] ? mark_held_locks+0x67/0x90
 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100
 [<ffffffff816b1266>] ? cfg80211_wext_siwfreq+0xc6/0x100
 [<ffffffff816b2fad>] ? ioctl_standard_call+0x5d/0xd0
 [<ffffffff8157818b>] ? __dev_get_by_name+0x9b/0xc0
 [<ffffffff816b2f50>] ? ioctl_standard_call+0x0/0xd0
 [<ffffffff816b3223>] ? T.808+0x163/0x170
 [<ffffffff8112ddf2>] ? might_fault+0x72/0xd0
 [<ffffffff816b326a>] ? wext_handle_ioctl+0x3a/0x90
 [<ffffffff8112de3b>] ? might_fault+0xbb/0xd0
 [<ffffffff815798d2>] ? dev_ioctl+0x6f2/0x830
 [<ffffffff810a1bae>] ? put_lock_stats+0xe/0x40
 [<ffffffff810a1c8c>] ? lock_release_holdtime+0xac/0x150
 [<ffffffff8155cf3d>] ? sock_ioctl+0xfd/0x290
 [<ffffffff8117dffd>] ? do_vfs_ioctl+0x9d/0x590
 [<ffffffff8116c8ff>] ? fget_light+0x1df/0x3c0
 [<ffffffff8117e53a>] ? sys_ioctl+0x4a/0x80
 [<ffffffff81003192>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-21 15:14:25 -05:00
Yuchung Cheng
c24f691b56 tcp: undo_retrans counter fixes
Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
not set or undo_retrans is already 0. This happens when sender receives
more DSACK ACKs than packets retransmitted during the current
undo phase. This may also happen when sender receives DSACK after
the undo operation is completed or cancelled.

Fix another bug that undo_retrans is incorrectly incremented when
sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
is rare but not impossible.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-21 11:31:18 -08:00
Tejun Heo
43d133c18b Merge branch 'master' into for-2.6.39 2011-02-21 09:43:56 +01:00
Eric W. Biederman
5f04d5068a net: Fix more stale on-stack list_head objects.
From: Eric W. Biederman <ebiederm@xmission.com>

In the beginning with batching unreg_list was a list that was used only
once in the lifetime of a network device (I think).  Now we have calls
using the unreg_list that can happen multiple times in the life of a
network device like dev_deactivate and dev_close that are also using the
unreg_list.  In addition in unregister_netdevice_queue we also do a
list_move because for devices like veth pairs it is possible that
unregister_netdevice_queue will be called multiple times.

So I think the change below to fix dev_deactivate which Eric D. missed
will fix this problem.  Now to go test that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-20 11:49:45 -08:00
Jiri Bohac
2205a6ea93 sctp: fix reporting of unknown parameters
commit 5fa782c2f5 re-worked the
handling of unknown parameters. sctp_init_cause_fixed() can now
return -ENOSPC if there is not enough tailroom in the error
chunk skb. When this happens, the error header is not appended to
the error chunk. In that case, the payload of the unknown parameter
should not be appended either.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-19 19:06:55 -08:00
Eric Dumazet
91035f0b7d tcp: fix inet_twsk_deschedule()
Eric W. Biederman reported a lockdep splat in inet_twsk_deschedule()

This is caused by inet_twsk_purge(), run from process context,
and commit 575f4cd5a5 (net: Use rcu lookups in inet_twsk_purge.)
removed the BH disabling that was necessary.

Add the BH disabling but fine grained, right before calling
inet_twsk_deschedule(), instead of whole function.

With help from Linus Torvalds and Eric W. Biederman

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Daniel Lezcano <daniel.lezcano@free.fr>
CC: Pavel Emelyanov <xemul@openvz.org>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: stable <stable@kernel.org> (# 2.6.33+)
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-19 18:59:04 -08:00
David S. Miller
ece639caa3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-02-19 16:42:37 -08:00
Linus Torvalds
4c3021da45 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
  net: deinit automatic LIST_HEAD
  net: dont leave active on stack LIST_HEAD
  net: provide default_advmss() methods to blackhole dst_ops
  tg3: Restrict phy ioctl access
  drivers/net: Call netif_carrier_off at the end of the probe
  ixgbe: work around for DDP last buffer size
  ixgbe: fix panic due to uninitialised pointer
  e1000e: flush all writebacks before unload
  e1000e: check down flag in tasks
  isdn: hisax: Use l2headersize() instead of dup (and buggy) func.
  arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS.
  cxgb4vf: Use defined Mailbox Timeout
  cxgb4vf: Quiesce Virtual Interfaces on shutdown ...
  cxgb4vf: Behave properly when CONFIG_DEBUG_FS isn't defined ...
  cxgb4vf: Check driver parameters in the right place ...
  pch_gbe: Fix the MAC Address load issue.
  iwlwifi: Delete iwl3945_good_plcp_health.
  net/can/softing: make CAN_SOFTING_CS depend on CAN_SOFTING
  netfilter: nf_iterate: fix incorrect RCU usage
  pch_gbe: Fix the issue that the receiving data is not normal.
  ...
2011-02-18 14:15:05 -08:00
Stanislaw Gruszka
05e7c99136 mac80211: fix conn_mon_timer running after disassociate
Low level driver could pass rx frames to us after disassociate, what
can lead to run conn_mon_timer by ieee80211_sta_rx_notify(). That
is obviously wrong, but nothing happens until we unload modules and
resources are used after free. If kernel debugging is enabled following
warning could be observed:

WARNING: at lib/debugobjects.c:259 debug_print_object+0x65/0x70()
Hardware name: HP xw8600 Workstation
ODEBUG: free active (active state 0) object type: timer_list
Modules linked in: iwlagn(-) iwlcore mac80211 cfg80211 aes_x86_64 aes_generic fuse cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod uinput hp_wmi sparse_keymap sg wmi arc4 microcode serio_raw ecb tg3 shpchp rfkill ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif firewire_ohci firewire_core crc_itu_t mptsas mptscsih mptbase scsi_transport_sas ahci libahci pata_acpi ata_generic ata_piix floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: cfg80211]
Pid: 13827, comm: rmmod Tainted: G        W   2.6.38-rc4-wl+ #22
Call Trace:
 [<ffffffff810649cf>] ? warn_slowpath_common+0x7f/0xc0
 [<ffffffff81064ac6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff81226fc5>] ? debug_print_object+0x65/0x70
 [<ffffffff81227625>] ? debug_check_no_obj_freed+0x125/0x210
 [<ffffffff8109ebd7>] ? debug_check_no_locks_freed+0xf7/0x170
 [<ffffffff81156092>] ? kfree+0xc2/0x2f0
 [<ffffffff813ec5c5>] ? netdev_release+0x45/0x60
 [<ffffffff812f1067>] ? device_release+0x27/0xa0
 [<ffffffff81216ddd>] ? kobject_release+0x8d/0x1a0
 [<ffffffff81216d50>] ? kobject_release+0x0/0x1a0
 [<ffffffff812183b7>] ? kref_put+0x37/0x70
 [<ffffffff81216c57>] ? kobject_put+0x27/0x60
 [<ffffffff813d5d1b>] ? netdev_run_todo+0x1ab/0x270
 [<ffffffff813e771e>] ? rtnl_unlock+0xe/0x10
 [<ffffffffa0581188>] ? ieee80211_unregister_hw+0x58/0x120 [mac80211]
 [<ffffffffa0377ed7>] ? iwl_pci_remove+0xdb/0x22a [iwlagn]
 [<ffffffff8123cde2>] ? pci_device_remove+0x52/0x120
 [<ffffffff812f5205>] ? __device_release_driver+0x75/0xe0
 [<ffffffff812f5348>] ? driver_detach+0xd8/0xe0
 [<ffffffff812f4111>] ? bus_remove_driver+0x91/0x100
 [<ffffffff812f5b62>] ? driver_unregister+0x62/0xa0
 [<ffffffff8123d194>] ? pci_unregister_driver+0x44/0xa0
 [<ffffffffa0377df5>] ? iwl_exit+0x15/0x1c [iwlagn]
 [<ffffffff810ab492>] ? sys_delete_module+0x1a2/0x270
 [<ffffffff81498889>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8100bf42>] ? system_call_fastpath+0x16/0x1b

Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-18 16:47:37 -05:00
Eric Dumazet
ceaaec98ad net: deinit automatic LIST_HEAD
commit 9b5e383c11 (net: Introduce
unregister_netdevice_many()) left an active LIST_HEAD() in
rollback_registered(), with possible memory corruption.

Even if device is freed without touching its unreg_list (and therefore
touching the previous memory location holding LISTE_HEAD(single), better
close the bug for good, since its really subtle.

(Same fix for default_device_exit_batch() for completeness)

Reported-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Eric W. Biderman <ebiderman@xmission.com>
Tested-by: Eric W. Biderman <ebiderman@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Octavian Purdila <opurdila@ixiacom.com>
CC: stable <stable@kernel.org> [.33+]
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-18 11:49:36 -08:00
Linus Torvalds
f87e6f4793 net: dont leave active on stack LIST_HEAD
Eric W. Biderman and Michal Hocko reported various memory corruptions
that we suspected to be related to a LIST head located on stack, that
was manipulated after thread left function frame (and eventually exited,
so its stack was freed and reused).

Eric Dumazet suggested the problem was probably coming from commit
443457242b (net: factorize
sync-rcu call in unregister_netdevice_many)

This patch fixes __dev_close() and dev_close() to properly deinit their
respective LIST_HEAD(single) before exiting.

References: https://lkml.org/lkml/2011/2/16/304
References: https://lkml.org/lkml/2011/2/14/223

Reported-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Eric W. Biderman <ebiderman@xmission.com>
Tested-by: Eric W. Biderman <ebiderman@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-18 11:49:35 -08:00
Eric Dumazet
214f45c91b net: provide default_advmss() methods to blackhole dst_ops
Commit 0dbaee3b37 (net: Abstract default ADVMSS behind an
accessor.) introduced a possible crash in tcp_connect_init(), when
dst->default_advmss() is called from dst_metric_advmss()

Reported-by: George Spelvin <linux@horizon.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-18 11:39:01 -08:00
Joerg Marx
0af320fb46 netfilter: ip6t_LOG: fix a flaw in printing the MAC
The flaw was in skipping the second byte in MAC header due to increasing
the pointer AND indexed access starting at '1'.

Signed-off-by: Joerg Marx <joerg.marx@secunet.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-17 16:23:40 +01:00
Florian Westphal
d503b30bd6 netfilter: tproxy: do not assign timewait sockets to skb->sk
Assigning a socket in timewait state to skb->sk can trigger
kernel oops, e.g. in nfnetlink_log, which does:

if (skb->sk) {
        read_lock_bh(&skb->sk->sk_callback_lock);
        if (skb->sk->sk_socket && skb->sk->sk_socket->file) ...

in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
is invalid.

Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
or xt_TPROXY must not assign a timewait socket to skb->sk.

This does the latter.

If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.

The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
listener socket.

Cc: Balazs Scheidler <bazsi@balabit.hu>
Cc: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-17 11:32:38 +01:00
Vladislav P
840af824b2 Bluetooth: Release BTM while sleeping to avoid deadlock
Signed-off-by: Vladislav P <vladisslav@inbox.ru>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-16 15:54:11 -03:00
Ian Campbell
d11327ad66 arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS.
NETDEV_NOTIFY_PEER is an explicit request by the driver to send a link
notification while NETDEV_UP/NETDEV_CHANGEADDR generate link
notifications as a sort of side effect.

In the later cases the sysctl option is present because link
notification events can have undesired effects e.g. if the link is
flapping. I don't think this applies in the case of an explicit
request from a driver.

This patch makes NETDEV_NOTIFY_PEER unconditional, if preferred we
could add a new sysctl for this case which defaults to on.

This change causes Xen post-migration ARP notifications (which cause
switches to relearn their MAC tables etc) to be sent by default.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-14 17:47:15 -08:00
David S. Miller
8bc26a008f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-02-14 12:51:42 -08:00
David S. Miller
af756e9d88 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-02-14 11:16:12 -08:00
Patrick McHardy
de9963f0f2 netfilter: nf_iterate: fix incorrect RCU usage
As noticed by Eric, nf_iterate doesn't use RCU correctly by
accessing the prev pointer of a RCU protected list element when
a verdict of NF_REPEAT is issued.

Fix by jumping backwards to the hook invocation directly instead
of loading the previous list element before continuing the list
iteration.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-14 17:35:07 +01:00
Jesper Juhl
d3337de52a Don't potentially dereference NULL in net/dcb/dcbnl.c:dcbnl_getapp()
nla_nest_start() may return NULL. If it does then we'll blow up in
nla_nest_end() when we dereference the pointer.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13 11:21:14 -08:00
John Fastabend
7ec79270d7 net: dcb: application priority is per net_device
The app_data priority may not be the same for all net devices.
In order for stacks with application notifiers to identify the
specific net device dcb_app_type should be passed in the ptr.

This allows handlers to use dev_get_by_name() to pin priority
to net devices.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13 11:02:39 -08:00
Herbert Xu
8a870178c0 bridge: Replace mp->mglist hlist with a bool
As it turns out we never need to walk through the list of multicast
groups subscribed by the bridge interface itself (the only time we'd
want to do that is when we shut down the bridge, in which case we
simply walk through all multicast groups), we don't really need to
keep an hlist for mp->mglist.

This means that we can replace it with just a single bit to indicate
whether the bridge interface is subscribed to a group.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-12 01:05:42 -08:00
Herbert Xu
24f9cdcbd7 bridge: Fix timer typo that may render snooping less effective
In a couple of spots where we are supposed to modify the port
group timer (p->timer) we instead modify the bridge interface
group timer (mp->timer).

The effect of this is mostly harmless.  However, it can cause
port subscriptions to be longer than they should be, thus making
snooping less effective.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-11 21:59:37 -08:00
Herbert Xu
6b0d6a9b42 bridge: Fix mglist corruption that leads to memory corruption
The list mp->mglist is used to indicate whether a multicast group
is active on the bridge interface itself as opposed to one of the
constituent interfaces in the bridge.

Unfortunately the operation that adds the mp->mglist node to the
list neglected to check whether it has already been added.  This
leads to list corruption in the form of nodes pointing to itself.

Normally this would be quite obvious as it would cause an infinite
loop when walking the list.  However, as this list is never actually
walked (which means that we don't really need it, I'll get rid of
it in a subsequent patch), this instead is hidden until we perform
a delete operation on the affected nodes.

As the same node may now be pointed to by more than one node, the
delete operations can then cause modification of freed memory.

This was observed in practice to cause corruption in 512-byte slabs,
most commonly leading to crashes in jbd2.

Thanks to Josef Bacik for pointing me in the right direction.

Reported-by: Ian Page Hands <ihands@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-11 21:59:37 -08:00
Steffen Klassert
946bf5ee3c ip_gre: Add IPPROTO_GRE to flowi in ipgre_tunnel_xmit
Commit 5811662b15 ("net: use the macros
defined for the members of flowi") accidentally removed the setting of
IPPROTO_GRE from the struct flowi in ipgre_tunnel_xmit. This patch
restores it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-11 11:23:12 -08:00
Hiroaki SHIMODA
0b15093219 xfrm: avoid possible oopse in xfrm_alloc_dst
Commit 80c802f307 (xfrm: cache bundles instead of policies for
outgoing flows) introduced possible oopse when dst_alloc returns NULL.

Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-10 23:08:33 -08:00
Linus Torvalds
e128c5e26b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits)
  virtio_net: Add schedule check to napi_enable call
  x25: Do not reference freed memory.
  pch_can: fix tseg1/tseg2 setting issue
  isdn: hysdn: Kill (partially buggy) CVS regision log reporting.
  can: softing_cs needs slab.h
  pch_gbe: Fix the issue which a driver locks when rx offload is set by ethtool
  netfilter: nf_conntrack: set conntrack templates again if we return NF_REPEAT
  pch_can: fix module reload issue with MSI
  pch_can: fix rmmod issue
  pch_can: fix 800k comms issue
  net: Fix lockdep regression caused by initializing netdev queues too early.
  net/caif: Fix dangling list pointer in freed object on error.
  USB CDC NCM errata updates for cdc_ncm host driver
  CDC NCM errata updates for cdc.h
  ixgbe: update version string
  ixgbe: cleanup variable initialization
  ixgbe: limit VF access to network traffic
  ixgbe: fix for 82599 erratum on Header Splitting
  ixgbe: fix variable set but not used warnings by gcc 4.6
  e1000: add support for Marvell Alaska M88E1118R PHY
  ...
2011-02-10 12:05:09 -08:00
David S. Miller
96642d42f0 x25: Do not reference freed memory.
In x25_link_free(), we destroy 'nb' before dereferencing
'nb->dev'.  Don't do this, because 'nb' might be freed
by then.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-09 22:36:13 -08:00
David S. Miller
ae0935776c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-02-09 12:40:21 -08:00
Eliad Peller
a7b545f7fe mac80211: add missing locking in ieee80211_reconfig
When suspending an associated system, and then resuming,
the station vif is being reconfigured without taking the
sdata->u.mgd.mtx lock, which results in the following warning:

WARNING: at net/mac80211/mlme.c:101 ieee80211_ap_probereq_get+0x58/0xb8 [mac80211]()
Modules linked in: wl12xx_sdio wl12xx firmware_class crc7 mac80211 cfg80211 [last unloaded: crc7]
Backtrace:
[<c005432c>] (dump_backtrace+0x0/0x118) from [<c0376e28>] (dump_stack+0x20/0x24)
 r7:00000000 r6:bf12d6ec r5:bf154aac r4:00000065
[<c0376e08>] (dump_stack+0x0/0x24) from [<c0079104>] (warn_slowpath_common+0x5c/0x74)
[<c00790a8>] (warn_slowpath_common+0x0/0x74) from [<c0079148>] (warn_slowpath_null+0x2c/0x34)
 r9:000024ff r8:cd006460 r7:00000001 r6:00000000 r5:00000000
r4:cf1394a0
[<c007911c>] (warn_slowpath_null+0x0/0x34) from [<bf12d6ec>] (ieee80211_ap_probereq_get+0x58/0xb8 [mac80211])
[<bf12d694>] (ieee80211_ap_probereq_get+0x0/0xb8 [mac80211]) from [<bf19cd04>] (wl1271_cmd_build_ap_probe_req+0x30/0xf8 [wl12xx])
 r4:cd007440
[<bf19ccd4>] (wl1271_cmd_build_ap_probe_req+0x0/0xf8 [wl12xx]) from [<bf1995f4>] (wl1271_op_bss_info_changed+0x4c4/0x808 [wl12xx])
 r5:cd007440 r4:000003b4
[<bf199130>] (wl1271_op_bss_info_changed+0x0/0x808 [wl12xx]) from [<bf122168>] (ieee80211_bss_info_change_notify+0x1a4/0x1f8 [mac80211])
[<bf121fc4>] (ieee80211_bss_info_change_notify+0x0/0x1f8 [mac80211]) from [<bf141e80>] (ieee80211_reconfig+0x4d0/0x668 [mac80211])
 r8:cf0eeea4 r7:cd00671c r6:00000000 r5:cd006460 r4:cf1394a0
[<bf1419b0>] (ieee80211_reconfig+0x0/0x668 [mac80211]) from [<bf137dd4>] (ieee80211_resume+0x60/0x70 [mac80211])
[<bf137d74>] (ieee80211_resume+0x0/0x70 [mac80211]) from [<bf0eb930>] (wiphy_resume+0x6c/0x7c [cfg80211])
 r5:cd006248 r4:cd006110
[<bf0eb8c4>] (wiphy_resume+0x0/0x7c [cfg80211]) from [<c0241024>] (legacy_resume+0x38/0x70)
 r7:00000000 r6:00000000 r5:cd006248 r4:cd0062fc
[<c0240fec>] (legacy_resume+0x0/0x70) from [<c0241478>] (device_resume+0x168/0x1a0)
 r8:c04ca8d8 r7:cd00627c r6:00000010 r5:cd006248 r4:cd0062fc
[<c0241310>] (device_resume+0x0/0x1a0) from [<c0241600>] (dpm_resume_end+0xf8/0x3bc)
 r7:00000000 r6:00000005 r5:cd006248 r4:cd0062fc
[<c0241508>] (dpm_resume_end+0x0/0x3bc) from [<c00b2a24>] (suspend_devices_and_enter+0x1b0/0x204)
[<c00b2874>] (suspend_devices_and_enter+0x0/0x204) from [<c00b2b68>] (enter_state+0xf0/0x148)
 r7:c037e978 r6:00000003 r5:c043d807 r4:00000000
[<c00b2a78>] (enter_state+0x0/0x148) from [<c00b20a4>] (state_store+0xa4/0xcc)
 r7:c037e978 r6:00000003 r5:00000003 r4:c043d807
[<c00b2000>] (state_store+0x0/0xcc) from [<c01fc90c>] (kobj_attr_store+0x20/0x24)
[<c01fc8ec>] (kobj_attr_store+0x0/0x24) from [<c0157120>] (sysfs_write_file+0x11c/0x150)
[<c0157004>] (sysfs_write_file+0x0/0x150) from [<c0100f84>] (vfs_write+0xc0/0x14c)
[<c0100ec4>] (vfs_write+0x0/0x14c) from [<c01010e4>] (sys_write+0x4c/0x78)
 r8:40126000 r7:00000004 r6:cf1a7c80 r5:00000000 r4:00000000
[<c0101098>] (sys_write+0x0/0x78) from [<c00500c0>] (ret_fast_syscall+0x0/0x30)
 r8:c00502c8 r7:00000004 r6:403525e8 r5:40126000 r4:00000004

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-09 15:35:13 -05:00
John W. Linville
5dc0fa782a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/padovan/bluetooth-2.6 2011-02-09 15:30:42 -05:00
Pablo Neira Ayuso
c317428644 netfilter: nf_conntrack: set conntrack templates again if we return NF_REPEAT
The TCP tracking code has a special case that allows to return
NF_REPEAT if we receive a new SYN packet while in TIME_WAIT state.

In this situation, the TCP tracking code destroys the existing
conntrack to start a new clean session.

[DESTROY] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925 [ASSURED]
    [NEW] tcp      6 120 SYN_SENT src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925

However, this is a problem for the iptables' CT target event filtering
which will not work in this case since the conntrack template will not
be there for the new session. To fix this, we reassign the conntrack
template to the packet if we return NF_REPEAT.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-09 08:08:20 +01:00
David S. Miller
8d3bdbd55a net: Fix lockdep regression caused by initializing netdev queues too early.
In commit aa94210411 ("net: init ingress
queue") we moved the allocation and lock initialization of the queues
into alloc_netdev_mq() since register_netdevice() is way too late.

The problem is that dev->type is not setup until the setup()
callback is invoked by alloc_netdev_mq(), and the dev->type is
what determines the lockdep class to use for the locks in the
queues.

Fix this by doing the queue allocation after the setup() callback
runs.

This is safe because the setup() callback is not allowed to make any
state changes that need to be undone on error (memory allocations,
etc.).  It may, however, make state changes that are undone by
free_netdev() (such as netif_napi_add(), which is done by the
ipoib driver's setup routine).

The previous code also leaked a reference to the &init_net namespace
object on RX/TX queue allocation failures.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 15:02:50 -08:00
David S. Miller
b2df5a8446 net/caif: Fix dangling list pointer in freed object on error.
rtnl_link_ops->setup(), and the "setup" callback passed to alloc_netdev*(),
cannot make state changes which need to be undone on failure.  There is
no cleanup mechanism available at this point.

So we have to add the caif private instance to the global list once we
are sure that register_netdev() has succedded in ->newlink().

Otherwise, if register_netdev() fails, the caller will invoke free_netdev()
and we will have a reference to freed up memory on the chnl_net_list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-08 14:31:31 -08:00
David S. Miller
e0985f27dd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-02-08 12:03:54 -08:00
David S. Miller
429a01a70f Merge branch 'batman-adv/merge' of git://git.open-mesh.org/ecsv/linux-merge 2011-02-07 19:54:14 -08:00
Sven Eckelmann
531c9da8c8 batman-adv: Linearize fragment packets before merge
We access the data inside the skbs of two fragments directly using memmove
during the merge. The data of the skb could span over multiple skb pages. An
direct access without knowledge about the pages would lead to an invalid memory
access.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
[lindner_marek@yahoo.de: Move return from function to the end]
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2011-02-08 00:54:31 +01:00
andrew hendry
95c3043008 x25: possible skb leak on bad facilities
Originally x25_parse_facilities returned
-1 for an error
 0 meaning 0 length facilities
>0 the length of the facilities parsed.

5ef41308f9 ("x25: Prevent crashing when parsing bad X.25 facilities") introduced more
error checking in x25_parse_facilities however used 0 to indicate bad parsing
a6331d6f9a ("memory corruption in X.25 facilities parsing") followed this further for
DTE facilities, again using 0 for bad parsing.

The meaning of 0 got confused in the callers.
If the facilities are messed up we can't determine where the data starts.
So patch makes all parsing errors return -1 and ensures callers close and don't use the skb further.

Reported-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-07 13:41:38 -08:00
Felix Fietkau
fc7c976dc7 mac80211: fix the skb cloned check in the tx path
Using skb_header_cloned to check if it's safe to write to the skb is not
enough - mac80211 also touches the tailroom of the skb.
Initially this check was only used to increase a counter, however this
commit changed the code to also skip skb data reallocation if no extra
head/tailroom was needed:

commit 4cd06a344d
mac80211: skip unnecessary pskb_expand_head calls

It added a regression at least with iwl3945, which is fixed by this patch.

Reported-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-07 16:02:14 -05:00
Linus Torvalds
44f2c5c841 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (68 commits)
  net: can: janz-ican3: world-writable sysfs termination file
  net: can: at91_can: world-writable sysfs files
  MAINTAINERS: update email ids of the be2net driver maintainers.
  bridge: Don't put partly initialized fdb into hash
  r8169: prevent RxFIFO induced loops in the irq handler.
  r8169: RxFIFO overflow oddities with 8168 chipsets.
  r8169: use RxFIFO overflow workaround for 8168c chipset.
  include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument
  net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.
  net: Support compat SIOCGETVIFCNT ioctl in ipv4.
  net: Fix bug in compat SIOCGETSGCNT handling.
  niu: Fix races between up/down and get_stats.
  tcp_ecn is an integer not a boolean
  atl1c: Add missing PCI device ID
  s390: Fix possibly wrong size in strncmp (smsgiucv)
  s390: Fix wrong size in memcmp (netiucv)
  qeth: allow OSA CHPARM change in suspend state
  qeth: allow HiperSockets framesize change in suspend
  qeth: add more strict MTU checking
  qeth: show new mac-address if its setting fails
  ...
2011-02-04 13:20:01 -08:00
Pavel Emelyanov
1158f762e5 bridge: Don't put partly initialized fdb into hash
The fdb_create() puts a new fdb into hash with only addr set. This is
not good, since there are callers, that search the hash w/o the lock
and access all the other its fields.

Applies to current netdev tree.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-04 13:02:36 -08:00
David S. Miller
e2d57766e6 net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-03 18:05:29 -08:00
David S. Miller
ca6b8bb097 net: Support compat SIOCGETVIFCNT ioctl in ipv4.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-03 17:24:28 -08:00
David S. Miller
0033d5ad27 net: Fix bug in compat SIOCGETSGCNT handling.
Commit 709b46e8d9 ("net: Add compat
ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT") added the
correct plumbing to handle SIOCGETSGCNT properly.

However, whilst definiting a proper "struct compat_sioc_sg_req" it
isn't actually used in ipmr_compat_ioctl().

Correct this oversight.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-03 17:21:31 -08:00
David S. Miller
0bc0be7f20 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2011-02-02 15:52:23 -08:00
Andy Gospodarek
6d152e23ad gro: reset skb_iif on reuse
Like Herbert's change from a few days ago:

66c46d741e gro: Reset dev pointer on reuse

this may not be necessary at this point, but we should still clean up
the skb->skb_iif.  If not we may end up with an invalid valid for
skb->skb_iif when the skb is reused and the check is done in
__netif_receive_skb.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-02 14:53:25 -08:00
Johannes Berg
4334ec8518 mac80211: fix TX status cookie in HW offload case
When the off-channel TX is done with remain-on-channel
offloaded to hardware, the reported cookie is wrong as
in that case we shouldn't use the SKB as the cookie but
need to instead use the corresponding r-o-c cookie
(XOR'ed with 2 to prevent API mismatches).

Fix this by keeping track of the hw_roc_skb pointer
just for the status processing and use the correct
cookie to report in this case. We can't use the
hw_roc_skb pointer itself because it is NULL'ed when
the frame is transmitted to prevent it being used
twice.

This fixes a bug where the P2P state machine in the
supplicant gets stuck because it never gets a correct
result for its transmitted frame.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-02-02 16:38:59 -05:00
Bao Liang
e733fb6208 Bluetooth: Set conn state to BT_DISCONN to avoid multiple responses
This patch fixes a minor issue that two connection responses will be sent
for one L2CAP connection request. If the L2CAP connection request is first
blocked due to security reason and responded with reason "security block",
the state of the connection remains BT_CONNECT2. If a pairing procedure
completes successfully before the ACL connection is down, local host will
send another connection complete response. See the following packets
captured by hcidump.

2010-12-07 22:21:24.928096 < ACL data: handle 12 flags 0x00 dlen 16
    0000: 0c 00 01 00 03 19 08 00  41 00 53 00 03 00 00 00  ........A.S.....
... ...

2010-12-07 22:21:35.791747 > HCI Event: Auth Complete (0x06) plen 3
    status 0x00 handle 12
... ...

2010-12-07 22:21:35.872372 > ACL data: handle 12 flags 0x02 dlen 16
    L2CAP(s): Connect rsp: dcid 0x0054 scid 0x0040 result 0 status 0
      Connection successful

Signed-off-by: Liang Bao <tim.bao@gmail.com>
Acked-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-02-02 12:47:59 -02:00
Pablo Neira Ayuso
3db7e93d33 netfilter: ecache: always set events bits, filter them later
For the following rule:

iptables -I PREROUTING -t raw -j CT --ctevents assured

The event delivered looks like the following:

 [UPDATE] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]

Note that the TCP protocol state is not included. For that reason
the CT event filtering is not very useful for conntrackd.

To resolve this issue, instead of conditionally setting the CT events
bits based on the ctmask, we always set them and perform the filtering
in the late stage, just before the delivery.

Thus, the event delivered looks like the following:

 [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 16:06:30 +01:00
Pablo Neira Ayuso
9d0db8b6b1 netfilter: arpt_mangle: fix return values of checkentry
In 135367b "netfilter: xtables: change xt_target.checkentry return type",
the type returned by checkentry was changed from boolean to int, but the
return values where not adjusted.

arptables: Input/output error

This broke arptables with the mangle target since it returns true
under success, which is interpreted by xtables as >0, thus
returning EIO.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-01 16:03:46 +01:00
Tejun Heo
c534a107e8 rds/ib: use system_wq instead of rds_ib_fmr_wq
With cmwq, there's no reason to use dedicated rds_ib_fmr_wq - it's not
in the memory reclaim path and the maximum number of concurrent work
items is bound by the number of devices.  Drop it and use system_wq
instead.  This rds_ib_fmr_init/exit() noops.  Both removed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Andy Grover <andy.grover@oracle.com>
2011-02-01 11:42:43 +01:00
Tejun Heo
aa70c585b1 net/9p: replace p9_poll_task with a work
Now that cmwq can handle high concurrency, it's more efficient to use
work than a dedicated kthread.  Convert p9_poll_proc() to a work
function for p9_poll_work and make p9_pollwake() schedule it on each
poll event.  The work is sync flushed on module exit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: v9fs-developer@lists.sourceforge.net
2011-02-01 11:42:43 +01:00
Tejun Heo
61edeeed91 net/9p: use system_wq instead of p9_mux_wq
With cmwq, there's no reason to use a dedicated workqueue in trans_fd.
Drop p9_mux_wq and use system_wq instead.  The used work items are
already sync canceled in p9_conn_destroy() and doesn't require further
synchronization.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: v9fs-developer@lists.sourceforge.net
2011-02-01 11:42:43 +01:00
Eric W. Biederman
bf36076a67 net: Fix ipv6 neighbour unregister_sysctl_table warning
In my testing of 2.6.37 I was occassionally getting a warning about
sysctl table entries being unregistered in the wrong order.  Digging
in it turns out this dates back to the last great sysctl reorg done
where Al Viro introduced the requirement that sysctl directories
needed to be created before and destroyed after the files in them.

It turns out that in that great reorg /proc/sys/net/ipv6/neigh was
overlooked.  So this patch fixes that oversight and makes an annoying
warning message go away.

>------------[ cut here ]------------
>WARNING: at kernel/sysctl.c:1992 unregister_sysctl_table+0x134/0x164()
>Pid: 23951, comm: kworker/u:3 Not tainted 2.6.37-350888.2010AroraKernelBeta.fc14.x86_64 #1
>Call Trace:
> [<ffffffff8103e034>] warn_slowpath_common+0x80/0x98
> [<ffffffff8103e061>] warn_slowpath_null+0x15/0x17
> [<ffffffff810452f8>] unregister_sysctl_table+0x134/0x164
> [<ffffffff810e7834>] ? kfree+0xc4/0xd1
> [<ffffffff813439b2>] neigh_sysctl_unregister+0x22/0x3a
> [<ffffffffa02cd14e>] addrconf_ifdown+0x33f/0x37b [ipv6]
> [<ffffffff81331ec2>] ? skb_dequeue+0x5f/0x6b
> [<ffffffffa02ce4a5>] addrconf_notify+0x69b/0x75c [ipv6]
> [<ffffffffa02eb953>] ? ip6mr_device_event+0x98/0xa9 [ipv6]
> [<ffffffff813d2413>] notifier_call_chain+0x32/0x5e
> [<ffffffff8105bdea>] raw_notifier_call_chain+0xf/0x11
> [<ffffffff8133cdac>] call_netdevice_notifiers+0x45/0x4a
> [<ffffffff8133d2b0>] rollback_registered_many+0x118/0x201
> [<ffffffff8133d3af>] unregister_netdevice_many+0x16/0x6d
> [<ffffffff8133d571>] default_device_exit_batch+0xa4/0xb8
> [<ffffffff81337c42>] ? cleanup_net+0x0/0x194
> [<ffffffff81337a2a>] ops_exit_list+0x4e/0x56
> [<ffffffff81337d36>] cleanup_net+0xf4/0x194
> [<ffffffff81053318>] process_one_work+0x187/0x280
> [<ffffffff8105441b>] worker_thread+0xff/0x19f
> [<ffffffff8105431c>] ? worker_thread+0x0/0x19f
> [<ffffffff8105776d>] kthread+0x7d/0x85
> [<ffffffff81003824>] kernel_thread_helper+0x4/0x10
> [<ffffffff810576f0>] ? kthread+0x0/0x85
> [<ffffffff81003820>] ? kernel_thread_helper+0x0/0x10
>---[ end trace 8a7e9310b35e9486 ]---

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31 20:54:17 -08:00
Tom Herbert
8587523640 net: Check rps_flow_table when RPS map length is 1
In get_rps_cpu, add check that the rps_flow_table for the device is
NULL when trying to take fast path when RPS map length is one.
Without this, RFS is effectively disabled if map length is one which
is not correct.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31 16:23:42 -08:00
Linus Torvalds
0fd08c5545 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: NFSv4 readdir loses entries
  NFS: Micro-optimize nfs4_decode_dirent()
  NFS: Fix an NFS client lockdep issue
  NFS construct consistent co_ownerid for v4.1
  NFS: nfs_wcc_update_inode() should set nfsi->attr_gencount
  NFS improve pnfs_put_deviceid_cache debug print
  NFS fix cb_sequence error processing
  NFS do not find client in NFSv4 pg_authenticate
  NLM: Fix "kernel BUG at fs/lockd/host.c:417!" or ".../host.c:283!"
  NFS: Prevent memory allocation failure in nfsacl_encode()
  NFS: nfsacl_{encode,decode} should return signed integer
  NFS: Fix "kernel BUG at fs/nfs/nfs3xdr.c:1338!"
  NFS: Fix "kernel BUG at fs/aio.c:554!"
  NFS4: Avoid potential NULL pointer dereference in decode_and_add_ds().
  NFS: fix handling of malloc failure during nfs_flush_multi()
2011-02-01 09:41:02 +10:00
Roland Dreier
ec831ea72e net: Add default_mtu() methods to blackhole dst_ops
When an IPSEC SA is still being set up, __xfrm_lookup() will return
-EREMOTE and so ip_route_output_flow() will return a blackhole route.
This can happen in a sndmsg call, and after d33e455337 ("net: Abstract
default MTU metric calculation behind an accessor.") this leads to a
crash in ip_append_data() because the blackhole dst_ops have no
default_mtu() method and so dst_mtu() calls a NULL pointer.

Fix this by adding default_mtu() methods (that simply return 0, matching
the old behavior) to the blackhole dst_ops.

The IPv4 part of this patch fixes a crash that I saw when using an IPSEC
VPN; the IPv6 part is untested because I don't have an IPv6 VPN, but it
looks to be needed as well.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-31 13:16:00 -08:00
David S. Miller
81c2bdb688 Merge branch 'batman-adv/merge-oopsonly' of git://git.open-mesh.org/ecsv/linux-merge 2011-01-30 22:16:34 -08:00
Sven Eckelmann
1181e1daac batman-adv: Make vis info stack traversal threadsafe
The batman-adv vis server has to a stack which stores all information
about packets which should be send later. This stack is protected
with a spinlock that is used to prevent concurrent write access to it.

The send_vis_packets function has to take all elements from the stack
and send them to other hosts over the primary interface. The send will
be initiated without the lock which protects the stack.

The implementation using list_for_each_entry_safe has the problem that
it stores the next element as "safe ptr" to allow the deletion of the
current element in the list. The list may be modified during the
unlock/lock pair in the loop body which may make the safe pointer
not pointing to correct next element.

It is safer to remove and use the first element from the stack until no
elements are available. This does not need reduntant information which
would have to be validated each time the lock was removed.

Reported-by: Russell Senior <russell@personaltelco.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-01-30 10:32:08 +01:00
Sven Eckelmann
dda9fc6b2c batman-adv: Remove vis info element in free_info
The free_info function will be called when no reference to the info
object exists anymore. It must be ensured that the allocated memory
gets freed and not only the elements which are managed by the info
object.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-01-30 10:32:06 +01:00
Sven Eckelmann
2674c15870 batman-adv: Remove vis info on hashing errors
A newly created vis info object must be removed when it couldn't be
added to the hash. The old_info which has to be replaced was already
removed and isn't related to the hash anymore.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-01-30 10:32:02 +01:00
Eric W. Biederman
709b46e8d9 net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT
SIOCGETSGCNT is not a unique ioctl value as it it maps tio SIOCPROTOPRIVATE +1,
which unfortunately means the existing infrastructure for compat networking
ioctls is insufficient.  A trivial compact ioctl implementation would conflict
with:

SIOCAX25ADDUID
SIOCAIPXPRISLT
SIOCGETSGCNT_IN6
SIOCGETSGCNT
SIOCRSSCAUSE
SIOCX25SSUBSCRIP
SIOCX25SDTEFACILITIES

To make this work I have updated the compat_ioctl decode path to mirror the
the normal ioctl decode path.  I have added an ipv4 inet_compat_ioctl function
so that I can have ipv4 specific compat ioctls.   I have added a compat_ioctl
function into struct proto so I can break out ioctls by which kind of ip socket
I am using.  I have added a compat_raw_ioctl function because SIOCGETSGCNT only
works on raw sockets.  I have added a ipmr_compat_ioctl that mirrors the normal
ipmr_ioctl.

This was necessary because unfortunately the struct layout for the SIOCGETSGCNT
has unsigned longs in it so changes between 32bit and 64bit kernels.

This change was sufficient to run a 32bit ip multicast routing daemon on a
64bit kernel.

Reported-by: Bill Fenner <fenner@aristanetworks.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-30 01:14:38 -08:00
Eric W. Biederman
13ad17745c net: Fix ip link add netns oops
Ed Swierk <eswierk@bigswitch.com> writes:
> On 2.6.35.7
>  ip link add link eth0 netns 9999 type macvlan
> where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang:
> [10663.821898] BUG: unable to handle kernel NULL pointer dereference at 000000000000006d
>  [10663.821917] IP: [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
>  [10663.821933] PGD 1d3927067 PUD 22f5c5067 PMD 0
>  [10663.821944] Oops: 0000 [#1] SMP
>  [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
>  [10663.821959] CPU 3
>  [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class
>  [10663.822155]
>  [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO
>  [10663.822167] RIP: 0010:[<ffffffff8149c2fa>] [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
>  [10663.822177] RSP: 0018:ffff88014aebf7b8 EFLAGS: 00010286
>  [10663.822182] RAX: 00000000fffffff4 RBX: ffff8801ad900800 RCX: 0000000000000000
>  [10663.822187] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffff88014ad63000
>  [10663.822191] RBP: ffff88014aebf808 R08: 0000000000000041 R09: 0000000000000041
>  [10663.822196] R10: 0000000000000000 R11: dead000000200200 R12: ffff88014aebf818
>  [10663.822201] R13: fffffffffffffffd R14: ffff88014aebf918 R15: ffff88014ad62000
>  [10663.822207] FS: 00007f00c487f700(0000) GS:ffff880001f80000(0000) knlGS:0000000000000000
>  [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [10663.822216] CR2: 000000000000006d CR3: 0000000231f19000 CR4: 00000000000026e0
>  [10663.822221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  [10663.822226] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  [10663.822231] Process ip (pid: 6000, threadinfo ffff88014aebe000, task ffff88014afb16e0)
>  [10663.822236] Stack:
>  [10663.822240] ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6
>  [10663.822251] <0> 0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818
>  [10663.822265] <0> ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413
>  [10663.822281] Call Trace:
>  [10663.822290] [<ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0
>  [10663.822298] [<ffffffff8149c413>] dev_alloc_name+0x43/0x90
>  [10663.822307] [<ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0
>  [10663.822314] [<ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570
>  [10663.822321] [<ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570
>  [10663.822332] [<ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20
>  [10663.822339] [<ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290
>  [10663.822346] [<ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290
>  [10663.822354] [<ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0
>  [10663.822360] [<ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40
>  [10663.822367] [<ffffffff814c223e>] netlink_unicast+0x2de/0x2f0
>  [10663.822374] [<ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0
>  [10663.822383] [<ffffffff81488533>] sock_sendmsg+0xf3/0x120
>  [10663.822391] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
>  [10663.822400] [<ffffffff81168656>] ? __d_lookup+0x136/0x150
>  [10663.822406] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
>  [10663.822414] [<ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80
>  [10663.822422] [<ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110
>  [10663.822429] [<ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70
>  [10663.822435] [<ffffffff81493308>] ? verify_iovec+0x88/0xe0
>  [10663.822442] [<ffffffff81489020>] sys_sendmsg+0x240/0x3a0
> [10663.822450] [<ffffffff8111e2a9>] ? __do_fault+0x479/0x560
>  [10663.822457] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
>  [10663.822465] [<ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150
>  [10663.822473] [<ffffffff8158d76e>] ? do_page_fault+0x15e/0x350
>  [10663.822482] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
>  [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55
>  [10663.822618] RIP [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
>  [10663.822627] RSP <ffff88014aebf7b8>
>  [10663.822631] CR2: 000000000000006d
>  [10663.822636] ---[ end trace 3dfd6c3ad5327ca7 ]---

This bug was introduced in:
commit 81adee47df
Author: Eric W. Biederman <ebiederm@aristanetworks.com>
Date:   Sun Nov 8 00:53:51 2009 -0800

    net: Support specifying the network namespace upon device creation.

    There is no good reason to not support userspace specifying the
    network namespace during device creation, and it makes it easier
    to create a network device and pass it to a child network namespace
    with a well known name.

    We have to be careful to ensure that the target network namespace
    for the new device exists through the life of the call.  To keep
    that logic clear I have factored out the network namespace grabbing
    logic into rtnl_link_get_net.

    In addtion we need to continue to pass the source network namespace
    to the rtnl_link_ops.newlink method so that we can find the base
    device source network namespace.

    Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
    Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Where apparently I forgot to add error handling to the path where we create
a new network device in a new network namespace, and pass in an invalid pid.

Cc: stable@kernel.org
Reported-by: Ed Swierk <eswierk@bigswitch.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-30 01:14:15 -08:00
Herbert Xu
66c46d741e gro: Reset dev pointer on reuse
On older kernels the VLAN code may zero skb->dev before dropping
it and causing it to be reused by GRO.

Unfortunately we didn't reset skb->dev in that case which causes
the next GRO user to get a bogus skb->dev pointer.

This particular problem no longer happens with the current upstream
kernel due to changes in VLAN processing.

However, for correctness we should still reset the skb->dev pointer
in the GRO reuse function in case a future user does the same thing.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-29 22:36:24 -08:00
David S. Miller
8f2771f2b8 ipv6: Remove route peer binding assertions.
They are bogus.  The basic idea is that I wanted to make sure
that prefixed routes never bind to peers.

The test I used was whether RTF_CACHE was set.

But first of all, the RTF_CACHE flag is set at different spots
depending upon which ip6_rt_copy() caller you're talking about.

I've validated all of the code paths, and even in the future
where we bind peers more aggressively (for route metric COW'ing)
we never bind to prefix'd routes, only fully specified ones.
This even applies when addrconf or icmp6 routes are allocated.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27 14:55:22 -08:00
Eric Dumazet
c2aa3665cf net: add kmemcheck annotation in __alloc_skb()
pskb_expand_head() triggers a kmemcheck warning when copy of
skb_shared_info is done in pskb_expand_head()

This is because destructor_arg field is not necessarily initialized at
this point. Add kmemcheck_annotate_variable() call in __alloc_skb() to
instruct kmemcheck this is a normal situation.

Resolves bugzilla.kernel.org 27212

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=27212
Reported-by: Christian Casteyde <casteyde.christian@free.fr>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27 14:41:06 -08:00
Kurt Van Dijck
6d3a9a6854 net: fix validate_link_af in rtnetlink core
I'm testing an API that uses IFLA_AF_SPEC attribute.
In the rtnetlink core , the set_link_af() member
of the rtnl_af_ops struct receives the nested attribute
(as I expected), but the validate_link_af() member
receives the parent attribute.
IMO, this patch fixes this.

Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27 14:39:21 -08:00
Eric Dumazet
389f2a18c6 econet: remove compiler warnings
net/econet/af_econet.c: In function ‘econet_sendmsg’:
net/econet/af_econet.c:494: warning: label ‘error’ defined but not used
net/econet/af_econet.c:268: warning: unused variable ‘sk’

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Phil Blundell <philb@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-27 14:15:54 -08:00
David S. Miller
7cc2edb834 xfrm6: Don't forget to propagate peer into ipsec route.
Like ipv4, we have to propagate the ipv6 route peer into
the ipsec top-level route during instantiation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-26 13:41:03 -08:00
David S. Miller
9b6941d8b1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-01-26 11:49:49 -08:00
Linus Lüssing
dd58ddc692 batman-adv: Fix kernel panic when fetching vis data on a vis server
The hash_iterate removal introduced a bug leading to a kernel panic when
fetching the vis data on a vis server. That commit forgot to rename one
variable name, which this commit fixes now.

Reported-by: Russell Senior <russell@personaltelco.net>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
2011-01-25 23:58:33 +01:00
Jerry Chu
44f5324b5d TCP: fix a bug that triggers large number of TCP RST by mistake
This patch fixes a bug that causes TCP RST packets to be generated
on otherwise correctly behaved applications, e.g., no unread data
on close,..., etc. To trigger the bug, at least two conditions must
be met:

1. The FIN flag is set on the last data packet, i.e., it's not on a
separate, FIN only packet.
2. The size of the last data chunk on the receive side matches
exactly with the size of buffer posted by the receiver, and the
receiver closes the socket without any further read attempt.

This bug was first noticed on our netperf based testbed for our IW10
proposal to IETF where a large number of RST packets were observed.
netperf's read side code meets the condition 2 above 100%.

Before the fix, tcp_data_queue() will queue the last skb that meets
condition 1 to sk_receive_queue even though it has fully copied out
(skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
tcp_recvmsg() often returns all the copied out data successfully
without actually consuming the skb, due to a check
"if ((chunk = len - tp->ucopy.len) != 0) {"
and
"len -= chunk;"
after tcp_prequeue_process() that causes "len" to become 0 and an
early exit from the big while loop.

I don't see any reason not to free the skb whose data have been fully
consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
get there if MSG_PEEK is on. Am I missing some arcane cases related
to urgent data?

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-25 13:46:30 -08:00
Felix Fietkau
eb3e554b4b mac80211: fix a crash in ieee80211_beacon_get_tim on change_interface
Some drivers (e.g. ath9k) do not always disable beacons when they're
supposed to. When an interface is changed using the change_interface op,
the mode specific sdata part is in an undefined state and trying to
get a beacon at this point can produce weird crashes.

To fix this, add a check for ieee80211_sdata_running before using
anything from the sdata.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-01-25 16:28:56 -05:00
David S. Miller
73a8bd74e2 ipv6: Revert 'administrative down' address handling changes.
This reverts the following set of commits:

d1ed113f16 ("ipv6: remove duplicate neigh_ifdown")
29ba5fed1b ("ipv6: don't flush routes when setting loopback down")
9d82ca98f7 ("ipv6: fix missing in6_ifa_put in addrconf")
2de7957072 ("ipv6: addrconf: don't remove address state on ifdown if the address is being kept")
8595805aaf ("IPv6: only notify protocols if address is compeletely gone")
27bdb2abcc ("IPv6: keep tentative addresses in hash table")
93fa159abe ("IPv6: keep route for tentative address")
8f37ada5b5 ("IPv6: fix race between cleanup and add/delete address")
84e8b803f1 ("IPv6: addrconf notify when address is unavailable")
dc2b99f71e ("IPv6: keep permanent addresses on admin down")

because the core semantic change to ipv6 address handling on ifdown
has broken some things, in particular "disable_ipv6" sysctl handling.

Stephen has made several attempts to get things back in working order,
but nothing has restored disable_ipv6 fully yet.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Tested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-25 12:49:08 -08:00
Andy Adamson
778be232a2 NFS do not find client in NFSv4 pg_authenticate
The information required to find the nfs_client cooresponding to the incoming
back channel request is contained in the NFS layer. Perform minimal checking
in the RPC layer pg_authenticate method, and push more detailed checking into
the NFS layer where the nfs_client can be found.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-25 15:26:51 -05:00
Sage Weil
42961d2333 libceph: fix socket write error handling
Pass errors from writing to the socket up the stack.  If we get -EAGAIN,
return 0 from the helper to simplify the callers' checks.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-01-25 08:19:34 -08:00
Sage Weil
98bdb0aa00 libceph: fix socket read error handling
If we get EAGAIN when trying to read from the socket, it is not an error.
Return 0 from the helper in this case to simplify the error handling cases
in the caller (indirectly, try_read).

Fix try_read to pass any error to it's caller (con_work) instead of almost
always returning 0.  This let's us respond to things like socket
disconnects.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-01-25 08:17:48 -08:00
Tejun Heo
ada609ee2a workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER
WQ_RESCUER is now an internal flag and should only be used in the
workqueue implementation proper.  Use WQ_MEM_RECLAIM instead.

This doesn't introduce any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: dm-devel@redhat.com
Cc: Neil Brown <neilb@suse.de>
2011-01-25 14:35:54 +01:00
Eugene Teo
b7c7d01aae net: clear heap allocation for ethtool_get_regs()
There is a conflict between commit b00916b1 and a77f5db3. This patch resolves
the conflict by clearing the heap allocation in ethtool_get_regs().

Cc: stable@kernel.org
Signed-off-by: Eugene Teo <eugeneteo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 21:05:17 -08:00
David S. Miller
d80bc0fd26 ipv6: Always clone offlink routes.
Do not handle PMTU vs. route lookup creation any differently
wrt. offlink routes, always clone them.

Reported-by: PK <runningdoglackey@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 16:01:58 -08:00
John Fastabend
3dce38a02d dcbnl: make get_app handling symmetric for IEEE and CEE DCBx
The IEEE get/set app handlers use generic routines and do not
require the net_device to implement the dcbnl_ops routines. This
patch makes it symmetric so user space and drivers do not have
to handle the CEE version and IEEE DCBx versions differently.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 15:19:55 -08:00
Eric Dumazet
fd0273c503 tcp: fix bug in listening_get_next()
commit a8b690f98b (tcp: Fix slowness in read /proc/net/tcp)
introduced a bug in handling of SYN_RECV sockets.

st->offset represents number of sockets found since beginning of
listening_hash[st->bucket].

We should not reset st->offset when iterating through
syn_table[st->sbucket], or else if more than ~25 sockets (if
PAGE_SIZE=4096) are in SYN_RECV state, we exit from listening_get_next()
with a too small st->offset

Next time we enter tcp_seek_last_pos(), we are not able to seek past
already found sockets.

Reported-by: PK <runningdoglackey@yahoo.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 14:41:20 -08:00
David S. Miller
3408404a4c inetpeer: Use correct AVL tree base pointer in inet_getpeer().
Family was hard-coded to AF_INET but should be daddr->family.

This fixes crashes when unlinking ipv6 peer entries, since the
unlink code was looking up the base pointer properly.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 14:38:09 -08:00
Michal Schmidt
d1dc7abf2f GRO: fix merging a paged skb after non-paged skbs
Suppose that several linear skbs of the same flow were received by GRO. They
were thus merged into one skb with a frag_list. Then a new skb of the same flow
arrives, but it is a paged skb with data starting in its frags[].

Before adding the skb to the frag_list skb_gro_receive() will of course adjust
the skb to throw away the headers. It correctly modifies the page_offset and
size of the frag, but it leaves incorrect information in the skb:
 ->data_len is not decreased at all.
 ->len is decreased only by headlen, as if no change were done to the frag.
Later in a receiving process this causes skb_copy_datagram_iovec() to return
-EFAULT and this is seen in userspace as the result of the recv() syscall.

In practice the bug can be reproduced with the sfc driver. By default the
driver uses an adaptive scheme when it switches between using
napi_gro_receive() (with skbs) and napi_gro_frags() (with pages). The bug is
reproduced when under rx load with enough successful GRO merging the driver
decides to switch from the former to the latter.

Manual control is also possible, so reproducing this is easy with netcat:
 - on machine1 (with sfc): nc -l 12345 > /dev/null
 - on machine2: nc machine1 12345 < /dev/zero
 - on machine1:
   echo 1 > /sys/module/sfc/parameters/rx_alloc_method  # use skbs
   echo 2 > /sys/module/sfc/parameters/rx_alloc_method  # use pages
 - See that nc has quit suddenly.

[v2: Modified by Eric Dumazet to avoid advancing skb->data past the end
     and to use a temporary variable.]

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 14:27:18 -08:00
David S. Miller
e92427b289 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2011-01-24 13:17:06 -08:00
Eric Dumazet
c506653d35 net: arp_ioctl() must hold RTNL
Commit 941666c2e3 "net: RCU conversion of dev_getbyhwaddr() and
arp_ioctl()" introduced a regression, reported by Jamie Heilman.
"arp -Ds 192.168.2.41 eth0 pub" triggered the ASSERT_RTNL() assert
in pneigh_lookup()

Removing RTNL requirement from arp_ioctl() was a mistake, just revert
that part.

Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 13:16:16 -08:00
Thomas Jacob
08b5194b5d netfilter: xt_iprange: Incorrect xt_iprange boundary check for IPv6
iprange_ipv6_sub was substracting 2 unsigned ints and then casting
the result to int to find out whether they are lt, eq or gt each
other, this doesn't work if the full 32 bits of each part
can be used in IPv6 addresses. Patch should remedy that without
significant performance penalties. Also number of ntohl
calls can be reduced this way (Jozsef Kadlecsik).

Signed-off-by: Thomas Jacob <jacob@internet24.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-24 21:35:36 +01:00
Pablo Neira Ayuso
c71caf4114 netfilter: ctnetlink: fix missing refcount increment during dumps
In 13ee6ac netfilter: fix race in conntrack between dump_table and
destroy, we recovered spinlocks to protect the dump of the conntrack
table according to reports from Stephen and acknowledgments on the
issue from Eric.

In that patch, the refcount bump that allows to keep a reference
to the current ct object was removed. However, we still decrement
the refcount for that object in the output path of
ctnetlink_dump_table():

        if (last)
                nf_ct_put(last)

Cc: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-01-24 19:01:07 +01:00
Rusty Russell
577d6a7c3a module: fix missing semicolons in MODULE macro usage
You always needed them when you were a module, but the builtin versions
of the macros used to be more lenient.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-01-24 14:32:54 +10:30
Eric Dumazet
9190b3b320 net_sched: accurate bytes/packets stats/rates
In commit 44b8288308 (net_sched: pfifo_head_drop problem), we fixed
a problem with pfifo_head drops that incorrectly decreased
sch->bstats.bytes and sch->bstats.packets

Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
previously enqueued packet, and bstats cannot be changed, so
bstats/rates are not accurate (over estimated)

This patch changes the qdisc_bstats updates to be done at dequeue() time
instead of enqueue() time. bstats counters no longer account for dropped
frames, and rates are more correct, since enqueue() bursts dont have
effect on dequeue() rate.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-20 23:31:33 -08:00