Commit Graph

222872 Commits

Author SHA1 Message Date
KOSAKI Motohiro
a0b0f58cdd ksm: annotate ksm_thread_mutex is no deadlock source
commit 62b61f611e ("ksm: memory hotremove migration only") caused the
following new lockdep warning.

  =======================================================
  [ INFO: possible circular locking dependency detected ]
  -------------------------------------------------------
  bash/1621 is trying to acquire lock:
   ((memory_chain).rwsem){.+.+.+}, at: [<ffffffff81079339>]
  __blocking_notifier_call_chain+0x69/0xc0

  but task is already holding lock:
   (ksm_thread_mutex){+.+.+.}, at: [<ffffffff8113a3aa>]
  ksm_memory_callback+0x3a/0xc0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (ksm_thread_mutex){+.+.+.}:
       [<ffffffff8108b70a>] lock_acquire+0xaa/0x140
       [<ffffffff81505d74>] __mutex_lock_common+0x44/0x3f0
       [<ffffffff81506228>] mutex_lock_nested+0x48/0x60
       [<ffffffff8113a3aa>] ksm_memory_callback+0x3a/0xc0
       [<ffffffff8150c21c>] notifier_call_chain+0x8c/0xe0
       [<ffffffff8107934e>] __blocking_notifier_call_chain+0x7e/0xc0
       [<ffffffff810793a6>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff813afbfb>] memory_notify+0x1b/0x20
       [<ffffffff81141b7c>] remove_memory+0x1cc/0x5f0
       [<ffffffff813af53d>] memory_block_change_state+0xfd/0x1a0
       [<ffffffff813afd62>] store_mem_state+0xe2/0xf0
       [<ffffffff813a0bb0>] sysdev_store+0x20/0x30
       [<ffffffff811bc116>] sysfs_write_file+0xe6/0x170
       [<ffffffff8114f398>] vfs_write+0xc8/0x190
       [<ffffffff8114fc14>] sys_write+0x54/0x90
       [<ffffffff810028b2>] system_call_fastpath+0x16/0x1b

  -> #0 ((memory_chain).rwsem){.+.+.+}:
       [<ffffffff8108b5ba>] __lock_acquire+0x155a/0x1600
       [<ffffffff8108b70a>] lock_acquire+0xaa/0x140
       [<ffffffff81506601>] down_read+0x51/0xa0
       [<ffffffff81079339>] __blocking_notifier_call_chain+0x69/0xc0
       [<ffffffff810793a6>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff813afbfb>] memory_notify+0x1b/0x20
       [<ffffffff81141f1e>] remove_memory+0x56e/0x5f0
       [<ffffffff813af53d>] memory_block_change_state+0xfd/0x1a0
       [<ffffffff813afd62>] store_mem_state+0xe2/0xf0
       [<ffffffff813a0bb0>] sysdev_store+0x20/0x30
       [<ffffffff811bc116>] sysfs_write_file+0xe6/0x170
       [<ffffffff8114f398>] vfs_write+0xc8/0x190
       [<ffffffff8114fc14>] sys_write+0x54/0x90
       [<ffffffff810028b2>] system_call_fastpath+0x16/0x1b

But it's a false positive.  Both memory_chain.rwsem and ksm_thread_mutex
have an outer lock (mem_hotplug_mutex).  So they cannot deadlock.

Thus, This patch annotate ksm_thread_mutex is not deadlock source.

[akpm@linux-foundation.org: update comment, from Hugh]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:15 -08:00
KOSAKI Motohiro
20d6c96b5f mem-hotplug: introduce {un}lock_memory_hotplug()
Presently hwpoison is using lock_system_sleep() to prevent a race with
memory hotplug.  However lock_system_sleep() is a no-op if
CONFIG_HIBERNATION=n.  Therefore we need a new lock.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:15 -08:00
Andrew Morton
4fe65cab84 Documentation/filesystems/vfs.txt: fix ->repeasepage() description
->releasepage() does not remove the page from the mapping.

Acked-by: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:15 -08:00
Jeremy Fitzhardinge
64141da587 vmalloc: eagerly clear ptes on vunmap
On stock 2.6.37-rc4, running:

  # mount lilith:/export /mnt/lilith
  # find  /mnt/lilith/ -type f -print0 | xargs -0 file

crashes the machine fairly quickly under Xen.  Often it results in oops
messages, but the couple of times I tried just now, it just hung quietly
and made Xen print some rude messages:

    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp
    3000000000000000) for mfn 1d7058 (pfn 18fa7)
    (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms
    (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp
    1000000000000000) for mfn 1d2e04 (pfn 1d1fb)
    (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04

Which means the domain tried to map a pagetable page RW, which would
allow it to map arbitrary memory, so Xen stopped it.  This is because
vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had
finished with them, and those pages got recycled as pagetable pages
while still having these RW aliases.

Removing those mappings immediately removes the Xen-visible aliases, and
so it has no problem with those pages being reused as pagetable pages.
Deferring the TLB flush doesn't upset Xen because it can flush the TLB
itself as needed to maintain its invariants.

When unmapping a region in the vmalloc space, clear the ptes
immediately.  There's no point in deferring this because there's no
amortization benefit.

The TLBs are left dirty, and they are flushed lazily to amortize the
cost of the IPIs.

This specific motivation for this patch is an oops-causing regression
since 2.6.36 when using NFS under Xen, triggered by the NFS client's use
of vm_map_ram() introduced in 56e4ebf877 ("NFS: readdir with vmapped
pages") .  XFS also uses vm_map_ram() and could cause similar problems.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:15 -08:00
Andres Salomon
853ff88324 cs5535-gpio: apply CS5536 errata workaround for GPIOs
The AMD Geode CS5536 Companion Device Silicon Revision B1 Specification
Update mentions the follow as issue #36:

 "Atomic write transactions to the atomic GPIO High Bank Feature Bit
  registers should only affect the bits selected [...]"

 "after Suspend, an atomic write transaction [...] will clear all
  non-selected bits of the accessed register."

In other words, writing to the high bank for a single GPIO bit will
clear every other GPIO bit (but only sometimes after a suspend).

The workaround described is obvious and simple; do a read-modify-write.
This patch does that, and documents why we're doing it.

Signed-off-by: Andres Salomon <dilinger@queued.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:15 -08:00
Frederic Weisbecker
238af8751f reiserfs: don't acquire lock recursively in reiserfs_acl_chmod
reiserfs_acl_chmod() can be called by reiserfs_set_attr() and then take
the reiserfs lock a second time.  Thereafter it may call journal_begin()
that definitely requires the lock not to be nested in order to release
it before taking the journal mutex because the reiserfs lock depends on
the journal mutex already.

So, aviod nesting the lock in reiserfs_acl_chmod().

Reported-by: Pawel Zawora <pzawora@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Pawel Zawora <pzawora@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: <stable@kernel.org>		[2.6.32.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:15 -08:00
Johannes Berg
0bae35e14b leds: fix up dependencies
It's not useful to build LED triggers when there's no LEDs that can be
triggered by them.  Therefore, fix up the dependencies so that this
cannot happen, and fix a few users that select triggers to depend on
LEDS_CLASS as well (there is also one user that also selects LEDS_CLASS,
which is OK).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Arnd Hannemann <arnd@arndnet.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Richard Purdie <rpurdie@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:15 -08:00
Wu Fengguang
e172662d11 vmstat: fix dirty threshold ordering
The nr_dirty_[background_]threshold fields are misplaced before the
numa_* fields, and users will read strange values.

This is the right order.  Before patch, nr_dirty_background_threshold
will read as 0 (the value from numa_miss).

	numa_hit 128501
	numa_miss 0
	numa_foreign 0
	numa_interleave 7388
	numa_local 128501
	numa_other 0
	nr_dirty_threshold 144291
	nr_dirty_background_threshold 72145

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Michael Rubin <mrubin@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:14 -08:00
Zeng Zhaoming
55cfaa3cbd mm/mempolicy.c: add rcu read lock to protect pid structure
find_task_by_vpid() should be protected by rcu_read_lock(), to prevent
free_pid() reclaiming pid.

Signed-off-by: Zeng Zhaoming <zengzm.kernel@gmail.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:14 -08:00
Dean Nelson
1f64d69c7a mm/hugetlb.c: avoid double unlock_page() in hugetlb_fault()
Have hugetlb_fault() call unlock_page(page) only if it had previously
called lock_page(page).

Setting CONFIG_DEBUG_VM=y and then running the libhugetlbfs test suite,
resulted in the tripping of VM_BUG_ON(!PageLocked(page)) in
unlock_page() having been called by hugetlb_fault() when page ==
pagecache_page.  This patch remedied the problem.

Signed-off-by: Dean Nelson <dnelson@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-12-02 14:51:14 -08:00
Linus Torvalds
94c35de9a9 Merge branch 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (27 commits)
  Staging: rt2870: Add USB ID for Buffalo Airstation WLI-UC-GN
  staging: easycap needs smp_lock.h, fixes build error
  Staging: batman-adv: ensure that eth_type_trans gets linear memory
  Staging: batman-adv: Don't remove interface with spinlock held
  staging: brcm80211: updated maintainers contact information
  staging: fix winbond build, needs delay.h
  Staging: line6: fix up my fixup for some sysfs attribute permissions
  Staging: zram: fix up my fixup for some sysfs attribute permissions
  Staging: udlfb: fix up my fixup for some sysfs attribute permissions
  Staging: samsung-laptop: fix up my fixup for some sysfs attribute permissions
  Staging: iio: adis16220: fix up my fixup for some sysfs attribute permissions
  Staging: frontier: fix up my fixup for some sysfs attribute permissions
  Staging: asus_oled: fix up my fixup for some sysfs attribute permissions
  staging: spectra: fix build error
  Staging: intel_sst: fix memory leak
  Staging: rtl8712: signedness bug in init
  staging: rtl8187se: Change panic to warn when RF switch turned off
  staging: comedi: fix memory leak
  Staging: quickstart: free after input_unregister_device()
  Staging: speakup: free after input_unregister_device()
  ...
2010-12-02 12:59:11 -08:00
Linus Torvalds
8733cb29d6 Merge branch 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  uio: Change mail address of Hans J. Koch
  driver core: prune docs about device_interface
  driver core: the development tree has switched to git
2010-12-02 12:58:36 -08:00
Linus Torvalds
eed5ee1a3a Merge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  serial: mfd: adjust the baud rate setting
  TTY: open/hangup race fixup
  TTY: don't allow reopen when ldisc is changing
  NET: wan/x25, fix ldisc->open retval
  TTY: ldisc, fix open flag handling
  serial8250: Mark console as CON_ANYTIME
2010-12-02 12:58:16 -08:00
Linus Torvalds
435a5aebf6 Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: fix autosuspend bug in usb-serial
  USB: ehci: disable LPM and PPCD for nVidia MCP89 chips
  USB: serial: ftdi_sio: Vardaan USB RS422/485 converter PID added
  USB: yurex: add .llseek fop to file_operations
  USB: ftdi_sio: Add ID for RT Systems USB-29B radio cable
  usb: musb: do not use dma for control transfers
  usb: musb: gadget: fix compilation warning
  usb: musb: clear RXCSR_AUTOCLEAR before PIO read
  usb: musb: unmap dma buffer when switching to PIO
  xhci: Don't let the USB core disable SuperSpeed ports.
  xhci: Setup array of USB 2.0 and USB 3.0 ports.
  xhci: Fix reset-device and configure-endpoint commands
2010-12-02 12:57:35 -08:00
Linus Torvalds
2e5c26de1d Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  watchdog: it8712f_wdt: add note to Kconfig
  watchdog: gef_wdt: include fs.h
  watchdog: bcm63xx_wdt: improve platform part.
  watchdog: iTCO_wdt: TCO Watchdog patch for Intel Patsburg PCH
2010-12-02 12:11:31 -08:00
Linus Torvalds
75318ec327 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB: Fix information leak in marshalling code
  IB/pack: Remove some unused code added by the IBoE patches
  IB/mlx4: Fix IBoE link state
  IB/mlx4: Fix IBoE reported link rate
  mlx4_core: Workaround firmware bug in query dev cap
  IB/mlx4: Fix memory ordering of VLAN insertion control bits
  MAINTAINERS: Update NetEffect entry
2010-12-02 12:10:56 -08:00
Linus Torvalds
8cb280c90f Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: only run xfs_error_test if error injection is active
  xfs: avoid moving stale inodes in the AIL
  xfs: delayed alloc blocks beyond EOF are valid after writeback
  xfs: push stale, pinned buffers on trylock failures
  xfs: fix failed write truncation handling.
2010-12-02 09:13:36 -08:00
Linus Torvalds
8fed709f34 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
  regulator: fix kernel-doc for set_consumer_device_supply
  regulator: enable supply regulator only when use count is zero
  regulator: twl-regulator - fix twlreg_set_mode
  regulator: lock supply in regulator enable
  regulator: Return proper error for regulator_register()
  regulator: Ensure enough delay time for enabling regulator
  regulator: Remove a redundant device_remove_file call in create_regulator
  regulator: Staticise mc13783_powermisc_rmw()
  regulator: regulator disable supply fix
2010-12-02 08:06:16 -08:00
Linus Torvalds
53f517a1f6 Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  [media] v4l: Remove module_name argument to the v4l2_i2c_new_subdev* functions
  [media] v4l: Remove hardcoded module names passed to v4l2_i2c_new_subdev* (2)
2010-12-02 08:05:56 -08:00
Linus Torvalds
04ed0978d5 Merge branch 'rbd-sysfs' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'rbd-sysfs' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: replace the rbd sysfs interface
2010-12-02 08:05:22 -08:00
Linus Torvalds
8520eeaa12 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: fix parsing of hostname in dfs referrals
  cifs: display fsc in /proc/mounts
  cifs: enable fscache iff fsc mount option is used explicitly
  cifs: allow fsc mount option only if CONFIG_CIFS_FSCACHE is set
  cifs: Handle extended attribute name cifs_acl to generate cifs acl blob (try #4)
  cifs: Misc. cleanup in cifsacl handling [try #4]
  cifs: trivial comment fix for cifs_invalidate_mapping
  [CIFS] fs/cifs/Kconfig: CIFS depends on CRYPTO_HMAC
  cifs: don't take extra tlink reference in initiate_cifs_search
  cifs: Percolate error up to the caller during get/set acls [try #4]
  cifs: fix another memleak, in cifs_root_iget
  cifs: fix potential use-after-free in cifs_oplock_break_put
2010-12-02 08:04:21 -08:00
Wim Van Sebroeck
4fc3680894 watchdog: it8712f_wdt: add note to Kconfig
On some motherboards the it8712f watchdog does not work unless
the game port was enabled. see Bug 13140. We therefor add a note
to Kconfig.

Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-12-02 14:10:32 +00:00
Wolfram Sang & Martyn Welch
f6e0722fc3 watchdog: gef_wdt: include fs.h
Add missing include "linux/fs.h".
This fixes compile failure.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Martyn Welch <martyn.welch@ge.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-12-02 14:10:21 +00:00
Wim Van Sebroeck
e6c3b699b2 watchdog: bcm63xx_wdt: improve platform part.
* fix devinit and devexit sections
* fix platform removal code so that the iounmap happens after the removal of the timer.
* changes the reboot_notifier by a platform shutdown method.

Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-12-02 14:10:16 +00:00
Seth Heasley
c54fb81174 watchdog: iTCO_wdt: TCO Watchdog patch for Intel Patsburg PCH
This patch adds an additional LPC Controller DeviceID for the Intel Patsburg PCH for TCO Watchdog.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2010-12-02 14:10:05 +00:00
Roland Dreier
7adce751ce Merge branches 'misc', 'mlx4' and 'nes' into for-next 2010-12-01 16:33:47 -08:00
Vasiliy Kulikov
91a4d157d0 IB: Fix information leak in marshalling code
ib_ucm_init_qp_attr() and ucma_init_qp_attr() pass struct ib_uverbs_qp_attr
with reserved, qp_state, {ah_attr,alt_ah_attr}{reserved,->grh.reserved}
fields uninitialized to copy_to_user().  This leads to leaking of
contents of kernel stack memory to userspace.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-12-01 16:33:18 -08:00
Or Gerlitz
f55864a4f4 IB/pack: Remove some unused code added by the IBoE patches
Remove unused functions added by commit ff7f5aab35 ("IB/pack: IBoE UD
packet packing support").

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
2010-12-01 16:30:18 -08:00
Eli Cohen
21d606090e IB/mlx4: Fix IBoE link state
Use netif_running() and netif_carrier_ok() to report link state,
exactly as is done to report Ethernet link state in sysfs.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-12-01 16:11:29 -08:00
Eli Cohen
328266c561 IB/mlx4: Fix IBoE reported link rate
The link rate is the product of the link speed in the link width. For
Etherent ports the rate is 10G, so we use 1 for the width and 4 for
speed to get the correct rate.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-12-01 16:10:35 -08:00
Eli Cohen
58d74bb1d9 mlx4_core: Workaround firmware bug in query dev cap
ConnectX firmware is supposed to report the number blue flame
registers per page as log2 of the value.  However, due to a firmware
bug, it reports actual number.  This patch works around this by
checking if the number of registers calculated fits within a page.  If
it does not, we use 8 registers per page.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-12-01 16:08:47 -08:00
Yehuda Sadeh
dfc5606dc5 rbd: replace the rbd sysfs interface
The new interface creates directories per mapped image
and under each it creates a subdir per available snapshot.
This allows keeping a cleaner interface within the sysfs
guidelines. The ABI documentation was updated too.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2010-12-01 15:53:22 -08:00
Eli Cohen
e27535b9c6 IB/mlx4: Fix memory ordering of VLAN insertion control bits
We must fully update the control segment before marking it as valid,
so that hardware doesn't start executing it before we're ready.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>

[ Move VLAN control bit setting to before wmb().  - Roland ]

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-12-01 11:08:54 -08:00
Chien Tung
e3d33cb132 MAINTAINERS: Update NetEffect entry
Correct web link as www.neteffect.com is no longer valid.  Remove
Chien Tung as maintainer.  I am moving on to other responsibilities at
Intel.  Thanks for all the fish.

Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-12-01 10:56:24 -08:00
Dave Chinner
c76febef57 xfs: only run xfs_error_test if error injection is active
Recent tests writing lots of small files showed the flusher thread
being CPU bound and taking a long time to do allocations on a debug
kernel. perf showed this as the prime reason:

             samples  pcnt function                    DSO
             _______ _____ ___________________________ _________________

           224648.00 36.8% xfs_error_test              [kernel.kallsyms]
            86045.00 14.1% xfs_btree_check_sblock      [kernel.kallsyms]
            39778.00  6.5% prandom32                   [kernel.kallsyms]
            37436.00  6.1% xfs_btree_increment         [kernel.kallsyms]
            29278.00  4.8% xfs_btree_get_rec           [kernel.kallsyms]
            27717.00  4.5% random32                    [kernel.kallsyms]

Walking btree blocks during allocation checking them requires each
block (a cache hit, so no I/O) call xfs_error_test(), which then
does a random32() call as the first operation.  IOWs, ~50% of the
CPU is being consumed just testing whether we need to inject an
error, even though error injection is not active.

Kill this overhead when error injection is not active by adding a
global counter of active error traps and only calling into
xfs_error_test when fault injection is active.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-01 07:40:20 -06:00
Dave Chinner
de25c1818c xfs: avoid moving stale inodes in the AIL
When an inode has been marked stale because the cluster is being
freed, we don't want to (re-)insert this inode into the AIL. There
is a race condition where the cluster buffer may be unpinned before
the inode is inserted into the AIL during transaction committed
processing. If the buffer is unpinned before the inode item has been
committed and inserted, then it is possible for the buffer to be
released and hence processthe stale inode callbacks before the inode
is inserted into the AIL.

In this case, we then insert a clean, stale inode into the AIL which
will never get removed by an IO completion. It will, however, get
reclaimed and that triggers an assert in xfs_inode_free()
complaining about freeing an inode still in the AIL.

This race can be avoided by not moving stale inodes forward in the AIL
during transaction commit completion processing. This closes the
race condition by ensuring we never insert clean stale inodes into
the AIL. It is safe to do this because a dirty stale inode, by
definition, must already be in the AIL.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-01 07:40:20 -06:00
Dave Chinner
309c848002 xfs: delayed alloc blocks beyond EOF are valid after writeback
There is an assumption in the parts of XFS that flushing a dirty
file will make all the delayed allocation blocks disappear from an
inode. That is, that after calling xfs_flush_pages() then
ip->i_delayed_blks will be zero.

This is an invalid assumption as we may have specualtive
preallocation beyond EOF and they are recorded in
ip->i_delayed_blks. A flush of the dirty pages of an inode will not
change the state of these blocks beyond EOF, so a non-zero
deeelalloc block count after a flush is valid.

The bmap code has an invalid ASSERT() that needs to be removed, and
the swapext code has a bug in that while it swaps the data forks
around, it fails to swap the i_delayed_blks counter associated with
the fork and hence can get the block accounting wrong.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-01 07:40:20 -06:00
Dave Chinner
90810b9e82 xfs: push stale, pinned buffers on trylock failures
As reported by Nick Piggin, XFS is suffering from long pauses under
highly concurrent workloads when hosted on ramdisks. The problem is
that an inode buffer is stuck in the pinned state in memory and as a
result either the inode buffer or one of the inodes within the
buffer is stopping the tail of the log from being moved forward.

The system remains in this state until a periodic log force issued
by xfssyncd causes the buffer to be unpinned. The main problem is
that these are stale buffers, and are hence held locked until the
transaction/checkpoint that marked them state has been committed to
disk. When the filesystem gets into this state, only the xfssyncd
can cause the async transactions to be committed to disk and hence
unpin the inode buffer.

This problem was encountered when scaling the busy extent list, but
only the blocking lock interface was fixed to solve the problem.
Extend the same fix to the buffer trylock operations - if we fail to
lock a pinned, stale buffer, then force the log immediately so that
when the next attempt to lock it comes around, it will have been
unpinned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-01 07:40:20 -06:00
Dave Chinner
c726de4409 xfs: fix failed write truncation handling.
Since the move to the new truncate sequence we call xfs_setattr to
truncate down excessively instanciated blocks.  As shown by the testcase
in kernel.org BZ #22452 that doesn't work too well.  Due to the confusion
of the internal inode size, and the VFS inode i_size it zeroes data that
it shouldn't.

But full blown truncate seems like overkill here.  We only instanciate
delayed allocations in the write path, and given that we never released
the iolock we can't have converted them to real allocations yet either.

The only nasty case is pre-existing preallocation which we need to skip.
We already do this for page discard during writeback, so make the delayed
allocation block punching a generic function and call it from the failed
write path as well as xfs_aops_discard_page. The callers are
responsible for ensuring that partial blocks are not truncated away,
and that they hold the ilock.

Based on a fix originally from Christoph Hellwig. This version used
filesystem blocks as the range unit.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-12-01 07:40:19 -06:00
Linus Torvalds
fb82155d5c Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: add workaround for dce3 ddc line vbios bug
  drm/radeon/kms: fix interlaced and doublescan handling
  drm/radeon/kms: fix typos in disabled vbios code
  Revert "drm/i915/dp: use VBT provided eDP params if available"
  drm/i915: Clear pfit registers when not used by any outputs
  drm: record monitor status in output_poll_execute
  drm: Set connector DPMS status to ON in drm_crtc_helper_set_config
  drm/i915: fix regression due to ba3d8d749b
  Revert "drm/radeon/kms: fix typo in r600 cs checker"
  drm/i915/sdvo: Always add a 30ms delay to make SDVO TV detection reliable
  MAINTAINERS: INTEL DRM DRIVERS list (intel-gfx) is subscribers-only
  drm/i915/sdvo: Always fallback to querying the shared DDC line
  drm/i915: Handle pagefaults in execbuffer user relocations
  drm/i915/sdvo: Only enable HDMI encodings only if the commandset is supported
  drm/radeon/kms: fix resume regression for some r5xx laptops
  drm/radeon/kms: fix regression in rs4xx i2c setup
  drm/i915: Only save/restore cursor regs if !KMS
  drm/i915: Prevent integer overflow when validating the execbuffer
2010-11-30 20:13:35 -08:00
Alex Deucher
3074adc8b6 drm/radeon/kms: add workaround for dce3 ddc line vbios bug
fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=23752

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc:stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-12-01 12:13:37 +10:00
Alex Deucher
c49948f4bd drm/radeon/kms: fix interlaced and doublescan handling
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-12-01 12:13:23 +10:00
Alex Deucher
0ec80d6456 drm/radeon/kms: fix typos in disabled vbios code
6xx/7xx was hitting the wrong BUS_CNTL reg and bits.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-12-01 12:13:10 +10:00
Dave Airlie
150f8815bb Merge remote branch 'intel/drm-intel-fixes' of /ssd/git/drm-next into drm-fixes
* 'intel/drm-intel-fixes' of /ssd/git/drm-next:
  Revert "drm/i915/dp: use VBT provided eDP params if available"
  drm/i915: Clear pfit registers when not used by any outputs
  drm/i915: fix regression due to ba3d8d749b
2010-12-01 12:10:34 +10:00
Linus Torvalds
22a5b566c8 Merge branch 'for_linus' of git://github.com/at91linux/linux-2.6-at91
* 'for_linus' of git://github.com/at91linux/linux-2.6-at91:
  at91/board-yl-9200: fix typo in video support
  atmel_spi: fix warning In function 'atmel_spi_dma_map_xfer'
  at91/picotux200: remove commenting usb device and dataflash support
  at91: rename rm9200ek and rm9200dk board file name
  at91rm9200ek: fix warning: 'ek_mmc_data' defined but not used
  at91rm9200dk: fix warning: 'dk_mmc_data' defined but not used
  at91: Convert remaining boards to new-style UART initialization
  at91: merge all at91rm9200 defconfig in one single file
2010-11-30 17:57:57 -08:00
Oleg Nesterov
114279be21 exec: copy-and-paste the fixes into compat_do_execve() paths
Note: this patch targets 2.6.37 and tries to be as simple as possible.
That is why it adds more copy-and-paste horror into fs/compat.c and
uglifies fs/exec.c, this will be cleanuped later.

compat_copy_strings() plays with bprm->vma/mm directly and thus has
two problems: it lacks the RLIMIT_STACK check and argv/envp memory
is not visible to oom killer.

Export acct_arg_size() and get_arg_page(), change compat_copy_strings()
to use get_arg_page(), change compat_do_execve() to do acct_arg_size(0)
as do_execve() does.

Add the fatal_signal_pending/cond_resched checks into compat_count() and
compat_copy_strings(), this matches the code in fs/exec.c and certainly
makes sense.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-30 17:56:38 -08:00
Oleg Nesterov
3c77f84572 exec: make argv/envp memory visible to oom-killer
Brad Spengler published a local memory-allocation DoS that
evades the OOM-killer (though not the virtual memory RLIMIT):
http://www.grsecurity.net/~spender/64bit_dos.c

execve()->copy_strings() can allocate a lot of memory, but
this is not visible to oom-killer, nobody can see the nascent
bprm->mm and take it into account.

With this patch get_arg_page() increments current's MM_ANONPAGES
counter every time we allocate the new page for argv/envp. When
do_execve() succeds or fails, we change this counter back.

Technically this is not 100% correct, we can't know if the new
page is swapped out and turn MM_ANONPAGES into MM_SWAPENTS, but
I don't think this really matters and everything becomes correct
once exec changes ->mm or fails.

Reported-by: Brad Spengler <spender@grsecurity.net>
Reviewed-and-discussed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-30 17:56:37 -08:00
Feng Tang
a5880a9e5b serial: mfd: adjust the baud rate setting
Previous baud rate setting code only has been tested with 3.5M/9600/
115200/230400/460800 bps, and recently we got a 3M bps device to test,
which needs to modify current MUL register setting, and with this
patch 2.5M/2M/1.5M/1M/0.5M should also work as they just use a MUL
value scale down from 3M's.

Also got some reference register setting from silicon guys for
different baud rates, which tries to keep the pre-scalar register value
to 16.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-11-30 17:32:32 -08:00
Greg Kroah-Hartman
b7a5100bc2 Merge branch 'for-greg' of git://gitorious.org/usb/usb into work 2010-11-30 15:52:04 -08:00
Greg Kroah-Hartman
8244272341 Merge branch 'for-usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci into work 2010-11-30 15:38:41 -08:00